[ovs-git] [openvswitch/ovs] 217ac4: dpif-netdev: Avoid deadlock with offloading during...

Timothy Redaelli noreply at github.com
Fri Jul 17 01:52:02 UTC 2020

  Branch: refs/heads/branch-2.13
  Home:   https://github.com/openvswitch/ovs
  Commit: 217ac490ac5eabb6d882527b6a6d6ee1ee64d9ed
  Author: Ilya Maximets <i.maximets at ovn.org>
  Date:   2020-07-17 (Fri, 17 Jul 2020)

  Changed paths:
    M AUTHORS.rst
    M lib/dpif-netdev.c

  Log Message:
  dpif-netdev: Avoid deadlock with offloading during PMD thread deletion.

Main thread will try to pause/stop all revalidators during datapath
reconfiguration via datapath purge callback (dp_purge_cb) while
holding 'dp->port_mutex'.  And deadlock happens in case any of
revalidator threads is already waiting on 'dp->port_mutex' while
dumping offloaded flows:

           main thread                           revalidator
 ---------------------------------  ----------------------------------


                                    -> dp_netdev_flow_to_dpif_flow
                                    -> get_dpif_flow_status
                                    -> dpif_netdev_get_flow_offload_status()
                                    -> ovs_mutex_lock(&dp->port_mutex)
                                       <waiting for mutex here>

 -> reconfigure_pmd_threads()
 -> dp_netdev_del_pmd()
 -> dp_purge_cb()
 -> udpif_pause_revalidators()
 -> ovs_barrier_block(&udpif->pause_barrier)
    <waiting for revalidators to reach barrier>


We're not allowed to call offloading API without holding global
port mutex from the userspace datapath due to thread safety
restrictions on netdev-offload-dpdk module.  And it's also not easy
to rework datapath reconfiguration process in order to move actual
PMD removal and datapath purge out of the port mutex.

So, for now, not sleeping on a mutex if it's not immediately available
seem like an easiest workaround.  This will have impact on flow
statistics update rate and on ability to get the latest statistics
before removing the flow (latest stats will be lost in case we were
not able to take the mutex).  However, this will allow us to operate
normally avoiding the deadlock.

The last bit is that to avoid flapping of flow attributes and
statistics we're not failing the operation, but returning last
statistics and attributes returned by offload provider.  Since those
might be updated in different threads, stores and reads are atomic.

Reported-by: Frank Wang (王培辉) <wangpeihui at inspur.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-June/371753.html
Fixes: a309e4f52660 ("dpif-netdev: Update offloaded flows statistics.")
Acked-by: Kevin Traynor <ktraynor at redhat.com>
Acked-by: Ian Stokes <ian.stokes at intel.com>
Tested-by: Eli Britstein <elibr at mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>

  Commit: 8d0b409fcc26d9548150e6c2e6bb7e6505e7ad5b
  Author: Timothy Redaelli <tredaelli at redhat.com>
  Date:   2020-07-17 (Fri, 17 Jul 2020)

  Changed paths:
    M acinclude.m4

  Log Message:
  acinclude: Remove libmnl for MLX5 PMD.

libmnl is not used anymore for MLX5 PMD since DPDK 19.08.

Signed-off-by: Timothy Redaelli <tredaelli at redhat.com>
Acked-by: Numan Siddique <numans at ovn.org>
Reviewed-by: David Marchand <david.marchand at redhat.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>

Compare: https://github.com/openvswitch/ovs/compare/bef407fa7f01...8d0b409fcc26

More information about the git mailing list