[ovs-dev] [PATCH v1 00/23] dpif-netdev: Parallel offload processing

Gaëtan Rivet grive at u256.net
Tue Mar 16 17:48:39 UTC 2021


On Tue, Mar 16, 2021, at 17:23, David Marchand wrote:
> On Tue, Mar 16, 2021 at 4:45 PM David Marchand
> <david.marchand at redhat.com> wrote:
> > But, either I missed something or there is another issue.
> > I _also_ got the assert with the whole series applied, while
> > restarting (stop; sleep 10; start):
> >
> > 2021-03-16T15:40:30.970Z|00368|dpif_netdev|WARN|There's no available
> > (non-isolated) pmd thread on numa node 0. Queue 0 on port 'vhost0'
> > will be assigned to the pmd on core 31 (numa node 1). Expect reduced
> > performance.
> > 2021-03-16T15:40:30.970Z|00369|dpdk|INFO|VHOST_CONFIG: free connfd =
> > 114 for device '/var/lib/vhost_sockets/vhost6'
> > 2021-03-16T15:40:30.970Z|00370|netdev_dpdk|INFO|vHost Device
> > '/var/lib/vhost_sockets/vhost6' not found
> > 2021-03-16T15:40:30.989Z|00371|util|EMER|lib/netdev-offload.c:479:
> > assertion thread_is_hw_offload || thread_is_rcu failed in
> > netdev_offload_thread_init()
> >
> > 2021-03-16T15:40:41.013Z|00001|vlog|INFO|opened log file
> > /var/log/openvswitch/ovs-vswitchd.log
> > 2021-03-16T15:40:41.028Z|00002|ovs_numa|INFO|Discovered 28 CPU cores
> > on NUMA node 0
> > 2021-03-16T15:40:41.028Z|00003|ovs_numa|INFO|Discovered 28 CPU cores
> > on NUMA node 1
> > 2021-03-16T15:40:41.028Z|00004|ovs_numa|INFO|Discovered 2 NUMA nodes
> > and 56 CPU cores
> >
> > And looking at ovs threads, they look fine:
> > 70948    cpu_list=0-1,28-29    ctxt_switches=492,19    urcu3
> > 70950    cpu_list=0-1,28-29    ctxt_switches=4453,12    hw_offload5
> 
> Some more context (played with backtrace() + adding some more info
> manually in ovs_assert()):
> 
> 2021-03-16T17:13:34.278Z|00372|util|ERR|15: [ovs-vswitchd(_start+0x2e)
> [0x4e8e5e]]
> 2021-03-16T17:13:34.278Z|00373|util|ERR|14:
> [/lib64/libc.so.6(__libc_start_main+0xf3) [0x7fdfc1674803]]
> 2021-03-16T17:13:34.278Z|00374|util|ERR|13: [ovs-vswitchd(main+0x389)
> [0x4e7cd9]]
> 2021-03-16T17:13:34.278Z|00375|util|ERR|12:
> [ovs-vswitchd(bridge_exit+0x159) [0x59d6669]]
> 2021-03-16T17:13:34.278Z|00376|util|ERR|11: [ovs-vswitchd() [0x59d1f0e]]
> 2021-03-16T17:13:34.278Z|00377|util|ERR|10:
> [ovs-vswitchd(ofproto_destroy+0x110) [0x59e7af0]]
> 2021-03-16T17:13:34.278Z|00378|util|ERR|9: [ovs-vswitchd() [0x59df91b]]
> 2021-03-16T17:13:34.278Z|00379|util|ERR|8: [ovs-vswitchd() [0x59f34d2]]
> 2021-03-16T17:13:34.278Z|00380|util|ERR|7:
> [ovs-vswitchd(dpif_port_del+0x1f) [0x5a461df]]
> 2021-03-16T17:13:34.279Z|00381|util|ERR|6: [ovs-vswitchd() [0x5a3c207]]
> 2021-03-16T17:13:34.279Z|00382|util|ERR|5: [ovs-vswitchd()
> [0x5a3bcd5]] <-- do_del_port
> 2021-03-16T17:13:34.279Z|00383|util|ERR|4: [ovs-vswitchd()
> [0x5b27fc8]] <-- netdev_offload_dpdk_flow_flush
> 2021-03-16T17:13:34.279Z|00384|util|ERR|3: [ovs-vswitchd()
> [0x5b27f85]] <-- netdev_offload_dpdk_flow_destroy
> 2021-03-16T17:13:34.279Z|00385|util|ERR|2:
> [ovs-vswitchd(netdev_offload_thread_init+0x84) [0x5a684d4]]
> 2021-03-16T17:13:34.279Z|00386|util|ERR|1:
> [ovs-vswitchd(ovs_assert_failure+0x25) [0x5af3985]]
> 2021-03-16T17:13:34.279Z|00387|util|EMER|lib/netdev-offload.c:479:
> assertion thread_is_hw_offload || thread_is_rcu failed in
> netdev_offload_thread_init()
> 
> 
> -- 
> David Marchand
> 
>

Hey, thanks for taking the time to give more info!

I'm in a pickle now though.

To rewind back a little: flush support was implemented after this series was written.
I added support for it as well, doing a proper parallel dispatch. It needs however some ugly sync between the thread doing the flush and offload threads. The ovs_barrier used for this has a UAF (which does not affect RCU or revalidators, only in very specific contexts such as this new one).

It was starting to become a series on its own, on top of an already large one, so I decided to keep it for later.
It seems I will need to do it in one fell swoop instead.

Sorry about this, I should have picked it up before. Well at least the crash is pretty obvious :) .
Thanks again though, I'll send a proper v2 ASAP, but it will require a few additional patches.

Regards,
-- 
Gaetan Rivet


More information about the dev mailing list