[ovs-dev] [PATCH v1 00/23] dpif-netdev: Parallel offload processing
david.marchand at redhat.com
Tue Mar 16 15:45:14 UTC 2021
On Mon, Mar 15, 2021 at 1:29 PM Gaëtan Rivet <grive at u256.net> wrote:
> > Just a first and easy update, I noticed an assert is triggered when
> > stopping OVS (at least).
> > 2021-03-11T20:32:01.928Z|00350|util|EMER|lib/netdev-offload.c:479:
> > assertion thread_is_hw_offload || thread_is_rcu failed in
> > netdev_offload_thread_init()
> > I did not check yet where the issue is, but I noticed it too half way
> > of the series, and it was not on stop in this case.
> > You can probably catch it easily.
> > I have hw-offloads enabled, 2 pf ports, 2 representors ports in a
> > single bridge, 2 pmds, no additional configuration.
> > I usually have bi directional traffic running through OVS while I restart.
> This abort is added in patch 12: netdev-offload: Add multi-thread API
> It is relying on the offload thread being named 'hw_offload'.
> This name is changed by patch 22: dpif-netdev: Use one or more offload threads
> In the RFC series, a separate patch did a fix on the thread name .
> I was relying on this fix happening first. I only did compilation checks in-between after squashing this patch.
> I can either re-introduce the fix patch separately, or rewrite the check in patch 12, then update it in patch 22.
> In my opinion having the fix separate was better but I can go with either solutions.
> Do you have a preference?
> (I did not have a setup this morning to do runtime checks, I will run them later. Reading the code this is my current understanding and it make sense for now.)
I would have a separate fix.
That's one thing.
But, either I missed something or there is another issue.
I _also_ got the assert with the whole series applied, while
restarting (stop; sleep 10; start):
2021-03-16T15:40:30.970Z|00368|dpif_netdev|WARN|There's no available
(non-isolated) pmd thread on numa node 0. Queue 0 on port 'vhost0'
will be assigned to the pmd on core 31 (numa node 1). Expect reduced
2021-03-16T15:40:30.970Z|00369|dpdk|INFO|VHOST_CONFIG: free connfd =
114 for device '/var/lib/vhost_sockets/vhost6'
'/var/lib/vhost_sockets/vhost6' not found
assertion thread_is_hw_offload || thread_is_rcu failed in
2021-03-16T15:40:41.013Z|00001|vlog|INFO|opened log file
2021-03-16T15:40:41.028Z|00002|ovs_numa|INFO|Discovered 28 CPU cores
on NUMA node 0
2021-03-16T15:40:41.028Z|00003|ovs_numa|INFO|Discovered 28 CPU cores
on NUMA node 1
2021-03-16T15:40:41.028Z|00004|ovs_numa|INFO|Discovered 2 NUMA nodes
and 56 CPU cores
And looking at ovs threads, they look fine:
70948 cpu_list=0-1,28-29 ctxt_switches=492,19 urcu3
70950 cpu_list=0-1,28-29 ctxt_switches=4453,12 hw_offload5
More information about the dev