[ovs-dev] [PATCH] dpif-netdev: Enter quiescent state after each offloading operation.

Thu Feb 27 09:00:54 UTC 2020

On 2/26/2020 2:31 PM, Ilya Maximets wrote:
> On 2/26/20 1:05 PM, Eli Britstein wrote:
>> On 2/24/2020 1:06 PM, Ilya Maximets wrote:
>>> On 2/23/20 3:32 PM, Eli Britstein wrote:
>>>> On 2/21/2020 4:54 PM, Ilya Maximets wrote:
>>>>> If the offloading queue is big and filled continuously, offloading
>>>>> thread may have no chance to quiesce blocking rcu callbacks and
>>>>> other threads waiting for synchronization.
>>>>>
>>>>> Fix that by entering momentary quiescent state after each operation
>>>>> since we're not holding any rcu-protected memory here.
>>>>>
>>>>> Fixes: 02bb2824e51d ("dpif-netdev: do hw flow offload in a thread")
>>>>> Reported-by: Eli Britstein <elibr at mellanox.com>
>>>>> Reported-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-discuss%2F2020-February%2F049768.html&data=02%7C01%7Celibr%40mellanox.com%7C5aeca2da0c4144e42fb708d7bab7de50%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637183171193644535&sdata=V7N8t72VsG4eD9eJbhyn2xh6Y73b%2BTz72utoSOFSEaQ%3D&reserved=0
>>>>> Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
>>>>> ---
>>>>>     lib/dpif-netdev.c | 1 +
>>>>>     1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>>>> index d393aab5e..a798db45d 100644
>>>>> --- a/lib/dpif-netdev.c
>>>>> +++ b/lib/dpif-netdev.c
>>>>> @@ -2512,6 +2512,7 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED)
>>>>>             VLOG_DBG("%s to %s netdev flow\n",
>>>>>                      ret == 0 ? "succeed" : "failed", op);
>>>>>             dp_netdev_free_flow_offload(offload);
>>>>> +        ovsrcu_quiesce();
>>>> This seems to solve the issue of responsiveness, but I have encountered a crash using it, while there is a lot of flow deletions.
>>>>
>>>> #0  0x0000000001e462dc in get_unaligned_u32 (p=0x7f57f7813500) at lib/unaligned.h:86
>>>> #1  0x0000000001e46388 in hash_bytes (p_=0x7f57f7813500, n=16, basis=0) at lib/hash.c:38
>>>> #2  0x0000000001f7f836 in ufid_to_rte_flow_data_find (ufid=0x7f57f7813500) at lib/netdev-offload-dpdk.c:134
>>>> #3  0x0000000001f88693 in netdev_offload_dpdk_flow_del (netdev=0x42260da0, ufid=0x7f57f7813500, stats=0x0) at lib/netdev-offload-dpdk.c:3361
>>>> #4  0x0000000001e6a00d in netdev_flow_del (netdev=0x42260da0, ufid=0x7f57f7813500, stats=0x0) at lib/netdev-offload.c:296
>>> This might mean 2 things:
>>> 1. Incorrect usage of RCU protected data inside netdev-offload-dpdk.
>>> 2. We're not allowed to quiesce if we have any item in offloading queue,
>>>      i.e. bad RCU usage scheme in dpif-netdev.
>>>
>>> Both cases are bad and I'm not sure if we can easily fix case #2.
>>> I'll take a look and try to find a root cause.
>> It was my bad. Sorry. This is another issue, that exists in my branch only (CT offloads), so it's not related. I was able to fix it.
> OK.  Good to know.
>
>> However, I wanted to reproduce with master branch, but then I cannot create a lot of flows as I get: "upcall: datapath flow limit reached".
>>
>> I tried setting large number using "ovs-appctl upcall/set-flow-limit 400000", but it didn't help.
> Number of flows in dpif-netdev is artificially limited by MAX_FLOWS (65536)
> value.  There was a recent discussion about this limit and I think we could
> just remove it.  I have a small patch for this, will send soon.
> For now, you can just raise the hardcoded value of MAX_FLOWS in dpif-netdev.c.

The issue is not in dpif-netdev.c. It's in 
ofproto/ofproto-dpif-upcall.c. If I'm not wrong, when it happens, there 
is not attempt to install the flow, so anyway MAX_FLOWS in dpif-netdev.c 
is not the blocker.

BTW, my setup is the PF and VF rep in a bridge. I run testpmd with 
macswap on the VF, and send 10k different src MAC packets (OpenFlow 
configuration is NORMAL). So, I expect total 20k flows, also less than 
64k of MAX_FLOWS.

>
>>>> In other trials, I see the ovs-appctl stuck again, but then I cannot attach gdb (or use pstack). It hangs. In this scenario, I see this in dmesg:
>>>>
>>>> [  404.450694] ovs-vswitchd[8574]: segfault at 7f8bd4e87280 ip 0000000001e463c3 sp 00007f8cfaf33028 error 4 in ovs-vswitchd[400000+200c000]
>>>>
>>>> Yanqin Wei <Yanqin.Wei at arm.com> suggested to add ovsrcu_try_quiesce call (in the same place). It seems more stable (haven't see such crash as above), but it has the same stuck symptom as above.
>>> ovsrcu_try_quiesce() is the same as ovsrcu_quiesce() except that is fails
>>> if mutex can not be taken right now.  There is no real difference.  Your
>>> crashes are less frequent just because thread enters quiescent state less
>>> frequently.
>> If so, I think "ovsrcu_try_quiesce" is more suitable here. Don't you think?
> I don't think that taking of one more mutex could make harm in
> offloading thread.  If we'll want to care about minor sleeping
> spikes, we should start from replacing the dp_flow_offload.mutex
> and sleeping on the conditional variable.

Agree it's not critical. However, if it's not mandatory, I think it's 
better to avoid it.

In addition I think of making it multi-threaded (thread per port) to 
boost insertion/deletion rates.