[ovs-dev] [PATCH] netdev-afxdp: Add delay when reconfiguring xsk.

Eelco Chaudron echaudro at redhat.com
Fri Dec 6 14:28:13 UTC 2019



On 6 Dec 2019, at 14:55, William Tu wrote:

> On Thu, Dec 5, 2019 at 5:43 AM William Tu <u9012063 at gmail.com> wrote:
>>
>> On Wed, Dec 4, 2019 at 10:06 PM William Tu <u9012063 at gmail.com> 
>> wrote:
>>>
>>> The patch works around an error when reconfigure the netdev-afxdp
>>> device into different modes. Currently, when OVS destroy xsk, the
>>> xsk_destruct() in linux kernel calls xdp_put_umem() and defers
>>> calling xdp_umem_release_deferred().
>>>
>>> This creates an -EBUSY error when xsk_socket__create() calls
>>>   xsk_bind()
>>>     xdp_umem_assign_dev()
>>>       xdp_get_umem_from_qid()
>>>
>>> And this is because xdp_clear_umem_at_qid() is deferred so not
>>> clearing umem yet. The issue can be reproduced by just changing
>>> xdp-mode multiple times, ex:
>>>   ovs-vsctl -- set interface afxdp-p0 \
>>>     options:n_rxq=1 type="afxdp" options:xdp-mode=best-effort
>>>   ovs-vsctl -- set interface afxdp-p0 \
>>>     options:n_rxq=1 type="afxdp" options:xdp-mode=generic
>>>
>>> The patch fixes it by adding a delay, hopefully the umem has
>>> been properly cleanup.
>>>
>>> Signed-off-by: William Tu <u9012063 at gmail.com>
>>> ---
>>>  lib/netdev-afxdp.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
>>> index ca2dfd005b9f..84897f1fd025 100644
>>> --- a/lib/netdev-afxdp.c
>>> +++ b/lib/netdev-afxdp.c
>>> @@ -661,6 +661,9 @@ netdev_afxdp_reconfigure(struct netdev *netdev)
>>>      struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
>>>      int err = 0;
>>>
>>> +    /* Time for umem to be released in kernel. */
>>> +    xnanosleep(500000);
>>> +
>>>      ovs_mutex_lock(&dev->mutex);
>>>
>>>      if (netdev->n_rxq == dev->requested_n_rxq
>>> --
>>> 2.7.4
>>>
>> Now I think it's a bad solution. Let me work on other way
>> to fix it.
>>
> This is an issue due to XSK async cleanup.
> We can wait for fix later.
> https://lore.kernel.org/netdev/CALDO+SbciaNy5EReV5YHvciOSJmdMRPBQdF6XbhxfBF6gvPFDw@mail.gmail.com/T/#m10cd49235090496dab98c24837dd1a2fbcb2e187

Waiting for the sync cleanup might make sense. Adding a xnanosleep() 
delay is not the way forward.



More information about the dev mailing list