[ovs-dev] [PATCH 2/2] netdev-afxdp: Add need_wakeup supprt.

Ilya Maximets i.maximets at samsung.com
Wed Sep 4 14:27:22 UTC 2019


On 04.09.2019 17:04, William Tu wrote:
> Hi Ilya,
> 
> Thanks for the feedback.
> 
> On Wed, Sep 4, 2019 at 5:19 AM Ilya Maximets <i.maximets at samsung.com> wrote:
>>
>> Hi William,
>>
>> Thanks for the patch!
>> One general comment is that we, probably, should make this feature
>> configurable.  There are 2 reasons:
>>
>> 1. We might want to run OVS on older kernel while building with more
>>    recent libbpf. In this case socket creation will fail due to
>>    unsupported flag and we'll not be able to make it work.
> 
> I think it will also work, it falls back to old behavior.
> From:
> commit 77cd0d7b3f257fd0e3096b4fdcff1a7d38e99e10
> Author: Magnus Karlsson <magnus.karlsson at intel.com>
> Date:   Wed Aug 14 09:27:17 2019 +0200
> "
>    This flag needs some simple driver support. If the driver does not
>     support this, the Rx flag is always zero and the Tx flag is always
>     one. This makes any application relying on this feature default to the
>     old behaviour of not requiring any syscalls in the Rx path and always
>     having to call sendto() in the Tx path.
> "

This part is about relation between xdp subsystem and the device driver.
If device driver doesn't support this feature, xdp subsystem will handle
this by always exposing need_wakeup flag on rings.
However, if bind_flags contains XDP_USE_NEED_WAKEUP, but xdp subsystem
doesn't support it, userspace/libbpf will get failure on socket creation.
To be honest, I didn't test that, but I think that it works this way.
You may try building OVS with libbpf from bpf-next and run it on master
kernel (without need_wakeup support).

> 
>> 2. Performance impact is not always positive. It depends on the
>>    number of available CPU cores, port types (phisical or virtual),
>>    interrupts affinity.  And this was proved by test results from
>>    Eelco.  So, it might be good to have control over the enabling/
>>    disabling the feature.
>>
>> However, I think that it's better to keep it enabled by default.
> 
> OK, I will make it enabled by default, and add another option.
> "options:need_wakeup=false" for user to disable it.

This might be better to call it 'options:use_need_wakeup' because
'need_wakeup=false' sounds like we don't need to ever wake it up,
but it's opposite in practice.

> 
>>
>> Some comments inline.
>>
>> Best regards, Ilya Maximets.
>>
>> On 27.08.2019 2:02, William Tu wrote:
>>> The patch adds support for using need_wakeup flag in AF_XDP rings.
>>> When this flag is used, it means that OVS has to explicitly wake
>>> up the kernel RX, using poll() syscall and wake up TX, using sendto()
>>> syscall. This feature improves the performance by avoiding unnecessary
>>> syscalls, so keeping more CPU time in userspace to process packets.
>>>
>>> On Intel Xeon E5-2620 v3 2.4GHz system, performance of physical port
>>> to physical port improves from 6.1Mpps to 7.3Mpps.
>>>
>>> Suggested-by: Eelco Chaudron <echaudro at redhat.com>
>>
>> Wasn't it me? :) Just curious.
> 
> Went back to check the previous email, and it was you not Eelco.
> Sorry for the mistake!

It's OK.

> 
>>
>>> Signed-off-by: William Tu <u9012063 at gmail.com>
>>> ---
>>>  acinclude.m4       |  5 +++++
>>>  lib/netdev-afxdp.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 56 insertions(+)
>>>
>>> diff --git a/acinclude.m4 b/acinclude.m4
>>> index 116ffcf9096d..8821b821ec3c 100644
>>> --- a/acinclude.m4
>>> +++ b/acinclude.m4
>>> @@ -276,6 +276,11 @@ AC_DEFUN([OVS_CHECK_LINUX_AF_XDP], [
>>>                [Define to 1 if AF_XDP support is available and enabled.])
>>>      LIBBPF_LDADD=" -lbpf -lelf"
>>>      AC_SUBST([LIBBPF_LDADD])
>>> +
>>> +    AC_CHECK_DECL([xsk_ring_prod__needs_wakeup], [
>>> +      AC_DEFINE([HAVE_XDP_NEED_WAKEUP], [1],
>>> +        [XDP need wakeup support detected in xsk.h.])
>>> +    ], [], [#include <bpf/xsk.h>])
>>>    fi
>>>    AM_CONDITIONAL([HAVE_AF_XDP], test "$AF_XDP_ENABLE" = true)
>>>  ])
>>> diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
>>> index e5b058d08a09..d14d100e8fa3 100644
>>> --- a/lib/netdev-afxdp.c
>>> +++ b/lib/netdev-afxdp.c
>>> @@ -26,6 +26,7 @@
>>>  #include <linux/rtnetlink.h>
>>>  #include <linux/if_xdp.h>
>>>  #include <net/if.h>
>>> +#include <poll.h>
>>>  #include <stdlib.h>
>>>  #include <sys/resource.h>
>>>  #include <sys/socket.h>
>>> @@ -117,6 +118,48 @@ struct xsk_socket_info {
>>>      atomic_uint64_t tx_dropped;
>>>  };
>>>
>>> +#ifdef HAVE_XDP_NEED_WAKEUP
>>> +static inline void
>>> +xsk_rx_need_wakeup(struct xsk_umem_info *umem,
>>
>> Function name is misleading, because it actually tries to wake
>> rx up, not just checking. Something like 'xsk_rx_wakeup' or
>> 'xsk_rx_wakeup_if_needed' might be more suitable.
>> Naming suggestions are welcome.
> 
> OK, thanks
>>
>>> +                   struct netdev *netdev, int fd) {
>>> +    struct pollfd pfd;
>>> +    int ret;
>>> +
>>> +    if (xsk_ring_prod__needs_wakeup(&umem->fq)) {
>>> +        pfd.fd = fd;
>>> +        pfd.events = POLLIN;
>>> +
>>> +        ret = poll(&pfd, 1, 1000);
>>
>> Agree with Eelco that we shouldn't pass 1000 there. Looks dangerous.
> yes
> 
>>
>>> +        if (OVS_UNLIKELY(ret == 0)) {
>>> +            VLOG_WARN_RL(&rl, "%s: poll rx fd timeout.",
>>> +                    netdev_get_name(netdev));
>>
>> Indents are off here and below. Please, align arguments to the next
>> symbol after '('.
>>
>>> +        } else if (OVS_UNLIKELY(ret < 0)) {
>>> +            VLOG_WARN_RL(&rl, "%s: error polling rx fd: %s.",
>>> +                    netdev_get_name(netdev),
>>> +                    ovs_strerror(ret));
>>
>> 'ret' is always -1 on failure. errno should be there.
> 
> OK will fix the above two.
> 
> William
> 
> 


More information about the dev mailing list