[ovs-dev] [PATCHv16 2/2] netdev-afxdp: add new netdev type for AF_XDP.

William Tu u9012063 at gmail.com
Wed Jul 17 18:52:04 UTC 2019


On Wed, Jul 17, 2019 at 9:56 AM Ben Pfaff <blp at ovn.org> wrote:
>
> On Fri, Jul 12, 2019 at 04:50:56PM -0700, William Tu wrote:
> > The patch introduces experimental AF_XDP support for OVS netdev.
> > AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket
> > type built upon the eBPF and XDP technology.  It is aims to have comparable
> > performance to DPDK but cooperate better with existing kernel's networking
> > stack.  An AF_XDP socket receives and sends packets from an eBPF/XDP program
> > attached to the netdev, by-passing a couple of Linux kernel's subsystems
> > As a result, AF_XDP socket shows much better performance than AF_PACKET
> > For more details about AF_XDP, please see linux kernel's
> > Documentation/networking/af_xdp.rst. Note that by default, this feature is
> > not compiled in.
> >
> > Signed-off-by: William Tu <u9012063 at gmail.com>
>
> Thanks a lot!  I have a few comments.
>
> In pmd_perf_stats, is last_tsc always accessed in a thread-local manner
> or can it be read cross-thread?  If it can be read (or updated)
> cross-thread, then it seems unsafe to me on 32-bit architectures, which
> might reasonably read or write the two halves of it with separate
> instructions.  If so, then I'd suggest using atomic_uint64_t instead of
> plain uint64_t.

I think rdtsc is a per-core register and OVS always reads it, not updated.
So it's not possible that while in the middle of 32-bit system reading it,
the other half register gets updated. I'm not 100% sure, will let Ilya comment
about it.

>
> Looking at xmalloc_size_align(), I noticed a couple of things.  First,
> the syntax (runt ? : 0) uses a GCC extension.  The portable way to write
> it is (runt ? runt : 0), or just "runt" since it's a bool.  However, I
> notice that the older code was (runt ? CACHE_LINE_SIZE : 0), and by
> analogy the newer code should be (runt ? alignment : 0).  Are you
> convinced that the substitute is correct?

Yes, this is a mistake. It should be
(runt ? alignment : 0)

>
> I also have a couple of trivia:
>
> * One of the conditionals in automake.mk isn't needed because
>   LIBBPF_LDADD will be defined to empty if !HAVE_AF_XDP.
>
> * In dpif-netdev-perf.h, we don't need the casts because the standard C
>   promotions work fine for us in this case.
>
> I'm appending an incremental to address the trivia:
>
> diff --git a/lib/automake.mk b/lib/automake.mk
> index b07bb01c4ef7..7cb1c3373538 100644
> --- a/lib/automake.mk
> +++ b/lib/automake.mk
> @@ -9,15 +9,12 @@ lib_LTLIBRARIES += lib/libopenvswitch.la
>
>  lib_libopenvswitch_la_LIBADD = $(SSL_LIBS)
>  lib_libopenvswitch_la_LIBADD += $(CAPNG_LDADD)
> +lib_libopenvswitch_la_LIBADD += $(LIBBPF_LDADD)
>
>  if WIN32
>  lib_libopenvswitch_la_LIBADD += ${PTHREAD_LIBS}
>  endif
>
> -if HAVE_AF_XDP
> -lib_libopenvswitch_la_LIBADD += $(LIBBPF_LDADD)
> -endif
> -
>  lib_libopenvswitch_la_LDFLAGS = \
>          $(OVS_LTINFO) \
>          -Wl,--version-script=$(top_builddir)/lib/libopenvswitch.sym \
> diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h
> index 6b6dfda7db1c..3a5e90b16a5f 100644
> --- a/lib/dpif-netdev-perf.h
> +++ b/lib/dpif-netdev-perf.h
> @@ -192,15 +192,11 @@ static inline uint64_t
>  rdtsc_syscall(struct pmd_perf_stats *s)
>  {
>      struct timespec val;
> -    uint64_t v;
> -
>      if (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0) {
>         return s->last_tsc;
>      }
>
> -    v  = (uint64_t) val.tv_sec * 1000000000LL;
> -    v += (uint64_t) val.tv_nsec;
> -
> +    uint64_t v  = val.tv_sec * UINT64_C(1000000000) + val.tv_nsec;
>      return s->last_tsc = v;
>  }
>  #endif

Thank you, will add them in v17 patch.
William


More information about the dev mailing list