[ovs-dev] [PATCH RFCv4 0/4] AF_XDP netdev support for OVS

Eelco Chaudron echaudro at redhat.com
Wed Apr 17 12:01:46 UTC 2019


Hi William,

I think you applied the following patch to get it to compile? Or did you 
copy in the kernel headers?

https://www.spinics.net/lists/netdev/msg563507.html

//Eelco

On 2 Apr 2019, at 0:46, William Tu wrote:

> The patch series introduces AF_XDP support for OVS netdev.
> AF_XDP is a new address family working together with eBPF.
> In short, a socket with AF_XDP family can receive and send
> packets from an eBPF/XDP program attached to the netdev.
> For more details about AF_XDP, please see linux kernel's
> Documentation/networking/af_xdp.rst
>
> OVS has a couple of netdev types, i.e., system, tap, or
> internal.  The patch first adds a new netdev types called
> "afxdp", and implement its configuration, packet reception,
> and transmit functions.  Since the AF_XDP socket, xsk,
> operates in userspace, once ovs-vswitchd receives packets
> from xsk, the proposed architecture re-uses the existing
> userspace dpif-netdev datapath.  As a result, most of
> the packet processing happens at the userspace instead of
> linux kernel.
>
> Architecure
> ===========
>                _
>               |   +-------------------+
>               |   |    ovs-vswitchd   |<-->ovsdb-server
>               |   +-------------------+
>               |   |      ofproto      |<-->OpenFlow controllers
>               |   +--------+-+--------+
>               |   | netdev | |ofproto-|
>     userspace |   +--------+ |  dpif  |
>               |   | netdev | +--------+
>               |   |provider| |  dpif  |
>               |   +---||---+ +--------+
>               |       ||     |  dpif- |
>               |       ||     | netdev |
>               |_      ||     +--------+
>                       ||
>                _  +---||-----+--------+
>               |   | af_xdp prog +     |
>        kernel |   |   xsk_map         |
>               |_  +--------||---------+
>                            ||
>                         physical
>                            NIC
>
> To simply start, create a ovs userspace bridge using dpif-netdev
> by setting the datapath_type to netdev:
>   # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
>
> And attach a linux netdev with type afxdp:
>   # ovs-vsctl add-port br0 afxdp-p0 -- \
>       set interface afxdp-p0 type="afxdp"
>
> Performance
> ===========
> For this version, v4, I mainly focus on making the features right with
> libbpf AF_XDP API and use the AF_XDP SKB mode, which is the slower 
> set-up.
> My next version is to measure the performance and add optimizations.
>
> Documentation
> =============
> Most of the design details are described in the paper presetned at
> Linux Plumber 2018, "Bringing the Power of eBPF to Open vSwitch"[1],
> section 4, and slides[2].
> This path uses a not-yet upstreamed feature called XDP_ATTACH[3],
> described in section 3.1, which is a built-in XDP program for the 
> AF_XDP.
> This greatly simplifies the management of XDP/eBPF programs.
>
> [1] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-afxdp.pdf
> [2] 
> http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-lpc18-presentation.pdf
> [3] 
> http://vger.kernel.org/lpc_net2018_talks/lpc18_paper_af_xdp_perf-v2.pdf
>
> For installation and configuration guide, see
>   # Documentation/intro/install/bpf.rst
>
> Test Cases
> ==========
> Test cases are created using namespaces and veth peer, with AF_XDP 
> socket
> attached to the veth (thus the SKB_MODE).  By issuing "make 
> check-afxdp",
> the patch shows the following:
>
> AF_XDP netdev datapath-sanity
>
>   1: datapath - ping between two ports               ok
>   2: datapath - ping between two ports on vlan       ok
>   3: datapath - ping6 between two ports              ok
>   4: datapath - ping6 between two ports on vlan      ok
>   5: datapath - ping over vxlan tunnel               ok
>   6: datapath - ping over vxlan6 tunnel              ok
>   7: datapath - ping over gre tunnel                 ok
>   8: datapath - ping over erspan v1 tunnel           ok
>   9: datapath - ping over erspan v2 tunnel           ok
>  10: datapath - ping over ip6erspan v1 tunnel        ok
>  11: datapath - ping over ip6erspan v2 tunnel        ok
>  12: datapath - ping over geneve tunnel              ok
>  13: datapath - ping over geneve6 tunnel             ok
>  14: datapath - clone action                         ok
>  15: datapath - basic truncate action                ok
>
> conntrack
>
>  16: conntrack - controller                          ok
>  17: conntrack - force commit                        ok
>  18: conntrack - ct flush by 5-tuple                 ok
>  19: conntrack - IPv4 ping                           ok
>  20: conntrack - get_nconns and get/set_maxconns     ok
>  21: conntrack - IPv6 ping                           ok
>
> system-ovn
>
>  22: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT ok
>  23: ovn -- 2 LRs connected via LS, gateway router, easy SNAT ok
>  24: ovn -- multiple gateway routers, SNAT and DNAT  ok
>  25: ovn -- load-balancing                           ok
>  26: ovn -- load-balancing - same subnet.            ok
>  27: ovn -- load balancing in gateway router         ok
>  28: ovn -- multiple gateway routers, load-balancing ok
>  29: ovn -- load balancing in router with gateway router port ok
>  30: ovn -- DNAT and SNAT on distributed router - N/S ok
>  31: ovn -- DNAT and SNAT on distributed router - E/W ok
>
> ---
> v1->v2:
> - add a list to maintain unused umem elements
> - remove copy from rx umem to ovs internal buffer
> - use hugetlb to reduce misses (not much difference)
> - use pmd mode netdev in OVS (huge performance improve)
> - remove malloc dp_packet, instead put dp_packet in umem
>
> v2->v3:
> - rebase on the OVS master, 7ab4b0653784
>   ("configure: Check for more specific function to pull in pthread 
> library.")
> - remove the dependency on libbpf and dpif-bpf.
>   instead, use the built-in XDP_ATTACH feature.
> - data structure optimizations for better performance, see[1]
> - more test cases support
> v3: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/354179.html
>
> v3->v4:
> - Use AF_XDP API provided by libbpf
> - Remove the dependency on XDP_ATTACH kernel patch set
> - Add documentation, bpf.rst
>
> William Tu (4):
>   Add libbpf build support.
>   netdev-afxdp: add new netdev type for AF_XDP
>   tests: add AF_XDP netdev test cases.
>   afxdp netdev: add documentation and configuration.
>
>  Documentation/automake.mk             |   1 +
>  Documentation/index.rst               |   1 +
>  Documentation/intro/install/bpf.rst   | 182 +++++++
>  Documentation/intro/install/index.rst |   1 +
>  acinclude.m4                          |  20 +
>  configure.ac                          |   1 +
>  lib/automake.mk                       |   7 +-
>  lib/dp-packet.c                       |  12 +
>  lib/dp-packet.h                       |  32 +-
>  lib/dpif-netdev.c                     |   2 +-
>  lib/netdev-afxdp.c                    | 491 +++++++++++++++++
>  lib/netdev-afxdp.h                    |  39 ++
>  lib/netdev-linux.c                    |  78 ++-
>  lib/netdev-provider.h                 |   1 +
>  lib/netdev.c                          |   1 +
>  lib/xdpsock.c                         | 179 +++++++
>  lib/xdpsock.h                         | 129 +++++
>  tests/automake.mk                     |  17 +
>  tests/system-afxdp-macros.at          | 153 ++++++
>  tests/system-afxdp-testsuite.at       |  26 +
>  tests/system-afxdp-traffic.at         | 978 
> ++++++++++++++++++++++++++++++++++
>  21 files changed, 2345 insertions(+), 6 deletions(-)
>  create mode 100644 Documentation/intro/install/bpf.rst
>  create mode 100644 lib/netdev-afxdp.c
>  create mode 100644 lib/netdev-afxdp.h
>  create mode 100644 lib/xdpsock.c
>  create mode 100644 lib/xdpsock.h
>  create mode 100644 tests/system-afxdp-macros.at
>  create mode 100644 tests/system-afxdp-testsuite.at
>  create mode 100644 tests/system-afxdp-traffic.at
>
> -- 
> 2.7.4


More information about the dev mailing list