[ovs-dev] [PATCH RFCv4 0/4] AF_XDP netdev support for OVS

Eelco Chaudron echaudro at redhat.com
Wed Apr 17 14:26:15 UTC 2019



On 17 Apr 2019, at 14:01, Eelco Chaudron wrote:

> Hi William,
>
> I think you applied the following patch to get it to compile? Or did 
> you copy in the kernel headers?
>
> https://www.spinics.net/lists/netdev/msg563507.html

I noticed you duplicated the macros, which resulted in all kind of 
compile errors. So I removed them, applied the two patches above, which 
would get me to the next step.

I’m building it with DPDK enabled and it was causing all kind of 
duplicate definition errors as the kernel and DPDK re-use some structure 
names.

To get it all compiled and working I had top make the following changes:

$ git diff
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index b3bf2f044..47fb3342a 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -295,7 +295,7 @@ netdev_linux_rxq_xsk(struct xsk_socket_info *xsk,
      uint32_t idx_rx = 0, idx_fq = 0;
      int ret = 0;

-    unsigned int non_afxdp;
+    unsigned int non_afxdp = 0;

      /* See if there is any packet on RX queue,
       * if yes, idx_rx is the index having the packet.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 47153dc60..77f2150ab 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -24,7 +24,7 @@
  #include <unistd.h>
  #include <linux/virtio_net.h>
  #include <sys/socket.h>
-#include <linux/if.h>
+//#include <linux/if.h>

  #include <rte_bus_pci.h>
  #include <rte_config.h>
diff --git a/lib/xdpsock.h b/lib/xdpsock.h
index 8df8fa451..a2ed1a136 100644
--- a/lib/xdpsock.h
+++ b/lib/xdpsock.h
@@ -28,7 +28,7 @@
  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>
-#include <net/ethernet.h>
+//#include <net/ethernet.h>
  #include <sys/resource.h>
  #include <sys/socket.h>
  #include <sys/mman.h>
@@ -43,14 +43,6 @@
  #include "ovs-atomic.h"
  #include "openvswitch/thread.h"

-/* bpf/xsk.h uses the following macros not defined in OVS,
- * so re-define them before include.
- */
-#define unlikely OVS_UNLIKELY
-#define likely OVS_LIKELY
-#define barrier() __asm__ __volatile__("": : :"memory")
-#define smp_rmb() barrier()
-#define smp_wmb() barrier()
  #include <bpf/xsk.h>

In addition you need to do “make install_headers” from kernel libbpf 
and copy the libbpf_util.h manually.

I was able to do a simple physical port in same physical port out test 
without crashing, but the numbers seem low:

$ ovs-ofctl dump-flows ovs_pvp_br0
  cookie=0x0, duration=210.344s, table=0, n_packets=1784692, 
n_bytes=2694884920, in_port=eno1 actions=IN_PORT

"Physical loopback test, L3 flows[port redirect]"
,Packet size
Number of flows,64,128,256,512,768,1024,1514
100,77574,77329,76605,76417,75539,75252,74617

The above is using two cores, but with a single DPDK core I get the 
following (on the same machine):

"Physical loopback test, L3 flows[port redirect]"
,Packet size
Number of flows,64,128,256,512,768,1024,1514
100,9527075,8445852,4528935,2349597,1586276,1197304,814854

For the kernel datapath the numbers are:

"Physical loopback test, L3 flows[port redirect]"
,Packet size
Number of flows,64,128,256,512,768,1024,1514
100,4862995,5521870,4528872,2349596,1586277,1197305,814854

But keep in mind it uses roughly 550/610/520/380/180/140/110% of the CPU 
for the respective packet size.

> On 2 Apr 2019, at 0:46, William Tu wrote:
>
>> The patch series introduces AF_XDP support for OVS netdev.
>> AF_XDP is a new address family working together with eBPF.
>> In short, a socket with AF_XDP family can receive and send
>> packets from an eBPF/XDP program attached to the netdev.
>> For more details about AF_XDP, please see linux kernel's
>> Documentation/networking/af_xdp.rst
>>
>> OVS has a couple of netdev types, i.e., system, tap, or
>> internal.  The patch first adds a new netdev types called
>> "afxdp", and implement its configuration, packet reception,
>> and transmit functions.  Since the AF_XDP socket, xsk,
>> operates in userspace, once ovs-vswitchd receives packets
>> from xsk, the proposed architecture re-uses the existing
>> userspace dpif-netdev datapath.  As a result, most of
>> the packet processing happens at the userspace instead of
>> linux kernel.
>>
>> Architecure
>> ===========
>>                _
>>               |   +-------------------+
>>               |   |    ovs-vswitchd   |<-->ovsdb-server
>>               |   +-------------------+
>>               |   |      ofproto      |<-->OpenFlow controllers
>>               |   +--------+-+--------+
>>               |   | netdev | |ofproto-|
>>     userspace |   +--------+ |  dpif  |
>>               |   | netdev | +--------+
>>               |   |provider| |  dpif  |
>>               |   +---||---+ +--------+
>>               |       ||     |  dpif- |
>>               |       ||     | netdev |
>>               |_      ||     +--------+
>>                       ||
>>                _  +---||-----+--------+
>>               |   | af_xdp prog +     |
>>        kernel |   |   xsk_map         |
>>               |_  +--------||---------+
>>                            ||
>>                         physical
>>                            NIC
>>
>> To simply start, create a ovs userspace bridge using dpif-netdev
>> by setting the datapath_type to netdev:
>>   # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
>>
>> And attach a linux netdev with type afxdp:
>>   # ovs-vsctl add-port br0 afxdp-p0 -- \
>>       set interface afxdp-p0 type="afxdp"
>>
>> Performance
>> ===========
>> For this version, v4, I mainly focus on making the features right 
>> with
>> libbpf AF_XDP API and use the AF_XDP SKB mode, which is the slower 
>> set-up.
>> My next version is to measure the performance and add optimizations.
>>
>> Documentation
>> =============
>> Most of the design details are described in the paper presetned at
>> Linux Plumber 2018, "Bringing the Power of eBPF to Open vSwitch"[1],
>> section 4, and slides[2].
>> This path uses a not-yet upstreamed feature called XDP_ATTACH[3],
>> described in section 3.1, which is a built-in XDP program for the 
>> AF_XDP.
>> This greatly simplifies the management of XDP/eBPF programs.
>>
>> [1] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-afxdp.pdf
>> [2] 
>> http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-lpc18-presentation.pdf
>> [3] 
>> http://vger.kernel.org/lpc_net2018_talks/lpc18_paper_af_xdp_perf-v2.pdf
>>
>> For installation and configuration guide, see
>>   # Documentation/intro/install/bpf.rst
>>
>> Test Cases
>> ==========
>> Test cases are created using namespaces and veth peer, with AF_XDP 
>> socket
>> attached to the veth (thus the SKB_MODE).  By issuing "make 
>> check-afxdp",
>> the patch shows the following:
>>
>> AF_XDP netdev datapath-sanity
>>
>>   1: datapath - ping between two ports               ok
>>   2: datapath - ping between two ports on vlan       ok
>>   3: datapath - ping6 between two ports              ok
>>   4: datapath - ping6 between two ports on vlan      ok
>>   5: datapath - ping over vxlan tunnel               ok
>>   6: datapath - ping over vxlan6 tunnel              ok
>>   7: datapath - ping over gre tunnel                 ok
>>   8: datapath - ping over erspan v1 tunnel           ok
>>   9: datapath - ping over erspan v2 tunnel           ok
>>  10: datapath - ping over ip6erspan v1 tunnel        ok
>>  11: datapath - ping over ip6erspan v2 tunnel        ok
>>  12: datapath - ping over geneve tunnel              ok
>>  13: datapath - ping over geneve6 tunnel             ok
>>  14: datapath - clone action                         ok
>>  15: datapath - basic truncate action                ok
>>
>> conntrack
>>
>>  16: conntrack - controller                          ok
>>  17: conntrack - force commit                        ok
>>  18: conntrack - ct flush by 5-tuple                 ok
>>  19: conntrack - IPv4 ping                           ok
>>  20: conntrack - get_nconns and get/set_maxconns     ok
>>  21: conntrack - IPv6 ping                           ok
>>
>> system-ovn
>>
>>  22: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT ok
>>  23: ovn -- 2 LRs connected via LS, gateway router, easy SNAT ok
>>  24: ovn -- multiple gateway routers, SNAT and DNAT  ok
>>  25: ovn -- load-balancing                           ok
>>  26: ovn -- load-balancing - same subnet.            ok
>>  27: ovn -- load balancing in gateway router         ok
>>  28: ovn -- multiple gateway routers, load-balancing ok
>>  29: ovn -- load balancing in router with gateway router port ok
>>  30: ovn -- DNAT and SNAT on distributed router - N/S ok
>>  31: ovn -- DNAT and SNAT on distributed router - E/W ok
>>
>> ---
>> v1->v2:
>> - add a list to maintain unused umem elements
>> - remove copy from rx umem to ovs internal buffer
>> - use hugetlb to reduce misses (not much difference)
>> - use pmd mode netdev in OVS (huge performance improve)
>> - remove malloc dp_packet, instead put dp_packet in umem
>>
>> v2->v3:
>> - rebase on the OVS master, 7ab4b0653784
>>   ("configure: Check for more specific function to pull in pthread 
>> library.")
>> - remove the dependency on libbpf and dpif-bpf.
>>   instead, use the built-in XDP_ATTACH feature.
>> - data structure optimizations for better performance, see[1]
>> - more test cases support
>> v3: 
>> https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/354179.html
>>
>> v3->v4:
>> - Use AF_XDP API provided by libbpf
>> - Remove the dependency on XDP_ATTACH kernel patch set
>> - Add documentation, bpf.rst
>>
>> William Tu (4):
>>   Add libbpf build support.
>>   netdev-afxdp: add new netdev type for AF_XDP
>>   tests: add AF_XDP netdev test cases.
>>   afxdp netdev: add documentation and configuration.
>>
>>  Documentation/automake.mk             |   1 +
>>  Documentation/index.rst               |   1 +
>>  Documentation/intro/install/bpf.rst   | 182 +++++++
>>  Documentation/intro/install/index.rst |   1 +
>>  acinclude.m4                          |  20 +
>>  configure.ac                          |   1 +
>>  lib/automake.mk                       |   7 +-
>>  lib/dp-packet.c                       |  12 +
>>  lib/dp-packet.h                       |  32 +-
>>  lib/dpif-netdev.c                     |   2 +-
>>  lib/netdev-afxdp.c                    | 491 +++++++++++++++++
>>  lib/netdev-afxdp.h                    |  39 ++
>>  lib/netdev-linux.c                    |  78 ++-
>>  lib/netdev-provider.h                 |   1 +
>>  lib/netdev.c                          |   1 +
>>  lib/xdpsock.c                         | 179 +++++++
>>  lib/xdpsock.h                         | 129 +++++
>>  tests/automake.mk                     |  17 +
>>  tests/system-afxdp-macros.at          | 153 ++++++
>>  tests/system-afxdp-testsuite.at       |  26 +
>>  tests/system-afxdp-traffic.at         | 978 
>> ++++++++++++++++++++++++++++++++++
>>  21 files changed, 2345 insertions(+), 6 deletions(-)
>>  create mode 100644 Documentation/intro/install/bpf.rst
>>  create mode 100644 lib/netdev-afxdp.c
>>  create mode 100644 lib/netdev-afxdp.h
>>  create mode 100644 lib/xdpsock.c
>>  create mode 100644 lib/xdpsock.h
>>  create mode 100644 tests/system-afxdp-macros.at
>>  create mode 100644 tests/system-afxdp-testsuite.at
>>  create mode 100644 tests/system-afxdp-traffic.at
>>
>> -- 
>> 2.7.4
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list