[ovs-dev] [RFC PATCH 0/5] XDP offload using flow API provider

William Tu u9012063 at gmail.com
Wed Mar 11 21:17:26 UTC 2020


Add a couple of people who might be interested in this feature.

On Tue, Mar 10, 2020 at 8:29 AM Toshiaki Makita
<toshiaki.makita1 at gmail.com> wrote:
>
> This patch adds an XDP-based flow cache using the OVS netdev-offload
> flow API provider.  When a OVS device with XDP offload enabled,
> packets first are processed in the XDP flow cache (with parse, and
> table lookup implemented in eBPF) and if hits, the action processing
> are also done in the context of XDP, which has the minimum overhead.
>
> This provider is based on top of William's recently posted patch for
> custom XDP load.  When a custom XDP is loaded, the provider detects if
> the program supports classifier, and if supported it starts offloading
> flows to the XDP program.
>
> The patches are derived from xdp_flow[1], which is a mechanism similar to
> this but implemented in kernel.
>
>
> * Motivation
>
> While userspace datapath using netdev-afxdp or netdev-dpdk shows good
> performance, there are use cases where packets better to be processed in
> kernel, for example, TCP/IP connections, or container to container
> connections.  Current solution is to use tap device or af_packet with
> extra kernel-to/from-userspace overhead.  But with XDP, a better solution
> is to steer packets earlier in the XDP program, and decides to send to
> userspace datapath or stay in kernel.
>
> One problem with current netdev-afxdp is that it forwards all packets to
> userspace, The first patch from William (netdev-afxdp: Enable loading XDP
> program.) only provides the interface to load XDP program, howerver users
> usually don't know how to write their own XDP program.
>
> XDP also supports HW-offload so it may be possible to offload flows to
> HW through this provider in the future, although not currently.
> The reason is that map-in-map is required for our program to support
> classifier with subtables in XDP, but map-in-map is not offloadable.
> If map-in-map becomes offloadable, HW-offload of our program will also
> be doable.
>
>
> * How to use
>
> 1. Install clang/llvm >= 9, libbpf >= 0.0.4, and kernel >= 5.3.
>
> 2. make with --enable-afxdp
> It will generate XDP program "bpf/flowtable_afxdp.o".  Note that the BPF
> object will not be installed anywhere by "make install" at this point.
>
> 3. Load custom XDP program
> E.g.
> $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native \
>   options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
> $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native \
>   options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
>
> 4. Enable XDP_REDIRECT
> If you use veth devices, make sure to load some (possibly dummy) programs
> on the peers of veth devices.
>
> 5. Enable hw-offload
> $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> This will starts offloading flows to the XDP program.
>
> You should be able to see some maps installed, including "debug_stats".
> $ bpftool map
>
> If packets are successfully redirected by the XDP program,
> debug_stats[2] will be counted.
> $ bpftool map dump id <ID of debug_stats>
>
> Currently only very limited keys and output actions is supported.
> For example NORMAL action entry and IP based matching work with current
> key support.
>
>
> * Performance
>
> Tested 2 cases. 1) i40e to veth, 2) i40e to i40e.
> Test 1 Measured drop rate at veth interface with redirect action from
> physical interface (i40e 25G NIC, XXV 710) to veth. The CPU is Xeon
> Silver 4114 (2.20 GHz).
>                                                                XDP_DROP
>                     +------+                      +-------+    +-------+
>  pktgen -- wire --> | eth0 | -- NORMAL ACTION --> | veth0 |----| veth2 |
>                     +------+                      +-------+    +-------+
>
> Test 2 uses i40e instead of veth, and measured tx packet rate at output
> device.
>
> Single-flow performance test results:
>
> 1) i40e-veth
>
>   a) no-zerocopy in i40e
>
>     - xdp   3.7 Mpps
>     - afxdp 820 kpps
>
>   b) zerocopy in i40e (veth does not have zc)
>
>     - xdp   1.8 Mpps
>     - afxdp 800 Kpps
>
> 2) i40e-i40e
>
>   a) no-zerocopy
>
>     - xdp   3.0 Mpps
>     - afxdp 1.1 Mpps
>
>   b) zerocopy
>
>     - xdp   1.7 Mpps
>     - afxdp 4.0 Mpps
>
> ** xdp is better when zc is disabled. The reason of poor performance on zc
>    is that xdp_frame requires packet memory allocation and memcpy on
>    XDP_REDIRECT to other devices iff zc is enabled.
>
> ** afxdp with zc is better than xdp without zc, but afxdp is using 2 cores
>    in this case, one is pmd and the other is softirq. When pmd and softirq
>    were running on the same core, the performance was extremely poor as
>    pmd consumes cpus.
>    When offloading to xdp, xdp only uses softirq while pmd is still
>    consuming 100% cpu.  This means we need probably only one pmd for xdp
>    even when we want to use more cores for multi-flow.
>    I'll also test afxdp-nonpmd when it is applied.
>
>
> This patch set is based on top of commit 59e994426 ("datapath: Update
> kernel test list, news and FAQ").
>
> [1] https://lwn.net/Articles/802653/
>
> Toshiaki Makita (4):
>   netdev-offload: Add xdp flow api provider
>   netdev-offload: Register xdp flow api provider
>   tun_metadata: Use OVS_ALIGNED_VAR to align opts field
>   bpf: Add reference XDP program implementation for netdev-offload-xdp
>
> William Tu (1):
>   netdev-afxdp: Enable loading XDP program.
>
>  Documentation/intro/install/afxdp.rst |   59 ++
>  Makefile.am                           |   10 +-
>  NEWS                                  |    2 +
>  bpf/.gitignore                        |    4 +
>  bpf/Makefile.am                       |   56 ++
>  bpf/bpf_miniflow.h                    |  199 +++++
>  bpf/bpf_netlink.h                     |   34 +
>  bpf/flowtable_afxdp.c                 |  510 +++++++++++
>  configure.ac                          |    1 +
>  include/openvswitch/tun-metadata.h    |    6 +-
>  lib/automake.mk                       |    6 +-
>  lib/bpf-util.c                        |   38 +
>  lib/bpf-util.h                        |   22 +
>  lib/netdev-afxdp.c                    |  342 +++++++-
>  lib/netdev-afxdp.h                    |    3 +
>  lib/netdev-linux-private.h            |    5 +
>  lib/netdev-offload-provider.h         |    3 +
>  lib/netdev-offload-xdp.c              | 1116 +++++++++++++++++++++++++
>  lib/netdev-offload-xdp.h              |   49 ++
>  lib/netdev.c                          |    4 +-
>  20 files changed, 2452 insertions(+), 17 deletions(-)
>  create mode 100644 bpf/.gitignore
>  create mode 100644 bpf/Makefile.am
>  create mode 100644 bpf/bpf_miniflow.h
>  create mode 100644 bpf/bpf_netlink.h
>  create mode 100644 bpf/flowtable_afxdp.c
>  create mode 100644 lib/bpf-util.c
>  create mode 100644 lib/bpf-util.h
>  create mode 100644 lib/netdev-offload-xdp.c
>  create mode 100644 lib/netdev-offload-xdp.h
>
> --
> 2.24.1
>


More information about the dev mailing list