[ovs-dev] [RFC v2 PATCH 0/4] XDP offload using flow API provider

William Tu u9012063 at gmail.com
Wed May 20 23:52:44 UTC 2020


On Sat, May 16, 2020 at 7:43 AM Toshiaki Makita
<toshiaki.makita1 at gmail.com> wrote:
>
> Hi William,
>
> On 2020/05/08 0:08, Toshiaki Makita wrote:
> > On 2020/05/05 23:29, William Tu wrote:
> >> On Tue, Apr 21, 2020 at 11:47:00PM +0900, Toshiaki Makita wrote:
> >>> This patch adds an XDP-based flow cache using the OVS netdev-offload
> >>> flow API provider.  When an OVS device with XDP offload enabled,
> >>> packets first are processed in the XDP flow cache (with parse, and
> >>> table lookup implemented in eBPF) and if hits, the action processing
> >>> are also done in the context of XDP, which has the minimum overhead.
> >>>
> >>> This provider is based on top of William's recently posted patch for
> >>> custom XDP load.  When a custom XDP is loaded, the provider detects if
> >>> the program supports classifier, and if supported it starts offloading
> >>> flows to the XDP program.
> >>>
> >>> The patches are derived from xdp_flow[1], which is a mechanism similar to
> >>> this but implemented in kernel.
> >>>
> >>>
> >>> * Motivation
> >>>
> >>> While userspace datapath using netdev-afxdp or netdev-dpdk shows good
> >>> performance, there are use cases where packets better to be processed in
> >>> kernel, for example, TCP/IP connections, or container to container
> >>> connections.  Current solution is to use tap device or af_packet with
> >>> extra kernel-to/from-userspace overhead.  But with XDP, a better solution
> >>> is to steer packets earlier in the XDP program, and decides to send to
> >>> userspace datapath or stay in kernel.
> >>>
> >>> One problem with current netdev-afxdp is that it forwards all packets to
> >>> userspace, The first patch from William (netdev-afxdp: Enable loading XDP
> >>> program.) only provides the interface to load XDP program, howerver users
> >>> usually don't know how to write their own XDP program.
> >>>
> >>> XDP also supports HW-offload so it may be possible to offload flows to
> >>> HW through this provider in the future, although not currently.
> >>> The reason is that map-in-map is required for our program to support
> >>> classifier with subtables in XDP, but map-in-map is not offloadable.
> >>> If map-in-map becomes offloadable, HW-offload of our program will also
> >>> be doable.
> >>>
> >>>
> >>> * How to use
> >>>
> >>> 1. Install clang/llvm >= 9, libbpf >= 0.0.4, and kernel >= 5.3.
> >>>
> >>> 2. make with --enable-afxdp --enable-bpf
> >>> --enable-bpf will generate XDP program "bpf/flowtable_afxdp.o".  Note that
> >>> the BPF object will not be installed anywhere by "make install" at this point.
> >>>
> >>> 3. Load custom XDP program
> >>> E.g.
> >>> $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native \
> >>>    options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
> >>> $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native \
> >>>    options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
> >>>
> >>> 4. Enable XDP_REDIRECT
> >>> If you use veth devices, make sure to load some (possibly dummy) programs
> >>> on the peers of veth devices.
> >>
> >> Hi Toshiaki,
> >>
> >> What kind of dummy program to put at the other side of veth?
> >
> > A program which just returns XDP_PASS should be sufficient.
> > e.g.
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/progs/xdp_dummy.c
> >
> >
> >> I'm trying to create a end-to-end test using veth, similar to
> >> the ping test in tests/system-traffic.at
> >>
> >> At the other side of veth, I use
> >> $/bpf-next/samples/bpf# ./xdp_rxq_info -d p0 -S -a XDP_PASS
> >>
> >> but somehow around 90% of the icmp packets are dropped, I'm still
> >> debugging the reason.
> >
> > I'm going to ping test based off of current master in a couple of days.
>
> Sorry for the delay.
> I can successfully ping between veth devices with the current master.
>
> veth0---veth1---ovs---veth2---veth3(in netns)
>
> ping between veth0 and veth3 succeeded without packet loss and with debug_stats[2]
> counted.
>
> Do you still have the problem?
>
Thanks, I will test it again tomorrow and get back to you!
William


More information about the dev mailing list