[ovs-dev] [PATCH v4 0/5] XDP offload using flow API provider

Toshiaki Makita toshiaki.makita1 at gmail.com
Mon Nov 2 16:36:29 UTC 2020


On 2020/11/02 18:37, Numan Siddique wrote:
> On Fri, Oct 30, 2020 at 10:49 AM Toshiaki Makita
> <toshiaki.makita1 at gmail.com> wrote:
>>
>> Hi all,
>>
>> It's about 3 months since I submitted this patch set.
>> Could someone review this?
>> Or should I resubmit the patch set on the top of current master?
> 
> Since the patches don't apply cleanly, I think you can rebase and
> repost them and/or provide the
> ovs commit id on top of which these patches apply cleanly.

Hi Numan,

Thank you for the advice!

The patches are based on top of commit e8bf77748 ("odp-util: Fix clearing
match mask if set action is partially unnecessary.").
Actually I provided this information at the bottom of the cover letter.

Also you can pull the changes from
https://github.com/tmakita/ovs.git (xdp_offload_v4 branch).

Thanks,
Toshiaki Makita

>> On 2020/08/15 10:54, Toshiaki Makita wrote:
>>> Ping.
>>> Any feedback is welcome.
>>>
>>> Thanks,
>>> Toshiaki Makita
>>>
>>> On 2020/07/31 11:55, Toshiaki Makita wrote:
>>>> This patch adds an XDP-based flow cache using the OVS netdev-offload
>>>> flow API provider.  When an OVS device with XDP offload enabled,
>>>> packets first are processed in the XDP flow cache (with parse, and
>>>> table lookup implemented in eBPF) and if hits, the action processing
>>>> are also done in the context of XDP, which has the minimum overhead.
>>>>
>>>> This provider is based on top of William's recently posted patch for
>>>> custom XDP load.  When a custom XDP is loaded, the provider detects if
>>>> the program supports classifier, and if supported it starts offloading
>>>> flows to the XDP program.
>>>>
>>>> The patches are derived from xdp_flow[1], which is a mechanism similar to
>>>> this but implemented in kernel.
>>>>
>>>>
>>>> * Motivation
>>>>
>>>> While userspace datapath using netdev-afxdp or netdev-dpdk shows good
>>>> performance, there are use cases where packets better to be processed in
>>>> kernel, for example, TCP/IP connections, or container to container
>>>> connections.  Current solution is to use tap device or af_packet with
>>>> extra kernel-to/from-userspace overhead.  But with XDP, a better solution
>>>> is to steer packets earlier in the XDP program, and decides to send to
>>>> userspace datapath or stay in kernel.
>>>>
>>>> One problem with current netdev-afxdp is that it forwards all packets to
>>>> userspace, The first patch from William (netdev-afxdp: Enable loading XDP
>>>> program.) only provides the interface to load XDP program, howerver users
>>>> usually don't know how to write their own XDP program.
>>>>
>>>> XDP also supports HW-offload so it may be possible to offload flows to
>>>> HW through this provider in the future, although not currently.
>>>> The reason is that map-in-map is required for our program to support
>>>> classifier with subtables in XDP, but map-in-map is not offloadable.
>>>> If map-in-map becomes offloadable, HW-offload of our program may also
>>>> be possible.
>>>>
>>>>
>>>> * How to use
>>>>
>>>> 1. Install clang/llvm >= 9, libbpf >= 0.0.6 (included in kernel 5.5), and
>>>>      kernel >= 5.3.
>>>>
>>>> 2. make with --enable-afxdp --enable-xdp-offload
>>>> --enable-bpf will generate XDP program "bpf/flowtable_afxdp.o".  Note that
>>>> the BPF object will not be installed anywhere by "make install" at this point.
>>>>
>>>> 3. Load custom XDP program
>>>> E.g.
>>>> $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native \
>>>>     options:xdp-obj="/path/to/ovs/bpf/flowtable_afxdp.o"
>>>> $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native \
>>>>     options:xdp-obj="/path/to/ovs/bpf/flowtable_afxdp.o"
>>>>
>>>> 4. Enable XDP_REDIRECT
>>>> If you use veth devices, make sure to load some (possibly dummy) programs
>>>> on the peers of veth devices. This patch set includes a program which
>>>> does nothing but returns XDP_PASS. You can use it for the veth peer like
>>>> this:
>>>> $ ip link set veth1 xdpdrv object /path/to/ovs/bpf/xdp_noop.o section xdp
>>>>
>>>> Some HW NIC drivers require as many queues as cores on its system. Tweak
>>>> queues using "ethtool -L".
>>>>
>>>> 5. Enable hw-offload
>>>> $ ovs-vsctl set Open_vSwitch . other_config:offload-driver=linux_xdp
>>>> $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
>>>> This will starts offloading flows to the XDP program.
>>>>
>>>> You should be able to see some maps installed, including "debug_stats".
>>>> $ bpftool map
>>>>
>>>> If packets are successfully redirected by the XDP program,
>>>> debug_stats[2] will be counted.
>>>> $ bpftool map dump id <ID of debug_stats>
>>>>
>>>> Currently only very limited keys and output actions are supported.
>>>> For example NORMAL action entry and IP based matching work with current
>>>> key support. VLAN actions used by port tag/trunks are also supported.
>>>>
>>>>
>>>> * Performance
>>>>
>>>> Tested 2 cases. 1) i40e to veth, 2) i40e to i40e.
>>>> Test 1 Measured drop rate at veth interface with redirect action from
>>>> physical interface (i40e 25G NIC, XXV 710) to veth. The CPU is Xeon
>>>> Silver 4114 (2.20 GHz).
>>>>                                                                  XDP_DROP
>>>>                       +------+                      +-------+    +-------+
>>>>    pktgen -- wire --> | eth0 | -- NORMAL ACTION --> | veth0 |----| veth2 |
>>>>                       +------+                      +-------+    +-------+
>>>>
>>>> Test 2 uses i40e instead of veth, and measured tx packet rate at output
>>>> device.
>>>>
>>>> Single-flow performance test results:
>>>>
>>>> 1) i40e-veth
>>>>
>>>>     a) no-zerocopy in i40e
>>>>
>>>>       - xdp   3.7 Mpps
>>>>       - afxdp 980 kpps
>>>>
>>>>     b) zerocopy in i40e (veth does not have zc)
>>>>
>>>>       - xdp   1.9 Mpps
>>>>       - afxdp 980 Kpps
>>>>
>>>> 2) i40e-i40e
>>>>
>>>>     a) no-zerocopy
>>>>
>>>>       - xdp   3.5 Mpps
>>>>       - afxdp 1.5 Mpps
>>>>
>>>>     b) zerocopy
>>>>
>>>>       - xdp   2.0 Mpps
>>>>       - afxdp 4.4 Mpps
>>>>
>>>> ** xdp is better when zc is disabled. The reason of poor performance on zc
>>>>      is that xdp_frame requires packet memory allocation and memcpy on
>>>>      XDP_REDIRECT to other devices iff zc is enabled.
>>>>
>>>> ** afxdp with zc is better than xdp without zc, but afxdp is using 2 cores
>>>>      in this case, one is pmd and the other is softirq. When pmd and softirq
>>>>      were running on the same core, the performance was extremely poor as
>>>>      pmd consumes cpus. I also tested afxdp-nonpmd to run softirq and
>>>>      userspace processing on the same core, but the result was lower than
>>>>      (pmd results) / 2.
>>>>      With nonpmd, xdp performance was the same as xdp with pmd. This means
>>>>      xdp only uses one core (for softirq only). Even with pmd, we need only
>>>>      one pmd for xdp even when we want to use more cores for multi-flow.
>>>>
>>>>
>>>> This patch set is based on top of commit e8bf77748 ("odp-util: Fix clearing
>>>> match mask if set action is partially unnecessary.").
>>>>
>>>> To make review easier I left pre-squashed commits from v3 here.
>>>> https://github.com/tmakita/ovs/compare/xdp_offload_v3...tmakita:xdp_offload_v4_history?expand=1
>>>>
>>>> [1] https://lwn.net/Articles/802653/
>>>>
>>>> v4:
>>>> - Fix checkpatch errors.
>>>> - Fix duplicate flow api register.
>>>> - Don't call unnecessary flow api init callbacks when default flow api
>>>>     provider can be used.
>>>> - Fix typo in comments.
>>>> - Improve bpf Makefile.am to support automatic dependencies.
>>>> - Add a dummy XDP program for veth peers.
>>>> - Rename netdev_info to netdev_xdp_info.
>>>> - Use id-pool for free subtable entry management and devmap indexes.
>>>> - Rename --enable-bpf to --enable-xdp-offload.
>>>> - Compile xdp flow api provider only with --enable-xdp-offload.
>>>> - Tested again and updated performance numbers in cover letter (get
>>>>     slightly better numbers).
>>>>
>>>> v3:
>>>> - Use ".ovs_meta" section to inform vswitchd of metadata like supported
>>>>     keys.
>>>> - Rewrite action loop logic in bpf to support multiple actions.
>>>> - Add missing linux/types.h in acinclude.m4, as per William Tu.
>>>> - Fix infinite reconfiguration loop when xsks_map is missing.
>>>> - Add vlan-related actions in bpf program.
>>>> - Fix CI build error.
>>>> - Fix inability to delete subtable entries.
>>>>
>>>> v2:
>>>> - Add uninit callback of netdev-offload-xdp.
>>>> - Introduce "offload-driver" other_config to specify offload driver.
>>>> - Add --enable-bpf (HAVE_BPF) config option to build bpf programs.
>>>> - Workaround incorrect UINTPTR_MAX in x64 clang bpf build.
>>>> - Fix boot.sh autoconf warning.
>>>>
>>>>
>>>> Toshiaki Makita (4):
>>>>     netdev-offload: Add "offload-driver" other_config to specify offload
>>>>       driver
>>>>     netdev-offload: Add xdp flow api provider
>>>>     bpf: Add reference XDP program implementation for netdev-offload-xdp
>>>>     bpf: Add dummy program for veth devices
>>>>
>>>> William Tu (1):
>>>>     netdev-afxdp: Enable loading XDP program.
>>>>
>>>>    .travis.yml                           |    2 +-
>>>>    Documentation/intro/install/afxdp.rst |   59 ++
>>>>    Makefile.am                           |    9 +-
>>>>    NEWS                                  |    2 +
>>>>    acinclude.m4                          |   60 ++
>>>>    bpf/.gitignore                        |    4 +
>>>>    bpf/Makefile.am                       |   83 ++
>>>>    bpf/bpf_compiler.h                    |   25 +
>>>>    bpf/bpf_miniflow.h                    |  179 ++++
>>>>    bpf/bpf_netlink.h                     |   63 ++
>>>>    bpf/bpf_workaround.h                  |   28 +
>>>>    bpf/flowtable_afxdp.c                 |  585 ++++++++++++
>>>>    bpf/xdp_noop.c                        |   31 +
>>>>    configure.ac                          |    2 +
>>>>    lib/automake.mk                       |    8 +
>>>>    lib/bpf-util.c                        |   38 +
>>>>    lib/bpf-util.h                        |   22 +
>>>>    lib/netdev-afxdp.c                    |  373 +++++++-
>>>>    lib/netdev-afxdp.h                    |    3 +
>>>>    lib/netdev-linux-private.h            |    5 +
>>>>    lib/netdev-offload-provider.h         |    8 +-
>>>>    lib/netdev-offload-xdp.c              | 1213 +++++++++++++++++++++++++
>>>>    lib/netdev-offload-xdp.h              |   49 +
>>>>    lib/netdev-offload.c                  |   42 +
>>>>    24 files changed, 2881 insertions(+), 12 deletions(-)
>>>>    create mode 100644 bpf/.gitignore
>>>>    create mode 100644 bpf/Makefile.am
>>>>    create mode 100644 bpf/bpf_compiler.h
>>>>    create mode 100644 bpf/bpf_miniflow.h
>>>>    create mode 100644 bpf/bpf_netlink.h
>>>>    create mode 100644 bpf/bpf_workaround.h
>>>>    create mode 100644 bpf/flowtable_afxdp.c
>>>>    create mode 100644 bpf/xdp_noop.c
>>>>    create mode 100644 lib/bpf-util.c
>>>>    create mode 100644 lib/bpf-util.h
>>>>    create mode 100644 lib/netdev-offload-xdp.c
>>>>    create mode 100644 lib/netdev-offload-xdp.h
>>>>
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list