[ovs-dev] [RFC PATCHv2 00/13] OVS eBPF datapath.

William Tu u9012063 at gmail.com
Sat Jul 14 11:38:52 UTC 2018


Today OVS has three datapath types: Linux kernel (dpif-netlink),
userspace (dpif-netdev), and Windows.  This series add another type of
OVS datapath: the eBPF datapath (dpif-bpf).

eBPF stands for extended Berkeley Packet Filter. It enables userspace
applications to customize and extend the Linux kernel’s functionality.
Thus, the benefit of implementing OVS datapath in eBPF is flexibility:
new feature can be added through eBPF bytecode and dynamically loaded
into the Linux kernel, and safety, the eBPF bytecode is guaranteed
to not crash the kernel by the BPF verifier, and finally: portability,
the eBPF bytecode is platform-agnostic so hopefully the same implementation
can run on different platforms.

The implementation tries to re-implement whatever under Linux kernel's
net/openvswitch/* into eBPF code.  However, this series is still far
from being complete.  A couple of eBPF limitations make it difficult.


OVS eBPF Architecture
=====================
OVS has the following architecure:
               _
              |   +-------------------+
              |   |    ovs-vswitchd   |
              |   +-------------------+
    userspace |   |      ofproto      |<-->OpenFlow controllers
              |   +--------+-+--------+  
              |   | netdev | |ofproto-|
              |   +--------+ |  dpif  |
                  | netdev | +--------+
   *eBPF hook --> |provider| |  dpif  |
                  +---||---+ +--------+
              |       ||     |  dpif  | <--- *eBPF provider
              |       ||     |provider|
              |_      ||     +---||---+
                      ||         ||
               _  +---||-----+---||---+
              |   |          |datapath| <--- *eBPF datapath
       kernel |   |          +--------+
              |   |                   |
              |_  +--------||---------+
                           ||
                        physical
                           NIC

And the patch adds:
- eBPF hook for attaching eBPF/XDP program to netdev,
  files: lib/netdev-linux.*
- eBPF dpif provider, an interface to communicate with eBPF datapath
  files: lib/dpif-bpf.*, lib/dpif-bpf-odp.*
- eBPF datapath, the implementation of OVS datapath in eBPF
  files: bpf/datapath.c, bpf/*.h 

Most of the design and implemention are described in OSR2018
paper[1], "Building an Extensible Open vSwitch Datapath" and
OVS conference[2], "Offloading OVS Flow Processing using eBPF" 
[1] https://dl.acm.org/citation.cfm?id=3139657
[2] http://openvswitch.org/support/ovscon2016/7/1120-tu.pdf 


eBPF/XDP
========
A single bpf bytecode 'bpf/datapath.o' is generated and loaded into
all netdevs attached to OVS, in either ingress or egress of the netdev.
A packet traversing the OVS eBPF datapath typically go through three
stages: parse, lookup, and action.  Each stage consists of multiple
eBPF program and each stage is tail called from each other.

'objdump -h bpf/datapath.o' shows the OVS eBPF datapath object file.

  1 tail-32       000018b0  0000000000000000  0000000000000000  00000040  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  2 tail-33       00001b08  0000000000000000  0000000000000000  000018f0  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  3 tail-0        000000b8  0000000000000000  0000000000000000  000033f8  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  4 tail-1        00000a48  0000000000000000  0000000000000000  000034b0  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
    <skip>
 16 tail-13       00000aa0  0000000000000000  0000000000000000  000095d8  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
 17 xdp           00000070  0000000000000000  0000000000000000  0000a078  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 18 af_xdp        00000010  0000000000000000  0000000000000000  0000a0e8  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 19 tail-35       000008b0  0000000000000000  0000000000000000  0000a0f8  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
 20 ingress       00000178  0000000000000000  0000000000000000  0000a9a8  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
 21 egress        00000188  0000000000000000  0000000000000000  0000ab20  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
 22 downcall      000003b0  0000000000000000  0000000000000000  0000aca8  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
 23 maps          000000fc  0000000000000000  0000000000000000  0000b058  2**2

Program with 'tail-{0-13}' is the OVS action implementation, see actions.h.
Program ingress, egress, and downcall are three possible entry points of a
packet triggered the eBPF program, which from there, tail calls the next
stage.  Program xdp and af_xdp is still empty for future integration. 

Currently, llvm/clang-4.0 doesn't work, we have to use version 3.8.

Testsuite
=========
We create a set of test cases under tests/system-bpf-traffic.at,
which is a subset of the kernel datapath testsuite (system-traffic.at)

'make check-bpf' will kick start the testing, so far this patch can do
BPF datapath-sanity

  1: datapath - basic BPF commands                   ok
  2: datapath - ping between two ports               ok
  3: datapath - http between two ports               ok
  4: datapath - ping between two ports on vlan       ok
  5: datapath - ping between two ports on cvlan      ok
  6: datapath - ping6 between two ports              ok
  7: datapath - ping6 between two ports on vlan      ok
  8: datapath - ping6 between two ports on cvlan     ok 
  9: datapath - ping over bond                       skipped (system-bpf-traffic.at:210)
 10: datapath - ping over vxlan tunnel               ok
 11: datapath - ping over vxlan6 tunnel              ok
 12: datapath - ping over gre tunnel                 ok
 13: datapath - ping over geneve tunnel              ok
 14: datapath - ping over geneve tunnel with TLV     ok
 15: datapath - ping over geneve6 tunnel             ok
 16: datapath - clone action                         FAILED (system-bpf-traffic.at:487)
 17: datapath - mpls actions                         FAILED (system-bpf-traffic.at:526)
 18: datapath - basic truncate action                FAILED (system-bpf-traffic.at:586)
 19: datapath - truncate and output to gre tunnel    FAILED (system-bpf-traffic.at:720)
 20: ovn -- 1 LR connects 2 LSes                     ok 
The log of each test is saved at tests/system-bpf-testsuite.dir/<id>/

The patch is based on top of the OVS 2.9.1
commit f8b6477aa019 ("Set release date for 2.9.1.")

Or at my github
# git clone https://github.com/williamtu/ovs-ebpf
at branch rfc or rfc2

v1->v2:
- Many fixes: parser, vlan, byte-ordering, set and set_masked action.
- Two new test cases: OVN and Geneve TLV support.
- Verifier: stacks utilizaion is much smaller, thanks to Paul and Alexei
  for pointing out the patch in 4.18-rc1:
  ("bpf: allow map helpers access to map values directly")

Details:
2df75277a33d bpf/parser.h: re-write bpf parser.
c22a49fa5a0a lib/dpif-bpf-odp: WIP: add MPLS
31a148de8d54 tests: filter out ERR only from revalidator.
09bcd43a97fb tests: add geneve TLV testcase.
f2c45671e679 lib/dpif-bpf-odp.c: fix byte ordering warning.
8096c7bfa2f5 bpf: fix set and set masked action.
1d053f9afdf4 tests: install bpf module when running make check-bpf
e2378e513f16 Merge pull request #2 from williamtu/rfc-pr2
1cd542d30db0 bpf: fix vlan and cvlan.
fa71e891ac7c bpf: make byte-ordering hard to break.
ed089428e1ce bpf: remove packet_length in bpf_flow_key.
a5d872233eed xdp: early drop ipv6 packet.


Joe Stringer (7):
  ovs-bpf: add documentation and configuration.
  netdev: add ebpf support for netdev provider.
  lib: implement perf event ringbuffer for upcall.
  lib/bpf: add support for managing bpf program/map.
  dpif: add 'dpif-bpf' provider.
  dpif-bpf-odp: Add bpf datapath interface and impl.
  utilities: Add ovs-bpfctl utility.

William Tu (6):
  bpf: implement OVS BPF datapath.
  vswitch/bridge.c: add bpf datapath initialization.
  tests: Add "make check-bpf" traffic target.
  vagrant: add ebpf support using ubuntu/bionic
  ofproto: disable megaflow for bpf datapath.
  xdp: early drop ipv6 packet.

 Documentation/automake.mk             |    1 +
 Documentation/index.rst               |    2 +-
 Documentation/intro/install/bpf.rst   |  142 +++
 Documentation/intro/install/index.rst |    1 +
 Makefile.am                           |   13 +-
 Vagrantfile-eBPF                      |   99 ++
 acinclude.m4                          |   39 +
 bpf/.gitignore                        |    4 +
 bpf/action.h                          |  715 ++++++++++++
 bpf/api.h                             |  279 +++++
 bpf/automake.mk                       |   60 +
 bpf/datapath.c                        |  192 ++++
 bpf/datapath.h                        |   71 ++
 bpf/generated_headers.h               |  182 +++
 bpf/helpers.h                         |  248 ++++
 bpf/lookup.h                          |  228 ++++
 bpf/maps.h                            |  170 +++
 bpf/odp-bpf.h                         |  255 +++++
 bpf/openvswitch.h                     |   49 +
 bpf/ovs-p4.h                          |   90 ++
 bpf/ovs-proto.p4                      |  329 ++++++
 bpf/parser.h                          |  344 ++++++
 bpf/xdp.h                             |   53 +
 configure.ac                          |    1 +
 include/linux/pkt_cls.h               |   21 +
 lib/automake.mk                       |   12 +
 lib/bpf.c                             |  524 +++++++++
 lib/bpf.h                             |   69 ++
 lib/dpif-bpf-odp.c                    |  945 ++++++++++++++++
 lib/dpif-bpf-odp.h                    |   47 +
 lib/dpif-bpf.c                        | 1996 +++++++++++++++++++++++++++++++++
 lib/dpif-netdev.c                     |   29 +-
 lib/dpif-provider.h                   |    1 +
 lib/dpif.c                            |    3 +
 lib/netdev-bsd.c                      |    2 +
 lib/netdev-dpdk.c                     |    2 +
 lib/netdev-dummy.c                    |    2 +
 lib/netdev-linux.c                    |  436 ++++++-
 lib/netdev-linux.h                    |    2 +
 lib/netdev-provider.h                 |   11 +
 lib/netdev-vport.c                    |  145 ++-
 lib/netdev.c                          |   25 +
 lib/netdev.h                          |    4 +
 lib/packets.h                         |    6 +-
 lib/perf-event.c                      |  288 +++++
 lib/perf-event.h                      |   43 +
 ofproto/ofproto-dpif-upcall.c         |    4 +
 ofproto/ofproto-dpif.c                |   69 +-
 tests/.gitignore                      |    1 +
 tests/automake.mk                     |   31 +-
 tests/ofproto-macros.at               |    7 +
 tests/system-bpf-macros.at            |  112 ++
 tests/system-bpf-testsuite.at         |   25 +
 tests/system-bpf-testsuite.patch      |   10 +
 tests/system-bpf-traffic.at           |  851 ++++++++++++++
 utilities/automake.mk                 |    9 +
 utilities/ovs-bpfctl.8.xml            |   45 +
 utilities/ovs-bpfctl.c                |  248 ++++
 vswitchd/bridge.c                     |   21 +
 59 files changed, 9560 insertions(+), 53 deletions(-)
 create mode 100644 Documentation/intro/install/bpf.rst
 create mode 100644 Vagrantfile-eBPF
 create mode 100644 bpf/.gitignore
 create mode 100644 bpf/action.h
 create mode 100644 bpf/api.h
 create mode 100644 bpf/automake.mk
 create mode 100644 bpf/datapath.c
 create mode 100644 bpf/datapath.h
 create mode 100644 bpf/generated_headers.h
 create mode 100644 bpf/helpers.h
 create mode 100644 bpf/lookup.h
 create mode 100644 bpf/maps.h
 create mode 100644 bpf/odp-bpf.h
 create mode 100644 bpf/openvswitch.h
 create mode 100644 bpf/ovs-p4.h
 create mode 100644 bpf/ovs-proto.p4
 create mode 100644 bpf/parser.h
 create mode 100644 bpf/xdp.h
 create mode 100644 lib/bpf.c
 create mode 100644 lib/bpf.h
 create mode 100644 lib/dpif-bpf-odp.c
 create mode 100644 lib/dpif-bpf-odp.h
 create mode 100644 lib/dpif-bpf.c
 create mode 100644 lib/perf-event.c
 create mode 100644 lib/perf-event.h
 create mode 100644 tests/system-bpf-macros.at
 create mode 100644 tests/system-bpf-testsuite.at
 create mode 100644 tests/system-bpf-testsuite.patch
 create mode 100644 tests/system-bpf-traffic.at
 create mode 100644 utilities/ovs-bpfctl.8.xml
 create mode 100644 utilities/ovs-bpfctl.c

-- 
2.7.4



More information about the dev mailing list