[ovs-dev] [RFC v7 00/13] Support multi-segment mbufs

Tiago Lam tiago.lam at intel.com
Wed May 23 16:47:15 UTC 2018


Overview
========
This patchset introduces support for multi-segment mbufs to OvS-DPDK.
Multi-segment mbufs are typically used when the size of an mbuf is
insufficient to contain the entirety of a packet's data. Instead, the
data is split across numerous mbufs, each carrying a portion, or
'segment', of the packet data. mbufs are chained via their 'next'
attribute (an mbuf pointer).

Use Cases
=========
i.  Handling oversized (guest-originated) frames, which are marked
    for hardware accelration/offload (TSO, for example).

    Packets which originate from a non-DPDK source may be marked for
    offload; as such, they may be larger than the permitted ingress
    interface's MTU, and may be stored in an oversized dp-packet. In
    order to transmit such packets over a DPDK port, their contents
    must be copied to a DPDK mbuf (via dpdk_do_tx_copy). However, in
    its current implementation, that function only copies data into
    a single mbuf; if the space available in the mbuf is exhausted,
    but not all packet data has been copied, then it is lost.
    Similarly, when cloning a DPDK mbuf, it must be considered
    whether that mbuf contains multiple segments. Both issues are
    resolved within this patchset.

ii. Handling jumbo frames.

    While OvS already supports jumbo frames, it does so by increasing
    mbuf size, such that the entirety of a jumbo frame may be handled
    in a single mbuf. This is certainly the preferred, and most
    performant approach (and remains the default). However, it places
    high demands on system memory; multi-segment mbufs may be
    prefereable for systems which are memory-constrained.

Enabling multi-segment mbufs
============================
Multi-segment and single-segment mbufs are mutually exclusive, and the
user must decide on which approach to adopt on init. The introduction
of a new OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this.

This is a global boolean value, which determines how jumbo frames are
represented across all DPDK ports. In the absence of a user-supplied
value, 'dpdk-multi-seg-mbufs' defaults to false, i.e. multi-segment
mbufs must be explicitly enabled / single-segment mbufs remain the
default.

Setting the field is identical to setting existing DPDK-specific OVSDB
fields:

    ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
    ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x10
    ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=4096,0
==> ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true

---

v7:  - Rebase on master 5e720da ("erspan: fix invalid erspan version.");
     - Address Ilya comments;
        - Fix non-DPDK build;
        - Serialise the access of non pmds to allocation and free of mbufs by
          using a newly introduced mutex.
     - Add a new set of tests that integrates with the recently added DPDK
       testsuite. These focus on allocating dp_packets, with a single or
       multiple mbufs, from an instantiated mempool and performing several
       operations on those, verifying if the data at the end matches what's
       expected;
     - Fix bugs found by the new tests:
        - dp_packet_shift() wasn't taking into account shift lefts;
        - dp_packet_resize__() was misusing and miscalculating the tailrooms
          and headrooms, ending up calculating the wrong number of segments
          that needed allocation;
        - An mbuf's end was being miscalculated both in dp_packet_tail,
          dp_packet_mbuf_tail() and dp_packet_end();
        - dp_packet_set_size() was not updating the number of chained segments
          'nb_segs';
     - Add support for multi-seg mbufs in dp_packet_clear().

v6:  - Rebase on master 7c0cb29 ("conntrack-tcp: Handle tcp session
       reuse.");
     - Further improve dp_packet_put_uninit() and dp_packet_shift() to
       support multi-seg mbufs;
     - Add support for multi-seg mbufs in dp_packet_l4_size() and
       improve other helper funcs, such as dp_packet_tail() and dp_
       packet_tailroom().
     - Add support for multi-seg mbufs in dp_packet_put(), dp_packet_
       put_zeros(), as well as dp_packet_resize__() - allocating new
       mbufs and linking them together;
     Restructured patchset:
     - Squash patch 5 into patch 6, since they were both related to
       copying data while handling multi-seg mbufs;
     - Split patch 4 into two separate patches - one that introduces the
       changes in helper functions to deal with multi-seg mbufs and
       two others for the shift() and put_uninit() functionality;
     - Move patch 4 to before patch 3, so that ihelper functions come
       before functionality improvement that rely on those helpers.

v5: - Rebased on master e5e22dc ("datapath-windows: Prevent ct-counters
      from getting redundantly incremented");
    - Sugesh's comments have been addressed:
      - Changed dp_packet_set_data() and dp_packet_set_size() logic to
        make them independent of each other;
      - Dropped patch 3 now that dp_packet_set_data() and dp_packet_set_
        size() are independent;
      - dp_packet_clone_with_headroom() now has split functions for
        handling DPDK sourced packets and non-DPDK packets;
    - Modified various functions in dp-packet.h to account for multi-seg
      mbufs - dp_packet_put_uninit(), dp_packet_tail(), dp_packet_tail()
      and dp_packet_at();
    - Added support for shifting packet data in multi-seg mbufs, using
      dp_packet_shift();
    - Fixed some minor inconsistencies.

    Note that some of the changes in v5 have been contributed by Mark
    Kavanagh as well.

v4: - restructure patchset
    - account for 128B ARM cacheline when sizing mbufs

Mark Kavanagh (4):
  netdev-dpdk: fix mbuf sizing
  dp-packet: Init specific mbuf fields.
  netdev-dpdk: copy large packet to multi-seg. mbufs
  netdev-dpdk: support multi-segment jumbo frames

Michael Qiu (1):
  dp-packet: copy data from multi-seg. DPDK mbuf

Tiago Lam (8):
  dp-packet: Fix allocated size on DPDK init.
  dp-packet: Fix data_len handling multi-seg mbufs.
  dp-packet: Handle multi-seg mbufs in helper funcs.
  dp-packet: Handle multi-seg mbufs in put*() funcs.
  dp-packet: Handle multi-seg mubfs in shift() func.
  netdev-dpdk: Serialise non-pmds mbufs' alloc/free.
  dp-packet: Handle multi-seg mbufs in resize__().
  dpdk-tests: Add test coverage for multi-seg mbufs.

 NEWS                           |   1 +
 lib/dp-packet.c                | 360 +++++++++++++++++++++++++++++++---
 lib/dp-packet.h                | 328 +++++++++++++++++++++++++------
 lib/dpdk.c                     |   7 +
 lib/netdev-dpdk.c              | 211 +++++++++++++++++---
 lib/netdev-dpdk.h              |   5 +
 tests/automake.mk              |  10 +-
 tests/dpdk-packet-mbufs.at     |   7 +
 tests/system-dpdk-testsuite.at |   1 +
 tests/test-dpdk-mbufs.c        | 434 +++++++++++++++++++++++++++++++++++++++++
 vswitchd/vswitch.xml           |  20 ++
 11 files changed, 1262 insertions(+), 122 deletions(-)
 create mode 100644 tests/dpdk-packet-mbufs.at
 create mode 100644 tests/test-dpdk-mbufs.c

-- 
2.7.4



More information about the dev mailing list