[ovs-dev] [PATCH v4 00/27] dpif-netdev: Parallel offload processing

Gaetan Rivet grive at u256.net
Wed Jun 9 13:09:08 UTC 2021


This patch series aims to improve the performance of the management
of hw-offloads in dpif-netdev. In the current version, some setup
will experience high memory usage and poor latency between a flow
decision and its execution regarding hardware offloading.

This series starts by measuring key metrics regarding both issues
Those patches are introduced first to compare the current status
with each improvements introduced.
Offloads enqueued and inserted, as well as the latency
from queue insertion to hardware insertion is measured. A new
command 'ovs-appctl dpctl/offload-stats-show' is introduced
to show the current measure.

In my current performance test setup I am measuring an
average latency hovering between 1~2 seconds.
After the optimizations, it is reduced to 500~900 ms.
Finally when using multiple threads and with proper driver
support[1], it is measured in the order of 1 ms.

A few modules are introduced:

  * An ID pool with reduced capabilities, simplifying its
    operations and allowing better performances in both
    single and multi-thread setup.

  * A lockless queue between PMDs / revalidators and
    offload thread(s). As the number of PMDs increases,
    contention can be high on the shared queue.
    This queue is designed to serve as message queue
    between threads.

  * A bounded lockless MPMC ring and some helpers for
    calculating moving averages.

  * A moving average module for Cumulative and Exponential
    moving averages.

The netdev-offload-dpdk module is made thread-safe.
Internal maps are made per-netdev instead, and locks are
taken for shorter critical sections within the module.

CI result: https://github.com/grivet/ovs/actions/runs/554918929

[1]: The rte_flow API was made thread-safe in the 20.11 DPDK
     release. Drivers that do not implement those operations
     concurrently are protected by a lock. Others will
     allow better concurrency, that improve the result
     of this series.

v2:

  * Improved the MPSC queue API to simplify usage.

  * Moved flush operation from initiator thread to offload
    thread(s). This ensures offload metadata are shared only
    among the offload thread pool.

  * Flush operation needs additional thread synchronization.
    The ovs_barrier currently triggers a UAF. Add a unit-test to
    validate its operations and a fix for the UAF.

CI result: https://github.com/grivet/ovs/actions/runs/741430135
           The error comes from a failure to download 'automake' on
           osx, unrelated to any change in this series.

v3:

  * Re-ordered commits so fixes are first. No conflict seen currently,
    but it might prevent them if some requested changes to the series
    were to move code in the same parts.

  * Modified the reduced quiescing the thread to use ovsrcu_quiesce(),
    and base next_rcu on the current time value (after quiescing happened,
    however long it takes).

  * Added Reviewed-by tags to the relevant commits.

CI result: https://github.com/grivet/ovs/actions/runs/782655601

v4:

  * Modified the seq-pool to use batches of IDs with a spinlock
    instead of lockless rings.

  * The llring structure is removed.

  * Due to the length of the changes to the structure, some
    acked-by or reviewed-by were not ported to the id-fpool patch.

CI result: https://github.com/grivet/ovs/actions/runs/921095015

The new pool 'id-fpool' (for id fast pool) performs better than
seq-pool and id-pool. It also scales better than the suggestions made
on the mailing list: as number of threads grows, its operations improves.

Executing 20 runs to compare id-fpool, seq-pool and the id-queue (name of
the unordered pool using a contiguous 2n-reallocated array), I get the
following numbers:

(The seq-pool and id-queue benchmarks were left only for the comparison,
they are not part of the id-fpool patch submitted with this series.)

$ pool-stats.sh 1000000 1
20 times './tests/ovstest test-id-fpool perf  1000000 1':
id-fpool new: avg   14.4 | stdev    1.3 | max    20 | min    14
id-fpool del: avg   12.9 | stdev    0.5 | max    14 | min    12
id-fpool mix: avg   34.5 | stdev    0.6 | max    36 | min    34
id-fpool rnd: avg   55.2 | stdev    0.7 | max    56 | min    54
seq-pool new: avg   22.3 | stdev    0.5 | max    23 | min    22
seq-pool del: avg   53.7 | stdev    0.7 | max    55 | min    53
seq-pool mix: avg   57.8 | stdev    0.8 | max    60 | min    57
seq-pool rnd: avg   70.3 | stdev    1.2 | max    74 | min    69
id-queue new: avg   12.4 | stdev    0.5 | max    13 | min    12
id-queue del: avg   10.1 | stdev    0.4 | max    11 | min     9
id-queue mix: avg   29.9 | stdev    0.3 | max    30 | min    29
id-queue rnd: avg   48.5 | stdev    0.5 | max    49 | min    48

$ pool-stats.sh 1000000 2
20 times './tests/ovstest test-id-fpool perf  1000000 2':
id-fpool new: avg   17.8 | stdev    3.3 | max    28 | min    13
id-fpool del: avg   14.9 | stdev    2.4 | max    20 | min    11
id-fpool mix: avg   43.0 | stdev    5.7 | max    56 | min    34
id-fpool rnd: avg   45.9 | stdev    5.6 | max    62 | min    36
seq-pool new: avg   39.6 | stdev    6.0 | max    49 | min    34
seq-pool del: avg   37.0 | stdev    5.3 | max    47 | min    33
seq-pool mix: avg  101.4 | stdev   16.3 | max   130 | min    89
seq-pool rnd: avg   81.6 | stdev   12.6 | max   105 | min    71
id-queue new: avg   20.9 | stdev    4.1 | max    32 | min    15
id-queue del: avg   17.2 | stdev    4.5 | max    28 | min    10
id-queue mix: avg   56.5 | stdev   10.9 | max    86 | min    38
id-queue rnd: avg   97.2 | stdev   15.7 | max   130 | min    64

$ pool-stats.sh 1000000 4
20 times './tests/ovstest test-id-fpool perf  1000000 4':
id-fpool new: avg   10.4 | stdev    2.8 | max    22 | min     7
id-fpool del: avg    8.5 | stdev    0.7 | max     9 | min     6
id-fpool mix: avg   19.6 | stdev    1.8 | max    22 | min    15
id-fpool rnd: avg   25.6 | stdev    2.4 | max    28 | min    20
seq-pool new: avg   47.7 | stdev    5.2 | max    52 | min    34
seq-pool del: avg   35.8 | stdev    3.3 | max    39 | min    28
seq-pool mix: avg  118.1 | stdev   14.8 | max   130 | min    81
seq-pool rnd: avg   89.3 | stdev    9.7 | max   101 | min    65
id-queue new: avg   83.2 | stdev   17.5 | max   126 | min    65
id-queue del: avg   81.8 | stdev   20.8 | max   128 | min    57
id-queue mix: avg  276.2 | stdev   57.1 | max   369 | min   171
id-queue rnd: avg  347.9 | stdev   44.1 | max   410 | min   236

Isolating the 'rnd' test:

1 thread:
---------
id-fpool rnd: avg   55.2 | stdev    0.7 | max    56 | min    54
seq-pool rnd: avg   70.3 | stdev    1.2 | max    74 | min    69
id-queue rnd: avg   48.5 | stdev    0.5 | max    49 | min    48

2 threads:
----------
id-fpool rnd: avg   45.9 | stdev    5.6 | max    62 | min    36
seq-pool rnd: avg   81.6 | stdev   12.6 | max   105 | min    71
id-queue rnd: avg   97.2 | stdev   15.7 | max   130 | min    64

4 threads:
----------
id-fpool rnd: avg   25.6 | stdev    2.4 | max    28 | min    20
seq-pool rnd: avg   89.3 | stdev    9.7 | max   101 | min    65
id-queue rnd: avg  347.9 | stdev   44.1 | max   410 | min   236


Gaetan Rivet (27):
  ovs-thread: Fix barrier use-after-free
  dpif-netdev: Rename flow offload thread
  tests: Add ovs-barrier unit test
  netdev: Add flow API uninit function
  netdev-offload-dpdk: Use per-netdev offload metadata
  netdev-offload-dpdk: Implement hw-offload statistics read
  dpctl: Add function to read hardware offload statistics
  dpif-netdev: Rename offload thread structure
  mov-avg: Add a moving average helper structure
  dpif-netdev: Implement hardware offloads stats query
  ovs-atomic: Expose atomic exchange operation
  mpsc-queue: Module for lock-free message passing
  id-fpool: Module for fast ID generation
  netdev-offload: Add multi-thread API
  dpif-netdev: Quiesce offload thread periodically
  dpif-netdev: Postpone flow offload item freeing
  dpif-netdev: Use id-fpool for mark allocation
  dpif-netdev: Introduce tagged union of offload requests
  dpif-netdev: Execute flush from offload thread
  netdev-offload-dpdk: Use per-thread HW offload stats
  netdev-offload-dpdk: Lock rte_flow map access
  netdev-offload-dpdk: Protect concurrent offload destroy/query
  dpif-netdev: Use lockless queue to manage offloads
  dpif-netdev: Make megaflow and mark mappings thread objects
  dpif-netdev: Replace port mutex by rwlock
  dpif-netdev: Use one or more offload threads
  netdev-dpdk: Remove rte-flow API access locks

 lib/automake.mk               |   5 +
 lib/dpctl.c                   |  36 ++
 lib/dpif-netdev.c             | 741 ++++++++++++++++++++++++--------
 lib/dpif-netlink.c            |   1 +
 lib/dpif-provider.h           |   7 +
 lib/dpif.c                    |   8 +
 lib/dpif.h                    |   9 +
 lib/id-fpool.c                | 279 ++++++++++++
 lib/id-fpool.h                |  66 +++
 lib/mov-avg.h                 | 171 ++++++++
 lib/mpsc-queue.c              | 251 +++++++++++
 lib/mpsc-queue.h              | 190 +++++++++
 lib/netdev-dpdk.c             |   6 -
 lib/netdev-offload-dpdk.c     | 277 ++++++++++--
 lib/netdev-offload-provider.h |   4 +
 lib/netdev-offload.c          |  92 +++-
 lib/netdev-offload.h          |  21 +
 lib/ovs-atomic-c++.h          |   3 +
 lib/ovs-atomic-clang.h        |   5 +
 lib/ovs-atomic-gcc4+.h        |   5 +
 lib/ovs-atomic-gcc4.7+.h      |   5 +
 lib/ovs-atomic-i586.h         |   5 +
 lib/ovs-atomic-locked.h       |   9 +
 lib/ovs-atomic-msvc.h         |  22 +
 lib/ovs-atomic-pthreads.h     |   5 +
 lib/ovs-atomic-x86_64.h       |   5 +
 lib/ovs-atomic.h              |   8 +-
 lib/ovs-thread.c              |  61 ++-
 lib/ovs-thread.h              |   6 +-
 tests/automake.mk             |   3 +
 tests/library.at              |  14 +
 tests/test-barrier.c          | 264 ++++++++++++
 tests/test-id-fpool.c         | 615 +++++++++++++++++++++++++++
 tests/test-mpsc-queue.c       | 772 ++++++++++++++++++++++++++++++++++
 vswitchd/vswitch.xml          |  16 +
 35 files changed, 3754 insertions(+), 233 deletions(-)
 create mode 100644 lib/id-fpool.c
 create mode 100644 lib/id-fpool.h
 create mode 100644 lib/mov-avg.h
 create mode 100644 lib/mpsc-queue.c
 create mode 100644 lib/mpsc-queue.h
 create mode 100644 tests/test-barrier.c
 create mode 100644 tests/test-id-fpool.c
 create mode 100644 tests/test-mpsc-queue.c

--
2.31.1



More information about the dev mailing list