[ovs-dev] [PATCH v4 00/27] dpif-netdev: Parallel offload processing
Gaetan Rivet
grive at u256.net
Wed Jun 9 13:09:08 UTC 2021
This patch series aims to improve the performance of the management
of hw-offloads in dpif-netdev. In the current version, some setup
will experience high memory usage and poor latency between a flow
decision and its execution regarding hardware offloading.
This series starts by measuring key metrics regarding both issues
Those patches are introduced first to compare the current status
with each improvements introduced.
Offloads enqueued and inserted, as well as the latency
from queue insertion to hardware insertion is measured. A new
command 'ovs-appctl dpctl/offload-stats-show' is introduced
to show the current measure.
In my current performance test setup I am measuring an
average latency hovering between 1~2 seconds.
After the optimizations, it is reduced to 500~900 ms.
Finally when using multiple threads and with proper driver
support[1], it is measured in the order of 1 ms.
A few modules are introduced:
* An ID pool with reduced capabilities, simplifying its
operations and allowing better performances in both
single and multi-thread setup.
* A lockless queue between PMDs / revalidators and
offload thread(s). As the number of PMDs increases,
contention can be high on the shared queue.
This queue is designed to serve as message queue
between threads.
* A bounded lockless MPMC ring and some helpers for
calculating moving averages.
* A moving average module for Cumulative and Exponential
moving averages.
The netdev-offload-dpdk module is made thread-safe.
Internal maps are made per-netdev instead, and locks are
taken for shorter critical sections within the module.
CI result: https://github.com/grivet/ovs/actions/runs/554918929
[1]: The rte_flow API was made thread-safe in the 20.11 DPDK
release. Drivers that do not implement those operations
concurrently are protected by a lock. Others will
allow better concurrency, that improve the result
of this series.
v2:
* Improved the MPSC queue API to simplify usage.
* Moved flush operation from initiator thread to offload
thread(s). This ensures offload metadata are shared only
among the offload thread pool.
* Flush operation needs additional thread synchronization.
The ovs_barrier currently triggers a UAF. Add a unit-test to
validate its operations and a fix for the UAF.
CI result: https://github.com/grivet/ovs/actions/runs/741430135
The error comes from a failure to download 'automake' on
osx, unrelated to any change in this series.
v3:
* Re-ordered commits so fixes are first. No conflict seen currently,
but it might prevent them if some requested changes to the series
were to move code in the same parts.
* Modified the reduced quiescing the thread to use ovsrcu_quiesce(),
and base next_rcu on the current time value (after quiescing happened,
however long it takes).
* Added Reviewed-by tags to the relevant commits.
CI result: https://github.com/grivet/ovs/actions/runs/782655601
v4:
* Modified the seq-pool to use batches of IDs with a spinlock
instead of lockless rings.
* The llring structure is removed.
* Due to the length of the changes to the structure, some
acked-by or reviewed-by were not ported to the id-fpool patch.
CI result: https://github.com/grivet/ovs/actions/runs/921095015
The new pool 'id-fpool' (for id fast pool) performs better than
seq-pool and id-pool. It also scales better than the suggestions made
on the mailing list: as number of threads grows, its operations improves.
Executing 20 runs to compare id-fpool, seq-pool and the id-queue (name of
the unordered pool using a contiguous 2n-reallocated array), I get the
following numbers:
(The seq-pool and id-queue benchmarks were left only for the comparison,
they are not part of the id-fpool patch submitted with this series.)
$ pool-stats.sh 1000000 1
20 times './tests/ovstest test-id-fpool perf 1000000 1':
id-fpool new: avg 14.4 | stdev 1.3 | max 20 | min 14
id-fpool del: avg 12.9 | stdev 0.5 | max 14 | min 12
id-fpool mix: avg 34.5 | stdev 0.6 | max 36 | min 34
id-fpool rnd: avg 55.2 | stdev 0.7 | max 56 | min 54
seq-pool new: avg 22.3 | stdev 0.5 | max 23 | min 22
seq-pool del: avg 53.7 | stdev 0.7 | max 55 | min 53
seq-pool mix: avg 57.8 | stdev 0.8 | max 60 | min 57
seq-pool rnd: avg 70.3 | stdev 1.2 | max 74 | min 69
id-queue new: avg 12.4 | stdev 0.5 | max 13 | min 12
id-queue del: avg 10.1 | stdev 0.4 | max 11 | min 9
id-queue mix: avg 29.9 | stdev 0.3 | max 30 | min 29
id-queue rnd: avg 48.5 | stdev 0.5 | max 49 | min 48
$ pool-stats.sh 1000000 2
20 times './tests/ovstest test-id-fpool perf 1000000 2':
id-fpool new: avg 17.8 | stdev 3.3 | max 28 | min 13
id-fpool del: avg 14.9 | stdev 2.4 | max 20 | min 11
id-fpool mix: avg 43.0 | stdev 5.7 | max 56 | min 34
id-fpool rnd: avg 45.9 | stdev 5.6 | max 62 | min 36
seq-pool new: avg 39.6 | stdev 6.0 | max 49 | min 34
seq-pool del: avg 37.0 | stdev 5.3 | max 47 | min 33
seq-pool mix: avg 101.4 | stdev 16.3 | max 130 | min 89
seq-pool rnd: avg 81.6 | stdev 12.6 | max 105 | min 71
id-queue new: avg 20.9 | stdev 4.1 | max 32 | min 15
id-queue del: avg 17.2 | stdev 4.5 | max 28 | min 10
id-queue mix: avg 56.5 | stdev 10.9 | max 86 | min 38
id-queue rnd: avg 97.2 | stdev 15.7 | max 130 | min 64
$ pool-stats.sh 1000000 4
20 times './tests/ovstest test-id-fpool perf 1000000 4':
id-fpool new: avg 10.4 | stdev 2.8 | max 22 | min 7
id-fpool del: avg 8.5 | stdev 0.7 | max 9 | min 6
id-fpool mix: avg 19.6 | stdev 1.8 | max 22 | min 15
id-fpool rnd: avg 25.6 | stdev 2.4 | max 28 | min 20
seq-pool new: avg 47.7 | stdev 5.2 | max 52 | min 34
seq-pool del: avg 35.8 | stdev 3.3 | max 39 | min 28
seq-pool mix: avg 118.1 | stdev 14.8 | max 130 | min 81
seq-pool rnd: avg 89.3 | stdev 9.7 | max 101 | min 65
id-queue new: avg 83.2 | stdev 17.5 | max 126 | min 65
id-queue del: avg 81.8 | stdev 20.8 | max 128 | min 57
id-queue mix: avg 276.2 | stdev 57.1 | max 369 | min 171
id-queue rnd: avg 347.9 | stdev 44.1 | max 410 | min 236
Isolating the 'rnd' test:
1 thread:
---------
id-fpool rnd: avg 55.2 | stdev 0.7 | max 56 | min 54
seq-pool rnd: avg 70.3 | stdev 1.2 | max 74 | min 69
id-queue rnd: avg 48.5 | stdev 0.5 | max 49 | min 48
2 threads:
----------
id-fpool rnd: avg 45.9 | stdev 5.6 | max 62 | min 36
seq-pool rnd: avg 81.6 | stdev 12.6 | max 105 | min 71
id-queue rnd: avg 97.2 | stdev 15.7 | max 130 | min 64
4 threads:
----------
id-fpool rnd: avg 25.6 | stdev 2.4 | max 28 | min 20
seq-pool rnd: avg 89.3 | stdev 9.7 | max 101 | min 65
id-queue rnd: avg 347.9 | stdev 44.1 | max 410 | min 236
Gaetan Rivet (27):
ovs-thread: Fix barrier use-after-free
dpif-netdev: Rename flow offload thread
tests: Add ovs-barrier unit test
netdev: Add flow API uninit function
netdev-offload-dpdk: Use per-netdev offload metadata
netdev-offload-dpdk: Implement hw-offload statistics read
dpctl: Add function to read hardware offload statistics
dpif-netdev: Rename offload thread structure
mov-avg: Add a moving average helper structure
dpif-netdev: Implement hardware offloads stats query
ovs-atomic: Expose atomic exchange operation
mpsc-queue: Module for lock-free message passing
id-fpool: Module for fast ID generation
netdev-offload: Add multi-thread API
dpif-netdev: Quiesce offload thread periodically
dpif-netdev: Postpone flow offload item freeing
dpif-netdev: Use id-fpool for mark allocation
dpif-netdev: Introduce tagged union of offload requests
dpif-netdev: Execute flush from offload thread
netdev-offload-dpdk: Use per-thread HW offload stats
netdev-offload-dpdk: Lock rte_flow map access
netdev-offload-dpdk: Protect concurrent offload destroy/query
dpif-netdev: Use lockless queue to manage offloads
dpif-netdev: Make megaflow and mark mappings thread objects
dpif-netdev: Replace port mutex by rwlock
dpif-netdev: Use one or more offload threads
netdev-dpdk: Remove rte-flow API access locks
lib/automake.mk | 5 +
lib/dpctl.c | 36 ++
lib/dpif-netdev.c | 741 ++++++++++++++++++++++++--------
lib/dpif-netlink.c | 1 +
lib/dpif-provider.h | 7 +
lib/dpif.c | 8 +
lib/dpif.h | 9 +
lib/id-fpool.c | 279 ++++++++++++
lib/id-fpool.h | 66 +++
lib/mov-avg.h | 171 ++++++++
lib/mpsc-queue.c | 251 +++++++++++
lib/mpsc-queue.h | 190 +++++++++
lib/netdev-dpdk.c | 6 -
lib/netdev-offload-dpdk.c | 277 ++++++++++--
lib/netdev-offload-provider.h | 4 +
lib/netdev-offload.c | 92 +++-
lib/netdev-offload.h | 21 +
lib/ovs-atomic-c++.h | 3 +
lib/ovs-atomic-clang.h | 5 +
lib/ovs-atomic-gcc4+.h | 5 +
lib/ovs-atomic-gcc4.7+.h | 5 +
lib/ovs-atomic-i586.h | 5 +
lib/ovs-atomic-locked.h | 9 +
lib/ovs-atomic-msvc.h | 22 +
lib/ovs-atomic-pthreads.h | 5 +
lib/ovs-atomic-x86_64.h | 5 +
lib/ovs-atomic.h | 8 +-
lib/ovs-thread.c | 61 ++-
lib/ovs-thread.h | 6 +-
tests/automake.mk | 3 +
tests/library.at | 14 +
tests/test-barrier.c | 264 ++++++++++++
tests/test-id-fpool.c | 615 +++++++++++++++++++++++++++
tests/test-mpsc-queue.c | 772 ++++++++++++++++++++++++++++++++++
vswitchd/vswitch.xml | 16 +
35 files changed, 3754 insertions(+), 233 deletions(-)
create mode 100644 lib/id-fpool.c
create mode 100644 lib/id-fpool.h
create mode 100644 lib/mov-avg.h
create mode 100644 lib/mpsc-queue.c
create mode 100644 lib/mpsc-queue.h
create mode 100644 tests/test-barrier.c
create mode 100644 tests/test-id-fpool.c
create mode 100644 tests/test-mpsc-queue.c
--
2.31.1
More information about the dev
mailing list