[ovs-dev] [PATCH v3 0/8] conntrack: improve multithread scalability

Gaetan Rivet grive at u256.net
Tue Jun 15 23:22:45 UTC 2021

Conntracks are executed within the datapath. Locks along this path are crucial
and their critical section should be minimal. The global 'ct_lock' is necessary
before any action taken on connection states. This lock is needed for many
operations on the conntrack, slowing down the datapath.

The cleanup thread 'ct_clean' will take it to do its job. As it can hold it a
long time, the thread is limited in amount of connection cleaned per round,
and calls are rate-limited.

* Timeout policies locking is contrived to avoid deadlock.
  Anytime a connection state is updated, during its update it is unlocked,
  'ct_lock' is taken, then the connection is locked again. Then the reverse
  is done for unlock.

* Scalability is poor. The global ct_lock needs to be taken before applying
  any change to a conn object. This is backward: local changes to smaller
  objects should be independent, then the global lock should only be taken once
  the rest of the work is done, the goal being to have the smallest possible
  critical section.

It can be improved. Using RCU-friendly structures for connections, zone limits
and timeout policies, read-first workload is improved and the precedence of the
global 'ct_lock' and local 'conn->lock' can be inversed.

Running the conntrack benchmark we see these changes:
  ./tests/ovstest test-conntrack benchmark <N> 3000000 32

code \ N      1     2     4     8
  Before   2310  2766  6117 19838  (ms)
   After   2072  2084  2653  4541  (ms)

One thread in the benchmark executes the task of a PMD, while the 'ct_clean' thread
runs in background as well.

Github actions: https://github.com/grivet/ovs/actions/runs/574446345


An mpsc-queue is used instead of rculist to manage connection expirations lists.
PMDs and ct_clean all act as producers, while ct_clean is the sole consumer thread.
A PMD now needs to take the 'ct_lock' only when creating a new connection, and only
while inserting it in the conn CMAP. For any updates, only the conn lock is now required,
to properly change its state.

The mpsc-queue implementation is identical to the one from the parallel offload series [1].

CI: https://github.com/grivet/ovs/actions/runs/772118640

[1]: https://patchwork.ozlabs.org/project/openvswitch/list/?series=238779


The last part of the series modifying the rate limit of conntrack_clean is dropped.
It is not necessary to improve scalability and can be done later if needed.

CI: https://github.com/grivet/ovs/actions/runs/940610003

On my local development laptop, the benchmark gives different numbers since v1:
  ./tests/ovstest test-conntrack benchmark <N> 3000000 32

code \ N      1     2     4     8
  Before    598  1656 12612 39301  (ms)
   After    293   337   427   893  (ms)

I replicate the numbers on a 24-cores machine as well.
The benchmark is not very accurate as no core pinning and no isolation is done.

Gaetan Rivet (8):
  conntrack: Init hash basis first at creation
  ovs-atomic: Expose atomic exchange operation
  mpsc-queue: Module for lock-free message passing
  conntrack: Use mpsc-queue to store conn expirations
  conntrack: Use a cmap to store zone limits
  conntrack-tp: Use a cmap to store timeout policies
  conntrack: Inverse conn and ct lock precedence
  conntrack: Use an atomic conn expiration value

 lib/automake.mk           |   2 +
 lib/conntrack-private.h   |  97 +++--
 lib/conntrack-tp.c        | 100 ++---
 lib/conntrack.c           | 278 ++++++++++----
 lib/conntrack.h           |   4 +-
 lib/dpif-netdev.c         |   5 +-
 lib/mpsc-queue.c          | 251 +++++++++++++
 lib/mpsc-queue.h          | 190 ++++++++++
 lib/ovs-atomic-c++.h      |   3 +
 lib/ovs-atomic-clang.h    |   5 +
 lib/ovs-atomic-gcc4+.h    |   5 +
 lib/ovs-atomic-gcc4.7+.h  |   5 +
 lib/ovs-atomic-i586.h     |   5 +
 lib/ovs-atomic-locked.h   |   9 +
 lib/ovs-atomic-msvc.h     |  22 ++
 lib/ovs-atomic-pthreads.h |   5 +
 lib/ovs-atomic-x86_64.h   |   5 +
 lib/ovs-atomic.h          |   8 +-
 tests/automake.mk         |   1 +
 tests/library.at          |   5 +
 tests/test-mpsc-queue.c   | 772 ++++++++++++++++++++++++++++++++++++++
 21 files changed, 1608 insertions(+), 169 deletions(-)
 create mode 100644 lib/mpsc-queue.c
 create mode 100644 lib/mpsc-queue.h
 create mode 100644 tests/test-mpsc-queue.c


More information about the dev mailing list