[ovs-dev] [PATCH v2 0/4] dpif-netdev: Combine CD and DFC patch for datapath refactor

yipeng wang yipeng1.wang at intel.com
Mon Apr 2 17:17:31 UTC 2018

This patch set is the V2 implementation to combine the CD and DFC design.
Both patches intend to refactor datapath to avoid costly sequential subtable
search. Rebased on 4299145c10953b5ba125ba2a95caa18e554f3f85

CD and DFC patch sets:
CD: [PATCH v2 0/5] dpif-netdev: Cuckoo-Distributor implementation

DFC: [PATCH] dpif-netdev: Refactor datapath flow cache

1. The first commit is a rebase of Jan Scheurich's patch of
[PATCH] dpif-netdev: Refactor datapath flow cache
with a couple of bug fixes.

2. The second commit is to incorporate CD's way-associative design into DFC to
improve the hit rate.

3. The third commit is to change the distributor to cache an index of flow_table
entry to improve memory efficiency.

4. The fourth commit is to split DFC into EMC and SMC for better organization.
Also the lookup function is rewritten to do batching processing.

We did a phy-2-phy test to evaluate the performance improvement with this patch
set. The traffic pattern we use is based on Billy's original TREX script:

We augment the script to generate power law distribution of flows to have
different bandwidth and to access different subtables.

For example, there are n flows each has bandwidth of w, while n/4 flows each has
bandwidth of 2w, while n/9 flows each has bandwidth of 3w, and so on (Power Law
distribution, y = Cx^-2). For subtable, the second most accessed subtable has
1/2 accesses of the first most accessed subtable, the third most accessed
subtable has 1/3 accesses of the first most accessed subtable and so on
(Zipf's law).

The CD/DFC size is 1 million entries. The speedup results are listed below:

#flow    #subtable    speedup
1000     1            1.015523746
1000     5            1.032199838
1000     10           1.050814738
1000     20           1.081794454
10000    1            1.201704118
10000    5            1.31634144
10000    10           1.402493331
10000    20           1.531133279
100000   1            1.11088487
100000   5            1.458748559
100000   10           1.683044348
100000   20           2.034441401
1000000  1            1.004339563
1000000  5            1.256745291
1000000  10           1.444329892
1000000  20           1.666275853

Both flow traffic and subtable accesses are skewed. The table shows the total
The most performance improvement happens when flow can totally hit DFC/CD thus
bypass the megaflow cache, and when there are multiple subtables.
When all flows hit EMC or flow count is larger than CD/DFC size, the performance
improvement reduces.

1. Add comment and follow code style for cmap code (Ben's comment)
2. Fix a bug in the first commit that fails multiple unit tests. Since DFC is
   per PMD not per port, the port mask should be included in rule.
3. Added commit 4. This commit separates DFC to be EMC cache and SMC (signature
   match cache) for easier optimization and readability.
4. In commit 4, DFC lookup is refactored to do batching lookup.
5. Rebase and other minor changes.

1. rebase to master head.
2. The last commit is totally rewritten to use the flow_table as indirect table.
   The CD/DFC distributor will cache the index of flow_table entry.
3. Incorporate commit 2 into commit 1. (Bhanu's comment)
4. Change DFC to be always on in commit 1. (Bhanu's comment)

RFC of this patch set:

Yipeng Wang (3):
  dpif-netdev: Use way-associative cache
  dpif-netdev: use flow_table as indirect table
  dpif-netdev: Split DFC cache and code optimization

Jan Scheurich (1):
  dpif-netdev: Refactor datapath flow cache

 lib/cmap.c             |  73 +++++++++
 lib/cmap.h             |   5 +
 lib/dpif-netdev-perf.h |   1 +
 lib/dpif-netdev.c      | 426 ++++++++++++++++++++++++++++++++++---------------
 4 files changed, 375 insertions(+), 130 deletions(-)


More information about the dev mailing list