[ovs-dev] [PATCH 00/12] Improve performance of OVS-DPDK classifier

Bodireddy, Bhanuprakash bhanuprakash.bodireddy at intel.com
Mon Oct 10 16:00:01 UTC 2016

Thanks Daniele, Jarno for reviewing and testing the patch series. We are working on the changes as suggested and will send out v2 soon.

Bhanu Prakash.

From: Daniele Di Proietto [mailto:diproiettod at ovn.org]
Sent: Friday, October 7, 2016 11:45 PM
To: Bodireddy, Bhanuprakash <bhanuprakash.bodireddy at intel.com>
Cc: dev at openvswitch.org; Jarno Rajahalme <jarno at ovn.org>
Subject: Re: [ovs-dev] [PATCH 00/12] Improve performance of OVS-DPDK classifier

Thanks for posting this.
I quickly tried this with some simple flow tables and it seems to be beneficial.
I agree with Jarno's comments and I posted a couple more to the single patches.
I see two signoff, but a single author, should you add a Co-authored-by, perhaps?
Other than these I am fine with the series. Could you maybe send another version with the suggested changes, please?

2016-10-07 9:17 GMT-07:00 Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com<mailto:bhanuprakash.bodireddy at intel.com>>:
This patch series is aimed at improving the performance of OVS-DPDK

With few thousands flows installed, the EMC becomes inefficient due
to thrashing and the bottleneck moves to the dpcls. In EMC disabled
case, through VTune we found that significant performance degradation
is due to LLC thrashing, memory latency, machine clears and expensive
hash computation.

This first patch-set improves the dpcls performance by 15% (~1 Mpps)
when EMC is disabled and OVS-DPDK built with CFLAGS="-O2 -g".

Bhanuprakash Bodireddy (12):
  dpcls: Use 32 packet batches for lookups.
        Comment: ~120k performance throughput improvement.

  flow: Add comments to mf_get_next_in_map()
        Comment: Add comments to function.

  flow: Skip invoking expensive count_1bits() with zero input.
        Comment: ~630k performance throughput improvement.

  hash: Skip invoking mhash_add__() with zero input.
        Comment: ~150k performance throughput improvement.

  dpif-netdev: Clear flow batches inside packet_batch_execute.
        Comment: ~50k performance throughput improvement with multiple batches test case.

  cmap: Remove prefetching in cmap_find_batch().
        Comment: ~39k performance throughput improvement.

  dpif-netdev: Cache align netdev_flow_keys.
        Comment: ~170k performance throughput improvement in EMC enabled case.

  dpif-netdev: Reorder elements in dp_netdev_port structure.
  dpif: Reorder elements in dpif_upcall structure.
  ovsdb: Reorder elements in ovsdb_table_schema structure.
  netlink-socket: Reorder elements in nl_dump structure.
  timeval: Reorder elements in clock structure.
        Comment: Reorder memeber variables of the structures to reduce pad bytes
                 and there by memory footprint.

 lib/cmap.c           |   4 --
 lib/dpif-netdev.c    | 118 +++++++++++++++++++++------------------------------
 lib/dpif.h           |  17 ++++----
 lib/flow.h           |  29 +++++++++++--
 lib/hash.h           |   2 +-
 lib/netlink-socket.h |   6 +--
 lib/timeval.c        |   4 +-
 ovsdb/table.h        |   4 +-
 8 files changed, 91 insertions(+), 93 deletions(-)


dev mailing list
dev at openvswitch.org<mailto:dev at openvswitch.org>

More information about the dev mailing list