[ovs-dev] [PATCH v11 0/3] dpif-netdev: Detailed PMD performance metrics and supervision

Jan Scheurich jan.scheurich at ericsson.com
Thu Apr 12 15:32:10 UTC 2018


The run-time performance of PMDs is often difficult to understand and 
trouble-shoot. The existing PMD statistics counters only provide a coarse 
grained average picture. At packet rates of several Mpps sporadic drops of
packet bursts happen at sub-millisecond time scales and are impossible to
capture and analyze with existing tools.

This patch collects a large number of important PMD performance metrics
per PMD iteration, maintaining histograms and circular histories for
iteration metrics and millisecond averages. To capture sporadic drop
events, the patch set can be configured to monitor iterations for suspicious
metrics and to log the neighborhood of such iterations for off-line analysis.

The extra cost for the performance metric collection and the supervision has
been measured to be in the order of 1% compared to the base commit in a PVP
setup with L3 pipeline over VXLAN tunnels. For that reason the metrics
collection is disabled by default and can be enabled at run-time through
configuration.

v9 -> v10:
* Rebased to master (commit 00a0a011d)
* Implemented comments on v10 by Ilya, Aaron and Ian.
* Replaced broken macro ATOMIC_LLONG_LOCK_FREE with working
  macro ATOMIC_ALWAYS_LOCK_FREE_8B.
* Changed iteration key in iteration history from TSC timetamp to
  iteration counter.
* Bugfix: Suspicious iteration logged was one off the actual suspicious
  iteration.

v9 -> v10:
* Implemented missed comment by Ilya on v8: use ATOMIC_LLONG_LOCK_FREE
* Fixed travis and checkpatch errors reported by Ian on v9.

v8 -> v9:
* Rebased to master (commit cb8cbbbe9)
* Implemented minor comments on v8 by Billy

v7 -> v8:
* Rebased on to master (commit 4e99b70df)
* Implemented comments from Ilya Maximets and Billy O'Mahony.
* Replaced netdev_rxq_length() introduced in v7 by optional out
  parameter for the remaining rx queue len in netdev_rxq_recv().
* Fixed thread synchronization issues in clearing PMD stats:
  - Use mutex to control whether to clear from main thread directly
    or in PMD at start of next iteration.
  - Use mutex to prevent concurrent clearing and printing of metrics.
* Added tx packet and batch stats to pmd-perf-show output.
* Delay warning for suspicious iteration to the iteration in which
  we also log the neighborhood to not pollute the logged iteration
  stats with logging costs.
* Corrected the exact number of iterations logged before and after a
  supicious iteration.
* Introduced options -e and -ne in pmd-perf-log-set to control whether
  to *extend* the range of logged iterations when additional supicious
  iterations are detected before the scheduled end of logging interval
  is reached.
* Exclude logging cycles from the iteration stats to avoid confusing
  ghost peaks.
* Performance impact compared to master less than 1% even with
  supervision enabled.

v5 -> v7:
* Rebased on to dpdk_merge (commit e666668)
  - New base contains earlier refactoring parts of series.
* Implemented comments from Ilya Maximets and Billy O'Mahony.
* Replaced piggybacking qlen on dp_packet_batch with a new netdev API
  netdev_rxq_length().
* Thread-safe clearing of pmd counters in pmd_perf_start_iteration().
* Fixed bug in reporting datapath stats.
* Work-around a bug in DPDK rte_vhost_rx_queue_count() which sometimes
  returns bogus in the upper 16 bits of the uint32_t return value.

v4 -> v5:
* Rebased to master (commit e9de6c0)
* Implemented comments from Aaron Conole and Darrel Ball

v3 -> v4:
* Rebased to master (commit 4d0a31b)
  - Reverting changes to struct dp_netdev_pmd_thread.
* Make metrics collection configurable.
* Several bugfixes.

v2 -> v3:
* Rebased to OVS master (commit 3728b3b).
* Non-trivial adaptation to struct dp_netdev_pmd_thread.
  - refactored in commit a807c157 (Bhanu).
* No other changes compared to v2.

v1 -> v2:
* Rebased to OVS master (commit 7468ec788).
* No other changes compared to v1.

Jan Scheurich (3):
  netdev: Add optional qfill output parameter to rxq_recv()
  dpif-netdev: Detailed performance stats for PMDs
  dpif-netdev: Detection and logging of suspicious PMD iterations

 NEWS                        |   6 +
 lib/automake.mk             |   1 +
 lib/dpif-netdev-perf.c      | 685 +++++++++++++++++++++++++++++++++++++++++++-
 lib/dpif-netdev-perf.h      | 218 ++++++++++++--
 lib/dpif-netdev-unixctl.man | 216 ++++++++++++++
 lib/dpif-netdev.c           | 192 ++++++++++++-
 lib/netdev-bsd.c            |   8 +-
 lib/netdev-dpdk.c           |  41 ++-
 lib/netdev-dummy.c          |   8 +-
 lib/netdev-linux.c          |   7 +-
 lib/netdev-provider.h       |   7 +-
 lib/netdev.c                |   5 +-
 lib/netdev.h                |   3 +-
 manpages.mk                 |   2 +
 vswitchd/ovs-vswitchd.8.in  |  27 +-
 vswitchd/vswitch.xml        |  12 +
 16 files changed, 1362 insertions(+), 76 deletions(-)
 create mode 100644 lib/dpif-netdev-unixctl.man

-- 
1.9.1



More information about the dev mailing list