[ovs-dev] [PATCH v7 0/3] dpif-netdev: Detailed PMD performance metrics and supervision

Ilya Maximets i.maximets at samsung.com
Wed Jan 17 07:52:53 UTC 2018


On 17.01.2018 10:49, Ilya Maximets wrote:
> Not a full review, just a note about man pages.
> Current version of netdev-dpif-unixctl.man doesn't scale and looks bad
> with different terminal widths. I refactored it to look more like a man
> page. See the incremental patch below.
> 
> What changed:
> * Dash lists re-formatted using "\(em" tags.
> * Numbered list re-formatted using ".IP N.".
> * Command option lists re-formatted using usual paragraphs with indentation.
> * Output example indented as recommended by man pages guidelines.
> * Removed extra spaces in "pmd-perf-log-set" command header.
> * 'pmd-stats-clear' moved closer to 'pmd-stats-show' for integrity.
> 
> One more thing:
> Can we rename "netdev-dpif-unixctl.man" to "dpif-netdev-unixctl.man" ?
> I don't understand why it named so.
> 
> -----------------------------------------------------------------------
> diff --git a/lib/netdev-dpif-unixctl.man b/lib/netdev-dpif-unixctl.man
> index 82e372f..b17ffdb 100644
> --- a/lib/netdev-dpif-unixctl.man
> +++ b/lib/netdev-dpif-unixctl.man
> @@ -26,6 +26,12 @@ transmitting said packets.
>  
>  To reset these counters use \fBdpif-netdev/pmd-stats-clear\fR.
>  .
> +.IP "\fBdpif-netdev/pmd-stats-clear\fR [\fIdp\fR]"
> +Resets to zero the per pmd thread performance numbers shown by the
> +\fBdpif-netdev/pmd-stats-show\fR and \fBdpif-netdev/pmd-perf-show\fR commands.
> +It will NOT reset datapath or bridge statistics, only the values shown by
> +the above commands.
> +.
>  .IP "\fBdpif-netdev/pmd-perf-show\fR [\fB-nh\fR] [\fB-it\fR \fIiter_len\fR] \
>  [\fB-ms\fR \fIms_len\fR] [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]"
>  Shows detailed performance metrics for one or all pmds threads of the
> @@ -36,46 +42,88 @@ configuration parameter "other_config:pmd-perf-metrics". By default it
>  is disabled. The run-time overhead, when enabled, is in the order of 1%.
>  
>  The covered metrics per iteration are:
> -  - used cycles
> -  - forwared packets
> -  - number of rx batches
> -  - packets/rx batch
> -  - max. vhostuser queue fill level
> -  - number of upcalls
> -  - cycles spent in upcalls
> -
> +.RS
> +.IP
> +.PD .4v
> +.IP \(em
> +used cycles
> +.IP \(em
> +forwared packets
> +.IP \(em
> +number of rx batches
> +.IP \(em
> +packets/rx batch
> +.IP \(em
> +max. vhostuser queue fill level
> +.IP \(em
> +number of upcalls
> +.IP \(em
> +cycles spent in upcalls
> +.PD
> +.RE
> +.IP
>  This raw recorded data is used threefold:
> -
> -1. In histograms for each of the following metrics:
> -   - cycles/iteration (logarithmic)
> -   - packets/iteration (logarithmic)
> -   - cycles/packet
> -   - packets/batch
> -   - max. vhostuser qlen (logarithmic)
> -   - upcalls
> -   - cycles/upcall (logarithmic)
> -   The histograms bins are divided linear or logarithmic.
> -
> -2. A cyclic history of the above metrics for 1024 iterations
> -
> -3. A cyclic history of the cummulative/average values per millisecond
> -   wall clock for the last 1024 milliseconds:
> -   - number of iterations
> -   - avg. cycles/iteration
> -   - packets (Kpps)
> -   - avg. packets/batch
> -   - avg. max vhost qlen
> -   - upcalls
> -   - avg. cycles/upcall
> -
> -The command options are
> -
> -    \fB-nh\fR:            Suppress the histograms
> -    \fB-it\fR \fIiter_len\fR:   Display the last iter_len iteration stats
> -    \fB-ms\fR \fIms_len\fR:     Display the last ms_len millisecond stats
> -
> +.RS
> +.IP
> +.PD .4v
> +.IP 1.
> +In histograms for each of the following metrics:
> +.RS
> +.IP \(em
> +cycles/iteration (logarithmic)
> +.IP \(em
> +packets/iteration (logarithmic)
> +.IP \(em
> +cycles/packet
> +.IP \(em
> +packets/batch
> +.IP \(em
> +max. vhostuser qlen (logarithmic)
> +.IP \(em
> +upcalls
> +.IP \(em
> +cycles/upcall (logarithmic)
> +The histograms bins are divided linear or logarithmic.
> +.RE
> +.IP 2.
> +A cyclic history of the above metrics for 1024 iterations
> +.IP 3.
> +A cyclic history of the cummulative/average values per millisecond wall
> +clock for the last 1024 milliseconds:
> +.RS
> +.IP \(em
> +number of iterations
> +.IP \(em
> +avg. cycles/iteration
> +.IP \(em
> +packets (Kpps)
> +.IP \(em
> +avg. packets/batch
> +.IP \(em
> +avg. max vhost qlen
> +.IP \(em
> +upcalls
> +.IP \(em
> +avg. cycles/upcall
> +.RE
> +.PD
> +.RE
> +.IP
> +.
> +The command options are:
> +.RS
> +.IP "\fB-nh\fR"
> +Suppress the histograms
> +.IP "\fB-it\fR \fIiter_len\fR"
> +Display the last iter_len iteration stats
> +.IP "\fB-ms\fR \fIms_len\fR"
> +Display the last ms_len millisecond stats
> +.RE
> +
> +.IP
>  The output always contains the following global PMD statistics:
> -
> +.RS
> +.IP
>  Time: 15:24:55.270 .br
>  Measurement duration: 1.008 s
>  
> @@ -91,7 +139,8 @@ pmd thread numa_id 0 core_id 1:
>    - Megaflow hits:      3262943  (90.7 %, 1.00 subtbl lookups/hit)
>    - Upcalls:                  0  ( 0.0 %, 0.0 us/upcall)
>    - Lost upcalls:             0  ( 0.0 %)
> -
> +.RE
> +.IP
>  Here "Packets" actually reflects the number of packets forwarded by the
>  datapath. "Datapath passes" matches the number of packet lookups as
>  reported by the \fBdpif-netdev/pmd-stats-show\fR command.
> @@ -99,29 +148,25 @@ reported by the \fBdpif-netdev/pmd-stats-show\fR command.
>  To reset the counters and start a new measurement use
>  \fBdpif-netdev/pmd-stats-clear\fR.
>  .
> -.IP "\fBdpif-netdev/pmd-stats-clear\fR [\fIdp\fR]"
> -Resets to zero the per pmd thread performance numbers shown by the
> -\fBdpif-netdev/pmd-stats-show\fR and \fBdpif-netdev/pmd-perf-show\fR commands.
> -It will NOT reset datapath or bridge statistics, only the values shown by
> -the above commands.
> -.
> -.IP "\fBdpif-netdev/pmd-perf-log-set\fR \fBon\fR|\fBoff\fR \
> - [\fB-b\fR \fIbefore\fR] [\fB-a\fR \fIafter\fR] [\fB-us\fR \fIusec\fR] \
> - [\fB-q\fR \fIqlen\fR]"
> +.IP "\fBdpif-netdev/pmd-perf-log-set\fR \fBon\fR | \fBoff\fR \
> +[\fB-b\fR \fIbefore\fR] [\fB-a\fR \fIafter\fR] [\fB-us\fR \fIusec\fR] \
> +[\fB-q\fR \fIqlen\fR]"
>  .
>  The userspace "netdev" datapath is able to supervise the PMD performance
>  metrics and detect iterations with suspicious statistics according to the
>  following criteria:
> -
> -- The iteration lasts longer than \fIusec\fR microseconds (default 250).
> -  This can be used to capture events where a PMD is blocked or
> -  interrupted for such a period of time that there is a risk for
> -  dropped packets on any of its Rx queues.
> -
> -- The max vhost qlen exceeds a threshold \fIqlen\fR (default 128). This can
> -  be used to infer virtio queue overruns and dropped packets inside a VM,
> -  which are not visible in OVS otherwise.
> -
> +.RS
> +.IP \(em
> +The iteration lasts longer than \fIusec\fR microseconds (default 250).
> +This can be used to capture events where a PMD is blocked or interrupted for
> +such a period of time that there is a risk for dropped packets on any of its Rx
> +queues.
> +.IP \(em
> +The max vhost qlen exceeds a threshold \fIqlen\fR (default 128). This can be
> +used to infer virtio queue overruns and dropped packets inside a VM, which are
> +not visible in OVS otherwise.
> +.RE
> +.IP
>  Such suspicious iterations can be logged together with their iteration
>  statistics in the \fBovs-vswitchd.log\fR to be able to correlate them to
>  packet drop or other events outside OVS.
> @@ -132,16 +177,19 @@ detecting suspicious iterations. By default supervision and logging is
>  disabled.
>  
>  The command options are:
> -
> -    \fB-b\fR \fIbefore\fR:
> -        The number of iterations before the suspicious iteration to be logged (default 5).
> -    \fB-a\fR \fIafter\fR:
> -        The number of iterations after the suspicious iteration to be logged (default 5).
> -    \fB-q\fR \fIqlen\fR:
> -        Suspicious vhost queue fill level threshold. Increase this to 512 if
> -        the Qemu supports 1024 virtio queue length (default 128).
> -    \fB-us\fR \fIusec\fR:
> -        change the duration threshold for a suspicious iteration (default 250 us).
> +.RS
> +.IP "\fB-b\fR \fIbefore\fR"
> +The number of iterations before the suspicious iteration to be logged
> +(default 5).
> +.IP "\fB-a\fR \fIafter\fR"
> +The number of iterations after the suspicious iteration to be logged
> +(default 5).
> +.IP "\fB-q\fR \fIqlen\fR"
> +Suspicious vhost queue fill level threshold. Increase this to 512 if the Qemu
> +supports 1024 virtio queue length (default 128).
> +.IP "\fB-us\fR \fIusec\fR"
> +Change the duration threshold for a suspicious iteration (default 250 us).
> +.RE

Sorry, one more .IP should be here.

>  
>  If more than 100 iterations before or after a suspicious iteration have
>  been looged once, OVS falls back to the safe default values (5/5) to
> -----------------------------------------------------------------------
> 
> Best regards, Ilya Maximets.
> 
> 
> On 16.01.2018 04:51, Jan Scheurich wrote:
>> The run-time performance of PMDs is often difficult to understand and 
>> trouble-shoot. The existing PMD statistics counters only provide a coarse 
>> grained average picture. At packet rates of several Mpps sporadic drops of
>> packet bursts happen at sub-millisecond time scales and are impossible to
>> capture and analyze with existing tools.
>>
>> This patch collects a large number of important PMD performance metrics
>> per PMD iteration, maintaining histograms and circular histories for
>> iteration metrics and millisecond averages. To capture sporadic drop
>> events, the patch set can be configured to monitor iterations for suspicious
>> metrics and to log the neighborhood of such iterations for off-line analysis.
>>
>> The extra cost for the performance metric collection and the supervision has
>> been measured to be in the order of 1% compared to the base commit in a PVP
>> setup with L3 pipeline over VXLAN tunnels. For that reason the metrics
>> collection is disabled by default and can be enabled at run-time through
>> configuration.
>>
>> v5 -> v7:
>> * Rebased on to dpdk_merge (commit e666668)
>>   - New base contains earlier refactoring parts of series.
>> * Implemented comments from Ilya Maximets and Billy O'Mahony.
>> * Replaced piggybacking qlen on dp_packet_batch with a new netdev API
>>   netdev_rxq_length().
>> * Thread-safe clearing of pmd counters in pmd_perf_start_iteration().
>> * Fixed bug in reporting datapath stats.
>> * Work-around a bug in DPDK rte_vhost_rx_queue_count() which sometimes
>>   returns bogus in the upper 16 bits of the uint32_t return value.
>>
>> v4 -> v5:
>> * Rebased to master (commit e9de6c0)
>> * Implemented comments from Aaron Conole and Darrel Ball
>>
>> v3 -> v4:
>> * Rebased to master (commit 4d0a31b)
>>   - Reverting changes to struct dp_netdev_pmd_thread.
>> * Make metrics collection configurable.
>> * Several bugfixes.
>>
>> v2 -> v3:
>> * Rebased to OVS master (commit 3728b3b).
>> * Non-trivial adaptation to struct dp_netdev_pmd_thread.
>>   - refactored in commit a807c157 (Bhanu).
>> * No other changes compared to v2.
>>
>> v1 -> v2:
>> * Rebased to OVS master (commit 7468ec788).
>> * No other changes compared to v1.
>>
>>
>> Jan Scheurich (3):
>>   netdev: Add rxq callback function rxq_length()
>>   dpif-netdev: Detailed performance stats for PMDs
>>   dpif-netdev: Detection and logging of suspicious PMD iterations
>>
>>  NEWS                        |   5 +
>>  lib/automake.mk             |   1 +
>>  lib/dp-packet.h             |   1 +
>>  lib/dpif-netdev-perf.c      | 475 +++++++++++++++++++++++++++++++++++++++++++-
>>  lib/dpif-netdev-perf.h      | 275 ++++++++++++++++++++++++-
>>  lib/dpif-netdev.c           | 184 +++++++++++++++--
>>  lib/netdev-bsd.c            |   1 +
>>  lib/netdev-dpdk.c           |  49 ++++-
>>  lib/netdev-dpdk.h           |  14 ++
>>  lib/netdev-dpif-unixctl.man | 156 +++++++++++++++
>>  lib/netdev-dummy.c          |   1 +
>>  lib/netdev-linux.c          |   1 +
>>  lib/netdev-provider.h       |   3 +
>>  lib/netdev-vport.c          |   1 +
>>  lib/netdev.c                |   9 +
>>  lib/netdev.h                |   1 +
>>  manpages.mk                 |   2 +
>>  vswitchd/ovs-vswitchd.8.in  |  27 +--
>>  vswitchd/vswitch.xml        |  12 ++
>>  19 files changed, 1158 insertions(+), 60 deletions(-)
>>  create mode 100644 lib/netdev-dpif-unixctl.man
>>
> 
> 


More information about the dev mailing list