[ovs-dev] [PATCH] netdev-dpdk: Add a statistic on vhost tx contention.

Ilya Maximets i.maximets at samsung.com
Mon Jul 8 14:14:05 UTC 2019


On 07.07.2019 19:13, David Marchand wrote:
> Add a statistic to help diagnose contention on the vhost txqs.
> This is usually seen as dropped packets on the physical ports for rates
> that are usually handled fine by OVS.
> 
> Signed-off-by: David Marchand <david.marchand at redhat.com>
> ---
> This patch applies on top of Kevin "vhost-retry" series [1].
> 
> 1: https://patchwork.ozlabs.org/project/openvswitch/list/?submitter=70482
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 25 +++++++++++++++++++++++++
>  lib/netdev-dpdk.c                        | 20 ++++++++++++++++++--
>  2 files changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
> index 724aa62..e276b21 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -553,6 +553,31 @@ shown with::
>  
>    $ ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries
>  
> +vhost tx contention
> +-------------------
> +
> +Contrary to physical ports where the number of Transmit queues is decided at
> +the configuration of the port and does not change once set up, the number of
> +Transmit queues for a vhost-user or vhost-user-client interface is negociated
> +with the guest when its driver enables a virtio interface and OVS has no
> +control over it.
> +
> +Each PMD threads needs a Transmit queue to send packet on such a port.
> +To accomodate with the cases where the number of PMD threads differs from the
> +number of Transmit queues enabled by a Virtual Machine, the Transmit queues are
> +distributed among the PMD threads and their accesses are protected by a
> +spinlock.
> +
> +When contention occurs (because two PMD threads are mapped to the same Transmit
> +queue), it can be hard to understand this phenomenon.
> +It usually will be seen as OVS taking more cycles to handle the packet and can
> +be seen as packets getting dropped on the receive side (rx_dropped statistic
> +on the physical ports).
> +
> +A statistic for the port is available to catch those cases::
> +
> +  ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_contentions


I'm concerned about this statistics. 'tx_retries' had a slight connection
with the networking and the end user is able to understand what it means,
but this one is fully abstracted from the network and requires the deep
knowledge of the current netdev-dpdk implementation to understand what does
it mean.

I see 2 options for counters like this:

  1. coverage counter.
  2. PMD perf. stats.

The first one is the definitely correct place for such statistics, but we'll
loose some information like the port where this happened.

The second one seems suitable too, but it's hard to expose information to
the upper layer from netdev (see the awful vhost qfill). This is, however,
could be solved by re-implementing the dpif-netdev-perf with the 'coverage
counters' like approach, how I suggested previously while discussing the
original 'dpif-netdev-perf' patch-set.

So, for now, I'm voting for a simple coverage counter.


More thoughts:

Main idea: In OVS we already have too many ways of counting different events!

We have a 'stats_lock', please, don't introduce "atomic" way of statistics
collecting. If you wish to have atomics, please rewrite all of stats in
netdev-dpdk to atomics (not a good idea, probably).

Regarding implementation:
Why these stats are per-queue? As you're using atomic there is no need to
collect them separately. And you will not avoid cache invalidation, because
tx queues are not cacheline aligned/padded.

Since you're using atomic operations, the variable should be atomic_uint64_t.


Best regards, Ilya Maximets.

> +
>  vhost-user Dequeue Zero Copy (experimental)
>  -------------------------------------------
>  
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 0e70e44..844fb3c 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -139,10 +139,11 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF / MIN_NB_MBUF))
>  #define XSTAT_RX_JABBER_ERRORS           "rx_jabber_errors"
>  
>  /* Size of vHost custom stats. */
> -#define VHOST_CUSTOM_STATS_SIZE          1
> +#define VHOST_CUSTOM_STATS_SIZE          2
>  
>  /* Names of vHost custom stats. */
>  #define VHOST_STAT_TX_RETRIES            "tx_retries"
> +#define VHOST_STAT_TX_CONTENTIONS        "tx_contentions"
>  
>  #define SOCKET0              0
>  
> @@ -340,6 +341,8 @@ struct dpdk_tx_queue {
>                                      * pmd threads (see 'concurrent_txq'). */
>      int map;                       /* Mapping of configured vhost-user queues
>                                      * to enabled by guest. */
> +    uint64_t tx_contentions;       /* Number of times a pmd has been waiting
> +                                    * for another to xmit on this queue. */
>  };
>  
>  /* dpdk has no way to remove dpdk ring ethernet devices
> @@ -2387,7 +2390,10 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>          goto out;
>      }
>  
> -    rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
> +    if (unlikely(!rte_spinlock_trylock(&dev->tx_q[qid].tx_lock))) {
> +        rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
> +        atomic_count_inc64(&dev->tx_q[qid].tx_contentions);
> +    }
>  
>      cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
>      /* Check has QoS has been configured for the netdev */
> @@ -2878,6 +2884,16 @@ netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev,
>      custom_stats->counters[0].value = dev->tx_retries;
>      rte_spinlock_unlock(&dev->stats_lock);
>  
> +    ovs_strlcpy(custom_stats->counters[1].name, VHOST_STAT_TX_CONTENTIONS,
> +                NETDEV_CUSTOM_STATS_NAME_SIZE);
> +    custom_stats->counters[1].value = 0;
> +    for (unsigned int i = 0; i < netdev->n_txq; i++) {
> +        uint64_t tx_contentions;
> +
> +        atomic_read_relaxed(&dev->tx_q[i].tx_contentions, &tx_contentions);
> +        custom_stats->counters[1].value += tx_contentions;
> +    }
> +
>      ovs_mutex_unlock(&dev->mutex);
>  
>      return 0;
> 


More information about the dev mailing list