[ovs-dev] [PATCH v8] netdev-dpdk: Add support for vHost dequeue zero copy (experimental)

Stokes, Ian ian.stokes at intel.com
Tue Jan 16 10:00:25 UTC 2018


> -----Original Message-----
> From: Loftus, Ciara
> Sent: Friday, January 5, 2018 4:14 PM
> To: dev at openvswitch.org
> Cc: Loftus, Ciara <ciara.loftus at intel.com>; Stokes, Ian
> <ian.stokes at intel.com>; jan.scheurich at ericsson.com; ktraynor at redhat.com;
> i.maximets at samsung.com
> Subject: [PATCH v8] netdev-dpdk: Add support for vHost dequeue zero copy
> (experimental)
> 

Hi Ciara,

This seems to have been simplified from the previous patchsets, I'll need to do some testing but some comments inline below that could be addressed for a v9.

Thanks
Ian

> Zero copy is disabled by default. To enable it, set the 'dq-zero-copy'
> option to 'true' when configuring the Interface:
> 
> ovs-vsctl set Interface dpdkvhostuserclient0
> options:vhost-server-path=/tmp/dpdkvhostuserclient0
> options:dq-zero-copy=true
> 
> When packets from a vHost device with zero copy enabled are destined for a
> single 'dpdk' port, the number of tx descriptors on that 'dpdk' port must
> be set to a smaller value. 128 is recommended. This can be achieved like
> so:
> 
> ovs-vsctl set Interface dpdkport options:n_txq_desc=128
> 
> Note: The sum of the tx descriptors of all 'dpdk' ports the VM will send
> to should not exceed 128. Due to this requirement, the feature is
> considered 'experimental'.
> 
> Testing of the patch showed a 15% improvement when switching 512B packets
> between vHost devices on different VMs on the same host when zero copy was
> enabled on the transmitting device.
> 
> Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
> ---
> v8:
> * Disallow configurability after vHost device has been registered & update
> docs accordingly.
> * Give performance datapoint in commit message.
> 
>  Documentation/intro/install/dpdk.rst     |  2 +
>  Documentation/topics/dpdk/vhost-user.rst | 72
> ++++++++++++++++++++++++++++++++
>  NEWS                                     |  1 +
>  lib/netdev-dpdk.c                        |  9 +++-
>  vswitchd/vswitch.xml                     | 11 +++++
>  5 files changed, 94 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/intro/install/dpdk.rst
> b/Documentation/intro/install/dpdk.rst
> index 3fecb5c..087eb88 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -518,6 +518,8 @@ The above command sets the number of rx queues for
> DPDK physical interface.
>  The rx queues are assigned to pmd threads on the same NUMA node in a
> round-robin fashion.
> 
> +.. _dpdk-queues-sizes:
> +
>  DPDK Physical Port Queue Sizes
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst
> b/Documentation/topics/dpdk/vhost-user.rst
> index 8447e2d..1a6c6d0 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -458,3 +458,75 @@ Sample XML
>      </domain>
> 
>  .. _QEMU documentation: http://git.qemu-
> project.org/?p=qemu.git;a=blob;f=docs/specs/vhost-
> user.txt;h=7890d7169;hb=HEAD
> +
> +vhost-user Dequeue Zero Copy (experimental)
> +-------------------------------------------
> +
> +Normally when dequeuing a packet from a vHost User device, a memcpy
> +operation must be used to copy that packet from guest address space to
> +host address space. This memcpy can be removed by enabling dequeue zero-
> copy like so::
> +
> +    $ ovs-vsctl add-port br0 dpdkvhostuserclient0 -- set Interface \
> +        dpdkvhostuserclient0 type=dpdkvhostuserclient \
> +        options:vhost-server-path=/tmp/dpdkvhostclient0 \
> +        options:dq-zero-copy=true
> +
> +With this feature enabled, a reference (pointer) to the packet is
> +passed to the host, instead of a copy of the packet. Removing this
> +memcpy can give a performance improvement for some use cases, for
> +example switching large packets between different VMs. However additional
> packet loss may be observed.
> +
> +Note that the feature is disabled by default and must be explicitly
> +enabled by setting the 'dq-zero-copy' option to 'true' while specifying
> +the 'vhost-server-path' option as above. If you wish to split out the
> +command into multiple commands as below, ensure 'dq-zero-copy' is set
> +before
> +'vhost-server-path'::
> +
> +    $ ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero-
> copy=true
> +    $ ovs-vsctl set Interface dpdkvhostuserclient0 \
> +        options:vhost-server-path=/tmp/dpdkvhostclient0
> +
> +The feature is only available to dpdkvhostuserclient port types.
> +
> +A limitation exists whereby if packets from a vHost port with
> +dq-zero-copy=true are destined for a 'dpdk' type port, the number of tx
> +descriptors (n_txq_desc) for that port must be reduced to a smaller
> +number, 128 being the recommended value. This can be achieved by issuing
> the following command::
> +
> +    $ ovs-vsctl set Interface dpdkport options:n_txq_desc=128
> +
> +Note: The sum of the tx descriptors of all 'dpdk' ports the VM will
> +send to should not exceed 128. For example, in case of a bond over two
> +physical ports in balance-tcp mode, one must divide 128 by the number of
> links in the bond.
> +
> +Refer to :ref:`<dpdk-queue-sizes>` for more information.

Compilation fails with the following from above during the Doc check stage:

WARNING: undefined label: <dpdk-queue-sizes> (if the link has no caption the label must precede a section header)

> +
> +The reason for this limitation is due to how the zero copy
> +functionality is implemented. The vHost device's 'tx used vring', a
> +virtio structure used for tracking used ie. sent descriptors, will only
> +be updated when the NIC frees the corresponding mbuf. If we don't free
> +the mbufs frequently enough, that vring will be starved and packets
> +will no longer be processed. One way to ensure we don't encounter this
> +scenario, is to configure n_txq_desc to a small enough number such that
> +the 'mbuf free threshold' for the NIC will be hit more often and thus
> +free mbufs more frequently. The value of 128 is suggested, but values
> +of 64 and 256 have been tested and verified to work too, with differing
> +performance characteristics. A value of 512 can be used too, if the
> +virtio queue size in the guest is increased to 1024 (available to
> configure in QEMU versions v2.10 and greater). This value can be set like
> so::
> +
> +    $ qemu-system-x86_64 ... -chardev
> socket,id=char1,path=<sockpath>,server
> +      -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce
> +      -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,
> +      tx_queue_size=1024
> +
> +Because of this limitation, this feature is condsidered 'experimental'.

Minor: typo above for 'considered'.

> +
> +The feature currently does not fully work with QEMU >= v2.7 due to a
> +bug in DPDK which will be addressed in an upcoming release. The patch
> +to fix this issue can be found on `Patchwork
> +<http://dpdk.org/dev/patchwork/patch/32198/>`__
> +
> +Further information can be found in the `DPDK documentation
> +<http://dpdk.readthedocs.io/en/v17.05/prog_guide/vhost_lib.html>`__
> diff --git a/NEWS b/NEWS
> index 752e98f..1eaa32c 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -24,6 +24,7 @@ Post-v2.8.0
>     - DPDK:
>       * Add support for DPDK v17.11
>       * Add support for vHost IOMMU
> +     * Add support for vHost dequeue zero copy (experimental)
> 
>  v2.8.0 - 31 Aug 2017
>  --------------------
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 8f22264..d100b31
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1384,6 +1384,10 @@ netdev_dpdk_vhost_client_set_config(struct netdev
> *netdev,
>          path = smap_get(args, "vhost-server-path");
>          if (path && strcmp(path, dev->vhost_id)) {
>              strcpy(dev->vhost_id, path);
> +            /* check zero copy configuration */
> +            if (smap_get_bool(args, "dq-zero-copy", false)) {
> +                dev->vhost_driver_flags |=
> RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> +            }
>              netdev_request_reconfigure(netdev);
>          }
>      }
> @@ -3278,7 +3282,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
> *netdev)  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int err;
> -    uint64_t vhost_flags = 0;
> +    uint64_t vhost_flags = dev->vhost_driver_flags;
> 
>      ovs_mutex_lock(&dev->mutex);
> 
> @@ -3307,6 +3311,9 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
> *netdev)
>              VLOG_INFO("vHost User device '%s' created in 'client' mode, "
>                        "using client socket '%s'",
>                        dev->up.name, dev->vhost_id);
> +            if (dev->vhost_driver_flags &
> RTE_VHOST_USER_DEQUEUE_ZERO_COPY) {
> +                VLOG_INFO("Zero copy enabled for vHost port %s", dev-
> >up.name);
> +            }
>          }
> 
>          err = rte_vhost_driver_callback_register(dev->vhost_id,
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index
> 018d644..16b3c3e 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -2669,6 +2669,17 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
> type=patch options:peer=p1 \
>          </p>
>        </column>
> 
> +      <column name="options" key="dq-zero-copy"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          The value specifies whether or not to enable dequeue zero copy
> on
> +          the given interface.
> +          Must be set before vhost-server-path is specified.
> +          Only supported by dpdkvhostuserclient interfaces.
> +          The feature is considered experimental.
> +        </p>
> +      </column>
> +
>        <column name="options" key="n_rxq_desc"
>                type='{"type": "integer", "minInteger": 1, "maxInteger":
> 4096}'>
>          <p>
> --
> 2.7.5



More information about the dev mailing list