[ovs-dev] [PATCH v9] netdev-dpdk: Add support for vHost dequeue zero copy (experimental)
Ilya Maximets
i.maximets at samsung.com
Tue Jan 23 13:38:33 UTC 2018
On 19.01.2018 20:19, Ciara Loftus wrote:
> Zero copy is disabled by default. To enable it, set the 'dq-zero-copy'
> option to 'true' when configuring the Interface:
>
> ovs-vsctl set Interface dpdkvhostuserclient0
> options:vhost-server-path=/tmp/dpdkvhostuserclient0
> options:dq-zero-copy=true
>
> When packets from a vHost device with zero copy enabled are destined for
> a single 'dpdk' port, the number of tx descriptors on that 'dpdk' port
> must be set to a smaller value. 128 is recommended. This can be achieved
> like so:
>
> ovs-vsctl set Interface dpdkport options:n_txq_desc=128
>
> Note: The sum of the tx descriptors of all 'dpdk' ports the VM will send
> to should not exceed 128. Due to this requirement, the feature is
> considered 'experimental'.
>
> Testing of the patch showed a 15% improvement when switching 512B
> packets between vHost devices on different VMs on the same host when
> zero copy was enabled on the transmitting device.
>
> Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
> ---
> v9:
> * Rebase
> * Fix docs issue
> * Move variable asignment inside mutex
> * Reset dq-zero-copy value if vhost_driver_register fails
>
> Documentation/intro/install/dpdk.rst | 2 +
> Documentation/topics/dpdk/vhost-user.rst | 73 ++++++++++++++++++++++++++++++++
> NEWS | 1 +
> lib/netdev-dpdk.c | 11 ++++-
> vswitchd/vswitch.xml | 11 +++++
> 5 files changed, 97 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
> index 3fecb5c..087eb88 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -518,6 +518,8 @@ The above command sets the number of rx queues for DPDK physical interface.
> The rx queues are assigned to pmd threads on the same NUMA node in a
> round-robin fashion.
>
> +.. _dpdk-queues-sizes:
> +
> DPDK Physical Port Queue Sizes
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst
> index 8447e2d..95517a6 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -458,3 +458,76 @@ Sample XML
> </domain>
>
> .. _QEMU documentation: http://git.qemu-project.org/?p=qemu.git;a=blob;f=docs/specs/vhost-user.txt;h=7890d7169;hb=HEAD
> +
> +vhost-user Dequeue Zero Copy (experimental)
> +-------------------------------------------
> +
> +Normally when dequeuing a packet from a vHost User device, a memcpy operation
> +must be used to copy that packet from guest address space to host address
> +space. This memcpy can be removed by enabling dequeue zero-copy like so::
> +
> + $ ovs-vsctl add-port br0 dpdkvhostuserclient0 -- set Interface \
> + dpdkvhostuserclient0 type=dpdkvhostuserclient \
> + options:vhost-server-path=/tmp/dpdkvhostclient0 \
> + options:dq-zero-copy=true
> +
> +With this feature enabled, a reference (pointer) to the packet is passed to
> +the host, instead of a copy of the packet. Removing this memcpy can give a
> +performance improvement for some use cases, for example switching large packets
> +between different VMs. However additional packet loss may be observed.
> +
> +Note that the feature is disabled by default and must be explicitly enabled
> +by setting the ``dq-zero-copy`` option to ``true`` while specifying the
> +``vhost-server-path`` option as above. If you wish to split out the command
> +into multiple commands as below, ensure ``dq-zero-copy`` is set before
> +``vhost-server-path``::
> +
> + $ ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero-copy=true
> + $ ovs-vsctl set Interface dpdkvhostuserclient0 \
> + options:vhost-server-path=/tmp/dpdkvhostclient0
> +
> +The feature is only available to ``dpdkvhostuserclient`` port types.
> +
> +A limitation exists whereby if packets from a vHost port with
> +``dq-zero-copy=true`` are destined for a ``dpdk`` type port, the number of tx
> +descriptors (``n_txq_desc``) for that port must be reduced to a smaller number,
> +128 being the recommended value. This can be achieved by issuing the following
> +command::
> +
> + $ ovs-vsctl set Interface dpdkport options:n_txq_desc=128
> +
> +Note: The sum of the tx descriptors of all ``dpdk`` ports the VM will send to
> +should not exceed 128. For example, in case of a bond over two physical ports
> +in balance-tcp mode, one must divide 128 by the number of links in the bond.
> +
> +Refer to :ref:`dpdk-queues-sizes` for more information.
> +
> +The reason for this limitation is due to how the zero copy functionality is
> +implemented. The vHost device's 'tx used vring', a virtio structure used for
> +tracking used ie. sent descriptors, will only be updated when the NIC frees
> +the corresponding mbuf. If we don't free the mbufs frequently enough, that
> +vring will be starved and packets will no longer be processed. One way to
> +ensure we don't encounter this scenario, is to configure ``n_txq_desc`` to a
> +small enough number such that the 'mbuf free threshold' for the NIC will be hit
> +more often and thus free mbufs more frequently. The value of 128 is suggested,
> +but values of 64 and 256 have been tested and verified to work too, with
> +differing performance characteristics. A value of 512 can be used too, if the
> +virtio queue size in the guest is increased to 1024 (available to configure in
> +QEMU versions v2.10 and greater). This value can be set like so::
> +
> + $ qemu-system-x86_64 ... -chardev socket,id=char1,path=<sockpath>,server
> + -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce
> + -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,
> + tx_queue_size=1024
> +
> +Because of this limitation, this feature is considered 'experimental'.
> +
> +The feature currently does not fully work with QEMU >= v2.7 due to a bug in
> +DPDK which will be addressed in an upcoming release. The patch to fix this
> +issue can be found on
> +`Patchwork
> +<http://dpdk.org/dev/patchwork/patch/32198/>`__
> +
> +Further information can be found in the
> +`DPDK documentation
> +<http://dpdk.readthedocs.io/en/v17.05/prog_guide/vhost_lib.html>`__
> diff --git a/NEWS b/NEWS
> index c067b94..51f9e04 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -47,6 +47,7 @@ v2.9.0 - xx xxx xxxx
> "management" statistics.
> - ovs-ofctl dump-ports command now prints new of set custom statistics
> if available (for OpenFlow 1.4+).
> + * Add support for vHost dequeue zero copy (experimental)
> - vswitchd:
> * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits.
> * Configuring a controller, or unconfiguring all controllers, now deletes
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index e32c7f6..8111eba 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1519,6 +1519,12 @@ netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
> path = smap_get(args, "vhost-server-path");
> if (path && strcmp(path, dev->vhost_id)) {
> strcpy(dev->vhost_id, path);
> + /* check zero copy configuration */
> + if (smap_get_bool(args, "dq-zero-copy", false)) {
> + dev->vhost_driver_flags |= RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> + } else {
> + dev->vhost_driver_flags &= ~RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> + }
> netdev_request_reconfigure(netdev);
> }
> }
> @@ -3521,7 +3527,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
> if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
> && strlen(dev->vhost_id)) {
> /* Register client-mode device. */
> - vhost_flags |= RTE_VHOST_USER_CLIENT;
> + vhost_flags = dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
RTE_VHOST_USER_CLIENT should be enabled in dev->vhost_driver_flags only
after successful driver regirter. Otherwise we'll not be able to do anything
with this device in case of failure.
>
> /* Enable IOMMU support, if explicitly requested. */
> if (dpdk_vhost_iommu_enabled()) {
> @@ -3538,6 +3544,9 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
> VLOG_INFO("vHost User device '%s' created in 'client' mode, "
> "using client socket '%s'",
> dev->up.name, dev->vhost_id);
> + if (dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY) {
> + VLOG_INFO("Zero copy enabled for vHost port %s", dev->up.name);
> + }
> }
>
> err = rte_vhost_driver_callback_register(dev->vhost_id,
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 58c0ebd..0e47c14 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -2671,6 +2671,17 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
> </p>
> </column>
>
> + <column name="options" key="dq-zero-copy"
> + type='{"type": "boolean"}'>
> + <p>
> + The value specifies whether or not to enable dequeue zero copy on
> + the given interface.
> + Must be set before vhost-server-path is specified.
> + Only supported by dpdkvhostuserclient interfaces.
> + The feature is considered experimental.
> + </p>
> + </column>
> +
> <column name="options" key="n_rxq_desc"
> type='{"type": "integer", "minInteger": 1, "maxInteger": 4096}'>
> <p>
>
More information about the dev
mailing list