[ovs-dev] [PATCH v2 2/2] netdev-dpdk: Enable optional dequeue zero copy for vHost User

Stokes, Ian ian.stokes at intel.com
Sun Oct 15 13:52:13 UTC 2017


Thanks for the v2 Ciara. Comments inline.


> Enabled per port like so:
> ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero-copy=true
> 
> The feature is disabled by default and can only be enabled/disabled when a
> vHost port is down.
> 
> When packets from a vHost device with zero copy enabled are destined for a
> 'dpdk' port, the number of tx descriptors on that 'dpdk' port must be set
> to a smaller value. 128 is recommended. This can be achieved like
> so:
> 
> ovs-vsctl set Interface dpdkport options:n_txq_desc=128
> 
> Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
> ---
>  Documentation/howto/dpdk.rst             | 29 +++++++++++
>  Documentation/topics/dpdk/vhost-user.rst | 35 +++++++++++++
>  NEWS                                     |  3 ++
>  lib/netdev-dpdk.c                        | 89
> +++++++++++++++++++++++++++++++-
>  vswitchd/vswitch.xml                     | 11 ++++
>  5 files changed, 166 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
> index d123819..d149a8e 100644
> --- a/Documentation/howto/dpdk.rst
> +++ b/Documentation/howto/dpdk.rst
> @@ -709,3 +709,32 @@ devices to bridge ``br0``. Once complete, follow the
> below steps:
>     Check traffic on multiple queues::
> 
>         $ cat /proc/interrupts | grep virtio
> +
> +PHY-VM-PHY (vHost Dequeue Zero Copy)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Title underline is too short here, will cause an error on compilation.

> +
> +vHost dequeue zero copy functionality can  be validated using the
> +PHY-VM-PHY configuration. To begin, follow the steps described in
> +:ref:`dpdk-phy-phy` to create and initialize the database, start
> +ovs-vswitchd and add ``dpdk``-type and ``dpdkvhostuser``-type devices
> +and flows to bridge ``br0``. Once complete, follow the below steps:
> +
> +1. Enable dequeue zero copy on the vHost devices.
> +
> +       $ ovs-vsctl set Interface dpdkvhostuser0 options:dq-zero-copy=true
> +       $ ovs-vsctl set Interface dpdkvhostuser1
> + options:dq-zero-copy=true
> +
> +2. Reduce the number of txq descriptors on the phy ports.
> +
> +       $ ovs-vsctl set Interface phy0 options:n_txq_desc=128
> +       $ ovs-vsctl set Interface phy1 options:n_txq_desc=128
> +
> +3. Proceed with the test by launching the VM and configuring guest
> +forwarding, be it via the vHost loopback method or kernel forwarding
> +method, and sending traffic.
> +
> +It is essential that step 1 is performed before booting the VM,
> +otherwise the feature will not be enabled.
> +
> +Check for the log "VHOST_CONFIG: dequeue zero copy is enabled" to
> +ensure the successful enablement of the feature.

Minor nit but the info message I saw in the log during testing differed from what's above, it was as follows:

netdev_dpdk|INFO|Zero copy enabled for vHost socket /usr/local/var/run/openvswitch/vhost-user0

Consider changing the message in the docs to reflect this.

> diff --git a/Documentation/topics/dpdk/vhost-user.rst
> b/Documentation/topics/dpdk/vhost-user.rst
> index 74ac06e..267ccaf 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -408,3 +408,38 @@ Sample XML
>      </domain>
> 
>  .. _QEMU documentation: http://git.qemu-
> project.org/?p=qemu.git;a=blob;f=docs/specs/vhost-
> user.txt;h=7890d7169;hb=HEAD
> +
> +vhost-user Dequeue Zero Copy
> +-------------------------------------
> +
> +Normally when dequeuing a packet from a vHost User device, a memcpy
> +operation must be used to copy that packet from guest address space to
> +host address space. This memcpy can be removed by enabling dequeue zero-
> copy like so:
> +
> +    $ ovs-vsctl set Interface dpdkvhostuserclient0
> + options:dq-zero-copy=true
> +
> +With this feature enabled, a reference (pointer) to the packet is
> +passed to the host, instead of a copy of the packet. Removing this
> +memcpy can give a performance improvement for some use cases, for
> +example switching large packets between different VMs.
> +
> +Note that the feature is disabled by default and must be explicitly
> +enabled by using the command above.
> +
> +The feature cannot be enabled when the device is active (ie. VM
> +booted). If you wish to enable the feature after the VM has booted, you
> +must shutdown the VM and bring it back up.

Does it make sense to explain how the feature can be disabled on a port that it was previously enabled on?

I don't see a command or an example of something like $ ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero-copy=false.

Disabling the feature and the steps involved would be good for completeness of the documentation.

> +
> +The feature is available to both dpdkvhostuser and dpdkvhostuserclient
> +port types.
> +
> +A limitation exists whereby if packets from a vHost port with
> +dq-zero-copy=true are destined for a 'dpdk' type port, the number of tx
> +descriptors (n_txq_desc) for that port must be reduced to a smaller
> +number, 128 being the recommended value. This can be achieved by issuing
> the following command:
> +
> +    $ ovs-vsctl set Interface dpdkport options:n_txq_desc=128
> +

I'd like to see this limitation expanded upon.

In testing a VM -> Phy use case with testpmd in txmode in the VM, if you don't reduce the descriptors (i.e. use the default settings on the phy port) I saw that no traffic was being transmitted from the phy port. Experimenting with different values for the num descriptors yields difference performance levels i.e. 64 desc gave higher throughput in this simple case, and increasing the number of descriptors (128,256 etc.) reduced the throughput level.

I can see a user experimenting with this but from the docs it's not clear what's going on under the hood and why 128 should be selected with zero copy enabled, what factors impact on this selection?

A quick high level explanation would be good here with maybe a link to the DPDK docs for descriptors if its available.

> +Further information can be found in the `DPDK documentation
> +<http://dpdk.readthedocs.io/en/v17.05/prog_guide/vhost_lib.html>`__
> diff --git a/NEWS b/NEWS
> index a3c1a02..832a348 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -5,6 +5,9 @@ Post-v2.8.0
>         chassis "hostname" in addition to a chassis "name".
>     - Linux kernel 4.13
>       * Add support for compiling OVS with the latest Linux 4.13 kernel
> +   - DPDK:
> +     * Optional dequeue zero copy feature for vHost ports enabled per
> port
> +       via the boolean 'dq-zero-copy' option.
> 
>  v2.8.0 - xx xxx xxxx
>  ---------------------
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 0b910cd..930b20e
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -379,6 +379,9 @@ struct netdev_dpdk {
>      /* True if vHost device is 'up' and has been reconfigured at least
> once */
>      bool vhost_reconfigured;
> 
> +    /* True if dq-zero-copy feature has successfully been enabled */
> +    bool dq_zc_enabled;
> +
>      /* Identifier used to distinguish vhost devices from each other. */
>      char vhost_id[PATH_MAX];
> 
> @@ -905,6 +908,7 @@ common_construct(struct netdev *netdev, dpdk_port_t
> port_no,
>      dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
>      ovsrcu_index_init(&dev->vid, -1);
>      dev->vhost_reconfigured = false;
> +    dev->dq_zc_enabled = false;
>      dev->attached = false;
> 
>      ovsrcu_init(&dev->qos_conf, NULL);
> @@ -1413,6 +1417,29 @@ netdev_dpdk_ring_set_config(struct netdev *netdev,
> const struct smap *args,
>      return 0;
>  }
> 
> +static void
> +dpdk_vhost_set_config_helper(struct netdev_dpdk *dev,
> +                             const struct smap *args) {
> +    bool needs_reconfigure = false;
> +    bool zc_requested = smap_get_bool(args, "dq-zero-copy", false);
> +
> +    if (zc_requested &&
> +            !(dev->vhost_driver_flags &
> RTE_VHOST_USER_DEQUEUE_ZERO_COPY)) {
> +        dev->vhost_driver_flags |= RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> +        needs_reconfigure = true;
> +    } else if (!zc_requested &&
> +            (dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY))
> {
> +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
> +        needs_reconfigure = true;
> +    }
> +
> +    /* Only try to change ZC mode when device is down */
> +    if (needs_reconfigure && (netdev_dpdk_get_vid(dev) == -1)) {
> +        netdev_request_reconfigure(&dev->up);
> +    }
> +}
> +
>  static int
>  netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
>                                      const struct smap *args, @@ -1429,6
> +1456,23 @@ netdev_dpdk_vhost_client_set_config(struct netdev *netdev,
>              netdev_request_reconfigure(netdev);
>          }
>      }
> +
> +    dpdk_vhost_set_config_helper(dev, args);
> +
> +    ovs_mutex_unlock(&dev->mutex);
> +
> +    return 0;
> +}
> +
> +static int
> +netdev_dpdk_vhost_set_config(struct netdev *netdev,
> +                             const struct smap *args,
> +                             char **errp OVS_UNUSED) {
> +    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +
> +    ovs_mutex_lock(&dev->mutex);
> +    dpdk_vhost_set_config_helper(dev, args);
>      ovs_mutex_unlock(&dev->mutex);
> 
>      return 0;
> @@ -2753,6 +2797,46 @@ netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev)
>      }
>  }
> 
> +static void
> +vhost_change_zero_copy_mode(struct netdev_dpdk *dev, bool client_mode,
> +                            bool enable) {
> +    int err = rte_vhost_driver_unregister(dev->vhost_id);
> +
> +    if (err) {
> +        VLOG_ERR("Error unregistering vHost socket %s; can't change zero
> copy "
> +                "mode", dev->vhost_id);
> +    } else {
> +        err = dpdk_setup_vhost_device(dev, client_mode);
> +        if (err) {
> +            VLOG_ERR("Error changing zero copy mode for vHost socket %s",
> +                    dev->vhost_id);
> +        } else if (enable) {
> +            dev->dq_zc_enabled = true;
> +            VLOG_INFO("Zero copy enabled for vHost socket %s", dev-
> >vhost_id);
> +        } else {
> +            dev->dq_zc_enabled = false;
> +            VLOG_INFO("Zero copy disabled for vHost socket %s", dev-
> >vhost_id);
> +        }
> +    }
> +}
> +
> +static void
> +vhost_check_zero_copy_status(struct netdev_dpdk *dev) {
> +    bool mode = dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT;
> +
> +    if ((dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY)
> +                && !dev->dq_zc_enabled) {
> +        /* ZC disabled but requested to be enabled, enable it. */
> +        vhost_change_zero_copy_mode(dev, mode, true);
> +    } else if (!(dev->vhost_driver_flags &
> +            RTE_VHOST_USER_DEQUEUE_ZERO_COPY) && dev->dq_zc_enabled) {
> +        /* ZC enabled but requested to be disabled, disable it. */
> +        vhost_change_zero_copy_mode(dev, mode, false);
> +    }
> +}
> +
>  /*
>   * Remove a virtio-net device from the specific vhost port.  Use dev-
> >remove
>   * flag to stop any more packets from being sent or received to/from a VM
> and @@ -2798,6 +2882,7 @@ destroy_device(int vid)
>           */
>          ovsrcu_quiesce_start();
>          VLOG_INFO("vHost Device '%s' has been removed", ifname);
> +        netdev_request_reconfigure(&dev->up);
>      } else {
>          VLOG_INFO("vHost Device '%s' not found", ifname);
>      }
> @@ -3289,6 +3374,8 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk
> *dev)
>              /* Carrier status may need updating. */
>              netdev_change_seq_changed(&dev->up);
>          }
> +    } else {
> +        vhost_check_zero_copy_status(dev);
>      }
> 
>      return 0;
> @@ -3450,7 +3537,7 @@ static const struct netdev_class dpdk_vhost_class =
>          NULL,
>          netdev_dpdk_vhost_construct,
>          netdev_dpdk_vhost_destruct,
> -        NULL,
> +        netdev_dpdk_vhost_set_config,
>          NULL,
>          netdev_dpdk_vhost_send,
>          netdev_dpdk_vhost_get_carrier,
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index
> 074535b..660b46f 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -2621,6 +2621,17 @@
>          </p>
>        </column>
> 
> +      <column name="options" key="dq-zero-copy"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          The value specifies whether or not to enable dequeue zero copy
> on
> +          the given interface.
> +          The port must be in an inactive state in order to enable or
> disable
> +          this feature.
> +          Only supported by dpdkvhostuserclient and dpdkvhostuser
> interfaces.
> +        </p>
> +      </column>
> +
>        <column name="options" key="n_rxq_desc"
>                type='{"type": "integer", "minInteger": 1, "maxInteger":
> 4096}'>
>          <p>
> --
> 2.7.5



More information about the dev mailing list