[ovs-dev] [RFC dpdk-latest 1/1] netdev-dpdk: integrate dpdk vhost pmd

Maxime Coquelin maxime.coquelin at redhat.com
Wed May 27 15:42:29 UTC 2020


Hi Siva,

On 5/19/20 1:49 PM, Sivaprasad Tummala wrote:
> The vHost PMD brings vHost User port types ('dpdkvhostuser' and
> 'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
> other DPDK netdev types ('dpdk'). In doing so, direct
> calls to DPDK's librte_vhost library are removed and replaced with
> librte_ether API calls, for which most of the infrastructure is already
> in place.
> 
> To enable TSO, specific changes were required in the vhost PMD. The
> patch which enables these is  available on dpdk-master and here:
> https://patches.dpdk.org/patch/66052/

>From discussion I had earlier this year, I understand the primary
motivation for this patch set is that it was a preliminary step before
introducing Vhost DMA support (work done by Intel based on IOAT).

Indeed, the initial RFC was adding DMA support into Vhost PMD. The
problem was that this work implied adding Virtio ring specificities into
the Vhost PMD, which was not accepted as these low-level handling of
the ring is to be done into the Vhost library.
After discussions, we agreed that the DMA support will be added into
the Vhost library, introducing a new set of Vhost APIs.

I think that this series will make it more complicated to add Vhost DMA
support to OVS (Adding Jiayu and Patrick who works on Vhost DMA series).

> Signed-off-by: Ciara Loftus <Ciara.Loftus at intel.com>
> Signed-off-by: Sivaprasad Tummala <Sivaprasad.Tummala at intel.com>
> 
> Tested-by: Sunil Pai G <sunil.pai.g at intel.com>
> ---
>  Documentation/topics/dpdk/vhost-user.rst |    3 +
>  NEWS                                     |    3 +
>  acinclude.m4                             |    4 +
>  include/openvswitch/netdev.h             |    1 +
>  lib/dpdk.c                               |   11 +
>  lib/dpdk.h                               |    2 +
>  lib/netdev-dpdk.c                        | 1384 ++++++++--------------
>  7 files changed, 535 insertions(+), 873 deletions(-)
> 

> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 9d2b4104c..e02929174 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c

...

> -static const struct vhost_device_ops virtio_net_device_ops =
> -{
> -    .new_device =  new_device,
> -    .destroy_device = destroy_device,
> -    .vring_state_changed = vring_state_changed,
> -    .features_changed = NULL,
> -    .new_connection = NULL,
> -    .destroy_connection = destroy_connection,
> -    .guest_notified = vhost_guest_notified,

AFAICS, you don't provide an equivalent for this callback.

...

>  static int
> -vhost_common_construct(struct netdev *netdev)
> -    OVS_REQUIRES(dpdk_mutex)
> +dpdk_attach_vhost_pmd(struct netdev_dpdk *dev, int mode)
>  {
> -    int socket_id = rte_lcore_to_socket_id(rte_get_master_lcore());
> -    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +    char *devargs;
> +    int err = 0;
> +    dpdk_port_t port_no = 0;
> +    uint32_t driver_id = 0;
> +    int iommu_enabled = 0;
> +    int zc_enabled = 0;
> +    int postcopy_enabled = 0;
> +    int linear_buf = 1;
> +    int ext_buf = 0;
> +    int tso = 0;
> +    char *driver_name;
>  
> -    dev->vhost_rxq_enabled = dpdk_rte_mzalloc(OVS_VHOST_MAX_QUEUE_NUM *
> -                                              sizeof *dev->vhost_rxq_enabled);
> -    if (!dev->vhost_rxq_enabled) {
> -        return ENOMEM;
> +    atomic_init(&dev->vhost_tx_retries_max, VHOST_ENQ_RETRY_DEF);
> +
> +    if (dev->vhost_driver_flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY) {
> +        zc_enabled = 1;
> +        /*
> +         * DPDK vHost library doesn't allow zero-copy with linear buffers
> +         * currently. Hence disabling the Linear buffer check until the
> +         * issue is fixed in DPDK.
> +         */

No, there is no issue.
It is not possible to support zero-copy and linear buffer at the same
time, simply because we cannot guarantee the driver in the guest won't
scatter the packets into multiple buffers.

Thanks,
Maxime



More information about the dev mailing list