[ovs-dev] [RFC PATCH] netdev-dpdk: Integrate vHost User PMD

Loftus, Ciara ciara.loftus at intel.com
Thu Jun 21 13:15:00 UTC 2018


> On Fri, Jun 01, 2018 at 01:40:31PM +0000, Loftus, Ciara wrote:
> > >
> > > > On Mon, May 21, 2018 at 04:44:13PM +0100, Ciara Loftus wrote:
> > > > > The vHost PMD brings vHost User port types ('dpdkvhostuser' and
> > > > > 'dpdkvhostuserclient') under control of DPDK's librte_ether API, like
> > > > > all other DPDK netdev types ('dpdk' and 'dpdkr'). In doing so, direct
> > > > > calls to DPDK's librte_vhost library are removed and replaced with
> > > > > librte_ether API calls, for which most of the infrastructure is
> > > > > already in place.
> > > > >
> > > > > This change has a number of benefits, including:
> > > > > * Reduced codebase (~200LOC removed)
> > > > > * More features automatically enabled for vHost ports eg. custom
> stats
> > > > >   and additional get_status information.
> > > > > * OVS can be ignorant to changes in the librte_vhost API between
> DPDK
> > > > >   releases potentially making upgrades easier and the OVS codebase
> less
> > > > >   susceptible to change.
> > > > >
> > > > > The sum of all DPDK port types must not exceed
> RTE_MAX_ETHPORTS
> > > which
> > > > > is set and can be modified in the DPDK configuration. Prior to this
> > > > > patch this only applied to 'dpdk' and 'dpdkr' ports, but now applies
> > > > > to all DPDK port types including vHost User.
> > > > >
> > > > > Performance (pps) of the different topologies p2p, pvp, pvvp and vv
> > > > > has been measured to remain within a +/- 5% margin of existing
> > > > performance.
> > > >
> > > > Thanks for putting this together.
> > > >
> > > > I think when this idea was discussed at least in my head we would
> pretty
> > > > much kill any vhost specific info and use a standard eth API instead.
> > > > However, it doesn't look like to be case, we still have the mtu and
> queue
> > > > issues, special construct/destruct, send, and etc which IMHO defeats
> the
> > > > initial goal.
> > >
> > > I agree, I think that would be the ideal situation but it seems where not
> there
> > > yet.
> > > I wonder if that is something that could be changed and fed back to
> DPDK? If
> > > we will always have to have the separate implementations is that
> reflective
> > > of OVS requirements or a gap in DPDK implementation of vhost PMD?
> >
> > Hi Ian & Flavio,
> 
> Hi Ciara,
> 
> > Thank you both for your responses. I agree, right now we are not at
> > the ideal scenario using this API which would probably be closer to
> > having the netdev_dpdk and netdev_dpdk_vhost* classes equivalent.
> > However 4 functions have become common (get_ carrier, stats,
> > custom_stats, features) and many of the remainder have some element
> > of commonality through helper functions (send, receive, status,
> > etc.). The hope would be that going forward we could narrow the gap
> > through both OVS and DPDK changes. I think it would be difficult to
> > narrow that gap if we opt for an "all or nothing" approach now.
> 
> It is all or nothing now because if we opt to apply this patch, we are
> all in and hoping that the API will evolve.  The problem I see is that
> as the API evolve, there will breakage/bugs and OVS would be exposed.
> 
> So, I still see value in OVS moving to common API, but I don't see a
> good trade off for OVS as a project moving to it in its current state.
> DPDK support isn't experimental anymore.
> 
> > > > Leaving that aside for a moment, I wonder about imposed limitations if
> we
> > > > switch to the eth API too. I mean, things that we can do today because
> OVS
> > > > is managing vhost that we won't be able after the API switch.
> > >
> > > I've been thinking of this situation also. But one concern is by not using
> the
> > > vhost PMD will there be features that are unavailable to vhost in ovs?
> > >
> > > Nothing comes to mind for now, and as long as we continue to access
> DPDK
> > > vhost library that should be ok. However it's something we should keep
> an
> > > eye in the future (For example we recently had an example of a DPDK
> > > function that could not be used with DPDK compiled for shared libs).
> > >
> > > It would be interesting to see where the DPDK community are trending
> > > towards with vhost development in the future WRT this.
> >
> > Feature-wise, it appears the development of any new DPDK vHost
> > feature includes relevant support for that feature in the PMD.
> > Dequeue zero copy and vhost iommu are examples of these. So going
> > forward I don't see any issues there.
> >
> > In my development and testing of this patch I haven't come across
> > any limitations other than that out-of-the-box one is limited to a
> > maximum of 32 vHost ports as defined by RTE_MAX_ETHPORTS. I would be
> > interested to hear if that would be a concern for users.
> 
> I am aware of cases where there are multiple vhost-user interfaces
> to the same VM, then that limit sounds quite low, but I will double
> check.
> 
> > On the other hand there are many more new things we can do with the
> > API switch too eg. more information in get_status & custom
> > statistics and hopefully more going forward. Although understand
> > preserving existing functionality is critical.
> >
> > Understand this is a large patch and might take some time to review.
> > But would definitely welcome any further high level feedback
> > especially around the topics above from anybody in the community
> > interested in netdev-dpdk/vHost.
> 
> I'd like to thank you again for putting this together. One thing is
> talking about it and another is having this patch showing the reality.

Thanks for your reply Flavio and apologies in the delayed response, I was on vacation.

> 
> What would be the vHost PMD goal from the DPDK project point of view?
> Because if DPDK is planning to drop the vhost library public API, we
> will need to plan and switch anyways at some point and I agree with
> you when you say we should close the gap soon.

+ Maxime, Tiwei & Zhihong DPDK vHost/vHost PMD maintainers

I have posted a patch to the OVS mailing list which switches the codebase from using the DPDK vHost library API to the vHost PMD via the ether API. There is some discussion about the change and I think input from the DPDK side would be valuable.

1. As asked by Flavio, will the vHost library API continue to be public or is there a plan to move away from that?
2. Are there any upcoming planned changes to the vHost PMD?
3. Would changes to the vHost PMD be welcome eg. changes that would ease the integration with OVS like some of those mentioned below. Some context: we currently use the ether API for physical ports/vdevs and want to reuse this code for vHost with minimal vHost-specific logic however as it stands that's not totally possible at the moment.

> 
> However, if that's not the case, it sounds like when vhost PMD API is
> compatible enough, current OVS would be able to use it as an ordinary
> DPDK device with maybe some special parameters. Assuming that it would
> be mature/good enough, the only change for OVS would be to remove
> pretty
> much most of the vhost code.

I reviewed the code in the context of closing the gap between dpdk ethdevs and vhost ethdevs and have put together a list of the 'special cases' for vHost (client) and some considerations.

1. netdev_dpdk_vhost_construct
Here we set dev->type = DPDK_DEV_VHOST.
Maybe we could remove the dev->type variable and rework the code such that each netdev_class ensures the right code is called for that port type with no need for the 'if dev->type' check. The trade-off would be the inflation of the code this change would cause, but the benefit would be having a common netdev_dpdk_construct function.

2. netdev_dpdk_vhost_destruct
Here we:
i. Unregister new/destroy/vsc callbacks.
Could be removed by making this automatic in DPDK eg. via a flag in rte_eth_dev_stop?
ii. Free vhost_pmd_id from pool
DPDK could be changed to accept non-unique vdev device names and generate and manage their names ie. IDs

3. netdev_dpdk_vhost_send
i. Retrieve qid from tx_q map.
Could perhaps be avoided by configuring netdev->n_txq to reflect the number of enabled queues rather than the total number of queues. Either derive this information from vring_state_changed cbs or via new API. Wouldn't work if individual queues can be enabled/disabled in Virtio - not sure if this is the case?
ii. Verify dev->vhost_reconfigured
This flag tells us the device is active and we've configured the mempool & queue info for the device.
We don't these things before sending to a phy port so perhaps could be removed for vHost or introduced for phy for consistency.

4. netdev_dpdk_get_status
There will always be extra code here for vHost report vHost specific info eg. socket path, client/server, etc.

5. netdev_dpdk_vhost_reconfigure
i. Set queue numbers. Phy ports allow the user to specify n_rxq and set n_txq to equal the number of pmd threads. vHost uses the number of queues configured in QEMU to set those values. This probably has to stay the same.
ii. Remapping of queues - see point 3.i.
iii. Set vhost_reconfigured - see point 3.ii

6. netdev_dpdk_vhost_rxq_recv
Verify dev->vhost_reconfigured - see point 3.ii

Thanks,
Ciara

> 
> --
> Flavio



More information about the dev mailing list