[ovs-dev] [PATCH ovs v3 0/2] Introduce dpdkvdpa netdev

Noa Levy noae at mellanox.com
Tue Oct 29 09:35:27 UTC 2019



> -----Original Message-----
> From: Roni Bar Yanai
> Sent: Monday, October 28, 2019 10:25 PM
> To: Ilya Maximets <i.maximets at ovn.org>; Noa Levy <noae at mellanox.com>;
> ovs-dev at openvswitch.org
> Cc: Oz Shlomo <ozsh at mellanox.com>; Majd Dibbiny
> <majd at mellanox.com>; Ameer Mahagneh <ameerm at mellanox.com>; Eli
> Britstein <elibr at mellanox.com>; William Tu <u9012063 at gmail.com>; Simon
> Horman <simon.horman at netronome.com>
> Subject: RE: [ovs-dev] [PATCH ovs v3 0/2] Introduce dpdkvdpa netdev
> 
> Hi ilya,
> please see inline
> 
> >-----Original Message-----
> >From: Ilya Maximets <i.maximets at ovn.org>
> >Sent: Monday, October 28, 2019 3:46 PM
> >To: Noa Levy <noae at mellanox.com>; ovs-dev at openvswitch.org; Roni Bar
> >Yanai <roniba at mellanox.com>
> >Cc: Oz Shlomo <ozsh at mellanox.com>; Majd Dibbiny
> <majd at mellanox.com>;
> >Ameer Mahagneh <ameerm at mellanox.com>; Eli Britstein
> ><elibr at mellanox.com>; William Tu <u9012063 at gmail.com>; Simon Horman
> ><simon.horman at netronome.com>
> >Subject: Re: [ovs-dev] [PATCH ovs v3 0/2] Introduce dpdkvdpa netdev
> >
> >On 17.10.2019 13:16, Noa Ezra wrote:
> >> There are two approaches to communicate with a guest, using virtIO or
> >> SR-IOV.
> >> SR-IOV allows working with port representor which is attached to the
> >> OVS and a matching VF is given with pass-through to the VM.
> >> HW rules can process packets from up-link and direct them to the VF
> >> without going through SW (OVS) and therefore SR-IOV gives the best
> >> performance.
> >> However, SR-IOV architecture requires that the guest will use a
> >> driver which is specific to the underlying HW. Specific HW driver has
> >> two main
> >> drawbacks:
> >> 1. Breaks virtualization in some sense (VM aware of the HW), can also
> >>     limit the type of images supported.
> >> 2. Less natural support for live migration.
> >>
> >> Using virtIO interface solves both problems, but reduces performance
> >> and causes losing of some functionality, for example, for some HW
> >> offload, working directly with virtIO cannot be supported.
> >> In order to solve this conflict, we created a new netdev type-dpdkvdpa.
> >> The new netdev is basically similar to a regular dpdk netdev, but it
> >> has some additional functionality for transferring packets from
> >> virtIO guest (VM) to a VF and vice versa. With this solution we can
> >> benefit both SR-IOV and virtIO.
> >> vDPA netdev is designed to support both SW and HW use-cases.
> >> HW mode will be used to configure vDPA capable devices. The support
> >> for this mode is on progress in the dpdk community.
> >> SW acceleration is used to leverage SR-IOV offloads to virtIO guests
> >> by relaying packets between VF and virtio devices and as a pre-step
> >> for supporting vDPA in HW mode.
> >>
> >> Running example:
> >> 1. Configure OVS bridge and ports:
> >> ovs-vsctl add-br br0-ovs -- set bridge br0-ovs datapath_type=netdev
> >> ovs-vsctl add-port br0-ovs pf -- set Interface pf type=dpdk options: \
> >>          dpdk-devargs=<pf pci id>
> >> ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa \
> >>          options:vdpa-socket-path=<sock path> \
> >>          options:vdpa-accelerator-devargs=<vf pci id> \
> >>          options:dpdk-devargs=<pf pci id>,representor=[id] 2. Run a
> >> virtIO guest (VM) in server mode that creates the socket of
> >>     the vDPA port.
> >> 3. Send traffic.
> >>
> >> Noa Ezra (2):
> >>    netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
> >>    netdev-dpdk: Add dpdkvdpa port
> >>
> >>   Documentation/automake.mk           |   1 +
> >>   Documentation/topics/dpdk/index.rst |   1 +
> >>   Documentation/topics/dpdk/vdpa.rst  |  90 +++++
> >>   NEWS                                |   1 +
> >>   lib/automake.mk                     |   4 +-
> >>   lib/netdev-dpdk-vdpa.c              | 750
> >++++++++++++++++++++++++++++++++++++
> >>   lib/netdev-dpdk-vdpa.h              |  54 +++
> >>   lib/netdev-dpdk.c                   | 162 ++++++++
> >>   vswitchd/vswitch.xml                |  25 ++
> >>   9 files changed, 1087 insertions(+), 1 deletion(-)
> >>   create mode 100644 Documentation/topics/dpdk/vdpa.rst
> >>   create mode 100755 lib/netdev-dpdk-vdpa.c
> >>   create mode 100644 lib/netdev-dpdk-vdpa.h
> >>
> >
> >Hi, everyone.
> >
> >So, I have a few questions (mostly to Roni?):
> >
> >1. What happened to idea of implementing this as a DPDK vdev?
> We wanted to solve both OVS-kernel and OVS-DPDK issue.
> The main argument against it was that we allow to define ports that are not
> OF ports on the switch. regardless of how useful we think this feature can be
> it breaks something fundamental, and looking back I agree.
> The vdev, was a workaround, but after speaking with some of our DPDK
> experts, they had similar arguments, this time for DPDK world. You create a
> port that never gets packet and never sends packets. This totally doesn't
> make sense even if it can be useful in some cases. When we remove the
> kernel form the equation, we are left with vDPA.
> In fact we followed your suggestion of having SW vDPA and HW vDPA.
> This makes total sense, for example open stack can configure vDPA and let
> the OVS probe the device and go with HW or SW, leaving same functionality
> and same configuration.
> 
> >
> >2. What was the results of "direct output optimization" patch [1] testing?
> >    At this point I also have to mention that OVS is going to have TSO support
> >    in one of the next releases (at least we have some progress on that,
> thanks
> >    to idea of using extbuf).  This way in combine with patch [1] there should
> >    be no any benefits from having separate netdev for forwarding purposes.
> >    Even without this patch, havin TSO alone should provide good
> performance
> >    benefits making it considerable to not have any extra netdev.
> 
> >
> >    [1]
> >https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> hwork
> >.ozlabs.org%2Fpatch%2F1105880%2F&amp;data=02%7C01%7Croniba%40m
> ellanox
> >.com%7C96d4cabbfe1f436e334108d75bad35b5%7Ca652971c7d2e4d9ba6a4d
> 14925
> >6f461b%7C0%7C0%7C637078671813245829&amp;sdata=7F1YgO4fgyr5vvPKI
> JquQq
> >ocKH08O4%2FAjn%2FNfVe4fPU%3D&amp;reserved=0
> Unfortunately the improvement was minor, few percentage. The
> performance difference
> Is dramatic between direct and vDPA SW. I think that also as a system vDPA is
> much easier
> to follow. You configure one port, instead of having 3 ports and also direct
> rule.
> We really hope TSO will be supported soon, and it can be a solution (with
> significant less
> performance) for kernel-OVS using direct, maybe It can have further
> optimizations.
> >
> >
> >3. Regarding implementation, I expected less code duplication. It's unclear
> >    why you're not reusing hotplug capabilities of exiting dpdk netdev.
> >    I expected that this new netdev will call netdev_open() like a 3 times
> >    with a few tweaks for TSO enabling or stripping/adding some options.
> >
> I'll let Noa answer here.

There is a separation between the dpdkvdpa netdev and the ports it opens under it.
By design, the dpdk ports are internal to the vDPA netdev. There are no netdev
instances for each one of them. Therefore, we can't use netdev_open() 3 times.
Regarding your comment for code duplication, we only implemented the necessary
functions for this netdev.

> >4. This code seem doesn't make sense in sw mode without full HW
> offloading.
> >    i.e. it doesn't make sense to use this netdev while VF representor is
> >    attached to userspace datapath.  This is just because, according to the
> >    solution architecture, all the traffic will be copied firstly from the
> >    VM to VF (parsed on the fly to enable TSO), and will immediately appear
> >    on VF representor to be handled by OVS (same parsing with partial
> offload
> >    + actions execution).  I'm completely unsure if it any faster than just
> >    sending packet from VM to VF withoout bypassing OVS processing.
> Maybe
> >    even slower and like 3 times heavier for PCI bandwidth.
> >
> Agree, but internally we already have full hardware offload of vxlan, and for
> Vxlan + connection tracking. We are planning to start submitting vxlan full
> offload first, and as far as I know it is planned in the next few weeks. We are
> doing a lot of effort pushing it ASAP.
> I think we better start reviewing vDPA it since it takes time and commits are
> not
> conflicting. vDPA and full offload are two features that adds up to full
> solution, and
> no feature can work as a standalone (unless you willing to use sr-iov for
> guest).
> Of course you won't use vDPA if you don't have full hardware offload, it will
> be slower.
> >5. "HW mode will be used to configure vDPA capable devices. The support
> >     for this mode is on progress in the dpdk community."
> >    AFAIU, host side of vDPA support is done already for some time in DPDK.
> >    API is prepared and there is a vhost-vdpa example that allows you to
> >    check the functionality.  As I understand, the missing part in dpdk right
> >    now is a DPDK virtio driver for guest, but this is not a limitation for
> >    not implementing vDPA support from the host side.  Am I missing
> something?
> Why this is not a limitation? How could it be used without it?
> Anyway, I'll check again internally, if we are ready.
> >
> >Best regards, Ilya Maximets.


More information about the dev mailing list