[ovs-dev] [PATCH ovs v3 2/2] netdev-dpdk: Add dpdkvdpa port

William Tu u9012063 at gmail.com
Wed Oct 30 21:21:40 UTC 2019


On Tue, Oct 29, 2019 at 4:20 AM Noa Levy <noae at mellanox.com> wrote:
>
>
>
> > -----Original Message-----
> > From: William Tu [mailto:u9012063 at gmail.com]
> > Sent: Monday, October 28, 2019 10:46 PM
> > To: Noa Levy <noae at mellanox.com>
> > Cc: ovs-dev at openvswitch.org; Oz Shlomo <ozsh at mellanox.com>; Majd
> > Dibbiny <majd at mellanox.com>; Ameer Mahagneh
> > <ameerm at mellanox.com>; Eli Britstein <elibr at mellanox.com>
> > Subject: Re: [ovs-dev] [PATCH ovs v3 2/2] netdev-dpdk: Add dpdkvdpa port
> >
> > Hi Noa,
> >
> > Thanks for your reply.
> >
> > > > > > Hi Noa,
> > > > > >
> > > > > > I have a couple more questions. I'm still at the learning stage
> > > > > > of this new feature, thanks in advance for your patience.
> > > > > >
> > > > > > On Thu, Oct 17, 2019 at 02:16:56PM +0300, Noa Ezra wrote:
> > > > > > > dpdkvdpa netdev works with 3 components:
> > > > > > > vhost-user socket, vdpa device: real vdpa device or a VF and
> > > > > > > representor of "vdpa device".
> > > > > >
> > > > > > What NIC card support this feature?
> > > > > > I don't have real vdpa device, can I use Intel X540 VF feature?
> > > > > >
> > > > >
> > > > > This feature will have two modes, SW and HW.
> > > > > The SW mode doesn't depend on a real vdpa device and allows you to
> > > > > use this feature even if you don't have a NIC that support it.
> > > Although you need to use representors, so you need your NIC to support
> > it.
> > > > > The HW mode will be implemented in the future and will use a real
> > > > > vdpa device. It will be better to use the HW mode if you have a
> > > > > NIC that support
> > > > it.
> > > > >
> > > > > For now, we only support the SW mode, when vdpa will have support
> > > > > in dpdk, we will add the HW mode to OVS.
> > > > >
> > > > > > >
> > > > > > > In order to add a new vDPA port, add a new port to existing
> > > > > > > bridge with type dpdkvdpa and vDPA options:
> > > > > > > ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa
> > > > > > >    options:vdpa-socket-path=<sock path>
> > > > > > >    options:vdpa-accelerator-devargs=<VF pci id>
> > > > > > >    options:dpdk-devargs=<vdpa pci id>,representor=[id]
> > > > > > >
> > > > > > > On this command OVS will create a new netdev:
> > > > > > > 1. Register vhost-user-client device.
> > > > > > > 2. Open and configure VF dpdk port.
> > > > > > > 3. Open and configure representor dpdk port.
> > > > > > >
> > > > > > > The new netdev will use netdev_rxq_recv() function in order to
> > > > > > > receive packets from VF and push to vhost-user and receive
> > > > > > > packets from vhost-user and push to VF.
> > > > > > >
> > > > > > > Signed-off-by: Noa Ezra <noae at mellanox.com>
> > > > > > > Reviewed-by: Oz Shlomo <ozsh at mellanox.com>
> > > > > > > ---
> > > > > > >  Documentation/automake.mk           |   1 +
> > > > > > >  Documentation/topics/dpdk/index.rst |   1 +
> > > > > > >  Documentation/topics/dpdk/vdpa.rst  |  90
> > ++++++++++++++++++++
> > > > > > >  NEWS                                |   1 +
> > > > > > >  lib/netdev-dpdk.c                   | 162
> > > > > > ++++++++++++++++++++++++++++++++++++
> > > > > > >  vswitchd/vswitch.xml                |  25 ++++++
> > > > > > >  6 files changed, 280 insertions(+)  create mode 100644
> > > > > > > Documentation/topics/dpdk/vdpa.rst
> > > > > > >
> > > > > > > diff --git a/Documentation/automake.mk
> > > > > b/Documentation/automake.mk
> > > > > > > index cd68f3b..ee574bc 100644
> > > > > > > --- a/Documentation/automake.mk
> > > > > > > +++ b/Documentation/automake.mk
> > > > > > > @@ -43,6 +43,7 @@ DOC_SOURCE = \
> > > > > > >         Documentation/topics/dpdk/ring.rst \
> > > > > > >         Documentation/topics/dpdk/vdev.rst \
> > > > > > >         Documentation/topics/dpdk/vhost-user.rst \
> > > > > > > +       Documentation/topics/dpdk/vdpa.rst \
> > > > > > >         Documentation/topics/fuzzing/index.rst \
> > > > > > >         Documentation/topics/fuzzing/what-is-fuzzing.rst \
> > > > > > >
> > > > > > > Documentation/topics/fuzzing/ovs-fuzzing-infrastructure.rst \
> > > > > > > diff --git a/Documentation/topics/dpdk/index.rst
> > > > > > > b/Documentation/topics/dpdk/index.rst
> > > > > > > index cf24a7b..c1d4ea7 100644
> > > > > > > --- a/Documentation/topics/dpdk/index.rst
> > > > > > > +++ b/Documentation/topics/dpdk/index.rst
> > > > > > > @@ -41,3 +41,4 @@ The DPDK Datapath
> > > > > > >     /topics/dpdk/pdump
> > > > > > >     /topics/dpdk/jumbo-frames
> > > > > > >     /topics/dpdk/memory
> > > > > > > +   /topics/dpdk/vdpa
> > > > > > > diff --git a/Documentation/topics/dpdk/vdpa.rst
> > > > > > > b/Documentation/topics/dpdk/vdpa.rst
> > > > > > > new file mode 100644
> >
> > <snip>
> >
> > > > > > > +2357,23 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq,
> > > > > > > +struct
> > > > > > dp_packet_batch *batch,
> > > > > > >      return 0;
> > > > > > >  }
> > > > > > >
> > > > > > > +static int
> > > > > > > +netdev_dpdk_vdpa_rxq_recv(struct netdev_rxq *rxq,
> > > > > > > +                          struct dp_packet_batch *batch,
> > > > > > > +                          int *qfill) {
> > > > > > > +    struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
> > > > > > > +    int fwd_rx;
> > > > > > > +    int ret;
> > > > > > > +
> > > > > > > +    fwd_rx = netdev_dpdk_vdpa_rxq_recv_impl(dev->relay,
> > > > > > > + rxq->queue_id);
> > > > > > I'm still not clear about the above function.
> > > > > > So netdev_dpdk_vdpa_recv_impl()
> > > > > >     netdev_dpdk_vdpa_forward_traffic(), with a queue pair as
> > parameter
> > > > > >         ...
> > > > > >         rte_eth_rx_burst(qpair->port_id_rx...)
> > > > > >         ...
> > > > > >         rte_eth_tx_burst(qpair->port_id_tx...)
> > > > > >
> > > > > > So looks like forwarding between vf to vhostuser and vice versa
> > > > > > is done in this function.
> > > > > >
> > > > > > > +    ret = netdev_dpdk_rxq_recv(rxq, batch, qfill);
> > > > > >
> > > > > > Then why do we call netdev_dpdk_rxq_recv() above again?
> > > > > > Are packets received above the same packets as
> > > > > > rte_eth_rx_burst() previously called in
> > netdev_dpdk_vdpa_forward_traffic()?
> > > > > >
> > > >
> > > > netdev_dpdk_vdpa_recv_impl() first calls: rte_eth_rx_burst and
> > > > rte_eth_tx_burst in order to forward between vf to vhostuser and
> > > > vice versa.
> > > > After rx_burst and tx_burst is done, we call netdev_dpdk_rxq_recv()
> > > > in order to receive packets for the representor.
> > > > The queue is different in rte_eth_rx_burst, rte_eth_tx_burst and
> > > > netdev_dpdk_rxq_recv.
> >
> > So what traffic goes into the queues seen by (rte_eth_rx_burst,
> > rte_eth_tx_burst)
> > and what traffic goes to queues seen by netdev_dpdk_rxq_recv()?
> >
> The traffic that goes through rte_eth_rx_burst and rte_eth_tx_burst is the
> traffic from vm to vf or from vf to vm (the "forwarder" traffic).
> The traffic that goes through netdev_dpdk_rxq_recv() is the packets sent to
> the representor's queues.
>
> > And if the HW mode is enabled, then we can remove calling the
> > rte_eth_rx_burst() and
> > rte_eth_tx_burst() because HW directly places packet into the virtio queue.
> > Do I understand correctly?
>
> Yes, you understand correctly, when the HW mode is enabled the packets go directly to the virtio queue and the "forwarding" will
> take place in the HW and not through SW, so rte_eth_rx_burst() and rte_eth_tx_burst() won't be used.
> We will support both HW and SW modes, so we won't remove this SW implementation, we will add support for HW.
>

Hi Noa,

Thank you!
So when using HW mode, OVS does not need to handle packet forwarding
(no rte_eth_rx_burst and rte_eth_tx_burst)
But when using HW mode, does OVS need to handle vhost user vring kick
and call event?
This is when a guest kicks the host/ovs because it has placed buffers
onto a virtqueue, or when
host/ovs trying to kick using call file descriptor to inform guest
there is incoming packets.

Or this two events are also handled in the hardware so OVS does not
need to do anything?

Thank you.
Regards,
William


More information about the dev mailing list