[ovs-dev] [PATCH v2 0/8] OVS-DPDK flow offload with rte_flow

Chandran, Sugesh sugesh.chandran at intel.com
Mon Sep 11 10:00:06 UTC 2017



Regards
_Sugesh


> -----Original Message-----
> From: Yuanhan Liu [mailto:yliu at fridaylinux.org]
> Sent: Monday, September 11, 2017 10:12 AM
> To: Chandran, Sugesh <sugesh.chandran at intel.com>
> Cc: dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 0/8] OVS-DPDK flow offload with rte_flow
> 
> On Sun, Sep 10, 2017 at 04:12:47PM +0000, Chandran, Sugesh wrote:
> > Hi Yuanhan,
> >
> > Thank you for sending out the patch series.
> 
> Hi Sugesh,
> 
> Thank you for taking your time to review it!
> 
> 
> > We are also looking into something similar to enable full offload in OVS-
> DPDK.
> 
> Good to know!
> 
> > It is based on ' http://dpdk.org/ml/archives/dev/2017-
> September/074746.html' and some other rte_flow extension in DPDK.
> 
> I saw the patches, I will take some time to read it.
[Sugesh]Sure.
> 
> >
> > It is noted that the patch series doesn't work very well for some of our
> requirements.
> > Please find below for the high level comments. I have also provided specific
> comments on the following patches.
> >
> > 1) Looks to me the patch series is to enable/use just one functionality of
> NIC(mark action). In a multiple hardware environment it is necessary to have
> a feature discovery mechanism to define what needs to be installed in the
> hardware based on its capabilities, for eg: MARK+QUEUE, MARK only,
> number of supported flow entries, supported flow fields and etc. This is very
> important to support different hardware NICs and make flow install easy.
> 
> Yes, you are right. I have also observed this issue while coding this patch.
[Sugesh] Ok.
> 
> > In our implementation we have a feature discovery at the OVS init. It will
> also populate the OVSDB to expose the device capability to higher
> management layers. The new table introduced in OVSDB is like below.
> 
> The solution I want to go, however, is different though. I was thinking to
> introduce few DPDK rte_flow APIs and structs to define the NIC flow
> capabilities.
[Sugesh] technically rte_flow is for flow programming and not for device capabilities.
Again if DPDK can have such API in rte_flow. I think it should be fine.
> 
> I think this would help long run, as the capabilitiy will be updated as the new
> features are added (when new versions are released). For the solution you
> proposed, it won't allow DPDK work with multiple DPDK versions (assume
> they provides different rte flow capabilities).
> 
> >   <table name="hw_offload">
> >     <p>
> >       Hardware switching configuration and capabilities.
> >     </p>
> >     <column name="name">
> >       The name of hardware acceleration device.
> >     </column>
> >     <column name="dev_id" type='{"type": "integer", "minInteger": 0,
> "maxInteger": 7}'>
> >       The integer device id of hardware accelerated NIC.
> >     </column>
> >      <column name="pci_id" type='{"type": "string"}'>
> >       The PCI ID of the hardware acceleration device. The broker id/PF id.
> >      </column>
> >      <column name="features" key="n_vhost_ports" type='{"type":
> "integer"}'>
> >       The number of supported vhost ports in the hardware switch.
> >      </column>
> >   </table>
> >
> > The features column can be extended with more fields as necessary.
> > IMO the proposed partial offload doesn't need populating the OVSDB,
> however its necessary to have some kind of feature discovery at init.
> >
> > 2) I feel its better to keep the hardware offload functionalities in netdev as
> much as possible similar to kernel implementation. I see changes in upcall
> and dpif.
> 
> I agree with you. But unfortunately, due to some dirver or hardware
> limitation, that's what I can get best.
[Sugesh] Ok. 
> 
> > 3) The cost of flow install . PMDs are blocked when a hardware flow install
> is happening. Its an issue when there are lot of short lived flows are getting
> installed in the DP.
> 
> I wasn't aware of it. Thank you for letting me know that!
> 
[Sugesh] Ok
> > One option to handle this would be move the flow install into revalidate.
> The advantage of this approach would be hardware offload would happen
> only when a flow is being used at least for sometime. Something like how
> revalidator thread handle the flow modify operation.
> 
> Yes, it sounds workable. However, the MARK and QUEUE workaround won't
> work then: we need record the rxq first. And again, I know the workaround is
> far from being perfect.
> 
> > 4) AFAIK, these hardware programmability for a NIC/not for a specific port.
> i.e the FDIR/RSS hash configuration are device specific. This will be an issue if
> a NIC shared between kernel and DPDK drivers?
> 
> That might be NIC specific. What do you mean by sharing between kernel
> and DPDK? In most NICs I'm aware of, it's requried to unbind the kernel
> driver first. Thus, it won't be shared. For Mellanox, the control unit is based
> on queues, thus it could be shared correctly.
[Sugesh] What  I meant by that is, consider a case of NIC with 4*10G ports.
2 ports bound to DPDK and 2 ports to kernel.
If I remember correctly XL710 NIC can support total 8k exact match flow entries in its
Flow director.  Similarly some other resources are also shared across all the ports in the NIC. 
Now how these resources are properly managed between
Kernel and DPDK.  
 I agree that Mellanox NICs this should be fine, but not sure if it work on all
the NICs out there. This will make adverse effect on each other when making changes to any global configuration.
> 
> 	--yliu


More information about the dev mailing list