[ovs-dev] [PATCH v5 0/5] OVS-DPDK flow offload with rte_flow

Fri Jan 12 09:24:17 UTC 2018

>-----Original Message-----
>From: Yuanhan Liu [mailto:yliu at fridaylinux.org]
>Sent: 11. januar 2018 15:11
>To: Chandran, Sugesh <sugesh.chandran at intel.com>
>Cc: Finn Christensen <fc at napatech.com>; dev at openvswitch.org; Darrell Ball
><dball at vmware.com>; Simon Horman <simon.horman at netronome.com>;
>Stokes, Ian <ian.stokes at intel.com>
>Subject: Re: [PATCH v5 0/5] OVS-DPDK flow offload with rte_flow
>
>On Wed, Jan 10, 2018 at 02:38:13PM +0000, Chandran, Sugesh wrote:
>> Hi Yuanhan,
>>
>> Thank you for your reply!..and for the wishes too :)
>>
>> At very high level, my comments are on the approach taken for enabling
>hardware acceleration in OVS-DPDK. In general, the proposal works very well
>for the specific partial offload use case. i.e alleviate the miniflow extract and
>optimize the megaflow lookup. I totally agree on the performance advantage
>aspect of it. However I expect the hardware acceleration in OVS-DPDK should
>be in much more broad scope than just limited to a flow lookup optimization. I
>had raised a few comments around these point earlier when the previous
>versions of patches were released.
>>
>> Once again I am sharing my thoughts here to initiate a discussion on
>> this area. In my point of view, hardware acceleration in OVS-DPDK will
>> be,
>
>Thank you for sharing it!
>
>> 1)	Must understand the hardware and its capabilities. I would
>prefer a proactive hardware management than the reactive model. This
>approach let the software collect the relevant hardware information in
>advance before interacting with it. It will have impact on all the port specific
>operations including port init, device programming and etc.
>
>Yes, I agree, and I think we have also discussed it before. Likely, doing probes
>proactively seems to work.
>
>> 2)	Define hardware acceleration modes to support different
>acceleration methods. It will allow to work with various hardware devices
>based on its capabilities.

For 1) and 2):
I completely agree. I think this will be a key issue in full offload. The challenge 
that we see with the next full-offload step is:
a. On a port basis knowledge of what actions can be offloaded
b. based on this information (and if output action is offloadable) full offload may be tried.
c. fall back on partial-offload
Thus, how do we chose.
However, I think, maybe the first full-offload patch could be try full -> if fail do partial?

>> 3)	Its nice to have to make OVS software to use device + port
>model. This is helpful when enumerating device bound characteristics (such as
>supported protocols, tcam size, and etc). Also it helps when handling flows
>that span across different hardware acceleration devices.
>> 4)	Having device model as mentioned in (3) helps OVS to
>distribute hardware flow install into multiple threads than just single thread in
>the current implementation. May be its possible to have one flow thread per
>device to avoid the contention.
>
>Single thread is chosen for simplicity, and I do think it could be extended to be
>multiple thread, when it comes to necessary.
>
>> 5)	The action translation and flow validation for the hardware
>acceleration has to be device centric. So having one set of functions may not
>work for other use cases. Consider another partial hardware acceleration
>approach 'zero copy' where packets are copied directly to the VM queues by
>the NIC. For zero copy acceleration,  the actions cannot be 'mark and RSS '
>instead it  will be specific queue action. Similarly different acceleration mode
>will have its own validations and set of actions. May be its good to have
>function hooks for these operation based on acceleration mode.
>
>Sure, the idea case would be choose different flow actions (or combinations)
>based on different model. And I think it would be easy to do such change.
>
>>
>> Considering there are no DPDK support available for handle most of the
>above points, I don't mind pushing this patch into OVS mainline. However I
>would expect a refactoring of that code in the future to make it generic
>enough for supporting different use cases and modes. By having a generic
>approach it is possible to support various partial acceleration & full
>acceleration modes based on the available hardware.
>>
>
>Agree. And that's also what I have been thinking of. It's also easier to make
>decisions when we are at the step of doing real extension (say, adding the full
>offload).

Agree, we need to do this stepwise.

Here is some of my thoughts:
We see the next full-offload step potentially contain (roughly):
a. Do full offload if possible, with fallback to partial (and finally none).
b. Differentiate between maps of full offloaded flows and partial offloaded flows
b. Add a task to the hw-offload thread to, periodically call RTE FLOW query for full 
   offloaded flows and update megaflow cache statistics accordingly.
c. let virtual ports in HW be represented by a normal dpdk port, as a representor port.
We have this running in lab at an early state, but not yet updated with latest partial-offload 
patchset. Further, small additions to RTE FLOW is needed.

Regards,
Finn

>
>	--yliu