[ovs-dev] [PATCH 0/7 RFC] OVS-DPDK flow offload with rte_flow

Darrell Ball dball at vmware.com
Wed Aug 30 01:32:10 UTC 2017



On 8/29/17, 4:55 AM, "Yuanhan Liu" <yliu at fridaylinux.org> wrote:

    On Tue, Aug 29, 2017 at 07:11:42AM +0000, Darrell Ball wrote:
    > 
    >     On 8/22/17, 11:24 PM, "Yuanhan Liu" <yliu at fridaylinux.org> wrote:
    >     
    >         Hi,
    >         
    >         Here is a joint work from Mellanox and Napatech, to enable the flow hw
    >         offload with the DPDK generic flow interface (rte_flow).
    >         
    >         The basic idea is to associate the flow with a mark id (a unit32_t number).
    >         Later, we then get the flow directly from the mark id, bypassing the heavy
    >         emc processing, including miniflow_extract.
    >         
    >         The association is done with CMAP in patch 1. It also resues the flow
    >         APIs introduced while adding the tc offloads. The emc bypassing is done
    >         in patch 2. The flow offload is done in patch 4, which mainly does two
    >         things:
    >         
    >         - translate the ovs match to DPDK rte flow patterns
    >         - bind those patterns with a MARK action.
    >         
    >         Afterwards, the NIC will set the mark id in every pkt's mbuf when it
    >         matches the flow. That's basically how we could get the flow directly
    >         from the received mbuf.
    >         
    >         While testing with PHY-PHY forwarding with one core and one queue, I got
    >         almost 80% performance boost. For PHY-vhost forwarding, I got about 50%
    >         performance boost.
    >         
    >         
    >         Though that being said, this patchset still has issues unresolved. The
    >         major issue is that maybe most NIC (for instance, Mellanox and Intel)
    >         can not support a pure MARK action. It has to be used together with a
    >         QUEUE action, which in turn needs a queue index. That comes to the issue:
    >         the queue index is not given in the flow context. To make it work, patch
    >         5 just set the queue index to 0, which is obviously wrong. One possible
    >         solution is to record the rxq and pass it down to the flow creation
    >         stage. It would be much better, but it's still far away from being perfect.
    >         Because it might have changed the steering rules stealthily, which may
    >         break the default RSS setup by OVS-DPDK.
    > 
    > If this cannot be solved by removing this restriction, I guess another alternative is to actively
    > manage flow-queue associations.
    
    do you mean let user provide the set_queue action?

I mean in the worst case, if the restriction cannot be lifted, we might to need to do flow distribution
across queues with additional added logic/tracking, because we would not really want to keep this restriction
without a workaround. This would be a fair bit of work, however.

Alternatively, pushing this to the user in the general may be too much overhead; what do you think ?.
As a user specification, it could be optional, however ?

    
    >         
    >         The reason I still want to send it out is to get more comments/thoughts
    >         from community on this whole patchset. Meanwhile, I will try to resolve
    >         the QUEUE action issue.
    >         
    >         Note that it's disabled by default, which can be enabled by:
    >         
    >             $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
    > 
    > Maybe per in-port configuration would alleviate the issue to a certain degree.
    
    Yes, it could be done. I choose it for following reasons:
    
    - the option is already there, used by tc offloads.
    - it also simplifies the (first) patchset a bit, IMO.

Of course, I understand.
    
    However, I'm okay with making it per port. What's your suggestion for
    this? Making "hw-offload" be port, or introducing another one? If so,
    what's your suggestion on the naming?
    
I am not suggesting to drop the global configuration
Mainly we ‘consider’ additional per interface configuration because of the restriction with
queue action we discuss. This reduces the scope of the queue remapping from what RSS would yield
with HWOL.
I would expect that when such configuration is done (if it were done), that
typically multiple ports would be configured, since traffic flows bi-directionally at least.

If we were to do this, one of the possibilities would be something like:
ovs-vsctl set Interface dpdk0 other_config:hw-offload=true

    Thanks for the review. BTW, would you please add me in 'to' or 'cc'
    list while replying to me?  Otherwise, it's easy to get missed: too
    many emails :/

of course

    
    	--yliu
    





More information about the dev mailing list