[ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

Yuanhan Liu yliu at fridaylinux.org
Tue Feb 6 09:10:28 UTC 2018


On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > Hi,
> > 
> > Here is a joint work from Mellanox and Napatech, to enable the flow hw
> > offload with the DPDK generic flow interface (rte_flow).
> > 
> > The basic idea is to associate the flow with a mark id (a unit32_t number).
> > Later, we then get the flow directly from the mark id, which could bypass
> > some heavy CPU operations, including but not limiting to mini flow extract,
> > emc lookup, dpcls lookup, etc.
> > 
> > The association is done with CMAP in patch 1. The CPU workload bypassing
> > is done in patch 2. The flow offload is done in patch 3, which mainly does
> > two things:
> > 
> > - translate the ovs match to DPDK rte flow patterns
> > - bind those patterns with a RSS + MARK action.
> > 
> > Patch 5 makes the offload work happen in another thread, for leaving the
> > datapath as light as possible.
> > 
> > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> > million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
> > performance boost.
> > 
> > Note that it's disabled by default, which can be enabled by:
> 
> Hi,
> 
> First of all, thanks for working on this feature.
> 
> I have some general comments I'd like to discuss before going deeper
> on it.
> 
> The documentation is too simple.  It should mention the HW requirements
> in order to use this feature.

It listed the NICs that support this feature.

>  Also some important limitations, like no
> support for IP frags, MPLS or conntrack, for instance.

Yes, that could be added.

> It seems it would be possible to leave the HW offloading code outside
> of dpif-netdev.c which is quite long already. I hope it will improve
> isolation and code clarity too.

Yes. My thoughts was we can do it later, when we are going to add some
other flow models (say, full offload) in future. Moreover, it seems
that it's a custom in OVS to have long source code files. I was just
following it.

But I hve no objections to break it. Ian, should I break it now?
What's your plan to merge it?

> So far there is no synchronization between PMDs in the fast path.
> However, we got a new mutex to sync PMDs and a new thread to manage.
> Even without the patch adding the thread, there would be a new mutex
> in the fast path.  It seems the slow path today causes issues, so maybe
> the whole upcall processing could be pushed to another thread. I
> realize this is outside of the scope of this patchset, but it is
> something we should consider.
> 
> As an alternative solution, maybe we could use a DPDK ring to have a
> lockless way to push flows to the auxiliary thread.

It could be an option, but it brings a hard dependency on DPDK. Note that
the netdev datapath could leave without DPDK, IIUC.

> There are some memory allocations and deallocations in the fast path
> using OVS functions.  Perhaps it is better to use rte_* functions
> instead (another reason to split the code out of dpif-netdev.c)

Why do you think the one from DPDK is better? Also, it brings the DPDK
dependency to the netdev datapth.

> I am curious to know why there is no flow dump or flush?

Not quite sure I got it.

> The function to help debugging (dump_flow_pattern) should use an
> initial condition to return asap if debug is not enabled.
> E.g.:
>     if (VLOG_DROP_DBG(rl)) {
>         return;
>     }   

Good tip. Thanks!

> I am still wrapping my head around the RSS+MARK action and rte_flow
> API, so I can't really comment those yet.

Thanks for looking at it!

	--yliu


More information about the dev mailing list