[ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

Stokes, Ian ian.stokes at intel.com
Tue Feb 6 21:34:36 UTC 2018


> -----Original Message-----
> From: Yuanhan Liu [mailto:yliu at fridaylinux.org]
> Sent: Tuesday, February 6, 2018 9:10 AM
> To: Flavio Leitner <fbl at sysclose.org>; Stokes, Ian <ian.stokes at intel.com>
> Cc: dev at openvswitch.org; Simon Horman <simon.horman at netronome.com>
> Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > Hi,
> > >
> > > Here is a joint work from Mellanox and Napatech, to enable the flow
> > > hw offload with the DPDK generic flow interface (rte_flow).
> > >
> > > The basic idea is to associate the flow with a mark id (a unit32_t
> number).
> > > Later, we then get the flow directly from the mark id, which could
> > > bypass some heavy CPU operations, including but not limiting to mini
> > > flow extract, emc lookup, dpcls lookup, etc.
> > >
> > > The association is done with CMAP in patch 1. The CPU workload
> > > bypassing is done in patch 2. The flow offload is done in patch 3,
> > > which mainly does two things:
> > >
> > > - translate the ovs match to DPDK rte flow patterns
> > > - bind those patterns with a RSS + MARK action.
> > >
> > > Patch 5 makes the offload work happen in another thread, for leaving
> > > the datapath as light as possible.
> > >
> > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and
> > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > than 260% performance boost.
> > >
> > > Note that it's disabled by default, which can be enabled by:
> >
> > Hi,
> >
> > First of all, thanks for working on this feature.
> >
> > I have some general comments I'd like to discuss before going deeper
> > on it.
> >
> > The documentation is too simple.  It should mention the HW
> > requirements in order to use this feature.
> 
> It listed the NICs that support this feature.
> 
> >  Also some important limitations, like no support for IP frags, MPLS
> > or conntrack, for instance.
> 
> Yes, that could be added.
> 
> > It seems it would be possible to leave the HW offloading code outside
> > of dpif-netdev.c which is quite long already. I hope it will improve
> > isolation and code clarity too.
> 
> Yes. My thoughts was we can do it later, when we are going to add some
> other flow models (say, full offload) in future. Moreover, it seems that
> it's a custom in OVS to have long source code files. I was just following
> it.
> 
> But I hve no objections to break it. Ian, should I break it now?
> What's your plan to merge it?

Hi Yuanhan, I believe I gave the same feedback on the first review pass I did. Previously I was happy to hold off on this work in order to make it into the 2.9 release but as that window has passed I think it makes sense to break it out now. It will save the refactor work later.

I should have more time later this week to continue reviewing, I'll provide some more feedback on the rest of the patchset then.

Thanks
Ian


> 
> > So far there is no synchronization between PMDs in the fast path.
> > However, we got a new mutex to sync PMDs and a new thread to manage.
> > Even without the patch adding the thread, there would be a new mutex
> > in the fast path.  It seems the slow path today causes issues, so
> > maybe the whole upcall processing could be pushed to another thread. I
> > realize this is outside of the scope of this patchset, but it is
> > something we should consider.
> >
> > As an alternative solution, maybe we could use a DPDK ring to have a
> > lockless way to push flows to the auxiliary thread.
> 
> It could be an option, but it brings a hard dependency on DPDK. Note that
> the netdev datapath could leave without DPDK, IIUC.
> 
> > There are some memory allocations and deallocations in the fast path
> > using OVS functions.  Perhaps it is better to use rte_* functions
> > instead (another reason to split the code out of dpif-netdev.c)
> 
> Why do you think the one from DPDK is better? Also, it brings the DPDK
> dependency to the netdev datapth.
> 
> > I am curious to know why there is no flow dump or flush?
> 
> Not quite sure I got it.
> 
> > The function to help debugging (dump_flow_pattern) should use an
> > initial condition to return asap if debug is not enabled.
> > E.g.:
> >     if (VLOG_DROP_DBG(rl)) {
> >         return;
> >     }
> 
> Good tip. Thanks!
> 
> > I am still wrapping my head around the RSS+MARK action and rte_flow
> > API, so I can't really comment those yet.
> 
> Thanks for looking at it!
> 
> 	--yliu


More information about the dev mailing list