[ovs-dev] knowing when a kernel flow is really deleted

Ben Pfaff blp at nicira.com
Thu Dec 15 20:34:51 UTC 2011


On Thu, Dec 15, 2011 at 11:42:21AM -0800, Jesse Gross wrote:
> >> > ?? ?? ?? ??3. Somehow actually eliminate the problem with deleting flows,
> >> > ?? ?? ?? ?? ?? so that when userspace receives the response to the flow
> >> > ?? ?? ?? ?? ?? deletion we know that no more packets can go through the
> >> > ?? ?? ?? ?? ?? flow. ??I don't know how to do this efficiently.
> >>
> >> I'm not sure that this is really a question of efficiency, so much as
> >> it is complexity. ??Basically you have to make userspace able to
> >> tolerate blocking while the flow is deleted and then use
> >> synchronize_rcu when removing flows. ??Presumably this would mean that
> >> you need some kind of worker threads.
> >
> > synchronize_rcu() is the obvious solution but the efficiency I was
> > worried about is being able to delete flows at a reasonable pace.
> > Wouldn't using synchronize_rcu() throttle back the speed of flow
> > deletion to an unacceptable rate? ??I remember from a long time ago
> > that synchronize_rcu() can be ridiculously slow. ??Oh yeah, here's the
> > log message (which might be so old that it's not in the current OVS
> > repo, not sure). ??It's referring to testing that we did on whatever
> > was the current XenServer version at the time, with a 2.6.18-based
> > kernel:
> 
> Yeah, that was why I was saying that you probably need a pool of
> worker threads that can block for a long time and then do whatever is
> associated with deleting the flow (I'm not saying that this is a good
> solution, just how I think this approach would look like).

I'm not sure that a pool of worker threads would actually help.  At
least the naive kernel side of the solution would be to call
synchronize_rcu() holding genl_lock, which means that only one thread
in the pool could make progress and all the other threads would just
be stuck waiting for it.  The main thread, too, if it tried to make
any genl calls, even those that don't delete flows.

A workaround would be to call synchronize_rcu() and send the genl
reply from some context that doesn't hold genl_lock, but that's hardly
elegant.  Also it means that the real reply would come after the one
generated automatically by af_netlink.c when NLM_F_ACK is used, which
violates the normal rules.



More information about the dev mailing list