[ovs-dev] knowing when a kernel flow is really deleted

Jesse Gross jesse at nicira.com
Thu Dec 15 23:24:38 UTC 2011


On Thu, Dec 15, 2011 at 1:34 PM, Ben Pfaff <blp at nicira.com> wrote:
> On Thu, Dec 15, 2011 at 01:00:00PM -0800, Jesse Gross wrote:
>> On Thu, Dec 15, 2011 at 12:34 PM, Ben Pfaff <blp at nicira.com> wrote:
>> > A workaround would be to call synchronize_rcu() and send the genl
>> > reply from some context that doesn't hold genl_lock, but that's hardly
>> > elegant. ??Also it means that the real reply would come after the one
>> > generated automatically by af_netlink.c when NLM_F_ACK is used, which
>> > violates the normal rules.
>>
>> Isn't that almost exactly the same as sending the message from the RCU
>> callback (including the Netlink ordering issue)?
>
> In the "send from RCU callback" case, I intended that the normal
> Netlink reply would be sent synchronously just after deletion, just as
> it is now.  The ordering would therefore stay just as it is now.  Only
> the broadcast reply would be deferred.

What do you consider the "normal Netlink reply"?  There are basically
3 different pieces:

 * Multicast flow information to the relevant group (excluding the
requester if NLM_F_ECHO is specified).
 * Unicast flow information to the requester if NLM_F_ECHO is specified.
 * Unicast Netlink acknowledgement if NLM_F_ACK is specified or there
was an error.

Are you talking about doing the last two as-is and moving the
multicast to the flow reclamation (but actually sending it to everyone
at that time)?

> The "use synchronize_rcu() on the side then send the reply" case would
> change the message ordering.

I guess I don't see why you couldn't use exactly the same strategy as
you choose above.

> Could we just use the existing spinlock in sw_flow plus the 'dead'
> variable to ensure that no packets go through a deleted flow after
> it's deleted?  On the delete side, take the spinlock and set 'dead'.
> On the fast path, take the spinlock and check 'dead' before executing
> the actions, release the spinlock after executing the actions.  We
> already have to take the spinlock for every packet anyway to update
> the stats, so it's not an additional cost.  I doubt that there's any
> parallel work for a given flow anyway (that would imply packet
> reordering).  We would have to avoid recursive locking somehow.

Hmm, I think that would probably work.  It's slightly convoluted but
certainly much less so than the alternatives.  How would you get
recursive locking?



More information about the dev mailing list