[ovs-dev] [RFC 2/2] datapath: Cache netlink flow key, mask on flow_put.

Joe Stringer joestringer at nicira.com
Fri Jul 11 11:29:03 UTC 2014


On 11 July 2014 21:03, Thomas Graf <tgraf at noironetworks.com> wrote:

> > Correct. My main concern with doing it the first time it is required, is
> > how it's synchronised. Flow dump is RCU. I don't really know what the
> > threadsafety requirements are for this code, or what options are
> available
> > to address this. Is it viable to check the pointer is NULL, allocate the
> > cache, perform an atomic swap, and free the (potentially NULL) returned
> > pointer?
>
> Right, you would need the cmpxchg + RCU protection or take ovs_lock().
> Given the Netlink buffer is sized large enough, the cost of taking the
> lock once for a large number of flows might get amortized compared to
> the expensive translation. A benchmark would help.
>

I'm skeptical of taking the ovs_lock(). The current patch performs this
key/mask cache construction inside ovs_lock() as part of the critical
section for flow install. If we perform this during flow_dump, but extend
the ovs_lock to the entire flow dump operation, that has the potential to
make things worse due to the added contention. If I follow correctly, the
mspin_lock() bottleneck after the current patch is applied is already
measuring contention on the ovs_mutex when revalidators attempt to delete
flows. That's with flow_dump as RCU.


Another sidenote:
> Looking at the code, OVS does not handle NLM_F_DUMP_INTR in user space yet
> and the kernel dump does not call genl_dump_check_consistent() yet to
> provide
> the flag. So what can currently happen even without your patch is that if
> the dump requires more than one Netlink message, the RCU critical section
> is left and thus a list deletion or insertion of a flow may occur in the
> middle of the dump and thus provide a slightly incorrect set of flows.
>

Yes, I believe this is done deliberately for performance reasons. OVS
userspace doesn't assume that it's getting a consistent snapshot of the
flow table, it handles cases where flows are duplicated or missing in the
dump. An idea has been floated to keep track of flow adds/deletes in the
kernel to counteract this, but I got distracted by this idea first :-)


> The alternative that I mentioned below the commit message is that we
> > directly copy the key/mask from the original netlink message, and adjust
> > the mask for any fields that have wildcarded bits where the kernel only
> > supports full masking. This should be cheaper than converting to sw_flow
> > and back again as this patch does currently. Furthermore, it shouldn't
> > require locking, so the critical section can remain the same size as
> > currently.
>
> Right, needs verification of your assumption.
>

The kernel module is free to ignore the mask completely, or install a flow
that matches more specifically than the given mask:
https://github.com/openvswitch/ovs/blob/branch-2.3/datapath/README#L107-L113

So it's just a matter of figuring out which fields lack support for
wildcarding. This option also needs to consider the case where userspace
doesn't specify a mask.



More information about the dev mailing list