[ovs-dev] [PATCH] RFC: Pass more packet and flow key info to userspace.

Jesse Gross jesse at nicira.com
Wed Feb 6 00:57:28 UTC 2013

On Tue, Feb 5, 2013 at 8:53 AM, Rajahalme, Jarno (NSN - FI/Espoo)
<jarno.rajahalme at nsn.com> wrote
> On Jan 31, 2013, at 3:19 , ext Jesse Gross wrote:
>> On Wed, Jan 30, 2013 at 3:14 AM, Rajahalme, Jarno (NSN - FI/Espoo)
>> <jarno.rajahalme at nsn.com> wrote:
>>> Otherwise not so much use of the layer pointers, but dpif_flow_stats_extract()
>>> does packet_get_tcp_flags(), which needs the layer pointers.
>>> dpif_flow_stats_extract() is called on each miss packet (but the collected
>>> tcp_flags are used only with facets, for netflow and facet->tcp_flags).
>>> Other use is for controller action and special processing (stpp/cfm/lacp).
>>> So it seems that deferred layer pointer setting could be effective only if the
>>> stats extraction would be changed to skip tcp_flags for the cases where they
>>> are not going to be used (i.e. when not making facets, as might be expected
>>> under heavy load?).
>> That's a good point about the TCP flags.
>> I think it's actually not right that we don't use the TCP flags when
>> we don't create facets under load.  Not creating facets is an
>> optimization and that shouldn't affect the results that are seen
>> through NetFlow.
> Fixing this seem non-trivial, as the nf_flow resides within struct facet. Without a
> facet I do not think there is any memory of specific flows seen, so there is no
> basis for reporting such flows vie netflow or any other means. The design seems
> to be that tracking of short flows is not essential under load, see this comment for
> flow_miss_should_make_facet():
> /* Figures out whether a flow that missed in 'ofproto', whose details are in
>  * 'miss', is likely to be worth tracking in detail in userspace and (usually)
>  * installing a datapath flow.  The answer is usually "yes" (a return value of
>  * true).  However, for short flows the cost of bookkeeping is much higher than
>  * the benefits, so when the datapath holds a large number of flows we impose
>  * some heuristics to decide which flows are likely to be worth tracking. */
> If, however, netflow should not be affected, then a facet should be installed for
> every flow. In that case maybe the installation of kernel flows could be avoided
> for short flows under load?

Not creating facets was intended purely as a performance optimization
- i.e. is the cost of installing the flow in the kernel greater or
less than the cost of processing packets in userspace?  My guess is
that the side effects on NetFlow weren't considered since that's not
the primary use case of facets.  It's possible that we could create a
facet in userspace to track NetFlow information but still not install
the flow in the kernel.  My guess is that this will reduce the benefit
of the optimization somewhat since allocating facets has a cost but I
don't know by how much.

Our facets are also unusually fine grained compared to the flows
usually reported by NetFlow.  We could consider creating a
NetFlow-specific structure, which could be allocated less frequently
and only used when NetFlow is enabled.

>> I suspect that there are quite a lot of cases where NetFlow isn't used
>> (NXAST_FIN_TIMEOUT is the other case that comes to mind that uses the
>> flags) so it might be worthwhile to turn off collection when those
>> features aren't in use.
> It seems possible that netflow is turned on at runtime. Would it be acceptable
> that netflow data would be missing for the time before netflow was turned on?

Yes, that seems fine to me.

>>> Also, I noticed that the facets hmap uses the hash from flow_miss, and
>>> according to the comments that should be flow_hash(flow, 0). However,
>>> the only place the hash originates in the code is at handle_miss_upcalls(),
>>> so again, there is no harm in using a kernel provided hash instead. But the
>>> comments need to be updated if this change is adopted.
>> It would probably be good to benchmark the results of these two
>> changes independently to see which is more significant.
>> In general, I'm nervous about putting optimizations into the
>> userspace/kernel interface because it exposes implementation
>> information.  Keeping a neutral format can actually make it easier to
>> optimize in the future.  In particular, I know that Ben is thinking
>> about grouping together flows at a coarser granularity than the kernel
>> uses, which would make passing the hash up not that useful.
> While I would not see a problem in passing a key hash computed with an
> undisclosed algorithm up to userspace, I see your point.
> In my tests it seems that about 1/3 of the benefit is attainable with deferred
> layer pointer computation within the userspace.

Ben has been doing work optimizing flow setups, so it might be good
for him to comment on this and how it fits in with his current ideas.

More information about the dev mailing list