[ovs-dev] [xc 4/4] ofproto-dpif: Track relevant fields for wildcarding and add xout cache.

Justin Pettit jpettit at nicira.com
Fri Jun 7 09:03:36 UTC 2013

On Jun 6, 2013, at 2:02 PM, Ben Pfaff <blp at nicira.com> wrote:

> Every caller of lookup_xc() creates a new xc entry on failure.  Can
> any of this be factored into a helper?  If not, then all the callers
> of create_xc() use very similar 3 lines of code, so can that be
> integrated into create_xc()?

I think the interface would be too weird for lookup_xc(), since you have to provide all this creation arguments, which don't seem like they belong in a lookup interface.  My rewrite of handle_flow_miss_without_facet() in v2 of this patch makes those three lines a little less consistent.  However, let me know in your review of v2 if you think it would still be useful to factor out. 

> A struct xout_cache is pretty big and the initial xzalloc() is
> followed pretty quickly by copying into its '->xout->wc' and then (in
> insert_xc()) into '->flow'.  Can we save time by initializing only
> what we need?  (Most notably we never need to zero the 256 bytes in
> xc->xout.odp_actions_stub.)

Fair enough.  Done.

> handle_flow_miss_without_facet() previously translated the actions for
> each packet.  Now, it translates the actions once (or not at all, if
> there's an xc already).  Won't this prevent slow protocols (e.g. CFM,
> LACP, BFD) from working properly, because packets after the first
> won't go through translation?  I guess the unit tests pass; maybe I am
> missing something (or maybe the unit tests pass because they never
> trigger the governor).

You are correct.  (And, no, the unit tests must not trigger the governor.)  My next patch should address that.

> Both forks of the main "if" in handle_flow_miss_without_facet()
> contain the same xlate_out_copy() call.  I think it can be factored
> out.  Actually I think it can be moved into the body of "if
> (op->xout.odp_actions.size) {", and then we don't need the
> xlate_out_uninit() in the "else" case.
> In handle_flow_miss_without_facet(), is it necessary to do the
> lookup_xc() inside the LIST_FOR_EACH loop?  That is, can we move it
> above the loop and then just use 'xc' for the whole body of the loop?

Both of these should be addressed in my refactoring of handle_flow_miss_without_facet().

> All the callers of rule_dpif_lookup() that provide a nonnull 'wc' call
> flow_wildcards_init_catchall() on that 'wc' immediately beforehand.
> Should rule_dpif_lookup() do it internally?

I think this would make things more confusing because we couldn't initialize 'wc' in rule_dpif_lookup__(), since it is called recursively.

> I see opportunity for optimization in facet_create().  I don't know
> whether it would have any effect on real performance.  Anyway: the
> 'wc' is only used if a new 'xc' must be created, and we can figure
> that out based on just miss->flow.  So we could theoretically save
> time by passing NULL to rule_dpif_lookup() if there's an existing
> 'xc'.

I think that makes the code harder to read, and I don't think keeping track of 'wc' is that expensive.  As we discussed off-line, we can clean this up later if it shows up in profiling.

> The commit changes facet_create()'s comment to mention a 'wc'
> parameter, but facet_create() doesn't have a 'wc' parameter.


> facet_lookup_valid() now appears to always iterate over every
> xout_cache entry (!).  Partly I think that's just a mistake (it should
> not call invalidate_xc_tags() if the revalidate_set is empty, or
> invalidate_xc_tags() should do nothing if its revalidate_set parameter
> is empty).  But, why does it need to do that at all?  That is, why
> can't it just see whether 'facet->xc->xout.tags' intersects the
> revalidate_set, at least as an initial check?

You're right.  I put a guard in invalidate_xc_tags() and fixed facet_lookup_valid().

> New code in facet_revalidate() looks like this:
>    xc = lookup_xc(ofproto, &facet->flow);
>    if (!xc) {
>        ovs_assert(!facet->xc);
>        xlate_in_init(&xin, ofproto, &facet->flow, &facet->initial_vals,
>                      new_rule, 0, NULL);
>        xc = create_xc(&wc, &facet->initial_vals, facet);
>        xlate_actions(&xin, &xc->xout);
>        insert_xc(ofproto, xc, &facet->flow);
>        refresh_subfacets = true;
>    } else if (!facet->xc) {
>        add_facet_to_xc(xc, facet);
>        refresh_subfacets = true;
>    }
> I am not sure about the "fall through" case at the end.  I have a
> suspicion that, in this case, we always have xc == facet->xc, because
> I think that if the xc might have changed, then the caller would have
> destroyed that cache entry.  If that is correct, then we can delete a
> lot of code from facet_revalidate() that only fires if xc !=
> facet->xc.  But I am not entirely convinced that is correct.

As we discussed off-line, I don't think this was true of the original code, since rule_destruct() didn't clear anything, but we agree that was a bug.

> I am not sure that revalidation does the right thing if only the slow
> path of a translation changes.
> This new code will do a lot of unneeded kernel datapath flow
> reinstallations.  For example, if need_revalidate becomes true, then
> it will reinstall every flow in the kernel datapath, even if their ODP
> actions do not change.  This becomes extra-expensive because it does
> the reinstallations one-by-one instead of batching them up.  (The
> current code also does reinstallations one by one, but it has not been
> a problem because it only reinstalls flows whose actions change, which
> is rare.)

I want to look at facet_revalidate() fresher eyes in the morning.  I'm going to send out a v2 of this patch knowing that there will be a v3 with at least the facet_revalidate refactoring.

> In xout_cache_push_stats(), in the call to memset(), can we just write
> 0 instead of '\0'?  The latter looks like a string null terminator,
> which seems weird here.


> I don't understand this comment in facet_push_stats__():
>    /* Use local stats instead of 'facet->xc->stats' because this
>     * function may be called multiple times before
>     * xout_cache_push_stats(), which leads to high counts. */

Okay.  I rewrote it.  Let me know if it's still not clear.

> The split between the OFPACT_LEARN case in do_xlate_actions() and the
> implementation in xlate_learn_action() seems more awkward than before.
> Can we move the assignment to has_learn into xlate_learn_action()?


> and OFPACT_DEC_TTL need to add dependencies to 'wc', because their
> behavior differs based on the protocol.  For example,
> OFPACT_SET_IPV4_SRC is a no-op if the Ethertype is anything other than
> IPv4, so I would think it would need to mark the Ethertype as a
> dependency during translation.

I think I got them all, but please take another look.

> I think that OFPACT_OUTPUT_REG needs to depend on the field it reads.


> I think that OFPACT_POP_QUEUE needs to mark the skb_priority as a
> dependency.


> Did you consider dependencies for the MPLS actions?

Whoops.  Fixed.

> destroy_xc() contains a doubled semicolon ";;" at the end of a
> statement.


Thanks for the review.  I'll send out v2 in a moment.


More information about the dev mailing list