[ovs-dev] [ovs-dev,v21,2/8] Persist ovn flow tables

Ben Pfaff blp at ovn.org
Mon Jul 4 00:19:11 UTC 2016


On Sun, Jul 03, 2016 at 06:08:23PM -0500, Ryan Moats wrote:
> Ben Pfaff <blp at ovn.org> wrote on 07/03/2016 05:40:59 PM:
> 
> > From: Ben Pfaff <blp at ovn.org>
> > To: Ryan Moats/Omaha/IBM at IBMUS
> > Cc: dev at openvswitch.org
> > Date: 07/03/2016 05:41 PM
> > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> >
> > On Sun, Jul 03, 2016 at 10:35:27AM -0500, Ryan Moats wrote:
> > > Ensure that ovn flow tables are persisted so that changes to
> > > them chan be applied incrementally - this is a prereq for
> > > making lflow_run and physical_run incremental.
> > >
> > > Signed-off-by: Ryan Moats <rmoats at us.ibm.com>
> >
> > This time I think I've figured out properly what I'm concerned about.
> > Before, I think I was distracted enough by 'is_new' and the remaining
> > patches that I hadn't formulated it correctly yet.
> >
> > ofctrl needs to support the following operations on physical flows:
> >
> >     1. Add flow.
> >     2. Update flow.
> >     3. Remove flow.
> >
> > lflow generates physical flows from logical flows in a one-to-many
> > fashion.  That is, a single logical flow can yield zero, one, or
> > multiple physical flows.  Other sources can yield physical flows too,
> > and ofctrl needs those sources to pretend that they are generated from
> > something similar enough to a logical flow that it can be uniquely
> > identified with a UUID.  All that makes sense.
> >
> > Case #1, "add flow", for physical flows is implemented via
> > ofctrl_add_flow() with is_new == true.  This is straightforward enough.
> > Add the new flow to the flow table, add it to the list of physical flows
> > associated with the UUID passed in, suppress duplicates.
> >
> > Case #3, "remove flow", for physical flows is implemented via
> > ofctrl_remove_flow().  This removes all the physical flows associated
> > with a UUID.  This is almost as straightforward.  The implementation
> > here does appear to mean that, if there is a bug in the logical flow
> > table such that two different logical flows generate physical flows with
> > the same match, then there is a behavioral change versus the previous
> > implementation: previously, one of the flows to be added in the main
> > loop would end up in the flow table, but after this change the removal
> > will not reveal an underlying flow but just delete it, at least until
> > something happens to fully refresh flows instead of just incrementally
> > adding and removing them.  That is an edge case; it might be necessary
> > to fix it but it is not the top priority.
> 
> I'm going to pull your text for case 3 up here and let's deal with that
> first:
> 
> > Now let's go back to the edge case concern I expressed about case #3.
> > Thinking harder, I don't think it's so much of an edge case, at least
> > not to the extent that exhibiting it requires a buggy logical flow
> > table.  Suppose that a single transaction from ovn-northd deletes a
> > Logical_Flow with uuid U1 and adds a new Logical_Flow with uuid U2.
> > This will yield a call to ofctrl_remove_flow(U1) and (except in
> > pathological cases) one or more calls to ofctrl_add_flow(U2).  Suppose
> > that the sets of physical flows for U1 and U2 overlap in terms of their
> > key fields (table_id, pipeline, match, ...).  Then I believe that the
> > result will currently depend on the order of the calls to
> > ofctrl_remove_flow() and ofctrl_add_flow():
> >
> >         * If the removal precedes all the adds, all is well.
> >
> >         * If the removal follows any of the adds, the remove will
> >           falsely delete all the flows that should be added.
> >
> > Does any of this ring a bell?
> 
> I see where your analysis is going, but I think the problem is on the
> add side :) : On the delete side, the code checks for U1 and so none
> of the U2 flows will be removed.  However, I see the problem on the add
> side: because if U2 creates a duplicate flow to what's in U1, then
> right now the U2 flow is discarded silently and after U1 is deleted,
> the flow incorrectly disappears.  The simplest thing that comes to
> mind is to make two passes through the changed flows: the first to
> handle deletes and the second to handle adding/modifying flows.

That will only make the problem harder to trigger.  It will work if the
only way that overlaps can come up is from within Logical_Flow itself.
If ovn-controller has two independent pieces of code that can generate
overlapping flows, then the same problem can arise again.  That is
unlikely today, but it will be sitting in the code like a time bomb
waiting to surprise us months or years down the road with unpredictable
behavior.  With this approach, the cure would be to make two passes
through the entire pipeline of code in ovn-controller that can generate
flows, the first pass to remove and the second pass to add.  That would
be a pain.

I suggest that we avoid going down that path by making ofctrl track
flows with duplicate keys, when those keys have different UUIDs.  This
could be done within an hmap, if struct ovn_flow would add an ovs_list
of duplicates (probably preferred).  We'd want to use some kind of
deterministic method to determine which one actually gets installed, so
as to avoid oscillation; for example, it could be done based on UUID
ordering.  This will add a little bit of code now, but I think that it's
well worth it for predictability and for making code harder to screw up
in subtle ways by ordering things wrong.

> > Case #2, "update flow", is implemented via ofctrl_add_flow() with is_new
> > == false.  This is the one that I'm worried about and where I do not
> > understand the strategy.  What I'd expect is that the client would be
> > able to say, one way or another, "Here is the new set of physical flows
> > F2 associated with uuid U."  If some of the flows in F2 coincide with
> > the set of physical flows F1 that had previously been associated with U,
> > then it would make sense that nothing actually changes in the physical
> > flow table.  But there are many more possibilities.  For example, if F1
> > contains more flows than F2, then there needs to be a way to indicate
> > that some of the physical flows associated with U should be removed.
> > There is code in ovn_flow_lookup_by_uuid() that tries to tolerate some
> > kind of changes to flow matches from F1 to F2 (match fields that appear
> > or disappear), but I don't know a reason to believe that only those
> > changes can happen from F1 to F2; even after explanation, they look like
> > magic to me.
> >
> > Maybe there is a belief here that a given Logical_Flow has some kind of
> > consistency, e.g. that its match and action, etc. do not change after
> > the logical flow is added.  That might be a useful assumption, and we
> > could enforce it if we made some (all?) Logical_Flow columns immutable.
> > But it is not currently guaranteed to be true: I can use ovn-sbctl (or
> > whatever) to modify all of the columns in a Logical_Flow row in
> > arbitrary ways.  This means that F1 and F2 might also have nothing in
> > common for a given Logical_Flow U.
> 
> If that's not the case, then I think the simplest thing to do is to
> delete all the existing physical flows for the modified logical flow
> and then recreate them as if they were a new add. That sacrifices some
> performance, but it should be correct :)
> 
> Now, the question that arises is how to handle the flows created in
> physical.c, but I think that should be relatively simple to handle.

My thought is that the API should change to something like this, where
"key" is (table_id, priority, match) and "value" is actions.  These
"key" and "value" names wouldn't actually be used this way but I find it
a good conceptualization:

    ofctrl_add_flow(uuid, key, value)

        Add a flow that maps from key to value.  If there's a duplicate
        key with a different uuid, add the flow anyway but use a
        deterministic rule (e.g. based on UUID order) to determine which
        one will be in the real flow table.  If there's a duplicate key
        with the same uuid, do nothing and log a warning.

    ofctrl_remove_flow(uuid)

        Remove all the flows that have the given uuid.

We could also add a shortcut for the case where a key maps to exactly
one value, e.g.:

    ofctl_set_flow(uuid, key, value)

        Equivalent to:

            ofctrl_remove_flow(uuid);
            ofctrl_add_flow(uuid, key, value);

        but easy to optimize for the case where nothing actually
        changes.

Thanks,

Ben.



More information about the dev mailing list