[ovs-dev] [PATCH] ofproto-dpif: ovs-vswitchd.log showing unexpected flow

Andy Zhou azhou at nicira.com
Sat Aug 3 14:05:28 UTC 2013


Thanks Ben for the late night review an improving the commit message.


On Sat, Aug 3, 2013 at 12:18 AM, Ben Pfaff <blp at nicira.com> wrote:

> Thanks, Andy and Jesse, for the confirmation.  I rewrote the commit
> message and applied this to master and branch-1.11 as follows.
>
> I'm really bleary-eyed tired.  I hope the following is correct.
>
> Thanks,
>
> Ben.
>
> --8<--------------------------cut here-------------------------->8--
>
> From: Andy Zhou <azhou at nicira.com>
> Date: Fri, 2 Aug 2013 20:22:17 -0700
> Subject: [PATCH] ofproto-dpif: avoid losing track of kernel flows upon
>  reinstallation
>
> This commit fixes a problem whereby userspace can lose track of a
> flow installed in the kernel, instead believing that the flow is
> not installed.  The most visible consequence of this bug was a
> message in the ovs-vswitchd log warning about an unexpected flow
> in the kernel.  Other possible consequences included loss of
> statistics and failure to updates actions when the OpenFlow flow
> table changed.
>
> The problem arose in the following scenario.  Suppose userspace
> sets up a kernel flow due to an arriving packet.  Before kernel
> flow setup completes, another packet for that flow arrives.  The
> kernel sends the new packet to userspace after userspace has
> completed processing the batch of packets that set up the flow.
> Userspace then attempts to reinstall the kernel flow.  This fails
> with EEXIST, so userspace then marked the flow as not-installed,
> even though it was successfully installed before and remains
> installed.  The next time userspace dumped the kernel flow
> table to gather statistics, it would complain about an unexpected
> flow and delete it.
>
> In practice, we have seen these messages with netperf TCP_CRR tests and
> UDP stream tests.
>
> This patch fixes the problem by changing userspace so that, once
> it successfully installs a flow in the kernel, it will not reinstall
> it when it sees another packet for the flow in userspace.  This
> has the downside that, if something goes wrong and a flow
> disappears from the kernel (e.g. ovs-dpctl del-flows), then userspace
> won't reinstall it (until it tries to delete it).  (This is in fact
> the reason why until now userspace reinstalled flows it knew it
> already installed.)
>
> Some more background may be warranted.  There are two EEXIST error
> cases:
>
>        1. A subfacet was installed successfully in a previous (recent)
>           batch.  Now we've attempted to reinstall exactly the same
>           subfacet in this batch.
>
>        2. A subfacet was installed successfully in a previous (recent)
>           batch or earlier in the current batch.  We've attempted to
>           install a subfacet for an overlapping megaflow.
>
> Before megaflows, installation errors were ignored completely.
> Since megaflows were introduced, they have been handled by
> considering on any installation error that the given subfacet is
> not installed.  This works well for case #2 but causes case #1 to
> yield unexpected flows, as described at the top of the commit
> message.
>
> This commit adds the wrinkle that we never try to reinstall
> exactly the same subfacet that we know we installed successfully
> earlier (and haven't deleted) unless its actions change.  This
> ought to work just as well for case #2, and avoids the problem
> with case #1.
>
> Prepared with assistance from Ethan.
>
> Signed-off-by: Andy Zhou <azhou at nicira.com>
> [blp at nicira.com rewrote the commit message]
> Signed-off-by: Ben Pfaff <blp at nicira.com>
> ---
>  ofproto/ofproto-dpif.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 3bf4fe3..29a93e6 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -3460,7 +3460,7 @@ handle_flow_miss_with_facet(struct flow_miss *miss,
> struct facet *facet,
>          subfacet_update_stats(subfacet, stats);
>      }
>
> -    if (miss->upcall_type == DPIF_UC_MISS || subfacet->path != want_path)
> {
> +    if (subfacet->path != want_path) {
>          struct flow_miss_op *op = &ops[(*n_ops)++];
>          struct dpif_flow_put *put = &op->dpif_op.u.flow_put;
>
> --
> 1.7.10.4
>
>
>
> On Sat, Aug 03, 2013 at 12:01:51AM -0700, Andy Zhou wrote:
> > Yes, this is my understanding as well.
> > I did not mention the implication of mega flow in the commit message
> > because it is not the problem. But it is important in discussing about
> the
> > solution.
> >
> > Your summary provides a better and more complete picture.
> >
> > thanks for the review.
> >
> >
> > On Fri, Aug 2, 2013 at 11:36 PM, Ben Pfaff <blp at nicira.com> wrote:
> >
> > > On Fri, Aug 02, 2013 at 10:27:14PM -0700, Andy Zhou wrote:
> > > > In the case of mega flow would this cause multiple subfacets within a
> > > facet
> > > > to be considered installed?
> > > >
> > > >
> > > > On Fri, Aug 2, 2013 at 10:21 PM, Ben Pfaff <blp at nicira.com> wrote:
> > > >
> > > > > On Fri, Aug 02, 2013 at 08:22:17PM -0700, Andy Zhou wrote:
> > > > > > This patch prevents the same subfacet from trying to install the
> > > kernel
> > > > > > flow more than once.
> > > > > >
> > > > > > This is how a kernel flow can become unexpected to the user
> space.
> > > > > > When a incoming packet did not match any flow in the kernel, it
> > > > > > would be sent up to the user space. A subfacet would be created
> and
> > > > > > a corresponding kenrel flow installed. So far so good.
> > > > > >
> > > > > > Just before the kernel flow was installed, another packet of the
> same
> > > > > > flow were to arrive at the kernel, it would also be queued up
> > > > > > for user space.  This time, user space would find the subfacet
> just
> > > > > > created (subfacet_create()) due of the first packet, and attempt
> > > > > > to install the same kernel flow again, but could not (kernel
> returns
> > > > > EEXIST).
> > > > > > User space would then mark the subfacet as not installed
> > > > > (SF_NOT_INSTALLED).
> > > > >
> > > > > Can we just ignore EEXIST in the error handling loop at the bottom
> of
> > > > > handle_miss_upcalls()?
> > > > >
> > >
> > > OK, now I'm starting to see the whole picture.
> > >
> > > There are two EEXIST error cases:
> > >
> > >        1. A subfacet was installed successfully in a previous (recent)
> > >           batch.  Now we've attempted to reinstall exactly the same
> > >           subfacet in this batch.
> > >
> > >        2. A subfacet was installed successfully in a previous (recent)
> > >           batch or earlier in the current batch.  We've attempted to
> > >           install a subfacet for an overlapping megaflow.
> > >
> > > So far we have two ideas for handling these:
> > >
> > >        a) The current way, which is to consider on any installation
> > >           error that a given subfacet is not installed.  This works
> well
> > >           for case #2 but causes case #1 to yield unexpected flows, as
> > >           described in the commit message here.
> > >
> > >        b) The way proposed in this patch, which is the same as a) with
> > >           the added wrinkle that we never try to reinstall exactly the
> > >           same subfacet that we know we installed successfully earlier
> > >           (and haven't deleted) unless its actions change.  This ought
> > >           to work just as well as a) for case #2, and avoids the
> problem
> > >           with case #1.
> > >
> > > The (minor) flaw in b) is that if something goes wrong and a flow
> > > disappears from the kernel (e.g. ovs-dpctl del-flows), then userspace
> > > won't reinstall it (until it tries to delete it).
> > >
> > > If I understand the patch properly now, then I'm willing to commit it.
> > > Is that right?
> > >
> > > Thanks,
> > >
> > > Ben.
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-dev/attachments/20130803/a8dcffba/attachment-0003.html>


More information about the dev mailing list