[ovs-dev] [PATCH] ovn.at: A "peer" is only for interconnected routers.

Ben Pfaff blp at ovn.org
Tue Jul 19 15:44:23 UTC 2016


On Mon, Jul 18, 2016 at 01:57:53PM -0700, Guru Shetty wrote:
> On 18 July 2016 at 13:11, Ben Pfaff <blp at ovn.org> wrote:
> 
> > On Mon, Jul 18, 2016 at 11:34:54AM -0700, Guru Shetty wrote:
> > > On 18 July 2016 at 10:35, Ben Pfaff <blp at ovn.org> wrote:
> > >
> > > > On Mon, Jul 18, 2016 at 12:00:30AM -0700, Gurucharan Shetty wrote:
> > > > > We should not use "peer" while connecting a router to a switch.
> > > > > (Doing so, will cause ovn-northd to constantly create and destroy
> > > > > port_binding records which causes CPU utilization of ovn-controller
> > to
> > > > > spike up.)
> > > > >
> > > > > Fixes: 31114af758c7e6 ("ovn-nbctl: Update logical router port
> > commands.")
> > > > > Signed-off-by: Gurucharan Shetty <guru at ovn.org>
> > > >
> > > > I updated the commit message to correctly say:
> > >     We should not use "peer" while connecting a router to a switch.
> > >     (Doing so, will cause ovn-northd to constantly create and destroy
> > >     logical_flow records which causes CPU utilization of ovn-controller
> > to
> > >     spike up.)
> > >
> > >
> > > > This seems like a bug in ovn-northd.  Did you investigate why it
> > > > happens?
> > > >
> > > I think I know ( I haven't put a debugger to confirm). We create a
> > logical
> > > flow incorrectly and add it via ovn_lflow_add() with a switch's datapath
> > > and a router's pipeline stage (S_ROUTER_IN_ARP_RESOLVE) here (This is
> > when
> > > we incorrectly set router's peer as a switch):
> > >
> > https://github.com/openvswitch/ovs/blob/master/ovn/northd/ovn-northd.c#L2614
> >
> > Oh, good spotting, can we fix it something like this?
> >
> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > index 7ce509d..97e3271 100644
> > --- a/ovn/northd/ovn-northd.c
> > +++ b/ovn/northd/ovn-northd.c
> > @@ -715,7 +736,10 @@ join_logical_ports(struct northd_context *ctx,
> >                  sizeof *op->od->router_ports * (op->od->n_router_ports +
> > 1));
> >              op->od->router_ports[op->od->n_router_ports++] = op;
> >          } else if (op->nbr && op->nbr->peer) {
> > -            op->peer = ovn_port_find(ports, op->nbr->peer);
> > +            struct ovn_port *peer = ovn_port_find(ports, op->nbr->peer);
> > +            if (peer && peer->nbr) {
> > +                op->peer = peer;
> > +            }
> >          }
> >      }
> >  }
> >
> Since the original patch in question fixed faulty tests independently, I
> had applied it as soon as I got your Ack. The above code snippet is good
> underlying fix. So you have my:
> Acked-by: Gurucharan Shetty <guru at ovn.org>

Thanks for the ack!

> 
> >
> > > In build_lflows, we go through each record of SB database's logical flows
> > > and then do a ovn_lflow_find() which returns a negative if the data was
> > > wrongly inserted, so it goes ahead and deleted the record from SB
> > database.
> > >
> > > In the next block, it will insert it back into the SB database. I will
> > send
> > > one additional fix. But, I think we should also assert in ovn_lflow_add
> > if
> > > we add a datapath with a different datapath's pipeline - not sure what
> > is a
> > > nice way to do that.
> >
> > Here's an idea (it compiles but I didn't test it):
> >
> Nice. I tested the below and the tests asserted with a wrong configuration.
> You have my:
> Acked-by: Gurucharan Shetty <guru at ovn.org>

Thanks for this ack too.

I posted these as proper patches in case anyone has a comment:
        https://patchwork.ozlabs.org/patch/650263/
        https://patchwork.ozlabs.org/patch/650264/

I forgot to apply your acks to them, but I'll do that in a day or so and
commit them if no one else comments.

> I think the below patch which Ryan Acked is probably still relevant to
> apply?
> https://patchwork.ozlabs.org/patch/649723/

I'll take a look, thanks.



More information about the dev mailing list