[ovs-dev] OVN: Broken pipe race

Alex Wang alexw at nicira.com
Fri Aug 21 22:12:25 UTC 2015


On Fri, Aug 21, 2015 at 3:02 PM, Ben Pfaff <blp at nicira.com> wrote:

> On Mon, Aug 17, 2015 at 11:24:48AM -0700, Alex Wang wrote:
> > Hey,
> >
> > Want to open a thread to discuss the following race I encountered while
> > unit testing ovn.
> >
> > The most simple case is when I run ovn-nbctl to add a lport in unit test:
> > 1. ovn-nbctl first creates/commits the logical_port entry in ovn-nb
> >     database.  the new entry's "up" column is empty,
> > 2. then assume ovn-nbctl execution got suspended after
> >     ovsdb_idl_txn_commit_block(),
> > 3. next, ovn-northd will update the ovn-sb database and finds that the
> >     new logical port is not bound.  so it goes ahead update the "up"
> >     column of the entry to "false"...
> > 4. since ovn-nbctl is still running and is set to monitor everything, the
> >     ovsdb-server will try sending the "update" to ovn-nbctl...
> > 5. now consider this race:  if ovn-nbctl execution resumes and exits
> right
> >     before ovsdb-server sending the update,...  the send will fail with
> >     (Broken Pipe) error, resulting in a WARN log in ovsdb-server.log.
> >
> > Even if we set the "up" column to "false" at creation, we can still run
> into
> > similar race if the ovn-controller quickly binds the lport to chassis and
> > ovn-northd now updates "up" column to "true".
> >
> > I also found similar race for other command combinations...  e.g.
> > deleting vtep switch physical port and deleting ovs port while running
> > ovs-vtep simulator...
> >
> > I'm thinking instead of trying to fix every case (which may not be even
> > possible), we can try removing all monitor request right after
> > ovsdb_idl_txn_commit_block() and try waiting until receiving the
> > monitor request ack from ovsdb-server.  After that ovsdb-server will
> > never try sending anything to "*-*ctl" commands,
> >
> > Would like to hear what you think?~
>
> I think the warning is harmless (since we know the cause) so I'd be
> inclined to just ignore it in the testsuite.
>

That's that alternative I received from talking with Andy~

Makes sense, I'll ignore the warning,

Thanks,
Alex Wang,



More information about the dev mailing list