[ovs-dev] OVN: Broken pipe race
alexw at nicira.com
Fri Aug 21 22:12:25 UTC 2015
On Fri, Aug 21, 2015 at 3:02 PM, Ben Pfaff <blp at nicira.com> wrote:
> On Mon, Aug 17, 2015 at 11:24:48AM -0700, Alex Wang wrote:
> > Hey,
> > Want to open a thread to discuss the following race I encountered while
> > unit testing ovn.
> > The most simple case is when I run ovn-nbctl to add a lport in unit test:
> > 1. ovn-nbctl first creates/commits the logical_port entry in ovn-nb
> > database. the new entry's "up" column is empty,
> > 2. then assume ovn-nbctl execution got suspended after
> > ovsdb_idl_txn_commit_block(),
> > 3. next, ovn-northd will update the ovn-sb database and finds that the
> > new logical port is not bound. so it goes ahead update the "up"
> > column of the entry to "false"...
> > 4. since ovn-nbctl is still running and is set to monitor everything, the
> > ovsdb-server will try sending the "update" to ovn-nbctl...
> > 5. now consider this race: if ovn-nbctl execution resumes and exits
> > before ovsdb-server sending the update,... the send will fail with
> > (Broken Pipe) error, resulting in a WARN log in ovsdb-server.log.
> > Even if we set the "up" column to "false" at creation, we can still run
> > similar race if the ovn-controller quickly binds the lport to chassis and
> > ovn-northd now updates "up" column to "true".
> > I also found similar race for other command combinations... e.g.
> > deleting vtep switch physical port and deleting ovs port while running
> > ovs-vtep simulator...
> > I'm thinking instead of trying to fix every case (which may not be even
> > possible), we can try removing all monitor request right after
> > ovsdb_idl_txn_commit_block() and try waiting until receiving the
> > monitor request ack from ovsdb-server. After that ovsdb-server will
> > never try sending anything to "*-*ctl" commands,
> > Would like to hear what you think?~
> I think the warning is harmless (since we know the cause) so I'd be
> inclined to just ignore it in the testsuite.
That's that alternative I received from talking with Andy~
Makes sense, I'll ignore the warning,
More information about the dev