[ovs-dev] OVN: Broken pipe race

Ben Pfaff blp at nicira.com
Fri Aug 21 22:02:34 UTC 2015


On Mon, Aug 17, 2015 at 11:24:48AM -0700, Alex Wang wrote:
> Hey,
> 
> Want to open a thread to discuss the following race I encountered while
> unit testing ovn.
> 
> The most simple case is when I run ovn-nbctl to add a lport in unit test:
> 1. ovn-nbctl first creates/commits the logical_port entry in ovn-nb
>     database.  the new entry's "up" column is empty,
> 2. then assume ovn-nbctl execution got suspended after
>     ovsdb_idl_txn_commit_block(),
> 3. next, ovn-northd will update the ovn-sb database and finds that the
>     new logical port is not bound.  so it goes ahead update the "up"
>     column of the entry to "false"...
> 4. since ovn-nbctl is still running and is set to monitor everything, the
>     ovsdb-server will try sending the "update" to ovn-nbctl...
> 5. now consider this race:  if ovn-nbctl execution resumes and exits right
>     before ovsdb-server sending the update,...  the send will fail with
>     (Broken Pipe) error, resulting in a WARN log in ovsdb-server.log.
> 
> Even if we set the "up" column to "false" at creation, we can still run into
> similar race if the ovn-controller quickly binds the lport to chassis and
> ovn-northd now updates "up" column to "true".
> 
> I also found similar race for other command combinations...  e.g.
> deleting vtep switch physical port and deleting ovs port while running
> ovs-vtep simulator...
> 
> I'm thinking instead of trying to fix every case (which may not be even
> possible), we can try removing all monitor request right after
> ovsdb_idl_txn_commit_block() and try waiting until receiving the
> monitor request ack from ovsdb-server.  After that ovsdb-server will
> never try sending anything to "*-*ctl" commands,
> 
> Would like to hear what you think?~

I think the warning is harmless (since we know the cause) so I'd be
inclined to just ignore it in the testsuite.



More information about the dev mailing list