[ovs-dev] [PATCH] ovsdb-idl.at: Wait all servers to join the cluster.

Flavio Leitner fbl at sysclose.org
Fri Sep 4 15:09:29 UTC 2020


On Fri, Sep 04, 2020 at 12:54:39AM +0200, Ilya Maximets wrote:
> On 9/4/20 12:05 AM, Flavio Leitner wrote:
> > On Thu, Sep 03, 2020 at 11:20:56PM +0200, Ilya Maximets wrote:
> >> On 6/11/20 1:45 AM, Flavio Leitner wrote:
> >>> The test 'Check Python IDL reconnects to leader - Python3
> >>> (leader only)' fails sometimes when the first ovsdb-server
> >>> gets killed before the others had joined the cluster.
> >>>
> >>> Fix the function ovsdb_cluster_start_idltest to wait them
> >>> to join the cluster.
> >>
> >> Hi, Flavio.  Thanks for the fix and sorry for delays.
> >>
> >> Patch seems OK, but I'm not very comfortable with the code duplication
> >> between this function and OVS_WAIT_UNTIL macro.  Have you considered
> >> conversion of ovsdb_cluster_start_idltest() function into m4_define()
> >> macro so we could easily use OVS_WAIT_UNTIL inside of it?
> > 
> > I tried, but I ran into issues and I am not familiar with m4.
> 
> Could you try following diff:

I tried, but it fails. The test logs are here:
http://people.redhat.com/~fleitner/2127.tar.bz2


#                             -*- compilation -*-
2127. ovsdb-idl.at:1840: testing Check Python IDL reconnects to leader - Python3 (leader only) ...
./ovsdb-idl.at:1840: ovsdb-tool create-cluster s1.db \
                        $abs_srcdir/idltest.ovsschema unix:s1.raft
./ovsdb-idl.at:1840: ovsdb-tool join-cluster s$i.db \
                          $schema_name unix:s$i.raft unix:s1.raft
./ovsdb-idl.at:1840: ovsdb-tool join-cluster s$i.db \
                          $schema_name unix:s$i.raft unix:s1.raft
./ovsdb-idl.at:1840: ovsdb-server -vraft -vconsole:warn --detach --no-chdir \
                   --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i \
                   --remote=punix:s$i.ovsdb ${2:+--remote="ptcp:0:"127.0.0.1} s$i.db
./ovsdb-idl.at:1840: ovsdb-server -vraft -vconsole:warn --detach --no-chdir \
                   --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i \
                   --remote=punix:s$i.ovsdb ${2:+--remote="ptcp:0:"127.0.0.1} s$i.db
./ovsdb-idl.at:1840: ovsdb-server -vraft -vconsole:warn --detach --no-chdir \
                   --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i \
                   --remote=punix:s$i.ovsdb ${2:+--remote="ptcp:0:"127.0.0.1} s$i.db
ovsdb-idl.at:1840: waiting until ovs-appctl -t $(pwd)/s$i cluster/status ${schema_name} \
                                           | grep -q 'Status: cluster member'...
ovsdb-idl.at:1840: wait succeeded immediately
ovsdb-idl.at:1840: waiting until ovs-appctl -t $(pwd)/s$i cluster/status ${schema_name} \
                                           | grep -q 'Status: cluster member'...
ovsdb-idl.at:1840: wait succeeded after 1 seconds
ovsdb-idl.at:1840: waiting until ovs-appctl -t $(pwd)/s$i cluster/status ${schema_name} \
                                           | grep -q 'Status: cluster member'...
ovsdb-idl.at:1840: wait succeeded immediately
ovsdb-idl.at:1840: waiting until TCP_PORT_1=`sed -n 's/.*0:.*: listening on port \([0-9]*\)$/\1/p' "s2.log"` && test X != X"$TCP_PORT_1"...
ovsdb-idl.at:1840: wait failed after 30 seconds
./ovs-macros.at:241: hard failure
2127. ovsdb-idl.at:1840: 2127. Check Python IDL reconnects to leader - Python3 (leader only) (ovsdb-idl.at:1840): FAILED (ovs-macros.at:241)

-- 
fbl


More information about the dev mailing list