[ovs-dev] [PATCH] ovsdb-idl.at: Wait all servers to join the cluster.

Flavio Leitner fbl at sysclose.org
Fri Sep 4 19:25:59 UTC 2020


On Fri, Sep 04, 2020 at 08:07:41PM +0200, Ilya Maximets wrote:
> On 9/4/20 5:09 PM, Flavio Leitner wrote:
> > On Fri, Sep 04, 2020 at 12:54:39AM +0200, Ilya Maximets wrote:
> >> On 9/4/20 12:05 AM, Flavio Leitner wrote:
> >>> On Thu, Sep 03, 2020 at 11:20:56PM +0200, Ilya Maximets wrote:
> >>>> On 6/11/20 1:45 AM, Flavio Leitner wrote:
> >>>>> The test 'Check Python IDL reconnects to leader - Python3
> >>>>> (leader only)' fails sometimes when the first ovsdb-server
> >>>>> gets killed before the others had joined the cluster.
> >>>>>
> >>>>> Fix the function ovsdb_cluster_start_idltest to wait them
> >>>>> to join the cluster.
> >>>>
> >>>> Hi, Flavio.  Thanks for the fix and sorry for delays.
> >>>>
> >>>> Patch seems OK, but I'm not very comfortable with the code duplication
> >>>> between this function and OVS_WAIT_UNTIL macro.  Have you considered
> >>>> conversion of ovsdb_cluster_start_idltest() function into m4_define()
> >>>> macro so we could easily use OVS_WAIT_UNTIL inside of it?
> >>>
> >>> I tried, but I ran into issues and I am not familiar with m4.
> >>
> >> Could you try following diff:
> > 
> > I tried, but it fails. The test logs are here:
> > http://people.redhat.com/~fleitner/2127.tar.bz2
> 
> Oh, sorry, my fault.  We can't use shell expansions over m4 definitions.
> Should work with following change:

No worries!

The test passes all times with the fix below.
Thanks for improving the fix!
fbl


> 
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index 075250a9c..30e896e3e 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -46,7 +46,8 @@ m4_define([OVSDB_CLUSTER_START_IDLTEST],
>     for i in $(seq $n); do
>       AT_CHECK([ovsdb-server -vraft -vconsole:warn --detach --no-chdir \
>                     --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i \
> -                   --remote=punix:s$i.ovsdb ${2:+--remote=$2} s$i.db])
> +                   --remote=punix:s$i.ovsdb                           \
> +                   m4_if([$2], [], [], [--remote=$2]) s$i.db])
>     done
>     on_exit 'kill $(cat s*.pid)'
>  
> ---
> 
> > 
> > 
> > #                             -*- compilation -*-
> > 2127. ovsdb-idl.at:1840: testing Check Python IDL reconnects to leader - Python3 (leader only) ...
> > ./ovsdb-idl.at:1840: ovsdb-tool create-cluster s1.db \
> >                         $abs_srcdir/idltest.ovsschema unix:s1.raft
> > ./ovsdb-idl.at:1840: ovsdb-tool join-cluster s$i.db \
> >                           $schema_name unix:s$i.raft unix:s1.raft
> > ./ovsdb-idl.at:1840: ovsdb-tool join-cluster s$i.db \
> >                           $schema_name unix:s$i.raft unix:s1.raft
> > ./ovsdb-idl.at:1840: ovsdb-server -vraft -vconsole:warn --detach --no-chdir \
> >                    --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i \
> >                    --remote=punix:s$i.ovsdb ${2:+--remote="ptcp:0:"127.0.0.1} s$i.db
> > ./ovsdb-idl.at:1840: ovsdb-server -vraft -vconsole:warn --detach --no-chdir \
> >                    --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i \
> >                    --remote=punix:s$i.ovsdb ${2:+--remote="ptcp:0:"127.0.0.1} s$i.db
> > ./ovsdb-idl.at:1840: ovsdb-server -vraft -vconsole:warn --detach --no-chdir \
> >                    --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i \
> >                    --remote=punix:s$i.ovsdb ${2:+--remote="ptcp:0:"127.0.0.1} s$i.db
> > ovsdb-idl.at:1840: waiting until ovs-appctl -t $(pwd)/s$i cluster/status ${schema_name} \
> >                                            | grep -q 'Status: cluster member'...
> > ovsdb-idl.at:1840: wait succeeded immediately
> > ovsdb-idl.at:1840: waiting until ovs-appctl -t $(pwd)/s$i cluster/status ${schema_name} \
> >                                            | grep -q 'Status: cluster member'...
> > ovsdb-idl.at:1840: wait succeeded after 1 seconds
> > ovsdb-idl.at:1840: waiting until ovs-appctl -t $(pwd)/s$i cluster/status ${schema_name} \
> >                                            | grep -q 'Status: cluster member'...
> > ovsdb-idl.at:1840: wait succeeded immediately
> > ovsdb-idl.at:1840: waiting until TCP_PORT_1=`sed -n 's/.*0:.*: listening on port \([0-9]*\)$/\1/p' "s2.log"` && test X != X"$TCP_PORT_1"...
> > ovsdb-idl.at:1840: wait failed after 30 seconds
> > ./ovs-macros.at:241: hard failure
> > 2127. ovsdb-idl.at:1840: 2127. Check Python IDL reconnects to leader - Python3 (leader only) (ovsdb-idl.at:1840): FAILED (ovs-macros.at:241)
> > 
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

-- 
fbl


More information about the dev mailing list