[ovs-dev] [PATCH] OVN resource agent - make promotion synchronous

Numan Siddique nusiddiq at redhat.com
Tue Jul 9 07:37:51 UTC 2019


On Tue, Jul 9, 2019 at 1:04 PM Daniel Alvarez Sanchez <dalvarez at redhat.com>
wrote:

> Thanks a lot Michele.
> Just mentioning that this has been tested in an OpenStack environment
> successfully. A timeout is not needed for the while loop since
> pacemaker will enforce its own.
>
> On Tue, Jul 9, 2019 at 9:20 AM Michele Baldessari <michele at acksyn.org>
> wrote:
> >
> > Currently inside the ovsdb_server_promote() function we call
> 'promote_ovnnb'
> > and 'promote_ovnsb' and then just record the new master state in the
> > CIB.
> >
> > This creates a race because those two promote commands are asynchronous
> > so when we exit the ovsdb_server_promote() function the underlying DBs
> > are not guaranteed to be in master state. That means that clients might
> > connect to an instance that is in read-only mode.
> >
> > We add a simple sleep loop where we wait for the underlying DB state to
> > confirm the master state. We do not need to add a timeout loop because
> > in case of an issue the resource timeout set within pacemaker will kick
> > in and the resource agent script will be killed by pacemaker.
> >
> > Tested this within an openstack environment using ovn with roughly ~20
> > reboots and was unable to trigger the issue (before the patch we would
> > trigger the issue after a couple of reboots tops).
> >
> > Signed-off-by: Michele Baldessari <michele at acksyn.org>
>

LGTM

Acked-by: Numan Siddique <nusiddiq at redhat.com>



> > ---
> >  ovn/utilities/ovndb-servers.ocf | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/ovn/utilities/ovndb-servers.ocf
> b/ovn/utilities/ovndb-servers.ocf
> > index 10313304cb7c..cd47426689ef 100755
> > --- a/ovn/utilities/ovndb-servers.ocf
> > +++ b/ovn/utilities/ovndb-servers.ocf
> > @@ -516,6 +516,8 @@ ovsdb_server_stop() {
> >  }
> >
> >  ovsdb_server_promote() {
> > +    local state
> > +
> >      ovsdb_server_check_status ignore_northd
> >      rc=$?
> >      case $rc in
> > @@ -540,7 +542,15 @@ ovsdb_server_promote() {
> >          ${OVN_CTL} --ovn-manage-ovsdb=no start_northd
> >      fi
> >
> > -    ocf_log debug "ovndb_servers: Promoting $host_name as the master"
> > +    ocf_log debug "ovndb_servers: Waiting for promotion $host_name as
> master to complete"
> > +    ovsdb_server_check_status
> > +    state=$?
> > +    while [ "$state" != "$OCF_RUNNING_MASTER" ]; do
> > +      sleep 1
> > +      ovsdb_server_check_status
> > +      state=$?
> > +    done
> > +    ocf_log debug "ovndb_servers: Promotion of $host_name as the master
> completed"
> >      # Record ourselves so that the agent has a better chance of doing
> >      # the right thing at startup
> >      ${CRM_ATTR_REPL_INFO} -v "$host_name"
> > --
> > 2.21.0
>
> Acked-By: Daniel Alvarez <dalvarez at redhat.com>
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list