[ovs-dev] [PATCH] OVN resource agent - make promotion synchronous
Numan Siddique
nusiddiq at redhat.com
Tue Jul 9 07:37:51 UTC 2019
On Tue, Jul 9, 2019 at 1:04 PM Daniel Alvarez Sanchez <dalvarez at redhat.com>
wrote:
> Thanks a lot Michele.
> Just mentioning that this has been tested in an OpenStack environment
> successfully. A timeout is not needed for the while loop since
> pacemaker will enforce its own.
>
> On Tue, Jul 9, 2019 at 9:20 AM Michele Baldessari <michele at acksyn.org>
> wrote:
> >
> > Currently inside the ovsdb_server_promote() function we call
> 'promote_ovnnb'
> > and 'promote_ovnsb' and then just record the new master state in the
> > CIB.
> >
> > This creates a race because those two promote commands are asynchronous
> > so when we exit the ovsdb_server_promote() function the underlying DBs
> > are not guaranteed to be in master state. That means that clients might
> > connect to an instance that is in read-only mode.
> >
> > We add a simple sleep loop where we wait for the underlying DB state to
> > confirm the master state. We do not need to add a timeout loop because
> > in case of an issue the resource timeout set within pacemaker will kick
> > in and the resource agent script will be killed by pacemaker.
> >
> > Tested this within an openstack environment using ovn with roughly ~20
> > reboots and was unable to trigger the issue (before the patch we would
> > trigger the issue after a couple of reboots tops).
> >
> > Signed-off-by: Michele Baldessari <michele at acksyn.org>
>
LGTM
Acked-by: Numan Siddique <nusiddiq at redhat.com>
> > ---
> > ovn/utilities/ovndb-servers.ocf | 12 +++++++++++-
> > 1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/ovn/utilities/ovndb-servers.ocf
> b/ovn/utilities/ovndb-servers.ocf
> > index 10313304cb7c..cd47426689ef 100755
> > --- a/ovn/utilities/ovndb-servers.ocf
> > +++ b/ovn/utilities/ovndb-servers.ocf
> > @@ -516,6 +516,8 @@ ovsdb_server_stop() {
> > }
> >
> > ovsdb_server_promote() {
> > + local state
> > +
> > ovsdb_server_check_status ignore_northd
> > rc=$?
> > case $rc in
> > @@ -540,7 +542,15 @@ ovsdb_server_promote() {
> > ${OVN_CTL} --ovn-manage-ovsdb=no start_northd
> > fi
> >
> > - ocf_log debug "ovndb_servers: Promoting $host_name as the master"
> > + ocf_log debug "ovndb_servers: Waiting for promotion $host_name as
> master to complete"
> > + ovsdb_server_check_status
> > + state=$?
> > + while [ "$state" != "$OCF_RUNNING_MASTER" ]; do
> > + sleep 1
> > + ovsdb_server_check_status
> > + state=$?
> > + done
> > + ocf_log debug "ovndb_servers: Promotion of $host_name as the master
> completed"
> > # Record ourselves so that the agent has a better chance of doing
> > # the right thing at startup
> > ${CRM_ATTR_REPL_INFO} -v "$host_name"
> > --
> > 2.21.0
>
> Acked-By: Daniel Alvarez <dalvarez at redhat.com>
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
More information about the dev
mailing list