[ovs-dev] [PATCH] OVN resource agent - make promotion synchronous

Daniel Alvarez Sanchez dalvarez at redhat.com
Tue Jul 9 07:27:43 UTC 2019


Thanks a lot Michele.
Just mentioning that this has been tested in an OpenStack environment
successfully. A timeout is not needed for the while loop since
pacemaker will enforce its own.

On Tue, Jul 9, 2019 at 9:20 AM Michele Baldessari <michele at acksyn.org> wrote:
>
> Currently inside the ovsdb_server_promote() function we call 'promote_ovnnb'
> and 'promote_ovnsb' and then just record the new master state in the
> CIB.
>
> This creates a race because those two promote commands are asynchronous
> so when we exit the ovsdb_server_promote() function the underlying DBs
> are not guaranteed to be in master state. That means that clients might
> connect to an instance that is in read-only mode.
>
> We add a simple sleep loop where we wait for the underlying DB state to
> confirm the master state. We do not need to add a timeout loop because
> in case of an issue the resource timeout set within pacemaker will kick
> in and the resource agent script will be killed by pacemaker.
>
> Tested this within an openstack environment using ovn with roughly ~20
> reboots and was unable to trigger the issue (before the patch we would
> trigger the issue after a couple of reboots tops).
>
> Signed-off-by: Michele Baldessari <michele at acksyn.org>
> ---
>  ovn/utilities/ovndb-servers.ocf | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/ovn/utilities/ovndb-servers.ocf b/ovn/utilities/ovndb-servers.ocf
> index 10313304cb7c..cd47426689ef 100755
> --- a/ovn/utilities/ovndb-servers.ocf
> +++ b/ovn/utilities/ovndb-servers.ocf
> @@ -516,6 +516,8 @@ ovsdb_server_stop() {
>  }
>
>  ovsdb_server_promote() {
> +    local state
> +
>      ovsdb_server_check_status ignore_northd
>      rc=$?
>      case $rc in
> @@ -540,7 +542,15 @@ ovsdb_server_promote() {
>          ${OVN_CTL} --ovn-manage-ovsdb=no start_northd
>      fi
>
> -    ocf_log debug "ovndb_servers: Promoting $host_name as the master"
> +    ocf_log debug "ovndb_servers: Waiting for promotion $host_name as master to complete"
> +    ovsdb_server_check_status
> +    state=$?
> +    while [ "$state" != "$OCF_RUNNING_MASTER" ]; do
> +      sleep 1
> +      ovsdb_server_check_status
> +      state=$?
> +    done
> +    ocf_log debug "ovndb_servers: Promotion of $host_name as the master completed"
>      # Record ourselves so that the agent has a better chance of doing
>      # the right thing at startup
>      ${CRM_ATTR_REPL_INFO} -v "$host_name"
> --
> 2.21.0

Acked-By: Daniel Alvarez <dalvarez at redhat.com>
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list