[ovs-dev] [PATCH] ovn-ctl: Add "--no-leader-only" while cluster db init

taoyunxiang at cmss.chinamobile.com taoyunxiang at cmss.chinamobile.com
Tue Mar 17 03:19:48 UTC 2020


Hi Han and Numan,
     Thanks for your review, I have summited v2 change, ovn-ctl: Fix stucked while do cluster db init



Thanks,
Yun
--------------
taoyunxiang at cmss.chinamobile.com
>On Mon, Mar 9, 2020 at 3:50 AM Tao YunXiang <
>taoyunxiang at cmss.chinamobile.com> wrote:
>>
>> For cluster mode, only the first server(which is the leader) in a cluster
>> should do init. But, sometimes, the role of the first server would
>transfer
>> leadership, so we should add "--no-leader-only" option in case
>> the ovn-northd stucked.
>>
>Thanks for the patch. This is trickier than it appears.
>Let me describe the problem from my understanding. When creating the
>cluster, the "init" needs to be executed once. This should be done with the
>first server. In this case the script works well, because the
>"cluster_remote_addr" is NOT specified when starting the first server,
>which is the leader at that time. When another server joins the cluster,
>the "cluster_remote_addr" must be specified, so the "init" won't be
>executed.
>However, when a server needs to be restarted later, no matter if it is
>leader or not, it doesn't need to supply the "cluster_remote_addr"
>parameter, because that parameter is needed only for joining a cluster.
>When a server is restarted by executing this script, if
>"cluster_remote_addr" is NOT specified, it will stuck at the "init"
>command, because it is not leader and "--no-leader-only" is not specified
>here.
>
>So, the implementation doesn't really reflect the comment:
>>      # Initialize the database if it's running standalone,
>>      # active-passive, or is the first server in a cluster.
>It tried to run the init command even when it is not the first server in a
>cluster.
>
>To fix the problem, one way is to make sure it does exactly as what the
>comment said. However, it seems there is no easy way to distinguish if the
>server is the first server in a cluster or it is just a server that has
>already joined being restarted.
>
>Alternatively, adding "--no-leader-only" solves the problem as well, as
>proposed by this patch, because it is not harmful to run "init" again even
>if the DB is already initiated, just like how it is handled when it is
>standalone mode.
>
>Based on the above analysis, I am ok with the fix, but I suggest to update
>the commit message to clarify the problem more clearly. In addition, for
>the comment mentioned above, it should be changed to:
>
># Initialize the database if it's NOT joining a cluster.
>
>Finally, maybe the if condition can be removed completely, but I haven't
>tested and not 100% sure.
>
>Thanks,
>Han
>
>
>> Author: Tao YunXiang <taoyunxiang at cmss.chinamobile.com>
>> Co-authored-by: Liu Chang <liuchang at cmss.chinamobile.com>
>> Signed-off-by: Tao YunXiang <taoyunxiang at cmss.chinamobile.com>
>> Signed-off-by: Liu Chang <liuchang at cmss.chinamobile.com>
>>
>> ---
>>  utilities/ovn-ctl | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
>> index 2a337ae27..71bdbc076 100755
>> --- a/utilities/ovn-ctl
>> +++ b/utilities/ovn-ctl
>> @@ -293,7 +293,7 @@ $cluster_remote_port
>>      # Initialize the database if it's running standalone,
>>      # active-passive, or is the first server in a cluster.
>>      if test -z "$cluster_remote_addr"; then
>> -        $(echo ovn-${db}ctl | tr _ -) init
>> +        $(echo ovn-${db}ctl | tr _ -) --no-leader-only init
>>      fi
>>
>>      if test $mode = cluster; then
>> --
>> 2.17.1
>>
>>
>>
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list