[ovs-dev] [PATCH] ovn pacemaker: Fix the promotion issue in other cluster nodes when the master node is reset

Wed May 23 10:34:58 UTC 2018

On Sat, May 19, 2018 at 3:12 AM, aginwala <aginwala at asu.edu> wrote:

> Sure.
>
> I tried with the settings you suggested but still its not able to promote
> new master during kernel panic :( :
>
> Current DC: test7 (version 1.1.14-70404b0) - partition WITHOUT quorum
> 2 nodes and 3 resources configured
>
> Online: [ test7 ]
> OFFLINE: [ test6 ]
>
> Full list of resources:
>
>  VirtualIP (ocf::heartbeat:IPaddr2): Stopped
>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>      Stopped: [ test6 test7 ]
>
> Until it is stuck in panic, it doesn't let new leader to be promoted which
> is bad. So the cluster recovers after I force reboot the box. So, promote
> logic should still proceed without manual intervention which is not working
> as expected.
> So without your patch too I see same results where if I reboot the stuck
> master, only then cluster restores.
>
>
> Also I noticed ipaddr2 resource shows below error:
> 2018-05-18T21:36:13.794Z|00005|ovsdb_error|ERR|unexpected ovsdb error:
> Server ID check failed: Self replicating is not allowed
> 2018-05-18T21:36:13.795Z|00006|ovsdb_jsonrpc_server|INFO|tcp:
> 192.168.220.107:59864: disconnecting (making server read/write)
>
> So I think this kind of race condition is expected as I am seeing this too
> using LB code.
>
>
>
I tried your commands and for some reason didn't work for me.

I tested with the below commands and it is working as expected. When i
trigger kernel panic on the master node, pacemaker promotes another master.
I suspect there is some issue with the ordering of the resources. If you
want you can give a shot using the below commands.

*******************************************
$cat setup_pcs_resources.sh
rm -f tmp-cib*
pcs resource delete ip-192.168.121.100
pcs resource delete ovndb_servers

sleep 5
pcs status

pcs cluster cib tmp-cib.xml
cp tmp-cib.xml tmp-cib.xml.deltasrc

pcs -f tmp-cib.xml resource create ip-192.168.121.100 ocf:heartbeat:IPaddr2
ip=192.168.121.100   op monitor interval=30s
pcs -f tmp-cib.xml resource create ovndb_servers  ocf:ovn:ovndb-servers
manage_northd=no master_ip=192.168.121.100 nb_master_port=6641
sb_master_port=6642 --master
pcs -f tmp-cib.xml resource meta ovndb_servers-master notify=true
pcs -f tmp-cib.xml constraint order start ip-192.168.121.100 then promote
ovndb_servers-master
pcs -f tmp-cib.xml constraint colocation add ip-192.168.121.100 with master
ovndb_servers-master

pcs cluster cib-push tmp-cib.xml diff-against=tmp-cib.xml.deltasrc

***********************************************************

Let me know further.
>
> Regards,
>
> On Fri, May 18, 2018 at 12:02 PM, Numan Siddique <nusiddiq at redhat.com>
> wrote:
>
>>
>>
>> On Fri, May 18, 2018 at 11:53 PM, aginwala <aginwala at asu.edu> wrote:
>>
>>>
>>>
>>> On Thu, May 17, 2018 at 11:23 PM, Numan Siddique <nusiddiq at redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, May 18, 2018 at 4:24 AM, aginwala <aginwala at asu.edu> wrote:
>>>>
>>>>> Hi:
>>>>>
>>>>> I tried and it didnt help where Ip resource is always showing stopped
>>>>> where my private VIP IP is 192.168.220.108
>>>>> # kernel panic on  active node
>>>>> root at test7:~# echo c > /proc/sysrq-trigger
>>>>>
>>>>>
>>>>> root at test6:~# crm stat
>>>>> Last updated: Thu May 17 22:46:38 2018 Last change: Thu May 17
>>>>> 22:45:03 2018 by root via cibadmin on test6
>>>>> Stack: corosync
>>>>> Current DC: test7 (version 1.1.14-70404b0) - partition with quorum
>>>>> 2 nodes and 3 resources configured
>>>>>
>>>>> Online: [ test6 test7 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>>  VirtualIP (ocf::heartbeat:IPaddr2): Started test7
>>>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>>>      Masters: [ test7 ]
>>>>>      Slaves: [ test6 ]
>>>>>
>>>>> root at test6:~# crm stat
>>>>> Last updated: Thu May 17 22:46:38 2018 Last change: Thu May 17
>>>>> 22:45:03 2018 by root via cibadmin on test6
>>>>> Stack: corosync
>>>>> Current DC: test6 (version 1.1.14-70404b0) - partition WITHOUT quorum
>>>>> 2 nodes and 3 resources configured
>>>>>
>>>>> Online: [ test6 ]
>>>>> OFFLINE: [ test7 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>>  VirtualIP (ocf::heartbeat:IPaddr2): Stopped
>>>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>>>      Slaves: [ test6 ]
>>>>>      Stopped: [ test7 ]
>>>>>
>>>>> root at test6:~# crm stat
>>>>> Last updated: Thu May 17 22:49:26 2018 Last change: Thu May 17
>>>>> 22:45:03 2018 by root via cibadmin on test6
>>>>> Stack: corosync
>>>>> Current DC: test6 (version 1.1.14-70404b0) - partition WITHOUT quorum
>>>>> 2 nodes and 3 resources configured
>>>>>
>>>>> Online: [ test6 ]
>>>>> OFFLINE: [ test7 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>>  VirtualIP (ocf::heartbeat:IPaddr2): Stopped
>>>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>>>      Stopped: [ test6 test7 ]
>>>>>
>>>>> I think this change not needed or something else is wrong when using
>>>>> virtual IP resource.
>>>>>
>>>>
>>>> Hi Aliasgar, I think you haven't created the resource properly. Or
>>>> haven't set the  colocation constraints properly. What pcs/crm commands you
>>>> used to create OVN db resources ?
>>>> Can you share the output of "pcs resource show ovndb_servers" and "pcs
>>>> constraint"
>>>> In case of tripleo we create resource like this -
>>>> https://github.com/openstack/puppet-tripleo/blob/master/ma
>>>> nifests/profile/pacemaker/ovn_northd.pp#L80
>>>>
>>>
>>> >>>>> # I am using the same commands suggested upstream in the ovs
>>> document to create resource:
>>> I am skipping manage northd option with default inactivity probe interval
>>> http://docs.openvswitch.org/en/latest/topics/integration/#ha
>>> -for-ovn-db-servers-using-pacemaker
>>> # cat pcs_with_ipaddr2.sh
>>> pcs resource create VirtualIP ocf:heartbeat:IPaddr2 \
>>>   params ip="192.168.220.108" op monitor interval="30s"
>>> pcs resource create ovndb_servers ocf:ovn:ovndb-servers \
>>>      master_ip="192.168.220.108" \
>>>      op monitor interval="10s" \
>>>      op monitor role=Master interval="15s" --debug
>>> pcs resource master ovndb_servers-master ovndb_servers \
>>>     meta notify="true"
>>> pcs constraint order promote ovndb_servers-master then VirtualIP
>>>
>>
>> I think ordering should be reversed. We want pacemaker to start IPAddr2
>> resource first and then start ovndb_servers resource. May be we need to
>> update the document.
>>
>> Can you please try with the command "pcs constraint order VirtualIP then
>> ovndb_servers-master". I think that's why in your setup, IPAddr2 resource
>> is not started.
>>
>> Thanks
>> Numan
>>
>>
>>
>>
>>> pcs constraint colocation add VirtualIP with master ovndb_servers-master
>>> \
>>>     score=INFINITY
>>>
>>> # pcs resource show ovndb_servers
>>>  Resource: ovndb_servers (class=ocf provider=ovn type=ovndb-servers)
>>>   Attributes: master_ip=192.168.220.108
>>>   Operations: start interval=0s timeout=30s
>>> (ovndb_servers-start-interval-0s)
>>>               stop interval=0s timeout=20s (ovndb_servers-stop-interval-0
>>> s)
>>>               promote interval=0s timeout=50s
>>> (ovndb_servers-promote-interval-0s)
>>>               demote interval=0s timeout=50s
>>> (ovndb_servers-demote-interval-0s)
>>>               monitor interval=10s (ovndb_servers-monitor-interval-10s)
>>>               monitor interval=15s role=Master
>>> (ovndb_servers-monitor-interval-15s)
>>> # pcs constraint
>>> Location Constraints:
>>> Ordering Constraints:
>>>   promote ovndb_servers-master then start VirtualIP (kind:Mandatory)
>>> Colocation Constraints:
>>>   VirtualIP with ovndb_servers-master (score:INFINITY)
>>> (rsc-role:Started) (with-rsc-role:Master)
>>>
>>>>
>>>>
>>>>>
>>>>> May we you need a similar promotion logic that we have for LB with
>>>>> pacemaker in the discussion (will submit formal patch soon). I did test
>>>>> with kernel panic with LB code change and it works fine where node2 gets
>>>>> promoted. Below works fine for LB even if there is kernel panic without
>>>>> this change:
>>>>>
>>>>
>>>> This issue is not seen all the time. I have another setup where I don't
>>>> see this issue at all. The issue is seen when the IPAddr2 resource is moved
>>>> to another slave node and ovsdb-server's start reporting as master as soon
>>>> as the IP address is configured.
>>>>
>>>> When the issue is seen we  hit the code here -
>>>> https://github.com/openvswitch/ovs/blob/master/ovn/utiliti
>>>> es/ovndb-servers.ocf#L412. Ideally when promot action is called, ovsdb
>>>> servers will be running as slaves/standby and the promote action promotes
>>>> them to master. But when the issue is seen, the ovsdb servers report the
>>>> status as active. Because of which we don't complete the full promote
>>>> action and return at L412. And later when notify action is called, we
>>>> demote the servers because of this - https://github.com/openvswit
>>>> ch/ovs/blob/master/ovn/utilities/ovndb-servers.ocf#L176
>>>>
>>>> >>> Yes I agree! As you said settings work fine in one cluster and if
>>> you use other cluster with same settings, you may see surprises .
>>>
>>>
>>>> For the use case like your's (where load balancer VIP is used), you may
>>>> not see this issue at all since you will not be using the IPaddr2 resource
>>>> as master ip.
>>>>
>>> >>> Correct, I just wanted to update both the settings to let you know
>>> pacemaker behavior with IPaddr2 vs LB VIP IP.
>>>
>>>>
>>>>
>>>>> root at test-pace1-2365293:~# echo c > /proc/sysrq-trigger
>>>>> root at test-pace2-2365308:~# crm stat
>>>>> Last updated: Thu May 17 15:15:45 2018 Last change: Wed May 16
>>>>> 23:10:52 2018 by root via cibadmin on test-pace2-2365308
>>>>> Stack: corosync
>>>>> Current DC: test-pace1-2365293 (version 1.1.14-70404b0) - partition
>>>>> with quorum
>>>>> 2 nodes and 2 resources configured
>>>>>
>>>>> Online: [ test-pace1-2365293 test-pace2-2365308 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>>>      Masters: [ test-pace1-2365293 ]
>>>>>      Slaves: [ test-pace2-2365308 ]
>>>>>
>>>>> root at test-pace2-2365308:~# crm stat
>>>>> Last updated: Thu May 17 15:15:45 2018 Last change: Wed May 16
>>>>> 23:10:52 2018 by root via cibadmin on test-pace2-2365308
>>>>> Stack: corosync
>>>>> Current DC: test-pace2-2365308 (version 1.1.14-70404b0) - partition
>>>>> WITHOUT quorum
>>>>> 2 nodes and 2 resources configured
>>>>>
>>>>> Online: [ test-pace2-2365308 ]
>>>>> OFFLINE: [ test-pace1-2365293 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>>>      Slaves: [ test-pace2-2365308 ]
>>>>>      Stopped: [ test-pace1-2365293 ]
>>>>>
>>>>> root at test-pace2-2365308:~# ps aux | grep ovs
>>>>> root     15175  0.0  0.0  18048   372 ?        Ss   15:15   0:00
>>>>> ovsdb-server: monitoring pid 15176 (healthy)
>>>>> root     15176  0.0  0.0  18312  4096 ?        S    15:15   0:00
>>>>> ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/openvswitch/ovsdb-server-nb.log
>>>>> --remote=punix:/var/run/openvswitch/ovnnb_db.sock
>>>>> --pidfile=/var/run/openvswitch/ovnnb_db.pid --unixctl=ovnnb_db.ctl
>>>>> --detach --monitor --remote=db:OVN_Northbound,NB_Global,connections
>>>>> --private-key=db:OVN_Northbound,SSL,private_key
>>>>> --certificate=db:OVN_Northbound,SSL,certificate
>>>>> --ca-cert=db:OVN_Northbound,SSL,ca_cert --ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols
>>>>> --ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers
>>>>> --remote=ptcp:6641:0.0.0.0 --sync-from=tcp:192.0.2.254:6641
>>>>> /etc/openvswitch/ovnnb_db.db
>>>>> root     15184  0.0  0.0  18048   376 ?        Ss   15:15   0:00
>>>>> ovsdb-server: monitoring pid 15185 (healthy)
>>>>> root     15185  0.0  0.0  18300  4480 ?        S    15:15   0:00
>>>>> ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/openvswitch/ovsdb-server-sb.log
>>>>> --remote=punix:/var/run/openvswitch/ovnsb_db.sock
>>>>> --pidfile=/var/run/openvswitch/ovnsb_db.pid --unixctl=ovnsb_db.ctl
>>>>> --detach --monitor --remote=db:OVN_Southbound,SB_Global,connections
>>>>> --private-key=db:OVN_Southbound,SSL,private_key
>>>>> --certificate=db:OVN_Southbound,SSL,certificate
>>>>> --ca-cert=db:OVN_Southbound,SSL,ca_cert --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols
>>>>> --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers
>>>>> --remote=ptcp:6642:0.0.0.0 --sync-from=tcp:192.0.2.254:6642
>>>>> /etc/openvswitch/ovnsb_db.db
>>>>> root     15398  0.0  0.0  12940   972 pts/0    S+   15:15   0:00 grep
>>>>> --color=auto ovs
>>>>>
>>>>> >>>I just want to point out that I am also seeing below errors when
>>>>> setting target with master IP using ipaddr2 resource too!
>>>>> 2018-05-17T21:58:51.889Z|00011|ovsdb_jsonrpc_server|ERR|ptcp:6641:
>>>>> 192.168.220.108: listen failed: Cannot assign requested address
>>>>> 2018-05-17T21:58:51.889Z|00012|socket_util|ERR|6641:192.168.220.108:
>>>>> bind: Cannot assign requested address
>>>>> That needs to be handled too since existing code do throw this error!
>>>>> Only if I skip setting target then it the error is gone.?
>>>>>
>>>>
>>>> In the case of tripleo, we handle this error by setting the sysctl
>>>> value net.ipv4.ip_nonlocal_bind to 1 - https://github.com/openstack
>>>> /puppet-tripleo/blob/master/manifests/profile/pacemaker/ovn_
>>>> northd.pp#L67
>>>> >>> Sweet, I can try to set this to get rid of socket error.
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>> Aliasgar
>>>>>
>>>>>
>>>>> On Thu, May 17, 2018 at 3:04 AM, <nusiddiq at redhat.com> wrote:
>>>>>
>>>>>> From: Numan Siddique <nusiddiq at redhat.com>
>>>>>>
>>>>>> When a node 'A' in the pacemaker cluster running OVN db servers in
>>>>>> master is
>>>>>> brought down ungracefully ('echo b > /proc/sysrq_trigger' for
>>>>>> example), pacemaker
>>>>>> is not able to promote any other node to master in the cluster. When
>>>>>> pacemaker selects
>>>>>> a node B for instance to promote, it moves the IPAddr2 resource (i.e
>>>>>> the master ip)
>>>>>> to node 'B'. As soon the node is configured with the IP address, when
>>>>>> the issue is
>>>>>> seen, the OVN db servers which were running as standy earlier,
>>>>>> transitions to active.
>>>>>> Ideally this should not have happened. The ovsdb-servers are expected
>>>>>> to remain in
>>>>>> standby until there are promoted. (This needs separate
>>>>>> investigation). When the pacemaker
>>>>>> calls the OVN OCF script's promote action, the ovsdb_server_promot
>>>>>> function returns
>>>>>> almost immediately without recording the present master. And later in
>>>>>> the notify action
>>>>>> it demotes back the OVN db servers since the last known master
>>>>>> doesn't match with
>>>>>> node 'B's hostname. This results in pacemaker promoting/demoting in a
>>>>>> loop.
>>>>>>
>>>>>> This patch fixes the issue by not returning immediately when promote
>>>>>> action is
>>>>>> called if the OVN db servers are running as active. Now it would
>>>>>> continue with
>>>>>> the ovsdb_server_promot function and records the new master by
>>>>>> setting proper
>>>>>> master score ($CRM_MASTER -N $host_name -v ${master_score})
>>>>>>
>>>>>> This issue is not seen when a node is brought down gracefully as
>>>>>> pacemaker before
>>>>>> promoting a node, calls stop, start and then promote actions. Not
>>>>>> sure why pacemaker
>>>>>> doesn't call stop, start and promote actions when a node is reset
>>>>>> ungracefully.
>>>>>>
>>>>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1579025
>>>>>> Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
>>>>>> ---
>>>>>>  ovn/utilities/ovndb-servers.ocf | 2 +-
>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/ovn/utilities/ovndb-servers.ocf
>>>>>> b/ovn/utilities/ovndb-servers.ocf
>>>>>> index 164b6bce6..23dc70056 100755
>>>>>> --- a/ovn/utilities/ovndb-servers.ocf
>>>>>> +++ b/ovn/utilities/ovndb-servers.ocf
>>>>>> @@ -409,7 +409,7 @@ ovsdb_server_promote() {
>>>>>>      rc=$?
>>>>>>      case $rc in
>>>>>>          ${OCF_SUCCESS}) ;;
>>>>>> -        ${OCF_RUNNING_MASTER}) return ${OCF_SUCCESS};;
>>>>>> +        ${OCF_RUNNING_MASTER}) ;;
>>>>>>          *)
>>>>>>              ovsdb_server_master_update $OCF_RUNNING_MASTER
>>>>>>              return ${rc}
>>>>>> --
>>>>>> 2.17.0
>>>>>>
>>>>>> _______________________________________________
>>>>>> dev mailing list
>>>>>> dev at openvswitch.org
>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>