[ovs-discuss] ovndb_servers can't be promoted

Numan Siddique nusiddiq at redhat.com
Tue Nov 28 14:48:05 UTC 2017


On Tue, Nov 28, 2017 at 2:29 PM, Hui Xiang <xianghuir at gmail.com> wrote:

> Hi Numan,
>
>
> Finally figure it out what's wrong when running ovndb-servers ocf in my
> environment.
>
> 1. There is no default ovnnb and ovnsb running in my environment, I
> thought it should be started by pacemaker as the usual way other typical
> resource agent do it.
> when I create the ovndb_servers resource, nothing happened, no operation
> is executed except monitor, which is really hard to debug for a while.
> In the ovsdb_server_monitor() function, first it will check the status,
> here, it will be return NOT_RUNNING, then in the ovsdb_server_master_update()
> function, "CRM_MASTER -D" is being executed, which appears stopped every
> following action, I am not very clear what work it did.
>
> So, do the ovn_nb and ovn_sb needs to be running previouly before
> pacemaker ovndb_servers resource create? Is there any such documentation
> referred?
>
> 2. Without your patch every nodes executing ovsdb_server_monitor and
> return OCF_SUCCESS
> However, the first node of the three nodes cluster is executed
> ovsdb_server_stop action, the reason showed below:
> <27>Nov 28 15:35:11 node-1 pengine[1897010]:    error: clone_color:
> ovndb_servers:0 is running on node-1.domain.tld which isn't allowed
> Did I miss anything? I don't understand why it isn't allowed.
>
> 3. Regard your patch[1]
> It first reports "/usr/lib/ocf/resource.d/ovn/ovndb-servers: line 26:
> ocf_attribute_target: command not found ]" in my environment(pacemaker
> 1.1.12)
>

Thanks. I will come back to you on your other points. The function
"ocf_attribute_target" action must be added in 1.1.16-12.

I think it makes sense to either remove "ocf_attribute_target" or find a
way so that even older versions work.

I will spin a v2.
Thanks
Numan



The log showed same as item2, but I have seen very shortly different state
> from "pcs status" as below shown:
>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>      Slaves: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
> There is no promote action being executed.
>
>
> Thanks for looking and help.
>
> [1] - https://patchwork.ozlabs.org/patch/839022/
>
>
>
>
>
> On Fri, Nov 24, 2017 at 10:54 PM, Numan Siddique <nusiddiq at redhat.com>
> wrote:
>
>> Hi Hui Xiang,
>>
>> Can you please try with this patch [1]  and see if it works for you ?
>> Please let me know how it goes. But I am not sure, if the patch would fix
>> the issue.
>>
>> To brief, the OVN OCF script doesn't add monitor action for "Master"
>> role. So pacemaker Resource agent would not check for the status of ovn db
>> servers periodically. In case ovn db servers are killed, pacemaker wont
>> know about it.
>>
>>
>>
>>
>> You can also take a look at this [1] to know how it is used in openstack
>> with tripleo installation.
>>
>> [1] - https://patchwork.ozlabs.org/patch/839022/
>> [2] - https://github.com/openstack/puppet-tripleo/blob/master/
>> manifests/profile/pacemaker/ovn_northd.pp
>>
>>
>> Thanks
>> Numan
>>
>> On Fri, Nov 24, 2017 at 3:00 PM, Hui Xiang <xianghuir at gmail.com> wrote:
>>
>>> Hi folks,
>>>
>>>   I am following what suggested on doc[1] to configure the ovndb_servers
>>> HA, however, it's so unluck with upgrading pacemaker packages from 1.12 to
>>> 1.16, do almost every kind of changes, there still not a ovndb_servers
>>> master promoted, is there any special recipe for it to run? so frustrated
>>> on it, sigh.
>>>
>>> It always showed:
>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>      Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>>
>>> Even if I tried below steps:
>>> 1. pcs resource debug-stop ovndb_server on every nodes.      ovn-ctl
>>> status_ovnxb: running/backup
>>> 2. pcs resource debug-start ovndb_server on every nodes.      ovn-ctl
>>> status_ovnxb: running/backup
>>> 3. pcs resource debug-promote ovndb_server on one nodes.   ovn-ctl
>>> status_ovnxb: running/active
>>>
>>> With above status, the pcs status still showed as:
>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>      Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>>
>>>
>>> [1]. https://github.com/openvswitch/ovs/blob/master/Document
>>> ation/topics/integration.rst
>>>
>>> Appreciated any hint.
>>>
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list
>>> discuss at openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20171128/70ca9d40/attachment.html>


More information about the discuss mailing list