[ovs-discuss] Question to OVN DB pacemaker script

Han Zhou zhouhan at gmail.com
Wed May 9 15:32:23 UTC 2018


Hi Numan,

Thanks you so much for the detailed answer! Please see my comments inline.

On Wed, May 9, 2018 at 7:41 AM, Numan Siddique <nusiddiq at redhat.com> wrote:

> Hi Han,
>
> Please see below for inline comments
>
> On Wed, May 9, 2018 at 5:17 AM, Han Zhou <zhouhan at gmail.com> wrote:
>
>> Hi Babu/Numan,
>>
>> I have a question regarding OVN pacemaker OCF script.
>> I see in the script MASTER_IP is used to start the active DB and standby
>> DBs will use that IP to sync from.
>>
>> In the Documentation/topics/integration.rst it is also mentioned:
>>
>> `master_ip` is the IP address on which the active database server is
>> expected to be listening, the slave node uses it to connect to the master
>> node.
>>
>> However, since active node will change after failover, I wonder if we
>> should provide all the IPs of each nodes, and let pacemaker to decide which
>> IP is the master IP to be used, dynamically.
>>
>
>
>
>> I see in the documentation it is mentioned about using the IPAddr2
>> resource for virtual IP. Does it indicate that we should use the virtual IP
>> as the master IP?
>>
>
> That is true. If the master ip is not virtual ip, then we will not be able
> to figure out which is the master node. We need to configure networking-ovn
> and ovn-controller to point to the right master node so that they can do
> write transactions on the DB.
>
> Below is how we have configured pacemaker OVN HA dbs in tripleo openstack
> deployment
>
>  - Tripleo deployment creates many virtual IPs (using IPAddr2) and these
> IP addresses are frontend IPs for keystone and all other openstack API
> services and haproxy is used to load balance the traffic (the deployment
> will mostly have 3 controllers and all the openstack API services will be
> running on each node).
>
>  - We choose one of the IPaddr2 virtual ip and we set a colocation
> constraint when creating the OVN pacemaker HA db resource i.e we ask
> pacemaker to promote the ovsdb-servers running in the node configured with
> the virtual ip (i.e master_ip).  Pacemaker will call the promote action [1]
> on the node where master ip is configured.
>
> - tripleo configures "ovn_nb_connection=tcp:VIP:6641" and "
> ovn_sb_connection=tcp:VIP:6642" in neutron.conf and runs "ovs-vsctl set
> open . external_ids:ovn-remote=tcp:VIP:6642" on all the nodes where
> ovn-controller service is started.
>
> - Suppose the master ip node goes down for some reason. Pacemaker detects
> this and moves the virtual ip IPAddr2 resource to another node and promotes
> the ovsdb-servers running on that node to master. This way, the
> neutron-servers and ovn-controlloers can still talk to the same IP without
> even noticing that other node becoming master.
>
>
>
> Since tripleo was using the IPaddr2 model, we thought this would be the
> better way to have a master/slave HA for ovsdb-servers.
>
> However, this may not work in all scenarios, since the virtual IP works
>> only if it can be routed to all nodes, e.g. when all nodes are on the same
>> subnet.
>>
>
> You mean you want to create a pacemaker cluster with nodes belonging to
> different subnets ? I had a chat with the pacemaker folks and this is
> possible. You can also create a IPAddr2 resource. Pacemaker doesn't put any
> restrictions. But you need to solve the  reachability of that ip from all
> the networks/nodes.
>

Yes, and this is why we can't use IPAddr2 due to the reachability problem.
(Not in same L2, no BGP, etc.)


> In those cases the IPAddr2 virtual IP won't work. In those cases, for the
>> clients to access the DB, we can use Load-Balancer VIP. But the problem is
>> still how to set the master_ip and how to make the standby to connect to
>> the new active after failover.
>>
>
> I am a bit confused here. Your setup will still have the pacemaker cluster
> right ? Are you talking about having OVN db servers active/passive setup on
> a non pacemaker cluster setup ? If so, I don't think the OVN OCF script can
> be used and you have to solve it differently. Correct me if I am wrong here.
>
>
You mentioned above "However, since active node will change after failover,
> I wonder if we should provide all the IPs of each nodes, and let pacemaker
> to decide which IP is the master IP to be used, dynamically".
>
> We can definitely add this support. Whenever pacemaker promotes a node,
> other nodes come to know about it and OVN OCF script can configure the
> ovsdb-servers on the slave nodes to connect to the new master. But how will
> you configure the neutron-server and ovn-controllers to talk to the new
> master ?
> Are you planning to use load balancer IP for this purpose ? What if the
> load balancer ip resolves to a standby server ?
>

We still have pacemaker to manage the cluster HA, but just don't use
IPAddr2 for VIP. To solve the VIP problem, we use physical/soft
load-balancer. The VIP is on LB rather than bound on the ovn central node
interface. There is no problem for client, but a little problem on the OCF
script. Since the OCF script relies on the master IP to start the active
OVSDB, but the master IP (now LB VIP) is not attached on the node
interface, this will fail. Now that you explained the usage of master IP, I
think a small change can solve this problem: don't use master IP when
starting the active OVSDB service, i.e. listen on 0.0.0.0. For standby
OVSDBs, they will continue using master IP to sync from active. The standby
should not listen on any port (or just on different port from the active if
they have to), so that the LB health-check can figure out the active member
and point the VIP/master IP correctly to the active one.

In addition, how do you configure northd for NB/SB DB? I think both
master-ip/vip and unix socket should work, but they are different. If using
master-ip/vip, northd can be active on any one of the nodes, not co-locate
with NB/SB DBs, and ovsdb named lock ensures only one is active. However,
it seems we can also use unix socket to always connect to local NB/SB.
Since NB/SB is managed as a single pacemaker resource, they failover
together, so we can consider ovn-northd part of the bundle (but not managed
by pacemaker). This way, although all northds are running, but only the one
on the active NB/SB node matters, and ovsdb named lock is irrelevant here.
Any thoughts/experience on this?

Alternatively, we can also separate pacemaker resource for NB and SB, so
that each component NB/SB/northd can be active/standby independent for each
other on different nodes, but I am not sure if there are more benefit or
churns.

>
> Hope this helps.
>
> If you have a requirement to support this scenario (i.e without master_ip
> param), it can be done. But care should be taken when implementing it.
>
> So far seems we can still use master_ip, but with a little change as
mentioned above.

>
> [1] - https://github.com/openvswitch/ovs/blob/master/
> ovn/utilities/ovndb-servers.ocf#L505
>        http://www.linux-ha.org/doc/dev-guides/_resource_agent_actions.html
>
>
>
>> I may have missed something here. Could you help explain what's the
>> expected way to work?
>>
>
>
>
>
>>
>> Thanks,
>> Han
>>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20180509/a4c6b3ca/attachment.html>


More information about the discuss mailing list