[ovs-discuss] OVN Scale with RAFT: how to make raft cluster clients to balanced state again

Winson Wang windson.wang at gmail.com
Wed Aug 5 19:51:26 UTC 2020


Hello OVN Experts:

With large scale ovn-k8s cluster,  there are several conditions that would
make ovn-controller clients connect SB central from a balanced state to
an unbalanced state.
Is there an ongoing project to address this problem?
If not,  I have one proposal not sure if it is doable.
Please share your thoughts.

The issue:

OVN SB RAFT 3 node cluster,  at first all the ovn-controller clients will
connect all the 3 nodes in a balanced state.

The following conditions will make the connections become unbalanced.

   -

   One RAFT node restart,  all the ovn-controller clients to reconnect to
   the two remaining cluster nodes.


   -

   Ovn-k8s,  after SB raft pods rolling upgrade, the last raft pod has no
   client connections.


RAFT clients in an unbalanced state would trigger more stress to the raft
cluster,  which makes the raft unstable under stress compared to a balanced
state.
The proposal solution:

Ovn-controller adds next unix commands “reconnect” with argument of
preferred SB node IP.

When unbalanced state happens,  the UNIX command can trigger ovn-controller
reconnect

To new SB raft node with fast sync which doesn’t trigger the whole DB
downloading process.


-- 
Winson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200805/45aeed05/attachment.html>


More information about the discuss mailing list