[ovs-discuss] OVN Scale with RAFT: how to make raft cluster clients to balanced state again

Girish Moodalbail gmoodalbail at gmail.com
Thu Aug 6 01:21:48 UTC 2020


On Wed, Aug 5, 2020 at 5:23 PM Han Zhou <zhouhan at gmail.com> wrote:

>
>
> On Wed, Aug 5, 2020 at 4:35 PM Girish Moodalbail <gmoodalbail at gmail.com>
> wrote:
>
>>
>>
>> On Wed, Aug 5, 2020 at 3:05 PM Han Zhou <hzhou at ovn.org> wrote:
>>
>>>
>>>
>>> On Wed, Aug 5, 2020 at 12:51 PM Winson Wang <windson.wang at gmail.com>
>>> wrote:
>>>
>>>> Hello OVN Experts:
>>>>
>>>> With large scale ovn-k8s cluster,  there are several conditions that
>>>> would make ovn-controller clients connect SB central from a balanced state
>>>> to an unbalanced state.
>>>> Is there an ongoing project to address this problem?
>>>> If not,  I have one proposal not sure if it is doable.
>>>> Please share your thoughts.
>>>>
>>>> The issue:
>>>>
>>>> OVN SB RAFT 3 node cluster,  at first all the ovn-controller clients
>>>> will connect all the 3 nodes in a balanced state.
>>>>
>>>> The following conditions will make the connections become unbalanced.
>>>>
>>>>    -
>>>>
>>>>    One RAFT node restart,  all the ovn-controller clients to reconnect
>>>>    to the two remaining cluster nodes.
>>>>
>>>>
>>>>    -
>>>>
>>>>    Ovn-k8s,  after SB raft pods rolling upgrade, the last raft pod has
>>>>    no client connections.
>>>>
>>>>
>>>> RAFT clients in an unbalanced state would trigger more stress to the
>>>> raft cluster,  which makes the raft unstable under stress compared to a
>>>> balanced state.
>>>> The proposal solution:
>>>>
>>>> Ovn-controller adds next unix commands “reconnect” with argument of
>>>> preferred SB node IP.
>>>>
>>>> When unbalanced state happens,  the UNIX command can trigger
>>>> ovn-controller reconnect
>>>>
>>>> To new SB raft node with fast sync which doesn’t trigger the whole DB
>>>> downloading process.
>>>>
>>>>
>>> Thanks Winson. The proposal sounds good to me. Will you implement it?
>>>
>>
>> Han/Winson,
>>
>> The fast re-sync is for ovsdb-server restart and it will not apply for
>> ovn-controller restart, right?
>>
>>
> Right, but the proposal is to provide a command just to reconnect, without
> restarting. In that case fast-resync should work.
>
>
>> If the ovsdb-client (ovn-controller) restarts, then it would have lost
>> all its state and when it starts again it will still need to download
>> logical_flows, port_bindings , and other tables it cares about. So, fast
>> re-sync may not apply to this case.
>>
>> Also, the ovn-controller should stash the IP address of the SB server to
>> which it is connected to in Open_vSwitch table's external_id column. It
>> updates this field whenever it re-connects to a different SB server
>> (because that ovsdb-server instance failed or restarted). When
>> ovn-controller itself restarts it could check for the value in this field
>> and try to connect to it first and on failure fallback to connect to
>> default connection approach.
>>
>
> The imbalance is usually caused by failover on server side. When one
> server is down, all clients are expected to connect to the rest of the
> servers, and when the server is back, there is no motivation for the
> clients to reconnect again (unless you purposely restart the clients, which
> would bring 1/3 of the restarted clients back to the old server). So I
> don't understand how "stash the IP address" would work in this scenario.
>
> The proposal above by Winson is to purposely trigger a reconnection
> towards the desired server without restarting the clients, which I think
> solves this problem directly.
>

Right. This is what we discussed internally, however when I read this email
on the list I got confused with the other thread (rolling update of
ovn-controller in K8s cluster which involves restart of ovn-controller).
Sorry, for the noise.

Regards,
~Girish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200805/080053eb/attachment-0001.html>


More information about the discuss mailing list