[ovs-dev] [PATCH] ovsdb-cs: Avoid unnecessary re-connections when updating remotes.

Tue Jun 29 19:57:44 UTC 2021

On 6/29/21 8:05 PM, Ben Pfaff wrote:
> On Tue, Jun 29, 2021 at 10:29:59AM -0700, Han Zhou wrote:
>> On Tue, Jun 29, 2021 at 8:43 AM Ben Pfaff <blp at ovn.org> wrote:
>>>
>>> On Tue, Jun 29, 2021 at 12:56:18PM +0200, Ilya Maximets wrote:
>>>> If a new database server added to the cluster, or if one of the
>>>> database servers changed its IP address or port, then you need to
>>>> update the list of remotes for the client.  For example, if a new
>>>> OVN_Southbound database server is added, you need to update the
>>>> ovn-remote for the ovn-controller.
>>>>
>>>> However, in the current implementation, the ovsdb-cs module always
>>>> closes the current connection and creates a new one.  This can lead
>>>> to a storm of re-connections if all ovn-controllers will be updated
>>>> simultaneously.  They can also start re-dowloading the database
>>>> content, creating even more load on the database servers.
>>>>
>>>> Correct this by saving an existing connection if it is still in the
>>>> list of remotes after the update.
>>>>
>>>> 'reconnect' module will report connection state updates, but that
>>>> is OK since no real re-connection happened and we only updated the
>>>> state of a new 'reconnect' instance.
>>>>
>>>> If required, re-connection can be forced after the update of remotes
>>>> with ovsdb_cs_force_reconnect().
>>>
>>> I think one of the goals here was to keep the load balanced as servers
>>> are added.

Yes, I thought about that and that is a valid point.  It's more like
a trade-off here between stability of connections and trying to keep
the load balanced in some way.

>>> Maybe that's not a big deal, or maybe it would make sense to
>>> flip a coin for each of the new servers and switch over to it with
>>> probability 1/n where n is the number of servers.

That seems like an interesting approach, but I think that resulted
probability of keeping the connection would be low.

>>
>> A similar load-balancing problem exists also when a server is down and then
>> recovered. Connections will obviously move away when it is down but they
>> won't automatically connect back when it is recovered. Apart from the
>> flipping-a-coin approach suggested by Ben, I saw a proposal [0] [1] in the
>> past that provides a CLI to reconnect to a specific server which leaves
>> this burden to CMS/operators. It is not ideal but still could be an
>> alternative to solve the problem.

I remember these patches.  And I think that disbalance after one of the
servers went down and up again (e.g. temporary disconnection of one of
the cluster nodes) is a more important issue and at the same time harder
to solve, because this happens automatically without intervention from
user/CMS's side.  And at some extent it's inevitable. E.g. cluster will
almost always be disbalanced if 3 server nodes will be restarted for
upgrade one by one.  Luckily, worker nodes with ovn-controllers needs
maintenance too, so eventual load balance will be achieved.

One interesting side effect of the current patch is that you can mimic
behavior of patches [0][1] like this:
  set ovn-remote=<new server>
  set ovn-remote=<new server><all other servers>
After the first command, the ovn-controller will re-connect to a new
server and it will not re-connect again after addition of all other
servers back to the list.  But I agree that this looks more like a hack
than an actual way to do that.

For the more or less automatic ways of solving the disbalance there are
few more ideas that we can explore:

- Try to measure the load on the ovsdb-server process and report it
  somehow in the _Server database, so the client might make a decision
  to re-connect to a less loaded server.  This might be some metric
  based on total number of clients or the time it takes to run a
  single event processing loop (poll interval).

- A bit more controlled way is to limit number of clients per server,
  so the server will decline connection attempts.  CMS might have an
  idea how many clients one server is able/allowed to handle.
  E.g. for N servers and M clients, it might be reasonable to allow
  not more than 2M/N connections per server to still be able to serve
  all clients if half of the servers is down.  Of course, it's up to
  CMS/user to decide on the exact number.  This could be implemented
  as an extra column for connection row in the database.

>>
>> I think both approaches have their pros and cons. The smart way doesn't
>> require human intervention in theory, but when operating at scale people
>> usually want to be cautious and have more control over the changes. For
>> example, they may want to add the server to the cluster first, and then
>> gradually move 1/n connections to the new server after a graceful period,
>> or they could be more conservative and only let the new server take new
>> connections without moving any existing connections. I'd support both
>> options and let the operators decide according to their requirements.

This sounds reasonable.

>>
>> Regarding the current patch, I think it's better to add a test case to
>> cover the scenario and confirm that existing connections didn't reset. With
>> that:
>> Acked-by: Han Zhou <hzhou at ovn.org>

I'll work on a unit test for this.

> 
> This seems reasonable; to be sure, I'm not arguing against Ilya's
> appproach, just trying to explain my recollection of why it was done
> this way.
> 

Thanks.  We need more good ideas on how to handle load balancing and
connections in general.

Best regards, Ilya Maximets.