[ovs-dev] [PATCH] reconnect: Add graceful reconnect.
blp at ovn.org
Tue Jun 29 16:02:17 UTC 2021
On Tue, Jun 29, 2021 at 01:20:35PM +0200, Dumitru Ceara wrote:
> Until now clients that needed to reconnect immediately could only use
> reconnect_force_reconnect(). However, reconnect_force_reconnect()
> doesn't reset the backoff for connections that were alive long enough
> (more than backoff seconds).
> Moreover, the reconnect library cannot determine the exact reason why a
> client wishes to initiate a reconnection. In most cases reconnection
> happens because of a fatal error when communicating with the remote,
> e.g., in the ovsdb-cs layer, when invalid messages are received from
> ovsdb-server. In such cases it makes sense to not reset the backoff
> because the remote seems to be unhealthy.
> There are however cases when reconnection is needed for other reasons.
> One such example is when ovsdb-clients require "leader-only" connections
> to clustered ovsdb-server databases. Whenever the client determines
> that the remote is not a leader anymore, it decides to reconnect to a
> new remote from its list, searching for the new leader. Using
> jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will
> not reset backoff even though the former leader is still likely in good
> Since 3c2d6274bcee ("raft: Transfer leadership before creating
> snapshots.") leadership changes inside the clustered database happen
> more often and therefore "leader-only" clients need to reconnect more
> often too. Not resetting the backoff every time a leadership change
> happens will cause all reconnections to happen with the maximum backoff
> (8 seconds) resulting in significant latency.
> This commit also updates the Python reconnect and IDL implementations
> and adds tests for force-reconnect and graceful-reconnect.
> Reported-at: https://bugzilla.redhat.com/1977264
> Signed-off-by: Dumitru Ceara <dceara at redhat.com>
I only glanced over this, but my reaction is good. Thank you for
adding tests and writing such a thorough rationale!
More information about the dev