[ovs-dev] [PATCH v2 0/9] OVSDB Relay Service Model. (Was: OVSDB 2-Tier deployment)

Dumitru Ceara dceara at redhat.com
Fri Jun 25 13:33:50 UTC 2021


On 6/12/21 3:59 AM, Ilya Maximets wrote:
> Replication can be used to scale out read-only access to the database.
> But there are clients that are not read-only, but read-mostly.
> One of the main examples is ovn-controller that mostly monitors
> updates from the Southbound DB, but needs to claim ports by sending
> transactions that changes some database tables.
> 
> Southbound database serves lots of connections: all connections
> from ovn-controllers and some service connections from cloud
> infrastructure, e.g. some OpenStack agents are monitoring updates.
> At a high scale and with a big size of the database ovsdb-server
> spends too much time processing monitor updates and it's required
> to move this load somewhere else.  This patch-set aims to introduce
> required functionality to scale out read-mostly connections by
> introducing a new OVSDB 'relay' service model .
> 
> In this new service model ovsdb-server connects to existing OVSDB
> server and maintains in-memory copy of the database.  It serves
> read-only transactions and monitor requests by its own, but forwards
> write transactions to the relay source.
> 
> Key differences from the active-backup replication:
> - support for "write" transactions.
> - no on-disk storage. (probably, faster operation)
> - support for multiple remotes (connect to the clustered db).
> - doesn't try to keep connection as long as possible, but
>   faster reconnects to other remotes to avoid missing updates.
> - No need to know the complete database schema beforehand,
>   only the schema name.
> - can be used along with other standalone and clustered databases
>   by the same ovsdb-server process. (doesn't turn the whole
>   jsonrpc server to read-only mode)
> - supports modern version of monitors (monitor_cond_since),
>   because based on ovsdb-cs.
> - could be chained, i.e. multiple relays could be connected
>   one to another in a row or in a tree-like form.
> 
> Bringing all above functionality to the existing active-backup
> replication doesn't look right as it will make it less reliable
> for the actual backup use case, and this also would be much
> harder from the implementation point of view, because current
> replication code is not based on ovsdb-cs or idl and all the required
> features would be likely duplicated or replication would be fully
> re-written on top of ovsdb-cs with severe modifications of the former.
> 
> Relay is somewhere in the middle between active-backup replication and
> the clustered model taking a lot from both, therefore is hard to
> implement on top of any of them.
> 
> To run ovsdb-server in relay mode, user need to simply run:
> 
>   ovsdb-server --remote=punix:db.sock relay:<schema-name>:<remotes>
> 
> e.g.
> 
>   ovsdb-server --remote=punix:db.sock relay:OVN_Southbound:tcp:127.0.0.1:6642
> 
> More details and examples in the documentation in the last patch
> of the series.
> 
> I actually tried to implement transaction forwarding on top of
> active-backup replication in v1 of this seies, but it required
> a lot of tricky changes, including schema format changes in order
> to bring required information to the end clients, so I decided
> to fully rewrite the functionality in v2 with a different approach.
> 
> Future work:
> - Add support for transaction history (it could be just inherited
>   from the transaction ids received from the relay source).  This
>   will allow clients to utilize monitor_cond_since while working
>   with relay.

Hi Ilya,

I acked most of the patches in the series (except 7/9 which I think
might need a rather straightforward change) and I saw Mark also left
some comments.

I wonder though if the lack of monitor_cond_since will be a show stopper
for deploying this in production?  Or do you expect reconnects to happen
less often do to the multi-tier nature of new deployments?

I guess we need some scale test data with this deployed to have a better
idea.

In any case, very nice work!

Regards,
Dumitru



More information about the dev mailing list