[ovs-discuss] [OVN] DB backup and restore

Han Zhou hzhou at ovn.org
Fri Jul 31 02:19:21 UTC 2020


On Thu, Jul 30, 2020 at 7:04 PM Tony Liu <tonyliu0592 at hotmail.com> wrote:

> Hi,
>
>
>
> Just update, finally make this snapshot/rollback work for me.
>
> The rollback is not live though. Here is what I did.
>
>
>
> 1. Make a snapshot by ovsdb-client. Assuming no ongoing
>
>    Transactions, and data is consistent on all nodes. The
>
>    Snapshot can be done on any node. It doesn't include any
>
>    cluster info. That's probably why the man page says this is
>
>    for standalone and A/B only. But that cluster info seems
>
>    not required to restore.
>
>
>
> 2. To rollback/restore, stop services on all nodes, starting
>
>    from followers to the leader.
>
>
>
> 3. Pick a node as the new leader, copy snapshot to be the DB
>
>    file. Then start the service. A cluster with new cluster ID
>
>    will be created. The node will be allocated a new server ID
>
>    as well.
>
>
>
> 4. On the rest two nodes, remove the DB file, restart service
>
>    with remote-address pointing to the leader.
>
>
>
> Now, the new cluster starts working with the rollback data.
>

The steps you gave may work, but it is weird. It is better to just follow
the steps mentioned in this section:

https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7.rst#backing-up-and-restoring-a-database


>
> "ovs-client restore" doesn't work for me, not sure why.
>
> ====
>
> ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type
>
> ====
>
> I tried to restore the snapshot created by backup, also the
>
> Directly copied DB file, neither of them works. Wondering anyone
>
> experienced such issue?
>
>
>
Maybe your command was wrong. Could you share your command line, and the
version used?


> To Numan, it would great if you could share the details to use
>
> Neutron-ovn-sync-util.
>
>
>
>
>
> Thanks!
>
>
>
> Tony
>
>
>
> *From: *Tony Liu <tonyliu0592 at hotmail.com>
> *Sent: *Thursday, July 30, 2020 4:51 PM
> *To: *Numan Siddique <nusiddiq at redhat.com>; Han Zhou <hzhou at ovn.org>
> *Cc: *Han Zhou <hzhou at ovn.org>; ovs-dev <ovs-dev at openvswitch.org>;
> ovs-discuss <ovs-discuss at openvswitch.org>
> *Subject: *Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore
>
>
>
> Hi Numan,
>
> I found this comment you made a few years back.
>
> - At neutron-server startup, OVN ML2 driver syncs the neutron
> DB and OVN DB if sync mode is set to repair.
> - Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.
>
> Could you share the details to try those two options?
>
>
> Thanks!
>
> Tony
>
> From: Tony Liu<mailto:tonyliu0592 at hotmail.com <tonyliu0592 at hotmail.com>>
> Sent: Thursday, July 30, 2020 4:38 PM
> To: Han Zhou<mailto:hzhou at ovn.org <hzhou at ovn.org>>
> Cc: Han Zhou<mailto:hzhou at ovn.org <hzhou at ovn.org>>; ovs-dev<
> mailto:ovs-dev at openvswitch.org <ovs-dev at openvswitch.org>>; ovs-discuss<
> mailto:ovs-discuss at openvswitch.org <ovs-discuss at openvswitch.org>>
> Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore
>
> Hi,
>
> I have another thought after some diggings. Since I am with
> OpenStack, all networking configurations are from OpenStack.
> I could snapshot OpenStack MariaDB, restore and run
> neutron-ovn-db-sync to update OVN DB. Would that be a cleaner
> solution?
>
> BTW, I got this error when restore the OVN DB.
> ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type
>
> The file was created by "backup" command.
>
>
> Thanks!
>
> Tony
>
> From: Tony Liu<mailto:tonyliu0592 at hotmail.com <tonyliu0592 at hotmail.com>>
> Sent: Thursday, July 30, 2020 3:41 PM
> To: Han Zhou<mailto:hzhou at ovn.org <hzhou at ovn.org>>
> Cc: Han Zhou<mailto:hzhou at ovn.org <hzhou at ovn.org>>; ovs-dev<
> mailto:ovs-dev at openvswitch.org <ovs-dev at openvswitch.org>>; ovs-discuss<
> mailto:ovs-discuss at openvswitch.org <ovs-discuss at openvswitch.org>>
> Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore
>
> Hi,
>
> A quick question here. Given this man page.
> http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt
>
> It says backup and restore commands are for OVSDB standalone and
>
> active-backup databases.
>
>
>
> Can they be used for RAFT cluster? If not, what would be the concern,
>
> like inconsistency?
>
>
>
> If I restore to a follower, is the request going to be forwarded to the
>
> leader to restore DB for the whole cluster? But I believe it's recommended
>
> to restore to the leader directly for performance sake.
>
>
>
> I am going to give it a try anyways, see how it works. Will make sure
>
> there is no configuration update from OpenStack side while running such
>
> snapshot and restore process.
>
>
>
>
>
> Thanks!
>
>
>
> Tony
>
> From: Han Zhou<mailto:hzhou at ovn.org <hzhou at ovn.org>>
> Sent: Thursday, July 30, 2020 12:23 PM
> To: Tony Liu<mailto:tonyliu0592 at hotmail.com <tonyliu0592 at hotmail.com>>
> Cc: Han Zhou<mailto:hzhou at ovn.org <hzhou at ovn.org>>; ovs-discuss<
> mailto:ovs-discuss at openvswitch.org <ovs-discuss at openvswitch.org>>;
> ovs-dev<mailto:ovs-dev at openvswitch.org <ovs-dev at openvswitch.org>>
> Subject: Re: [ovs-discuss] [OVN] DB backup and restore
>
>
>
> On Thu, Jul 30, 2020 at 10:56 AM Tony Liu <tonyliu0592 at hotmail.com<mailto:
> tonyliu0592 at hotmail.com>> wrote:
> Hi Han,
>
> That doc helps. I will run some tests and update here. The use case I want
> to cover is snapshot/rollback and backup/restore.
>
> ========
> Actually, "at-least-once" consistency, because OVSDB does not have a
> session
> mechanism to drop duplicate transactions if a connection drops after the
> server
> commits it but before the client receives the result.
> ========
> I saw duplicated datapath bindings for the same logical switch once, if you
> recall. This may explain that. The ovn-northd connection to sb-db is
> dropped
> before receiving the result. So ovn-northd initiates another transaction to
> create datapath binding for the same logical switch.
>
> Yes, this is a possibility.
> However, in reality, this is usually not a problem:
>
> 1) If DB schema has table keys properly defined, the redundant transaction
> from clients would be rejected by DB server because of key constraint
> check. In the datapath binding case, this doesn't work because of the poor
> definition of the datapath_binding table. It should have had
> "logical_switch_router" column defined and set as a key (in addition to the
> "tunnel_key") instead of storing it in external_ids. The duplicated entries
> would have been avoided. The other tables such as port_binding would never
> have such problem.
>
> 2) OVSDB clients usually monitors and syncs all (interested) data from
> server to local, so when they do declarative processing, they could correct
> problems by themselves. In fact, ovn-northd does the check and deletes
> duplicated datapaths. I did a simple test and it did cleanup by itself:
> 2020-07-30T18:55:53.057Z|00006|ovn_northd|INFO|ovn-northd lock acquired.
> This ovn-northd instance is now active.
> 2020-07-30T19:02:10.465Z|00007|ovn_northd|INFO|deleting Datapath_Binding
> abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate
> external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e
>
> I am not sure why in your case north was stuck, but I agree there must be
> something wrong. Please collect northd logs if you encounter this again so
> we can dig further.
>
> I see two ways to improve it.
> 1) On client side, if the connection is broken while waiting for the result
>    of a transaction, the client checks the transaction state, committed or
> not,
>    when it reconnects to the leader (maybe a different node).
>    Do we have such check today?
>
> Clients does check. In this case when transaction was actually successful
> but appears to be failed from client point of view, the check doesn't help.
>
> 2) I see client connection is dropped by the leader when it's busy. I don't
>    think this is a good way to control the traffic. The server can cache
> and
>    hold the request when it's busy, or even push back. Dropping connection
>    is not a good option. Any thoughts here?
>
> The server doesn't make this kind of decisions. It could be simply
> overloaded and disconnected from the cluster, or even worse, a node could
> crash after commiting the transaction.
>
> Thanks,
> Han
>
>
> Thanks!
>
> Tony
>
> From: Han Zhou<mailto:hzhou at ovn.org <hzhou at ovn.org>>
> Sent: Wednesday, July 29, 2020 11:38 PM
> To: Tony Liu<mailto:tonyliu0592 at hotmail.com <tonyliu0592 at hotmail.com>>
> Cc: ovs-discuss<mailto:ovs-discuss at openvswitch.org
> <ovs-discuss at openvswitch.org>>; ovs-dev<mailto:ovs-dev at openvswitch.org
> <ovs-dev at openvswitch.org>>
> Subject: Re: [ovs-discuss] [OVN] DB backup and restore
>
>
>
> On Wed, Jul 29, 2020 at 10:58 PM Tony Liu <tonyliu0592 at hotmail.com<mailto:
> tonyliu0592 at hotmail.com>> wrote:
> >
> > Hi,
> >
> >
> >
> > There is any guidance to backup and restore OVN nb-db and sb-db?
> >
> >
> >
> > Is /var/lib/openvswitch/ovn-[ns]b/ovn[ns]b.db the only database file?
> >
> >
> >
> > For 3-node DB cluster, is replication 3 (the data is replicated onto
> >
> > All 3 nodes)?
> >
> >
> >
> > Are DB files on 3 nodes identical?
> >
> >
> >
> > If I stop a DB follower and empty the DB file on the follower node,
> >
> > when I start it back, is the whole DB going to be replicated to it?
> >
> >
> >
> > To backup the DB, is it OK to copy the DB file from any node, assuming
> >
> > no transaction ongoing?
> >
> >
> >
> > Is the following going to work to restore the DB?
> >
> > * Stop all 3 DBs.
> >
> > * Copy backup DB file to one node, empty DB file on the rest two nodes.
> >
> > * Bootstrap the node with DB file.
> >
> > * Start the rest two nodes to join the cluster.
> >
>
> For ovsdb operations, please refer to "man 7 ovsdb", or here:
> https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7.rst
>
> >
> >
> > Do I need to restore sb-db as well? Or restore nb-db only and let
> >
> > ovn-northd to sync data from nb-db to sb-db. Chassis data should be
> >
> > updated by onv-controller?
> >
>
> You don't have to restore sb-db. ovn-northd and ovn-controllers will sync
> the data in SB DB.
> However, it may take quite some time to sync if the scale is large.
> Also, remember that the mac_binding table in SB will not be restored by
> ovn-controller because it is populated as a result of ARP packets handling
> by ovn-controller. The entries will be generated again only if new ARP
> packets are observed by ovn-controller.
>
> >
> >
> > I am running scaling test. It takes quite a lot of time to build
> >
> > Configurations. Wondering if I can back and restore DB to rollback
> >
> > to some checkpoint to avoid restart all over.
> >
> >
> >
> >
> >
> > Thanks!
> >
> >
> >
> > Tony
> >
> >
> >
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org<mailto:discuss at openvswitch.org>
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200730/cbbf4807/attachment-0001.html>


More information about the discuss mailing list