[ovs-dev] [PATCH v3] ovn-ctl: Support starting clustered OVN dbs

Ben Pfaff blp at ovn.org
Wed Apr 4 21:42:49 UTC 2018


On Thu, Apr 05, 2018 at 12:02:07AM +0530, Numan Siddique wrote:
> On Wed, Apr 4, 2018 at 10:51 PM, Ben Pfaff <blp at ovn.org> wrote:
> 
> > Hi Numan, thanks for the new version.
> >
> > Did you (or someone else) re-test this with the changes that I made?  I
> > did not test them myself, so I have low confidence in them.  The
> > upgrade_cluster() function especially needs testing.
> >
> > Thanks,
> >
> >
> Hi Ben,
> 
> I did testing with upgrade_cluster and it worked fine. But I found one
> issue.
> 
> In my testing, I stopped all the servers in my 3 node cluster. Upgraded ovn
> db schema by adding a tst column in logical_switch table and updated the
> ovs and ovn rpms in all the 3 nodes.
> Then when I started the cluster on node 1,  all the services were started,
> but I saw the below error
> 
> *******
> 2018-04-04T14:28:13Z|00002|ovsdb_idl|WARN|Logical_Switch table in
> OVN_Northbound database lacks tst2 column (database needs upgrade?)
> Waiting for OVN_Northbound to come up                      [  OK  ]
> Upgrading database OVN_Northbound from schema version 5.10.2 to 5.10.3
> 2018-04-04T14:28:13Z|00001|ovsdb|WARN|/usr/share/openvswitch/ovn-nb.ovsschema:
> changed 2 columns in 'OVN_Northbound' database from ephemeral to
> persistent, including 'status' column in 'Connection' table, because
> clusters do not support ephemeral columns
> 2018-04-04T14:28:43Z|00002|fatal_signal|WARN|terminating with signal 14
> (Alarm clock)
> /usr/share/openvswitch/scripts/ovs-lib: line 600: 18504 Alarm clock
>      "$@"
>                                                            [FAILED]
> 2018-04-04T14:28:43Z|00002|ovsdb_idl|WARN|Logical_Switch table in
> OVN_Northbound database lacks tst2 column (database needs upgrade?)
> Waiting for OVN_Southbound to come up                      [  OK  ]
> Starting ovn-northd                                        [  OK  ]
> 
> [root at vm1 vagrant]# ovn-nbctl show
> 2018-04-04T14:29:31Z|00001|ovsdb_idl|WARN|Logical_Switch table in
> OVN_Northbound database lacks tst2 column (database needs upgrade?)
> switch b5a0e5d3-2587-4f15-a3c4-8262fb20c65d (sw0)
> ********
> 
> Even though I saw the above warning when I ran "ovn-nbctl show", everything
> worked fine. When I ran "ovn-nbctl list logical_switch" I could see the new
> column "tst".
>  And when I started the ovsdb-servers on the other nodes, the above warning
> went away.
> 
> I also tested by restarting the ovsdb-servers on all the nodes almost at
> the same time and I didn't notice the message
> "/usr/share/openvswitch/scripts/ovs-lib:
> line 600: 18504 Alarm clock ..."  by ovn-ctl.
> 
> 
> Other than this I didn't notice anything odd. I was thinking to share these
> details tomorrow :).
> 
> It would definitely help if some one could test this patch out.

Thank you for testing.

I see multiple issues here.

You mentioned a 'tst' column but the messages talk about 'tst2'.  I
guess that you meant 'tst2'.

One is the message about ephemeral columns.  That is unavoidable for
now.  In some future version we should figure out a better way to handle
them.  Maybe my intern this summer will work on the issue.

The "Alarm Clock" message is more worrying.  It means that the database
took more than 30 seconds to upgrade, or at least that the ovsdb-client
command believed that it did.  For a simple test case, the upgrade
should actually take less than a second.  That means a bug somewhere.

None of these problems is a bug in the ovn-ctl script, though, so it
should not preclude applying the patch.

I applied this to master.


More information about the dev mailing list