[ovs-dev] [PATCH ovn v3 4/6] tests: Eliminate most "sleep" calls.
Ben Pfaff
blp at ovn.org
Tue Nov 10 00:26:59 UTC 2020
On Mon, Nov 09, 2020 at 02:44:27PM +0100, Dumitru Ceara wrote:
> On 11/6/20 4:36 AM, Ben Pfaff wrote:
> > Many of these could be replaced by "ovn-nbctl sync". Some weren't
> > really needed at all because they were adjacent to something that itself
> > called sync or otherwise used --wait. Some were more appropriately
> > done with explicit waits for what was really needed.
> >
> > I left some "sleep"s. Some were because they were "negative" sleeps:
> > they were giving time for something to happen that should *not* happen
> > (in other words, if you wait for it to happen, you'll wait forever,
> > unless there's a bug). Some were because I didn't know how to properly
> > wait for what they were waiting for, or because I didn't understand
> > the circumstances deeply enough.
> >
> > Signed-off-by: Ben Pfaff <blp at ovn.org>
> > ---
> > tests/ovn-northd.at | 7 ++
> > tests/ovn.at | 167 ++++++++++++--------------------------------
> > 2 files changed, 52 insertions(+), 122 deletions(-)
> >
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 949a8eee054e..5d670233561e 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -1120,6 +1120,7 @@ ovn-nbctl --wait=sb -- --id=@hc create \
> > Load_Balancer_Health_Check vip="10.0.0.10\:80" -- add Load_Balancer . \
> > health_check @hc
> > wait_row_count Service_Monitor 2
> > +check ovn-nbctl --wait=hv sync
>
> This should be "--wait=sb". ovn-controller is not started for tests in
> ovn-northd.at and "--wait=hv" would return immediately. This applies to
> all "ovn-nbctl --wait=hv sync" in ovn-northd.at.
I see what you mean.
This is surprising to me. When I defined this stuff years ago, my
intent was that --wait=hv would be stronger than --wait=sb. That is,
--wait=sb would cause ovn-nbctl to wait for the southbound database to
catch up with the northbound changes, and --wait=hb would cause it to
wait for the southbound database AND the hypervisors to catch up.
That's even how it's documented for ovn-nbctl:
<p>
With <code>--wait=sb</code>, before <code>ovn-nbctl</code> exits, it
waits for <code>ovn-northd</code> to bring the southbound database
up-to-date with the northbound database updates.
</p>
<p>
With <code>--wait=hv</code>, before <code>ovn-nbctl</code> exits, it
additionally waits for all OVN chassis (hypervisors and gateways) to
become up-to-date with the northbound database updates. (This can
become an indefinite wait if any chassis is malfunctioning.)
</p>
That's not what actually happens, though. As you point out, if there
are no hypervisors, then the hypervisors are always up-to-date, even if
the database is not.
I think this is a bug in ovn-nbctl. Arguably, it's a bug in the
database definition (perhaps hv_cfg shouldn't even be filled in but left
empty if there are no chassis) but I think it is too late to fix it
there. It is, however, easy enough to fix it in ovn-nbctl.
(And at the same time, I'll change these in ovn-northd to just say
--wait=sb. You have a point.)
How about this as an additional patch?
-8<--------------------------cut here-------------------------->8--
>From 7b373c22dbda2f808f3d5f3b8ae6eb67612dbe87 Mon Sep 17 00:00:00 2001
From: Ben Pfaff <blp at ovn.org>
Date: Mon, 9 Nov 2020 16:12:50 -0800
Subject: [PATCH ovn] ovn-nbctl: Make --wait=hv wait for southbound database in
corner case.
In the corner case where there are no chassis, --wait=hv didn't wait at
all, instead of waiting for the southbound database.
Reported-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>
---
ovn-nb.xml | 27 ++++++++++++++++++++-------
utilities/ovn-nbctl.c | 2 +-
2 files changed, 21 insertions(+), 8 deletions(-)
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 51b186b84419..d3e51f3e5e87 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -84,13 +84,26 @@
</column>
<column name="hv_cfg">
- Sequence number that <code>ovn-northd</code> sets to the smallest
- sequence number of all the chassis in the system, as reported in the
- <code>Chassis_Private</code> table in the southbound database. Thus,
- <ref column="hv_cfg"/> equals <ref column="nb_cfg"/> if all chassis are
- caught up with the northbound configuration (which may never happen, if
- any chassis is down). This value can regress, if a chassis was removed
- from the system and rejoins before catching up.
+ <p>
+ Sequence number that <code>ovn-northd</code> sets to the smallest
+ sequence number of all the chassis in the system, as reported in the
+ <code>Chassis_Private</code> table in the southbound database. Thus,
+ <ref column="hv_cfg"/> equals <ref column="nb_cfg"/> if all chassis are
+ caught up with the northbound configuration (which may never happen, if
+ any chassis is down). This value can regress, if a chassis was removed
+ from the system and rejoins before catching up.
+ </p>
+
+ <p>
+ If there are no chassis, then <code>ovn-northd</code> copies
+ <code>nb_cfg</code> to <ref column="hv_cfg"/>. Thus, in this case,
+ the (nonexistent) hypervisors are always considered to be caught up.
+ This means that hypervisors can be "caught up" even in cases where
+ <ref column="sb_cfg"/> would show that the southbound database is
+ not. To detect when both the hypervisors and the southbound database
+ are caught up, a client should take the smaller of <ref
+ column="sb_cfg"/> and <ref column="hv_cfg"/>.
+ </p>
</column>
<column name="hv_cfg_timestamp">
diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c
index 4f28edc808ec..ee9849de53a9 100644
--- a/utilities/ovn-nbctl.c
+++ b/utilities/ovn-nbctl.c
@@ -6357,7 +6357,7 @@ do_nbctl(const char *args, struct ctl_command *commands, size_t n_commands,
NBREC_NB_GLOBAL_FOR_EACH (nb, idl) {
int64_t cur_cfg = (wait_type == NBCTL_WAIT_SB
? nb->sb_cfg
- : nb->hv_cfg);
+ : MIN(nb->sb_cfg, nb->hv_cfg));
if (cur_cfg >= next_cfg) {
if (print_wait_time) {
printf("Time spent on processing nb_cfg %"PRId64":\n",
--
2.26.2
More information about the dev
mailing list