[ovs-dev] [PATCH ovn v3 4/6] tests: Eliminate most "sleep" calls.

Dumitru Ceara dceara at redhat.com
Tue Nov 10 08:47:56 UTC 2020


On 11/10/20 1:26 AM, Ben Pfaff wrote:
> On Mon, Nov 09, 2020 at 02:44:27PM +0100, Dumitru Ceara wrote:
>> On 11/6/20 4:36 AM, Ben Pfaff wrote:
>>> Many of these could be replaced by "ovn-nbctl sync".  Some weren't
>>> really needed at all because they were adjacent to something that itself
>>> called sync or otherwise used --wait.  Some were more appropriately
>>> done with explicit waits for what was really needed.
>>>
>>> I left some "sleep"s.  Some were because they were "negative" sleeps:
>>> they were giving time for something to happen that should *not* happen
>>> (in other words, if you wait for it to happen, you'll wait forever,
>>> unless there's a bug).  Some were because I didn't know how to properly
>>> wait for what they were waiting for, or because I didn't understand
>>> the circumstances deeply enough.
>>>
>>> Signed-off-by: Ben Pfaff <blp at ovn.org>
>>> ---
>>>  tests/ovn-northd.at |   7 ++
>>>  tests/ovn.at        | 167 ++++++++++++--------------------------------
>>>  2 files changed, 52 insertions(+), 122 deletions(-)
>>>
>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>> index 949a8eee054e..5d670233561e 100644
>>> --- a/tests/ovn-northd.at
>>> +++ b/tests/ovn-northd.at
>>> @@ -1120,6 +1120,7 @@ ovn-nbctl --wait=sb -- --id=@hc create \
>>>  Load_Balancer_Health_Check vip="10.0.0.10\:80" -- add Load_Balancer . \
>>>  health_check @hc
>>>  wait_row_count Service_Monitor 2
>>> +check ovn-nbctl --wait=hv sync
>>
>> This should be "--wait=sb".  ovn-controller is not started for tests in
>> ovn-northd.at and "--wait=hv" would return immediately.  This applies to
>> all "ovn-nbctl --wait=hv sync" in ovn-northd.at.
> 
> I see what you mean.
> 
> This is surprising to me.  When I defined this stuff years ago, my
> intent was that --wait=hv would be stronger than --wait=sb.  That is,
> --wait=sb would cause ovn-nbctl to wait for the southbound database to
> catch up with the northbound changes, and --wait=hb would cause it to
> wait for the southbound database AND the hypervisors to catch up.
> 
> That's even how it's documented for ovn-nbctl:
> 
>         <p>
>           With <code>--wait=sb</code>, before <code>ovn-nbctl</code> exits, it
>           waits for <code>ovn-northd</code> to bring the southbound database
>           up-to-date with the northbound database updates.
>         </p>
> 
>         <p>
>           With <code>--wait=hv</code>, before <code>ovn-nbctl</code> exits, it
>           additionally waits for all OVN chassis (hypervisors and gateways) to
>           become up-to-date with the northbound database updates.  (This can
>           become an indefinite wait if any chassis is malfunctioning.)
>         </p>
> 
> That's not what actually happens, though.  As you point out, if there
> are no hypervisors, then the hypervisors are always up-to-date, even if
> the database is not.
> 
> I think this is a bug in ovn-nbctl.  Arguably, it's a bug in the
> database definition (perhaps hv_cfg shouldn't even be filled in but left
> empty if there are no chassis) but I think it is too late to fix it
> there.  It is, however, easy enough to fix it in ovn-nbctl.
> 
> (And at the same time, I'll change these in ovn-northd to just say
> --wait=sb.  You have a point.)
> 
> How about this as an additional patch?
> 
> -8<--------------------------cut here-------------------------->8--
> 
> From 7b373c22dbda2f808f3d5f3b8ae6eb67612dbe87 Mon Sep 17 00:00:00 2001
> From: Ben Pfaff <blp at ovn.org>
> Date: Mon, 9 Nov 2020 16:12:50 -0800
> Subject: [PATCH ovn] ovn-nbctl: Make --wait=hv wait for southbound database in
>  corner case.
> 
> In the corner case where there are no chassis, --wait=hv didn't wait at
> all, instead of waiting for the southbound database.
> 
> Reported-by: Dumitru Ceara <dceara at redhat.com>
> Signed-off-by: Ben Pfaff <blp at ovn.org>
> ---
>  ovn-nb.xml            | 27 ++++++++++++++++++++-------
>  utilities/ovn-nbctl.c |  2 +-
>  2 files changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 51b186b84419..d3e51f3e5e87 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -84,13 +84,26 @@
>        </column>
>  
>        <column name="hv_cfg">
> -        Sequence number that <code>ovn-northd</code> sets to the smallest
> -        sequence number of all the chassis in the system, as reported in the
> -        <code>Chassis_Private</code> table in the southbound database.  Thus,
> -        <ref column="hv_cfg"/> equals <ref column="nb_cfg"/> if all chassis are
> -        caught up with the northbound configuration (which may never happen, if
> -        any chassis is down).  This value can regress, if a chassis was removed
> -        from the system and rejoins before catching up.
> +        <p>
> +          Sequence number that <code>ovn-northd</code> sets to the smallest
> +          sequence number of all the chassis in the system, as reported in the
> +          <code>Chassis_Private</code> table in the southbound database.  Thus,
> +          <ref column="hv_cfg"/> equals <ref column="nb_cfg"/> if all chassis are
> +          caught up with the northbound configuration (which may never happen, if
> +          any chassis is down).  This value can regress, if a chassis was removed
> +          from the system and rejoins before catching up.
> +        </p>
> +
> +        <p>
> +          If there are no chassis, then <code>ovn-northd</code> copies
> +          <code>nb_cfg</code> to <ref column="hv_cfg"/>.  Thus, in this case,
> +          the (nonexistent) hypervisors are always considered to be caught up.
> +          This means that hypervisors can be "caught up" even in cases where
> +          <ref column="sb_cfg"/> would show that the southbound database is
> +          not.  To detect when both the hypervisors and the southbound database
> +          are caught up, a client should take the smaller of <ref
> +          column="sb_cfg"/> and <ref column="hv_cfg"/>.
> +        </p>
>        </column>
>  
>        <column name="hv_cfg_timestamp">
> diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c
> index 4f28edc808ec..ee9849de53a9 100644
> --- a/utilities/ovn-nbctl.c
> +++ b/utilities/ovn-nbctl.c
> @@ -6357,7 +6357,7 @@ do_nbctl(const char *args, struct ctl_command *commands, size_t n_commands,
>              NBREC_NB_GLOBAL_FOR_EACH (nb, idl) {
>                  int64_t cur_cfg = (wait_type == NBCTL_WAIT_SB
>                                     ? nb->sb_cfg
> -                                   : nb->hv_cfg);
> +                                   : MIN(nb->sb_cfg, nb->hv_cfg));
>                  if (cur_cfg >= next_cfg) {
>                      if (print_wait_time) {
>                          printf("Time spent on processing nb_cfg %"PRId64":\n",
> 

Looks good to me, thanks!



More information about the dev mailing list