[ovs-dev] [PATCH ovn 4/4] ovn-controller: Support configuration option ovn-delay-nb-cfg-report.

Han Zhou hzhou at ovn.org
Tue Sep 8 01:22:31 UTC 2020


On Mon, Sep 7, 2020 at 11:02 AM Numan Siddique <numans at ovn.org> wrote:
>
>
>
> On Sat, Sep 5, 2020 at 12:46 AM Numan Siddique <numans at ovn.org> wrote:
>>
>>
>>
>> On Sat, Sep 5, 2020, 12:04 AM Han Zhou <hzhou at ovn.org> wrote:
>>>
>>> On Fri, Sep 4, 2020 at 11:18 AM Numan Siddique <numans at ovn.org> wrote:
>>> >
>>> >
>>> >
>>> > On Wed, Sep 2, 2020 at 4:57 AM Han Zhou <hzhou at ovn.org> wrote:
>>> >>
>>> >> This option is added to avoid the flooding of nb_cfg updates from
>>> >> a large nubmer of hypervisors interfere with the original SB DB
>>> >> data distribution and handling, so that the e2e control plane latency
>>> >> can be measured more accurately during scale testing.
>>> >>
>>> >> Signed-off-by: Han Zhou <hzhou at ovn.org>
>
>
> Hi Han,
>
> If I understand correctly, if the option is configured, ovn-controller
will sleep before it updates
> the OVN Southbound database. Can you please explain a bit on how it helps
in latency measurement.
>

Thanks Numan.

In my testing with 3K HVs, when there is a change in SB DB, all the 3K HVs
will be notified by SB DB server, process the change, and update its nb_cfg
back to SB DB.
Because the OVSDB server is single threaded, there can be HVs start
updating nb_cfg back to SB before all HVs are notified of the SB change,
and once the updating starts, the server would become more busy which could
further delay the notification sending. At the same time, although nb_cfg
update wouldn't get flooded when private_chassis is introduced, with RAFT
cluster, the _Server database monitor still triggers the flooding of raft
index change for every nb_cfg update from every HV to every other HVs,
which can interfere the HVs for processing the actual change. To avoid
these problems, this patch provides the capability of delaying nb_cfg
updates so that in scale tests these noises will be isolated.

> Also I don't think this option will be used during live deployments. So
instead of adding this as configuration
> in the ovs db, I would suggest to add an ovn-appctl command -
debug/set-delay-nb-cfg-report (or  a more
> appropriate name) so that a control plane latency test suite can set this
option -
>    ovn-appctl -t ovn-controller  debug/set-delay-nb-cfg-report 3 (for
example).
>

Agree. I will send a v2.

> Thanks
> Numan
>
>
>
>>> >> ---
>>> >>  controller/ovn-controller.8.xml | 12 ++++++++++++
>>> >>  controller/ovn-controller.c     |  9 +++++++++
>>> >>  2 files changed, 21 insertions(+)
>>> >>
>>> >> diff --git a/controller/ovn-controller.8.xml
>>> b/controller/ovn-controller.8.xml
>>> >> index 6687731..4cb3f41 100644
>>> >> --- a/controller/ovn-controller.8.xml
>>> >> +++ b/controller/ovn-controller.8.xml
>>> >> @@ -233,6 +233,18 @@
>>> >>          The boolean flag indicates if the chassis is used as an
>>> >>          interconnection gateway.
>>> >>        </dd>
>>> >> +
>>> >> +      <dt><code>external_ids:ovn-delay-nb-cfg-report</code></dt>
>>> >> +      <dd>
>>> >> +        An integer that represents the time (in seconds) to sleep
before
>>> >> +        <code>ovn-controller</code> updating <code>nb_cfg</code>
column
>>> >> +        of <code>Chassis_Private</code> table after it completes
>>> processing
>>> >> +        the corresponding changes. By default it is 0 (meaning do
not
>>> sleep).
>>> >> +        This value can be configured when doing control plane
latency
>>> testing,
>>> >> +        to make sure the updates of <code>nb_cfg</code> to the
>>> Southbound DB
>>> >> +        from a large number of hypervisors do not interfere with the
>>> original
>>> >> +        Southbound DB data processing.
>>> >> +      </dd>
>>> >>      </dl>
>>> >>
>>> >>      <p>
>>> >> diff --git a/controller/ovn-controller.c
b/controller/ovn-controller.c
>>> >> index 89f10a6..5ab5a13 100644
>>> >> --- a/controller/ovn-controller.c
>>> >> +++ b/controller/ovn-controller.c
>>> >> @@ -2568,6 +2568,15 @@ main(int argc, char *argv[])
>>> >>
 sbrec_chassis_private_set_nb_cfg(chassis_private,
>>> cur_cfg);
>>> >>                      sbrec_chassis_private_set_nb_cfg_timestamp(
>>> >>                          chassis_private, time_wall_msec());
>>> >> +                    const struct ovsrec_open_vswitch *cfg =
>>> >> +                        ovsrec_open_vswitch_first(ovs_idl_loop.idl);
>>> >> +                    if (cfg) {
>>> >> +                        int delay_cfg_report =
>>> smap_get_int(&cfg->external_ids,
>>> >> +
>>> "ovn-delay-nb-cfg-report", 0);
>>> >> +                        if (delay_cfg_report) {
>>> >> +                            xsleep(delay_cfg_report);
>>> >> +                        }
>>> >> +                    }
>>> >
>>> >
>>> > Hi Han,
>>> >
>>> > Maybe we can also sync this option from the ovs config to chassis
>>> other_config in south db ?
>>> >
>>> > What do you think ?
>>> >
>>>
>>> Hi Numan,
>>>
>>> I am not sure what's the purpose of syncing this to SB? It is used only
by
>>> ovn-controller, something similar to ovn-remote-probe-interval, etc.
>>
>>
>> We do copy some of the configs from ovs db to chassis other_config. But
I agree this config doesnt need to go to sb and probably CMS wouldn't care
much know about it.
>>
>> Thanks
>> Numan
>>
>>>
>>> Thanks,
>>> Han
>>>
>>> > Thanks
>>> > Numan
>>> >
>>> >>
>>> >>                  }
>>> >>              }
>>> >>
>>> >> --
>>> >> 2.1.0
>>> >>
>>> >> _______________________________________________
>>> >> dev mailing list
>>> >> dev at openvswitch.org
>>> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> >>
>>> _______________________________________________
>>> dev mailing list
>>> dev at openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>


More information about the dev mailing list