[ovs-dev] [PATCH ovn 1/2] Revert "northd: Don't poll ovsdb before the connection is fully established"

Fri Sep 17 20:34:30 UTC 2021

On Thu, Sep 16, 2021 at 8:05 PM Zhen Wang <zhewang at nvidia.com> wrote:
>
> From: zhen wang <zhewang at nvidia.com>
>
> This reverts commit 1e59feea933610b28fd4442243162ce35595cfee.
> Above commit introduced a bug when muptiple ovn-northd instances work in
HA
> mode. If SB leader and active ovn-northd instance got killed by system
power
> outage, standby ovn-northd instance would never detect the failure.
>

Thanks Zhen! I added the Renat and Numan who worked on the reverted commit
to CC, so that they can comment if this is ok.

For the commit message, I think it may be decoupled from the HA scenario
that is supposed to be fixed by the other patch in this series. The issue
this patch fixes is that before the initial NB downloading is complete the
northd will not send probe, so if the DB server is down (ungracefully)
before the northd reads the NB_Global options, the northd would never
probe, thus never reconnect to the new leader. (it is related to RAFT, but
whether it is multiple northds is irrelevant)

As to the original commit that is reverted by this one:

    northd: Don't poll ovsdb before the connection is fully established

    Set initial SB and NB DBs probe interval to 0 to avoid connection
    flapping.

    Before configured in northd_probe_interval value is actually applied
    to southbound and northbound database connections, both connections
    must be fully established, otherwise ovnnb_db_run() will return
    without retrieving configuration data from northbound DB. In cases
    when southbound database is big enough, default interval of 5 seconds
    will kill and retry the connection before it is fully established, no
    matter what is set in northd_probe_interval. Client reconnect will
    cause even more load to ovsdb-server and cause cascade effect, so
    northd can never stabilise. We have more than 2000 ports in our lab,
    and northd could not start before this patch, holding at 100% CPU
    utilisation both itself and ovsdb-server.

    After connections are established, any value in northd_probe_interval,
    or default DEFAULT_PROBE_INTERVAL_MSEC is applied correctly.

I am not sure how would the commit help. There are at most 3 - 5 northds
(in practice), and suppose there are tens or hundreds of ovn-controllers
that makes SB busy, it is just 3 - 5 more clients retrying reconnect SB for
several times, and if NB is not that busy (most likely), these northd
clients should get the proper probe settings applied soon without causing
more issues at all. So I don't think the default probe 5 sec would cause
cascade effect for the initial period. @Renat @Numan please correct me if I
am wrong.

Thanks,
Han

> Signed-off-by: zhen wang <zhewang at nvidia.com>
> ---
>  northd/northd.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index 688a6e4ef..b7e64470f 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -74,8 +74,8 @@ static bool use_ct_inv_match = true;
>
>  /* Default probe interval for NB and SB DB connections. */
>  #define DEFAULT_PROBE_INTERVAL_MSEC 5000
> -static int northd_probe_interval_nb = 0;
> -static int northd_probe_interval_sb = 0;
> +static int northd_probe_interval_nb = DEFAULT_PROBE_INTERVAL_MSEC;
> +static int northd_probe_interval_sb = DEFAULT_PROBE_INTERVAL_MSEC;
>  #define MAX_OVN_TAGS 4096
>
>  /* Pipeline stages. */
> --
> 2.20.1
>