[ovs-dev] [PATCH] ovn-controller: Fix busy loop when sb disconnected.

Numan Siddique nusiddiq at redhat.com
Tue Apr 2 10:59:40 UTC 2019


On Mon, Apr 1, 2019 at 1:15 AM Han Zhou <zhouhan at gmail.com> wrote:

> From: Han Zhou <hzhou8 at ebay.com>
>
> In the main loop, if the SB DB is disconnected when there is a pending
> transaction, there can be busy loop causing 100% CPU of ovn-controller,
> until SB DB is connected again.
>
> The root cause is that when a transaction is pending, ovsdb_idl_loop_run()
> will return NULL for ovsdb_idl_txn, and chassis_run() returns NULL when
> ovnsb_idl_txn is NULL, so the condition if (br_int && chassis) is not
> satisfied and so ofctrl_run() is not executed in the main loop. If there
> is any message pending from br-int.mgmt, such as OFPT_BARRIER_REPLY or
> OFPTYPE_ECHO_REQUEST, the main loop will be woken up again and again
> because those messages are not processed because ofctrl_run() is not
> invoked.
>
> This patch fixes the problem by returning chassis_rec by chassis_run()
> whenever the chassis_rec is found, even if ovnsb_idl_txn is NULL (if
> ovnsb_idl_txn is NULL, it just don't do any update to the DB). With
> this fix, ofctrl_run() can still be invoked even when there is a pending
> transaction, so the busy loop doesn't happen any more in this situation.
>

Since ofctrl_run(), doesn't take 'chassis' record as input, can't we call
this function
when just br_int is non NULL ?

i.e

if (br_int) {
    ofctrl_run(..);
}

With the propoed patch other functions -
patch_run(), pinctrl_run(), lflow_run(), physical_run()
would get called even if the ovnsb_idl_txn is NULL.

Thanks
Numan


> Signed-off-by: Han Zhou <hzhou8 at ebay.com>
> ---
>  ovn/controller/chassis.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c
> index 3ea908d..8fe5565 100644
> --- a/ovn/controller/chassis.c
> +++ b/ovn/controller/chassis.c
> @@ -82,8 +82,10 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>              const char *chassis_id,
>              const struct ovsrec_bridge *br_int)
>  {
>

I think the comment of this function needs to be updated accordingly.


> +    const struct sbrec_chassis *chassis_rec
> +        = chassis_lookup_by_name(sbrec_chassis_by_name, chassis_id);
>      if (!ovnsb_idl_txn) {
> -        return NULL;
> +        return chassis_rec;
>      }
>
>      const struct ovsrec_open_vswitch *cfg;
> @@ -148,8 +150,6 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>      ds_chomp(&iface_types, ',');
>      const char *iface_types_str = ds_cstr(&iface_types);
>
> -    const struct sbrec_chassis *chassis_rec
> -        = chassis_lookup_by_name(sbrec_chassis_by_name, chassis_id);
>      const char *encap_csum = smap_get_def(&cfg->external_ids,
>                                            "ovn-encap-csum", "true");
>      int n_encaps = count_1bits(req_tunnels);
> --
> 2.1.0
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list