[ovs-dev] [PATCH] ovn-controller: Fix busy loop when sb disconnected.

Han Zhou zhouhan at gmail.com
Wed Apr 3 04:35:27 UTC 2019


Thanks Numan for the review. Please see my response below.

On Tue, Apr 2, 2019 at 3:59 AM Numan Siddique <nusiddiq at redhat.com> wrote:
>
>
>
> On Mon, Apr 1, 2019 at 1:15 AM Han Zhou <zhouhan at gmail.com> wrote:
>>
>> From: Han Zhou <hzhou8 at ebay.com>
>>
>> In the main loop, if the SB DB is disconnected when there is a pending
>> transaction, there can be busy loop causing 100% CPU of ovn-controller,
>> until SB DB is connected again.
>>
>> The root cause is that when a transaction is pending,
ovsdb_idl_loop_run()
>> will return NULL for ovsdb_idl_txn, and chassis_run() returns NULL when
>> ovnsb_idl_txn is NULL, so the condition if (br_int && chassis) is not
>> satisfied and so ofctrl_run() is not executed in the main loop. If there
>> is any message pending from br-int.mgmt, such as OFPT_BARRIER_REPLY or
>> OFPTYPE_ECHO_REQUEST, the main loop will be woken up again and again
>> because those messages are not processed because ofctrl_run() is not
>> invoked.
>>
>> This patch fixes the problem by returning chassis_rec by chassis_run()
>> whenever the chassis_rec is found, even if ovnsb_idl_txn is NULL (if
>> ovnsb_idl_txn is NULL, it just don't do any update to the DB). With
>> this fix, ofctrl_run() can still be invoked even when there is a pending
>> transaction, so the busy loop doesn't happen any more in this situation.
>
>
> Since ofctrl_run(), doesn't take 'chassis' record as input, can't we call
this function
> when just br_int is non NULL ?
>
> i.e
>
> if (br_int) {
>     ofctrl_run(..);
> }
>

I wanted to make as small change as possible and to avoid changing the
order of the function calls. However, your suggest seems good, because it
can avoid busy loop before SB DB is connected (when chassis == NULL, and
there is OF messages pending on br-int.mgmt. I will move ofctrl_run()
before the if-condition in v2.

> With the propoed patch other functions - patch_run(), pinctrl_run(),
lflow_run(), physical_run()
> would get called even if the ovnsb_idl_txn is NULL.
>
What I think is, if the function doesn't depend on ovnsb_idl_txn, they
should be called; if it does depend on ovnsb_idl_txn, it is checked in the
function before using it, e.g. pinctrl_run() and patch_run(). So it is not
harmful to call them anyway. Do you see a problem there?

> Thanks
> Numan
>
>>
>> Signed-off-by: Han Zhou <hzhou8 at ebay.com>
>> ---
>>  ovn/controller/chassis.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c
>> index 3ea908d..8fe5565 100644
>> --- a/ovn/controller/chassis.c
>> +++ b/ovn/controller/chassis.c
>> @@ -82,8 +82,10 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>>              const char *chassis_id,
>>              const struct ovsrec_bridge *br_int)
>>  {
>
>
> I think the comment of this function needs to be updated accordingly.
>
Sure, I will update it in v2.

>>
>> +    const struct sbrec_chassis *chassis_rec
>> +        = chassis_lookup_by_name(sbrec_chassis_by_name, chassis_id);
>>      if (!ovnsb_idl_txn) {
>> -        return NULL;
>> +        return chassis_rec;
>>      }
>>
>>      const struct ovsrec_open_vswitch *cfg;
>> @@ -148,8 +150,6 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>>      ds_chomp(&iface_types, ',');
>>      const char *iface_types_str = ds_cstr(&iface_types);
>>
>> -    const struct sbrec_chassis *chassis_rec
>> -        = chassis_lookup_by_name(sbrec_chassis_by_name, chassis_id);
>>      const char *encap_csum = smap_get_def(&cfg->external_ids,
>>                                            "ovn-encap-csum", "true");
>>      int n_encaps = count_1bits(req_tunnels);
>> --
>> 2.1.0
>>
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list