[ovs-dev] [PATCH ovn v3] binding: Fix the crashes seen when port binding type changes.

Dumitru Ceara dceara at redhat.com
Tue Mar 23 16:24:40 UTC 2021


On 3/22/21 7:44 PM, Numan Siddique wrote:
> On Fri, Mar 19, 2021 at 8:10 PM Dumitru Ceara <dceara at redhat.com> wrote:
>>
>> On 3/19/21 1:05 PM, numans at ovn.org wrote:
>>> From: Numan Siddique <numans at ovn.org>
>>>
>>> When a port binding type changes from type 'A' to type 'B', then
>>> there are many code paths in the existing binding.c which results
>>> in crashes due to use-after-free or NULL references.
>>>
>>> Below crashes are seen when a container lport is changed to a normal
>>> lport and then deleted.
>>>
>>> ***
>>>  (gdb) bt
>>>    0  in raise () from /lib64/libc.so.6
>>>    1  in abort () from /lib64/libc.so.6
>>>    2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
>>>    3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
>>>    4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
>>>    5  ovs_assert_failure (where="lib/ovsdb-idl.c:4653",
>>>                           function="ovsdb_idl_txn_write__",
>>>                           condition="row->new_datum != NULL") at lib/util.c:86
>>>    6  ovsdb_idl_txn_write__ () at lib/ovsdb-idl.c:4695
>>>    7  ovsdb_idl_txn_write_clone () at lib/ovsdb-idl.c:4757
>>>    8  sbrec_port_binding_set_chassis () at lib/ovn-sb-idl.c:25946
>>>    9  release_lport () at controller/binding.c:971
>>>   10  release_local_binding_children () at controller/binding.c:1039
>>>   11  release_local_binding () at controller/binding.c:1056
>>>   12  consider_iface_release (iface_rec=.. iface_id="bb43e818-b2ee-4329-b67e-218556580056") at controller/binding.c:1880
>>>   13  binding_handle_ovs_interface_changes () at controller/binding.c:1998
>>>   14  runtime_data_ovs_interface_handler () at controller/ovn-controller.c:1481
>>>   15  engine_compute () at lib/inc-proc-eng.c:306
>>>   16  engine_run_node () at lib/inc-proc-eng.c:352
>>>   17  engine_run () at lib/inc-proc-eng.c:377
>>>   18  main () at controller/ovn-controller.c:2826
>>>
>>> The present code creates a 'struct local_binding' instance for a
>>> container lport and adds this object to the parent local binding
>>> children list.  And if the container lport is changed to a normal
>>> vif, then there is no way to access the local binding object created
>>> earlier.  This patch fixes these type of issues by refactoring the
>>> 'local binding' code of binding.c.  This patch now creates only one
>>> instance of 'struct local_binding' for every OVS interface with
>>> external_ids:iface-id set.  A new structure 'struct binding_lport' is
>>> added which is created for a VIF, container and virtual port bindings
>>> and is stored in 'binding_lports' shash.  'struct local_binding' now
>>> maintains a list of binding_lports which it maps to.
>>>
>>> When a container lport is changed to a normal lport, we now can
>>> easily access the 'binding_lport' object of the container lport
>>> fron the 'binding_lports' shash.
>>>
>>> Reported-by: Dumitru Ceara <dceara at redhat.com>
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936328
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936331
>>>
>>> Co-authored-by: Dumitru Ceara <dceara at redhat.com>
>>> Signed-off-by: Dumitru Ceara <dceara at redhat.com>
>>> Signed-off-by: Numan Siddique <numans at ovn.org>
>>> ---
>>
>> Hi Numan,
>>
>> I didn't do a full review of the code yet but one thing comes to mind.
>>
>> Wouldn't it simplify the code if we always dealt with an updated port
>> binding record in the same way we handle a "delete" folowed by an "insert"?
>>
>> We already store all the relevant information about the old port binding
>> (and if we don't we maybe should) so we could process every port binding
>> update as:
>> a. first delete the port binding resources based on the information we
>> have in memory about the old port binding record.
>> b. insert the new port binding based on the new contents of the record.
>>
> Hi Dumitru,

Hi Numan,

> 
> Thanks for the review comments.  I am thinking about your suggestion.
> I'm not very sure if this would make the code easier. I have few questions
> for you and few observations.
> 
> 1.  Let's say we have a lport by name - "lp1" of type VIF.  When
>      an OVS interface is created for it we claim the lport and set the
> port_binding.chassis
>      column. Once the ovn-controller gets the update jsonrpc message
> from ovsdb-server,
>     what should we do ? Because it will be a port_binding update ?
>     From what I understand from your comment, we should consider it as
> delete first - which means we need
>     to delete the binding_lport object for this and remove the lport
> 'lp1' from the b_ctx_out->local_lports,
>    b_ctx_out->local_lport_ids etc And then consider it as an insert,
> in which case
>     we will create the binding_lport again and also add back in
> local_lports and local_lport_ids.
>     We will also end up calling remove_local_lport_ids() first and
> then update_local_lport_ids().
>     This may cause unnecessary deletions and additions in tracked_datapaths.
>     Correct me If I'm wrong here.
> 
>     Is this what you think should happen ?
> 
> 2.  Lets say the lport - "lp1" changes its type from VIF to external.
> If suppose it was bound
>      as normal VIF, then we need to clear the chassis column of the
> port bindings.  But if we consider
>     it as delete first, then we will not be sure if we can call
> 'sbrec_port_binding_set_chassis(pb, NULL)
>     unless we add a check if the pb can be accessed or not.  I think
> this case will be tricky to handle.
>     I'm not sure if we can consider a port binding update as a delete
> followed by insert.

I was kind of vague in my initial question, I apologize.  I was thinking
more of abstracting the operations we do for various Southbound updates.
For example, a:
- southbound port binding delete triggers:
  a. a "binding release" in the binding.c in-memory binding
     representation.
- southbound port binding insert triggers:
  a. a "binding create" in the binding.c in-memory binding
     representation (e.g., when the OVS interface is added first).
  b. a "binding claim" in the binding.c in-memory binding
     representation.
- soutbound port binding update triggers:
  a. if the update is "destructive" (e.g., port binding type changes),
     execute the "binding release" followed by "binding create"/"claim"
     operations from the two above cases.

It was more of a thought as I was struggling to imagine all possible
combinations of updates but I guess we can discuss more options like
this as a follow up if/when we decide to refactor/harden binding.c.

> 
>> Currently binding_handle_port_binding_changes() seems to have a lot of
>> special cases to deal exactly with this kind of change.  It's quite hard
>> to validate if we missed any.
> 
> You mean this patch adds a lot of special cases ? or  these special
> handling already exists in the present code
> and this patch also makes it complicated ?

This patch adds some cases too but that's expected, I guess, because
it's fixing bugs :)

> 
> Let me know if I misunderstood you and also your comments.
> 
> Thanks
> 
>>
> 
> 
> 
> 
> 
>> I have a couple more comments below.
>>
>> Thanks,
>> Dumitru
>>
>>>
>>> v2 -> v3
>>> ----
>>>  * Fixed a crash seen when a container port is changed to normal port
>>>    and then deleted in the same transaction.  Added the relevant test case
>>>    for this.
>>>
>>> v1 -> v2
>>> ----
>>>  * Fixed a crash seen when 2 container ports are changed to normal ports
>>>    in the same transaction.  Added the relevant test case for this.
>>>
>>>  controller/binding.c        | 968 +++++++++++++++++++++++-------------
>>>  controller/binding.h        |  32 +-
>>>  controller/ovn-controller.c |  25 +-
>>>  controller/physical.c       |  13 +-
>>>  tests/ovn.at                | 247 +++++++++
>>>  5 files changed, 887 insertions(+), 398 deletions(-)
>>>
>>> diff --git a/controller/binding.c b/controller/binding.c
>>> index 4e6c756969..40276d1b12 100644
>>> --- a/controller/binding.c
>>> +++ b/controller/binding.c
>>> @@ -597,6 +597,23 @@ remove_local_lport_ids(const struct sbrec_port_binding *pb,
>>>      }
>>>  }
>>>
>>
>> <snip>
>>
>>> +
>>> +static void
>>> +local_binding_destroy(struct local_binding *lbinding,
>>> +                      struct shash *binding_lports)
>>
>> I'm sure there's a good reason to pass binding_lports as NULL sometimes
>> but it's not really clear why and makes me wonder if there are cases
>> when we can leak stuff.
>>
>> A comment might be helpful.
> 
> Ack.
> 
>>
>>> +{
>>> +    struct binding_lport *b_lport;
>>> +    LIST_FOR_EACH_POP (b_lport, list_node, &lbinding->binding_lports) {
>>> +        b_lport->lbinding = NULL;
>>> +        if (binding_lports) {
>>> +            binding_lport_delete(binding_lports, b_lport);
>>> +        }
>>> +    }
>>> +
>>> +    free(lbinding->name);
>>> +    free(lbinding);
>>> +}
>>> +
>>> +static void
>>> +local_binding_delete(struct local_binding *lbinding,
>>> +                     struct shash *local_bindings,
>>> +                     struct shash *binding_lports)
>>> +{
>>> +    shash_find_and_delete(local_bindings, lbinding->name);
>>> +    local_binding_destroy(lbinding, binding_lports);
>>> +}
>>> +
>>> +/* Returns the primary binding lport if present in lbinding's
>>> + * binding lports list.  A binding lport is considered primary
>>> + * if binding lport's type is LP_VIF and the name matches
>>> + * with the 'lbinding'.
>>> + */
>>> +static struct binding_lport *
>>> +local_binding_get_primary_lport(struct local_binding *lbinding)
>>> +{
>>> +    if (!lbinding) {
>>> +        return NULL;
>>> +    }
>>> +
>>> +    struct binding_lport *b_lport;
>>
>> I might be wrong, but isn't it easier to just store the "primary lport"
>> as a separate field in 'struct local_binding' instead of having to walk
>> the list every time we need it?
> 
> I thought about it.  My concern was when an existing primary lport changes
> its type to 'external' or other type, how to handle it ? or if the lport type
> changes from container to VIF which becomes primart and how to handle it.
> 
> May be it can be handled much more easier than I thought. I'll explore more
> and try to address your comment.

Cool, thanks!

> 
>>
>> <snip>
>>
>>> diff --git a/controller/binding.h b/controller/binding.h
>>> index c9ebef4b17..f75a787dad 100644
>>> --- a/controller/binding.h
>>> +++ b/controller/binding.h
>>> @@ -56,7 +56,7 @@ struct binding_ctx_in {
>>>
>>>  struct binding_ctx_out {
>>>      struct hmap *local_datapaths;
>>> -    struct shash *local_bindings;
>>> +    struct local_binding_data *lbinding_data;
>>>
>>>      /* sset of (potential) local lports. */
>>>      struct sset *local_lports;
>>> @@ -86,28 +86,16 @@ struct binding_ctx_out {
>>>      struct hmap *tracked_dp_bindings;
>>>  };
>>>
>>> -enum local_binding_type {
>>> -    BT_VIF,
>>> -    BT_CONTAINER,
>>> -    BT_VIRTUAL
>>> +struct local_binding_data {
>>> +    struct shash local_bindings;
>>> +    struct shash binding_lports;
>>>  };
>>
>> This structure is already refering to "local bindings".  Maybe it's more
>> clear if we define it like below?
>>
>> struct local_binding_data {
>>     struct shash bindings;
>>     struct shash lports;
>> };
> 
> Ack. Sounds good.
> 
> Thanks
> Numan
> 

I noticed you sent out a v4, I'll continue the review on it.

Thanks,
Dumitru



More information about the dev mailing list