[ovs-dev] [PATCH ovn v3] binding: Fix the crashes seen when port binding type changes.

Numan Siddique numans at ovn.org
Mon Mar 22 18:44:04 UTC 2021


On Fri, Mar 19, 2021 at 8:10 PM Dumitru Ceara <dceara at redhat.com> wrote:
>
> On 3/19/21 1:05 PM, numans at ovn.org wrote:
> > From: Numan Siddique <numans at ovn.org>
> >
> > When a port binding type changes from type 'A' to type 'B', then
> > there are many code paths in the existing binding.c which results
> > in crashes due to use-after-free or NULL references.
> >
> > Below crashes are seen when a container lport is changed to a normal
> > lport and then deleted.
> >
> > ***
> >  (gdb) bt
> >    0  in raise () from /lib64/libc.so.6
> >    1  in abort () from /lib64/libc.so.6
> >    2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
> >    3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
> >    4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
> >    5  ovs_assert_failure (where="lib/ovsdb-idl.c:4653",
> >                           function="ovsdb_idl_txn_write__",
> >                           condition="row->new_datum != NULL") at lib/util.c:86
> >    6  ovsdb_idl_txn_write__ () at lib/ovsdb-idl.c:4695
> >    7  ovsdb_idl_txn_write_clone () at lib/ovsdb-idl.c:4757
> >    8  sbrec_port_binding_set_chassis () at lib/ovn-sb-idl.c:25946
> >    9  release_lport () at controller/binding.c:971
> >   10  release_local_binding_children () at controller/binding.c:1039
> >   11  release_local_binding () at controller/binding.c:1056
> >   12  consider_iface_release (iface_rec=.. iface_id="bb43e818-b2ee-4329-b67e-218556580056") at controller/binding.c:1880
> >   13  binding_handle_ovs_interface_changes () at controller/binding.c:1998
> >   14  runtime_data_ovs_interface_handler () at controller/ovn-controller.c:1481
> >   15  engine_compute () at lib/inc-proc-eng.c:306
> >   16  engine_run_node () at lib/inc-proc-eng.c:352
> >   17  engine_run () at lib/inc-proc-eng.c:377
> >   18  main () at controller/ovn-controller.c:2826
> >
> > The present code creates a 'struct local_binding' instance for a
> > container lport and adds this object to the parent local binding
> > children list.  And if the container lport is changed to a normal
> > vif, then there is no way to access the local binding object created
> > earlier.  This patch fixes these type of issues by refactoring the
> > 'local binding' code of binding.c.  This patch now creates only one
> > instance of 'struct local_binding' for every OVS interface with
> > external_ids:iface-id set.  A new structure 'struct binding_lport' is
> > added which is created for a VIF, container and virtual port bindings
> > and is stored in 'binding_lports' shash.  'struct local_binding' now
> > maintains a list of binding_lports which it maps to.
> >
> > When a container lport is changed to a normal lport, we now can
> > easily access the 'binding_lport' object of the container lport
> > fron the 'binding_lports' shash.
> >
> > Reported-by: Dumitru Ceara <dceara at redhat.com>
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936328
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936331
> >
> > Co-authored-by: Dumitru Ceara <dceara at redhat.com>
> > Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> > Signed-off-by: Numan Siddique <numans at ovn.org>
> > ---
>
> Hi Numan,
>
> I didn't do a full review of the code yet but one thing comes to mind.
>
> Wouldn't it simplify the code if we always dealt with an updated port
> binding record in the same way we handle a "delete" folowed by an "insert"?
>
> We already store all the relevant information about the old port binding
> (and if we don't we maybe should) so we could process every port binding
> update as:
> a. first delete the port binding resources based on the information we
> have in memory about the old port binding record.
> b. insert the new port binding based on the new contents of the record.
>
Hi Dumitru,

Thanks for the review comments.  I am thinking about your suggestion.
I'm not very sure if this would make the code easier. I have few questions
for you and few observations.

1.  Let's say we have a lport by name - "lp1" of type VIF.  When
     an OVS interface is created for it we claim the lport and set the
port_binding.chassis
     column. Once the ovn-controller gets the update jsonrpc message
from ovsdb-server,
    what should we do ? Because it will be a port_binding update ?
    From what I understand from your comment, we should consider it as
delete first - which means we need
    to delete the binding_lport object for this and remove the lport
'lp1' from the b_ctx_out->local_lports,
   b_ctx_out->local_lport_ids etc And then consider it as an insert,
in which case
    we will create the binding_lport again and also add back in
local_lports and local_lport_ids.
    We will also end up calling remove_local_lport_ids() first and
then update_local_lport_ids().
    This may cause unnecessary deletions and additions in tracked_datapaths.
    Correct me If I'm wrong here.

    Is this what you think should happen ?

2.  Lets say the lport - "lp1" changes its type from VIF to external.
If suppose it was bound
     as normal VIF, then we need to clear the chassis column of the
port bindings.  But if we consider
    it as delete first, then we will not be sure if we can call
'sbrec_port_binding_set_chassis(pb, NULL)
    unless we add a check if the pb can be accessed or not.  I think
this case will be tricky to handle.
    I'm not sure if we can consider a port binding update as a delete
followed by insert.

> Currently binding_handle_port_binding_changes() seems to have a lot of
> special cases to deal exactly with this kind of change.  It's quite hard
> to validate if we missed any.

You mean this patch adds a lot of special cases ? or  these special
handling already exists in the present code
and this patch also makes it complicated ?

Let me know if I misunderstood you and also your comments.

Thanks

>





> I have a couple more comments below.
>
> Thanks,
> Dumitru
>
> >
> > v2 -> v3
> > ----
> >  * Fixed a crash seen when a container port is changed to normal port
> >    and then deleted in the same transaction.  Added the relevant test case
> >    for this.
> >
> > v1 -> v2
> > ----
> >  * Fixed a crash seen when 2 container ports are changed to normal ports
> >    in the same transaction.  Added the relevant test case for this.
> >
> >  controller/binding.c        | 968 +++++++++++++++++++++++-------------
> >  controller/binding.h        |  32 +-
> >  controller/ovn-controller.c |  25 +-
> >  controller/physical.c       |  13 +-
> >  tests/ovn.at                | 247 +++++++++
> >  5 files changed, 887 insertions(+), 398 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 4e6c756969..40276d1b12 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -597,6 +597,23 @@ remove_local_lport_ids(const struct sbrec_port_binding *pb,
> >      }
> >  }
> >
>
> <snip>
>
> > +
> > +static void
> > +local_binding_destroy(struct local_binding *lbinding,
> > +                      struct shash *binding_lports)
>
> I'm sure there's a good reason to pass binding_lports as NULL sometimes
> but it's not really clear why and makes me wonder if there are cases
> when we can leak stuff.
>
> A comment might be helpful.

Ack.

>
> > +{
> > +    struct binding_lport *b_lport;
> > +    LIST_FOR_EACH_POP (b_lport, list_node, &lbinding->binding_lports) {
> > +        b_lport->lbinding = NULL;
> > +        if (binding_lports) {
> > +            binding_lport_delete(binding_lports, b_lport);
> > +        }
> > +    }
> > +
> > +    free(lbinding->name);
> > +    free(lbinding);
> > +}
> > +
> > +static void
> > +local_binding_delete(struct local_binding *lbinding,
> > +                     struct shash *local_bindings,
> > +                     struct shash *binding_lports)
> > +{
> > +    shash_find_and_delete(local_bindings, lbinding->name);
> > +    local_binding_destroy(lbinding, binding_lports);
> > +}
> > +
> > +/* Returns the primary binding lport if present in lbinding's
> > + * binding lports list.  A binding lport is considered primary
> > + * if binding lport's type is LP_VIF and the name matches
> > + * with the 'lbinding'.
> > + */
> > +static struct binding_lport *
> > +local_binding_get_primary_lport(struct local_binding *lbinding)
> > +{
> > +    if (!lbinding) {
> > +        return NULL;
> > +    }
> > +
> > +    struct binding_lport *b_lport;
>
> I might be wrong, but isn't it easier to just store the "primary lport"
> as a separate field in 'struct local_binding' instead of having to walk
> the list every time we need it?

I thought about it.  My concern was when an existing primary lport changes
its type to 'external' or other type, how to handle it ? or if the lport type
changes from container to VIF which becomes primart and how to handle it.

May be it can be handled much more easier than I thought. I'll explore more
and try to address your comment.

>
> <snip>
>
> > diff --git a/controller/binding.h b/controller/binding.h
> > index c9ebef4b17..f75a787dad 100644
> > --- a/controller/binding.h
> > +++ b/controller/binding.h
> > @@ -56,7 +56,7 @@ struct binding_ctx_in {
> >
> >  struct binding_ctx_out {
> >      struct hmap *local_datapaths;
> > -    struct shash *local_bindings;
> > +    struct local_binding_data *lbinding_data;
> >
> >      /* sset of (potential) local lports. */
> >      struct sset *local_lports;
> > @@ -86,28 +86,16 @@ struct binding_ctx_out {
> >      struct hmap *tracked_dp_bindings;
> >  };
> >
> > -enum local_binding_type {
> > -    BT_VIF,
> > -    BT_CONTAINER,
> > -    BT_VIRTUAL
> > +struct local_binding_data {
> > +    struct shash local_bindings;
> > +    struct shash binding_lports;
> >  };
>
> This structure is already refering to "local bindings".  Maybe it's more
> clear if we define it like below?
>
> struct local_binding_data {
>     struct shash bindings;
>     struct shash lports;
> };

Ack. Sounds good.

Thanks
Numan

>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list