[ovs-dev] [PATCH ovn v3] binding: Fix the crashes seen when port binding type changes.

Dumitru Ceara dceara at redhat.com
Fri Mar 19 14:39:59 UTC 2021


On 3/19/21 1:05 PM, numans at ovn.org wrote:
> From: Numan Siddique <numans at ovn.org>
> 
> When a port binding type changes from type 'A' to type 'B', then
> there are many code paths in the existing binding.c which results
> in crashes due to use-after-free or NULL references.
> 
> Below crashes are seen when a container lport is changed to a normal
> lport and then deleted.
> 
> ***
>  (gdb) bt
>    0  in raise () from /lib64/libc.so.6
>    1  in abort () from /lib64/libc.so.6
>    2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
>    3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
>    4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
>    5  ovs_assert_failure (where="lib/ovsdb-idl.c:4653",
>                           function="ovsdb_idl_txn_write__",
>                           condition="row->new_datum != NULL") at lib/util.c:86
>    6  ovsdb_idl_txn_write__ () at lib/ovsdb-idl.c:4695
>    7  ovsdb_idl_txn_write_clone () at lib/ovsdb-idl.c:4757
>    8  sbrec_port_binding_set_chassis () at lib/ovn-sb-idl.c:25946
>    9  release_lport () at controller/binding.c:971
>   10  release_local_binding_children () at controller/binding.c:1039
>   11  release_local_binding () at controller/binding.c:1056
>   12  consider_iface_release (iface_rec=.. iface_id="bb43e818-b2ee-4329-b67e-218556580056") at controller/binding.c:1880
>   13  binding_handle_ovs_interface_changes () at controller/binding.c:1998
>   14  runtime_data_ovs_interface_handler () at controller/ovn-controller.c:1481
>   15  engine_compute () at lib/inc-proc-eng.c:306
>   16  engine_run_node () at lib/inc-proc-eng.c:352
>   17  engine_run () at lib/inc-proc-eng.c:377
>   18  main () at controller/ovn-controller.c:2826
> 
> The present code creates a 'struct local_binding' instance for a
> container lport and adds this object to the parent local binding
> children list.  And if the container lport is changed to a normal
> vif, then there is no way to access the local binding object created
> earlier.  This patch fixes these type of issues by refactoring the
> 'local binding' code of binding.c.  This patch now creates only one
> instance of 'struct local_binding' for every OVS interface with
> external_ids:iface-id set.  A new structure 'struct binding_lport' is
> added which is created for a VIF, container and virtual port bindings
> and is stored in 'binding_lports' shash.  'struct local_binding' now
> maintains a list of binding_lports which it maps to.
> 
> When a container lport is changed to a normal lport, we now can
> easily access the 'binding_lport' object of the container lport
> fron the 'binding_lports' shash.
> 
> Reported-by: Dumitru Ceara <dceara at redhat.com>
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936328
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936331
> 
> Co-authored-by: Dumitru Ceara <dceara at redhat.com>
> Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> Signed-off-by: Numan Siddique <numans at ovn.org>
> ---

Hi Numan,

I didn't do a full review of the code yet but one thing comes to mind.

Wouldn't it simplify the code if we always dealt with an updated port
binding record in the same way we handle a "delete" folowed by an "insert"?

We already store all the relevant information about the old port binding
(and if we don't we maybe should) so we could process every port binding
update as:
a. first delete the port binding resources based on the information we
have in memory about the old port binding record.
b. insert the new port binding based on the new contents of the record.

Currently binding_handle_port_binding_changes() seems to have a lot of
special cases to deal exactly with this kind of change.  It's quite hard
to validate if we missed any.

I have a couple more comments below.

Thanks,
Dumitru

> 
> v2 -> v3
> ----
>  * Fixed a crash seen when a container port is changed to normal port
>    and then deleted in the same transaction.  Added the relevant test case
>    for this.
> 
> v1 -> v2
> ----
>  * Fixed a crash seen when 2 container ports are changed to normal ports
>    in the same transaction.  Added the relevant test case for this.
> 
>  controller/binding.c        | 968 +++++++++++++++++++++++-------------
>  controller/binding.h        |  32 +-
>  controller/ovn-controller.c |  25 +-
>  controller/physical.c       |  13 +-
>  tests/ovn.at                | 247 +++++++++
>  5 files changed, 887 insertions(+), 398 deletions(-)
> 
> diff --git a/controller/binding.c b/controller/binding.c
> index 4e6c756969..40276d1b12 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -597,6 +597,23 @@ remove_local_lport_ids(const struct sbrec_port_binding *pb,
>      }
>  }
>  

<snip>

> +
> +static void
> +local_binding_destroy(struct local_binding *lbinding,
> +                      struct shash *binding_lports)

I'm sure there's a good reason to pass binding_lports as NULL sometimes
but it's not really clear why and makes me wonder if there are cases
when we can leak stuff.

A comment might be helpful.

> +{
> +    struct binding_lport *b_lport;
> +    LIST_FOR_EACH_POP (b_lport, list_node, &lbinding->binding_lports) {
> +        b_lport->lbinding = NULL;
> +        if (binding_lports) {
> +            binding_lport_delete(binding_lports, b_lport);
> +        }
> +    }
> +
> +    free(lbinding->name);
> +    free(lbinding);
> +}
> +
> +static void
> +local_binding_delete(struct local_binding *lbinding,
> +                     struct shash *local_bindings,
> +                     struct shash *binding_lports)
> +{
> +    shash_find_and_delete(local_bindings, lbinding->name);
> +    local_binding_destroy(lbinding, binding_lports);
> +}
> +
> +/* Returns the primary binding lport if present in lbinding's
> + * binding lports list.  A binding lport is considered primary
> + * if binding lport's type is LP_VIF and the name matches
> + * with the 'lbinding'.
> + */
> +static struct binding_lport *
> +local_binding_get_primary_lport(struct local_binding *lbinding)
> +{
> +    if (!lbinding) {
> +        return NULL;
> +    }
> +
> +    struct binding_lport *b_lport;

I might be wrong, but isn't it easier to just store the "primary lport"
as a separate field in 'struct local_binding' instead of having to walk
the list every time we need it?

<snip>

> diff --git a/controller/binding.h b/controller/binding.h
> index c9ebef4b17..f75a787dad 100644
> --- a/controller/binding.h
> +++ b/controller/binding.h
> @@ -56,7 +56,7 @@ struct binding_ctx_in {
>  
>  struct binding_ctx_out {
>      struct hmap *local_datapaths;
> -    struct shash *local_bindings;
> +    struct local_binding_data *lbinding_data;
>  
>      /* sset of (potential) local lports. */
>      struct sset *local_lports;
> @@ -86,28 +86,16 @@ struct binding_ctx_out {
>      struct hmap *tracked_dp_bindings;
>  };
>  
> -enum local_binding_type {
> -    BT_VIF,
> -    BT_CONTAINER,
> -    BT_VIRTUAL
> +struct local_binding_data {
> +    struct shash local_bindings;
> +    struct shash binding_lports;
>  };

This structure is already refering to "local bindings".  Maybe it's more
clear if we define it like below?

struct local_binding_data {
    struct shash bindings;
    struct shash lports;
};



More information about the dev mailing list