[ovs-dev] [PATCH ovn] controller: Avoid unnecessary load balancer flow processing.

Han Zhou hzhou at ovn.org
Fri Jul 9 01:46:54 UTC 2021


On Thu, Jul 8, 2021 at 1:10 PM Mark Michelson <mmichels at redhat.com> wrote:
>
> Acked-by: Mark Michelson <mmichels at redhat.com>
>
Thanks Dumitru and Mark!

>
> If I'm to extrapolate this a bit more, it sounds like any time a
> Datapath_Binding is updated, it results in the potential for many
> Logical_Flows to be re-calculated. In this particular case, it was
> triggered based on modifying a wide-reaching load balancer. I suppose
> the same thing would happen if Datapath_Binding external_ids were updated.
>
Yes. Luckily it seems external_ids of datapath_binding shouldn't change
very often. But I agree this is a general problem of the I-P.

> I wonder if it's worth investigating trying to create a new reference
> type for the purposes of the IDL. The idea would be that if a column in
> table A references rows in table B, then if the rows in table B are
> updated, then any relevant rows in table A will reported as updated as
> well. However, any columns in tables C, D, E, etc. that reference table
> A will not result in the rows in tables C, D, E, etc. being reported as
> updated.
>
> Essentially, it's the same as what you've done here, except we can have
> the IDL fill in the sbrec_load_balancer in the Datapath_Binding record
> instead of having to look it up when needed.
>
It may be helpful to have a new reference type, but deciding when and where
to use it would be very tricky. If not careful enough, we might be creating
some dependency in the code without getting proper change notification.

> Alternatively, would it be worth listing the relevant columns that a
> particular type "cares" about? For instance, Logical_Datapaths are only
> affected if the tunnel_key is updated on a Datapath_Binding. The
> load_balancers and external_ids are irrelevant. Would it be worthwhile
> to be able to note in the reference the columns that should elicit an
> update?

This sounds more reliable. Ideally, if referenced columns are noted, the
generated IDL structure of A's field B should only contains the columns
that are interested by A. This way we can make sure it is impossible to
secretly use a column without being notified of updates that possibly need
to be handled by I-P. I am not sure how hard this would be.

>
> Is it worth it? Or is this such a niche use case that it's not worth
> implementing?

I think we can see if there are more such scenarios that justify the IDL
change. I am ok with changing the schema for the current use case to
replace the reference by just raw uuid. However, we should never directly
use IDL from the ctx and access any table, this would break the I-P
dependency tracking. Please see more comments inlined.

>
> On 7/7/21 10:56 AM, Dumitru Ceara wrote:
> > Whenever a Load_Balancer is updated, e.g., a VIP is added, the following
> > sequence of events happens:
> >
> > 1. The Southbound Load_Balancer record is updated.
> > 2. The Southbound Datapath_Binding records on which the Load_Balancer is
> >     applied are updated.
> > 3. Southbound ovsdb-server sends updates about the Load_Balancer and
> >     Datapath_Binding records to ovn-controller.
> > 4. The IDL layer in ovn-controller processes the updates at #3, but
> >     because of the SB schema references between tables [0] all logical
> >     flows referencing the updated Datapath_Binding are marked as
> >     "updated".  The same is true for Logical_DP_Group records
> >     referencing the Datapath_Binding, and also for all logical flows
> >     pointing to the new "updated" datapath groups.
> > 5. ovn-controller ends up recomputing (removing/readding) all flows for
> >     all these tracked updates.
> >
> >  From the SB Schema:
> >          "Datapath_Binding": {
> >              "columns": {
> >                  [...]
> >                  "load_balancers": {"type": {"key": {"type": "uuid",
> >                                                     "refTable":
"Load_Balancer",
> >                                                     "refType": "weak"},
> >                                              "min": 0,
> >                                              "max": "unlimited"}},
> >          [...]
> >          "Load_Balancer": {
> >              "columns": {
> >                  "datapaths": {
> >                  [...]
> >                      "type": {"key": {"type": "uuid",
> >                                       "refTable": "Datapath_Binding"},
> >                               "min": 0, "max": "unlimited"}},
> >          [...]
> >          "Logical_DP_Group": {
> >              "columns": {
> >                  "datapaths":
> >                      {"type": {"key": {"type": "uuid",
> >                                        "refTable": "Datapath_Binding",
> >                                        "refType": "weak"},
> >                                "min": 0, "max": "unlimited"}}},
> >          [...]
> >          "Logical_Flow": {
> >              "columns": {
> >                  "logical_datapath":
> >                      {"type": {"key": {"type": "uuid",
> >                                        "refTable": "Datapath_Binding"},
> >                                "min": 0, "max": 1}},
> >                  "logical_dp_group":
> >                      {"type": {"key": {"type": "uuid",
> >                                        "refTable": "Logical_DP_Group"},
> >
> > In order to avoid this unnecessary Logical_Flow notification storm we
> > now remove the explicit reference from Datapath_Binding to
> > Load_Balancer and instead store raw UUIDs.
> >
> > This means that on the ovn-controller side we need to perform a
> > Load_Balancer table lookup by UUID whenever a new datapath is added,
> > but that doesn't happen too often and the cost of the lookup is
> > negligible compared to the huge cost of processing the unnecessary
> > logical flow updates.
> >
> > This change is backwards compatible because the contents stored in the
> > database are not changed, just that the schema constraints are relaxed a
> > bit.
> >
> > Some performance measurements, on a scale test deployment simulating an
> > ovn-kubernetes deployment with 120 nodes and a large load balancer
> > with 16K VIPs associated to each node's logical switch, the event
> > processing loop time in ovn-controller, when adding a new VIP, is
> > reduced from ~39 seconds to ~8 seconds.
> >
> > There's no need to change the northd DDlog implementation.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1978605
> > Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> > ---
> >   controller/lflow.c          |  6 ++++--
> >   controller/lflow.h          |  2 ++
> >   controller/ovn-controller.c |  2 ++
> >   lib/inc-proc-eng.h          |  1 +
> >   northd/ovn-northd.c         | 14 ++++++--------
> >   ovn-sb.ovsschema            |  6 ++----
> >   6 files changed, 17 insertions(+), 14 deletions(-)
> >
> > diff --git a/controller/lflow.c b/controller/lflow.c
> > index 60aa011ff..877bd49b0 100644
> > --- a/controller/lflow.c
> > +++ b/controller/lflow.c
> > @@ -1744,8 +1744,10 @@ lflow_processing_end:
> >       /* Add load balancer hairpin flows if the datapath has any load
balancers
> >        * associated. */
> >       for (size_t i = 0; i < dp->n_load_balancers; i++) {
> > -        consider_lb_hairpin_flows(dp->load_balancers[i],
> > -                                  l_ctx_in->local_datapaths,
> > +        const struct sbrec_load_balancer *lb =
> > +            sbrec_load_balancer_get_for_uuid(l_ctx_in->sb_idl,
> > +                                             &dp->load_balancers[i]);

We should never directly access DB tables this way within the I-P engine.
We will lose track of the dependencies. Now that we removed the reference,
we should add load_balancer table as a new dependency, get the table data
through EN_OVSDB_GET, and explicitly add a change handler to avoid
recompute.

> > +        consider_lb_hairpin_flows(lb, l_ctx_in->local_datapaths,
> >                                     l_ctx_out->flow_table);
> >       }
> >
> > diff --git a/controller/lflow.h b/controller/lflow.h
> > index c17ff6dd4..5ba7af894 100644
> > --- a/controller/lflow.h
> > +++ b/controller/lflow.h
> > @@ -40,6 +40,7 @@
> >   #include "openvswitch/list.h"
> >
> >   struct ovn_extend_table;
> > +struct ovsdb_idl;
> >   struct ovsdb_idl_index;
> >   struct ovn_desired_flow_table;
> >   struct hmap;
> > @@ -126,6 +127,7 @@ void lflow_resource_destroy(struct
lflow_resource_ref *);
> >   void lflow_resource_clear(struct lflow_resource_ref *);
> >
> >   struct lflow_ctx_in {
> > +    struct ovsdb_idl *sb_idl;
> >       struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
> >       struct ovsdb_idl_index *sbrec_logical_flow_by_logical_datapath;
> >       struct ovsdb_idl_index *sbrec_logical_flow_by_logical_dp_group;
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 9050380f3..24443bad1 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -2016,6 +2016,7 @@ init_lflow_ctx(struct engine_node *node,
> >           engine_get_input_data("port_groups", node);
> >       struct shash *port_groups = &pg_data->port_groups_cs_local;
> >
> > +    l_ctx_in->sb_idl = engine_get_context()->ovnsb_idl;
> >       l_ctx_in->sbrec_multicast_group_by_name_datapath =
> >           sbrec_mc_group_by_name_dp;
> >       l_ctx_in->sbrec_logical_flow_by_logical_datapath =
> > @@ -3124,6 +3125,7 @@ main(int argc, char *argv[])
> >           }
> >
> >           struct engine_context eng_ctx = {
> > +            .ovnsb_idl = ovnsb_idl_loop.idl,
> >               .ovs_idl_txn = ovs_idl_txn,
> >               .ovnsb_idl_txn = ovnsb_idl_txn,
> >               .client_ctx = &ctrl_engine_ctx
> > diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
> > index 7e9f5bb70..506da51a1 100644
> > --- a/lib/inc-proc-eng.h
> > +++ b/lib/inc-proc-eng.h
> > @@ -64,6 +64,7 @@
> >   #define ENGINE_MAX_OVSDB_INDEX 256
> >
> >   struct engine_context {
> > +    struct ovsdb_idl *ovnsb_idl;
> >       struct ovsdb_idl_txn *ovs_idl_txn;
> >       struct ovsdb_idl_txn *ovnsb_idl_txn;
> >       void *client_ctx;
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 570c6a3ef..8649a590b 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -3529,19 +3529,17 @@ build_ovn_lbs(struct northd_context *ctx,
struct hmap *datapaths,
> >               continue;
> >           }
> >
> > -        const struct sbrec_load_balancer **sbrec_lbs =
> > -            xmalloc(od->nbs->n_load_balancer * sizeof *sbrec_lbs);
> > +        struct uuid *lb_uuids =
> > +            xmalloc(od->nbs->n_load_balancer * sizeof *lb_uuids);
> >           for (size_t i = 0; i < od->nbs->n_load_balancer; i++) {
> >               const struct uuid *lb_uuid =
> >                   &od->nbs->load_balancer[i]->header_.uuid;
> >               lb = ovn_northd_lb_find(lbs, lb_uuid);
> > -            sbrec_lbs[i] = lb->slb;
> > +            lb_uuids[i] = lb->slb->header_.uuid;
> >           }
> > -
> > -        sbrec_datapath_binding_set_load_balancers(
> > -            od->sb, (struct sbrec_load_balancer **)sbrec_lbs,
> > -            od->nbs->n_load_balancer);
> > -        free(sbrec_lbs);
> > +        sbrec_datapath_binding_set_load_balancers(od->sb, lb_uuids,
> > +
 od->nbs->n_load_balancer);
> > +        free(lb_uuids);

I believe we need to change the deletion part as well. Now that we don't
have the weak reference, we need to take care of the LB deletions by
removing the reference explicitly from table datapath_binding.

Thanks,
Han


> >       }
> >   }
> >
> > diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema
> > index bbf60781d..57fbc3ca4 100644
> > --- a/ovn-sb.ovsschema
> > +++ b/ovn-sb.ovsschema
> > @@ -1,7 +1,7 @@
> >   {
> >       "name": "OVN_Southbound",
> >       "version": "20.18.0",
> > -    "cksum": "1816525029 26536",
> > +    "cksum": "779623696 26386",
> >       "tables": {
> >           "SB_Global": {
> >               "columns": {
> > @@ -166,9 +166,7 @@
> >                        "type": {"key": {"type": "integer",
> >                                         "minInteger": 1,
> >                                         "maxInteger": 16777215}}},
> > -                "load_balancers": {"type": {"key": {"type": "uuid",
> > -                                                   "refTable":
"Load_Balancer",
> > -                                                   "refType": "weak"},
> > +                "load_balancers": {"type": {"key": {"type": "uuid"},
> >                                               "min": 0,
> >                                               "max": "unlimited"}},
> >                   "external_ids": {
> >
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list