[ovs-dev] [PATCH ovn v13 4/7] ovn-controller: Use the tracked runtime data changes for flow calculation.

numans at ovn.org numans at ovn.org
Fri Jun 19 11:10:28 UTC 2020


From: Venkata Anil <anilvenkata at redhat.com>

This patch processes the logical flows of tracked datapaths
and tracked logical ports. To handle the tracked logical port
changes, reference of logical flows to port bindings is maintained.

Below are the results of some testing done with ovn-fake-multinode setup
comparing these incremental processing improvement patches and the
master.

Test setup
------
 1. ovn-central fake node running OVN dbs and 2 compute nodes running
    ovn-controller.

 2. Before running the tests, used an existing OVN db with the below
resources
   No of logical switches     - 53
   No of logical ports        - 1256
   No of logical routers      - 9
   No of logical router ports - 56
   No of port groups          - 152
   No of logical flows        - 45447

   Port bindings on compute-1 -  19
   Port bindings on compute-2 -  18
   No of OF flows on compute-1 - 84996
   No of OF flows on compute-2 - 84901

 3. The test does the following
    - Creates 2 logical switches (one for each compute node) and connect to a
      logical router for each compute node.
    - 100 logical ports are created (50 per lswitch), a simple ACL is added and the address
      set is created for each port.
    - Each port is bound on the respective compute node and the test
      pings the IP of the port (from another port belonging to the same
      lswitch created earlier).

Below are the results with OVN master

+--------------------------------------------------------------------------------------------------------------------------+
|                                                   Response Times (sec)                                                   |
+--------------------------------------------------+--------+--------+--------+--------+--------+--------+---------+-------+
| action                                           | min    | median | 90%ile | 95%ile | max    | avg    | success | count |
+--------------------------------------------------+--------+--------+--------+--------+--------+--------+---------+-------+
| ovn.create_or_update_network_policy              | 1.02   | 1.266  | 1.522  | 1.552  | 2.053  | 1.284  | 100.0%  | 100   |
| ovn.create_or_update_network_policy_address_sets | 0.222  | 0.276  | 0.289  | 0.294  | 0.318  | 0.272  | 100.0%  | 100   |
| ovn.create_port_group_acls                       | 0.462  | 0.54   | 0.57   | 0.576  | 0.588  | 0.531  | 50.0%   | 100   |
| ovn.create_or_update_name_space                  | 0.346  | 0.476  | 0.554  | 0.569  | 0.595  | 0.47   | 100.0%  | 100   |
| ovn_network.bind_port                            | 1.235  | 1.349  | 1.404  | 1.425  | 1.45   | 1.351  | 100.0%  | 100   |
| ovn.bind_ovs_vm                                  | 0.394  | 0.458  | 0.494  | 0.501  | 0.536  | 0.458  | 100.0%  | 100   |
| ovn.bind_internal_vm                             | 0.795  | 0.894  | 0.938  | 0.95   | 0.978  | 0.893  | 100.0%  | 100   |
| ovn_network.wait_port_ping                       | 7.388  | 7.669  | 7.774  | 7.83   | 8.6    | 7.682  | 100.0%  | 100   |
| total                                            | 10.546 | 11.007 | 11.401 | 11.526 | 12.222 | 11.044 | 100.0%  | 100   |
+--------------------------------------------------+--------+--------+--------+--------+--------+--------+---------+-------+
Load duration: 1106.4100859165192
Full duration: 1108.1457152366638

Below are the results with these I-P improvement patches

+-----------------------------------------------------------------------------------------------------------------------+
|                                                 Response Times (sec)                                                  |
+--------------------------------------------------+-------+--------+--------+--------+-------+-------+---------+-------+
| action                                           | min   | median | 90%ile | 95%ile | max   | avg   | success | count |
+--------------------------------------------------+-------+--------+--------+--------+-------+-------+---------+-------+
| ovn.create_or_update_network_policy              | 0.97  | 1.229  | 1.451  | 1.46   | 1.944 | 1.24  | 100.0%  | 100   |
| ovn.create_or_update_network_policy_address_sets | 0.224 | 0.267  | 0.28   | 0.282  | 0.291 | 0.266 | 100.0%  | 100   |
| ovn.create_port_group_acls                       | 0.462 | 0.526  | 0.539  | 0.544  | 0.564 | 0.52  | 50.0%   | 100   |
| ovn.create_or_update_name_space                  | 0.33  | 0.467  | 0.549  | 0.556  | 0.562 | 0.463 | 100.0%  | 100   |
| ovn_network.bind_port                            | 1.231 | 1.313  | 1.386  | 1.4    | 1.446 | 1.317 | 100.0%  | 100   |
| ovn.bind_ovs_vm                                  | 0.392 | 0.436  | 0.464  | 0.475  | 0.502 | 0.438 | 100.0%  | 100   |
| ovn.bind_internal_vm                             | 0.816 | 0.876  | 0.936  | 0.967  | 0.997 | 0.879 | 100.0%  | 100   |
| ovn_network.wait_port_ping                       | 0.097 | 0.141  | 0.177  | 0.18   | 0.203 | 0.136 | 100.0%  | 100   |
| total                                            | 2.977 | 3.406  | 3.709  | 3.748  | 4.181 | 3.411 | 100.0%  | 100   |
+--------------------------------------------------+-------+--------+--------+--------+-------+-------+---------+-------+
Load duration: 343.18983340263367
Full duration: 344.80813431739807

Acked-by: Han Zhou <hzhou at ovn.org>
Acked-by: Dumitru Ceara <dceara at redhat.com>
Co-Authored-by: Numan Siddique <numans at ovn.org>
Signed-off-by: Venkata Anil <anilvenkata at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>
---
 controller/lflow.c          |  87 +++++++++++++++++++++++++++++
 controller/lflow.h          |  12 +++-
 controller/ovn-controller.c | 107 ++++++++++++++++++------------------
 tests/ovn-performance.at    |  12 ++--
 4 files changed, 157 insertions(+), 61 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index 01214a3a6..c76ba4e92 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -59,6 +59,10 @@ struct condition_aux {
     struct ovsdb_idl_index *sbrec_port_binding_by_name;
     const struct sbrec_chassis *chassis;
     const struct sset *active_tunnels;
+    const struct sbrec_logical_flow *lflow;
+    /* Resource reference to store the port name referenced
+     * in is_chassis_resident() to the logical flow. */
+    struct lflow_resource_ref *lfrr;
 };
 
 static bool
@@ -68,6 +72,8 @@ consider_logical_flow(const struct sbrec_logical_flow *lflow,
                       struct controller_event_options *controller_event_opts,
                       struct lflow_ctx_in *l_ctx_in,
                       struct lflow_ctx_out *l_ctx_out);
+static void lflow_resource_add(struct lflow_resource_ref *, enum ref_type,
+                               const char *ref_name, const struct uuid *);
 
 static bool
 lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
@@ -120,6 +126,14 @@ is_chassis_resident_cb(const void *c_aux_, const char *port_name)
     if (!pb) {
         return false;
     }
+
+    /* Store the port_name to lflow reference. */
+    int64_t dp_id = pb->datapath->tunnel_key;
+    char buf[16];
+    snprintf(buf, sizeof(buf), "%"PRId64"_%"PRId64, dp_id, pb->tunnel_key);
+    lflow_resource_add(c_aux->lfrr, REF_TYPE_PORTBINDING, buf,
+                       &c_aux->lflow->header_.uuid);
+
     if (strcmp(pb->type, "chassisredirect")) {
         /* for non-chassisredirect ports */
         return pb->chassis && pb->chassis == c_aux->chassis;
@@ -594,6 +608,8 @@ consider_logical_flow(const struct sbrec_logical_flow *lflow,
         .sbrec_port_binding_by_name = l_ctx_in->sbrec_port_binding_by_name,
         .chassis = l_ctx_in->chassis,
         .active_tunnels = l_ctx_in->active_tunnels,
+        .lflow = lflow,
+        .lfrr = l_ctx_out->lfrr
     };
     expr = expr_simplify(expr, is_chassis_resident_cb, &cond_aux);
     expr = expr_normalize(expr);
@@ -649,6 +665,8 @@ consider_logical_flow(const struct sbrec_logical_flow *lflow,
                 int64_t dp_id = lflow->logical_datapath->tunnel_key;
                 char buf[16];
                 snprintf(buf, sizeof(buf), "%"PRId64"_%"PRId64, dp_id, port_id);
+                lflow_resource_add(l_ctx_out->lfrr, REF_TYPE_PORTBINDING, buf,
+                                   &lflow->header_.uuid);
                 if (!sset_contains(l_ctx_in->local_lport_ids, buf)) {
                     VLOG_DBG("lflow "UUID_FMT
                              " port %s in match is not local, skip",
@@ -847,3 +865,72 @@ lflow_destroy(void)
     expr_symtab_destroy(&symtab);
     shash_destroy(&symtab);
 }
+
+bool
+lflow_add_flows_for_datapath(const struct sbrec_datapath_binding *dp,
+                             struct lflow_ctx_in *l_ctx_in,
+                             struct lflow_ctx_out *l_ctx_out)
+{
+    bool handled = true;
+    struct hmap dhcp_opts = HMAP_INITIALIZER(&dhcp_opts);
+    struct hmap dhcpv6_opts = HMAP_INITIALIZER(&dhcpv6_opts);
+    const struct sbrec_dhcp_options *dhcp_opt_row;
+    SBREC_DHCP_OPTIONS_TABLE_FOR_EACH (dhcp_opt_row,
+                                       l_ctx_in->dhcp_options_table) {
+        dhcp_opt_add(&dhcp_opts, dhcp_opt_row->name, dhcp_opt_row->code,
+                     dhcp_opt_row->type);
+    }
+
+
+    const struct sbrec_dhcpv6_options *dhcpv6_opt_row;
+    SBREC_DHCPV6_OPTIONS_TABLE_FOR_EACH (dhcpv6_opt_row,
+                                         l_ctx_in->dhcpv6_options_table) {
+       dhcp_opt_add(&dhcpv6_opts, dhcpv6_opt_row->name, dhcpv6_opt_row->code,
+                    dhcpv6_opt_row->type);
+    }
+
+    struct hmap nd_ra_opts = HMAP_INITIALIZER(&nd_ra_opts);
+    nd_ra_opts_init(&nd_ra_opts);
+
+    struct controller_event_options controller_event_opts;
+    controller_event_opts_init(&controller_event_opts);
+
+    struct sbrec_logical_flow *lf_row = sbrec_logical_flow_index_init_row(
+        l_ctx_in->sbrec_logical_flow_by_logical_datapath);
+    sbrec_logical_flow_index_set_logical_datapath(lf_row, dp);
+
+    const struct sbrec_logical_flow *lflow;
+    SBREC_LOGICAL_FLOW_FOR_EACH_EQUAL (
+        lflow, lf_row, l_ctx_in->sbrec_logical_flow_by_logical_datapath) {
+        /* Remove the lflow from flow_table if present before processing it. */
+        ofctrl_remove_flows(l_ctx_out->flow_table, &lflow->header_.uuid);
+
+        if (!consider_logical_flow(lflow, &dhcp_opts, &dhcpv6_opts,
+                                   &nd_ra_opts, &controller_event_opts,
+                                   l_ctx_in, l_ctx_out)) {
+            handled = false;
+            break;
+        }
+    }
+
+    dhcp_opts_destroy(&dhcp_opts);
+    dhcp_opts_destroy(&dhcpv6_opts);
+    nd_ra_opts_destroy(&nd_ra_opts);
+    controller_event_opts_destroy(&controller_event_opts);
+    return handled;
+}
+
+bool
+lflow_handle_flows_for_lport(const struct sbrec_port_binding *pb,
+                             struct lflow_ctx_in *l_ctx_in,
+                             struct lflow_ctx_out *l_ctx_out)
+{
+    bool changed;
+    int64_t dp_id = pb->datapath->tunnel_key;
+    char pb_ref_name[16];
+    snprintf(pb_ref_name, sizeof(pb_ref_name), "%"PRId64"_%"PRId64,
+             dp_id, pb->tunnel_key);
+
+    return lflow_handle_changed_ref(REF_TYPE_PORTBINDING, pb_ref_name,
+                                    l_ctx_in, l_ctx_out, &changed);
+}
diff --git a/controller/lflow.h b/controller/lflow.h
index f02f709d7..ae02eaf5e 100644
--- a/controller/lflow.h
+++ b/controller/lflow.h
@@ -48,6 +48,8 @@ struct sbrec_dhcp_options_table;
 struct sbrec_dhcpv6_options_table;
 struct sbrec_logical_flow_table;
 struct sbrec_mac_binding_table;
+struct sbrec_datapath_binding;
+struct sbrec_port_binding;
 struct simap;
 struct sset;
 struct uuid;
@@ -72,7 +74,8 @@ struct uuid;
 
 enum ref_type {
     REF_TYPE_ADDRSET,
-    REF_TYPE_PORTGROUP
+    REF_TYPE_PORTGROUP,
+    REF_TYPE_PORTBINDING
 };
 
 /* Maintains the relationship for a pair of named resource and
@@ -117,6 +120,7 @@ void lflow_resource_clear(struct lflow_resource_ref *);
 
 struct lflow_ctx_in {
     struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
+    struct ovsdb_idl_index *sbrec_logical_flow_by_logical_datapath;
     struct ovsdb_idl_index *sbrec_port_binding_by_name;
     const struct sbrec_dhcp_options_table *dhcp_options_table;
     const struct sbrec_dhcpv6_options_table *dhcpv6_options_table;
@@ -157,4 +161,10 @@ void lflow_destroy(void);
 
 void lflow_expr_destroy(struct hmap *lflow_expr_cache);
 
+bool lflow_add_flows_for_datapath(const struct sbrec_datapath_binding *,
+                                  struct lflow_ctx_in *,
+                                  struct lflow_ctx_out *);
+bool lflow_handle_flows_for_lport(const struct sbrec_port_binding *,
+                                  struct lflow_ctx_in *,
+                                  struct lflow_ctx_out *);
 #endif /* controller/lflow.h */
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 29712d2f6..a389ee879 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1595,6 +1595,11 @@ static void init_lflow_ctx(struct engine_node *node,
                 engine_get_input("SB_port_binding", node),
                 "name");
 
+    struct ovsdb_idl_index *sbrec_logical_flow_by_dp =
+        engine_ovsdb_node_get_index(
+                engine_get_input("SB_logical_flow", node),
+                "logical_datapath");
+
     struct ovsdb_idl_index *sbrec_mc_group_by_name_dp =
         engine_ovsdb_node_get_index(
                 engine_get_input("SB_multicast_group", node),
@@ -1642,6 +1647,8 @@ static void init_lflow_ctx(struct engine_node *node,
 
     l_ctx_in->sbrec_multicast_group_by_name_datapath =
         sbrec_mc_group_by_name_dp;
+    l_ctx_in->sbrec_logical_flow_by_logical_datapath =
+        sbrec_logical_flow_by_dp;
     l_ctx_in->sbrec_port_binding_by_name = sbrec_port_binding_by_name;
     l_ctx_in->dhcp_options_table  = dhcp_table;
     l_ctx_in->dhcpv6_options_table = dhcpv6_table;
@@ -1796,7 +1803,8 @@ flow_output_sb_mac_binding_handler(struct engine_node *node, void *data)
 }
 
 static bool
-flow_output_sb_port_binding_handler(struct engine_node *node, void *data)
+flow_output_sb_port_binding_handler(struct engine_node *node,
+                                    void *data)
 {
     struct ed_type_runtime_data *rt_data =
         engine_get_input_data("runtime_data", node);
@@ -1804,56 +1812,13 @@ flow_output_sb_port_binding_handler(struct engine_node *node, void *data)
     struct ed_type_flow_output *fo = data;
     struct ovn_desired_flow_table *flow_table = &fo->flow_table;
 
-    /* XXX: now we handle port-binding changes for physical flow processing
-     * only, but port-binding change can have impact to logical flow
-     * processing, too, in below circumstances:
-     *
-     *  - When a port-binding for a lport is inserted/deleted but the lflow
-     *    using that lport doesn't change.
-     *
-     *    This can happen only when the lport name is used by ACL match
-     *    condition, which is specified by user. Even in that case, if the port
-     *    is actually bound on the current chassis it will trigger recompute on
-     *    that chassis since ovs interface would be updated. So the only
-     *    situation this would have real impact is when user defines an ACL
-     *    that includes lport that is not on current chassis, and there is a
-     *    port-binding creation/deletion related to that lport.e.g.: an ACL is
-     *    defined:
-     *
-     *    to-lport 1000 'outport=="A" && inport=="B"' allow-related
-     *
-     *    If "A" is on current chassis, but "B" is lport that hasn't been
-     *    created yet. When a lport "B" is created and bound on another
-     *    chassis, the ACL will not take effect on the current chassis until a
-     *    recompute is triggered later. This case doesn't seem to be a problem
-     *    for real world use cases because usually lport is created before
-     *    being referenced by name in ACLs.
-     *
-     *  - When is_chassis_resident(<lport>) is used in lflow. In this case the
-     *    port binding is not a regular VIF. It can be either "patch" or
-     *    "external", with ha-chassis-group assigned.  In current
-     *    "runtime_data" handling, port-binding changes for these types always
-     *    trigger recomputing. So it is fine even if we do not handle it here.
-     *    (due to the ovsdb tracking support for referenced table changes,
-     *    ha-chassis-group changes will appear as port-binding change).
-     *
-     *  - When a mac-binding doesn't change but the port-binding related to
-     *    that mac-binding is deleted. In this case the neighbor flow generated
-     *    for the mac-binding should be deleted. This would not cause any real
-     *    issue for now, since the port-binding related to mac-binding is
-     *    always logical router port, and any change to logical router port
-     *    would just trigger recompute.
-     *
-     * Although there is no correctness issue so far (except the unusual ACL
-     * use case, which doesn't seem to be a real problem), it might be better
-     * to handle this more gracefully, without the need to consider these
-     * tricky scenarios.  One approach is to maintain a mapping between lport
-     * names and the lflows that uses them, and reprocess the related lflows
-     * when related port-bindings change.
-     */
     struct physical_ctx p_ctx;
     init_physical_ctx(node, rt_data, &p_ctx);
 
+    /* We handle port-binding changes for physical flow processing
+     * only. flow_output runtime data handler takes care of processing
+     * logical flows for any port binding changes.
+     */
     physical_handle_port_binding_changes(&p_ctx, flow_table);
 
     engine_set_node_state(node, EN_UPDATED);
@@ -1941,6 +1906,9 @@ _flow_output_resource_ref_handler(struct engine_node *node, void *data,
             updated = &pg_data->updated;
             deleted = &pg_data->deleted;
             break;
+
+        /* This ref type is handled in the flow_output_runtime_data_handler. */
+        case REF_TYPE_PORTBINDING:
         default:
             OVS_NOT_REACHED();
     }
@@ -2031,16 +1999,42 @@ flow_output_runtime_data_handler(struct engine_node *node,
         return false;
     }
 
-    if (!hmap_is_empty(&rt_data->tracked_dp_bindings)) {
-        /* We are not yet handling the tracked datapath binding
-         * changes. Return false to trigger full recompute. */
-        return false;
+    struct hmap *tracked_dp_bindings = &rt_data->tracked_dp_bindings;
+    if (hmap_is_empty(tracked_dp_bindings)) {
+        if (rt_data->local_lports_changed) {
+            engine_set_node_state(node, EN_UPDATED);
+        }
+        return true;
     }
 
-    if (rt_data->local_lports_changed) {
-        engine_set_node_state(node, EN_UPDATED);
+    struct lflow_ctx_in l_ctx_in;
+    struct lflow_ctx_out l_ctx_out;
+    struct ed_type_flow_output *fo = data;
+    init_lflow_ctx(node, rt_data, fo, &l_ctx_in, &l_ctx_out);
+
+    struct physical_ctx p_ctx;
+    init_physical_ctx(node, rt_data, &p_ctx);
+
+    struct tracked_binding_datapath *tdp;
+    HMAP_FOR_EACH (tdp, node, tracked_dp_bindings) {
+        if (tdp->is_new) {
+            if (!lflow_add_flows_for_datapath(tdp->dp, &l_ctx_in,
+                                              &l_ctx_out)) {
+                return false;
+            }
+        } else {
+            struct shash_node *shash_node;
+            SHASH_FOR_EACH (shash_node, &tdp->lports) {
+                struct tracked_binding_lport *lport = shash_node->data;
+                if (!lflow_handle_flows_for_lport(lport->pb, &l_ctx_in,
+                                                  &l_ctx_out)) {
+                    return false;
+                }
+            }
+        }
     }
 
+    engine_set_node_state(node, EN_UPDATED);
     return true;
 }
 
@@ -2099,6 +2093,9 @@ main(int argc, char *argv[])
         = chassis_index_create(ovnsb_idl_loop.idl);
     struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath
         = mcast_group_index_create(ovnsb_idl_loop.idl);
+    struct ovsdb_idl_index *sbrec_logical_flow_by_logical_datapath
+        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
+                                  &sbrec_logical_flow_col_logical_datapath);
     struct ovsdb_idl_index *sbrec_port_binding_by_name
         = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
                                   &sbrec_port_binding_col_logical_port);
@@ -2258,6 +2255,8 @@ main(int argc, char *argv[])
     engine_ovsdb_node_add_index(&en_sb_chassis, "name", sbrec_chassis_by_name);
     engine_ovsdb_node_add_index(&en_sb_multicast_group, "name_datapath",
                                 sbrec_multicast_group_by_name_datapath);
+    engine_ovsdb_node_add_index(&en_sb_logical_flow, "logical_datapath",
+                                sbrec_logical_flow_by_logical_datapath);
     engine_ovsdb_node_add_index(&en_sb_port_binding, "name",
                                 sbrec_port_binding_by_name);
     engine_ovsdb_node_add_index(&en_sb_port_binding, "key",
diff --git a/tests/ovn-performance.at b/tests/ovn-performance.at
index 08acd3bae..a12757e18 100644
--- a/tests/ovn-performance.at
+++ b/tests/ovn-performance.at
@@ -274,7 +274,7 @@ for i in 1 2; do
     )
 
     # Add router port to $ls
-    OVN_CONTROLLER_EXPECT_HIT(
+    OVN_CONTROLLER_EXPECT_NO_HIT(
         [hv1 hv2], [lflow_run],
         [ovn-nbctl --wait=hv lrp-add lr1 $lrp 02:00:00:00:0$i:01 10.0.$i.1/24]
     )
@@ -282,7 +282,7 @@ for i in 1 2; do
         [hv1 hv2], [lflow_run],
         [ovn-nbctl --wait=hv lsp-add $ls $lsp]
     )
-    OVN_CONTROLLER_EXPECT_HIT(
+    OVN_CONTROLLER_EXPECT_NO_HIT(
         [hv1 hv2], [lflow_run],
         [ovn-nbctl --wait=hv lsp-set-type $lsp router]
     )
@@ -353,8 +353,8 @@ for i in 1 2; do
     )
 
     # Bind port $lp and wait for it to come up
-    OVN_CONTROLLER_EXPECT_HIT_COND(
-        [hv$i hv$j], [lflow_run], [>0 =0],
+    OVN_CONTROLLER_EXPECT_NO_HIT(
+        [hv$i hv$j], [lflow_run],
         [as hv$i ovs-vsctl add-port br-int $vif -- set Interface $vif external-ids:iface-id=$lp &&
          ovn-nbctl wait-until Logical_Switch_Port $lp 'up=true' &&
          ovn-nbctl --wait=hv sync]
@@ -404,8 +404,8 @@ for i in 1 2; do
     lp=lp$i
 
     # Delete port $lp
-    OVN_CONTROLLER_EXPECT_HIT_COND(
-        [hv$i hv$j], [lflow_run], [>0 =0],
+    OVN_CONTROLLER_EXPECT_NO_HIT(
+        [hv$i hv$j], [lflow_run],
         [ovn-nbctl --wait=hv lsp-del $lp]
     )
 done
-- 
2.26.2



More information about the dev mailing list