[ovs-git] [ovn-org/ovn] 3ee3d4: ovn-ctl: Handle cluster db upgrades for run_(nb/sb...

numansiddique noreply at github.com
Tue Mar 2 18:22:39 UTC 2021


  Branch: refs/heads/branch-20.06
  Home:   https://github.com/ovn-org/ovn
  Commit: 3ee3d4ee41c19df7e700d59950ceb8816686c0fd
      https://github.com/ovn-org/ovn/commit/3ee3d4ee41c19df7e700d59950ceb8816686c0fd
  Author: Numan Siddique <numans at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M utilities/ovn-ctl

  Log Message:
  -----------
  ovn-ctl: Handle cluster db upgrades for run_(nb/sb)_ovsdb

when ovn-ctl run_(nb_sb)_ovsdb is called, the ovsdb-server is started without
passing --detach and --monoitor and the process is exec'd.

For cluster mode, upgrade_cluster is never called and hence the dbs are not upraded
to new schema. CMS has to handle the db upgrade separately.

This patch handles the db upgrade by starting ovsdb-server in background and then
waits on ovsdb-server to complete.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1868392
Acked-by: Mark Michelson <mmichels at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>
(cherry picked from commit 67e2f386cc838d0b0f9b4b5da7fe611e1113b70c)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>


  Commit: 44aaf7caa3302a08c3afa5409c91b11b95d56556
      https://github.com/ovn-org/ovn/commit/44aaf7caa3302a08c3afa5409c91b11b95d56556
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c

  Log Message:
  -----------
  ofctrl.c: Maintain references between installed flows and desired flows.

Currently there is no link maintained between installed flows and desired
flows. This patch maintains the mapping between them, which will be useful
for a future patch that incrementally processes the flow installation without
having to do the full comparison between them.

This patch also refactors the struct ovn_flow with two different wrapper
types: desired_flow and installed_flow, and the related static functions,
to make the code easier to read and avoid misuses of the struct.

Acked-by: Mark Michelson <mmichels at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>

(cherry-picked from master commit 354d3853d40cbce89a434632f67daed7fc992d8b)


  Commit: 2a290298dbe25b7ce2a777d00158c86af16e0c18
      https://github.com/ovn-org/ovn/commit/2a290298dbe25b7ce2a777d00158c86af16e0c18
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c

  Log Message:
  -----------
  ofctrl.c: Refactor - move openflow msg construction to functions.

These functions will be reused in multiple places in a future patch.

Acked-by: Mark Michelson <mmichels at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
(cherry picked from commit 23063cf4178c05f5d6b3e4ec6d323ccc88df6101)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>


  Commit: 95f8066633d67fe6252c1ae5c1f318e348673a8c
      https://github.com/ovn-org/ovn/commit/95f8066633d67fe6252c1ae5c1f318e348673a8c
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c
    M controller/ofctrl.h

  Log Message:
  -----------
  ofctrl: Incremental processing for flow installation by tracking.

With incremental processing for flow computation, the bottle neck of
ovn-controller in large scale environment is in the flow installation
(ofctrl_put()), which does full comparison between the two big flow tables: the
installed flows and desired flows.

This patch implements tracking desired flow changes when flows are
incrementally computed by I-P engine, and then incrementally processing the
flow installation using the tracked information in ofctrl_put(). It falls back
to the full comparison whenever tracking information is unavailable, e.g. when
I-P engine triggers full recompute.

In ovn-scale-test with 3000 HVs and 30k lports, the end-to-end latency between
the moment a lflow is updated in SB DB and the moment when all the 3K HVs
complete OVS flow updating has reduced around 60% (from 1s to 400ms).

Below is the perf result for a ovn-controller processing a new port-binding:

Beore:
+   96.76%     0.00%  ovn-controller  [unknown]           [k] 0xffffffffffffffff
+   90.21%     0.00%  ovn-controller  ovn-controller      [.] main
+   39.93%     1.19%  ovn-controller  ovn-controller      [.] ofctrl_put
+   31.27%    12.47%  ovn-controller  ovn-controller      [.] ovn_flow_lookup
+   22.12%     3.12%  ovn-controller  ovn-controller      [.] encaps_run
+   18.69%     2.77%  ovn-controller  ovn-controller      [.] minimatch_equal
+   17.63%     4.11%  ovn-controller  ovn-controller      [.] patch_run
+   15.91%     0.00%  ovn-controller  ovn-controller      [.] add_bridge_mappings (inlined)
+   14.03%    12.08%  ovn-controller  ovn-controller      [.] minimask_equal
+   12.41%     0.00%  ovn-controller  ovn-controller      [.] chassis_tunnel_add (inlined)
+   11.40%     0.00%  ovn-controller  ovn-controller      [.] tunnel_add (inlined)

After:
+   94.59%     0.00%  ovn-controller  [unknown]           [k] 0xffffffffffffffff
+   83.56%     0.09%  ovn-controller  ovn-controller      [.] main
+   35.37%     3.13%  ovn-controller  ovn-controller      [.] encaps_run
+   27.54%     7.53%  ovn-controller  ovn-controller      [.] patch_run
+   24.86%     0.00%  ovn-controller  ovn-controller      [.] add_bridge_mappings (inlined)
+   20.01%     0.00%  ovn-controller  ovn-controller      [.] chassis_tunnel_add (inlined)
+   18.51%     0.00%  ovn-controller  ovn-controller      [.] tunnel_add (inlined)
+   14.08%     0.17%  ovn-controller  ovn-controller      [.] physical_run
+   11.37%    11.28%  ovn-controller  ovn-controller      [.] next_real_row
+   10.50%     2.59%  ovn-controller  ovn-controller      [.] consider_port_binding
...
+    2.14%     0.32%  ovn-controller  ovn-controller      [.] ofctrl_put                                                                                                                                                                                                                 ▒

Before the optimization, ofctrl_put took 40% of CPU, and now it disappears from
the hot spots.

Acked-by: Mark Michelson <mmichels at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>


  Commit: 1fad5e6cacd2283944bb56b8bfe01d9e80035686
      https://github.com/ovn-org/ovn/commit/1fad5e6cacd2283944bb56b8bfe01d9e80035686
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c

  Log Message:
  -----------
  ofctrl.c: Merge opposite changes of tracked flows before installing.

This patch optimizes the previous patch that incrementally processes
flow installation by merging the "add-after-delete" flow changes as
much as possible to avoid unnecessary OpenFlow updates.

Acked-by: Mark Michelson <mmichels at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
(cherry picked from commit f4e508dd7a6cfbfc2e3250a8c11a8d0fdc1dfdd0)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>


  Commit: 23f8b4e930cb1adf7ff8aaed64c4cf70aed255f0
      https://github.com/ovn-org/ovn/commit/23f8b4e930cb1adf7ff8aaed64c4cf70aed255f0
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c
    M tests/ovn.at

  Log Message:
  -----------
  ofctrl.c: Fix duplicated flow handling in I-P while merging opposite changes.

In a scenario in I-P when a desired flow is removed and an exactly same flow is
added back in the same iteration, the merge_tracked_flows() function will merge
the operation without firing any OpenFlow action. However, if there are
multiple desired flows (e.g. F1 and F2) matching the same installed flow, and
if the one being re-added (say, F1) is the one currently installed in OVS, the
current implementation will update the currently installed flow to F2 in the
data structure while the actual OVS flow is not updated (still F1). So far
there is still no problem, but the member desired_flow of the installed flow
is pointing to the wrong desired flow.

Now there is only one place where the desired_flow member is consulted, in
update_installed_flows_by_track() for handling a tracked *modified* flow. In
reality flow modification happens only when conjunction flows need to be
combined, which would never happen together with the above scenario of
merge_tracked_flows(). So this wrong desired_flow pointer doesn't cause any
real problem yet.

However, there is a place that can already utilize this active desired_flow
information, which is when handling a tracked flow deletion in
update_installed_flows_by_track(): we can check if the flow being deleted is
the currently active one installed in OVS. If not, we don't need to send and
OpenFlow modify to OVS. This optimization is added in this patch, apart from
fixing the problem of merge_tracked_flows(). Without the fix, this optimization
would cause the test case added in this patch fail.

Fixes: f4e508dd7 ("ofctrl.c: Merge opposite changes of tracked flows before installing.")
Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Numan Siddique <numans at ovn.org>

(cherry-picked from commit 9d2e8d32fb9865513b70408a665184a67564390d)


  Commit: 2f82b25ff84618b18ba457f45a323b290ed7411f
      https://github.com/ovn-org/ovn/commit/2f82b25ff84618b18ba457f45a323b290ed7411f
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c

  Log Message:
  -----------
  ofctrl.c: Avoid repeatedly linking an installed flow and a desired flow.

In update_installed_flows_by_compare() there are two loops. The first
loop iterates the installed flows and find its peer in desired flows to:

1. uninstall flows that are not needed anymore

2. update flows if needed

At the same time, it links the desired flow found for the installed flow
which also set the desired flow as the current active installed flow.

The second loop iterates the desired flows and find its peer in installed
flows to install missing flows. At the same time it will detect if there
are conflict desired flows matching same installed flow then just link
them.

However, currently in the second loop, it blindly link the desired flows to the
installed flows, without checking if it is already linked in the first loop.
Lucky enough, this won't cause any real problem so far, because when there are
conflict flows, the one found in the first loop will be set as active in the
installed_flow, and in the function link_installed_to_desired() checks if it is
already the active desired flow it just does nothing but return.  However, the
check in the link_installed_to_desired() is confusing because a desired_flow
may be linked to the installed_flow already but not the active flow, and the
check is insufficient. It should be rather an assertion and let the caller
ensure that a pair of desired_flow and installed_flow is never linked twice.

For the above reason, this patch does the following changes:

1. Removes the check in link_installed_to_desired() and convert it to an assert.

2. Before calling link_installed_to_desired() in the above mentioned loop,
   check if the desired flow is already installed.

Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
(cherry picked from commit 7cab7bd1268ba67429954da4f73de91090acf779)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 52709a5816dfe1c313dcdca58856c25d0c4ab191
      https://github.com/ovn-org/ovn/commit/52709a5816dfe1c313dcdca58856c25d0c4ab191
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c

  Log Message:
  -----------
  ofctrl.c: Only merge actions for conjunctive flows.

In ofctrl_add_or_append_flow() when merging flow actions make sure we only
do that for conjunctive flows.  All other actions can not be merged with
action "conjunction".

CC: Mark Michelson <mmichels at redhat.com>
Fixes: e659bab31a91 ("Combine conjunctions with identical matches into one flow.")
Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
(cherry picked from commit dadae4f800ccb1f2759378f0bd804dd002e31605)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: a58e415fafc5904e7e3bcd1d444e462fb100f9fc
      https://github.com/ovn-org/ovn/commit/a58e415fafc5904e7e3bcd1d444e462fb100f9fc
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c

  Log Message:
  -----------
  ofctrl.c: Do not change flow ordering when merging opposite changes.

Instead of removing the old desired flow from the list and inserting the new
one at the end we now directly replace the old flow with the new one while
maintaining the same ordering.

For now order of the flows is not relevant but further commits require
maintaining the order of desired flows.

Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
(cherry picked from commit e49ce9a33f38f29c44e3c30afcc189b5f6a9ef8e)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 799dcf96f79167306c631db3687b818e51d96ce2
      https://github.com/ovn-org/ovn/commit/799dcf96f79167306c631db3687b818e51d96ce2
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c

  Log Message:
  -----------
  ofctrl.c: Simplify active desired flow selection.

Always consider the first "desired flow" in the list of flows that refer an
"installed flow" as the active flow.  This simplifies the logic of the flow
update code and is used in a further commit to implement a partial ordering
of desired flows within installed flows.

Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Numan Siddique <numans at ovn.org>

(cherry-picked from commit 107bb25029350bd0f7dfeeb0ef3053adbd504e3e)


  Commit: 1ec698353aa7664e54fdb471837408829c7872cc
      https://github.com/ovn-org/ovn/commit/1ec698353aa7664e54fdb471837408829c7872cc
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c

  Log Message:
  -----------
  ofctrl.c: Always log the most recent flow changes.

Fixes: 6f0b1e02d9ab ("ofctrl: Incremental processing for flow installation by tracking.")
Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
(cherry picked from commit 33c15c145988daa6172928dc870f3a0225515f50)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: b585f1d64f2263bd585c7bae06e52e952bf4ebe5
      https://github.com/ovn-org/ovn/commit/b585f1d64f2263bd585c7bae06e52e952bf4ebe5
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c
    M tests/ovn.at

  Log Message:
  -----------
  ofctrl.c: Add a predictable resolution for conflicting flow actions.

Until now, in case the ACL configuration generates openflows that have
the same match but different actions, ovn-controller was using the
following approach:
1. If the flow being added contains conjunctive actions, merge its
   actions with the already existing flow.
2. Otherwise, if the flow is being added incrementally
   (update_installed_flows_by_track), don't install the new flow but
   instead keep the old one.
3. Otherwise, (update_installed_flows_by_compare), don't install the
   new flow but instead keep the old one.

Even though one can argue that having an ACL with a match that includes
the match of another ACL is a misconfiguration, it can happen that the
users provision OVN like this.  Depending on the order of reading and
installing the logical flows, the above operations can yield
unpredictable results, e.g., allow specific traffic but then after
ovn-controller is restarted (or a recompute happens) that specific
traffic starts being dropped.

A simple example of ACL configuration is:
ovn-nbctl acl-add ls to-lport 3 '(ip4.src==10.0.0.1 ||
ip4.src==10.0.0.2) && (ip4.dst == 10.0.0.3 || ip4.dst == 10.0.0.4)' allow
ovn-nbctl acl-add ls to-lport 3 'ip4.src==10.0.0.1' allow
ovn-nbctl acl-add ls to-lport 2 'arp' allow
ovn-nbctl acl-add ls to-lport 1 'ip4' drop

This is a pattern used by most CMSs:
- define a default deny policy.
- punch holes in the default deny policy based on user specific
  configurations.

Without this commit the behavior for traffic from 10.0.0.1 to 10.0.0.5
is unpredictable.  Depending on the order of operations traffic might be
dropped or allowed.

It's also quite hard to force the CMS to ensure that such match overlaps
never occur.

To address this issue we now ensure that all desired flows refering the
same installed flow are partially sorted in the following way:
- first all flows with action "allow".
- then all flows with action "drop".
- then a single flow with action "conjunction" (resulting from merging
  all flows with the same match and action conjunction).

This ensures that "allow" flows have precedence over "drop" flows which
in turn have precedence over "conjunction" flows.  Essentially less
restrictive flows are always preferred over more restrictive flows whenever a match
conflict happens.

CC: Daniel Alvarez <dalvarez at redhat.com>
Reported-at: https://bugzilla.redhat.com/1871931
Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Acked-by: Mark Gray <mark.d.gray at redhat.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
(cherry picked from commit 986b3d5e4ad6f05245d021ba699c957246294a22)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 373e04048198830d039a4d7cce17945989d97a33
      https://github.com/ovn-org/ovn/commit/373e04048198830d039a4d7cce17945989d97a33
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M tests/ovn.at

  Log Message:
  -----------
  ovn.at: Make some of the tests more predictable.

Fix some of the race conditions that are present in the current OVN test
suite.

Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>
(cherry picked from commit d6594a4695084e1b0f4fb3170547942876d3a81a)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 31007ef76e192cca1f03765e72e0cc2cabccc3fc
      https://github.com/ovn-org/ovn/commit/31007ef76e192cca1f03765e72e0cc2cabccc3fc
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M tests/ofproto-macros.at

  Log Message:
  -----------
  tests: Add ofctl_strip_all() to filter OVS flow outputs.

Extend the already existing ofctl_strip() to also ignore n_packets,
n_bytes, cookie.  Use some helper functions from
tests/system-userspace-packet-type-aware.at, which will be removed
by a future commit.

Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>
(cherry picked from commit 0879a0aa9f332bdc689769a1bc2f156f1c3149a5)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>


  Commit: 6a7660698db573519c349e80444ee141f2e4280a
      https://github.com/ovn-org/ovn/commit/6a7660698db573519c349e80444ee141f2e4280a
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M tests/ovn.at

  Log Message:
  -----------
  tests: Fix test "ovn -- Superseding ACLs with conjunction".

The test was checking the output of "ovs-ofctl dump-flows" without
taking into account that some fields don't have predictable values,
e.g., hard_age.

Instead, use ofctl_strip_all().

Fixes: 986b3d5e4ad6 ("ofctrl.c: Add a predictable resolution for conflicting flow actions.")
Reported-by: Numan Siddique <numans at ovn.org>
Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>
(cherry picked from commit 6e3d69e6b5153e26c869a03ec7dd1aaa40488005)
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>


  Commit: 3c9e38f4f082681a6bf715ccdc89a4fedffee268
      https://github.com/ovn-org/ovn/commit/3c9e38f4f082681a6bf715ccdc89a4fedffee268
  Author: Numan Siddique <numans at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c
    M tests/ovn.at

  Log Message:
  -----------
  ofctrl: Fix the assert seen when flood removing flows.

In one of the scaled deployments, ovn-controller is asserting with the
below stack trace

***
 (gdb) bt
   0  raise () from /lib64/libc.so.6
   1  abort () from /lib64/libc.so.6
   2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
   3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
   4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
   5  ovs_assert_failure (where="controller/ofctrl.c:1198",
                          function="flood_remove_flows_for_sb_uuid",
                          condition="ovs_list_is_empty(&f->list_node)") at lib/util.c:86
   6  flood_remove_flows_for_sb_uuid (sb_uuid=...538,
        flood_remove_nodes=...ed0) at controller/ofctrl.c:1205
   7  flood_remove_flows_for_sb_uuid (sb_uuid=...898,
        flood_remove_nodes=...ed0) at controller/ofctrl.c:1230
   8  flood_remove_flows_for_sb_uuid (sb_uuid=...bf0,
        flood_remove_nodes=...ed0) at controller/ofctrl.c:1230
   9  ofctrl_flood_remove_flows (flood_remove_nodes=...ed0) at controller/ofctrl.c:1250
   10 lflow_handle_changed_ref (ref_type=REF_TYPE_PORTGROUP,
        ref_name= "5564_pg_64...bac") at controller/lflow.c:612
   11 _flow_output_resource_ref_handler (ref_type=REF_TYPE_PORTGROUP)
        at controller/ovn-controller.c:2181
   12 engine_compute () at lib/inc-proc-eng.c:306
   13 engine_run_node (recompute_allowed=true) at lib/inc-proc-eng.c:352
   14 engine_run (recompute_allowed=true) at lib/inc-proc-eng.c:377
   15 main () at controller/ovn-controller.c:2794
***

This assertion is seen when a port group gets updated and it is referenced by many
logical flows (with conj actions).  The function ofctrl_flood_remove_flows(), calls
flood_remove_flows_for_sb_uuid() for each sb uuid in the hmap - flood_remove_nodes
using HMAP_FOR_EACH (flood_remove_nodes). flood_remove_flows_for_sb_uuid() also takes
the hmap 'flood_remove_nodes' as an argument and it inserts few items into it when
it has to call itself recursively.  When an item is inserted, its possible that the
hmap may get expanded.  And if this happens, the HMAP_FOR_EACH () skips few entries
causing some of the desired flows not getting cleared.

Later when ofctrl_add_or_append_flow() is called, there would be multiple
'struct sb_flow_ref' references for the same desired flow.  And this causes the
above assertion later when the same port group gets updated.

This patch fixes this issue by cloning the hmap 'flood_remove_nodes' and using it to
iterate the flood remove nodes.  Also a test case is added to cover this scenario.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1928012
Fixes: 580aea72e26f ("ovn-controller: Fix conjunction handling with incremental processing.")
Suggested-by: Ilya Maximetes <i.maximets at ovn.org>
Acked-by: Ilya Maximetes <i.maximets at ovn.org>
Signed-off-by: Numan Siddique <numans at ovn.org>

(cherry-picked from master commit 858d1dd716db1a1e664a7c1737fd34f04fcbda5e)


  Commit: 8932f4f8e27b568e9646b4d9d60a332ed8dac68f
      https://github.com/ovn-org/ovn/commit/8932f4f8e27b568e9646b4d9d60a332ed8dac68f
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/ofctrl.c
    M tests/ovn.at

  Log Message:
  -----------
  ofctrl: Do not link a desired flow twice.

Depending on the logical flow matches, multiple SB flows can point to
the same desired flow.  If it happens that the desired flow conflicts
with a more restrictive (already installed) flow, then we shouldn't try
to add the desired flow multiple times to the list maintained for the
installed flow.

Fixes: 6f0b1e02d9ab ("ofctrl: Incremental processing for flow installation by tracking.")
Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Acked-by: Mark Gray <mark.d.gray at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>

(cherry-picked from master commit 6975c649f932633042ca54df2d8f8f0eb866c344)


  Commit: 79d991664938ba0e6423f344f252865840d3d8d5
      https://github.com/ovn-org/ovn/commit/79d991664938ba0e6423f344f252865840d3d8d5
  Author: Numan Siddique <numans at ovn.org>
  Date:   2021-03-02 (Tue, 02 Mar 2021)

  Changed paths:
    M controller/lflow.c
    M tests/ovn.at

  Log Message:
  -----------
  ofctrl: Fix the assert seen when flood removing flows with conj actions.

In one of the scaled deployments, ovn-controller is asserting with the
below stack trace

    ***
     (gdb) bt
       0  raise () from /lib64/libc.so.6
       1  abort () from /lib64/libc.so.6
       2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
       3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
       4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
       5  ovs_assert_failure (where="controller/ofctrl.c:1216",
                              function="flood_remove_flows_for_sb_uuid",
                              condition="ovs_list_is_empty(&f->list_node)") at lib/util.c:86
       6  flood_remove_flows_for_sb_uuid (sb_uuid=...134,
            flood_remove_nodes=...ef0) at controller/ofctrl.c:1216
       9  ofctrl_flood_remove_flows (flood_remove_nodes=...ef0) at controller/ofctrl.c:1280
       10 lflow_handle_changed_ref (ref_type=REF_TYPE_PORTGROUP,
            ref_name= "5564_pg_14...aac") at controller/lflow.c:522
       11 _flow_output_resource_ref_handler (ref_type=REF_TYPE_PORTGROUP)
            at controller/ovn-controller.c:2220
       12 engine_compute () at lib/inc-proc-eng.c:306
       13 engine_run_node (recompute_allowed=true) at lib/inc-proc-eng.c:352
       14 engine_run (recompute_allowed=true) at lib/inc-proc-eng.c:377
       15 main () at controller/ovn-controller.c:2887
    ***

The reason for this assert is different from the assertion bug fixed in
the commit 858d1dd716("ofctrl: Fix the assert seen when flood removing
flows.")

The above assertion is seen if lflow.c calls ofctrl_add_or_append_flow()
twice for the same flow uuid 'S'.  When this function is called for
the first time, a new desired flow 'f' is created and 'f->references'
will have [(S, f)].  When it is called for the 2nd time, the list
'f->references' is updated as [(S, f), (S,f)].  Later when
flood_remove_flows_for_sb_uuid() is called for 'S', the above assertion
is seen.

This scenarion can happen when ovn-controller receives an update message
from SB ovsdb-server in which a new logical flow 'F' belonging to a
datapath 'D' is added which would result in conjunction flows.  If this
datapath 'D' is added to 'local_datapaths' in the same loop, then
logical flow 'F' is first processed in lflow_add_flows_for_datapath()
and the second time in lflow_handle_changed_flows().

This issue can be fixed in 2 ways
1. Flood remove all the tracked flows in lflow_handle_changed_flows()
   first and then process the modified or newly added logical flows.

2. In the function ofctrl_add_or_append_flow(), check if there is
   already a reference for the flow uuid 'S' in the existing desired
   flows and only append the actions if it doesn't exist.

This patch has taken the approach (1) to ensure correctness of flows.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1929978
Signed-off-by: Numan Siddique <numans at ovn.org>
Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>

(cherry-picked from master commit c6c61b4e3462fb5201a61a226c2acaf6f4caf917)


Compare: https://github.com/ovn-org/ovn/compare/5ee35ecef3d2...79d991664938


More information about the git mailing list