[ovs-git] [ovn-org/ovn] c60c9a: ofctrl: Fix the assert seen when flood removing fl...

numansiddique noreply at github.com
Wed Feb 17 02:35:12 UTC 2021


  Branch: refs/heads/branch-20.12
  Home:   https://github.com/ovn-org/ovn
  Commit: c60c9af9ed6d15c53066f3d7a15a63926fef8748
      https://github.com/ovn-org/ovn/commit/c60c9af9ed6d15c53066f3d7a15a63926fef8748
  Author: Numan Siddique <numans at ovn.org>
  Date:   2021-02-17 (Wed, 17 Feb 2021)

  Changed paths:
    M controller/ofctrl.c
    M tests/ovn.at

  Log Message:
  -----------
  ofctrl: Fix the assert seen when flood removing flows.

In one of the scaled deployments, ovn-controller is asserting with the
below stack trace

***
 (gdb) bt
   0  raise () from /lib64/libc.so.6
   1  abort () from /lib64/libc.so.6
   2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
   3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
   4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
   5  ovs_assert_failure (where="controller/ofctrl.c:1198",
                          function="flood_remove_flows_for_sb_uuid",
                          condition="ovs_list_is_empty(&f->list_node)") at lib/util.c:86
   6  flood_remove_flows_for_sb_uuid (sb_uuid=...538,
        flood_remove_nodes=...ed0) at controller/ofctrl.c:1205
   7  flood_remove_flows_for_sb_uuid (sb_uuid=...898,
        flood_remove_nodes=...ed0) at controller/ofctrl.c:1230
   8  flood_remove_flows_for_sb_uuid (sb_uuid=...bf0,
        flood_remove_nodes=...ed0) at controller/ofctrl.c:1230
   9  ofctrl_flood_remove_flows (flood_remove_nodes=...ed0) at controller/ofctrl.c:1250
   10 lflow_handle_changed_ref (ref_type=REF_TYPE_PORTGROUP,
        ref_name= "5564_pg_64...bac") at controller/lflow.c:612
   11 _flow_output_resource_ref_handler (ref_type=REF_TYPE_PORTGROUP)
        at controller/ovn-controller.c:2181
   12 engine_compute () at lib/inc-proc-eng.c:306
   13 engine_run_node (recompute_allowed=true) at lib/inc-proc-eng.c:352
   14 engine_run (recompute_allowed=true) at lib/inc-proc-eng.c:377
   15 main () at controller/ovn-controller.c:2794
***

This assertion is seen when a port group gets updated and it is referenced by many
logical flows (with conj actions).  The function ofctrl_flood_remove_flows(), calls
flood_remove_flows_for_sb_uuid() for each sb uuid in the hmap - flood_remove_nodes
using HMAP_FOR_EACH (flood_remove_nodes). flood_remove_flows_for_sb_uuid() also takes
the hmap 'flood_remove_nodes' as an argument and it inserts few items into it when
it has to call itself recursively.  When an item is inserted, its possible that the
hmap may get expanded.  And if this happens, the HMAP_FOR_EACH () skips few entries
causing some of the desired flows not getting cleared.

Later when ofctrl_add_or_append_flow() is called, there would be multiple
'struct sb_flow_ref' references for the same desired flow.  And this causes the
above assertion later when the same port group gets updated.

This patch fixes this issue by cloning the hmap 'flood_remove_nodes' and using it to
iterate the flood remove nodes.  Also a test case is added to cover this scenario.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1928012
Fixes: 580aea72e26f ("ovn-controller: Fix conjunction handling with incremental processing.")
Suggested-by: Ilya Maximetes <i.maximets at ovn.org>
Acked-by: Ilya Maximetes <i.maximets at ovn.org>
Signed-off-by: Numan Siddique <numans at ovn.org>

(cherry-picked from master commit 858d1dd716db1a1e664a7c1737fd34f04fcbda5e)




More information about the git mailing list