[ovs-git] [ovn-org/ovn] 671315: ofctrl: Fix the assert seen when flood removing fl...

numansiddique noreply at github.com
Tue Feb 23 12:20:50 UTC 2021


  Branch: refs/heads/branch-20.12
  Home:   https://github.com/ovn-org/ovn
  Commit: 6713159d97b6eaa29ecc6364351c4eeabdd65a26
      https://github.com/ovn-org/ovn/commit/6713159d97b6eaa29ecc6364351c4eeabdd65a26
  Author: Numan Siddique <numans at ovn.org>
  Date:   2021-02-23 (Tue, 23 Feb 2021)

  Changed paths:
    M controller/lflow.c
    M tests/ovn.at

  Log Message:
  -----------
  ofctrl: Fix the assert seen when flood removing flows with conj actions.

In one of the scaled deployments, ovn-controller is asserting with the
below stack trace

    ***
     (gdb) bt
       0  raise () from /lib64/libc.so.6
       1  abort () from /lib64/libc.so.6
       2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
       3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
       4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
       5  ovs_assert_failure (where="controller/ofctrl.c:1216",
                              function="flood_remove_flows_for_sb_uuid",
                              condition="ovs_list_is_empty(&f->list_node)") at lib/util.c:86
       6  flood_remove_flows_for_sb_uuid (sb_uuid=...134,
            flood_remove_nodes=...ef0) at controller/ofctrl.c:1216
       9  ofctrl_flood_remove_flows (flood_remove_nodes=...ef0) at controller/ofctrl.c:1280
       10 lflow_handle_changed_ref (ref_type=REF_TYPE_PORTGROUP,
            ref_name= "5564_pg_14...aac") at controller/lflow.c:522
       11 _flow_output_resource_ref_handler (ref_type=REF_TYPE_PORTGROUP)
            at controller/ovn-controller.c:2220
       12 engine_compute () at lib/inc-proc-eng.c:306
       13 engine_run_node (recompute_allowed=true) at lib/inc-proc-eng.c:352
       14 engine_run (recompute_allowed=true) at lib/inc-proc-eng.c:377
       15 main () at controller/ovn-controller.c:2887
    ***

The reason for this assert is different from the assertion bug fixed in
the commit 858d1dd716("ofctrl: Fix the assert seen when flood removing
flows.")

The above assertion is seen if lflow.c calls ofctrl_add_or_append_flow()
twice for the same flow uuid 'S'.  When this function is called for
the first time, a new desired flow 'f' is created and 'f->references'
will have [(S, f)].  When it is called for the 2nd time, the list
'f->references' is updated as [(S, f), (S,f)].  Later when
flood_remove_flows_for_sb_uuid() is called for 'S', the above assertion
is seen.

This scenarion can happen when ovn-controller receives an update message
from SB ovsdb-server in which a new logical flow 'F' belonging to a
datapath 'D' is added which would result in conjunction flows.  If this
datapath 'D' is added to 'local_datapaths' in the same loop, then
logical flow 'F' is first processed in lflow_add_flows_for_datapath()
and the second time in lflow_handle_changed_flows().

This issue can be fixed in 2 ways
1. Flood remove all the tracked flows in lflow_handle_changed_flows()
   first and then process the modified or newly added logical flows.

2. In the function ofctrl_add_or_append_flow(), check if there is
   already a reference for the flow uuid 'S' in the existing desired
   flows and only append the actions if it doesn't exist.

This patch has taken the approach (1) to ensure correctness of flows.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1929978
Signed-off-by: Numan Siddique <numans at ovn.org>
Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>

(cherry-picked from master commit c6c61b4e3462fb5201a61a226c2acaf6f4caf917)




More information about the git mailing list