[ovs-dev] [PATCH ovn] ofctrl: Fix the assert seen when flood removing flows.
Numan Siddique
nusiddiq at redhat.com
Tue Feb 16 08:08:56 UTC 2021
On Tue, Feb 16, 2021, 12:33 PM <numans at ovn.org> wrote:
> From: Numan Siddique <numans at ovn.org>
>
> In one of the scaled deployments, ovn-controller is asserting with the
> below stack trace
>
> ****************
> (gdb) bt
> #0 0x00007fa6aed4970f in raise () from /lib64/libc.so.6
> #1 0x00007fa6aed33b25 in abort () from /lib64/libc.so.6
> #2 0x000055d46594a714 in ovs_abort_valist (err_no=err_no at entry=0,
> format=format at entry=0x55d465a2a190 "%s: assertion %s failed in %s()",
> args=args at entry=0x7ffd24307ba0) at lib/util.c:419
> #3 0x000055d465952504 in vlog_abort_valist (module_=<optimized out>,
> message=0x55d465a2a190 "%s: assertion %s failed in %s()", args=args at entry=0x7ffd24307ba0)
> at lib/vlog.c:1249
> #4 0x000055d4659525aa in vlog_abort (module=module at entry=0x55d465ce6880
> <this_module>, message=message at entry=0x55d465a2a190 "%s: assertion %s
> failed in %s()") at lib/vlog.c:1263
> #5 0x000055d46594a42b in ovs_assert_failure (where=where at entry=0x55d465a05529
> "controller/ofctrl.c:1198", function=function at entry=0x55d465a05ca0
> <__func__.33798> "flood_remove_flows_for_sb_uuid",
> condition=condition at entry=0x55d465a05a80
> "ovs_list_is_empty(&f->list_node)") at lib/util.c:86
> #6 0x000055d465877aa2 in flood_remove_flows_for_sb_uuid
> (flow_table=flow_table at entry=0x55d46688d080, sb_uuid=sb_uuid at entry=0x55d53dbec538,
> flood_remove_nodes=flood_remove_nodes at entry=0x7ffd24307ed0) at
> controller/ofctrl.c:1205
> #7 0x000055d465877c2e in flood_remove_flows_for_sb_uuid
> (flow_table=flow_table at entry=0x55d46688d080, sb_uuid=sb_uuid at entry=0x55d546553898,
> flood_remove_nodes=flood_remove_nodes at entry=0x7ffd24307ed0) at
> controller/ofctrl.c:1230
> #8 0x000055d465877c2e in flood_remove_flows_for_sb_uuid
> (flow_table=flow_table at entry=0x55d46688d080, sb_uuid=sb_uuid at entry=0x55d55eafebf0,
> flood_remove_nodes=flood_remove_nodes at entry=0x7ffd24307ed0) at
> controller/ofctrl.c:1230
> #9 0x000055d465877dc2 in ofctrl_flood_remove_flows
> (flow_table=0x55d46688d080, flood_remove_nodes=flood_remove_nodes at entry=0x7ffd24307ed0)
> at controller/ofctrl.c:1250
> #10 0x000055d465872b24 in lflow_handle_changed_ref
> (ref_type=ref_type at entry=REF_TYPE_PORTGROUP, ref_name=ref_name at entry=0x55d49375a050
> "5564_pg_6415729e_58ec_4b8b_bc99_2ceef5c44bac", l_ctx_in=l_ctx_in at entry
> =0x7ffd24307fd0,
> l_ctx_out=l_ctx_out at entry=0x7ffd24307f90, changed=changed at entry=0x7ffd24307f8f)
> at controller/lflow.c:612
> #11 0x000055d46588f2f8 in _flow_output_resource_ref_handler
> (node=<optimized out>, data=<optimized out>, ref_type=REF_TYPE_PORTGROUP)
> at controller/ovn-controller.c:2181
> #12 0x000055d4658a8163 in engine_compute (recompute_allowed=<optimized
> out>, node=<optimized out>) at lib/inc-proc-eng.c:306
> #13 engine_run_node (recompute_allowed=true, node=0x7ffd2430d4b0) at
> lib/inc-proc-eng.c:352
> #14 engine_run (recompute_allowed=recompute_allowed at entry=true) at
> lib/inc-proc-eng.c:377
> #15 0x000055d46586613d in main (argc=<optimized out>, argv=<optimized
> out>) at controller/ovn-controller.c:2794
>
> ***************
>
> This assertion is seen when a port group gets updated and it is referenced
> by many
> logical flows (with conj actions). The function
> ofctrl_flood_remove_flows(), calls flood_remove_flows_for_sb_uuid() for
> each sb uuid in the hmap - flood_remove_nodes using HMAP_FOR_EACH
> (flood_remove_nodes).
> flood_remove_flows_for_sb_uuid() also takes the hmap
> 'flood_remove_nodes' as an argument and it inserts few items into it
> when it has to call itself recursively. When an item is inserted, its
> possible that
> the hmap may get expanded. And if this happens, the HMAP_FOR_EACH ()
> skips few entries causing some of the desired flows not getting cleared.
>
> Later when ofctrl_add_or_append_flow() is called, there would be
> multiple 'struct sb_flow_ref' references for the same desired flow.
> And this causes the above assertion later when the same port group gets
> updated.
>
> This patch fixes this issue by cloning the hmap 'flood_remove_nodes' and
> using it to iterate the flood remove nodes.
>
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1928012
> CC: Han Zhou <hzhou at ovn.org>
> CC: Dumitru Ceara <dceara at ovn.org>
> Signed-off-by: Numan Siddique <numans at ovn.org>
>
Hi Han and Dumitru,
Would you mind taking a look at this patch when you can.
Thanks
Numan
---
> controller/ofctrl.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/controller/ofctrl.c b/controller/ofctrl.c
> index 9d62e1260..42443acb2 100644
> --- a/controller/ofctrl.c
> +++ b/controller/ofctrl.c
> @@ -1247,10 +1247,28 @@ ofctrl_flood_remove_flows(struct
> ovn_desired_flow_table *flow_table,
> struct hmap *flood_remove_nodes)
> {
> struct ofctrl_flood_remove_node *ofrn;
> +
> + /* flood_remove_flows_for_sb_uuid() modifies the hmap by inserting
> + * few entries when it calls itself recursively. Clone the
> + * hmap 'flood_remove_nodes' before calling. Inserting an item to
> + * hmap may expand it. */
> + struct hmap flood_remove_uuids =
> HMAP_INITIALIZER(&flood_remove_uuids);
> HMAP_FOR_EACH (ofrn, hmap_node, flood_remove_nodes) {
> + ofctrl_flood_remove_add_node(&flood_remove_uuids, &ofrn->sb_uuid);
> + }
> +
> + /* Iterate using the cloned hmap - 'flood_remove_uuids', but pass
> + * the hmap 'flood_remove_nodes' provided by the caller.
> + * flood_remove_flows_for_sb_uuid() will delete the other lflows
> + * referenced by the sb_uuid, which needs to be re-added later
> + * if those sb_uuids were not deleted. The caller
> + * (in lflow.c) will add re-add lflows which are not deleted. */
> + HMAP_FOR_EACH_POP (ofrn, hmap_node, &flood_remove_uuids) {
> flood_remove_flows_for_sb_uuid(flow_table, &ofrn->sb_uuid,
> flood_remove_nodes);
> + free(ofrn);
> }
> + hmap_destroy(&flood_remove_uuids);
>
> /* remove any related group and meter info */
> HMAP_FOR_EACH (ofrn, hmap_node, flood_remove_nodes) {
> --
> 2.29.2
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
More information about the dev
mailing list