[ovs-dev] [PATCH ovn] ofctrl: Don't append actions for conj flows if called twice for same flow uuid.

Numan Siddique numans at ovn.org
Mon Feb 22 14:29:46 UTC 2021


On Mon, Feb 22, 2021 at 6:27 PM Dumitru Ceara <dceara at redhat.com> wrote:
>
> On 2/22/21 11:58 AM, Numan Siddique wrote:
> > On Mon, Feb 22, 2021 at 3:13 PM Dumitru Ceara <dceara at redhat.com> wrote:
> >>
> >> On 2/22/21 10:23 AM, Numan Siddique wrote:
> >>> On Mon, Feb 22, 2021 at 2:17 PM Dumitru Ceara <dceara at redhat.com> wrote:
> >>>>
> >>>> On 2/21/21 12:34 PM, numans at ovn.org wrote:
> >>>>> From: Numan Siddique <numans at ovn.org>
> >>>>>
> >>>>> In one of the scaled deployments, ovn-controller is asserting with the
> >>>>> below stack trace
> >>>>>
> >>>>>        ***
> >>>>>         (gdb) bt
> >>>>>           0  raise () from /lib64/libc.so.6
> >>>>>           1  abort () from /lib64/libc.so.6
> >>>>>           2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
> >>>>>           3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
> >>>>>           4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
> >>>>>           5  ovs_assert_failure (where="controller/ofctrl.c:1216",
> >>>>>                                  function="flood_remove_flows_for_sb_uuid",
> >>>>>                                  condition="ovs_list_is_empty(&f->list_node)") at lib/util.c:86
> >>>>>           6  flood_remove_flows_for_sb_uuid (sb_uuid=...134,
> >>>>>                flood_remove_nodes=...ef0) at controller/ofctrl.c:1216
> >>>>>           9  ofctrl_flood_remove_flows (flood_remove_nodes=...ef0) at controller/ofctrl.c:1280
> >>>>>           10 lflow_handle_changed_ref (ref_type=REF_TYPE_PORTGROUP,
> >>>>>                ref_name= "5564_pg_14...aac") at controller/lflow.c:522
> >>>>>           11 _flow_output_resource_ref_handler (ref_type=REF_TYPE_PORTGROUP)
> >>>>>                at controller/ovn-controller.c:2220
> >>>>>           12 engine_compute () at lib/inc-proc-eng.c:306
> >>>>>           13 engine_run_node (recompute_allowed=true) at lib/inc-proc-eng.c:352
> >>>>>           14 engine_run (recompute_allowed=true) at lib/inc-proc-eng.c:377
> >>>>>           15 main () at controller/ovn-controller.c:2887
> >>>>>        ***
> >>>>>
> >>>>> The reason for this assert is different from the assertion bug fixed in
> >>>>> the commit 858d1dd716("ofctrl: Fix the assert seen when flood removing
> >>>>> flows.")
> >>>>>
> >>>>> The above assertion is seen if lflow.c calls ofctrl_add_or_append_flow()
> >>>>> twice for the same flow uuid 'S'.  When this function is called for
> >>>>> the first time, a new desired flow 'f' is created and 'f->references'
> >>>>> will have [(S, f)].  When it is called for the 2nd time, the list
> >>>>> 'f->references' is updated as [(S, f), (S,f)].  Later when
> >>>>> flood_remove_flows_for_sb_uuid() is called for 'S', the above assertion
> >>>>> is seen.
> >>>>>
> >>>>> This scenarion can happen when ovn-controller receives an update message
> >>>>> from SB ovsdb-server in which a new logical flow 'F' belonging to a
> >>>>> datapath 'D' is added which would result in conjunction flows.  If this
> >>>>> datapath 'D' is added to 'local_datapaths' in the same loop, then
> >>>>> logical flow 'F' is first processed in lflow_add_flows_for_datapath()
> >>>>> and the second time in lflow_handle_changed_flows().
> >>>>>
> >>>>> This issue can be fixed in 2 ways
> >>>>> 1. Flood remove all the tracked flows in lflow_handle_changed_flows()
> >>>>>       first and then process the modified or newly added logical flows.
> >>>>>
> >>>>> 2. In the function ofctrl_add_or_append_flow(), check if there is
> >>>>>       already a reference for the flow uuid 'S' in the existing desired
> >>>>>       flows and only append the actions if it doesn't exist.
> >>>>>
> >>>>> This patch has taken the approach (2) as that seems better and
> >>>>> effecient.
> >>>>>
> >>>>> Signed-off-by: Numan Siddique <numans at ovn.org>
> >>>>> ---
> >>>>
> >>>> Hi Numan,
> >>>>
> >>>> Nice work tracking this down!
> >>>>
> >>>> This is not a full review, just an initial question to see if there's
> >>>> also an option #3 to fix this issue:
> >>>>
> >>>> Can we try to avoid processing the same lflow twice?  E.g., in
> >>>> lflow_add_flows_for_datapath(), call consider_logical_flow__() only for
> >>>> logical flows that are not on the table's 'track_list', so have not
> >>>> changed in the SB.
> >>>
> >>> Hi Dumitru,
> >>>
> >>> Thanks for the comments. I think this would require running  a for
> >>> loop of tracked
> >>> flows for every logical flow for the datapath and checking if it is in
> >>> the tracked list or not.
> >>
> >> OVS IDL doesn't explicitly expose an API to check if a record has been
> >> changed (i.e., is on the track list).  We could add one, and use it in
> >> lflow_add_flows_for_datapath() when iterating over lflows, to just skip
> >> records that were marked as updates already.
> >>
> >>> I think we could do this too. But in my opinion we should support
> >>> ofctrl_add_or_append_flow() to be called twice for the same uuid.
> >>
> >> I'm not sure about this.  Why should we support adding the same flow
> >> twice?  It seems like a bug to me.
> >
> > Sorry for the confusion.  What I mean is if the user calls it twice, we
> > should warn about it and return.
> >
> > Like how we do here -
> > https://github.com/ovn-org/ovn/blob/master/controller/ofctrl.c#L1037
> > in ofctrl_check_and_add_flow().
> >
> >
> >>
> >>> When ofctrl_add_or_append_flow() is called twice, it should result in a no-op.
> >>
> >> What if we implement (3) and also add an assert or check in
> >> link_flow_to_sb() to not allow the same flow to be added twice to a SB uuid?
> >>
> >>>
> >>> We could end up in a similar scenario later when incrementally processing
> >>> other changes.
> >>
> >> As far as I see, we currently call ofctrl_add_or_append_flow() only for
> >> logical flows so, at least for now, no other incrementally processed SB
> >> record types can cause this issue.
> >>
> >>>
> >>> In my opinion we can take (2) to not result in this assert scenario.
> >>> We can also take your suggested approach (3) so that we don't call
> >>> ofctrl_add_or_append_flow() twice for the reported scenario.
> >>
> >> Right now ofctrl_add_or_append_flow() has the implicit requirement that
> >> a single flow shouldn't be added twice.  This seems reasonable to me,
> >> and I think it should be the responsibility of the caller to avoid that.
> >
> >>
> >> However, as you mentioned, we could also check inside the ofctrl_*()
> >> implementation and make sure the caller isn't adding the same flow
> >> twice, but I think that can be simplified by adding the check to
> >> link_flow_to_sb().
> >
> > You mean we should  check if a desired flow references an sb_uuid
> > in the link_flow_to_sb() instead of doing the way this patch does ?
> >
> >
> >   The way I see it we have two options when such a
> >> check would fail:
> >> - either assert
> >> - or ignore the second addition while logging an error but this might be
> >> easily overlooked and so the real problem of adding the same flow twice
> >> will not be addressed.
> >
> > I don't see any real problem if a flow for a given uuid is attemted to
> > added twice.
> > We would just ignore the 2nd one.
> >
> > I took the approach (2) over (1) because
> > 1.  it is done in a similar way in check_and_add_flow()
> > 2. (1) would result in unnecessary addition, removal and addition. so I chose
> >      effeciency over redundancy.
> >
> > If (2) complicates the code (IMHO it doesn't. We just need to traverse
> > the desired
> > flow->references to check if there is an entry already or not),
> > I think we should just take the approach (1).
> > Because we have followed the approach of deletion and then addition
> > for all the changes. I think we should do the same here.
> > For every tracked flow, flood delete and then call consider_logical_flow()
> > for every new or updated tracked flow.
> >
>
> Thanks for the discussion, Numan!
>
> In this case, I would prefer option (1) of "flood removing" all tracked
> flows in lflow_handle_changed_flows().  This should ensure correctness
> and would keep the code simpler.
>
> We can revisit this approach and optimize it if/when the ovn-controller
> refactoring happens.

Ok. Sounds good. I will spin v2 accordingly.

Numan

>
> Regards,
> Dumitru
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list