[ovs-dev] [PATCH v3 2/3] ovsdb-idl: Preserve references for deleted rows.

Han Zhou hzhou at ovn.org
Mon Mar 22 19:45:05 UTC 2021


On Mon, Mar 22, 2021 at 4:27 AM Dumitru Ceara <dceara at redhat.com> wrote:
>
> On 3/19/21 6:55 PM, Han Zhou wrote:
> > On Fri, Mar 19, 2021 at 1:42 AM Dumitru Ceara <dceara at redhat.com> wrote:
> >>
> >> On 3/19/21 2:41 AM, Han Zhou wrote:
> >>>
> >>>
> >>> On Thu, Mar 18, 2021 at 6:27 PM Han Zhou <hzhou at ovn.org
> >>> <mailto:hzhou at ovn.org>> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Mar 18, 2021 at 11:07 AM Ilya Maximets <i.maximets at ovn.org
> >>> <mailto:i.maximets at ovn.org>> wrote:
> >>>>>
> >>>>> On 3/12/21 1:07 PM, Dumitru Ceara wrote:
> >>>>>> Considering two DB rows, 'a' from table A and 'b' from table B
> > (with
> >>>>>> column 'ref_a' a reference to table A):
> >>>>>> a = {A._uuid=<U1>}
> >>>>>> b = {B._uuid=<U2>, B.ref_a=<U1>}
> >>>>>>
> >>>>>> Assuming both records are present in the IDL client's in-memory
> >>> view of
> >>>>>> the database, depending whether row 'b' is also deleted in the same
> >>>>>> transaction or not, deletion of row 'a' should generate the
> > following
> >>>>>> tracked changes:
> >>>>>>
> >>>>>> 1. only row 'a' is deleted:
> >>>>>> - for table A:
> >>>>>>   - deleted records: a = {A._uuid=<U1>}
> >>>>>> - for table B:
> >>>>>>   - updated records: b = {B._uuid=<U2>, B.ref_a=[]}
> >>>>>>
> >>>>>> 2. row 'a' and row 'b' are deleted in the same update:
> >>>>>> - for table A:
> >>>>>>   - deleted records: a = {A._uuid=<U1>}
> >>>>>> - for table B:
> >>>>>>   - deleted records: b = {B._uuid=<U2>, B.ref_a=<U1>}
> >>>>>>
> >>>>>> To ensure this, we now delay reparsing row backrefs for deleted
> > rows
> >>>>>> until all updates in the current run have been processed.
> >>>>>>
> >>>>>> Without this change, in scenario 2 above, the tracked changes for
> >>> table
> >>>>>> B would be:
> >>>>>> - deleted records: b = {B._uuid=<U2>, B.ref_a=[]}
> >>>>>>
> >>>>>> In particular, for strong references, row 'a' can never be deleted
> > in
> >>>>>> a transaction that happens strictly before row 'b' is deleted.  In
> >>> some
> >>>>>> cases [0] both rows are deleted in the same transaction and having
> >>>>>> B.ref_a=[] would violate the integrity of the database from client
> >>>>>> perspective.  This would force the client to always validate that
> >>>>>> strong reference fields are non-NULL.  This is not really an option
> >>>>>> because the information in the original reference is required for
> >>>>>> incrementally processing the record deletion.
> >>>>>>
> >>>>>> [0] with ovn-monitor-all=true, the following command triggers a
> > crash
> >>>>>>     in ovn-controller because a strong reference field becomes
> > NULL:
> >>>>>>     $ ovn-nbctl --wait=hv -- lr-add r -- lrp-add r rp
> >>> 00:00:00:00:00:01 1.0.0.1/24 <http://1.0.0.1/24>
> >>>>>>     $ ovn-nbctl lr-del r
> >>>>>>
> >>>>>> Reported-at: https://bugzilla.redhat.com/1932642
> >>> <https://bugzilla.redhat.com/1932642>
> >>>>>> Fixes: 72aeb243a52a ("ovsdb-idl: Tracking - preserve data for
> >>> deleted rows.")
> >>>>>> Signed-off-by: Dumitru Ceara <dceara at redhat.com
> >>> <mailto:dceara at redhat.com>>
> >>>>>> ---
> >>>>>>  lib/ovsdb-idl.c     |  135 ++++++++++++++++++++++++++--------
> >>>>>>  tests/ovsdb-idl.at <http://ovsdb-idl.at>  |  203
> >>> +++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>  tests/test-ovsdb.c  |   50 +++++++++++++
> >>>>>>  tests/test-ovsdb.py |   14 ++++
> >>>>>>  4 files changed, 370 insertions(+), 32 deletions(-)
> >>>>>>
> >>>>>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> >>>>>> index 9e1e787..1542da9 100644
> >>>>>> --- a/lib/ovsdb-idl.c
> >>>>>> +++ b/lib/ovsdb-idl.c
> >>>>>> @@ -92,6 +92,7 @@ struct ovsdb_idl {
> >>>>>>      struct ovsdb_idl_txn *txn;
> >>>>>>      struct hmap outstanding_txns;
> >>>>>>      bool verify_write_only;
> >>>>>> +    struct ovs_list orphan_rows; /* All deleted but still
> >>> referenced rows. */
> >>>>>
> >>>>> The name here is a bit confusing.  I mean it's not a list
> >>>>> of orphan rows, at least it's not a list of all the orphan
> >>>>> rows.  So, we should, probably, rename it to something like
> >>>>> 'deleted_rows' instead and also rename all related functions.
> >>>>>
> >>>>> What do you think?
> >>>>>
> >>>>
> >>>> I agree with Ilya that the "orphan_rows" is confusing. However,
> >>> "deleted_rows" may also be confusing, because the current code also
> >>> maintains deleted rows in the track_list of each table (no matter if
> >>> tracking is enabled or not). If tracking is enabled, the rows are
> >>> destroyed (for real) when ovsdb_idl_track_clear() is called; if not,
it
> >>> is destroyed in ovsdb_idl_row_destroy_postprocess(). The part of logic
> >>> was really confusing. This patch changes that behavior further by
> >>> introducing this orphan_rows list as a temporary place (for some
> >>> situations) before moving to the track_list. Since this patch already
> >>> did some refactoring (which is great), I'd suggest doing a little more
> >>> to avoid the above confusion: for deleted rows, shall we just put the
> >>> untracked rows in a "deleted_untracked_rows" list (i.e. rename the
> >>> "orphan_rows" of this patch), and let the "track_list" only hold the
> >>> tracked ones, and update the ovsdb_idl_row_destroy_postprocess() to
> >>> destroy deleted_untracked_rows without worrying the track_list.
> >>> track_list should exist only if tracking is enabled, and should be
taken
> >>> care by ovsdb_idl_track_clear().
> >>
> >> Thanks, Ilya and Han, for the review and suggestions.
> >>
> >> Renaming 'orphan_rows' definitely makes sense, I'll do that.  However,
> >> regarding using a different list for tables without change tracking I'm
> >> not so sure if that's OK.  Even if the table doesn't have change
> >> tracking its deleted rows should still first go to the "temporary"
> >> orphan_rows list.
> >>
> >> For example:
> >>
> >> Table T1 (no change tracking):
> >> - Row a = {A.uuid=<U1>}
> >>
> >> Table T2 (change tracking enabled):
> >> - Row b = {B.uuid=<U2>, B.ref_to_A=<U1>}
> >>
> >> If in a transaction both rows 'a' and 'b' are deleted, and updates are
> >> received in this order, if we wouldn't add 'a' to 'orphan_rows' then
> >> we'd be reparsing backrefs for 'a' immediately leading to:
> >>
> >> Row b (updated) = {B.uuid=<U2>, B.ref_to_A=None}
> >>
> >> Then the update to delete 'b' is processed:
> >>
> >> Row b (deleted) = {B.uuid=<U2>, B.ref_to_A=None}
> >>
> >> The above is wrong and breaks consistency if ref_to_A is a strong
> > reference.
> >>
> > Sorry that I didn't make it clear enough. The "deleted_untracked_rows"
list
> > in my suggestion is just a renaming of your "orphan_rows". The real
change
> > I suggested was that let's don't use it as a temporary place to hold
these
> > rows. We can make it clear that deleted_untracked_rows is for deleted
> > untracked rows and the "track_list" of each table contains only tracked
> > rows.
>
> I think there's some confusion: 'orphan_rows' doesn't contain deleted
> untracked rows.  What it actually stores is "rows deleted in the current
> IDL run regardless of change tracking enabled or not".
>
I think we are saying the same thing from different perspectives. For
"deleted untracked rows" I am not talking about whether change tracking is
enabled or not, but about whether the deleted row is in "track_list" or
not. In this patch, a deleted row is in either "track_list" (what I called
"tracked") or "orphan_rows" (what I called "deleted and untracked"). You
always call "ovsdb_idl_row_untrack_change" before putting a row to
"orphan_rows", so I think it would be clear to rename "orphan_rows" as
"deleted_untracked_rows".

> >
> > In the above example, row a (without change tracking) will be put to
> > deleted_untracked_rows, but no need to move to "track_list" later.
> >
> > In ovsdb_idl_row_destroy_postprocess, we can destroy
> > deleted_untracked_rows, and no need to touch "track_list".
> >
> > Would this be more clear?
>
> The problem is with deleted rows that are part of tables with change
> tracking enabled.  Taking the example from the commit log:
>
> Considering two DB rows, 'a' from table A and 'b' from table B (with
> column 'ref_a' a reference to table A), both tables with change tracking
> enabled:
> a = {A._uuid=<U1>}
> b = {B._uuid=<U2>, B.ref_a=<U1>}
>
> If row 'a' and row 'b' are deleted in the same update then the IDL user
> should be notified of the following events.
> - for table A:
>   - deleted records: a = {A._uuid=<U1>}
> - for table B:
>   - deleted records: b = {B._uuid=<U2>, B.ref_a=<U1>}
>
> To achieve this we need to make sure all "delete" updates are processed
> before reparsing any backrefs.  This allows to check in
> ovsdb_idl_row_reparse_backrefs() if an arc's source (B in the example
> above) is also deleted in the current run, and skip unparsing B.ref_a so
> that that the IDL user (e.g., ovn-controller) can still access it.
>

In fact I didn't see the check you mentioned above. What's in the code is:
+        /* Skip reparsing deleted refs.  If ref's table has change tracking
+         * enabled, the users need the original information (and that will
be
+         * cleaned up by ovsdb_idl_track_clear()).  Otherwise, 'row' is
+         * already orphan and is not accessible to the IDL user and will be
+         * cleaned up at the end of the current IDL run.
+         */
+        if (ovsdb_idl_row_get_seqno(row, OVSDB_IDL_CHANGE_DELETE)) {
+            continue;
+        }
+

1) "row" is "a" in the above example, and "ref" is "b". It seems you should
check if "ref" is deleted?

2) The comments says "if ref's table has change tracking enabled", but in
fact the above code didn't check if change tracking is enabled or not.
Should we check it?

On the other hand, I understand the problem you are trying to solve, but I
don't see why it is related to my comment regarding clearly distinguishing
deleted_untracked_rows v.s. track_list.

> >
> >>>>
> >>>> What do you think?
> >>>>
> >>>
> >>> I took a second look at the patch, and it seems when tracking is not
> >>> enabled it would result in the rows in "orphan_rows" leaked, because
in
> >>> ovsdb_idl_process_orphans() the rows are moved to "track_list" ONLY IF
> >>> tracking is enabled for the table. If I am correct, would the above
> >>> suggested refactoring help avoiding such kinds of problems?
> >>
> >> Actually, there's no leak AFAICT.  ovsdb_idl_process_orphans() moves
> >> rows to 'track_list' in two scenarios:
> >> a. they're not referenced anymore (ovs_list_is_empty(&row->dst_arcs))
> >> b. they're in a table with change tracking enabled.
> >>
> >> If the table doesn't have change tracking enabled and the row is not
> >> referenced anymore it will be added to 'track_list' and will be cleaned
> >> up properly by ovsdb_idl_row_destroy_postprocess() (case "a" above).
> >>
> >> If the table doesn't have change tracking enabled but the row (let's
> >> call it 'R1') is still referenced (i.e., it became "orphan") then we
> >> shouldn't delete it yet.  When, in the future, the rows referring to
> >> this orphan row are deleted/updated (let's call them 'R2'),
> >> ovsdb_idl_row_clear_arcs(R2, true) is called and the orphan row R1 is
> >> now properly deleted.
> >>
> >
> > Ok, it seems no leak at all. I think I was confused by this check:
> > +        /* Orphan rows that are still unreferenced or are part of
tables
> > that
> > +         * have hange tracking enabled should be added to their table's
> > +         * 'track_list'.
> > +         */
> > +        if (ovs_list_is_empty(&row->dst_arcs)
> > +                || ovsdb_idl_track_is_set(row->table)) {
> > +            ovsdb_idl_row_track_change(row, OVSDB_IDL_CHANGE_DELETE);
> > +        }
> >
> > For R1, since "ovs_list_is_empty(&row->dst_arcs)" is true, it will be
moved
> > to track_list and finally get destroyed.
>
> R1 is still referenced by R2 at this point in the example above, that is
> there is an arc from R2 to R1.

Ok, my mistake.

>
> > But I think this "if" can be removed because deleted rows should always
go
> > to "track_list" so that they can be destroyed (assume not doing the
> > refactoring I suggested earlier).
>
> R1 should not be destroyed until R2 is destroyed too.  The complexity
> here exists because for tables with change tracking enabled, the current
> design is to always add deleted rows (even if they're still referenced)
> to 'track_list'.  But, even for those, ovsdb_idl_track_clear() will not
> destroy them as long as they're still referenced.

I am confused again. According to the above code:
1) If "row" is not referenced, ovs_list_is_empty(&row->dst_arcs) is true
and the row will be put to track_list.
2) If "row" is referenced
   2.1)  if the table is tracked, ovsdb_idl_track_is_set(row->table) is
true and the row will be put to track_list.
   2.2)  if the table is not tracked, the row will NOT be put to
track_list. The row will NOT be put to track_list.

If I understand correctly, in the current implementation (and in the
patch), everything in track_list will finally get released (either in
post_processing or in track_clear).
For 2.2), since the row is NOT put to track_list, how will it be released?

And for "But, even for those, ovsdb_idl_track_clear() will not destroy them
as long as they're still referenced.", do you mean it is possible that a
row is NOT destroyed even after the IDL iteration after user calling
track_clear()? If yes, why is that necessary? And how is it achieved (in
track_clear() it goes through each track_list and release all is_orphaned()
rows, so how could it not get released)?

>
> > Before this check, ovsdb_idl_row_reparse_backrefs(row) is called to make
> > sure the backrefs are cleared, so "ovs_list_is_empty(&row->dst_arcs)"
> > should always be true, right?
>
> No, even if there was an update to delete R1, R2 might still exist
> (e.g., monitor condition change causes R1 to be deleted from the local
> database view) and then ovsdb_idl_row_reparse_backrefs(R1) would create
> an arc from R2 to R1.
>
> I agree that the existing change tracking code is complex and can
> potentially be simplified and/or refactored.  However, that's a big
> effort and I think should not be in the scope of this bug fix.  I did
> have to refactor some things but I limited myself to only changing parts
> that are absolutely required to fix the bug.
>
Sure, I agree that refactoring shouldn't be required in this patch. I am
totally ok with it being merged without the refactoring I suggested, but
regardless of the refactoring, could you still have my above concerns
clarified? Thanks a lot for your patience!

Thanks,
Han

> For the time being, without this fix ovn-controller is crashing because
> of the inconsistent view of the database the IDL change tracking code
> creates.
>
> I think this should be fixed as soon as possible.  In my opinion all
> other refactoring and hardening should be done as follow up.
>
> Regards,
> Dumitru
>
> >
> >>>
> >>> Thanks,
> >>> Han
> >>>
> >>>>>
> >>>>> For the rest of the code... It seems fine, but it's really hard
> >>>>> to review and I still have a feeling that something might go
> >>>>> wrong (not more wrong than in current code, though), but I can't
> >>>>> think of any example.  So, it's, probably, OK.
> >>
> >> Thanks, I'm not sure how we can improve the stability/reliability of
> >> this code without major refactoring/rewrite.  Until then though, the
> >> fact that we break consistency of the IDL client's view of the database
> >> is something that needs to be addressed because it's taken for granted
> >> by ovn-controller in quite a few cases.
> >>
> >>>>>
> >>>>> Best regards, Ilya Maximets.
> >>
> >> Regards,
> >> Dumitru
> >>
> >
>


More information about the dev mailing list