[ovs-dev] [PATCH] ovsdb-idl: Fix *_is_new() IDL functions

Mark Gray mark.d.gray at redhat.com
Sat Oct 10 15:42:53 UTC 2020


On 08/10/2020 19:57, Dumitru Ceara wrote:
> On 10/2/20 1:13 AM, Han Zhou wrote:
>>
>>
>> On Thu, Oct 1, 2020 at 2:30 AM Mark Gray <mark.d.gray at redhat.com
>> <mailto:mark.d.gray at redhat.com>> wrote:
>>>
>>> On 30/09/2020 21:04, Han Zhou wrote:
>>>> On Wed, Sep 30, 2020 at 2:21 AM Dumitru Ceara <dceara at redhat.com
>> <mailto:dceara at redhat.com>> wrote:
>>>>>
>>>>> On 9/29/20 8:34 PM, Mark Gray wrote:
>>>>>> Currently all functions of the type *_is_new() always return
>>>>>> 'false'. This patch resolves this issue by using the
>>>>>> 'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' instead of the
>>>>>> 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' to determine if a row
>>>>>> is new and by resetting the 'OVSDB_IDL_CHANGE_INSERT'
>>>>>> 'change_seqno' on clear.
>>>>>>
>>>>>> Further to this, the code is also updated to match this
>>>>>> behaviour:
>>>>>>
>>>>>> When a row is inserted, the 'OVSDB_IDL_CHANGE_INSERT'
>>>>>> 'change_seqno' is updated to match the new database
>>>>>> change_seqno. The 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno'
>>>>>> is not set for inserted rows (only for updated rows).
>>>>>>
>>>>>> At the end of a run, ovsdb_idl_db_track_clear() should be
>>>>>> called to clear all tracking information, this includes
>>>>>> resetting all row 'change_seqno' to zero. This will ensure
>>>>>> that subsequent runs will not see a previously 'new' row.
>>>>>>
>>>>>> add_tracked_change_for_references() is updated to only
>>>>>> track rows that reference the current row.
>>>>>>
>>>>>> Signed-off-by: Mark Gray <mark.d.gray at redhat.com
>> <mailto:mark.d.gray at redhat.com>>
>>>>>> Suggested-by: Dumitru Ceara <dceara at redhat.com
>> <mailto:dceara at redhat.com>>
>>>>>> Reported-at: https://bugzilla.redhat.com/1883562
>>>>>
>>>>> Hi Mark,
>>>>>
>>>>> Thanks for working on this, it definitely looks like a nasty bug to me!
>>>>>
>>>>> We were lucky to not hit it in ovn-controller before. That's just
>>>>> because all the "update" operations were handled there as "delete" +
>>>>> "insert" but that is not mandatory and might change.>>
>>>>
>>>> Agree. Thanks for this critical finding! It was my bad - commit
>> ca545a787ac
>>>> introduced this bug, which was fixing another bug related to is_new. I
>>>> should have added test cases to avoid the miss. Apologize.
>>>
>>> No worries, I will reference this commit in the commit message.
>>>>
>>>> For this fix, I don't understand why we need to check
>>>> OVSDB_IDL_CHANGE_INSERT. I think the root cause is that in
>>>> add_tracked_change_for_references() it updated the
>> OVSDB_IDL_CHANGE_MODIFY
>>>> seqno for the initial inserted *new* row, while it shouldn't. So
>> should the
>>>> fix only include the changes in add_tracked_change_for_references() and
>>>> ovsdb_idl_row_change__()? Why do we need to change the implementation of
>>>> _is_new() itself? Could you explain?
>>>
>>> In the case of a newly inserted row, the change_seqno triplet will still
>>> be {X, 0, 0}. It will remain as such until the row is modified, at which
>>> point it will become {X, Y, 0}. This means that the _*is_new() function
>>> will continue to return true until the row is modified which is not
>>> correct behaviour.
>>
>> Hmm, this is not a problem because if a row inserted in a previous run
>> is not modified/deleted in the current run, it won't be added to
>> tracking so no one would call _is_new() for the row. However, there will
>> be a problem when a row is deleted (see below).
>>
>>> If we reset the change_seqno triplet on each run, it
>>> will still not work. In the end, it is probably better to use the
>>> OVSDB_IDL_CHANGE_INSERT element to track this.
>>>
>> Indeed, we should use OVSDB_IDL_CHANGE_INSERT to check _is_new() because
>> if row is deleted without any modification after insertion, the
>> OVSDB_IDL_CHANGE_MODIFY seqno would still be zero when it is being
>> deleted. In that case calling the _is_new() function would return true
>> for a deleted row, which is wrong. So, your change LGTM.
>>
>>> On a seperate point, I don't expect add_tracked_change_for_references()
>>> will ever get called on insert as nothing will be referenced at that
>>> stage? Am I correct in my understanding?
>>>
>> Yes, I think so. More accurately, it will get called once on the
>> inserted row but will not trigger any recursive calls.
>>
>>>>>> ---
>>>>>>  lib/ovsdb-idl.c     | 39 +++++++++++++++++++++++++++------------
>>>>>>  ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> |  2 +-
>>>>>>  2 files changed, 28 insertions(+), 13 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
>>>>>> index d8f221ca6..3cfd73871 100644
>>>>>> --- a/lib/ovsdb-idl.c
>>>>>> +++ b/lib/ovsdb-idl.c
>>>>>> @@ -1959,6 +1959,11 @@ ovsdb_idl_db_track_clear(struct
>> ovsdb_idl_db *db)
>>>>>>                      free(row->updated);
>>>>>>                      row->updated = NULL;
>>>>>>                  }
>>>>>> +
>>>>>> +                row->change_seqno[OVSDB_IDL_CHANGE_INSERT] =
>>>>>> +                    row->change_seqno[OVSDB_IDL_CHANGE_MODIFY] =
>>>>>> +                    row->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0;
>>>>>> +
>>>>>>                  ovs_list_remove(&row->track_node);
>>>>>>                  ovs_list_init(&row->track_node);
>>>>>>                  if (ovsdb_idl_row_is_orphan(row) &&
>>>> row->tracked_old_datum) {
>>>>>> @@ -2684,24 +2689,27 @@ ovsdb_idl_process_update2(struct
>>>> ovsdb_idl_table *table,
>>>>>>      return OVSDB_IDL_UPDATE_DB_CHANGED;
>>>>>>  }
>>>>>>
>>>>>> -/* Recursively add rows to tracked change lists for current row
>>>>>> - * and the rows that reference this row. */
>>>>>> +/* Recursively add rows to tracked change lists for all rows that
>>>> reference
>>>>>> +   'row'. */
>>>>>>  static void
>>>>>>  add_tracked_change_for_references(struct ovsdb_idl_row *row)
>>>>>>  {
>>>>>> -    if (ovs_list_is_empty(&row->track_node) &&
>>>>>> -            ovsdb_idl_track_is_set(row->table)) {
>>>>>> -        ovs_list_push_back(&row->table->track_list,
>>>>>> -                           &row->track_node);
>>>>>> -        row->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>>>>>> -            = row->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>>>>>> -            = row->table->db->change_seqno + 1;
>>>>>> -
>>>>>>          const struct ovsdb_idl_arc *arc;
>>>>>>          LIST_FOR_EACH (arc, dst_node, &row->dst_arcs) {
>>>>>> -            add_tracked_change_for_references(arc->src);
>>>>>> +            struct ovsdb_idl_row *ref = arc->src;
>>>>>> +
>>>>>> +            if (ovs_list_is_empty(&ref->track_node) &&
>>>>>> +                ovsdb_idl_track_is_set(ref->table)) {
>>>>>> +                    ovs_list_push_back(&ref->table->track_list,
>>>>>> +                                       &ref->track_node);
>>>>>> +
>>>>>> +                ref->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>>>>>> +                    =
>> ref->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>>>>>> +                    = ref->table->db->change_seqno + 1;
>>>>>> +
>>>>>> +                add_tracked_change_for_references(ref);
>>>>>> +            }
>>>>>>          }
>>>>>> -    }
>>>>>>  }
>>>>>>
>>>>>>
>>>>>> @@ -2767,7 +2775,14 @@ ovsdb_idl_row_change__(struct ovsdb_idl_row
>>>> *row, const struct json *row_json,
>>>>>>                      row->change_seqno[change]
>>>>>>                          = row->table->change_seqno[change]
>>>>>>                          = row->table->db->change_seqno + 1;
>>>>>> +
>>>>>>                      if (table->modes[column_idx] & OVSDB_IDL_TRACK) {
>>>>>> +                        if (ovs_list_is_empty(&row->track_node) &&
>>>>>> +                            ovsdb_idl_track_is_set(row->table)) {
>>>>>> +                          
>>  ovs_list_push_back(&row->table->track_list,
>>>>>> +                                               &row->track_node);
>>>>>> +                        }
>>>>>> +
>>>>>>                          add_tracked_change_for_references(row);
>>>>>>                          if (!row->updated) {
>>>>>>                              row->updated =
>>>> bitmap_allocate(class->n_columns);
>>>>>> diff --git a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in>
>> b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in>
>>>>>> index 698fe25f3..89c3eb682 100755
>>>>>> --- a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in>
>>>>>> +++ b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in>
>>>>>> @@ -282,7 +282,7 @@ const struct %(s)s
>>>> *%(s)s_table_track_get_first(const struct %(s)s_table *);
>>>>>>  /* Returns true if 'row' was inserted since the last change tracking
>>>> reset. */
>>>>>>  static inline bool %(s)s_is_new(const struct %(s)s *row)
>>>>>>  {
>>>>>> -    return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_MODIFY) == 0;
>>>>>> +    return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_INSERT) > 0;
>>>>>
>>>>> This causes some issues with records created by the IDL client (e.g.,
>>>>> ovn-controller). Those are created with ovsdb_idl_txn_insert() and we
>>>>> don't set the OVSDB_IDL_CHANGE_INSERT change_seqno in that case.
>>>>>
>>>>> Essentially *_is_new() will never return true for rows inserted
>> locally.
>>>>>
>>>> Hi Dumitru, I have a different understanding here. I think this is not a
>>>> problem. Whatever changes made locally by the IDL will be discarded when
>>>> committing to server and then rely on the notifications from the
>> server to
>>>> finally update the IDL data, which updates the change_seqno accordingly.
>>>
>>> Dumitru appears to be right. There is a failure on one ovn UT with this
>>> change. It gets resolved by the fix that Dumitru suggests.
>>>
>> I am confused here. Which UT fails and what's the suggested fix? If the
>> row is added locally, I don't expect it to be tracked at all in the
>> current iteration (while it will be in the iteration after the
>> transaction commit). Did I miss anything?
>>
>> Thanks,
>> Han
>>
> 
> Hi Han,
> 
> One way to see the issue is:
> 
> make sandbox
> ovn-sbctl destroy chassis .
> 
> At this point ovn-controller continuously tries to transact insert
> Chassis_Private even though Chassis_Private was only updated:
> 
> 2020-10-08T18:42:27.784Z|00024|jsonrpc|DBG|unix:sb1.ovsdb: send request,
> method="transact",
> params=["OVN_Southbound",{"uuid-name":"row3e499288_5f8c_4859_b82a_adcc474eb114","row":{"name":"chassis-1","chassis":["named-uuid","row6534b1df_bcbb_4b91_8eb1_5bc823f1673e"]},"op":"insert","table":"Chassis_Private"},{"uuid-name":"row9d86c159_0be5_4d13_a8d6_a3414a855511","row":{"ip":"127.0.0.1","options":["map",[["csum","true"]]],"chassis_name":"chassis-1","type":"geneve"},"op":"insert","table":"Encap"},{"uuid-name":"row6534b1df_bcbb_4b91_8eb1_5bc823f1673e","row":{"name":"chassis-1","hostname":"sandbox","other_config":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]],"encaps":["named-uuid","row9d86c159_0be5_4d13_a8d6_a3414a855511"],"external_ids":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]]},"op":"insert","table":"Chassis"},{"comment":"ovn-controller:
> registering chassis 'chassis-1'","op":"comment"}], id=14
> 
> While, if I set OVSDB_IDL_CHANGE_INSERT in ovsdb_idl_txn_insert() and
> redo the same steps as above, ovn-controller doesn't try to insert
> Chassis_Private but instead it updates the record.
> 
> The only use of _is_new() for potentially locally created records is
> here afaik:
> https://github.com/ovn-org/ovn/blob/9d2e8d32fb9865513b70408a665184a67564390d/controller/ovn-controller.c#L184
> 
> I didn't dig further though.

Sorry for the delay. The following test fails:

 90: ovn -- ensure one gw controller restart in HA doesn't bounce the
master FAILED (ovs-macros.at:220)

<snip>
ovn.at:12213: waiting until test 1 = `ovn-sbctl --bare --columns=_uuid
find Chassis name=gw2 | wc -l`...
ovn.at:12213: wait failed after 30 seconds
./ovs-macros.at:220: hard failure
90. ovn.at:12139: 90. ovn -- ensure one gw controller restart in HA
doesn't bounce the master (ovn.at:12139): FAILED (ovs-macros.at:220)
<snip>

This is similar failure as the repoducer highlighted by Dumitru above.
It is resolved by:

+++ b/lib/ovsdb-idl.c
@@ -4858,6 +4858,11 @@ ovsdb_idl_txn_insert(struct ovsdb_idl_txn *txn,
     hmap_insert(&row->table->rows, &row->hmap_node, uuid_hash(&row->uuid));
     hmap_insert(&txn->txn_rows, &row->txn_node, uuid_hash(&row->uuid));
     ovsdb_idl_add_to_indexes(row);
+
+    row->change_seqno[OVSDB_IDL_CHANGE_INSERT]
+        = row->table->change_seqno[OVSDB_IDL_CHANGE_INSERT]
+        = row->table->db->change_seqno + 1;
+


> 
> Regards,
> Dumitru
> 



More information about the dev mailing list