[ovs-dev] [PATCH] ovsdb-idl: Fix *_is_new() IDL functions

Mark Gray mark.d.gray at redhat.com
Mon Oct 12 14:32:43 UTC 2020


On 10/10/2020 16:42, Mark Gray wrote:
> On 08/10/2020 19:57, Dumitru Ceara wrote:
>> On 10/2/20 1:13 AM, Han Zhou wrote:
>>>
>>>
>>> On Thu, Oct 1, 2020 at 2:30 AM Mark Gray <mark.d.gray at redhat.com
>>> <mailto:mark.d.gray at redhat.com>> wrote:
>>>>
>>>> On 30/09/2020 21:04, Han Zhou wrote:
>>>>> On Wed, Sep 30, 2020 at 2:21 AM Dumitru Ceara <dceara at redhat.com
>>> <mailto:dceara at redhat.com>> wrote:
>>>>>>
>>>>>> On 9/29/20 8:34 PM, Mark Gray wrote:
>>>>>>> Currently all functions of the type *_is_new() always return
>>>>>>> 'false'. This patch resolves this issue by using the
>>>>>>> 'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' instead of the
>>>>>>> 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' to determine if a row
>>>>>>> is new and by resetting the 'OVSDB_IDL_CHANGE_INSERT'
>>>>>>> 'change_seqno' on clear.
>>>>>>>
>>>>>>> Further to this, the code is also updated to match this
>>>>>>> behaviour:
>>>>>>>
>>>>>>> When a row is inserted, the 'OVSDB_IDL_CHANGE_INSERT'
>>>>>>> 'change_seqno' is updated to match the new database
>>>>>>> change_seqno. The 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno'
>>>>>>> is not set for inserted rows (only for updated rows).
>>>>>>>
>>>>>>> At the end of a run, ovsdb_idl_db_track_clear() should be
>>>>>>> called to clear all tracking information, this includes
>>>>>>> resetting all row 'change_seqno' to zero. This will ensure
>>>>>>> that subsequent runs will not see a previously 'new' row.
>>>>>>>
>>>>>>> add_tracked_change_for_references() is updated to only
>>>>>>> track rows that reference the current row.
>>>>>>>
>>>>>>> Signed-off-by: Mark Gray <mark.d.gray at redhat.com
>>> <mailto:mark.d.gray at redhat.com>>
>>>>>>> Suggested-by: Dumitru Ceara <dceara at redhat.com
>>> <mailto:dceara at redhat.com>>
>>>>>>> Reported-at: https://bugzilla.redhat.com/1883562
>>>>>>
>>>>>> Hi Mark,
>>>>>>
>>>>>> Thanks for working on this, it definitely looks like a nasty bug to me!
>>>>>>
>>>>>> We were lucky to not hit it in ovn-controller before. That's just
>>>>>> because all the "update" operations were handled there as "delete" +
>>>>>> "insert" but that is not mandatory and might change.>>
>>>>>
>>>>> Agree. Thanks for this critical finding! It was my bad - commit
>>> ca545a787ac
>>>>> introduced this bug, which was fixing another bug related to is_new. I
>>>>> should have added test cases to avoid the miss. Apologize.
>>>>
>>>> No worries, I will reference this commit in the commit message.
>>>>>
>>>>> For this fix, I don't understand why we need to check
>>>>> OVSDB_IDL_CHANGE_INSERT. I think the root cause is that in
>>>>> add_tracked_change_for_references() it updated the
>>> OVSDB_IDL_CHANGE_MODIFY
>>>>> seqno for the initial inserted *new* row, while it shouldn't. So
>>> should the
>>>>> fix only include the changes in add_tracked_change_for_references() and
>>>>> ovsdb_idl_row_change__()? Why do we need to change the implementation of
>>>>> _is_new() itself? Could you explain?
>>>>
>>>> In the case of a newly inserted row, the change_seqno triplet will still
>>>> be {X, 0, 0}. It will remain as such until the row is modified, at which
>>>> point it will become {X, Y, 0}. This means that the _*is_new() function
>>>> will continue to return true until the row is modified which is not
>>>> correct behaviour.
>>>
>>> Hmm, this is not a problem because if a row inserted in a previous run
>>> is not modified/deleted in the current run, it won't be added to
>>> tracking so no one would call _is_new() for the row. However, there will
>>> be a problem when a row is deleted (see below).
>>>
>>>> If we reset the change_seqno triplet on each run, it
>>>> will still not work. In the end, it is probably better to use the
>>>> OVSDB_IDL_CHANGE_INSERT element to track this.
>>>>
>>> Indeed, we should use OVSDB_IDL_CHANGE_INSERT to check _is_new() because
>>> if row is deleted without any modification after insertion, the
>>> OVSDB_IDL_CHANGE_MODIFY seqno would still be zero when it is being
>>> deleted. In that case calling the _is_new() function would return true
>>> for a deleted row, which is wrong. So, your change LGTM.
>>>
>>>> On a seperate point, I don't expect add_tracked_change_for_references()
>>>> will ever get called on insert as nothing will be referenced at that
>>>> stage? Am I correct in my understanding?
>>>>
>>> Yes, I think so. More accurately, it will get called once on the
>>> inserted row but will not trigger any recursive calls.
>>>
>>>>>>> ---
>>>>>>>  lib/ovsdb-idl.c     | 39 +++++++++++++++++++++++++++------------
>>>>>>>  ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> |  2 +-
>>>>>>>  2 files changed, 28 insertions(+), 13 deletions(-)
>>>>>>>
>>>>>>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
>>>>>>> index d8f221ca6..3cfd73871 100644
>>>>>>> --- a/lib/ovsdb-idl.c
>>>>>>> +++ b/lib/ovsdb-idl.c
>>>>>>> @@ -1959,6 +1959,11 @@ ovsdb_idl_db_track_clear(struct
>>> ovsdb_idl_db *db)
>>>>>>>                      free(row->updated);
>>>>>>>                      row->updated = NULL;
>>>>>>>                  }
>>>>>>> +
>>>>>>> +                row->change_seqno[OVSDB_IDL_CHANGE_INSERT] =
>>>>>>> +                    row->change_seqno[OVSDB_IDL_CHANGE_MODIFY] =
>>>>>>> +                    row->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0;
>>>>>>> +
>>>>>>>                  ovs_list_remove(&row->track_node);
>>>>>>>                  ovs_list_init(&row->track_node);
>>>>>>>                  if (ovsdb_idl_row_is_orphan(row) &&
>>>>> row->tracked_old_datum) {
>>>>>>> @@ -2684,24 +2689,27 @@ ovsdb_idl_process_update2(struct
>>>>> ovsdb_idl_table *table,
>>>>>>>      return OVSDB_IDL_UPDATE_DB_CHANGED;
>>>>>>>  }
>>>>>>>
>>>>>>> -/* Recursively add rows to tracked change lists for current row
>>>>>>> - * and the rows that reference this row. */
>>>>>>> +/* Recursively add rows to tracked change lists for all rows that
>>>>> reference
>>>>>>> +   'row'. */
>>>>>>>  static void
>>>>>>>  add_tracked_change_for_references(struct ovsdb_idl_row *row)
>>>>>>>  {
>>>>>>> -    if (ovs_list_is_empty(&row->track_node) &&
>>>>>>> -            ovsdb_idl_track_is_set(row->table)) {
>>>>>>> -        ovs_list_push_back(&row->table->track_list,
>>>>>>> -                           &row->track_node);
>>>>>>> -        row->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>>>>>>> -            = row->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>>>>>>> -            = row->table->db->change_seqno + 1;
>>>>>>> -
>>>>>>>          const struct ovsdb_idl_arc *arc;
>>>>>>>          LIST_FOR_EACH (arc, dst_node, &row->dst_arcs) {
>>>>>>> -            add_tracked_change_for_references(arc->src);
>>>>>>> +            struct ovsdb_idl_row *ref = arc->src;
>>>>>>> +
>>>>>>> +            if (ovs_list_is_empty(&ref->track_node) &&
>>>>>>> +                ovsdb_idl_track_is_set(ref->table)) {
>>>>>>> +                    ovs_list_push_back(&ref->table->track_list,
>>>>>>> +                                       &ref->track_node);
>>>>>>> +
>>>>>>> +                ref->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>>>>>>> +                    =
>>> ref->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>>>>>>> +                    = ref->table->db->change_seqno + 1;
>>>>>>> +
>>>>>>> +                add_tracked_change_for_references(ref);
>>>>>>> +            }
>>>>>>>          }
>>>>>>> -    }
>>>>>>>  }
>>>>>>>
>>>>>>>
>>>>>>> @@ -2767,7 +2775,14 @@ ovsdb_idl_row_change__(struct ovsdb_idl_row
>>>>> *row, const struct json *row_json,
>>>>>>>                      row->change_seqno[change]
>>>>>>>                          = row->table->change_seqno[change]
>>>>>>>                          = row->table->db->change_seqno + 1;
>>>>>>> +
>>>>>>>                      if (table->modes[column_idx] & OVSDB_IDL_TRACK) {
>>>>>>> +                        if (ovs_list_is_empty(&row->track_node) &&
>>>>>>> +                            ovsdb_idl_track_is_set(row->table)) {
>>>>>>> +                          
>>>  ovs_list_push_back(&row->table->track_list,
>>>>>>> +                                               &row->track_node);
>>>>>>> +                        }
>>>>>>> +
>>>>>>>                          add_tracked_change_for_references(row);
>>>>>>>                          if (!row->updated) {
>>>>>>>                              row->updated =
>>>>> bitmap_allocate(class->n_columns);
>>>>>>> diff --git a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in>
>>> b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in>
>>>>>>> index 698fe25f3..89c3eb682 100755
>>>>>>> --- a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in>
>>>>>>> +++ b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in>
>>>>>>> @@ -282,7 +282,7 @@ const struct %(s)s
>>>>> *%(s)s_table_track_get_first(const struct %(s)s_table *);
>>>>>>>  /* Returns true if 'row' was inserted since the last change tracking
>>>>> reset. */
>>>>>>>  static inline bool %(s)s_is_new(const struct %(s)s *row)
>>>>>>>  {
>>>>>>> -    return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_MODIFY) == 0;
>>>>>>> +    return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_INSERT) > 0;
>>>>>>
>>>>>> This causes some issues with records created by the IDL client (e.g.,
>>>>>> ovn-controller). Those are created with ovsdb_idl_txn_insert() and we
>>>>>> don't set the OVSDB_IDL_CHANGE_INSERT change_seqno in that case.
>>>>>>
>>>>>> Essentially *_is_new() will never return true for rows inserted
>>> locally.
>>>>>>
>>>>> Hi Dumitru, I have a different understanding here. I think this is not a
>>>>> problem. Whatever changes made locally by the IDL will be discarded when
>>>>> committing to server and then rely on the notifications from the
>>> server to
>>>>> finally update the IDL data, which updates the change_seqno accordingly.
>>>>
>>>> Dumitru appears to be right. There is a failure on one ovn UT with this
>>>> change. It gets resolved by the fix that Dumitru suggests.
>>>>
>>> I am confused here. Which UT fails and what's the suggested fix? If the
>>> row is added locally, I don't expect it to be tracked at all in the
>>> current iteration (while it will be in the iteration after the
>>> transaction commit). Did I miss anything?
>>>
>>> Thanks,
>>> Han
>>>
>>
>> Hi Han,
>>
>> One way to see the issue is:
>>
>> make sandbox
>> ovn-sbctl destroy chassis .
>>
>> At this point ovn-controller continuously tries to transact insert
>> Chassis_Private even though Chassis_Private was only updated:
>>
>> 2020-10-08T18:42:27.784Z|00024|jsonrpc|DBG|unix:sb1.ovsdb: send request,
>> method="transact",
>> params=["OVN_Southbound",{"uuid-name":"row3e499288_5f8c_4859_b82a_adcc474eb114","row":{"name":"chassis-1","chassis":["named-uuid","row6534b1df_bcbb_4b91_8eb1_5bc823f1673e"]},"op":"insert","table":"Chassis_Private"},{"uuid-name":"row9d86c159_0be5_4d13_a8d6_a3414a855511","row":{"ip":"127.0.0.1","options":["map",[["csum","true"]]],"chassis_name":"chassis-1","type":"geneve"},"op":"insert","table":"Encap"},{"uuid-name":"row6534b1df_bcbb_4b91_8eb1_5bc823f1673e","row":{"name":"chassis-1","hostname":"sandbox","other_config":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]],"encaps":["named-uuid","row9d86c159_0be5_4d13_a8d6_a3414a855511"],"external_ids":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]]},"op":"insert","table":"Chassis"},{"comment":"ovn-controller:
>> registering chassis 'chassis-1'","op":"comment"}], id=14
>>
>> While, if I set OVSDB_IDL_CHANGE_INSERT in ovsdb_idl_txn_insert() and
>> redo the same steps as above, ovn-controller doesn't try to insert
>> Chassis_Private but instead it updates the record.
>>
>> The only use of _is_new() for potentially locally created records is
>> here afaik:
>> https://github.com/ovn-org/ovn/blob/9d2e8d32fb9865513b70408a665184a67564390d/controller/ovn-controller.c#L184
>>
>> I didn't dig further though.
> 
> Sorry for the delay. The following test fails:
> 
>  90: ovn -- ensure one gw controller restart in HA doesn't bounce the
> master FAILED (ovs-macros.at:220)
> 
> <snip>
> ovn.at:12213: waiting until test 1 = `ovn-sbctl --bare --columns=_uuid
> find Chassis name=gw2 | wc -l`...
> ovn.at:12213: wait failed after 30 seconds
> ./ovs-macros.at:220: hard failure
> 90. ovn.at:12139: 90. ovn -- ensure one gw controller restart in HA
> doesn't bounce the master (ovn.at:12139): FAILED (ovs-macros.at:220)
> <snip>
> 
> This is similar failure as the repoducer highlighted by Dumitru above.
> It is resolved by:
> 
> +++ b/lib/ovsdb-idl.c
> @@ -4858,6 +4858,11 @@ ovsdb_idl_txn_insert(struct ovsdb_idl_txn *txn,
>      hmap_insert(&row->table->rows, &row->hmap_node, uuid_hash(&row->uuid));
>      hmap_insert(&txn->txn_rows, &row->txn_node, uuid_hash(&row->uuid));
>      ovsdb_idl_add_to_indexes(row);
> +
> +    row->change_seqno[OVSDB_IDL_CHANGE_INSERT]
> +        = row->table->change_seqno[OVSDB_IDL_CHANGE_INSERT]
> +        = row->table->db->change_seqno + 1;
> +
> 
> 
>>
>> Regards,
>> Dumitru
>>
> 

v2 published at
https://mail.openvswitch.org/pipermail/ovs-dev/2020-October/375962.html


More information about the dev mailing list