[ovs-dev] [PATCH] ovsdb-idl: Fix *_is_new() IDL functions

Dumitru Ceara dceara at redhat.com
Tue Oct 20 09:23:42 UTC 2020

On 10/20/20 10:28 AM, Mark Gray wrote:
> On 19/10/2020 22:21, Han Zhou wrote:
>>>>> If we make the change so that _is_new() can be used to evaluate
>> against local (uncommitted) changes, it may work but can be confusing. If a
>> row R is inserted by current transaction locally, _is_new(R) returns true,
>> but when the transaction is committed the local copy will be discarded, and
>> if the commit is successful the IDL will finally receive a server
>> notification and the _is_new(R) evaluation will return true again. I think
>> this could cause problems for the code that relies on _is_new(R). Moreover,
>> if we update the OVSDB_IDL_CHANGE_INSERT seq for locally inserted rows, we
>> may need to do the same for locally modified/deleted rows, to be
>> consistent, which needs more careful considerations for the implications of
>> the change.
>>>>>>>>> Because of the above concerns, I'd rather suggest to stick to the
>> original design (maybe document it more clearly) that the functions used
>> only for tracked changes.
> Thanks for the feedback.
> I think I agree with your points here and Ilya's below. It is something
> to keep in mind as we may need this type of functionality in the future.
> However, we should be able to resolve all our issues without it at this
> stage.
> I will refactor this patch to remove the change that allows the use of
> _is_new() on potentially locally inserted records but I will wait until
> Dumitru implements the change to disallow changing the system-id
> on-the-fly. Dumitru, if you need any help here, please let me know.

Hi Mark,

I sent a patch to remove the need for sbrec_chassis_is_new() in ovn-controller:


I also tested it with the "ovsdb-idl: Fix *_is_new() IDL functions" patch
applied to OVS and OVN tests pass now.

> When this is all done, I will then resubmit the change in OVN which I
> originally started on which was fixing I-P for changes in
> requested-tnl-key!
>>>> One even bigger concern from my side is what process should not take
>> any actions
>>>> until ovsdb-server accepts the change.  Otherwise it might be very
>> tricky to
>>>> handle rejected transactions.  Even if it might be not a likely event
>> in OVN case.
>> This is a valid concern, which I've thought about before. It seems OVSDB
>> IDL was supposed to be used in a declarative way (ideally), i.e. level
>> triggered instead of edge triggered. The IDL client typically runs in a
>> loop that takes current status of the IDL as input and computes a desired
>> output within the transaction and commits back to the server. This way, it
>> doesn't matter if a DB change is made in current transaction.
>> However, there are use cases when we do need to handle changes (edge
>> triggered), typically, in ovn-controller incremental processing.
>> Fundamental question is, how to handle the OVSDB changes made in current
>> I-P iteration. Fortunately, ovn-controller doesn't really modify too much
>> data in OVSDB (SB and local conf DB). For SB DB, they are only chassis
>> (which we don't include in incremental processing), MAC_binding and
>> Port_binding, and I don't see any side-effect yet, because those local
>> changes are not handled and they will be handled in the next iteration when
>> updates received from SB server, and handled through the change-tracking
>> (which _is_new is also used to evaluate the type of change for the tracked
>> records). For OVS_IDL (local conf DB), the first version of I-P doesn't do
>> any incremental processing (always recompute), but the recent version does
>> some I-P for interface changes. I think it is similar to the SB I-P, since
>> we don't really handle the data changed by current iteration (maybe need
>> more careful check for exceptions).
>> Other than ovn-controller I-P, I don't see any use case where we have to
>> rely on the edge-trigger yet. (For the current only use case for the
>> chassis regarding chassis-private monitoring, please see below)
>>> Actually, the single use of _is_new() on potentially locally inserted
>> records
>>> in ovn-controller is exactly to avoid taking actions on records that have
>> just
>>> been created locally without ovsdb-server accepting the transaction [0]:
>> https://github.com/ovn-org/ovn/blob/master/controller/ovn-controller.c#L184
>>> We avoid using chassis->header_.uuid as a clause in the monitor condition
>> if
>>> chassis is not yet committed to the DB.
>> I think using the chassis name in monitor condition should have avoided the
>> problem (if not considering supporting the system-id change on-the-fly).

Hi Han,

You're right, the patch I mentioned above fixes this.  We still need a change
to completely remove support for changing system-id on-the-fly and to document
how an operator can safely change system-id.  It's still on my list but I
didn't get to it yet.


>>>>> For the only place the _is_new() currently used for evaluating
>> non-tracked change [0], it was introduced in [1], which we have discussed
>> earlier that it is better to avoid the complex logic for handling system-id
>> change, but let the operator to follow the steps of completely deleting and
>> re-adding a chassis [2]. So I guess this kind of usage would not be
>> necessary at least for existing use cases.
>>> As mentioned above, even if we refactor [1] by implementing [2] I'm not
>> sure if
>>> we get rid of the need of checking sbrec_chassis_is_new(chassis) in
>>> update_sb_monitors().  I have to look again closely at that part.
>>> Regards,
>>> Dumitru
>>>>> P.S. Ilya found an old patch that I submitted a year ago [3] but never
>> merged for fixing the _is_new() problem (that I've forgotten myself), which
>> is similar to v1 but didn't use the "INSERT" seqno. I think the current
>> patch (v1) from Mark is better, so @Ilya Maximets <mailto:i.maximets at ovn.org>
>> please ignore my old patch.
>>>> OK.  I'll mark it as superseded.
>>>>> [0]
>> https://github.com/ovn-org/ovn/blob/9d2e8d32fb9865513b70408a665184a67564390d/controller/ovn-controller.c#L184
>>>>> [1] commit dce1af31b5 ("chassis: Fix chassis_private record updates
>> when the system-id changes.")
>>>>> [2]
>> https://mail.openvswitch.org/pipermail/ovs-dev/2020-September/374801.html
>>>>> [3]
>> https://patchwork.ozlabs.org/project/openvswitch/patch/1564100775-80657-1-git-send-email-hzhou8@ebay.com/
>>>>> Thanks,
>>>>> Han
>>>>>>>> I didn't dig further though.
>>>>>>> Sorry for the delay. The following test fails:
>>>>>>>  90: ovn -- ensure one gw controller restart in HA doesn't bounce the
>>>>>>> master FAILED (ovs-macros.at:220 <http://ovs-macros.at:220>)
>>>>>>> <snip>
>>>>>>> ovn.at:12213 <http://ovn.at:12213>: waiting until test 1 =
>> `ovn-sbctl --bare --columns=_uuid
>>>>>>> find Chassis name=gw2 | wc -l`...
>>>>>>> ovn.at:12213 <http://ovn.at:12213>: wait failed after 30 seconds
>>>>>>> ./ovs-macros.at:220 <http://ovs-macros.at:220>: hard failure
>>>>>>> 90. ovn.at:12139 <http://ovn.at:12139>: 90. ovn -- ensure one gw
>> controller restart in HA
>>>>>>> doesn't bounce the master (ovn.at:12139 <http://ovn.at:12139>):
>> FAILED (ovs-macros.at:220 <http://ovs-macros.at:220>)
>>>>>>> <snip>
>>>>>>> This is similar failure as the repoducer highlighted by Dumitru
>> above.
>>>>>>> It is resolved by:
>>>>>>> +++ b/lib/ovsdb-idl.c
>>>>>>> @@ -4858,6 +4858,11 @@ ovsdb_idl_txn_insert(struct ovsdb_idl_txn
>> *txn,
>>>>>>>      hmap_insert(&row->table->rows, &row->hmap_node,
>> uuid_hash(&row->uuid));
>>>>>>>      hmap_insert(&txn->txn_rows, &row->txn_node,
>> uuid_hash(&row->uuid));
>>>>>>>      ovsdb_idl_add_to_indexes(row);
>>>>>>> +
>>>>>>> +    row->change_seqno[OVSDB_IDL_CHANGE_INSERT]
>>>>>>> +        = row->table->change_seqno[OVSDB_IDL_CHANGE_INSERT]
>>>>>>> +        = row->table->db->change_seqno + 1;
>>>>>>> +
>>>>>>>> Regards,
>>>>>>>> Dumitru
>>>>>> v2 published at
>> https://mail.openvswitch.org/pipermail/ovs-dev/2020-October/375962.html

More information about the dev mailing list