[ovs-dev] [PATCH ovn 0/2] Make ovn-northd recover from NB/SB inconsistencies.

Dumitru Ceara dceara at redhat.com
Thu Apr 30 13:19:34 UTC 2020


On 4/30/20 1:07 PM, Dumitru Ceara wrote:
> On 4/30/20 12:06 PM, Numan Siddique wrote:
>> On Thu, Apr 30, 2020 at 1:41 PM Numan Siddique <numans at ovn.org> wrote:
>>
>>>
>>>
>>> On Thu, Apr 30, 2020 at 8:43 AM Han Zhou <hzhou at ovn.org> wrote:
>>>
>>>> On Wed, Apr 29, 2020 at 4:50 PM Han Zhou <hzhou at ovn.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Apr 29, 2020 at 2:45 PM Dumitru Ceara <dceara at redhat.com>
>>>> wrote:
>>>>>>
>>>>>> On 4/29/20 9:57 PM, Han Zhou wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Apr 29, 2020 at 12:17 PM Numan Siddique <numans at ovn.org
>>>>>>> <mailto:numans at ovn.org>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Apr 29, 2020 at 9:57 PM Dumitru Ceara <dceara at redhat.com
>>>>>>> <mailto:dceara at redhat.com>> wrote:
>>>>>>>>>
>>>>>>>>> In some cases, if the NB/SB databases ovn-northd connects to are
>>>>>>>>> inconsistent, ovn-northd might generate transactions that fail
>>>>>>>>> continuously due to failed integrity checks on the SB database
>>>> server.
>>>>>>>>>
>>>>>>>>> The first patch of the series addresses inconsistencies due to
>>>> stale
>>>>>>>>> Datapath_Binding records in the SB database.
>>>>>>>>>
>>>>>>>>> The second patch of the series addresses inconsistencies due to
>>>> stale
>>>>>>>>> tunnel_key values in various SB database table records.
>>>>>>>>>
>>>>>>>>> Reported-by: Dan Williams <dcbw at redhat.com <mailto:
>>>> dcbw at redhat.com>>
>>>>>>>>> Reported-at: https://bugzilla.redhat.com/1828637
>>>>>>>>> Signed-off-by: Dumitru Ceara <dceara at redhat.com
>>>>>>> <mailto:dceara at redhat.com>>
>>>>>>>>>
>>>>>>>>> Dumitru Ceara (2):
>>>>>>>>>       ovn-northd: Clear SB records depending on stale datapaths.
>>>>>>>>>       ovn-northd: Fix tunnel_key allocation for SB records.
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Dumitru,
>>>>>>>>
>>>>>>>> I did some testing in my ovn-fake-multinode setup. These are my
>>>>>>> observations.
>>>>>>>>
>>>>>>>> I created a logical switch sw0 with 4 logical ports. So the next
>>>>>>> tunnel key should be 5.
>>>>>>>> I stopped ovn-northd and  created a couple of port_binding entries
>>>>>>> manually using
>>>>>>>> "ovn-sbctl create port_binding"  with tunnel keys 5 and 6.
>>>>>>>> I also created a logical port in sw0. Then I started ovn-northd.
>>>>>>> ovn-northd deletes the port binding
>>>>>>>> entries added by me and creates the port_binding entry for the
>>>> logical
>>>>>>> port with the tunnel_key=5
>>>>>>>> in the same transaction.
>>>>>>>>
>>>>>>>> I think ovn-northd syncs the south db based on the contents of the
>>>>>>> north db.
>>>>>>>>
>>>>>>>> There's no harm in having your patches. But I'm not really sure if
>>>> it
>>>>>>> resolves the issue we have observed.
>>>>>>>>
>>>>>>>> Just to brief everyone about the issue we are seeing, we see below
>>>>>>> logs in ovn-northd.
>>>>>>>>
>>>>>>>> *******
>>>>>>>> 2020-04-16T23:02:33Z|00127|ovsdb_idl|WARN|transaction error:
>>>>>>> {"details":"Transaction causes multiple rows in \"Port_Binding\"
>>>> table
>>>>>>> to have identical values (23eb9016-45f9-4158-be35-77b2713b9a0f and
>>>> 7)
>>>>>>> for index on columns \"datapath\" and \"tunnel_key\".  First row,
>>>> with
>>>>>>> UUID e4f11a7b-09b6-454f-a125-34cc4b144ef6, had the following index
>>>>>>> values before the transaction: bdbb436e-f98c-4651-9b80-6e8b95044560
>>>> and
>>>>>>> 7.  Second row, with UUID d37cc3f1-8633-440f-b145-8222a0d4723c,
>>>> existed
>>>>>>> in the database before this transaction and was not modified by the
>>>>>>> transaction.","error":"constraint violation"}
>>>>>>>> ******
>>>>>>>>
>>>>>>>> And because of this constraint violation error, ovn-northd cannot
>>>>>>> further write to the sb db until it is restarted.
>>>>>>>>
>>>>>>>> In my opinion this can only happen if ovn-northd doesn't see the
>>>> port
>>>>>>> binding row (which is actually present in the DB) in its IDL
>>>> in-memory db.
>>>>>>>> I suspect this could have happened when ovn-northd reconnects to
>>>> the
>>>>>>> same master or connects to the new master and it doesn't get the
>>>> proper
>>>>>>>> updates.
>>>>>>>>
>>>>>>>> Maybe in this case, the IDL should request the db contents with txn
>>>> id
>>>>>>> =0, so that it receives the complete dump of the db.
>>>>>>>>
>>>>>>>> Is it possible that ovn-northd sees a port binding with a tunnel
>>>> key
>>>>>>> 'x' and still allocates the same tunnel id 'x' to a new logical
>>>> port ?
>>>>>>>> If so, then definitely your patches makes sense.
>>>>>>>>
>>>>>>>> @Han - Have you seen this issue in your deployments ? Do you have
>>>>>>> comments here ?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Numan
>>>>>>>>
>>>>>>> I never saw such issue before, but I am not sure if this is possible
>>>> due
>>>>>>> to bugs. Currently there is a bug fix under review:
>>>>>>>
>>>>
>>>> https://patchwork.ozlabs.org/project/openvswitch/patch/20200422183842.6303.99600.stgit@dceara.remote.csb/
>>>> .
>>>>>>> However, northd doesn't conditionally monitor the rows so I am not
>>>> sure
>>>>>>> if this is the root cause of the northd inconsistency issue
>>>> discussed
>>>> here.
>>>>>>>
>>>>>>> I don't think we should fix in northd (or ovn-controller) to handle
>>>> the
>>>>>>> inconsistency of ovsdb. The consistency should be expected from
>>>> ovsdb
>>>>>>> and we should fix ovsdb/IDL when there is such kind of bug.
>>>> Otherwise,
>>>>>>> there might be too many places to fix and even re-design. My
>>>>>>> understanding is, if the ovsdb IDL sees a temporarily stale data,
>>>> the
>>>>>>> current northd/ovn-controller logic should be able to correct
>>>> themselves
>>>>>>> once the data is up-to-date. Moreover, for northd, it is connected
>>>> to
>>>>>>> leader-only in clustered mode, which avoids the possibility of
>>>> seeing
>>>>>>> staled data in northd (unless there is a bug).
>>>>>>>
>>>>>>> To summarize, I think we need to find the root cause of the
>>>>>>> inconsistency between IDL and server and fix it there, instead of
>>>>>>> changing ovn-northd to accommodate the inconsistency. (consistency
>>>> is
>>>>>>> the biggest advantage of OVSDB, to ease the application
>>>> implementation).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Han
>>>>>>
>>>>>> Hi Han, Numan,
>>>>>>
>>>>>> I might have misused "inconsistency" in this context. What I meant was
>>>>>> more on the note of "discrepancies between NB and SB databases".
>>>>>>
>>>>>> This is a very simple reproducer for the port_binding tunnel_key
>>>> issue,
>>>>>> no clustering of NB/SB dbs involved:
>>>>>>
>>>>>> # Create two logical switches with one port each.
>>>>>> $ ovn-nbctl ls-add ls1
>>>>>> $ ovn-nbctl lsp-add ls1 p1
>>>>>> $ ovn-nbctl ls-add ls2
>>>>>> $ ovn-nbctl lsp-add ls2 p2
>>>>>> $ ovn-nbctl --wait=sb sync
>>>>>>
>>>>>> # At this point PB for p1 has tunnel_key=1
>>>>>> # At this point PB for p2 has tunnel_key=2
>>>>>>
>>>>>> # Simulate the SB db going away (could be network
>>>>>> # issues or crash or some other event).
>>>>>> $ ovn-ctl stop_sb_ovsdb
>>>>>>
>>>>>> # CMS decides to move p2 from ls2 to ls1 and removes
>>>>>> # ls2 completely.
>>>>>> $ ovn-nbctl ls-del ls2
>>>>>> $ ovn-nbctl lsp-add ls1 p2
>>>>>>
>>>>>> # Simulate SB DB coming back online.
>>>>>> $ ovn-ctl start_sb_ovsdb
>>>>>>
>>>>>> At this point ovn-northd will try to set the datapath field in PB2 to
>>>>>> point to datapath_binding corresponding to ls1 but will *not* change
>>>>>> tunnel_key.
>>>>>>
>>>>>> We get:
>>>>>> 2020-04-29T20:52:41.327Z|00016|ovsdb_idl|WARN|transaction error:
>>>>>> {"details":"Transaction causes multiple rows in \"Port_Binding\" table
>>>>>> to have identical values (1b1c4b39-c045-448d-a532-8edbe5544e13 and 1)
>>>>>> for index on columns \"datapath\" and \"tunnel_key\".  First row, with
>>>>>> UUID e20219fa-ef67-49a2-81cd-739fa80d2bd4, existed in the database
>>>>>> before this transaction and was not modified by the transaction.
>>>> Second
>>>>>> row, with UUID 50b0e240-8a4d-4e98-8e2f-97c94811d1b1, had the following
>>>>>> index values before the transaction:
>>>>>> a9b5959f-2f48-44e7-b6bb-f7148c28e4b5 and 1.","error":"constraint
>>>> violation"}
>>>>>>
>>>>>> And ovn-northd keeps retrying the same transaction at every iteration
>>>>>> from this point on and fails continuously.
>>>>>>
>>>>>> For the stale datapath issue (patch #1 in the series) a similar
>>>>>> reproducer is:
>>>>>>
>>>>>> # Create a logical router with on router port.
>>>>>> $ ovn-nbctl lr-add lr
>>>>>> $ ovn-nbctl lrp-add lr p 00:00:00:00:00:01 1.1.1.1/24
>>>>>>
>>>>>> # Simulate that a mac binding was created for the router
>>>>>> # port.
>>>>>> $ dp=$(ovn-sbctl --bare --columns _uuid list datapath .)
>>>>>> $ ovn-sbctl create mac_binding logical_port="p" ip="1.1.1.2"
>>>> datapath="$dp"
>>>>>> $ ovn-nbctl --wait=sb sync
>>>>>>
>>>>>> # Simulate the SB db going away (could be network
>>>>>> # issues or crash or some other event).
>>>>>> $ ovn-ctl stop_sb_ovsdb
>>>>>>
>>>>>> # CMS decides to delete lr.
>>>>>> $ ovn-nbctl lr-del lr
>>>>>>
>>>>>> # CMS decides to readd lr and router port.
>>>>>> $ ovn-nbctl lr-add lr
>>>>>> $ ovn-nbctl lrp-add lr p 00:00:00:00:00:01 1.1.1.1/24
>>>>>>
>>>>>> # Simulate SB DB coming back online.
>>>>>> $ ovn-ctl start_sb_ovsdb
>>>>>>
>>>>>> At this point ovn-northd will try to clear the old datapath record
>>>> from
>>>>>> SB DB *without* destroying the mac binding record.
>>>>>>
>>>>>> We get:
>>>>>> 2020-04-29T21:41:42.145Z|00013|ovsdb_idl|WARN|transaction error:
>>>>>> {"details":"cannot delete Datapath_Binding row
>>>>>> de8d19d6-d67b-499b-8825-12d34ec60946 because of 1 remaining
>>>>>> reference(s)","error":"referential integrity violation"}
>>>>>>
>>>>>> I think both situations above should be addressed by ovn-northd and
>>>>>> stale datapath/mac_binding/port_binding/etc records should be purged.
>>>> I
>>>>>> guess there might be other scenarios that would trigger constraint
>>>>>> violations too but this is what I found so far.
>>>>>>
>>>>>> If you agree, I can send a v2 and add tests for the two simplified
>>>>>> scenarios I mentioned above.
>>>>>>
>>>>>> What do you think?
>>>>
>>>
>>> Thanks Dumitru. for the explanation. It would be great to add these tests
>>> in v2.
>>>
>>> Thanks
>>> Numan
>>>
>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Dumitru
>>>>>>
>>>>>
>>>>> Thanks Dumitru for explaining. Now I understand the problem. So it has
>>>> nothing to do with OVSDB consistency itself, but just northd'd logic. I
>>>> don't even need to stop SB to reproduce. Here is how I reproduced it:
>>>>> $ ovn-nbctl ls-add ls1
>>>>> $ ovn-nbctl ls-add ls2
>>>>> $ ovn-nbctl lsp-add ls1 lsp1
>>>>> $ ovn-nbctl lsp-add ls2 lsp2
>>>>> $ ovn-nbctl lsp-del ls2 -- lsp-add ls1 lsp2
>>>>
>>>> Sorry for the typo. The last command was:
>>>> $ ovn-nbctl lsp-del lsp2 -- lsp-add ls1 lsp2
>>>>
>>>
>> I applied these 2 patches locally and I ran the below commands, which is
>> the same as the above
>> commands shared by Han.
>>
>> $ovn-nbctl ls-add ls1
>> $ovn-nbctl ls-add ls2
>> $ovn-nbctl lsp-add ls1 lsp1
>> $ovn-nbctl lsp-add ls2 lsp2
>> $ovs-vsctl add-port br-int p1 -- set Interface p1 external_ids:iface-id=lsp2
>> $ovn-sbctl list port_binding
>>
>> $ovn-sbctl list port_binding
>> _uuid               : bbf2f7e4-b61b-4ce8-adb6-4d17e410b87b
>> chassis             : ff506354-ac7b-4463-b42d-d89bddf319c7
>> datapath            : ef316369-0f2c-4246-adbd-8c187bd95e41
>> ...
>> ...
>> tunnel_key          : 1
>> type                : ""
>> virtual_parent      : []
>>
>> _uuid               : 7cca89fa-55f9-4326-8188-6678838467bb
>> chassis             : []
>> datapath            : 21263028-a511-457a-824b-39a1219084c8
>> ...
>> logical_port        : lsp1
>> ...
>> tunnel_key          : 1
>> type                : ""
>> virtual_parent      : []
>>
>> $ovn-nbctl lsp-del lsp2 -- lsp-add ls1 lsp2
>>
>> $ovn-sbctl list port_binding
>> _uuid               : bbf2f7e4-b61b-4ce8-adb6-4d17e410b87b
>> chassis             : ff506354-ac7b-4463-b42d-d89bddf319c7
>> datapath            : 21263028-a511-457a-824b-39a1219084c8
>> ...
>> logical_port        : lsp2
>> ...
>> tunnel_key          : 1
>> type                : ""
>> virtual_parent      : []
>>
>> _uuid               : 7cca89fa-55f9-4326-8188-6678838467bb
>> chassis             : []
>> datapath            : 21263028-a511-457a-824b-39a1219084c8
>> ...
>> logical_port        : lsp1
>> ...
>> tunnel_key          : 2
>> type                : ""
>> virtual_parent      : []
>>
>>
>> I notice that the same port_binding record for lsp2 is being reused.
>> Is that intentional ?
> 
> This happens because the order in which ovn_port entries will be processed:
> 
> https://github.com/ovn-org/ovn/blob/master/northd/ovn-northd.c#L3453
> 
> The "both" list is populated in join_logical_ports() and depends on the
> order of Logical_Switch/Router_Port records in
> od->nbs->ports/od->nbr->ports arrays which is not under ovn-northd's
> control.
> 
> https://github.com/ovn-org/ovn/blob/master/northd/ovn-northd.c#L2022
> https://github.com/ovn-org/ovn/blob/master/northd/ovn-northd.c#L2103
> 
>>
>> Ideally the old port binding record lsp2 should get deleted and
>> new one should get created.
> 
> So even if we delete the old port binding and recreate it we'd still get
> a conflict in some cases because lsp2 would be processed before lsp1.
> 
>>
>> I found another issue with the below commands (tested in sandbox env)
>>
>> $ovn-nbctl ls-add ls1
>> $ovn-nbctl ls-add ls2
>> $ovn-nbctl lsp-add ls1 lsp1
>> $ovn-nbctl lsp-add ls2 lsp2
>> $ovn-nbctl lsp-set-type lsp2 external
>> $ovn-nbctl ha-chassis-group-add chg1
>> $ovn-nbctl ha-chassis-group-add-chassis chg1 chassis-1 30
>> $ovn-nbctl set logical_switch_port lsp2 ha_chassis_group=<chg1_uuid>
>> $ovn-nbctl lsp-del lsp2 -- lsp-add ls1 lsp2 -> This fails with the below
>> logs in ovn-northd
>>
>> *******
>> 2020-04-30T09:59:48.319Z|00007|ovsdb_idl|WARN|transaction error:
>> {"details":"cannot delete HA_Chassis_Group row
>> 6e0c88d7-20f6-473a-bd0a-9eea60b639e6 because of 1 remaining
>> reference(s)","error":"referential integrity violation"}
>> *******
> 
> I'll look into this. Looks like patch #2 of the series should take care
> of HA_Chassis_Group too.
> 
>>
>> I think it's better if the stale port binding entry is deleted instead of
>> reusing it. What  do you think ?
> 
> As mentioned above, this wouldn't help too much and it would actually
> create larger transactions so it seems inefficient.
> 
> Thanks,
> Dumitru
> 
>>
>> Thanks
>> Numan
>>
>>
>>
>>
>>>>>
>>>>> 2020-04-29T23:46:17.675Z|00007|ovsdb_idl|WARN|transaction error:
>>>> {"details":"Transaction causes multiple rows in \"Port_Binding\" table to
>>>> have identical values (be595a3b-3904-4229-9ba2-884b27a86b75 and 1) for
>>>> index on columns \"datapath\" and \"tunnel_key\".  First row, with UUID
>>>> d4cc6ec5-4817-47c9-aa83-9985d3b7b452, existed in the database before this
>>>> transaction and was not modified by the transaction.  Second row, with
>>>> UUID
>>>> b874ab93-d97a-4583-8ac3-c353a40b180d, had the following index values
>>>> before
>>>> the transaction: 6940ad91-83c5-4fe9-bab5-4fbec6714b0d and
>>>> 1.","error":"constraint violation"}
>>>>>
>>>>> I will take a closer look at the fix.
>>>>>
>>>>> Thanks,
>>>>> Han

Hi Han, Numan,

I sent a v2 including a new patch (#3) for the ha_chassis_group issue
Numan mentioned and also added unit tests for each of the 3 issues.

https://patchwork.ozlabs.org/project/openvswitch/list/?series=173823

Regards,
Dumitru



More information about the dev mailing list