[ovs-dev] [PATCH ovn 0/2] Make ovn-northd recover from NB/SB inconsistencies.

Dumitru Ceara dceara at redhat.com
Wed Apr 29 21:44:58 UTC 2020


On 4/29/20 9:57 PM, Han Zhou wrote:
> 
> 
> On Wed, Apr 29, 2020 at 12:17 PM Numan Siddique <numans at ovn.org
> <mailto:numans at ovn.org>> wrote:
>>
>>
>>
>> On Wed, Apr 29, 2020 at 9:57 PM Dumitru Ceara <dceara at redhat.com
> <mailto:dceara at redhat.com>> wrote:
>>>
>>> In some cases, if the NB/SB databases ovn-northd connects to are
>>> inconsistent, ovn-northd might generate transactions that fail
>>> continuously due to failed integrity checks on the SB database server.
>>>
>>> The first patch of the series addresses inconsistencies due to stale
>>> Datapath_Binding records in the SB database.
>>>
>>> The second patch of the series addresses inconsistencies due to stale
>>> tunnel_key values in various SB database table records.
>>>
>>> Reported-by: Dan Williams <dcbw at redhat.com <mailto:dcbw at redhat.com>>
>>> Reported-at: https://bugzilla.redhat.com/1828637
>>> Signed-off-by: Dumitru Ceara <dceara at redhat.com
> <mailto:dceara at redhat.com>>
>>>
>>> Dumitru Ceara (2):
>>>       ovn-northd: Clear SB records depending on stale datapaths.
>>>       ovn-northd: Fix tunnel_key allocation for SB records.
>>
>>
>> Hi Dumitru,
>>
>> I did some testing in my ovn-fake-multinode setup. These are my
> observations.
>>
>> I created a logical switch sw0 with 4 logical ports. So the next
> tunnel key should be 5.
>> I stopped ovn-northd and  created a couple of port_binding entries
> manually using
>> "ovn-sbctl create port_binding"  with tunnel keys 5 and 6.
>> I also created a logical port in sw0. Then I started ovn-northd.
> ovn-northd deletes the port binding
>> entries added by me and creates the port_binding entry for the logical
> port with the tunnel_key=5
>> in the same transaction.
>>
>> I think ovn-northd syncs the south db based on the contents of the
> north db.
>>
>> There's no harm in having your patches. But I'm not really sure if it
> resolves the issue we have observed.
>>
>> Just to brief everyone about the issue we are seeing, we see below
> logs in ovn-northd.
>>
>> *******
>> 2020-04-16T23:02:33Z|00127|ovsdb_idl|WARN|transaction error:
> {"details":"Transaction causes multiple rows in \"Port_Binding\" table
> to have identical values (23eb9016-45f9-4158-be35-77b2713b9a0f and 7)
> for index on columns \"datapath\" and \"tunnel_key\".  First row, with
> UUID e4f11a7b-09b6-454f-a125-34cc4b144ef6, had the following index
> values before the transaction: bdbb436e-f98c-4651-9b80-6e8b95044560 and
> 7.  Second row, with UUID d37cc3f1-8633-440f-b145-8222a0d4723c, existed
> in the database before this transaction and was not modified by the
> transaction.","error":"constraint violation"}
>> ******
>>
>> And because of this constraint violation error, ovn-northd cannot
> further write to the sb db until it is restarted.
>>
>> In my opinion this can only happen if ovn-northd doesn't see the port
> binding row (which is actually present in the DB) in its IDL in-memory db.
>> I suspect this could have happened when ovn-northd reconnects to the
> same master or connects to the new master and it doesn't get the proper
>> updates.
>>
>> Maybe in this case, the IDL should request the db contents with txn id
> =0, so that it receives the complete dump of the db.
>>
>> Is it possible that ovn-northd sees a port binding with a tunnel key
> 'x' and still allocates the same tunnel id 'x' to a new logical port ?
>> If so, then definitely your patches makes sense.
>>
>> @Han - Have you seen this issue in your deployments ? Do you have
> comments here ?
>>
>> Thanks
>> Numan
>>
> I never saw such issue before, but I am not sure if this is possible due
> to bugs. Currently there is a bug fix under review:
> https://patchwork.ozlabs.org/project/openvswitch/patch/20200422183842.6303.99600.stgit@dceara.remote.csb/.
> However, northd doesn't conditionally monitor the rows so I am not sure
> if this is the root cause of the northd inconsistency issue discussed here.
> 
> I don't think we should fix in northd (or ovn-controller) to handle the
> inconsistency of ovsdb. The consistency should be expected from ovsdb
> and we should fix ovsdb/IDL when there is such kind of bug. Otherwise,
> there might be too many places to fix and even re-design. My
> understanding is, if the ovsdb IDL sees a temporarily stale data, the
> current northd/ovn-controller logic should be able to correct themselves
> once the data is up-to-date. Moreover, for northd, it is connected to
> leader-only in clustered mode, which avoids the possibility of seeing
> staled data in northd (unless there is a bug).
> 
> To summarize, I think we need to find the root cause of the
> inconsistency between IDL and server and fix it there, instead of
> changing ovn-northd to accommodate the inconsistency. (consistency is
> the biggest advantage of OVSDB, to ease the application implementation).
> 
> Thanks,
> Han

Hi Han, Numan,

I might have misused "inconsistency" in this context. What I meant was
more on the note of "discrepancies between NB and SB databases".

This is a very simple reproducer for the port_binding tunnel_key issue,
no clustering of NB/SB dbs involved:

# Create two logical switches with one port each.
$ ovn-nbctl ls-add ls1
$ ovn-nbctl lsp-add ls1 p1
$ ovn-nbctl ls-add ls2
$ ovn-nbctl lsp-add ls2 p2
$ ovn-nbctl --wait=sb sync

# At this point PB for p1 has tunnel_key=1
# At this point PB for p2 has tunnel_key=2

# Simulate the SB db going away (could be network
# issues or crash or some other event).
$ ovn-ctl stop_sb_ovsdb

# CMS decides to move p2 from ls2 to ls1 and removes
# ls2 completely.
$ ovn-nbctl ls-del ls2
$ ovn-nbctl lsp-add ls1 p2

# Simulate SB DB coming back online.
$ ovn-ctl start_sb_ovsdb

At this point ovn-northd will try to set the datapath field in PB2 to
point to datapath_binding corresponding to ls1 but will *not* change
tunnel_key.

We get:
2020-04-29T20:52:41.327Z|00016|ovsdb_idl|WARN|transaction error:
{"details":"Transaction causes multiple rows in \"Port_Binding\" table
to have identical values (1b1c4b39-c045-448d-a532-8edbe5544e13 and 1)
for index on columns \"datapath\" and \"tunnel_key\".  First row, with
UUID e20219fa-ef67-49a2-81cd-739fa80d2bd4, existed in the database
before this transaction and was not modified by the transaction.  Second
row, with UUID 50b0e240-8a4d-4e98-8e2f-97c94811d1b1, had the following
index values before the transaction:
a9b5959f-2f48-44e7-b6bb-f7148c28e4b5 and 1.","error":"constraint violation"}

And ovn-northd keeps retrying the same transaction at every iteration
from this point on and fails continuously.

For the stale datapath issue (patch #1 in the series) a similar
reproducer is:

# Create a logical router with on router port.
$ ovn-nbctl lr-add lr
$ ovn-nbctl lrp-add lr p 00:00:00:00:00:01 1.1.1.1/24

# Simulate that a mac binding was created for the router
# port.
$ dp=$(ovn-sbctl --bare --columns _uuid list datapath .)
$ ovn-sbctl create mac_binding logical_port="p" ip="1.1.1.2" datapath="$dp"
$ ovn-nbctl --wait=sb sync

# Simulate the SB db going away (could be network
# issues or crash or some other event).
$ ovn-ctl stop_sb_ovsdb

# CMS decides to delete lr.
$ ovn-nbctl lr-del lr

# CMS decides to readd lr and router port.
$ ovn-nbctl lr-add lr
$ ovn-nbctl lrp-add lr p 00:00:00:00:00:01 1.1.1.1/24

# Simulate SB DB coming back online.
$ ovn-ctl start_sb_ovsdb

At this point ovn-northd will try to clear the old datapath record from
SB DB *without* destroying the mac binding record.

We get:
2020-04-29T21:41:42.145Z|00013|ovsdb_idl|WARN|transaction error:
{"details":"cannot delete Datapath_Binding row
de8d19d6-d67b-499b-8825-12d34ec60946 because of 1 remaining
reference(s)","error":"referential integrity violation"}

I think both situations above should be addressed by ovn-northd and
stale datapath/mac_binding/port_binding/etc records should be purged. I
guess there might be other scenarios that would trigger constraint
violations too but this is what I found so far.

If you agree, I can send a v2 and add tests for the two simplified
scenarios I mentioned above.

What do you think?

Thanks,
Dumitru



More information about the dev mailing list