[ovs-dev] Help diagnosing an OVN issue

Mark Michelson mmichels at redhat.com
Wed May 16 12:21:34 UTC 2018


Hi folks,

I'm trying to fix an OVN issue,

The gist of it is that if you run

ovn-nbctl set-ssl <some key> <some cert> <some cacert>

And then later attempt to run

ovn-nbctl set-ssl <some other key> <some other cert> <some other cacert>

the ovn-nbctl process hangs indefinitely.

Using git-bisect, I was able to find the commit[1] that introduced this 
problem. The commit in question makes it so that "singleton" tables 
(those with maxRows of 1) have an extra check added when inserting to 
ensure that the table is currently empty. If it is not empty, then the 
attempt to commit will fail because a "where" clause test failed.

When we run `ovn-nbctl set-ssl`, we create a transaction that deletes 
the current row in the SSL table in the northbound database and inserts 
a new row with the provided data. Because the SSL table is a singleton 
table, we also add the operation that checks if the table is currently 
empty. What I believe is happening is that the extra singleton check is 
failing since the SSL table has data in it prior to when the transaction 
is committed. However, the check should not be performed since part of 
our transaction is to delete the current content in the table.

I tried two approaches to fixing this

1) Wait until we have looked at all of the transaction rows. If there 
was an insert and no delete, add the singleton check on to the end of 
the json array of operations to perform. If there was an insert and also 
a delete, then don't perform the singleton check at all.

2) Add the singleton check in as normal if inserting. If we also come 
across a delete, then remove the singleton check from the json array of 
operations to perform.

With approach 1, I ended up not being able to start a sandbox at all 
because the operation to set-ssl on the southbound database hung 
forever. With approach 2, the sandbox starts properly. But when I 
attempt issue reproduction, I get the following error:

2018-05-15T20:50:45Z|00007|ovsdb_idl|WARN|"insert" reply "uuid" is missing

I have confirmed that if I simply "#if 0" the singleton check, then 
everything works just fine. Does anyone know why approach 2 is not working?

Thanks,
Mark

[1] 25540a777ff6c81ff71ace04ffabfd0df93e5e2d
ovsdb-idl: Tolerate initialization races for singleton tables


More information about the dev mailing list