[ovs-dev] Help diagnosing an OVN issue

Mark Michelson mmichels at redhat.com
Wed May 16 19:26:41 UTC 2018


For those keeping track, I found the source of the problem. Insert 
operations keep track of the array index where the insert operation is. 
By removing an item from the array, I altered the index where the insert 
operation was in the array.

I also have found a much better way of fixing this problem that does not 
require json array manipulation. I'll be posting a patch soon.

On 05/16/2018 08:21 AM, Mark Michelson wrote:
> Hi folks,
> 
> I'm trying to fix an OVN issue,
> 
> The gist of it is that if you run
> 
> ovn-nbctl set-ssl <some key> <some cert> <some cacert>
> 
> And then later attempt to run
> 
> ovn-nbctl set-ssl <some other key> <some other cert> <some other cacert>
> 
> the ovn-nbctl process hangs indefinitely.
> 
> Using git-bisect, I was able to find the commit[1] that introduced this 
> problem. The commit in question makes it so that "singleton" tables 
> (those with maxRows of 1) have an extra check added when inserting to 
> ensure that the table is currently empty. If it is not empty, then the 
> attempt to commit will fail because a "where" clause test failed.
> 
> When we run `ovn-nbctl set-ssl`, we create a transaction that deletes 
> the current row in the SSL table in the northbound database and inserts 
> a new row with the provided data. Because the SSL table is a singleton 
> table, we also add the operation that checks if the table is currently 
> empty. What I believe is happening is that the extra singleton check is 
> failing since the SSL table has data in it prior to when the transaction 
> is committed. However, the check should not be performed since part of 
> our transaction is to delete the current content in the table.
> 
> I tried two approaches to fixing this
> 
> 1) Wait until we have looked at all of the transaction rows. If there 
> was an insert and no delete, add the singleton check on to the end of 
> the json array of operations to perform. If there was an insert and also 
> a delete, then don't perform the singleton check at all.
> 
> 2) Add the singleton check in as normal if inserting. If we also come 
> across a delete, then remove the singleton check from the json array of 
> operations to perform.
> 
> With approach 1, I ended up not being able to start a sandbox at all 
> because the operation to set-ssl on the southbound database hung 
> forever. With approach 2, the sandbox starts properly. But when I 
> attempt issue reproduction, I get the following error:
> 
> 2018-05-15T20:50:45Z|00007|ovsdb_idl|WARN|"insert" reply "uuid" is missing
> 
> I have confirmed that if I simply "#if 0" the singleton check, then 
> everything works just fine. Does anyone know why approach 2 is not working?
> 
> Thanks,
> Mark
> 
> [1] 25540a777ff6c81ff71ace04ffabfd0df93e5e2d
> ovsdb-idl: Tolerate initialization races for singleton tables



More information about the dev mailing list