[ovs-dev] Help diagnosing an OVN issue
Mark Michelson
mmichels at redhat.com
Wed May 16 12:21:34 UTC 2018
Hi folks,
I'm trying to fix an OVN issue,
The gist of it is that if you run
ovn-nbctl set-ssl <some key> <some cert> <some cacert>
And then later attempt to run
ovn-nbctl set-ssl <some other key> <some other cert> <some other cacert>
the ovn-nbctl process hangs indefinitely.
Using git-bisect, I was able to find the commit[1] that introduced this
problem. The commit in question makes it so that "singleton" tables
(those with maxRows of 1) have an extra check added when inserting to
ensure that the table is currently empty. If it is not empty, then the
attempt to commit will fail because a "where" clause test failed.
When we run `ovn-nbctl set-ssl`, we create a transaction that deletes
the current row in the SSL table in the northbound database and inserts
a new row with the provided data. Because the SSL table is a singleton
table, we also add the operation that checks if the table is currently
empty. What I believe is happening is that the extra singleton check is
failing since the SSL table has data in it prior to when the transaction
is committed. However, the check should not be performed since part of
our transaction is to delete the current content in the table.
I tried two approaches to fixing this
1) Wait until we have looked at all of the transaction rows. If there
was an insert and no delete, add the singleton check on to the end of
the json array of operations to perform. If there was an insert and also
a delete, then don't perform the singleton check at all.
2) Add the singleton check in as normal if inserting. If we also come
across a delete, then remove the singleton check from the json array of
operations to perform.
With approach 1, I ended up not being able to start a sandbox at all
because the operation to set-ssl on the southbound database hung
forever. With approach 2, the sandbox starts properly. But when I
attempt issue reproduction, I get the following error:
2018-05-15T20:50:45Z|00007|ovsdb_idl|WARN|"insert" reply "uuid" is missing
I have confirmed that if I simply "#if 0" the singleton check, then
everything works just fine. Does anyone know why approach 2 is not working?
Thanks,
Mark
[1] 25540a777ff6c81ff71ace04ffabfd0df93e5e2d
ovsdb-idl: Tolerate initialization races for singleton tables
More information about the dev
mailing list