[ovs-dev] [PATCH ovn] ovs: Bump submodule version to latest ovsdb-cs changes.

Ilya Maximets i.maximets at ovn.org
Thu Mar 11 19:08:18 UTC 2021


On 3/11/21 7:53 PM, Ben Pfaff wrote:
> On Thu, Mar 11, 2021 at 07:37:19PM +0100, Ilya Maximets wrote:
>> Few bugfixes was accepted recently to OVS for ovsdb-cs code
>> that are required for OVN build.
>> e.g. ac09cbfcb70a ("ovsdb-cs: Fix use-after-free for the request id.")
>>
>> This will also include all the changes necessary to build
>> ddlog version of northd.
>>
>> Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
>> ---
>>  ovs | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/ovs b/ovs
>> index 50e5523b9..ac09cbfcb 160000
>> --- a/ovs
>> +++ b/ovs
>> @@ -1 +1 @@
>> -Subproject commit 50e5523b9b2b154e5fafc5acdcdec85e9cc5a330
>> +Subproject commit ac09cbfcb70ac6f443f039d5934448bd80f74493
> 
> You could consider bumping all the way to 39b937f06434 ("raft: Add
> 'stop-raft-rpc' failure test command."), which would add the commits
> listed below.

OVN itself doesn't use the raft code, so it doesn't really matter
if these commits included or not.  I don't see a big difference
between bumping to 39b937f06434 and just to current tip of the
master branch.  I think it's probably better to just stick to
necessary changes.

OTOH, someone might try to actually build and use OVS from this
submodule... We probably need to discourage users from doing that
somehow.

> But I support this either way.
> 
> Acked-by: Ben Pfaff <blp at ovn.org>
> 
> $ git log ac09cbfcb70ac6f443f039d5934448bd80f74493...39b937f064347884614002c9bdac79382a144fca
> commit 39b937f064347884614002c9bdac79382a144fca
> Author: Ilya Maximets <i.maximets at ovn.org>
> Date:   Fri Feb 19 18:00:19 2021 +0100
> 
>     raft: Add 'stop-raft-rpc' failure test command.
>     
>     This command will stop sending and receiving any RAFT-related
>     traffic or accepting new connections.  Useful to simulate
>     network problems between cluster members.
>     
>     There is no unit test that uses it yet, but it's convenient for
>     manual testing.
>     
>     Acked-by: Han Zhou <hzhou at ovn.org>
>     Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
> 
> commit 4c1d9ef14af3da3b904c1ceb4a6823bb3c5fd3e3
> Author: Ilya Maximets <i.maximets at ovn.org>
> Date:   Fri Feb 19 17:42:43 2021 +0100
> 
>     raft: Report disconnected in cluster/status if candidate retries election.
>     
>     If election times out for a server in 'candidate' role it sets
>     'candidate_retrying' flag that notifies that storage is disconnected
>     and client should re-connect.  However, cluster/status command
>     reports 'Status: cluster member' and that is misleading.
>     Reporting "disconnected from the cluster (election timeout)" instead.
>     
>     Reported-by: Carlos Goncalves <cgoncalves at redhat.com>
>     Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1929690
>     Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
>     Acked-by: Han Zhou <hzhou at ovn.org>
>     Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
> 
> commit 14b2b0aad7ae9254bad8b8c2cc9d5386065ab42f
> Author: Ilya Maximets <i.maximets at ovn.org>
> Date:   Wed Feb 17 16:44:57 2021 +0100
> 
>     raft: Reintroduce jsonrpc inactivity probes.
>     
>     It's not enough to just have heartbeats.
>     
>     RAFT heartbeats are unidirectional, i.e. leader sends them to followers
>     but not the other way around.  Missing heartbeats provokes followers to
>     start election, but if leader will not receive any replies it will not
>     do anything while there is a quorum, i.e. there are enough other
>     servers to make decisions.
>     
>     This leads to situation that while TCP connection is established,
>     leader will continue to blindly send messages to it.  In our case this
>     leads to growing send backlog.  Connection will be terminated
>     eventually due to excessive send backlog, but this this might take a
>     lot of time and wasted process memory.  At the same time 'candidate'
>     will continue to send vote requests to the dead connection on its
>     side.
>     
>     To fix that we need to reintroduce inactivity probes that will drop
>     connection if there was no incoming traffic for a long time and remote
>     server doesn't reply to the "echo" request.  Probe interval might be
>     chosen based on an election timeout to avoid issues described in commit
>     db5a066c17bd.
>     
>     Reported-by: Carlos Goncalves <cgoncalves at redhat.com>
>     Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1929690
>     Fixes: db5a066c17bd ("raft: Disable RAFT jsonrpc inactivity probe.")
>     Acked-by: Han Zhou <hzhou at ovn.org>
>     Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
> 



More information about the dev mailing list