[ovs-discuss] Re: [HELP] Question about Raft

txfh2007 txfh2007 at aliyun.com
Tue Mar 10 08:13:44 UTC 2020


Hi Han:

    Thanks for your kindly reply ! I have tried your patch and the candidate problem is fixed on my env. Now my 3 nodes raft env works well.
    Another question: I found you have submit ovsdb-cluster testsuite also, how could I run these tests on my own setup ?

Thanks
Timo


------------------------------------------------------------------

On Sat, Mar 7, 2020 at 2:33 AM txfh2007 <txfh2007 at aliyun.com> wrote:
>
> Hi Han:
>
>     Thanks for your reply ! There is one point that I can't agree with you: "If S2 or S3 already becomes leader, their term won't be lower than S2. " In my test , in step 3, S3 is leader and its term is lower than S2. The reason is when S2 disconnected from S1 and S3, S2 will add its term and send vote req until its connection recovered. At the same time ,S3 becomes leader and won't add its term. So it is possible that S2's term is larger than S3's,  and that's why in Step 3, S2 replies "stale term" to S3's append entry request.

Hi Timo,

Sorry that my answer wasn't accurate enough and caused confusion. My answer was focusing on the "candidate forever" scenario as you reported so I didn't take the more common scenario (that a reconnected server can have larger term) into account, but of course the more common scenario do exist. Please see my rephrased answer below. and let me know if it solves the confusion.

Thanks,
Han
>
> Timo
>
>
> On Fri, Mar 6, 2020 at 1:13 AM txfh2007 via discuss <ovs-discuss at openvswitch.org> wrote:
> >
> > Hi Han && all:
> >
> >     I have a question about RAFT: I have tried the latest OVN-2.30, and have found in some condition, there is one node whose role is always "Candidate" (got by cluster/status cmd), but act as a Follower. My cluster still works well, but it seems odd that a server's role is always Candidate. As far as I know, server's role is normally Follower or Leader.
>
>
> Hi Timo, I happened to fix the problem yesterday and here is the patch: https://patchwork.ozlabs.org/patch/1250116/. Details of my analysis is in commit message and a test case is added to cover this scenario.
>
>
>
> >     After digging into related code, I think I can try to describe how to reproduce this scenario:
> >         1. It is three servers cluster: One Leader(S2), Two followers(S1,S3)
> >         2. Try to disconnect Leader(S2) from other two servers,so S2 would add term and send vote request, and meanwhile S1 and S3 would choose a new Leader(Let's say it's S3)
>
>
> When S1 and S3 choose a new leader, they (one of them, or both) would have to increase the term, too.
>
>
>
> >         3. Recover connection between S2 and other two nodes, then if S2 receives append entry req from S3, as S3's term is lower, so S2 will reply "stale term"
>
>
> If S2 or S3 already becomes leader, their term won't be lower than S2. From this point on, the below steps shouldn't happen. But instead, it is possible that when S2 receives append-request from the new leader, it has the same term, and it updates the leader without switching from candidate to follower, thus result in the candidate state forever.
>

Rephrase:

If S1 (not S2, sorry for the typo above) or S3 already becomes leader, it is possible that their term is the same as the one of S2 when S2's connection restored, and when S2 received append-request from the new leader, because it observes the same term, it updates the leader without switching from candidate to follower (which is a bug of the implementation, and fixed in the patch I posted, which is merged yesterday), thus result in the candidate state forever. In this situation, the candidate doesn't increase term and initiate vote-request any more because it receives append-request (heartbeat) regularly and responses, like a follower. The only difference is that it announces itself as "disconnected from cluster" to its clients, so all the clients will be disconnected from it.

On the other hand, if S2's connection is restored after more election timer timeouts, it's term can be larger than the new leader. In this case, it won't trigger the "candidate forever" problem. Firstly, the candidate will send vote-request with a larger term, but the new leader will reject vote-request because it is leader itself, and the follower will also reject the vote-request because of the logic of "raft_should_suppress_disruptive_server()". However, the candidate will receive append-request from the new leader, which has smaller term. It replies append-reply with reason "stale term" but with the its own term number. When the leader receives this reply, it sees a large term number than its own, so it updates its term to the larger term and steps down as follower, and then the cluster will start election again, which will end up with one leader and two followers as usual.

>
>
> >         4. After S3 gets S2's reply, S3 will change its term to S2's value and change its role to follower and then candidate(at the same time , S1/S2/S3 are all candidate role)
> >         5.Then if S2 got S3's vote request and vote for S3, S3 will become new leader, but S2's role is still candidate

If all 3 ended up as candidate in same term as mention in your step 4, each of them only votes to themselves, and there won't be any leader elected in that term and they will have to increase term (at random time) and re-elect again. For my understanding the only chance that end up with a candidate forever, is when 2 servers entered into candidate competing in the *same term*.

> >
> >     I guess The reason is term of S3's vote request is equal to S2's term, For S2, it will change to follower only if receiving vote request whose term value is larger than it own .
> >     Am I right? and the candidate role(but actually is a follower) is reasonable ?
> >
> > Thanks
> > Timo
> >
> Hi Timo,
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200310/d39e366c/attachment.html>


More information about the discuss mailing list