[ovs-discuss] [HELP] Question about Raft
hzhou at ovn.org
Tue Mar 10 23:03:19 UTC 2020
To run the ovsdb-cluster tests, just:
# make check-ovsdb-cluster
You can also specify which test case to run:
# make check-ovsdb-cluster TESTSUITEFLAGS="<test case number>"
You can list all cases by:
# make check-ovsdb-cluster TESTSUITEFLAGS="--list"
On Tue, Mar 10, 2020 at 1:13 AM txfh2007 <txfh2007 at aliyun.com> wrote:
> Hi Han:
> Thanks for your kindly reply ! I have tried your patch and the
> candidate problem is fixed on my env. Now my 3 nodes raft env works well.
> Another question: I found you have submit ovsdb-cluster testsuite
> also, how could I run these tests on my own setup ?
> On Sat, Mar 7, 2020 at 2:33 AM txfh2007 <txfh2007 at aliyun.com> wrote:
> > Hi Han:
> > Thanks for your reply ! There is one point that I can't agree with
> you: "If S2 or S3 already becomes leader, their term won't be lower than
> S2. " In my test , in step 3, S3 is leader and its term is lower than S2.
> The reason is when S2 disconnected from S1 and S3, S2 will add its term and
> send vote req until its connection recovered. At the same time ,S3 becomes
> leader and won't add its term. So it is possible that S2's term is larger
> than S3's, and that's why in Step 3, S2 replies "stale term" to S3's
> append entry request.
> Hi Timo,
> Sorry that my answer wasn't accurate enough and caused confusion. My
> answer was focusing on the "candidate forever" scenario as you reported so
> I didn't take the more common scenario (that a reconnected server can have
> larger term) into account, but of course the more common scenario do exist.
> Please see my rephrased answer below. and let me know if it solves the
> > Timo
> > On Fri, Mar 6, 2020 at 1:13 AM txfh2007 via discuss <
> ovs-discuss at openvswitch.org> wrote:
> > >
> > > Hi Han && all:
> > >
> > > I have a question about RAFT: I have tried the latest OVN-2.30,
> and have found in some condition, there is one node whose role is always
> "Candidate" (got by cluster/status cmd), but act as a Follower. My cluster
> still works well, but it seems odd that a server's role is always
> Candidate. As far as I know, server's role is normally Follower or Leader.
> > Hi Timo, I happened to fix the problem yesterday and here is the patch:
> https://patchwork.ozlabs.org/patch/1250116/. Details of my analysis is in
> commit message and a test case is added to cover this scenario.
> > > After digging into related code, I think I can try to describe how
> to reproduce this scenario:
> > > 1. It is three servers cluster: One Leader(S2), Two
> > > 2. Try to disconnect Leader(S2) from other two servers,so S2
> would add term and send vote request, and meanwhile S1 and S3 would choose
> a new Leader(Let's say it's S3)
> > When S1 and S3 choose a new leader, they (one of them, or both) would
> have to increase the term, too.
> > > 3. Recover connection between S2 and other two nodes, then if
> S2 receives append entry req from S3, as S3's term is lower, so S2 will
> reply "stale term"
> > If S2 or S3 already becomes leader, their term won't be lower than S2.
> From this point on, the below steps shouldn't happen. But instead, it is
> possible that when S2 receives append-request from the new leader, it has
> the same term, and it updates the leader without switching from candidate
> to follower, thus result in the candidate state forever.
> If S1 (not S2, sorry for the typo above) or S3 already becomes leader, it
> is possible that their term is the same as the one of S2 when S2's
> connection restored, and when S2 received append-request from the new
> leader, because it observes the same term, it updates the leader without
> switching from candidate to follower (which is a bug of the implementation,
> and fixed in the patch I posted, which is merged yesterday), thus result in
> the candidate state forever. In this situation, the candidate doesn't
> increase term and initiate vote-request any more because it receives
> append-request (heartbeat) regularly and responses, like a follower. The
> only difference is that it announces itself as "disconnected from cluster"
> to its clients, so all the clients will be disconnected from it.
> On the other hand, if S2's connection is restored after more election
> timer timeouts, it's term can be larger than the new leader. In this case,
> it won't trigger the "candidate forever" problem. Firstly, the candidate
> will send vote-request with a larger term, but the new leader will reject
> vote-request because it is leader itself, and the follower will also reject
> the vote-request because of the logic of
> "raft_should_suppress_disruptive_server()". However, the candidate will
> receive append-request from the new leader, which has smaller term. It
> replies append-reply with reason "stale term" but with the its own term
> number. When the leader receives this reply, it sees a large term number
> than its own, so it updates its term to the larger term and steps down as
> follower, and then the cluster will start election again, which will end up
> with one leader and two followers as usual.
> > > 4. After S3 gets S2's reply, S3 will change its term to S2's
> value and change its role to follower and then candidate(at the same time ,
> S1/S2/S3 are all candidate role)
> > > 5.Then if S2 got S3's vote request and vote for S3, S3 will
> become new leader, but S2's role is still candidate
> If all 3 ended up as candidate in same term as mention in your step 4,
> each of them only votes to themselves, and there won't be any leader
> elected in that term and they will have to increase term (at random time)
> and re-elect again. For my understanding the only chance that end up with a
> candidate forever, is when 2 servers entered into candidate competing in
> the *same term*.
> > >
> > > I guess The reason is term of S3's vote request is equal to S2's
> term, For S2, it will change to follower only if receiving vote request
> whose term value is larger than it own .
> > > Am I right? and the candidate role(but actually is a follower) is
> reasonable ?
> > >
> > > Thanks
> > > Timo
> > >
> > Hi Timo,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the discuss