[ovs-git] [openvswitch/ovs] 26949a: raft: Fix next_index in install_snapshot reply han...

Han Zhou noreply at github.com
Fri Mar 6 23:02:21 UTC 2020


  Branch: refs/heads/branch-2.12
  Home:   https://github.com/openvswitch/ovs
  Commit: 26949af5c40dd4ffd97d8f375cece37bc5eb1318
      https://github.com/openvswitch/ovs/commit/26949af5c40dd4ffd97d8f375cece37bc5eb1318
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Fix next_index in install_snapshot reply handling.

When a leader handles install_snapshot reply, the next_index for
the follower should be log_start instead of log_end, because there
can be new entries added in leader's log after initiating the
install_snapshot procedure.  Also, it should send all the accumulated
entries to follower in the following append-request message, instead
of sending 0 entries, to speed up the converge.

Without this fix, there is no functional problem, but it takes
uncessary extra rounds of append-requests responsed with "inconsistency"
by follower, although finally will be converged.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: 874895b49192b02127eb9189c37f614469db3fa2
      https://github.com/openvswitch/ovs/commit/874895b49192b02127eb9189c37f614469db3fa2
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/raft.c
    M tests/ovsdb-cluster.at

  Log Message:
  -----------
  raft: Fix the problem of stuck in candidate role forever.

Sometimes a server can stay in candidate role forever, even if the server
already see the new leader and handles append-requests normally. However,
because of the wrong role, it appears as disconnected from cluster and
so the clients are disconnected.

This problem happens when 2 servers become candidates in the same
term, and one of them is elected as leader in that term. It can be
reproduced by the test cases added in this patch.

The root cause is that the current implementation only changes role to
follower when a bigger term is observed (in raft_receive_term__()).
According to the RAFT paper, if another candidate becomes leader with
the same term, the candidate should change to follower.

This patch fixes it by changing the role to follower when leader
is being updated in raft_update_leader().

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


Compare: https://github.com/openvswitch/ovs/compare/209cd3a647c5...874895b49192


More information about the git mailing list