[ovs-git] [openvswitch/ovs] 692a09: raft-rpc: Fix message format.

Han Zhou noreply at github.com
Fri Mar 6 22:30:00 UTC 2020


  Branch: refs/heads/master
  Home:   https://github.com/openvswitch/ovs
  Commit: 692a09cb5e2a1ba8aaddd3340d80ae47fcda3ae2
      https://github.com/openvswitch/ovs/commit/692a09cb5e2a1ba8aaddd3340d80ae47fcda3ae2
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/raft-rpc.c

  Log Message:
  -----------
  raft-rpc: Fix message format.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: bda1f6b60588a45b71fa812f260921793df39aef
      https://github.com/openvswitch/ovs/commit/bda1f6b60588a45b71fa812f260921793df39aef
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/ovsdb-server.c
    M tests/ovsdb-cluster.at

  Log Message:
  -----------
  ovsdb-server: Don't disconnect clients after raft install_snapshot.

When "schema" field is found in read_db(), there can be two cases:
1. There is a schema change in clustered DB and the "schema" is the new one.
2. There is a install_snapshot RPC happened, which caused log compaction on the
server and the next log is just the snapshot, which always constains "schema"
field, even though the schema hasn't been changed.

The current implementation doesn't handle case 2), and always assume the schema
is changed hence disconnect all clients of the server. It can cause stability
problem when there are big number of clients connected when this happens in
a large scale environment.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: 2833885f7ab565ce07f40de2ab8d415dc0390329
      https://github.com/openvswitch/ovs/commit/2833885f7ab565ce07f40de2ab8d415dc0390329
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/raft.c
    M tests/ovsdb-cluster.at

  Log Message:
  -----------
  raft: Fix raft_is_connected() when there is no leader yet.

If there is never a leader known by the current server, it's status
should be "disconnected" to the cluster. Without this patch, when
a server in cluster is restarted, before it successfully connecting
back to the cluster it will appear as connected, which is wrong.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: bb66a0a6eb7971556504a294f5cf796d1d72db25
      https://github.com/openvswitch/ovs/commit/bb66a0a6eb7971556504a294f5cf796d1d72db25
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/ovsdb.c
    M ovsdb/ovsdb.h
    M ovsdb/transaction.c
    M ovsdb/trigger.c

  Log Message:
  -----------
  raft: Avoid busy loop during leader election.

When a server doesn't see a leader yet, e.g. during leader re-election,
if a transaction comes from a client, it will cause 100% CPU busy loop.
With debug log enabled it is like:

2020-02-28T04:04:35.631Z|00059|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164
2020-02-28T04:04:35.631Z|00062|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164
2020-02-28T04:04:35.631Z|00065|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164
2020-02-28T04:04:35.631Z|00068|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164
2020-02-28T04:04:35.631Z|00071|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164
2020-02-28T04:04:35.631Z|00074|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164
2020-02-28T04:04:35.631Z|00077|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164
...

The problem is that in ovsdb_trigger_try(), all cluster errors are treated
as temporary error and retry immediately. This patch fixes it by introducing
'run_triggers_now', which tells if a retry is needed immediately. When the
cluster error is with detail 'not leader', we don't immediately retry, but
will wait for the next poll event to trigger the retry. When 'not leader'
status changes, there must be a event, i.e. raft RPC that changes the
status, so the trigger is guaranteed to be triggered, without busy loop.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: b5e8810443a552b0adc5ff05b483c30de63f5ab9
      https://github.com/openvswitch/ovs/commit/b5e8810443a552b0adc5ff05b483c30de63f5ab9
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Avoid sending unnecessary heartbeat when becoming leader.

When a node becomes leader, it sends out heartbeat to all followers
and then sends out another append-request for a no-op command
execution to all followers again immediately. This causes 2
continously append-requests sent out to each followers, and the first
heartbeat append-request is unnecessary. This patch removes the
heartbeat.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: 99c2dc8d04b3b697edfa02b06e127edad6ad5b28
      https://github.com/openvswitch/ovs/commit/99c2dc8d04b3b697edfa02b06e127edad6ad5b28
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Send all missing logs in one single append_request.

When a follower needs to "catch up", leader can send N entries in
a single append_request instead of only one entry by each message.

The function raft_send_append_request() already supports this, so
this patch just calculate the correct "n" and use it.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


  Commit: 315e88cb4dd9c524ac111323f9d064678cf06a5e
      https://github.com/openvswitch/ovs/commit/315e88cb4dd9c524ac111323f9d064678cf06a5e
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-03-06 (Fri, 06 Mar 2020)

  Changed paths:
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Fix next_index in install_snapshot reply handling.

When a leader handles install_snapshot reply, the next_index for
the follower should be log_start instead of log_end, because there
can be new entries added in leader's log after initiating the
install_snapshot procedure.  Also, it should send all the accumulated
entries to follower in the following append-request message, instead
of sending 0 entries, to speed up the converge.

Without this fix, there is no functional problem, but it takes
uncessary extra rounds of append-requests responsed with "inconsistency"
by follower, although finally will be converged.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>


Compare: https://github.com/openvswitch/ovs/compare/44810e6d411e...315e88cb4dd9


More information about the git mailing list