[ovs-git] [openvswitch/ovs] e78d6f: raft: Reintroduce jsonrpc inactivity probes.

William Tu noreply at github.com
Mon Mar 1 20:18:15 UTC 2021


  Branch: refs/heads/branch-2.13
  Home:   https://github.com/openvswitch/ovs
  Commit: e78d6ffba7cae6d4f356234c42943b6cb228f2e9
      https://github.com/openvswitch/ovs/commit/e78d6ffba7cae6d4f356234c42943b6cb228f2e9
  Author: Ilya Maximets <i.maximets at ovn.org>
  Date:   2021-03-01 (Mon, 01 Mar 2021)

  Changed paths:
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Reintroduce jsonrpc inactivity probes.

It's not enough to just have heartbeats.

RAFT heartbeats are unidirectional, i.e. leader sends them to followers
but not the other way around.  Missing heartbeats provokes followers to
start election, but if leader will not receive any replies it will not
do anything while there is a quorum, i.e. there are enough other
servers to make decisions.

This leads to situation that while TCP connection is established,
leader will continue to blindly send messages to it.  In our case this
leads to growing send backlog.  Connection will be terminated
eventually due to excessive send backlog, but this this might take a
lot of time and wasted process memory.  At the same time 'candidate'
will continue to send vote requests to the dead connection on its
side.

To fix that we need to reintroduce inactivity probes that will drop
connection if there was no incoming traffic for a long time and remote
server doesn't reply to the "echo" request.  Probe interval might be
chosen based on an election timeout to avoid issues described in commit
db5a066c17bd.

Reported-by: Carlos Goncalves <cgoncalves at redhat.com>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1929690
Fixes: db5a066c17bd ("raft: Disable RAFT jsonrpc inactivity probe.")
Acked-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


  Commit: 1dee0a67e591847f97979c9fef36a23bf66f1596
      https://github.com/openvswitch/ovs/commit/1dee0a67e591847f97979c9fef36a23bf66f1596
  Author: Ilya Maximets <i.maximets at ovn.org>
  Date:   2021-03-01 (Mon, 01 Mar 2021)

  Changed paths:
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Report disconnected in cluster/status if candidate retries election.

If election times out for a server in 'candidate' role it sets
'candidate_retrying' flag that notifies that storage is disconnected
and client should re-connect.  However, cluster/status command
reports 'Status: cluster member' and that is misleading.
Reporting "disconnected from the cluster (election timeout)" instead.

Reported-by: Carlos Goncalves <cgoncalves at redhat.com>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1929690
Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Acked-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


  Commit: 732e17d6b64797972153e7e9d861facb3d03c539
      https://github.com/openvswitch/ovs/commit/732e17d6b64797972153e7e9d861facb3d03c539
  Author: William Tu <u9012063 at gmail.com>
  Date:   2021-03-01 (Mon, 01 Mar 2021)

  Changed paths:
    M Documentation/topics/dpdk/qos.rst
    M vswitchd/vswitch.xml

  Log Message:
  -----------
  Documentation: Fix DPDK qos example.

Fix the example use case based on the decription.
EIR and CIR are measured in bytes/sec and considered 64-byte
IP packets size withtout 14-byte Ethernet header.
So fix the 1000pps example by: (64 - 14) * 1000 = 50,000
If the frame includes 4-byte FCS header, then it's
(64 - 14 - 4) * 1000 = 46,000

Fixes: e61bdffc2a98 ("netdev-dpdk: Add new DPDK RFC 4115 egress policer")
Signed-off-by: William Tu <u9012063 at gmail.com>
Acked-by: Eelco Chaudron <echaudro at redhat.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


Compare: https://github.com/openvswitch/ovs/compare/aea5fefdacf0...732e17d6b647


More information about the git mailing list