[ovs-git] [openvswitch/ovs] efc59b: raft: Set threshold on backlog for raft connections.

Ilya Maximets noreply at github.com
Tue Nov 10 08:08:40 UTC 2020


  Branch: refs/heads/branch-2.13
  Home:   https://github.com/openvswitch/ovs
  Commit: efc59be2eadbf5b425db7476646d4c8a16c10b96
      https://github.com/openvswitch/ovs/commit/efc59be2eadbf5b425db7476646d4c8a16c10b96
  Author: Ilya Maximets <i.maximets at ovn.org>
  Date:   2020-11-10 (Tue, 10 Nov 2020)

  Changed paths:
    M NEWS
    M lib/jsonrpc.c
    M lib/jsonrpc.h
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Set threshold on backlog for raft connections.

RAFT messages could be fairly big.  If something abnormal happens to
one of the servers in a cluster it may not be able to process all the
incoming messages in a timely manner.  This results in jsonrpc backlog
growth on the sender's side.  For example if follower gets many new
clients at once that it needs to serve, or it decides to take a
snapshot in a period of high number of database changes.
If backlog grows large enough it becomes harder and harder for follower
to process incoming raft messages, it sends outdated replies and
starts receiving snapshots and the whole raft log from the leader.
Sometimes backlog grows too high (60GB in this example):

      jsonrpc|INFO|excessive sending backlog, jsonrpc: ssl:<ip>,
                   num of msgs: 15370, backlog: 61731060773.

In this case OS might actually decide to kill the sender to free some
memory.  Anyway, It could take a lot of time for such a server to catch
up with the rest of the cluster if it has so much data to receive and
process.

Introducing backlog thresholds for jsonrpc connections.
If sending backlog will exceed particular values (500 messages or
4GB in size), connection will be dropped and re-created.  This will
allow to drop all the current backlog and start over increasing
chances of cluster recovery.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


  Commit: 1440833435c59e0ee6b9d7ea4e92accd6a0c4b65
      https://github.com/openvswitch/ovs/commit/1440833435c59e0ee6b9d7ea4e92accd6a0c4b65
  Author: Ilya Maximets <i.maximets at ovn.org>
  Date:   2020-11-10 (Tue, 10 Nov 2020)

  Changed paths:
    M NEWS
    M ovsdb/ovsdb-server.1.in
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Make backlog thresholds configurable.

New appctl 'cluster/set-backlog-threshold' to configure thresholds
on backlog of raft jsonrpc connections.  Could be used, for example,
in some extreme conditions where size of a database expected to be
very large, i.e. comparable with default 4GB threshold.

Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


  Commit: 3e581f43a9895ef99397389d671466eca0350b91
      https://github.com/openvswitch/ovs/commit/3e581f43a9895ef99397389d671466eca0350b91
  Author: Yi-Hung Wei <yihung.wei at gmail.com>
  Date:   2020-11-10 (Tue, 10 Nov 2020)

  Changed paths:
    M ovsdb/log.c

  Log Message:
  -----------
  ovsdb: Remove read permission of *.db from others.

Currently, when ovsdb *.db is created by ovsdb-tool it grants read
permission to others.  This may incur security concerns, for example,
IPsec Pre-shared keys are stored in ovs-vsitchd.conf.db.
This patch addresses the concerns by removing permission for others.

Reported-by: Antonin Bas <abas at vmware.com>
Acked-by: Mark Gray <mark.d.gray at redhat.com>
Signed-off-by: Yi-Hung Wei <yihung.wei at gmail.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


  Commit: d43d10e10c837a0f2523d05c804b056ccf2d526a
      https://github.com/openvswitch/ovs/commit/d43d10e10c837a0f2523d05c804b056ccf2d526a
  Author: Eli Britstein <elibr at nvidia.com>
  Date:   2020-11-10 (Tue, 10 Nov 2020)

  Changed paths:
    M lib/netdev-offload-dpdk.c

  Log Message:
  -----------
  netdev-offload-dpdk: Preserve HW statistics for modified flows.

In case of a flow modification, preserve the HW statistics of the old HW
flow to the new one.

Fixes: 3c7330ebf036 ("netdev-offload-dpdk: Support offload of output action.")
Signed-off-by: Eli Britstein <elibr at nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr at nvidia.com>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna at broadcom.com>
Tested-by: Emma Finn <emma.finn at intel.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


Compare: https://github.com/openvswitch/ovs/compare/dc5b4e8f694d...d43d10e10c83


More information about the git mailing list