[ovs-git] [openvswitch/ovs] bc8727: raft: Set threshold on backlog for raft connections.

Ilya Maximets noreply at github.com
Tue Nov 10 08:08:35 UTC 2020


  Branch: refs/heads/branch-2.14
  Home:   https://github.com/openvswitch/ovs
  Commit: bc87273c18cf97b90f1fdc983f960640d956e8d6
      https://github.com/openvswitch/ovs/commit/bc87273c18cf97b90f1fdc983f960640d956e8d6
  Author: Ilya Maximets <i.maximets at ovn.org>
  Date:   2020-11-10 (Tue, 10 Nov 2020)

  Changed paths:
    M NEWS
    M lib/jsonrpc.c
    M lib/jsonrpc.h
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Set threshold on backlog for raft connections.

RAFT messages could be fairly big.  If something abnormal happens to
one of the servers in a cluster it may not be able to process all the
incoming messages in a timely manner.  This results in jsonrpc backlog
growth on the sender's side.  For example if follower gets many new
clients at once that it needs to serve, or it decides to take a
snapshot in a period of high number of database changes.
If backlog grows large enough it becomes harder and harder for follower
to process incoming raft messages, it sends outdated replies and
starts receiving snapshots and the whole raft log from the leader.
Sometimes backlog grows too high (60GB in this example):

      jsonrpc|INFO|excessive sending backlog, jsonrpc: ssl:<ip>,
                   num of msgs: 15370, backlog: 61731060773.

In this case OS might actually decide to kill the sender to free some
memory.  Anyway, It could take a lot of time for such a server to catch
up with the rest of the cluster if it has so much data to receive and
process.

Introducing backlog thresholds for jsonrpc connections.
If sending backlog will exceed particular values (500 messages or
4GB in size), connection will be dropped and re-created.  This will
allow to drop all the current backlog and start over increasing
chances of cluster recovery.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


  Commit: 346e1a22eb9ce1f85ade449f89d61b8110e7695a
      https://github.com/openvswitch/ovs/commit/346e1a22eb9ce1f85ade449f89d61b8110e7695a
  Author: Ilya Maximets <i.maximets at ovn.org>
  Date:   2020-11-10 (Tue, 10 Nov 2020)

  Changed paths:
    M NEWS
    M ovsdb/ovsdb-server.1.in
    M ovsdb/raft.c

  Log Message:
  -----------
  raft: Make backlog thresholds configurable.

New appctl 'cluster/set-backlog-threshold' to configure thresholds
on backlog of raft jsonrpc connections.  Could be used, for example,
in some extreme conditions where size of a database expected to be
very large, i.e. comparable with default 4GB threshold.

Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


  Commit: 474ee09f2527f0502d28a9d96cb41b61070c5f33
      https://github.com/openvswitch/ovs/commit/474ee09f2527f0502d28a9d96cb41b61070c5f33
  Author: Yi-Hung Wei <yihung.wei at gmail.com>
  Date:   2020-11-10 (Tue, 10 Nov 2020)

  Changed paths:
    M ovsdb/log.c

  Log Message:
  -----------
  ovsdb: Remove read permission of *.db from others.

Currently, when ovsdb *.db is created by ovsdb-tool it grants read
permission to others.  This may incur security concerns, for example,
IPsec Pre-shared keys are stored in ovs-vsitchd.conf.db.
This patch addresses the concerns by removing permission for others.

Reported-by: Antonin Bas <abas at vmware.com>
Acked-by: Mark Gray <mark.d.gray at redhat.com>
Signed-off-by: Yi-Hung Wei <yihung.wei at gmail.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


  Commit: 869fd7162b6d76e270aa658cbf91ca3f9f98241c
      https://github.com/openvswitch/ovs/commit/869fd7162b6d76e270aa658cbf91ca3f9f98241c
  Author: Eli Britstein <elibr at nvidia.com>
  Date:   2020-11-10 (Tue, 10 Nov 2020)

  Changed paths:
    M lib/netdev-offload-dpdk.c

  Log Message:
  -----------
  netdev-offload-dpdk: Preserve HW statistics for modified flows.

In case of a flow modification, preserve the HW statistics of the old HW
flow to the new one.

Fixes: 3c7330ebf036 ("netdev-offload-dpdk: Support offload of output action.")
Signed-off-by: Eli Britstein <elibr at nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr at nvidia.com>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna at broadcom.com>
Tested-by: Emma Finn <emma.finn at intel.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>


Compare: https://github.com/openvswitch/ovs/compare/c3f0673d05c0...869fd7162b6d


More information about the git mailing list