[ovs-git] [openvswitch/ovs] 16e3a8: ovsdb-server: Fix schema leak while reading db.
Mark Michelson
noreply at github.com
Thu May 28 16:57:09 UTC 2020
Branch: refs/heads/master
Home: https://github.com/openvswitch/ovs
Commit: 16e3a80cf646f6c53d22ef98599d5aecb8310414
https://github.com/openvswitch/ovs/commit/16e3a80cf646f6c53d22ef98599d5aecb8310414
Author: Ilya Maximets <i.maximets at ovn.org>
Date: 2020-05-28 (Thu, 28 May 2020)
Changed paths:
M ovsdb/ovsdb-server.c
Log Message:
-----------
ovsdb-server: Fix schema leak while reading db.
parse_txn() function doesn't always take ownership of the 'schema'
passed. So, if the schema of the clustered db has same version as the
one that already in use, parse_txn() will not use it, resulting with a
memory leak:
7,827 (56 direct, 7,771 indirect) bytes in 1 blocks are definitely lost
at 0x483BB1A: calloc (vg_replace_malloc.c:762)
by 0x44AD02: xcalloc (util.c:121)
by 0x40E70E: ovsdb_schema_create (ovsdb.c:41)
by 0x40EA6D: ovsdb_schema_from_json (ovsdb.c:217)
by 0x415EDD: ovsdb_storage_read (storage.c:280)
by 0x408968: read_db (ovsdb-server.c:607)
by 0x40733D: main_loop (ovsdb-server.c:227)
by 0x40733D: main (ovsdb-server.c:469)
While we could put ovsdb_schema_destroy() in a few places inside
'parse_txn()', from the users' point of view it seems better to have a
constant argument and just clone the 'schema' if needed. The caller
will be responsible for destroying the 'schema' it owns.
Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Acked-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
Commit: 8c2c503bdb0da1ce6044a53d462f905fd4f8acf5
https://github.com/openvswitch/ovs/commit/8c2c503bdb0da1ce6044a53d462f905fd4f8acf5
Author: Ilya Maximets <i.maximets at ovn.org>
Date: 2020-05-28 (Thu, 28 May 2020)
Changed paths:
M ovsdb/raft-private.c
M ovsdb/raft-private.h
M ovsdb/raft.c
Log Message:
-----------
raft: Avoid sending equal snapshots.
Snapshots are huge. In some cases we could receive several outdated
append replies from the remote server. This could happen in high
scale cases if the remote server is overloaded and not able to process
all the raft requests in time. As an action to each outdated append
reply we're sending full database snapshot. While remote server is
already overloaded those snapshots will stuck in jsonrpc backlog for
a long time making it grow up to few GB. Since remote server wasn't
able to timely process incoming messages it will likely not able to
process snapshots leading to the same situation with low chances to
recover. Remote server will likely stuck in 'candidate' state, other
servers will grow their memory consumption due to growing jsonrpc
backlogs:
jsonrpc|INFO|excessive sending backlog, jsonrpc: ssl:192.16.0.3:6644,
num of msgs: 3795, backlog: 8838994624.
This patch is trying to avoid that situation by avoiding sending of
equal snapshot install requests. This helps maintain reasonable memory
consumption and allows the cluster to recover on a larger scale.
Acked-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
Commit: a6117059904bb692039c926221964dd6d49b3bfd
https://github.com/openvswitch/ovs/commit/a6117059904bb692039c926221964dd6d49b3bfd
Author: Eiichi Tsukata <eiichi.tsukata at nutanix.com>
Date: 2020-05-28 (Thu, 28 May 2020)
Changed paths:
M lib/classifier.c
M lib/classifier.h
M tests/test-classifier.c
Log Message:
-----------
classifier: Prevent tries vs n_tries race leading to NULL dereference.
Currently classifier tries and n_tries can be updated not atomically,
there is a race condition which can lead to NULL dereference.
The race can happen when main thread updates a classifier tries and
n_tries in classifier_set_prefix_fields() and at the same time revalidator
or handler thread try to lookup them in classifier_lookup__(). Such race
can be triggered when user changes prefixes of flow_table.
Race(user changes flow_table prefixes: ip_dst,ip_src => none):
[main thread] [revalidator/handler thread]
===========================================================
/* cls->n_tries == 2 */
for (int i = 0; i < cls->n_tries; i++) {
trie_init(cls, i, NULL);
/* n_tries == 0 */
cls->n_tries = n_tries;
/* cls->tries[i]->feild is NULL */
trie_ctx_init(&trie_ctx[i],&cls->tries[i]);
/* trie->field is NULL */
ctx->be32ofs = trie->field->flow_be32ofs;
To prevent the race, instead of re-introducing internal mutex
implemented in the commit fccd7c092e09 ("classifier: Remove internal
mutex."), this patch makes trie field RCU protected and checks it after
read.
Fixes: fccd7c092e09 ("classifier: Remove internal mutex.")
Signed-off-by: Eiichi Tsukata <eiichi.tsukata at nutanix.com>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
Commit: 21ad0088bedbd9b4070f4c5e3e6b29c7d9349beb
https://github.com/openvswitch/ovs/commit/21ad0088bedbd9b4070f4c5e3e6b29c7d9349beb
Author: Ilya Maximets <i.maximets at ovn.org>
Date: 2020-05-28 (Thu, 28 May 2020)
Changed paths:
M AUTHORS.rst
Log Message:
-----------
AUTHORS: Add Eiichi Tsukata.
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
Commit: 9ba57fc7cccca85a753bc3d5c12271defb5619c1
https://github.com/openvswitch/ovs/commit/9ba57fc7cccca85a753bc3d5c12271defb5619c1
Author: Han Zhou <hzhou at ovn.org>
Date: 2020-05-28 (Thu, 28 May 2020)
Changed paths:
M acinclude.m4
M datapath/datapath.c
M datapath/datapath.h
M datapath/linux/compat/include/linux/skbuff.h
Log Message:
-----------
datapath: Add hash info to upcall.
This patch backports below upstream patches, and add __skb_set_hash
to compat for older kernels.
commit b5ab1f1be6180a2e975eede18731804b5164a05d
Author: Jakub Kicinski <kuba at kernel.org>
Date: Mon Mar 2 21:05:18 2020 -0800
openvswitch: add missing attribute validation for hash
Add missing attribute validation for OVS_PACKET_ATTR_HASH
to the netlink policy.
Fixes: bd1903b7c459 ("net: openvswitch: add hash info to upcall")
Signed-off-by: Jakub Kicinski <kuba at kernel.org>
Reviewed-by: Greg Rose <gvrose8192 at gmail.com>
Signed-off-by: David S. Miller <davem at davemloft.net>
commit bd1903b7c4596ba6f7677d0dfefd05ba5876707d
Author: Tonghao Zhang <xiangxia.m.yue at gmail.com>
Date: Wed Nov 13 23:04:49 2019 +0800
net: openvswitch: add hash info to upcall
When using the kernel datapath, the upcall don't
include skb hash info relatived. That will introduce
some problem, because the hash of skb is important
in kernel stack. For example, VXLAN module uses
it to select UDP src port. The tx queue selection
may also use the hash in stack.
Hash is computed in different ways. Hash is random
for a TCP socket, and hash may be computed in hardware,
or software stack. Recalculation hash is not easy.
Hash of TCP socket is computed:
tcp_v4_connect
-> sk_set_txhash (is random)
__tcp_transmit_skb
-> skb_set_hash_from_sk
There will be one upcall, without information of skb
hash, to ovs-vswitchd, for the first packet of a TCP
session. The rest packets will be processed in Open vSwitch
modules, hash kept. If this tcp session is forward to
VXLAN module, then the UDP src port of first tcp packet
is different from rest packets.
TCP packets may come from the host or dockers, to Open vSwitch.
To fix it, we store the hash info to upcall, and restore hash
when packets sent back.
+---------------+ +-------------------------+
| Docker/VMs | | ovs-vswitchd |
+----+----------+ +-+--------------------+--+
| ^ |
| | |
| | upcall v restore packet hash
(not recalculate)
| +-+--------------------+--+
| tap netdev | | vxlan module
+---------------> +--> Open vSwitch ko +-->
or internal type | |
+-------------------------+
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html
Signed-off-by: Tonghao Zhang <xiangxia.m.yue at gmail.com>
Acked-by: Pravin B Shelar <pshelar at ovn.org>
Signed-off-by: David S. Miller <davem at davemloft.net>
Tested-by: Aliasgar Ginwala <aginwala at ebay.com>
Acked-by: Tonghao Zhang <xiangxia.m.yue at gmail.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
Commit: 89b522aee379f7ebd21ec67ffb622118af7e9db1
https://github.com/openvswitch/ovs/commit/89b522aee379f7ebd21ec67ffb622118af7e9db1
Author: Mark Michelson <mmichels at redhat.com>
Date: 2020-05-28 (Thu, 28 May 2020)
Changed paths:
M lib/ovsdb-idl.c
M lib/ovsdb-idl.h
Log Message:
-----------
ovsdb-idl: Add function to reset min_index.
If an administrator removes all of the databases in a cluster from
disk, then ovsdb IDL clients will have a problem. The databases will all
reset their stored indexes to 0, so The IDL client's min_index will be
higher than the indexes of all databases in the cluster. This results in
the client constantly connecting to databases, detecting the data as
"stale", and then attempting to connect to another.
This function provides a way to reset the IDL to an initial state with
min_index of 0. This way, the client will not wrongly detect the
database data as stale and will recover properly.
Notice that this function is not actually used anywhere in this patch.
This will be used by OVN, though, since OVN is the primary user of
clustered OVSDB.
Signed-off-by: Mark Michelson <mmichels at redhat.com>
Acked-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
Compare: https://github.com/openvswitch/ovs/compare/ede446678f5d...89b522aee379
More information about the git
mailing list