[ovs-dev] ovn northbound ovsdb-server’s memory usage problem

Lei Huang huang.f.lei at gmail.com
Mon Mar 7 07:02:33 UTC 2016


Hi,


During a scalability test, we found that the ovn northbound ovsdb-server’s
memory usage becomes very high while creating and binding ports, the test
step is:

1. Create 1000 sandboxes

2. Create 5 lswitches and create 200 lports for each of lswitches,  on
northbound

3. For each of lport created in step 2, bind it to a sandboxes randomly,
 then use ‘ovn-nbctl wait-until Logical_Port <port-name> up=true’ to wait
the port is ‘up’ on northbound

4. Goto step 2


The VmRSS memory usages as below:


lport number  ovn-northd   nb ovsdb-server   sb ovsdb-server

    0               1292              1124              7104

 5000              21704           6476420             17424

10000              41680          25446148             31808

15000              61600          56893928            133864

20000              81844         100955012            176868


The memory log as below:

016-03-02T09:48:32.024Z|00003|memory|INFO|1124 kB peak resident set size
after 10.0 seconds

2016-03-02T16:24:46.430Z|00005|memory|INFO|peak resident set size grew 150%
in last 23774.4 seconds, from 1124 kB to 2808 kB

2016-03-02T16:25:06.432Z|00007|memory|INFO|peak resident set size grew 130%
in last 20.0 seconds, from 2808 kB to 6448 kB

2016-03-02T16:25:16.433Z|00009|memory|INFO|peak resident set size grew 54%
in last 10.0 seconds, from 6448 kB to 9948 kB

2016-03-02T16:25:26.434Z|00011|memory|INFO|peak resident set size grew 52%
in last 10.0 seconds, from 9948 kB to 15072 kB

2016-03-02T16:25:36.436Z|00013|memory|INFO|peak resident set size grew 178%
in last 10.0 seconds, from 15072 kB to 41956 kB

2016-03-02T16:26:36.439Z|00015|memory|INFO|peak resident set size grew 140%
in last 60.0 seconds, from 41956 kB to 100696 kB

2016-03-02T16:27:36.444Z|00017|memory|INFO|peak resident set size grew 78%
in last 60.0 seconds, from 100696 kB to 179108 kB

2016-03-02T16:28:46.444Z|00019|memory|INFO|peak resident set size grew 55%
in last 70.0 seconds, from 179108 kB to 277412 kB

2016-03-02T16:39:26.953Z|00021|memory|INFO|peak resident set size grew 50%
in last 640.5 seconds, from 277412 kB to 417472 kB

2016-03-02T16:41:06.960Z|00023|memory|INFO|peak resident set size grew 67%
in last 100.0 seconds, from 417472 kB to 696732 kB

2016-03-02T16:44:17.180Z|00025|memory|INFO|peak resident set size grew 54%
in last 190.2 seconds, from 696732 kB to 1071080 kB

2016-03-02T16:55:02.317Z|00027|memory|INFO|peak resident set size grew 68%
in last 645.1 seconds, from 1071080 kB to 1794276 kB

2016-03-02T17:03:22.544Z|00029|memory|INFO|peak resident set size grew 50%
in last 500.2 seconds, from 1794276 kB to 2698100 kB

2016-03-02T17:12:18.359Z|00031|memory|INFO|peak resident set size grew 54%
in last 535.8 seconds, from 2698100 kB to 4158336 kB

2016-03-02T17:27:59.897Z|00070|memory|INFO|peak resident set size grew 55%
in last 941.5 seconds, from 4158336 kB to 6458136 kB

2016-03-02T17:51:04.005Z|00136|memory|INFO|peak resident set size grew 53%
in last 1384.1 seconds, from 6458136 kB to 9896844 kB

2016-03-02T18:37:52.913Z|00243|memory|INFO|peak resident set size grew 57%
in last 2808.9 seconds, from 9896844 kB to 15539132 kB

2016-03-02T19:21:33.977Z|00414|memory|INFO|peak resident set size grew 51%
in last 2621.1 seconds, from 15539132 kB to 23403744 kB

2016-03-02T20:27:53.938Z|00661|memory|INFO|peak resident set size grew 51%
in last 3980.0 seconds, from 23403744 kB to 35348044 kB

2016-03-02T22:11:13.200Z|00957|memory|INFO|peak resident set size grew 53%
in last 6199.3 seconds, from 35348044 kB to 53943084 kB

2016-03-03T00:49:32.196Z|01306|memory|INFO|peak resident set size grew 51%
in last 9499.0 seconds, from 53943084 kB to 81658988 kB



On another small test environment, I checked northbound ovsdb-server by
valgrind:

$ valgrind --leak-check=full --show-leak-kinds=all ovsdb-server --no-chdir
--pidfile=ovsdb-server-nb.pid --unixctl=ovsdb-server-nb.ctl -vconsole:off
-vsyslog:off -vfile:info --log-file=ovsdb-server-nb.log
--remote=punix:/home/rally/controller-sandbox/db-nb.sock conf-nb.db ovnnb.db


But no obvious memory leak found. So I make ovsdb-server exit without
cleanup to see what are memory used for, code changed as below:


diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c

index fa662b1..4498eaa 100644

--- a/ovsdb/ovsdb-server.c

+++ b/ovsdb/ovsdb-server.c

@@ -339,7 +339,7 @@ main(int argc, char *argv[])

                              ovsdb_server_disable_monitor2, jsonrpc);



     main_loop(jsonrpc, &all_dbs, unixctl, &remotes, run_process, &exiting);

-

+    return 0;

     ovsdb_jsonrpc_server_destroy(jsonrpc);

     SHASH_FOR_EACH_SAFE(node, next, &all_dbs) {

         struct db *db = node->data;


After 100 port created and bound, the memory used is about 200M, and the
output of valgrind as below :

==23712== 24,994,032 bytes in 520,709 blocks are possibly lost in loss
record 228 of 229

==23712==    at 0x4C2CC70: calloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==23712==    by 0x44140C: xcalloc (in /usr/local/sbin/ovsdb-server)

==23712==    by 0x44145A: xzalloc (in /usr/local/sbin/ovsdb-server)

==23712==    by 0x410DFF: ovsdb_monitor_changes_update (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x4110F6: ovsdb_monitor_change_cb (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x41780A: ovsdb_txn_for_each_change (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x4119A0: ovsdb_monitor_commit (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x41767C: ovsdb_txn_commit_ (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x41775D: ovsdb_txn_commit (in /usr/local/sbin/ovsdb-server)

==23712==    by 0x419011: ovsdb_execute (in /usr/local/sbin/ovsdb-server)

==23712==    by 0x414D2E: ovsdb_trigger_try (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x414A8B: ovsdb_trigger_init (in
/usr/local/sbin/ovsdb-server)

==23712==

==23712== 124,299,840 bytes in 520,709 blocks are possibly lost in loss
record 229 of 229

==23712==    at 0x4C2AB80: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==23712==    by 0x441487: xmalloc (in /usr/local/sbin/ovsdb-server)

==23712==    by 0x40F975: clone_monitor_row_data (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x410E70: ovsdb_monitor_changes_update (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x4110F6: ovsdb_monitor_change_cb (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x41780A: ovsdb_txn_for_each_change (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x4119A0: ovsdb_monitor_commit (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x41767C: ovsdb_txn_commit_ (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x41775D: ovsdb_txn_commit (in /usr/local/sbin/ovsdb-server)

==23712==    by 0x419011: ovsdb_execute (in /usr/local/sbin/ovsdb-server)

==23712==    by 0x414D2E: ovsdb_trigger_try (in
/usr/local/sbin/ovsdb-server)

==23712==    by 0x414A8B: ovsdb_trigger_init (in
/usr/local/sbin/ovsdb-server)

==23712==

==23712== LEAK SUMMARY:

==23712==    definitely lost: 159 bytes in 4 blocks

==23712==    indirectly lost: 398 bytes in 13 blocks

==23712==      possibly lost: 212,065,579 bytes in 3,963,543 blocks

==23712==    still reachable: 1,156,295 bytes in 9,686 blocks

==23712==         suppressed: 0 bytes in 0 blocks


It’s seems like about 150M memory are used by ovsdb monitor, I am looking
into this problem, because I’m not familiar with ovsdb-ser’s source, this
may take some time. Do you have any idea about this problem?


BR

Huang Lei



More information about the dev mailing list