[ovs-discuss] [OVN] Cluster mode ovsdb memory keeps increasing

刘梦馨 liumengxinfly at gmail.com
Mon Dec 16 05:55:59 UTC 2019


After more iteration (6 in my environment) the rss usage stabilized in
759128KB.

This is a really simplified test, in our real environment we run about 3000
containers and with lots other operations, like set route, loadbalancer,
all the ovn-sb operations etc. The memory consumption can quickly go up to
6GB (nb and sb together) and lead a system OOM.  Is that a reasonable
resource consumption in your experience? I didn't remember the actual
numbers of standalone db resource consumption, however in the same
environment, it didn't lead to an OOM.

Han Zhou <hzhou at ovn.org> 于2019年12月16日周一 下午1:05写道:

> Thanks for the details. I tried the same command with a for loop.
>
> After the first 4 iterations, the RSS of the first NB server increased to
> 572888 (KB). After that, it stayed the same in the next 3 iterations. So it
> seems to just build memory buffers up and then stayed at the level without
> further increasing and doesn't seem to be memory leaking. Could you try
> more iterations and see if it still continuously increase?
>
> Thanks,
> Han
>
> On Sun, Dec 15, 2019 at 7:54 PM 刘梦馨 <liumengxinfly at gmail.com> wrote:
> >
> > Hi, Han
> >
> > In my test scenario, I use ovn-ctl to start a one node ovn with cluster
> mode db and no chassis bind to the ovn-sb to just check the memory usage of
> ovn-nb.
> > Then use a script to add a logical switch, add 1000 ports, set dynamic
> addresses and then delete the logical switch.
> >
> > #!/bin/bash
> > ovn-nbctl ls-add ls1
> > for i in {1..1000}; do
> >   ovn-nbctl lsp-add ls1 ls1-vm$i
> >   ovn-nbctl lsp-set-addresses ls1-vm$i dynamic
> > done
> > ovn-nbctl ls-del ls1
> >
> > I run this script repeatedly and watch the memory change.
> >
> > After 5 runs (5000 lsp add and delete), the rss of nb increased to 667M.
> > The nb file increased to 119M and didn't automatically compacted. After
> a manually compact the db file size change back to 11K, but the memory
> usage didn't change.
> >
> >
> >
> > Han Zhou <hzhou at ovn.org> 于2019年12月14日周六 上午3:40写道:
> >>
> >>
> >>
> >> On Wed, Dec 11, 2019 at 12:51 AM 刘梦馨 <liumengxinfly at gmail.com> wrote:
> >> >
> >> >
> >> > We are using ovs/ovn 2.12.0 to implementing our container network.
> After switching form standalone ovndb to cluster mode ovndb, we noticed
> that the memory consumption for both ovnnb and ovnsb will keep increasing
> after each operation and never decrease.
> >> >
> >> > We did some profiling by valgrind. The leak check report a 16 byte
> leak in fork_and_wait_for_startup, which obviously is not the main reason.
> Later we use memif to profile the memory consumption and we put the result
> in the attachment.
> >> >
> >> > Most of the memory come from two part ovsthread_wrapper
> (ovs-thread.c:378) that allocates a subprogram_name and  jsonrpc_send
> (jsonrpc.c:253) as below, (I just skipped the duplicated stack of jsonrpc).
> >> >
> >> > However I found both part have a related free operation in near
> place, so I don't know how to further explore this memory issue. I'm not
> aware of the differences here between cluster mode and standalone mode.
> >> >
> >> > Can anyone give some advice and hint? Thanks!
> >> >
> >> > 100.00% (357,920,768B) (page allocation syscalls) mmap/mremap/brk,
> --alloc-fns, etc.
> >> > ->78.52% (281,038,848B) 0x66FDD49: mmap (in /usr/lib64/libc-2.17.so)
> >> > | ->37.50% (134,217,728B) 0x66841EF: new_heap (in /usr/lib64/
> libc-2.17.so)
> >> > | | ->37.50% (134,217,728B) 0x6684C22: arena_get2.isra.3 (in
> /usr/lib64/libc-2.17.so)
> >> > | |   ->37.50% (134,217,728B) 0x668AACC: malloc (in /usr/lib64/
> libc-2.17.so)
> >> > | |     ->37.50% (134,217,728B) 0x4FDC613: xmalloc (util.c:138)
> >> > | |       ->37.50% (134,217,728B) 0x4FDC78E: xvasprintf (util.c:202)
> >> > | |         ->37.50% (134,217,728B) 0x4FDC877: xasprintf (util.c:343)
> >> > | |           ->37.50% (134,217,728B) 0x4FA548D: ovsthread_wrapper
> (ovs-thread.c:378)
> >> > | |             ->37.50% (134,217,728B) 0x5BE5E63: start_thread (in
> /usr/lib64/libpthread-2.17.so)
> >> > | |               ->37.50% (134,217,728B) 0x670388B: clone (in
> /usr/lib64/libc-2.17.so)
> >> > | |
> >> > | ->36.33% (130,023,424B) 0x6686DF3: sysmalloc (in /usr/lib64/
> libc-2.17.so)
> >> > | | ->36.33% (130,023,424B) 0x6687CA8: _int_malloc (in /usr/lib64/
> libc-2.17.so)
> >> > | |   ->28.42% (101,711,872B) 0x66890C0: _int_realloc (in /usr/lib64/
> libc-2.17.so)
> >> > | |   | ->28.42% (101,711,872B) 0x668B160: realloc (in /usr/lib64/
> libc-2.17.so)
> >> > | |   |   ->28.42% (101,711,872B) 0x4FDC9A3: xrealloc (util.c:149)
> >> > | |   |     ->28.42% (101,711,872B) 0x4F1DEB2: ds_reserve
> (dynamic-string.c:63)
> >> > | |   |       ->28.42% (101,711,872B) 0x4F1DED3: ds_put_uninit
> (dynamic-string.c:73)
> >> > | |   |         ->28.42% (101,711,872B) 0x4F1DF0B: ds_put_char__
> (dynamic-string.c:82)
> >> > | |   |           ->26.37% (94,371,840B) 0x4F2B09F:
> json_serialize_string (dynamic-string.h:93)
> >> > | |   |           | ->12.01% (42,991,616B) 0x4F2B3EA: json_serialize
> (json.c:1651)
> >> > | |   |           | | ->12.01% (42,991,616B) 0x4F2B3EA:
> json_serialize (json.c:1651)
> >> > | |   |           | |   ->12.01% (42,991,616B) 0x4F2B3EA:
> json_serialize (json.c:1651)
> >> > | |   |           | |     ->12.01% (42,991,616B) 0x4F2B540:
> json_serialize (json.c:1626)
> >> > | |   |           | |       ->12.01% (42,991,616B) 0x4F2B540:
> json_serialize (json.c:1626)
> >> > | |   |           | |         ->12.01% (42,991,616B) 0x4F2B540:
> json_serialize (json.c:1626)
> >> > | |   |           | |           ->12.01% (42,991,616B) 0x4F2B540:
> json_serialize (json.c:1626)
> >> > | |   |           | |             ->12.01% (42,991,616B) 0x4F2B3EA:
> json_serialize (json.c:1651)
> >> > | |   |           | |               ->12.01% (42,991,616B) 0x4F2B540:
> json_serialize (json.c:1626)
> >> > | |   |           | |                 ->12.01% (42,991,616B)
> 0x4F2D82A: json_to_ds (json.c:1525)
> >> > | |   |           | |                   ->12.01% (42,991,616B)
> 0x4F2EA49: jsonrpc_send (jsonrpc.c:253)
> >> > | |   |           | |                     ->12.01% (42,991,616B)
> 0x4C3A68A: ovsdb_jsonrpc_server_run (jsonrpc-server.c:1104)
> >> > | |   |           | |                       ->12.01% (42,991,616B)
> 0x10DCC1: main (ovsdb-server.c:209)
> >> >
> >> > _______________________________________________
> >> > discuss mailing list
> >> > discuss at openvswitch.org
> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >>
> >> Thanks for reporting the issue. Could you describe your test scenario
> (the operations), the scale, the db file size and the memory (RSS) data of
> the NB/SB?
> >> Clustered mode maintains some extra data such as RAFT logs, compares to
> standalone, but it should not increase forever, because RAFT logs will get
> compacted periodically.
> >>
> >> Thanks,
> >> Han
> >
> >
> >
> > --
> > 刘梦馨
> > Blog: http://oilbeater.com
> > Weibo: @oilbeater
>


-- 
刘梦馨
Blog: http://oilbeater.com
Weibo: @oilbeater <http://weibo.com/oilbeater>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20191216/174fee25/attachment.html>


More information about the discuss mailing list