[ovs-discuss] OVN Database sizes - Auto compact feature

Daniel Alvarez Sanchez dalvarez at redhat.com
Thu Mar 8 20:54:05 UTC 2018


I agree with you Mark. I tried to check how much it would shrink with 1800
ports in the system:

[stack at ovn ovs]$ sudo ovn-nbctl list Logical_Switch_Port | grep uuid | wc -l
1809
[stack at ovn ovs]$ sudo ovn-sbctl list Logical_Flow | grep uuid | wc -l


50780
[stack at ovn ovs]$ ls -alh ovn*.db


-rw-r--r--. 1 stack stack 15M Mar  8 15:56 ovnnb_db.db
-rw-r--r--. 1 stack stack 61M Mar  8 15:56 ovnsb_db.db
[stack at ovn ovs]$ sudo ovs-appctl -t
/usr/local/var/run/openvswitch/ovnsb_db.ctl ovsdb-server/compact
[stack at ovn ovs]$ sudo ovs-appctl -t
/usr/local/var/run/openvswitch/ovnnb_db.ctl ovsdb-server/compact


[stack at ovn ovs]$ ls -alh ovn*.db
-rw-r--r--. 1 stack stack 5.8M Mar  8 20:45 ovnnb_db.db
-rw-r--r--. 1 stack stack  23M Mar  8 20:45 ovnsb_db.db

As you can see, with ~50K lflows, the database min size would be ~23M while
the NB database
is much smaller. Still I think we need to do something to not allow delay
the compact task to
kick in this much unnecessarily. Or maybe we want some sort of
configuration (ie. normal, aggressive,...)
for this since in some situations it may help to have the full log of the
DB (although this can be
achieved through periodic backups :?). That said, I'm not a big fan of such
configs but...



On Thu, Mar 8, 2018 at 9:31 PM, Mark Michelson <mmichels at redhat.com> wrote:

> Most of the data in this thread has been pretty easily explainable based
> on what I've seen in the code compared with the nature of the data in the
> southbound database.
>
> The southbound database tends to have more data in it than other databases
> in OVS, due especially to the Logical_Flow table. The result is that auto
> shrinking of the database does not shrink it down by as much as other
> databases. You can see in Daniel's graphs that each time the southbound
> database is shrunk, its "base" size ends up noticeably larger than it
> previously was.
>
> Couple that with the fact that the database has to increase to 4x its
> previous snapshot size in order to be shrunk, and you can end up with a
> situation after a while where the "shrunk" southbound database is 750MB,
> and it won't shrink again until it exceeds 3GB.
>
> To fix this, I think there are a few things that can be done:
>
> * Somehow make the southbound database have less data in it. I don't have
> any real good ideas for how to do this, and doing this in a
> backwards-compatible way will be difficult.
>
> * Ease the requirements for shrinking a database. For instance, once the
> database reaches a certain size, maybe it doesn't need to grow by 4x in
> order to be a candidate for shrinking. Maybe it only needs to double in
> size. Or, there could be some time cutoff where the database always will be
> shrunk. So for instance, every hour, always shrink the database, no matter
> how much activity has occurred in it (okay, maybe not if there have been 0
> transactions).


Maybe we can just do the the shrink if the last compact took place >24h ago
regardless of the other conditions.
I can send a patch for this if you guys like the idea. It's some sort of
"cleanup task" just in case and seems harmless.
What do you say?

>
>
> On 03/07/2018 02:50 PM, Ben Pfaff wrote:
>
>> OK.
>>
>> I guess we need to investigate this issue from the basics.
>>
>> On Wed, Mar 07, 2018 at 09:02:02PM +0100, Daniel Alvarez Sanchez wrote:
>>
>>> With OVS 2.8 branch it never shrank when I started to delete the ports
>>> since
>>> the DB sizes didn't grow, which makes sense to me. The conditions weren't
>>> met for further compaction.
>>> See attached image.
>>>
>>> NB:
>>> 2018-03-07T18:25:49.269Z|00009|ovsdb_file|INFO|/opt/stack/
>>> data/ovs/ovnnb_db.db:
>>> compacting database online (647.317 seconds old, 436 transactions,
>>> 10505382
>>> bytes)
>>> 2018-03-07T18:35:51.414Z|00012|ovsdb_file|INFO|/opt/stack/
>>> data/ovs/ovnnb_db.db:
>>> compacting database online (602.089 seconds old, 431 transactions,
>>> 29551917
>>> bytes)
>>> 2018-03-07T18:45:52.263Z|00015|ovsdb_file|INFO|/opt/stack/
>>> data/ovs/ovnnb_db.db:
>>> compacting database online (600.563 seconds old, 463 transactions,
>>> 52843231
>>> bytes)
>>> 2018-03-07T18:55:53.810Z|00016|ovsdb_file|INFO|/opt/stack/
>>> data/ovs/ovnnb_db.db:
>>> compacting database online (601.128 seconds old, 365 transactions,
>>> 57618931
>>> bytes)
>>>
>>>
>>> SB:
>>> 2018-03-07T18:33:24.927Z|00009|ovsdb_file|INFO|/opt/stack/
>>> data/ovs/ovnsb_db.db:
>>> compacting database online (1102.840 seconds old, 775 transactions,
>>> 10505486 bytes)
>>> 2018-03-07T18:43:27.569Z|00012|ovsdb_file|INFO|/opt/stack/
>>> data/ovs/ovnsb_db.db:
>>> compacting database online (602.394 seconds old, 445 transactions,
>>> 15293972
>>> bytes)
>>> 2018-03-07T18:53:31.664Z|00015|ovsdb_file|INFO|/opt/stack/
>>> data/ovs/ovnsb_db.db:
>>> compacting database online (603.605 seconds old, 385 transactions,
>>> 19282371
>>> bytes)
>>> 2018-03-07T19:03:42.116Z|00031|ovsdb_file|INFO|/opt/stack/
>>> data/ovs/ovnsb_db.db:
>>> compacting database online (607.542 seconds old, 371 transactions,
>>> 23538784
>>> bytes)
>>>
>>>
>>>
>>>
>>> On Wed, Mar 7, 2018 at 7:18 PM, Daniel Alvarez Sanchez <
>>> dalvarez at redhat.com>
>>> wrote:
>>>
>>> No worries, I just triggered the test now running OVS compiled out of
>>>> 2.8 branch (2.8.3). I'll post the results and investigate too.
>>>>
>>>> I have just sent a patch to fix the timing issue we can see in the
>>>> traces I
>>>> posted. I applied it and it works, I believe it's good to fix as it
>>>> gives
>>>> us
>>>> an idea of how frequent the compact is, and also to backport if you
>>>> agree with it.
>>>>
>>>> Thanks!
>>>>
>>>> On Wed, Mar 7, 2018 at 7:13 PM, Ben Pfaff <blp at ovn.org> wrote:
>>>>
>>>> OK, thanks.
>>>>>
>>>>> If this is a lot of trouble, let me know and I'll investigate directly
>>>>> instead of on the basis of a suspected regression.
>>>>>
>>>>> On Wed, Mar 07, 2018 at 07:06:50PM +0100, Daniel Alvarez Sanchez wrote:
>>>>>
>>>>>> All right, I'll repeat it with code in branch-2.8.
>>>>>> Will post the results once the test finishes.
>>>>>> Daniel
>>>>>>
>>>>>> On Wed, Mar 7, 2018 at 7:03 PM, Ben Pfaff <blp at ovn.org> wrote:
>>>>>>
>>>>>> On Wed, Mar 07, 2018 at 05:53:15PM +0100, Daniel Alvarez Sanchez
>>>>>>>
>>>>>> wrote:
>>>>>
>>>>>> Repeated the test with 1000 ports this time. See attached image.
>>>>>>>> For some reason, the sizes grow while deleting the ports (the
>>>>>>>> deletion task starts at around x=2500). The weird thing is why
>>>>>>>> they keep growing and the online compact doesn't work as when
>>>>>>>> I do it through ovs-appctl tool.
>>>>>>>>
>>>>>>>> I suspect this is a bug and eventually it will grow and grow unless
>>>>>>>> we manually compact the db.
>>>>>>>>
>>>>>>>
>>>>>>> Would you mind trying out an older ovsdb-server, for example the one
>>>>>>> from OVS 2.8?  Some of the logic in ovsdb-server around compaction
>>>>>>> changed in OVS 2.9, so it would be nice to know whether this was a
>>>>>>> regression or an existing bug.
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20180308/92d6ecd3/attachment-0001.html>


More information about the discuss mailing list