[ovs-dev] ovs-vswitchd too large memory consumption with OVN stateless ACL

Ilya Maximets i.maximets at ovn.org
Sat Nov 20 04:26:23 UTC 2021


On 11/20/21 03:39, Han Zhou wrote:
> 
> 
> On Fri, Nov 19, 2021 at 3:11 PM Ilya Maximets <i.maximets at ovn.org <mailto:i.maximets at ovn.org>> wrote:
>>
>> On 11/19/21 19:12, Vladislav Odintsov wrote:
>> > Hi,
>> >
>> > I’m testing OVN stateless ACL rules with `$port_group_ipVERSION` in match portion.
>> > There’s a strange behaviour and sometimes I got configuration, which totally kills my transport nodes, where logical switch ports reside.
>> > ovs-vswitchd and ovn-controller processes utilise 100% 1 core CPU each and ovs-vswitchd consumes all free memory and repeatedly got killed by OOM-killer. It consumes 5GB memory in 5-10 seconds!
>> >
>> > I reproduced this with OVS 2.13.4 & OVN main, but also tried with actual OVS master branch and the problem still reproduces.
>> >
>> > Below are steps to reproduce:
>>
>> <snip>
>>
>> >
>> > I couldn’t get any source of the problem except to find the steps to reproduce.
>> > Can somebody please take a look on this?
>> > This looks like a potential serious problem for OVN transport nodes.
>>
>> This indeed looks like a serious issue.
>> And thanks for the great detailed report!  That was really easy to reproduce.
>>
>> I think, I found the main problem.  Could you try the following patch:
>>   https://patchwork.ozlabs.org/project/openvswitch/patch/20211119230738.2765297-1-i.maximets@ovn.org/ <https://patchwork.ozlabs.org/project/openvswitch/patch/20211119230738.2765297-1-i.maximets@ovn.org/>
>> ?
> 
> Thanks Vladislav for reporting and thanks Ilya for the quick fix!
> The fix looks good to me. However, I think there are more problems revealed by this bug report to be addressed.
> I could also reproduce it easily and I see at least 3 problems:
> 
> 1) The simple ACL condition shouldn't generate the huge number of flows (>60k) in the first place. The ovn-controller expression parser doesn't handle != for const sets efficiently. It can be optimized to combine most of the matches. For the example in this report, I'd expect at most hundreds of flows in total. I have some ideas but need to try it out.

The actual IPs and masks in these 60K flows actually look kind of
funky.  It's really hard to tell if they are legit or just a random
trash put into flows.  I know that OVN has problems with negations
and it's actually not easy to represent negations with OpenFlow,
but these looks very much random from my perspective, so it's also
a debugabily point to make them look better.

In the ideal world, IMO, OVN should try to do something like this:
  https://github.com/openvswitch/ovs-issues/issues/222#issuecomment-904813522

> 2) The memory spike problem caused by  in OVS as explained and fixed by Ilya. Really great finding and fix! It is definitely required even if 1) is solved, because we have real situations when a large number of flows will be generated and installed at once.

Yes.  For sure.  I've seen setups with millions of flows, and
ovn-controller may try to add all of them in a single bundle in some
cases.  And I've seen ovs-vswitchd consuming GBs of RAM.
The patch should noticeably improve the performance in general for
OVN deployments.  We will also need to backport this fix down to
OVS 2.13 LTS, I think.

> 3) What's left unclear to me, related to 2), is that after the bundle processing is finished, the quiescent state should be entered, and the RCU thread should free the temporarily allocated memory, right? But at least in my test I don't see the memory goes down. With 60K flows OVS has 3.3G RES which is unreasonable.

This is again a weird glibc "I will not release these fastbins"
situation.  You can verify that by applying a following patch and
noticing that the memory immediately drops after the processing is
finished:

diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
index 1866bd308..b8301c311 100644
--- a/lib/ovs-rcu.c
+++ b/lib/ovs-rcu.c
@@ -16,6 +16,7 @@
 
 #include <config.h>
 #include <errno.h>
+#include <malloc.h>
 #include "ovs-rcu.h"
 #include "fatal-signal.h"
 #include "guarded-list.h"
@@ -348,6 +349,7 @@ ovsrcu_call_postponed(void)
         free(cbset->cbs);
         free(cbset);
     }
+    malloc_trim(0);
 
     return true;
 }
---

We need some generic solution for this kind of stuff.  Ideally,
fixed in glibc, but it doesn't look like they're going to change
that behavior.  jemalloc could also be an option to avoid all
that glibc weirdness.

One interesting thing though is that OVS in different runs without
a fix consumes 3GB or 12GB.  It's one of these two numbers in
different runs without any code changes.  I'd like to dig deeper
into that later.  I don't see that thing if the patch is applied,
so hopefully it's just another outcome of the same problem that
I fixed.  But I'll try to investigate where these extra 8GB
are coming from.  RAM usage with the fix is only 140MB in the test.

Best regards, Ilya Maximets.

> 
> Thanks,
> Han


More information about the dev mailing list