[ovs-discuss] ovs-vswitchd mlockall and stack size

Ben Pfaff blp at nicira.com
Tue Jul 8 16:47:50 UTC 2014


I guess that the biggest effect on stack size would be the flow table
and in particular how much recursion flow processing causes.  There are
a few tests that force as-deep-as-possible recursion:

    AT_SETUP([ofproto-dpif - infinite resubmit])
    AT_SETUP([ofproto-dpif - exponential resubmit chain])

I don't think that forcing all packets to userspace would have much of
an effect.  (The closest equivalent would be to disable megaflows,
there's an "ovs-appctl" command for that, look in "ovs-appctl help".)

Another hint toward maximum stack requirement is to look through the
generated asm for stack usage, e.g.:

        objdump -dr vswitchd/ovs-vswitchd|sed -n 's/^.*sub.*$0x\([0-9a-f]\{1,\}\),%esp/\1/p'|sort|uniq|less

which shows that we have at least one place where we allocate 327,788
bytes on the stack (!).  I hope that is not in the flow processing path!

On Tue, Jul 08, 2014 at 05:36:07PM +0100, Anoob Soman wrote:
> I have been running tests with 1MB stack size and ovs-vswitchd seem
> to hold pretty well. I will try to do some more experiments to find
> out the max depth of the stack, but I am afraid this will totally
> depend on the test I am running. Any suggestion on what sort of test
> I should be running ? More over "force-miss-model" other-config is
> missing from 2.1.x as there is no concept of facets. Is there way
> that I can force all packets to be processed in userspace, other
> than me doing "ovs-dpctl del-flows" periodically.
> 
> Thanks,
> Anoob.
> On 08/07/14 17:15, Ben Pfaff wrote:
> >On Tue, Jul 08, 2014 at 05:08:43PM +0100, Anoob Soman wrote:
> >>Since openvswitch has moved to multi-threaded model, RSS usage of
> >>ovs-vswitchd has increased quite significantly compared to the last
> >>release we used (ovs-1.4.x). Part of the problem is using mlockall
> >>(with MCL_CURRENT|MCL_FUTURE) on ovs-vswitchd, which causes every
> >>pthreads stack's and heap's virtual address to locked to RAM.
> >>ovs-vswitch (2.1.x) running on a 8 vCPU dom0 (10 pthreads) uses
> >>around 89M of RSS (80MB just for stack), without any VMs running on
> >>the host. One way to reduce RSS would be to reduce the number of
> >>"n-handler-threads" and "n-revalidator-threads", but I am not sure
> >>about the performance impact of having these thread numbers reduced.
> >>I am wondering if the stack size of the pthreads can be reduce
> >>(using pthread_attr_setstack). By default pthreads max stack size is
> >>8MB and mlockall locks all of this 8MB into RAM. What could be
> >>optimal stack size that can be used.
> >I think it would be very reasonable to reduce the stack sizes, but I
> >don't know the "correct" size off-hand.  Since you're looking at the
> >problem already, perhaps you should consider some experiments.
> 



More information about the discuss mailing list