[ovs-discuss] max mega flow 64k per pmd or per dpcls?

Fri Jun 30 13:33:40 UTC 2017

Thank you very much, Bodireddy, appreciated your reply.

On Fri, Jun 30, 2017 at 5:19 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodireddy at intel.com> wrote:

> >
> >Thanks Bodireddy.
> >
> >Sorry I am a bit confused about the EMC occupied size per PMD, here[1]
> has a
> >different story.
>
> Initially EMC had 1024 entries and the patch [1] increased it to 8k.  By
> doing so in simple test
> scenarios, most of the flows will hit EMC and we can achieve wire speed
> for smaller packets.
> With this patch the EMC cache size is now @ ~4MB.
>
> It should be noted that there can be multiple PMD threads and whole lot of
> other threads running
> on the compute. Cache is limited and all the threads contend for LLC
> resources. This leads to cache thrashing
> and impacts performance.
>
> To alleviate this problem, Intel's (Resource Director Technology) RDT is
> used to partition the LLC and
> assign cache ways to different threads based on priority.
>
> >
> >Do you mean in real scenarios OVS-DPDK can be memory bound on EMC?
> OvS is flexible and results are use case dependent.  With some use cases ,
> EMC quickly gets saturated and packets will be sent to classifier.
> Some of the bottlenecks I referred are in classifier. .
>
> >I thought EMC should be totally fit in LLC.
> As pointed earlier there may be lot of threads on the compute and this
> assumption may not be right always.
>
> >
> >If the megaflows just part in LLC, then the cost of copy between memory
> and
> >LLC should be large, isn't it not like what defined as 'fast path' in
> userspace
> >compared with kernel datapath? And if most of megaflows are in memory,
> >the reason of every PMD  has one dpcls instance is to follow the rule PMD
> >thread should has local data as most as it can, but not every PMD put it
> in its
> >local cache, if that is true, I can't see why 64k is the limit num,
> unless this is an
> >experience best value calculated from vtune/perf resutls.
> >
> >You are probably enabled hyper-thread with 35MB and got 28 cores.
>
> I have E5-2695 v3, dual socket with 14 cores per socket. I will have 56
> cores with HT enabled.
>
> - Bhanuprakash.
>
> >
> >[1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html
> >
> >
> >
> >On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash
> ><bhanuprakash.bodireddy at intel.com> wrote:
> >>
> >>I guess the answer is now the general LLC is 2.5M per core so that there
> is
> >64k
> >>flows per thread.
> >
> >AFAIK, the no. of flows here may not have to do anything with LLC.  Also
> there
> >is EMC cache(8k entries) of ~4MB per PMD thread.
> >Yes the performance will be nice with simple test cases (P2P with 1 PMD
> >thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK
> can be
> >memory bound.
> >
> >BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of
> >2.5M/core isn't right.
> >
> >- Bhanuprakash.
> >
> >>
> >>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang <xianghuir at gmail.com> wrote:
> >>Thanks Darrell,
> >>
> >>More questions:
> >>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache
> or
> >>anywhere? how it is calculated in such an exact number?  If there are
> more
> >>ports added for polling, for avoid competing can I increase the 64k size
> into a
> >>bigger one? Thanks.
> >>
> >>Hui.
> >>
> >>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20170630/619823a6/attachment-0001.html>