[ovs-dev] Tunable flow eviction threshold

Simon Horman horms at verge.net.au
Thu Jul 28 01:54:57 UTC 2011


On Wed, Jul 27, 2011 at 09:31:19AM -0700, Ben Pfaff wrote:
> On Wed, Jul 27, 2011 at 03:20:48PM +0900, Simon Horman wrote:
> > On Tue, Jul 26, 2011 at 05:02:25PM -0700, Ben Pfaff wrote:
> > >     1. To determine what flows should be retained in the kernel, and
> > >        what flows should be discarded, based on the number of packets
> > >        that have recently passed through for that flow.  We also need to
> > >        do this to maintain accurate OpenFlow flow statistics.
> > > 
> > >        Currently this is implemented by periodically dumping stats for
> > >        every kernel flow, clearly O(n).  I don't know a better way.
> > 
> > 	I guess this hurts a lot as that code's purpose
> > 	is to mitigate the effects of large numbers of flows.
> > 
> > 	Off the top of my head, I wonder if an approach that would
> > 	work would be to only ask the kernel for flows over a certain age.
> > 	Or otherwise push some of the logic into the kernel where
> > 	access to the data ought to be cheaper.
> > 	Or perhaps to have an algorithm that can deal with
> > 	a partial dump from the kernel - e.g. a dump of up to 10,000 flows.
> 
> One strategy that I have considered is to be able to ask only for flows
> that have a non-zero packet count.  That would help with the common case
> where, when there is a large number of flows, they are caused by a port
> scan or some other activity with 1-packet flows.  It wouldn't help at
> all in your case.
> 
> It's already possible to do a partial dump.  Just read, e.g., the first
> 10,000 flows from the Netlink socket and leave the rest unread.  Then
> later read the next 10,000 flows from the socket, and so on.  But I
> don't see how it really solves a problem.  It divides the dump into
> pieces but it doesn't make obtaining it any faster.
> 
> By "flows over a certain age", do you mean flows that have not been used
> in a certain amount of time?  I can see how that helps with eviction,
> although it doesn't help maintaining OpenFlow flow stats.
> 
> Pushing some of the logic into the kernel might make sense.  I wouldn't
> want to make the kernel implementation less general-purpose in the
> process, though.

Agreed. This conversation has continued else where in this thread.
I'll read those emails more thoroughly before responding in a more
meaningful fashion (hopefully).

> > > > My current test is rather abusive - I'm using the kernel packet generator
> > > > to send packets for many flows at roughly the same per-flow packet rate.
> > > > In that scenario all flows are equal, so I think that classification would
> > > > probably break down. Indeed, it plays havoc with the current eviction algorithm.
> > > > However, I do agree that some kind of prioritisation algorithm could work
> > > > well with less abusive (and more common?) workloads.
> > > 
> > > What does your flow table look like?
> > 
> > Does this answer the question?
> > 
> > sudo ovs-ofctl add-flow br3 "in_port=1 ip nw_dst=10.0.0.0/8 idle_timeout=0 action=output:3"
> > 
> > Where 3 is a dummy interface.
> >
> > My test involves sending 128byte packets at 450,000 pps (~1/2Gigabit/s).
> > 
> > By tuning the eviction threshold and with the table resize fix
> > I sent a week or so back I can get up to 128k flows and still
> > a have a little of one CPU left over.
> > 
> > Empirically it seems that there is some hard limit of 128k flows
> > that I have hit.
> 
> Thanks for the info.  I might want to try to replicate your results.  I
> have never used the kernel packet generator: would you mind giving me a
> recipe for producing this behavior?

Sure, its a bit raw. but its attached and is exactly what I have been using.
In particular, the bit about "eth4" is left over from earlier
testing work.

> Your test case really strikes at all the assumptions I've always made
> about network traffic, which is essentially that when there is heavy
> traffic there are identifiable "heavy hitters", so that one may obtain
> good performance by focusing on keeping the CPU% used to process those
> to a minimum.
> 
> The limit should be higher than 128 kflow.  Comments in datapath/table.h
> imply that the limit should be 256 kflow with a 64-bit kernel or 1 Mflow
> with a 32-bit kernel.  Maybe there's another bug there.

I was also expecting the limit to be 1M flows and to run
out of CPU at around 200,000 for the particular test I am using.
It is of course possible that I am miss-reading the results.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pktgen.sh
Type: application/x-sh
Size: 1476 bytes
Desc: not available
URL: <http://mail.openvswitch.org/pipermail/ovs-dev/attachments/20110728/f8265f0e/attachment-0004.sh>


More information about the dev mailing list