[ovs-dev] Statistics Collection Performance Impact

Sun Dec 11 02:20:50 UTC 2011

On Mon, Nov 28, 2011 at 02:03:10PM -0800, Ben Pfaff wrote:
> On Thu, Nov 10, 2011 at 09:14:50AM -0800, Ben Pfaff wrote:
> > On Thu, Nov 10, 2011 at 06:04:07PM +0900, Simon Horman wrote:
> > > On Wed, Nov 09, 2011 at 08:14:36PM -0800, Ben Pfaff wrote:
> > > > On Thu, Nov 10, 2011 at 10:20:24AM +0900, Simon Horman wrote:
> > > > > On Wed, Nov 09, 2011 at 01:19:42PM -0800, Ben Pfaff wrote:
> > > > > > On Mon, Nov 07, 2011 at 04:20:19PM +0900, Simon Horman wrote:
> > > > > > > Although a very simple and possibly na??ve approach, I wonder if
> > > > > > > dynamically extending the interval at which statistics are collected is
> > > > > > > a worthwhile approach to mitigating the performance impact of statistics
> > > > > > > collection.
> > > > > > 
> > > > > > It's one approach.  It has the nice property of being very simple.
> > > > > > 
> > > > > > Another approach that I've been pondering myself is, when there are many
> > > > > > datapath flows, to read out the kernel stats for only some of them at a
> > > > > > time and apply the expiration algorithm to just those.  We could run
> > > > > > expiration just as frequently overall, but it would apply to any given
> > > > > > flow less frequently.
> > > > > 
> > > > > I had also considered that and I think it an approach worth investigating.
> > > > > It seems to me that the main challenge will be arrange things such that a
> > > > > partial statistics update can occur while still allowing all statistics to
> > > > > be updated over time.  I wonder if partitioning the flow hash would be a
> > > > > reasonable way to achieve this.
> > > > 
> > > > I think that it works OK already.  Just start the flow dump and read as
> > > > many flows as you want to deal with at a time, then stop, keeping the
> > > > dump in progress.  Then when you want to keep going later, just keep
> > > > reading the dump.
> > > 
> > > Good point. I'll see about testing that.
> > 
> > Great, I look forward to hearing how well it works.
> 
> Did you get a chance to experiment with this?  I really am interested
> in doing it this way, if it is effective.

Hi Ben,

sorry for the long delay. I finally had a stab at implementing this idea
but as yet it does not seem to be working.

My implementation simply keeps track of a static struct dpif_flow_dump dump
in update_stats() and resets it as necessary using dpif_flow_dump_start()
and dpif_flow_dump_done().

The result seems to be that flows are expiring all over the place.
I assume this is because their stats are not being updated. Do I need
to keep track of more state in order to take into account that
the flow_table is changing between calls to update_stats()?