[ovs-dev] [PATCH v2] Simplify kernel sFlow implementation

Jesse Gross jesse at nicira.com
Fri Aug 19 01:55:22 UTC 2011


On Aug 19, 2011 6:41 AM, "Neil McKee" <neil.mckee at inmon.com> wrote:
>
>
> On Aug 18, 2011, at 4:15 AM, Jesse Gross wrote:
>
>> On Aug 18, 2011 12:21 AM, "Ben Pfaff" <blp at nicira.com> wrote:
>> >
>> > [bringing Neil McKee into the conversation since he knows more about
>> > this than me]
>> >
>> > On Wed, Aug 17, 2011 at 02:17:03PM +0800, Jesse Gross wrote:
>> > > The one piece that I'm the least sure of is how to handle stats.  In
>> > > my ideal world, we would drop the sample pool altogether and have
>> > > userspace figure it out for itself.  The information is clearly
>> > > present in userspace to calculate any packet count that may be
>> > > required but it might not be instantaneously accurate.  I'm unsure of
>> > > the consequences of this. I think it's reasonable to report some kind
>> > > of stats (current flow stats seems to fit) but if you're trying to
get
>> > > the current port counters then that isn't really useful.  Directly
>> > > reporting port count seems a little out of place to me.  Ben, (or
>> >            ^^^^^^^^^ packet count?
>>
>> Yes, sorry, I meant packet count on the port (as opposed to the flow),
which is essentially what the sample pool is.
>>
>> > > anyone else) do you have any thoughts?
>> >
>> > My understanding is that the sflow sample pool should be
>> > asymptotically correct, that is, it should be within O(1) of the
>> > correct value as n increases to infinity.  So I think that it's OK if
>> > the stats are always instantaneously behind a bit.
>>
>> OK, I buy that and it means that userspace already has all the
information it needs.
>
> Yes,  if you are certain that the counters available in user-space are
never incremented in a place where the current sample pool would not be (or
vice-versa) then you could eliminate that atomic_inc() step from the
datapath.   However since execute_actions() is called in two places it seems
like this might be quite hard to verify for all cases(?)

Userspace actually has a pretty good idea of the stats (and a lot of work
has gone into making this the case over time).  In fact, by making sFlow sit
on top of all of this infrastructure instead of a special case, it
automatically picks up all of this and future work.

Two examples:
* I recently noticed a bug while working on something else that packets send
by userspace (typically the first packet in a flow) will never be sampled by
sFlow.
* Atomic operations are quite slow, which means that enabling sFlow results
in a major performance hit.

The solution proposed here wouldn't have had either one of these problems
because the normal stats counters are fast and get more development time,
testing, etc.

> Another check worth confirming is that the random number generator you are
using will always converge exactly to the intended 1-in-N mean (before too
long).   That gives you another fall-back position.

I'm not sure what you mean by "fall-back position"?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-dev/attachments/20110819/9da8e06a/attachment-0003.html>


More information about the dev mailing list