[ovs-dev] sFlow extension for tunnels / MPLS - question about user-space flow-cache

Jesse Gross jesse at nicira.com
Mon Apr 6 22:23:40 UTC 2015


On Sat, Apr 4, 2015 at 8:45 AM, Neil McKee <neil.mckee at inmon.com> wrote:
>
> On Fri, Apr 3, 2015 at 3:14 PM, Jesse Gross <jesse at nicira.com> wrote:
>>
>> On Wed, Apr 1, 2015 at 10:14 PM, Neil McKee <neil.mckee at inmon.com> wrote:
>> > I've been looking at filling in the sFlow structures to report on tunnel
>> > encap and other transformations.   sFlow sampling is best done on
>> > ingress
>> > only,  so I can't use the egress-sampling action that the IPFIX
>> > implementation uses to get the tunnel info.   So how should I look up
>> > the
>> > list of actions for a flow when an sFlow sample appears in
>> > ofproto-dpif-upcall.c:process_upcall()?  There is not enough room for
>> > all
>> > the fields we need in the (8-byte) userdata-cookie that goes with the
>> > compiled actions into the kernel and is passed back on the upcall.   But
>> > it
>> > should be possible to look up the whole list of actions in user-space
>> > when
>> > we process the upcall.   Here are some possible ways.   Please comment:
>>
>> If you're trying to get the outer tunnel information for both ingress
>> and where a packet will eventually egress, I don't think that having
>> the list of kernel actions is sufficient. It can tell you some things,
>> like the tunnel type and destination IP address but not UDP source
>> port, for example.
>
>
> The UDP source port of the tunnel is not so important for sFlow monitoring.
> The usage-model for sFlow is to run it on all ports of all switches,  so if
> you need to know those details you can always pick them up at the next
> switch where the encapsulated packets will be sampled at ingress.
>
> sFlow is strict about some things (such as the nature of the
> packet-count-based random sampling or the freshness of the pushed-counters)
> but the annotation structures allow you to encode "unknown" for any field,
> especially if getting that field is more trouble than it's worth.  I think
> the UDP source port of a tunnel is a prime example.  It's just a random
> number.  It doesn't add much to the story.  It's no problem to leave it out.
>
> Even if we can only supply the outer ip_dst,  outer ip-proto, and VNI (where
> applicable),  that's most of the value.
>
>>
>> Can you explain why sampling must be done at the beginning of the
>> pipeline? Is it just trying to avoid other transforms like vlan tags?
>
>
> While there is no strict rule about ingress v. egress sampling in the
> standard,  the practical considerations are compelling.  The most obvious is
> that ingress-sampling is your one chance to see and count the packet as it
> came from the host,  before it gets mangled in various ways and sent on,
> possibly going out via multiple interfaces and tunnels.
>
> A less obvious consideration is the value of uniformity from hop to hop
> through the fabric.  I know of only one legacy product family that used
> egress-sampling (by necessity given the ASIC design - it was egress or
> nothing).  Every other implementation has ingress-sampling.   When all the
> devices have ingress sampling then the solution tessellates well and you end
> up with good visibility in both directions on every link in the network.
>
> Parsing the actions to extract tunnel info is working well for me in
> prototype.  I can report egress tunnels whether they were tunnel-SET or
> tunnel-PUSH operations,  and  I can populate some of the sFlow-MPLS
> structures too.  There are even some sFlow v4/v6 NAT structures that might
> be worth exporting if it looks like actions are rewriting addresses and
> ports.  Hard to predict all the crazy things OVS will be used for,  so it's
> nice to have a mechanism that is future-proofed.
>
> Just need a reliable way to get the actions.

If we have the actions in userspace and it works well enough to get
them from there, then that seems like the preferred way to go.
However, at this point, I don't see a huge problem with having the
kernel pass them up either.



More information about the dev mailing list