[ovs-dev] [PATCH] sFlow export: include standard tunnel structures (for GRE, VXLAN etc.)

Jesse Gross jesse at nicira.com
Tue Oct 22 22:46:54 UTC 2013


On Mon, Oct 21, 2013 at 2:33 PM, Romain Lenglet <rlenglet at vmware.com> wrote:
> ----- Original Message -----
>> From: "Romain Lenglet" <rlenglet at vmware.com>
>> To: "Jesse Gross" <jesse at nicira.com>
>> Cc: dev at openvswitch.org
>> Sent: Friday, October 18, 2013 6:46:05 PM
>> Subject: Re: [ovs-dev] [PATCH] sFlow export: include standard tunnel structures (for GRE, VXLAN etc.)
>>
>> ----- Original Message -----
>> > From: "Jesse Gross" <jesse at nicira.com>
>> > To: "Romain Lenglet" <rlenglet at vmware.com>
>> > Cc: "Neil Mckee" <neil.mckee at inmon.com>, dev at openvswitch.org
>> > Sent: Friday, October 18, 2013 6:23:23 PM
>> > Subject: Re: [ovs-dev] [PATCH] sFlow export: include standard tunnel
>> > structures (for GRE, VXLAN etc.)
>> >
>> > On Fri, Oct 18, 2013 at 5:58 PM, Romain Lenglet <rlenglet at vmware.com>
>> > wrote:
>> > > ----- Original Message -----
>> > >> From: "Jesse Gross" <jesse at nicira.com>
>> > >> To: "Romain Lenglet" <rlenglet at vmware.com>
>> > >> Cc: "Neil Mckee" <neil.mckee at inmon.com>, dev at openvswitch.org
>> > >> Sent: Friday, October 18, 2013 5:50:05 PM
>> > >> Subject: Re: [ovs-dev] [PATCH] sFlow export: include standard tunnel
>> > >> structures (for GRE, VXLAN etc.)
>> > >>
>> > >> On Fri, Oct 18, 2013 at 5:43 PM, Romain Lenglet <rlenglet at vmware.com>
>> > >> wrote:
>> > >> > ----- Original Message -----
>> > >> >> From: "Romain Lenglet" <rlenglet at vmware.com>
>> > >> >> To: "Neil Mckee" <neil.mckee at inmon.com>
>> > >> >> Cc: dev at openvswitch.org
>> > >> >> Sent: Wednesday, October 9, 2013 10:30:17 AM
>> > >> >> Subject: Re: [ovs-dev] [PATCH] sFlow export: include standard tunnel
>> > >> >> structures (for GRE, VXLAN etc.)
>> > >> >>
>> > >> >> On Oct 8, 2013, at 10:09 PM, Neil Mckee <neil.mckee at inmon.com> wrote:
>> > >> >> > +    /* Indicate 0==unknown for the src_port. It may be set to a
>> > >> >> > random
>> > >> >> > +       number on a flow-by-flow basis to increase entropy for ECMP
>> > >> >> > fabrics.
>> > >> >> > +       The assumption being made here is that it is not so
>> > >> >> > important
>> > >> >> > to
>> > >> >> > +       report this.  At least not important enough to justify the
>> > >> >> > effort
>> > >> >> > +       of making it accessible here. */
>> > >> >>
>> > >> >> Exporting the source UDP source port is essential.
>> > >> >> You also have to export the tunnel key: GRE key (32- or 64-bit), VNI
>> > >> >> (24-bit), etc.
>> > >> >> I don't see how this feature could be useful without the UDP source
>> > >> >> port
>> > >> >> and
>> > >> >> tunnel key.
>> > >> >
>> > >> > I thought more about this. Exporting the source UDP port is really
>> > >> > important. Since the source port is calculated in the tunnel port at
>> > >> > egress during encapsulation and is lost at ingress during
>> > >> > decapsulation,
>> > >> > and the sampling here is done before encapsulation or after
>> > >> > decapsulation,
>> > >> > the easiest way I can imagine to determine the source port is to redo
>> > >> > the
>> > >> > hashing here. This would require factorizing the hashing code into a
>> > >> > function that can be used in userspace in this code.
>> > >>
>> > >> I don't think that it's really viable to regenerate the hash used to
>> > >> compute the source port. In the best case, we are the ones generating
>> > >> it but the kernel hash function might change or the hash might come
>> > >> from the NIC. In the worst case, when we receive a packet the hash
>> > >> could have been generated by a non-OVS device with an unknown hash
>> > >> algorithm
>> > >>
>> > >
>> > > Yes, agreed, that's a problem.
>> > > The only other alternative I can imagine to get the source UDP port is to
>> > > do
>> > > the sampling in the port (esp. in the tunnel port) in the datapath.
>> > > This would be quite intrusive and complicated, as it would require the
>> > > ports
>> > > to do sampling and upcalls.
>> > > I'd prefer to avoid that.
>> > > Do you see any other alternative?
>> >
>> > I guess it's not entirely clear to me at this point why it's important
>> > to record the UDP source port. Can you explain?
>>
>> Identifying all the flows for a tunnel in the network is useful to detect
>> changes in the routing of tunnel flows, which can e.g. be due to network
>> failures (e.g. a link went down, and the flows are rerouted), and might
>> impact the tunnel as a whole. This is useful for root cause analysis.
>> If we didn't get all the tunnel flow headers from the hosts, we would lose
>> some of the information.
>
> More importantly, we want to be able to map a logical flow to a specific
> tunnel flow (i.e. the tunnel's IP+transport header), to determine the path
> taken by a logical flow in the physical fabric.
> This is possible because the tunnel header, incl. the transport source port,
> uniquely identifies that tunnel flow in the physical network.
> If we don't have the source port from OVS, we can't do that mapping.
>
> Here's a proposal:
>
> - Factorize get_src_port() out of datapath/vport-lisp.c to be shared by all
>   vport types.
>
> - Modify datapath/vport-vxlan.c to call get_src_port() instead of
>   vxlan_src_port(). The VXLAN RFC doesn't specify any specific hashing
>   algorithm, so it should be fine to just use the same get_src_port()
>   hashing as for LISP.
>
> - Always calculate the hash in kernelspace for each packet sent in an
>   upcall, or only for some types of upcalls e.g. sFlow / IPFIX sampling
>   upcalls, and send it in the upcall so that userspace gets the transport
>   transport source port from independently from the input or output tunnel
>   type.

To clarify, I think this would need to have two parts:
 - For received packets include the source port of the outer UDP
header. This can't simply be computed because the original sender
might have used an unknown hash algorithm.
 - Compute the hash for all packets because they might be send to a tunnel port.

Is that right?

The second one in particular seems a little odd to me. The other thing
that I think is important to be careful of is how this will interact
with megaflows. In the traditional OVS case with a very wide exact
match, it was likely (although perhaps not guaranteed) that the hash
computed for the source port was fixed for a given flow. This is
definitely not true any longer and while it may not matter if it is
only needed on a per-sampled-packet basis, it affects where and how it
is attached to a flow or upcall.



More information about the dev mailing list