[ovs-dev] STT Implementation Thoughts

Sat Mar 31 01:47:43 UTC 2012

On Fri, Mar 30, 2012 at 02:49:52PM -0700, Jesse Gross wrote:
> On Tue, Mar 27, 2012 at 12:40 AM, Simon Horman <horms at verge.net.au> wrote:
> > On Mon, Mar 26, 2012 at 02:36:31PM +0900, Simon Horman wrote:
> >> On Wed, Mar 21, 2012 at 12:09:09PM -0700, Jesse Gross wrote:
> >> > On Wed, Mar 21, 2012 at 1:52 AM, Simon Horman <horms at verge.net.au> wrote:
> >> > > Hi Jesse, Hi all,
> >> > >
> >> > > I am currently investigating how STT[1] may be implemented for Open vSwtich.
> >> > > In the course of this it has struck me that a property particular
> >> > > to STT is that it uses the TCP IP protocol number but actually isn't TCP.
> >> > > Which makes me wonder how the receive-side it may be hooked into the Linux
> >> > > kernel.
> >> > >
> >> > > My thought so far is that some modifications may need to be made to
> >> > > tcp_v4_rcv() and/or tcp_v6_rcv() to skip TCP processing of packets
> >> > > for sockets that have been bound by STT. But I was wondering if you
> >> > > have any thoughts on this.
> >> >
> >> > I agree it is an issue.  I think what we want is something analogous
> >> > to encap_rcv in the UDP stack, which is used for exactly this purpose
> >> > (for example, the OVS CAPWAP implementation uses it, as do L2TP and
> >> > UDP encapsulated IPsec upstream).  It's perhaps somewhat more likely
> >> > to be contentious since with UDP the packets are still processed by
> >> > the UDP stack just without the userspace termination part whereas this
> >> > takes the TCP state machine out of the picture but as least there is
> >> > precedent.
> >>
> >> Agreed, it does seem that there could be some cause for contention there.
> >>
> >> The next thing that I am puzzling over is how to select a source port.
> >> The draft makes reference to using a hash, which seems like a nice idea.
> >> However I am concerned about ensuring that a) the port isn't already in use
> >> and b) nothing else uses it while STT is using it. It seems to me that
> >> an obvious but not necessarily very efficient way to do this would be to
> >> create a sock using sock_create() and bind it using a modified
> >> version if inet_bind() or similar. Do you have any thoughts on this.
> >
> > Ok, scratch that for the most part. I see that the CAPWAP implementation
> > makes use of sock_create_kern() to create a socket. So it seems that it
> > would be reasonable for an STT implementation to do similar.
> 
> I'm actually not sure that it's necessary to really allocate source
> ports at all.  The only thing that needs to be unique is the
> SIP/DIP/SPORT/DPORT 4-tuple.  It's important for the kernel to do port
> allocation if you can have multiple userspace programs that are trying
> to connect to the same remote IP/port combination but in this case,
> STT effectively "owns" that remote peer (and any program that tries to
> establish a TCP connection to it will fail anyways since the remote
> will presumably expect it to be STT traffic).  As a result, as long as
> there is only one STT stack running, nobody should stomp on that
> unique identifier.

Hi Jesse,

thanks, you make a good point which I had overlooked.
I agree that its probably not necessary to allocate the source port.