[ovs-dev] [PATCH] datapath: Make port numbers for UDP-based tunnels userspace configurable.

Jesse Gross jesse at nicira.com
Tue Nov 1 21:12:46 UTC 2011


On Nov 1, 2011, at 1:19 PM, Ben Pfaff <blp at nicira.com> wrote:

> On Fri, Oct 28, 2011 at 06:37:42PM -0700, Jesse Gross wrote:
>> On Fri, Oct 28, 2011 at 11:29 AM, Ben Pfaff <blp at nicira.com> wrote:
>>> On Thu, Oct 27, 2011 at 04:38:39PM -0700, Jesse Gross wrote:
>>>> On Wed, Oct 26, 2011 at 3:27 PM, Ben Pfaff <blp at nicira.com> wrote:
>>>>> diff --git a/datapath/tunnel.c b/datapath/tunnel.c
>>>>> index 372d90e..047961f 100644
>>>>> --- a/datapath/tunnel.c
>>>>> +++ b/datapath/tunnel.c
>>>>> +static int tnl_sock_bind(const struct tnl_ops *tnl_ops, struct tnl_mutable_config *mutable)
>>>>> +{
>>>>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,26)
>>>>> + ?? ?? ?? struct sockaddr_in sin;
>>>>> + ?? ?? ?? struct tnl_sock *ts;
>>>>> + ?? ?? ?? int err;
>>>>> +
>>>>> + ?? ?? ?? if (tnl_ops->ipproto != IPPROTO_UDP)
>>>>> + ?? ?? ?? ?? ?? ?? ?? return 0;
>>>>> +
>>>>> + ?? ?? ?? list_for_each_entry (ts, &tnl_socks, list) {
>>>>> + ?? ?? ?? ?? ?? ?? ?? if (ts->port == mutable->dst_port) {
>>>>> + ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? if (udp_sk(ts->socket->sk)->encap_rcv != tnl_ops->udp_rcv)
>>>>> + ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? return -EADDRINUSE;
>>>>
>>>> When I originally proposed this, I thought that we could directly use
>>>> the socket lookup to find the tunnel. ??I now see that it isn't
>>>> possible because of the key (and there is no way to know whether the
>>>> key is important until after you do the tunnel lookup). ??Is there any
>>>> benefit to binding the socket more specifically to the tunnel using
>>>> the remote and local (if there is one) IPs? ??The main benefit of this
>>>> would be to enable finer grained binding of sockets for different
>>>> tunnel types. ??I don't know whether that is more or less confusing
>>>> though. ??It also makes the UDP socket lookup marginally faster.
>>>
>>> If I do that I think I'll need a hash table for the tnl_socks since
>>> there's likely to be many of them (one per local/remote IP pair) if
>>> someone configures a bunch of tunnels instead of just one (for a given
>>> port).
>>>
>>> I'm willing to do it, your paragraph doesn't make me sure whether you
>>> really want it though.
>>
>> My gut feeling is that it's probably the right thing to do (and
>> actually follow through to use the socket for tunnel lookup, modulo
>> the key; in the keyless case we don't even have to anything else).  It
>> doesn't excite me that much because it results in parallel code paths
>> for UDP based protocols vs. everything else.  However, I think for
>> upstreaming where we only have VXLAN it will look a lot more correct
>> if we do it this way and it should perform slightly better as well.
>
> OK, I have this work in progress but I'm not sure what you have in
> mind for using the UDP socket as part of the lookup.  There's one case
> where we can obviously short-circuit the logic: if there's exactly one
> tunnel associated with a tnl_sock then we can just check whether that
> one is appropriate for the packet.  Otherwise I don't see what we can
> do much better than the existing tnl_find_port().

Yes, that's the case that I'm thinking of but I think it is actually
rather common. If we bind the socket to the specific local/remote
addresses/ports then the only case where we will have more than one
port per socket is when there is more than one key defined.  That
means that if we are using the flow key based approach then we don't
need a lookup and never have worry about the wildcarded local port or
multicast.

> Also it's not clear to me that the UDP socket lookup logic is a real
> winner performance-wise, see the comment
> /* UDP is nearly always wildcards out the wazoo, it makes no sense to try
> * harder than this. -DaveM
> */
> on __udp4_lib_lookup() and the lovely scoring function in there.

Yes but we always need to do the UDP lookup and this allows us to
avoid doing the same thing all over again.  UDP will also match sooner
with fewer wildcards.



More information about the dev mailing list