[ovs-dev] [RFC PATCH 3/3] tunneling: Allow tunnel fragmentation by default.

Rajahalme, Jarno (NSN - FI/Espoo) jarno.rajahalme at nsn.com
Thu Mar 28 12:41:29 UTC 2013


On Mar 27, 2013, at 1:47 , ext Jesse Gross wrote:

> On Mon, Mar 25, 2013 at 12:03 PM, Jarno Rajahalme
> <jarno.rajahalme at nsn.com> wrote:
>> Changes the default tunnel dont_fragment from "true" (don't
>> fragment) to "false" (allow fragmentation).  Tunnel outer headers
>> will not have the DF bit set by default, and if "df=true" option is
>> given for a tunnel, also local fragmentation will be disabled.
>> The name of the option is changed from "df_default" to "df" to be in
>> line with the rest of the tunneling code.
>> 
>> Signed-off-by: Jarno Rajahalme <jarno.rajahalme at nsn.com>
> 
> I can see the desire to make these two settings consistent, although
> it really seems preferable to me to have DF on in most situations to
> avoid possible repeated fragmentation.  I also don't know that there's
> much benefit to turning local_df off since the alternative is to
> simply drop the packet (it will also generate an ICMP message but in
> the case of tunnels, the sender will never get it).

I have no need to insist, but it seems to me that DF should be used (only) when doing path MTU discovery, i.e., you are prepared to receive the associated ICMP messages and decrease your message size accordingly. Since OVS no longer does that for tunnels, maybe DF use should also be retired. Relating to this, I'd think most implementations fragment only as the last resort to avoid dropping packets. So, by setting DF and not doing PMTUD we are essentially saying that it is OK to drop the tunneled packets if they don't happen fit to the MTU on a link somewhere down the path.

So, if we retire the use of DF by default, we can drop the local_df (read as: "local_do_fragment" :-) setting and let the tunnel config to choose between "OK to fragment" and "DO NOT fragment" for the whole path, including the local stack.

One strategy to avoid unnecessary fragmentation would be to not use the maximum segment size when you must fragment. For example, if you have 1600 byte IP packet to transmit over a link with MTU of 1500 bytes, it would be better to fragment it to 812 and 808 byte packets, than, say 1500 byte and 120 byte packets. That way the risk for further fragmentation (e.g., due to yet another layer of tunneling) would be smaller.

Finally, the option name "df_default" seems like a remnant from the time we had the "df_inherit" option. IMO that should be fixed regardless.

  Jarno




More information about the dev mailing list