[ovs-dev] Path MTU discovery on GRE interfaces

Matthias May matthias.may at westermo.com
Fri Jun 25 15:31:52 UTC 2021

On 24/06/2021 05:51, Jesse Gross wrote:
> On Wed, Jun 23, 2021 at 10:06 AM Ben Pfaff <blp at ovn.org <mailto:blp at ovn.org>> wrote:
>     [updating Jesse's email address]
>     On Wed, Jun 23, 2021 at 04:48:29PM +0200, Matthias May via dev wrote:
>     > I'm currently fighting with issues where TCP/UDP frames that are larger than the MTU of a GRE tunnel are dropped.
>     > I'm aware of the whys and how to work around the issue, but while looking for solutions i stumbled over the fact that:
>     > * [1] added PMTUD support to OVS
>     > * [2] disabled/removed with v1.9.0 respectively v1.10.0 the feature
>     >
>     > Even after some significant time looking through the history i haven't found a reason why this was removed, just
>     that it
>     > was removed.
>     >
>     > I started some preliminary work to add PMTUD support to OVS (again), but the fact that it was removed 8 years ago
>     seems
>     > to me like a red flag to not do it (again).
>     >
>     > Could someone fluent with the OVS history from 8 years ago shed some light on why PMTUD support was dropped?
>     > Any pointers to a thread on this topic?
>     It was a layering violation.  This caused problems like, for example,
>     not having a good IP address to send the "frag needed" message from.
> In terms of the history, I believe what happened is that PMTUD support was added before the kernel module was
> upstreamed. When we later submitted the code upstream, we knew that it would not fly due to the layering violations so
> support was removed before submitting.
> However, as Dan mentioned, I believe that check_pkt_len can be used to implement essentially the same behavior and it is
> upstream as it is more generic. It should still only be used in the context of an L3 operation to avoid introducing the
> same layering issues though.

Thank you for your input.
I haven't done anything with check_pkt_len yet, but this seems promising.

Currently i simply ignore the DF bit and force fragmentation on the tunnel between the two sites.
After all, the proper solution is to set the MTU correctly on all involved devices.
--> I already don't have an issue for proper devices.

This is to work around some "industrial" devices that simply don't have the option to reduce the MTU, don't implement
DHCP option 26, and where the performance hit by doing fragmentation is too high.
IMO these devices are broken already. Being forced to have to work with them means that breaking L2/L3 layering to make
them work a bit better are probably the least of the issue(s).


More information about the dev mailing list