[ovs-dev] Path MTU discovery on GRE interfaces

Dan Williams dcbw at redhat.com
Wed Jun 23 17:25:11 UTC 2021


On Wed, 2021-06-23 at 10:06 -0700, Ben Pfaff wrote:
> [updating Jesse's email address]
> 
> On Wed, Jun 23, 2021 at 04:48:29PM +0200, Matthias May via dev wrote:
> > I'm currently fighting with issues where TCP/UDP frames that are
> > larger than the MTU of a GRE tunnel are dropped.
> > I'm aware of the whys and how to work around the issue, but while
> > looking for solutions i stumbled over the fact that:
> > * [1] added PMTUD support to OVS
> > * [2] disabled/removed with v1.9.0 respectively v1.10.0 the feature
> > 
> > Even after some significant time looking through the history i
> > haven't found a reason why this was removed, just that it
> > was removed.
> > 
> > I started some preliminary work to add PMTUD support to OVS
> > (again), but the fact that it was removed 8 years ago seems
> > to me like a red flag to not do it (again).
> > 
> > Could someone fluent with the OVS history from 8 years ago shed
> > some light on why PMTUD support was dropped?
> > Any pointers to a thread on this topic?
> 
> It was a layering violation.  This caused problems like, for example,
> not having a good IP address to send the "frag needed" message from.

See also Aaron Conole's recent attempt to do some fragmentation
handling when delivering to OVS ports with a smaller MTU. 

Since the tunnels have a smaller MTU for encapsulated traffic by
necessity, things that need to send through the tunnel (like a
container) must have a smaller MTU. But when something outside of the
container's host sends a large UDP packet to the container, OVS fails
to deliver that packet to the container's OVS port because its MTU is
too small.

We finally landed on using check_pkt_len to detect this condition and
punt the ICMP reply to ovn-controller, but check_pkt_len isn't easily
hardware offloadable :( And it would be great to just fragment this
traffic to the right MTU in the first place, rather than have to send
an ICMP reply or punt the fragmentation up to a controller.

Dan



More information about the dev mailing list