[ovs-dev] [PATCH net 0/2] vxlan: Set a large MTU on ovs-created vxlan devices

David Wragg david at weave.works
Wed Jan 6 23:25:56 UTC 2016


David Miller <davem at davemloft.net> writes:
>> Prior to 4.3, openvswitch vxlan vports could transmit vxlan packets of
>> any size, constrained only by the ability to transmit the resulting
>> UDP packets.  4.3 introduced vxlan netdevs corresponding to vxlan
>> vports.  These netdevs have an MTU, which limits the size of a packet
>> that can be successfully vxlan-encapsulated.  The default value for
>> this MTU is 1500, which is awkwardly small, and leads to a conspicuous
>> change in behaviour for userspace.
>> 
>> These two patches set the MTU on openvswitch-crated vxlan devices to
>> be 65465 (the maximum IP packet size minus the vxlan-on-IPv6
>> overhead), effectively restoring the behaviour prior to 4.3.  In order
>> to accomplish this, the first patch removes the MTU constraint of 1500
>> for vxlan netdevs without an underlying device.
>
> Is this really the right thing to do?

I'm certainly open to suggestions of better ways to solve the problem.

To be clear, the problem from our perspective is that a use of the
kernel openvswitch that worked fine in 4.2 and earlier is hobbled in
4.3.  Previously the MTU of an openvswitch-based vxlan overlay network
was constrained only by the MTU of the physical network.  In 4.3, we
can't take advantage of physical networks that support jumbo frames,
causing a huge hit to throughput across the overlay network.

The specific limit of 1500 seems very arbitrary.  For a vxlan overlay
network on top of a traditional ethernet network, the "correct" MTU for
the vxlan netdevs is 1450 rather than 1500.  And in general with
openvswitch, the destination for vxlan packets is determined on a
packet-by-packet basis, possibly involving different path MTUs of the
underlying network.  There is no single "correct" value.

> Won't we get a lot of fragmentation
> by using such a large MTU, especially since you're making it the default
> for OVS setups?

In the context of the openvswitch vxlan vport transmit path, I can't
find a place where the dev->mtu is used (and it would be surprising, on
the basis that the relevant parts of vxlan.c have not changed that much
since 4.2, when no netdev was involved in that path).

Considering non-openvswitch scenarios, when using vxlan netdevs
directly, a vxlan netdev locked to an underlying device supporting jumbo
frames can use a larger MTU.  It's only vxlan netdevs without an
underlying device that have the limit of 1500 imposed.  But why
shouldn't there be the same flexibility to select an MTU for best
performance in both cases?  Aren't the fragmentation concerns the same?

> Things like path MTU discovery hinge strongly upon accurate MTU settings.
> Otherwise they won't function properly.

True.  But in what sense is 1500 accurate?  Uses/users of the kernel
openvswitch code have always had to get this right, making sure that the
MTU set on a vxlan overlay network conforms to the underlying network
paths involved.

David



More information about the dev mailing list