[ovs-discuss] MTU considerations for OVN

Jesse Gross jesse at kernel.org
Fri May 6 19:08:27 UTC 2016


On Fri, May 6, 2016 at 11:53 AM, Ryan Moats <rmoats at us.ibm.com> wrote:
> Jesse Gross <jesse at kernel.org> wrote on 05/06/2016 11:11:10 AM:
>
>> From: Jesse Gross <jesse at kernel.org>
>> To: Ryan Moats/Omaha/IBM at IBMUS
>> Cc: Matt Kassawara <mkassawara at gmail.com>, discuss
>> <discuss at openvswitch.org>, Thomas Graf <tgraf at suug.ch>
>> Date: 05/06/2016 11:11 AM
>
>
>> Subject: Re: [ovs-discuss] MTU considerations for OVN
>>
>> On Fri, May 6, 2016 at 8:40 AM, Ryan Moats <rmoats at us.ibm.com> wrote:
>> > "discuss" <discuss-bounces at openvswitch.org> wrote on 05/04/2016 06:09:04
>> > PM:
>> >
>> >> From: Jesse Gross <jesse at kernel.org>
>> >> To: Matt Kassawara <mkassawara at gmail.com>
>> >> Cc: discuss <discuss at openvswitch.org>
>> >> Date: 05/04/2016 06:09 PM
>> >> Subject: Re: [ovs-discuss] MTU considerations for OVN
>> >> Sent by: "discuss" <discuss-bounces at openvswitch.org>
>> >>
>> >> On Tue, May 3, 2016 at 3:50 PM, Matt Kassawara <mkassawara at gmail.com>
>> >> wrote:
>> >> > Jesse,
>> >> >
>> >> > I'm resurrecting this thread after a fairly lengthy discussion of MTU
>> >> > with
>> >> > Ben at the recent OpenStack summit. Have you given the topic any
>> >> > further
>> >> > thought toward implementation in a reasonable way? Can you elaborate
>> >> > on
>> >> > the
>> >> > architectural limitations? At the moment, the OpenStack
>> >> > implementation
>> >> > of
>> >> > OVN doesn't use DPDK.
>> >>
>> >> The issue that I alluded to before is that when OVS (and by extension
>> >> OVN) does L3 processing the packets aren't traversing the Linux IP
>> >> stack and so the usual MTU checks don't apply. Instead OVS just does a
>> >> single combined lookup for all flow processing and then applies some
>> >> actions like set SMAC/DMAC and decrement TTL. Not only is there no
>> >> code to check the outgoing MTU but there's no obvious outgoing device
>> >> to fetch the desired MTU from.
>> >
>> > I'm not 100% sure why this would be an issue - IIRC (based on my
>> > scanning
>> > the code)
>> > when a packet is going to be outputed, it looks like the MTU of the
>> > physical
>> > device
>> > is checked and a fragmentation decision made.  Isn't that good enough
>> > for
>> > our
>> > purposes?
>>
>> Which check in particular do you have in mind?
>>
>> There are two possibilities that I can think of:
>>  * ovs_vport_send() has one but the device it looks at for the MTU is
>> a tunnel device, which has an essentially infinite MTU. The real MTU
>> that we would need to check also depends on the destination IP address
>> of the tunnel but we haven't done a route lookup at this point.
>>  * ip_finish_output() in the IP stack. This one does have the
>> information that we need but it is outside of the tunnel. Any ICMP
>> packets that are generated will be processed through the hypervisor's
>> IP stack and won't make it back to the VM. In addition, this check
>> doesn't handle GSO packets.
>
> I see, I was misreading code... my mistake.
>
> I certainly dislike the idea of separating the MTU calculation from the
> datapath. What I was hoping to find that it would be possible to do the
> fragmentation check on the tunnel after the route has been looked up and
> the outgoing device is known, but looking through this, I'm not seeing
> a good way to do this cleanly (yet) ...

I agree.

There was a thread a while back on the netdev mailing list related
this but no real conclusion:
https://www.spinics.net/lists/netdev/msg257830.html



More information about the discuss mailing list