[ovs-discuss] OVN - MTU path discovery

Han Zhou zhouhan at gmail.com
Fri Jul 27 20:48:46 UTC 2018


On Fri, Jul 27, 2018 at 12:49 AM, Miguel Angel Ajo Pelayo <
majopela at redhat.com> wrote:

>
>
>
> On 24 July 2018 at 17:43:51, Han Zhou (zhouhan at gmail.com) wrote:
>
>
>
> On Tue, Jul 24, 2018 at 8:26 AM, Miguel Angel Ajo Pelayo <
> majopela at redhat.com> wrote:
>
>>
>>
>>
>> On 24 July 2018 at 17:20:59, Han Zhou (zhouhan at gmail.com) wrote:
>>
>>
>>
>> On Thu, Jul 12, 2018 at 7:03 AM, Miguel Angel Ajo Pelayo <
>> majopela at redhat.com> wrote:
>>
>>> I believe we need to emit ICMP / need to frag messages to have proper
>>> support
>>> on different MTUs (on router sides), I wonder how does it work the other
>>> way
>>> around (when external net is 1500, and internal net is 1500-geneve
>>> overhead).
>>>
>>
>> I think this is expected since GW chassis forwards packets without going
>> through IP stack.
>> One solution might be using a network namespace on the GW node as an
>> intermediate hop, so that IP stack on the GW will handle the fragmentation
>> (or reply ICMP when DF is set). Of course this will have some latency
>> added, and also increase complexity of the deployment, so I'd rather tune
>> the MTU properly to avoid the problem. But if east-west performance is more
>> important and HV <-> HV jumbo frame is supported, then probably it worth
>> the namespace trick just to make external work regardless of internal MTU
>> settings. Does this make sense?
>>
>>
>> I believe we should avoid that path at all costs, it’s the way the
>> neutron reference implementation was built and it’s slower. Also it has a
>> lot of complexity.
>>
>>
>> Sometimes the MTU will be just mismatched the internal network/ls has a
>> bigger MTU to increase performance, but the external network is on the
>> standard 1500, in some cases such thing could be circumvented by having a
>> leg of the external router with big MTU just for ovn, but… if we look at
>> how people use openstack for example, that probably render most of the
>> deployments incompatible with ovn.
>>
>>
>> For example, customers tend to have several provider networks + external
>> networks, like legacy networks, different providers, etc.
>>
>>
>>
>>
>>> Is there any way to match packet_size > X on a flow?
>>>
>>> How could we implement this?
>>>
>> I didn't find anything for matching packet_size in ovs-fields.7. Even we
>> could do this in OVN (e.g. through controller action in slowpath), I wonder
>> is it really better than relying on IP stack. Maybe blp or someone else
>> could shed a light on this :)
>>
>> I think that would be undesirable also.
>>
>>
>> I wonder how it works now when external network is generally on 1500 MTU,
>> while Geneve has a lower mtu.
>>
> Do you mean for example: VM has MTU: 1400, while external network and eth0
> (tunnel physical interface) of HVs and GWs are all 1500 MTU? Why would
> there be a problem in this case? Or did I misunderstand?
>
>
> In that case some handling is also necessary at some point, imagine you
> have stablished a TCP connection through a floating IP (dnat), when the
> packets traverse the router from external network to internal network, if
> the router is not handling MTU, a 1500 packet will be transmitted over the
> 1400 network, and either Geneve is fragmenting/defragmenting (very bad for
> performance), or, if the packet went through VLAN, it would be dropped when
> arriving the final hypervisor.
>
>
> An I right, or am I missing something?, I need to actually try it and look
> at the traffic/packets.
>
In my example above all physical interfaces are with MTU 1500, only the
VM's internal MTU setting is 1400. In this case I don't think there is any
IP fragmentation or dropping happening, because the MSS of the TCP
connection should be adjusted by the hand-shake to fit for the MTU 1400 (or
smaller if the remote endpoint has MTU < 1400).

Or maybe you are talking about some different settings?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20180727/c32d34f7/attachment.html>


More information about the discuss mailing list