[ovs-dev] [PATCH net] gso: do GSO for local skb with size bigger than MTU

Fri Jan 9 05:42:56 UTC 2015

于 2015年01月09日 03:55, Jesse Gross 写道:
> On Thu, Jan 8, 2015 at 1:39 AM, Fan Du <fengyuleidian0615 at gmail.com> wrote:
>> 于 2015年01月08日 04:52, Jesse Gross 写道:
>>>>
>>>> My understanding is:
>>>>> controller sets the forwarding rules into kernel datapath, any flow not
>>>>> matching
>>>>> with the rules are threw to controller by upcall. Once the rule decision
>>>>> is
>>>>> made
>>>>> by controller, then, this flow packet is pushed down to datapath to be
>>>>> forwarded
>>>>> again according to the new rule.
>>>>>
>>>>> So I'm not sure whether pushing the over-MTU-sized packet or pushing the
>>>>> forged ICMP
>>>>> without encapsulation to controller is required by current ovs
>>>>> implementation. By doing
>>>>> so, such over-MTU-sized packet is treated as a event for the controller
>>>>> to
>>>>> be take
>>>>> care of.
>>>
>>> If flows are implementing routing (again, they are doing things like
>>> decrementing the TTL) then it is necessary for them to also handle
>>> this situation using some potentially new primitives (like a size
>>> check). Otherwise you end up with issues like the ones that I
>>> mentioned above like needing to forge addresses because you don't know
>>> what the correct ones are.
>>
>>
>> Thanks for explaining, Jesse!
>>
>> btw, I don't get it about "to forge addresses", building ICMP message
>> with Guest packet doesn't require to forge address when not encapsulating
>> ICMP message with outer headers.
>
> Your patch has things like this (for the inner IP header):
>
> +                               new_ip->saddr = orig_ip->daddr;
> +                               new_ip->daddr = orig_ip->saddr;
>
> These addresses are owned by the endpoints, not the host generating
> generating the ICMP message, so I would consider that to be forging
> addresses.
>
>> If the flows aren't doing things to
>>>
>>> implement routing, then you really have a flat L2 network and you
>>> shouldn't be doing this type of behavior at all as I described in the
>>> original plan.
>>
>>
>> For flows implementing routing scenario:
>> First of all, over-MTU-sized packet could only be detected once the flow
>> as been consulted(each port could implement a 'check' hook to do this),
>> and just before send to the actual port.
>>
>> Then pushing the over-MTU-sized packet back to controller, it's the
>> controller
>> who will will decide whether to build ICMP message, or whatever routing
>> behaviour
>> it may take. And sent it back with the port information. This ICMP message
>> will
>> travel back to Guest.
>>
>> Why does the flow has to use primitive like a "check size"? "check size"
>> will only take effect after do_output. I'm not very clear with this
>> approach.
>
> Checking the size obviously needs to be an action that would take
> place before outputting in order for it to have any effect. Attaching
> a check to a port does not fit in very well with the other primitives
> of OVS, so I think an action is the obvious place to put it.

If flow is defined as:

	CHECK_SIZE -> OUTPUT

Then traversing actions at CHECK_SIZE needs to find the exactly OUTPUT port,
thus get its underlay encapsulation method as well as valid route for physical
NIC MTU, with those information can calculation whether GSOed packets
exceeds physical MTU. This is feasible anyway at the first look. After this,
it's the controller responsibility to handle such event.

If the CHECK_SIZE returns positive(over-MTU-sized packets show up), then call
output_userspace to push this packet upper wards.

I'm not sure this vague idea is the expected behaviour as required by "L3 processing".

>> And not all scenario involving flow with routing behaviour, just set up a
>> vxlan tunnel, and attach KVM guest or Docker onto it for playing or
>> developing.
>> This wouldn't necessarily require user to set additional specific flows to
>> make
>> over-MTU-sized packet pass through the tunnel correctly. In such scenario, I
>> think the original patch in this thread to fragment tunnel packet is still
>> needed
>> OR workout a generic component to build ICMP for all type tunnel in L2
>> level.
>> Both of those will act as a backup plan as there is no such specific flow as
>> default.
>
> In these cases, we should find a way to adjust the MTU, preferably
> automatically using virtio.
>

-- 
No zuo no die but I have to try.