[ovs-dev] [PATCH 2/3] datapath: compat: unset skb encapsulation bit

pravin shelar pshelar at ovn.org
Tue Jul 26 19:30:28 UTC 2016


On Tue, Jul 26, 2016 at 11:14 AM, Jesse Gross <jesse at kernel.org> wrote:
> On Tue, Jul 26, 2016 at 10:56 AM, pravin shelar <pshelar at ovn.org> wrote:
>> On Tue, Jul 26, 2016 at 10:06 AM, Jesse Gross <jesse at kernel.org> wrote:
>>> On Mon, Jul 25, 2016 at 5:49 PM, Pravin B Shelar <pshelar at ovn.org> wrote:
>>>> OVS compat layer can handle tunnel GSO packets. but it does
>>>> keep skb encapsulation on for packet handled in GSO. This can
>>>> confuse some NIC drivers. I have seen this issue on intel devices:
>>>>
>>>>   i40e 0000:42:00.0: TX driver issue detected, PF reset issued
>>>>
>>>> Following patch resets this bit in case compat layer handles the packet.
>>>>
>>>> VMware-BZ: 1698877
>>>> Signed-off-by: Pravin B Shelar <pshelar at ovn.org>
>>>
>>> In upstream, this is done as part of the GSO code (for example, in
>>> __skb_udp_tunnel_segment()) so that probably makes more sense and is
>>> safer if this is GSO specific. There is already code in
>>> ovs_iptunnel_handle_offloads() that will clear the encapsulation bit
>>> in the case of checksum offload on the outer header.
>>>
>> ovs_iptunnel_handle_offloads() is not equivalent to
>> skb_udp_tunnel_segment(). If handle-offload clear the bit it would not
>> be possible to check the encapsulation bit after handle-offload call
>> in tunnel implementation. At this point this bit is not checked, so it
>> is not issue. But that would be different behavior compared to
>> upstream.
>
> I was actually referring to existing behavior in
> ovs_iptunnel_handle_offlads(). Here is the code that I was talking
> about:
>
> #if LINUX_VERSION_CODE >= KERNEL_VERSION(3,8,0)
>         /* If packet is not gso and we are resolving any partial checksum,
>          * clear encapsulation flag. This allows setting CHECKSUM_PARTIAL
>          * on the outer header without confusing devices that implement
>          * NETIF_F_IP_CSUM with encapsulation.
>          */
>         if (csum_help)
>                 skb->encapsulation = 0;
> #endif
>
> In the case of outer checksums being enabled and offloaded and no GSO,
> we will have cleared skb->encapsulation bit already, so there should
> be no confusion for the NIC driver. As a result, this seems like a
> GSO-only problem and we can just mirror what upstream does and clear
> the bit in the GSO code. I'm a little bit nervous about
> indiscriminately clearing it in all cases, since it seems like it is
> possible that we will accidentally do it in some case where we are
> trying to use more of the network stack.
>

So you are fine with clearing the bit in ip_local_out() but only for
GSO packets.

>>> Something that I noticed while looking at this is it looks like the
>>> recent patch that moved the check for gso_type_mask into a GSO-only
>>> block in ovs_iptunnel_handle_offloads() might cause a bit of a
>>> performance regression. Even though that field is GSO-specific, it is
>>> also used to control whether we resolve partial checksums. Even in
>>> cases where we do need to use the OVS offload compat code, I suspect
>>> we could take better advantage of hardware offloads - not computing
>>> the checksum when !skb->encapsulation (since every kernel can do UDP
>>> checksum offload), using scatter/gather in GSO, checking for backport
>>> support when clearing type in rpl_udp_tunnel_handle_offloads().
>>
>> I am not sure I understand it correctly. the recent change actually
>> using compat gso code only for GSO packet, earlier it was using it for
>> non GSO packets too. By not setting "OVS_GSO_CB(skb)->fix_segment" we
>> are using networking stack.
>
> That's true - it will default to NULL. In that case, I don't know if
> this is entirely safe though. When encapsulation offloads first went
> in, many drivers essentially assumed that this meant VXLAN. It wasn't
> until 3.18 that ndo_features_check was available and drivers started
> implementing it. While this affects TSO more than checksum, issues are
> still possible if we try to pass a different type of encapsulation.

I agree, I am trying to fix these issues.



More information about the dev mailing list