[ovs-discuss] skb_warn_bad_offload+0xc8/0xd3() kernel warning on linux v3.14.57+ovs 2.5.0

Jesse Gross jesse at kernel.org
Wed Jul 13 14:48:09 UTC 2016


On Wed, Jul 13, 2016 at 4:05 AM, 张东亚 <fortitude.zhang at gmail.com> wrote:
> Hi,
>
> Recently we upgraded to ovs 2.5 and use the datapath in the ovs code(means
> we do not use kernel datapath), however, when we test vxlan performance
> between two VMs on different compute node, we observed the bandwidth can
> only achieve 5Gbps on the 10Gbps intel nic pair, and we are still using
> kernel v3.14.57.
>
> We found lots of following warning (see stack in the PS) in dmesg and thinks
> it causes the bad performance.
>
> After trying to read the source code of the kernel and ovs 2.5 datapath
> code, it seems gso_type 513(0x201) cause the problem, since our VM tap
> interface does not have tx-udp_tnl-segmentation(NETIF_F_GSO_UDP_TUNNEL_BIT)
> flag enabled by default, net_gso_ok return false which cause
> __skb_gso_segment being called, and since tcpv4_gso_complete have set the
> skb->ip_summed to CHECKSUM_UNCESSARY, which cause skb_warn_bad_offload being
> called.
>
> My wonder is whether this is a desired behavior, or OVS should clear the
> NETIF_F_GSO_UDP_TUNNEL_BIT bit when send the inner packet (may be GSOed) to
> tap interface on the host?

This is an upstream kernel problem, fixed by:

    commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168
    Author: Jesse Gross <jesse at kernel.org>

    tunnels: Remove encapsulation offloads on decap.

    If a packet is either locally encapsulated or processed through GRO
    it is marked with the offloads that it requires. However, when it is
    decapsulated these tunnel offload indications are not removed. This
    means that if we receive an encapsulated TCP packet, aggregate it with
    GRO, decapsulate, and retransmit the resulting frame on a NIC that does
    not support encapsulation, we won't be able to take advantage of hardware
    offloads even though it is just a simple TCP packet at this point.

    This fixes the problem by stripping off encapsulation offload indications
    when packets are decapsulated.

    The performance impacts of this bug are significant. In a test where a
    Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed,
decapsulated,
    and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
    result of avoiding unnecessary segmentation at the VM tap interface.

    Reported-by: Ramu Ramamurthy <sramamur at linux.vnet.ibm.com>
    Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
    Signed-off-by: Jesse Gross <jesse at kernel.org>
    Signed-off-by: David S. Miller <davem at davemloft.net>

The problem is not really related to OVS and so the right solution is
really to upgrade your base kernel. However, since it does affect OVS
users we have backported code into the OVS tree to avoid it.
Unfortunately, the fix is too large to bring to OVS 2.5, so you'll
need to use the current master branch.



More information about the discuss mailing list