[ovs-git] [openvswitch/ovs] 893b99: datapath: do not propagate headroom updates to int...

GitHub noreply at github.com
Fri Feb 16 08:49:38 UTC 2018


  Branch: refs/heads/branch-2.9
  Home:   https://github.com/openvswitch/ovs
  Commit: 893b9906cad89662c901f2cc4f1b051410b05eec
      https://github.com/openvswitch/ovs/commit/893b9906cad89662c901f2cc4f1b051410b05eec
  Author: paolo abeni <pabeni at redhat.com>
  Date:   2018-02-16 (Fri, 16 Feb 2018)

  Changed paths:
    M datapath/vport-internal_dev.c

  Log Message:
  -----------
  datapath: do not propagate headroom updates to internal port

Upstream commit:
    commit 183dea5818315c0a172d21ecbcd2554894bf01e3
    Author: Paolo Abeni <pabeni at redhat.com>
    Date:   Thu Nov 30 15:35:33 2017 +0100

    openvswitch: do not propagate headroom updates to internal port

    After commit 3a927bc7cf9d ("ovs: propagate per dp max headroom to
    all vports") the need_headroom for the internal vport is updated
    accordingly to the max needed headroom in its datapath.

    That avoids the pskb_expand_head() costs when sending/forwarding
    packets towards tunnel devices, at least for some scenarios.

    We still require such copy when using the ovs-preferred configuration
    for vxlan tunnels:
   br_int
      /       \
    tap      vxlan
         (remote_ip:X)

    br_phy
   \
  NIC

    where the route towards the IP 'X' is via 'br_phy'.

    When forwarding traffic from the tap towards the vxlan device, we
    will call pskb_expand_head() in vxlan_build_skb() because
    br-phy->needed_headroom is equal to tun->needed_headroom.

    With this change we avoid updating the internal vport needed_headroom,
    so that in the above scenario no head copy is needed, giving 5%
    performance improvement in UDP throughput test.

    As a trade-off, packets sent from the internal port towards a tunnel
    device will now experience the head copy overhead. The rationale is
    that the latter use-case is less relevant performance-wise.

    Signed-off-by: paolo abeni <pabeni at redhat.com>
    Acked-by: Pravin B Shelar <pshelar at ovn.org>
    Signed-off-by: David S. Miller <davem at davemloft.net>

Cc: paolo abeni <pabeni at redhat.com>
Signed-off-by: Greg Rose <gvrose8192 at gmail.com>
Acked-by: Pravin B Shelar <pshelar at ovn.org>


  Commit: fcb77796c4d133c3ac37dc00b8fe8398c1c75f20
      https://github.com/openvswitch/ovs/commit/fcb77796c4d133c3ac37dc00b8fe8398c1c75f20
  Author: Eric Garver <e at erig.me>
  Date:   2018-02-16 (Fri, 16 Feb 2018)

  Changed paths:
    M datapath/flow.c

  Log Message:
  -----------
  datapath: Fix pop_vlan action for double tagged frames

Upstream commit:
    commit c48e74736fccf25fb32bb015426359e1c2016e3b
    Author: Eric Garver <e at erig.me>
    Date:   Wed Dec 20 15:09:22 2017 -0500

    openvswitch: Fix pop_vlan action for double tagged frames

    skb_vlan_pop() expects skb->protocol to be a valid TPID for double
    tagged frames. So set skb->protocol to the TPID and let skb_vlan_pop()
    shift the true ethertype into position for us.

    Fixes: 5108bbaddc37 ("openvswitch: add processing of L3 packets")
    Signed-off-by: Eric Garver <e at erig.me>
    Reviewed-by: Jiri Benc <jbenc at redhat.com>
    Signed-off-by: David S. Miller <davem at davemloft.net>

Cc: Eric Garver <e at erig.me>
Fixes: a27c454ee0 ("datapath: add processing of L3 packets")
Signed-off-by: Greg Rose <gvrose8192 at gmail.com>
Acked-by: Pravin B Shelar <pshelar at ovn.org>


  Commit: a948fa4b1ea3ec7ec86cc37ca518dee416beb538
      https://github.com/openvswitch/ovs/commit/a948fa4b1ea3ec7ec86cc37ca518dee416beb538
  Author: Ed Swierk <eswierk at skyportsystems.com>
  Date:   2018-02-16 (Fri, 16 Feb 2018)

  Changed paths:
    M datapath/conntrack.c

  Log Message:
  -----------
  datapath: Remove padding from packet before L3+ conntrack processing

Upstream commit:
    commit 9382fe71c0058465e942a633869629929102843d
    Author: Ed Swierk <eswierk at skyportsystems.com>
    Date:   Wed Jan 31 18:48:02 2018 -0800

    openvswitch: Remove padding from packet before L3+ conntrack processing

    IPv4 and IPv6 packets may arrive with lower-layer padding that is not
    included in the L3 length. For example, a short IPv4 packet may have
    up to 6 bytes of padding following the IP payload when received on an
    Ethernet device with a minimum packet length of 64 bytes.

    Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
    and help() in nf_conntrack_ftp) assume skb->len reflects the length of
    the L3 header and payload, rather than referring back to
    ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
    lower-layer padding.

    In the normal IPv4 receive path, ip_rcv() trims the packet to
    ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
    path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
    in the br_netfilter receive path, br_validate_ipv4() and
    br_validate_ipv6() trim the packet to the L3 length before invoking
    netfilter hooks.

    Currently in the OVS conntrack receive path, ovs_ct_execute() pulls
    the skb to the L3 header but does not trim it to the L3 length before
    calling nf_conntrack_in(NF_INET_PRE_ROUTING). When
    nf_conntrack_proto_tcp encounters a packet with lower-layer padding,
    nf_ip_checksum() fails causing a "nf_ct_tcp: bad TCP checksum" log
    message. While extra zero bytes don't affect the checksum, the length
    in the IP pseudoheader does. That length is based on skb->len, and
    without trimming, it doesn't match the length the sender used when
    computing the checksum.

    In ovs_ct_execute(), trim the skb to the L3 length before higher-layer
    processing.

    Signed-off-by: Ed Swierk <eswierk at skyportsystems.com>
    Acked-by: Pravin B Shelar <pshelar at ovn.org>
    Signed-off-by: David S. Miller <davem at davemloft.net>

Cc: Ed Swierk <eswierk at skyportsystems.com>
Signed-off-by: Greg Rose <gvrose8192 at gmail.com>
Acked-by: Pravin B Shelar <pshelar at ovn.org>


Compare: https://github.com/openvswitch/ovs/compare/3774206996c8...a948fa4b1ea3


More information about the git mailing list