[ovs-dev] [PATCH v2 08/12] userspace: Handling of versatile tunnel ports

Ben Pfaff blp at ovn.org
Tue Jun 20 02:19:41 UTC 2017


Thanks for reporting that.  I changed this paragraph to:

  <p>
    Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
    concept introduced in OpenFlow 1.5.  Such a pipeline does not have any root
    fields.  Instead, a new metadata field, <ref field="packet_type"/>,
    indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
    or another type.  For backward compatibility, by default Open vSwitch 2.8
    imitates the behavior of Open vSwitch 2.7 and earlier.  Later versions of
    Open vSwitch may change the default, and in the meantime controllers can
    turn off this legacy behavior, on a port-by-port basis, by setting
    <code>other_config:packet_type</code> to <code>ptap</code> in the
    <code>Interface</code> table.  This is significant only for ports that can
    handle non-Ethernet packets, which is currently just LISP, VXLAN-GPE, and
    GRE tunnel ports.  See <code>ovs-vwitchd.conf.db</code>(5) for more
    information.
  </p>


On Mon, Jun 19, 2017 at 03:33:33PM +0000, Zoltán Balogh wrote:
> Hi Ben,
> 
> In the lib/meta-flow.xml, you introduced the 'packet type-aware pipeline' concept. 
> You mentioned, controllers can turn off legacy behavior by setting 'other-config:packet-type' bridge property to 'ptap'.
> As far as I know, you discussed on Friday, there will be only one property for the tunnel ports and no bridge property.
> 
> Best regards,
> Zoltan
> 
> 
> > -----Original Message-----
> > From: ovs-dev-bounces at openvswitch.org [mailto:ovs-dev-bounces at openvswitch.org] On Behalf Of Ben Pfaff
> > Sent: Monday, June 19, 2017 1:30 AM
> > To: dev at openvswitch.org
> > Cc: Ben Pfaff <blp at ovn.org>
> > Subject: [ovs-dev] [PATCH v2 08/12] userspace: Handling of versatile tunnel ports
> > 
> > In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based
> > on packet_type of flow. If it's about an Ethernet packet, it is set to
> > ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set
> > according to the name space type.
> > 
> > Signed-off-by: Jan Scheurich <jan.scheurich at ericsson.com>
> > Signed-off-by: Ben Pfaff <blp at ovn.org>
> > ---
> >  NEWS                          |   6 +--
> >  lib/meta-flow.xml             |  28 ++++++-----
> >  lib/netdev-bsd.c              |   1 +
> >  lib/netdev-dpdk.c             |   1 +
> >  lib/netdev-dummy.c            |   1 +
> >  lib/netdev-linux.c            |   1 +
> >  lib/netdev-native-tnl.c       |  23 ++++++---
> >  lib/netdev-provider.h         |   6 +++
> >  lib/netdev-vport.c            | 106 ++++++++++++++++++++++++++++++------------
> >  lib/netdev-vport.h            |   1 -
> >  lib/netdev.c                  |   8 ++++
> >  lib/netdev.h                  |  29 +++++++++++-
> >  ofproto/ofproto-dpif-xlate.c  |  35 ++++++++------
> >  ofproto/ofproto-dpif.c        |   4 +-
> >  ofproto/tunnel.c              |  27 ++++++++---
> >  tests/tunnel-push-pop-ipv6.at |   4 +-
> >  tests/tunnel-push-pop.at      |   4 +-
> >  vswitchd/vswitch.xml          |  94 ++++++++++++++++++++++++++++++-------
> >  18 files changed, 277 insertions(+), 102 deletions(-)
> > 
> > diff --git a/NEWS b/NEWS
> > index a2f5a6dc8e54..8b0ad6191325 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -59,11 +59,9 @@ Post-v2.7.0
> >       * OVN services are no longer restarted automatically after upgrade.
> >     - Add --cleanup option to command 'ovs-appctl exit' (see ovs-vswitchd(8)).
> >     - L3 tunneling:
> > -     * Add "layer3" options for tunnel ports that support non-Ethernet (L3)
> > -       payload (GRE, VXLAN-GPE).
> > +     * Use new tunnel port option "packet_type" to configure L2 vs. L3.
> >       * New vxlan tunnel extension "gpe" to support VXLAN-GPE tunnels.
> > -     * Transparently pop and push Ethernet headers at transmit/reception
> > -       of packets to/from L3 tunnels.
> > +     * New support for non-Ethernet (L3) payloads in GRE and VXLAN-GPE.
> >     - The BFD detection multiplier is now user-configurable.
> >     - New support for HW offloading
> >       * HW offloading is disabled by default.
> > diff --git a/lib/meta-flow.xml b/lib/meta-flow.xml
> > index 856e1ba8cf7b..dc2731e2a260 100644
> > --- a/lib/meta-flow.xml
> > +++ b/lib/meta-flow.xml
> > @@ -26,19 +26,25 @@
> >      networking technology in use are called called <dfn>root fields</dfn>.
> >      Open vSwitch 2.7 and earlier considered Ethernet fields to be root fields,
> >      and this remains the default mode of operation for Open vSwitch bridges.
> > -    In this mode, when a packet is received from a non-Ethernet interfaces,
> > -    such as a layer-3 LISP or GRE tunnel, Open vSwitch force-fits it to this
> > +    When a packet is received from a non-Ethernet interfaces, such as a layer-3
> > +    LISP tunnel, Open vSwitch 2.7 and earlier force-fit the packet to this
> >      Ethernet-centric point of view by pretending that an Ethernet header is
> >      present whose Ethernet type that indicates the packet's actual type (and
> >      whose source and destination addresses are all-zero).
> >    </p>
> > 
> >    <p>
> > -    Open vSwitch 2.8 and later supports the ``packet type-aware pipeline''
> > -    concept introduced in OpenFlow 1.5.  A bridge configured to be packet
> > -    type-aware can handle packets of multiple networking technologies, such as
> > -    Ethernet, IP, ARP, MPLS, or NSH in parallel.  Such a bridge does not have
> > -    any root fields.
> > +    Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
> > +    concept introduced in OpenFlow 1.5.  Such a pipeline does not have any root
> > +    fields.  Instead, a new metadata field, <ref field="packet_type"/>,
> > +    indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
> > +    or another type.  For backward compatibility, by default Open vSwitch 2.8
> > +    imitates the behavior of Open vSwitch 2.7 and earlier.  Later versions of
> > +    Open vSwitch may change the default, and in the meantime controllers can
> > +    turn off this legacy behavior by setting
> > +    <code>other-config:packet-type</code> to <code>ptap</code> in the
> > +    <code>Bridge</code> table.  (See <code>ovs-vwitchd.conf.db</code>(5) for
> > +    more information.)
> >    </p>
> > 
> >    <p>
> > @@ -332,14 +338,6 @@ tcp,tp_src=0x07c0/0xfff0
> >      <dt><code>mplsm</code></dt>  <dd><code>eth_type=0x8848</code></dd>
> >    </dl>
> > 
> > -  <p>
> > -    These shorthand notations continue to work in packet type-aware bridges.
> > -    The absence of a packet_type match implies
> > -    <code>packet_type=ethernet</code>, so that shorthands match on Ethernet
> > -    packets with the implied eth_type. Please note that the shorthand
> > -    <code>ip</code> does not match packets of packet_type (1,0x800) for IPv4.
> > -  </p>
> > -
> > 
> >    <h2>Evolution of OpenFlow Fields</h2>
> > 
> > diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
> > index f863a189cd5e..6cc83d347795 100644
> > --- a/lib/netdev-bsd.c
> > +++ b/lib/netdev-bsd.c
> > @@ -1517,6 +1517,7 @@ netdev_bsd_update_flags(struct netdev *netdev_, enum netdev_flags off,
> >                                                       \
> >      GET_FEATURES,                                    \
> >      NULL, /* set_advertisement */                    \
> > +    NULL, /* get_pt_mode */                          \
> >      NULL, /* set_policing */                         \
> >      NULL, /* get_qos_type */                         \
> >      NULL, /* get_qos_capabilities */                 \
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index bba4de378888..5ad92446950f 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -3276,6 +3276,7 @@ unlock:
> >      GET_STATS,                                                \
> >      GET_FEATURES,                                             \
> >      NULL,                       /* set_advertisements */      \
> > +    NULL,                       /* get_pt_mode */             \
> >                                                                \
> >      netdev_dpdk_set_policing,                                 \
> >      netdev_dpdk_get_qos_types,                                \
> > diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
> > index d189a8615e05..51d29d54a4ac 100644
> > --- a/lib/netdev-dummy.c
> > +++ b/lib/netdev-dummy.c
> > @@ -1382,6 +1382,7 @@ netdev_dummy_update_flags(struct netdev *netdev_,
> >                                                                  \
> >      NULL,                       /* get_features */              \
> >      NULL,                       /* set_advertisements */        \
> > +    NULL,                       /* get_pt_mode */               \
> >                                                                  \
> >      NULL,                       /* set_policing */              \
> >      NULL,                       /* get_qos_types */             \
> > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> > index f5dc30fbc188..a8c296002cc4 100644
> > --- a/lib/netdev-linux.c
> > +++ b/lib/netdev-linux.c
> > @@ -2842,6 +2842,7 @@ netdev_linux_update_flags(struct netdev *netdev_, enum netdev_flags off,
> >                                                                  \
> >      GET_FEATURES,                                               \
> >      netdev_linux_set_advertisements,                            \
> > +    NULL,                       /* get_pt_mode */               \
> >                                                                  \
> >      netdev_linux_set_policing,                                  \
> >      netdev_linux_get_qos_types,                                 \
> > diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> > index c7a29934537e..7f3cf984e887 100644
> > --- a/lib/netdev-native-tnl.c
> > +++ b/lib/netdev-native-tnl.c
> > @@ -463,10 +463,13 @@ netdev_gre_build_header(const struct netdev *netdev,
> > 
> >      greh = netdev_tnl_ip_build_header(data, params, IPPROTO_GRE);
> > 
> > -    if (tnl_cfg->is_layer3) {
> > -        greh->protocol = params->flow->dl_type;
> > -    } else {
> > +    if (params->flow->packet_type == htonl(PT_ETH)) {
> >          greh->protocol = htons(ETH_TYPE_TEB);
> > +    } else if (pt_ns(params->flow->packet_type) == OFPHTN_ETHERTYPE) {
> > +        greh->protocol = pt_ns_type_be(params->flow->packet_type);
> > +    } else {
> > +        ovs_mutex_unlock(&dev->mutex);
> > +        return 1;
> >      }
> >      greh->flags = 0;
> > 
> > @@ -575,8 +578,10 @@ netdev_vxlan_build_header(const struct netdev *netdev,
> >          put_16aligned_be32(&vxh->vx_flags, htonl(VXLAN_FLAGS | VXLAN_HF_GPE));
> >          put_16aligned_be32(&vxh->vx_vni,
> >                             htonl(ntohll(params->flow->tunnel.tun_id) << 8));
> > -        if (tnl_cfg->is_layer3) {
> > -            switch (ntohs(params->flow->dl_type)) {
> > +        if (params->flow->packet_type == htonl(PT_ETH)) {
> > +            vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
> > +        } else if (pt_ns(params->flow->packet_type) == OFPHTN_ETHERTYPE) {
> > +            switch (pt_ns_type(params->flow->packet_type)) {
> >              case ETH_TYPE_IP:
> >                  vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_IPV4;
> >                  break;
> > @@ -586,9 +591,11 @@ netdev_vxlan_build_header(const struct netdev *netdev,
> >              case ETH_TYPE_TEB:
> >                  vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
> >                  break;
> > +            default:
> > +                goto drop;
> >              }
> >          } else {
> > -            vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
> > +            goto drop;
> >          }
> >      } else {
> >          put_16aligned_be32(&vxh->vx_flags, htonl(VXLAN_FLAGS));
> > @@ -600,6 +607,10 @@ netdev_vxlan_build_header(const struct netdev *netdev,
> >      data->header_len += sizeof *vxh;
> >      data->tnl_type = OVS_VPORT_TYPE_VXLAN;
> >      return 0;
> > +
> > +drop:
> > +    ovs_mutex_unlock(&dev->mutex);
> > +    return 1;
> >  }
> > 
> >  struct dp_packet *
> > diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
> > index 79143d2c8dfb..3c3c181135de 100644
> > --- a/lib/netdev-provider.h
> > +++ b/lib/netdev-provider.h
> > @@ -474,6 +474,12 @@ struct netdev_class {
> >      int (*set_advertisements)(struct netdev *netdev,
> >                                enum netdev_features advertise);
> > 
> > +    /* Returns 'netdev''s configured packet_type mode.
> > +     *
> > +     * This function may be set to null if it would always return
> > +     * NETDEV_PT_LEGACY_L2. */
> > +    enum netdev_pt_mode (*get_pt_mode)(const struct netdev *netdev);
> > +
> >      /* Attempts to set input rate limiting (policing) policy, such that up to
> >       * 'kbits_rate' kbps of traffic is accepted, with a maximum accumulative
> >       * burst size of 'kbits' kb.
> > diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
> > index 640cdbe9ca7c..f1db38cbb10f 100644
> > --- a/lib/netdev-vport.c
> > +++ b/lib/netdev-vport.c
> > @@ -98,18 +98,6 @@ netdev_vport_is_patch(const struct netdev *netdev)
> >      return class->get_config == get_patch_config;
> >  }
> > 
> > -bool
> > -netdev_vport_is_layer3(const struct netdev *dev)
> > -{
> > -    if (is_vport_class(netdev_get_class(dev))) {
> > -        struct netdev_vport *vport = netdev_vport_cast(dev);
> > -
> > -        return vport->tnl_cfg.is_layer3;
> > -    }
> > -
> > -    return false;
> > -}
> > -
> >  static bool
> >  netdev_vport_needs_dst_port(const struct netdev *dev)
> >  {
> > @@ -407,6 +395,30 @@ parse_tunnel_ip(const char *value, bool accept_mcast, bool *flow,
> >      return 0;
> >  }
> > 
> > +enum tunnel_layers {
> > +    TNL_L2 = 1 << 0,       /* 1 if a tunnel type can carry Ethernet traffic. */
> > +    TNL_L3 = 1 << 1        /* 1 if a tunnel type can carry L3 traffic. */
> > +};
> > +static enum tunnel_layers
> > +tunnel_supported_layers(const char *type,
> > +                        const struct netdev_tunnel_config *tnl_cfg)
> > +{
> > +    if (!strcmp(type, "lisp")) {
> > +        return TNL_L3;
> > +    } else if (!strcmp(type, "gre")) {
> > +        return TNL_L2 | TNL_L3;
> > +    } else if (!strcmp(type, "vxlan") && tnl_cfg->exts & OVS_VXLAN_EXT_GPE) {
> > +        return TNL_L2 | TNL_L3;
> > +    } else {
> > +        return TNL_L2;
> > +    }
> > +}
> > +static enum netdev_pt_mode
> > +default_pt_mode(enum tunnel_layers layers)
> > +{
> > +    return layers == TNL_L3 ? NETDEV_PT_LEGACY_L3 : NETDEV_PT_LEGACY_L2;
> > +}
> > +
> >  static int
> >  set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> >  {
> > @@ -414,16 +426,14 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> >      const char *name = netdev_get_name(dev_);
> >      const char *type = netdev_get_type(dev_);
> >      struct ds errors = DS_EMPTY_INITIALIZER;
> > -    bool needs_dst_port, has_csum, optional_layer3;
> > +    bool needs_dst_port, has_csum;
> >      uint16_t dst_proto = 0, src_proto = 0;
> >      struct netdev_tunnel_config tnl_cfg;
> >      struct smap_node *node;
> > -    bool is_layer3 = false;
> >      int err;
> > 
> >      has_csum = strstr(type, "gre") || strstr(type, "geneve") ||
> >                 strstr(type, "stt") || strstr(type, "vxlan");
> > -    optional_layer3 = !strcmp(type, "gre");
> >      memset(&tnl_cfg, 0, sizeof tnl_cfg);
> > 
> >      /* Add a default destination port for tunnel ports if none specified. */
> > @@ -437,7 +447,6 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> > 
> >      if (!strcmp(type, "lisp")) {
> >          tnl_cfg.dst_port = htons(LISP_DST_PORT);
> > -        tnl_cfg.is_layer3 = true;
> >      }
> > 
> >      if (!strcmp(type, "stt")) {
> > @@ -501,9 +510,10 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> >              }
> >          } else if (!strcmp(node->key, "key") ||
> >                     !strcmp(node->key, "in_key") ||
> > -                   !strcmp(node->key, "out_key")) {
> > +                   !strcmp(node->key, "out_key") ||
> > +                   !strcmp(node->key, "packet_type")) {
> >              /* Handled separately below. */
> > -        } else if (!strcmp(node->key, "exts")) {
> > +        } else if (!strcmp(node->key, "exts") && !strcmp(type, "vxlan")) {
> >              char *str = xstrdup(node->value);
> >              char *ext, *save_ptr = NULL;
> > 
> > @@ -515,7 +525,6 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> >                      tnl_cfg.exts |= (1 << OVS_VXLAN_EXT_GBP);
> >                  } else if (!strcmp(type, "vxlan") && !strcmp(ext, "gpe")) {
> >                      tnl_cfg.exts |= (1 << OVS_VXLAN_EXT_GPE);
> > -                    optional_layer3 = true;
> >                  } else {
> >                      ds_put_format(&errors, "%s: unknown extension '%s'\n",
> >                                    name, ext);
> > @@ -528,21 +537,44 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> >          } else if (!strcmp(node->key, "egress_pkt_mark")) {
> >              tnl_cfg.egress_pkt_mark = strtoul(node->value, NULL, 10);
> >              tnl_cfg.set_egress_pkt_mark = true;
> > -        } else if (!strcmp(node->key, "layer3")) {
> > -            if (!strcmp(node->value, "true")) {
> > -                is_layer3 = true;
> > -            }
> >          } else {
> >              ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name,
> >                            type, node->key);
> >          }
> >      }
> > 
> > -    if (optional_layer3 && is_layer3) {
> > -       tnl_cfg.is_layer3 = is_layer3;
> > -    } else if (!optional_layer3 && is_layer3) {
> > -        ds_put_format(&errors, "%s: unknown %s argument '%s'\n",
> > -                      name, type, "layer3");
> > +    enum tunnel_layers layers = tunnel_supported_layers(type, &tnl_cfg);
> > +    const char *full_type = (strcmp(type, "vxlan") ? type
> > +                             : tnl_cfg.exts & OVS_VXLAN_EXT_GPE ? "VXLAN-GPE"
> > +                             : "VXLAN (without GPE");
> > +    const char *packet_type = smap_get(args, "packet_type");
> > +    if (!packet_type) {
> > +        tnl_cfg.pt_mode = default_pt_mode(layers);
> > +    } else if (!strcmp(packet_type, "legacy_l2")) {
> > +        tnl_cfg.pt_mode = NETDEV_PT_LEGACY_L2;
> > +        if (!(layers & TNL_L2)) {
> > +            ds_put_format(&errors, "%s: legacy_l2 configured on %s tunnel "
> > +                          "that cannot carry L2 traffic\n",
> > +                          name, full_type);
> > +            err = EINVAL;
> > +            goto out;
> > +        }
> > +    } else if (!strcmp(packet_type, "legacy_l3")) {
> > +        tnl_cfg.pt_mode = NETDEV_PT_LEGACY_L3;
> > +        if (!(layers & TNL_L3)) {
> > +            ds_put_format(&errors, "%s: legacy_l3 configured on %s tunnel "
> > +                          "that cannot carry L3 traffic\n",
> > +                          name, full_type);
> > +            err = EINVAL;
> > +            goto out;
> > +        }
> > +    } else if (!strcmp(packet_type, "ptap")) {
> > +        tnl_cfg.pt_mode = NETDEV_PT_AWARE;
> > +    } else {
> > +        ds_put_format(&errors, "%s: unknown packet_type '%s'\n",
> > +                      name, packet_type);
> > +        err = EINVAL;
> > +        goto out;
> >      }
> > 
> >      if (!ipv6_addr_is_set(&tnl_cfg.ipv6_dst) && !tnl_cfg.ip_dst_flow) {
> > @@ -675,9 +707,12 @@ get_tunnel_config(const struct netdev *dev, struct smap *args)
> >          smap_add(args, "csum", "true");
> >      }
> > 
> > -    if (tnl_cfg.is_layer3 && (!strcmp("gre", type) ||
> > -        !strcmp("vxlan", type))) {
> > -        smap_add(args, "layer3", "true");
> > +    enum tunnel_layers layers = tunnel_supported_layers(type, &tnl_cfg);
> > +    if (tnl_cfg.pt_mode != default_pt_mode(layers)) {
> > +        smap_add(args, "packet_type",
> > +                 tnl_cfg.pt_mode == NETDEV_PT_LEGACY_L2 ? "legacy_l2"
> > +                 : tnl_cfg.pt_mode == NETDEV_PT_LEGACY_L3 ? "legacy_l3"
> > +                 : "ptap");
> >      }
> > 
> >      if (!tnl_cfg.dont_fragment) {
> > @@ -809,6 +844,14 @@ get_stats(const struct netdev *netdev, struct netdev_stats *stats)
> >      return 0;
> >  }
> > 
> > +static enum netdev_pt_mode
> > +get_pt_mode(const struct netdev *netdev)
> > +{
> > +    struct netdev_vport *dev = netdev_vport_cast(netdev);
> > +
> > +    return dev->tnl_cfg.pt_mode;
> > +}
> > +
> >  
> 
> >  #ifdef __linux__
> >  static int
> > @@ -873,6 +916,7 @@ netdev_vport_get_ifindex(const struct netdev *netdev_)
> >                                                              \
> >      NULL,                       /* get_features */          \
> >      NULL,                       /* set_advertisements */    \
> > +    get_pt_mode,                                            \
> >                                                              \
> >      NULL,                       /* set_policing */          \
> >      NULL,                       /* get_qos_types */         \
> > diff --git a/lib/netdev-vport.h b/lib/netdev-vport.h
> > index 048aa6ebf223..9d756a265c4f 100644
> > --- a/lib/netdev-vport.h
> > +++ b/lib/netdev-vport.h
> > @@ -31,7 +31,6 @@ void netdev_vport_tunnel_register(void);
> >  void netdev_vport_patch_register(void);
> > 
> >  bool netdev_vport_is_patch(const struct netdev *);
> > -bool netdev_vport_is_layer3(const struct netdev *);
> > 
> >  char *netdev_vport_patch_peer(const struct netdev *netdev);
> > 
> > diff --git a/lib/netdev.c b/lib/netdev.c
> > index 765bf4b9ccad..a7840a84e594 100644
> > --- a/lib/netdev.c
> > +++ b/lib/netdev.c
> > @@ -727,6 +727,14 @@ netdev_set_tx_multiq(struct netdev *netdev, unsigned int n_txq)
> >      return error;
> >  }
> > 
> > +enum netdev_pt_mode
> > +netdev_get_pt_mode(const struct netdev *netdev)
> > +{
> > +    return (netdev->netdev_class->get_pt_mode
> > +            ? netdev->netdev_class->get_pt_mode(netdev)
> > +            : NETDEV_PT_LEGACY_L2);
> > +}
> > +
> >  /* Sends 'batch' on 'netdev'.  Returns 0 if successful (for every packet),
> >   * otherwise a positive errno value.  Returns EAGAIN without blocking if
> >   * at least one the packets cannot be queued immediately.  Returns EMSGSIZE
> > diff --git a/lib/netdev.h b/lib/netdev.h
> > index 31846fabf9af..998f942e29d9 100644
> > --- a/lib/netdev.h
> > +++ b/lib/netdev.h
> > @@ -71,6 +71,32 @@ struct smap;
> >  struct sset;
> >  struct ovs_action_push_tnl;
> > 
> > +enum netdev_pt_mode {
> > +    /* The netdev is packet type aware.  It can potentially carry any kind of
> > +     * packet.  This "modern" mode is appropriate for both netdevs that handle
> > +     * only a single kind of packet (such as a virtual or physical Ethernet
> > +     * interface) and for those that can handle multiple (such as VXLAN-GPE or
> > +     * Geneve). */
> > +    NETDEV_PT_AWARE,
> > +
> > +    /* The netdev sends and receives only Ethernet frames.  The netdev cannot
> > +     * carry packets other than Ethernet frames.  This is a legacy mode for
> > +     * backward compability with controllers that are not prepared to handle
> > +     * OpenFlow 1.5+ "packet_type". */
> > +    NETDEV_PT_LEGACY_L2,
> > +
> > +    /* The netdev sends and receives only IPv4 and IPv6 packets.  The netdev
> > +     * cannot carry Ethernet frames or other kinds of packets.
> > +     *
> > +     * IPv4 and IPv6 packets carried over the netdev are treated as Ethernet:
> > +     * when they are received, they are converted to Ethernet by adding a dummy
> > +     * header with the proper Ethertype; on tranmission, the Ethernet header is
> > +     * stripped.  This is a legacy mode for backward compability with
> > +     * controllers that are not prepared to handle OpenFlow 1.5+
> > +     * "packet_type". */
> > +    NETDEV_PT_LEGACY_L3,
> > +};
> > +
> >  /* Configuration specific to tunnels. */
> >  struct netdev_tunnel_config {
> >      bool in_key_present;
> > @@ -100,7 +126,7 @@ struct netdev_tunnel_config {
> > 
> >      bool csum;
> >      bool dont_fragment;
> > -    bool is_layer3;
> > +    enum netdev_pt_mode pt_mode;
> >  };
> > 
> >  void netdev_run(void);
> > @@ -140,6 +166,7 @@ void netdev_mtu_user_config(struct netdev *, bool);
> >  bool netdev_mtu_is_user_config(struct netdev *);
> >  int netdev_get_ifindex(const struct netdev *);
> >  int netdev_set_tx_multiq(struct netdev *, unsigned int n_txq);
> > +enum netdev_pt_mode netdev_get_pt_mode(const struct netdev *);
> > 
> >  /* Packet reception. */
> >  int netdev_rxq_open(struct netdev *, struct netdev_rxq **, int id);
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index c7a7a371d32b..26cf3ba286cf 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -165,7 +165,7 @@ struct xport {
> > 
> >      bool may_enable;                 /* May be enabled in bonds. */
> >      bool is_tunnel;                  /* Is a tunnel port. */
> > -    bool is_layer3;                  /* Is a layer 3 port. */
> > +    enum netdev_pt_mode pt_mode;     /* packet_type handling. */
> > 
> >      struct cfm *cfm;                 /* CFM handle or null. */
> >      struct bfd *bfd;                 /* BFD handle or null. */
> > @@ -905,7 +905,7 @@ xlate_xport_set(struct xport *xport, odp_port_t odp_port,
> >      xport->state = state;
> >      xport->stp_port_no = stp_port_no;
> >      xport->is_tunnel = is_tunnel;
> > -    xport->is_layer3 = netdev_vport_is_layer3(netdev);
> > +    xport->pt_mode = netdev_get_pt_mode(netdev);
> >      xport->may_enable = may_enable;
> >      xport->odp_port = odp_port;
> > 
> > @@ -2691,7 +2691,10 @@ xlate_normal(struct xlate_ctx *ctx)
> > 
> >      /* Learn source MAC. */
> >      bool is_grat_arp = is_gratuitous_arp(flow, wc);
> > -    if (ctx->xin->allow_side_effects && !in_port->is_layer3) {
> > +    if (ctx->xin->allow_side_effects
> > +        && flow->packet_type == htonl(PT_ETH)
> > +        && in_port->pt_mode != NETDEV_PT_LEGACY_L3
> > +    ) {
> >          update_learning_table(ctx, in_xbundle, flow->dl_src, vlan,
> >                                is_grat_arp);
> >      }
> > @@ -3351,15 +3354,19 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
> >          return;
> >      }
> > 
> > -    if (flow->packet_type == htonl(PT_ETH) && xport->is_layer3) {
> > -        /* Ethernet packet to L3 outport -> pop ethernet header. */
> > -        flow->packet_type = PACKET_TYPE_BE(OFPHTN_ETHERTYPE,
> > -                                           ntohs(flow->dl_type));
> > -    } else if (flow->packet_type != htonl(PT_ETH) && !xport->is_layer3) {
> > -        /* L2 outport and non-ethernet packet_type -> add dummy eth header. */
> > -        flow->packet_type = htonl(PT_ETH);
> > -        flow->dl_dst = eth_addr_zero;
> > -        flow->dl_src = eth_addr_zero;
> > +    if (flow->packet_type == htonl(PT_ETH)) {
> > +        /* Strip Ethernet header for legacy L3 port. */
> > +        if (xport->pt_mode == NETDEV_PT_LEGACY_L3) {
> > +            flow->packet_type = PACKET_TYPE_BE(OFPHTN_ETHERTYPE,
> > +                                               ntohs(flow->dl_type));
> > +        }
> > +    } else {
> > +        /* Add dummy Ethernet header for legacy L2 port. */
> > +        if (xport->pt_mode == NETDEV_PT_LEGACY_L2) {
> > +            flow->packet_type = htonl(PT_ETH);
> > +            flow->dl_dst = eth_addr_zero;
> > +            flow->dl_src = eth_addr_zero;
> > +        }
> >      }
> > 
> >      if (xport->peer) {
> > @@ -6391,8 +6398,8 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout)
> >      struct xport *in_port = get_ofp_port(xbridge,
> >                                           ctx.base_flow.in_port.ofp_port);
> > 
> > -    if (flow->packet_type != htonl(PT_ETH) && in_port && in_port->is_layer3 &&
> > -        ctx.table_id == 0) {
> > +    if (flow->packet_type != htonl(PT_ETH) && in_port &&
> > +        in_port->pt_mode == NETDEV_PT_LEGACY_L3 && ctx.table_id == 0) {
> >          /* Add dummy Ethernet header to non-L2 packet if it's coming from a
> >           * L3 port. So all packets will be L2 packets for lookup.
> >           * The dl_type has already been set from the packet_type. */
> > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> > index cc325ddd7a37..d19d486d9d81 100644
> > --- a/ofproto/ofproto-dpif.c
> > +++ b/ofproto/ofproto-dpif.c
> > @@ -2893,7 +2893,7 @@ bundle_update(struct ofbundle *bundle)
> >      bundle->floodable = true;
> >      LIST_FOR_EACH (port, bundle_node, &bundle->ports) {
> >          if (port->up.pp.config & OFPUTIL_PC_NO_FLOOD
> > -            || netdev_vport_is_layer3(port->up.netdev)
> > +            || netdev_get_pt_mode(port->up.netdev) == NETDEV_PT_LEGACY_L3
> >              || (bundle->ofproto->stp && !stp_forward_in_state(port->stp_state))
> >              || (bundle->ofproto->rstp && !rstp_forward_in_state(port->rstp_state))) {
> >              bundle->floodable = false;
> > @@ -2942,7 +2942,7 @@ bundle_add_port(struct ofbundle *bundle, ofp_port_t ofp_port,
> >          port->bundle = bundle;
> >          ovs_list_push_back(&bundle->ports, &port->bundle_node);
> >          if (port->up.pp.config & OFPUTIL_PC_NO_FLOOD
> > -            || netdev_vport_is_layer3(port->up.netdev)
> > +            || netdev_get_pt_mode(port->up.netdev) == NETDEV_PT_LEGACY_L3
> >              || (bundle->ofproto->stp && !stp_forward_in_state(port->stp_state))
> >              || (bundle->ofproto->rstp && !rstp_forward_in_state(port->rstp_state))) {
> >              bundle->floodable = false;
> > diff --git a/ofproto/tunnel.c b/ofproto/tunnel.c
> > index fa99b3102862..c6856a09ef4e 100644
> > --- a/ofproto/tunnel.c
> > +++ b/ofproto/tunnel.c
> > @@ -50,7 +50,7 @@ struct tnl_match {
> >      bool in_key_flow;
> >      bool ip_src_flow;
> >      bool ip_dst_flow;
> > -    bool is_layer3;
> > +    enum netdev_pt_mode pt_mode;
> >  };
> > 
> >  struct tnl_port {
> > @@ -164,7 +164,7 @@ tnl_port_add__(const struct ofport_dpif *ofport, const struct netdev *netdev,
> >      tnl_port->match.ip_dst_flow = cfg->ip_dst_flow;
> >      tnl_port->match.in_key_flow = cfg->in_key_flow;
> >      tnl_port->match.odp_port = odp_port;
> > -    tnl_port->match.is_layer3 = netdev_vport_is_layer3(netdev);
> > +    tnl_port->match.pt_mode = netdev_get_pt_mode(netdev);
> > 
> >      map = tnl_match_map(&tnl_port->match);
> >      existing_port = tnl_find_exact(&tnl_port->match, *map);
> > @@ -564,8 +564,20 @@ tnl_find(const struct flow *flow) OVS_REQ_RDLOCK(rwlock)
> >                      match.in_key_flow = in_key_flow;
> >                      match.ip_dst_flow = ip_dst_flow;
> >                      match.ip_src_flow = ip_src == IP_SRC_FLOW;
> > -                    match.is_layer3 = flow->packet_type != htonl(PT_ETH);
> > 
> > +                    /* Look for a legacy L2 or L3 tunnel port first. */
> > +                    if (pt_ns(flow->packet_type) == OFPHTN_ETHERTYPE) {
> > +                        match.pt_mode = NETDEV_PT_LEGACY_L3;
> > +                    } else {
> > +                        match.pt_mode = NETDEV_PT_LEGACY_L2;
> > +                    }
> > +                    tnl_port = tnl_find_exact(&match, map);
> > +                    if (tnl_port) {
> > +                        return tnl_port;
> > +                    }
> > +
> > +                    /* Then check for a packet type aware port. */
> > +                    match.pt_mode = NETDEV_PT_AWARE;
> >                      tnl_port = tnl_find_exact(&match, map);
> >                      if (tnl_port) {
> >                          return tnl_port;
> > @@ -614,11 +626,12 @@ tnl_match_fmt(const struct tnl_match *match, struct ds *ds)
> >      } else {
> >          ds_put_format(ds, ", key=%#"PRIx64, ntohll(match->in_key));
> >      }
> > -    if (match->is_layer3) {
> > -        ds_put_cstr(ds, ", layer3");
> > -    }
> > 
> > -    ds_put_format(ds, ", dp port=%"PRIu32, match->odp_port);
> > +    const char *pt_mode
> > +        = (match->pt_mode == NETDEV_PT_LEGACY_L2 ? "legacy_l2"
> > +           : match->pt_mode == NETDEV_PT_LEGACY_L3 ? "legacy_l3"
> > +           : "ptap");
> > +    ds_put_format(ds, ", %s, dp port=%"PRIu32, pt_mode, match->odp_port);
> >  }
> > 
> >  static void
> > diff --git a/tests/tunnel-push-pop-ipv6.at b/tests/tunnel-push-pop-ipv6.at
> > index 228a9af43573..9ff7c897c0b7 100644
> > --- a/tests/tunnel-push-pop-ipv6.at
> > +++ b/tests/tunnel-push-pop-ipv6.at
> > @@ -13,7 +13,7 @@ AT_CHECK([ovs-vsctl add-port int-br t2 -- set Interface t2 type=vxlan \
> >                      -- add-port int-br t4 -- set Interface t4 type=geneve \
> >                         options:remote_ip=flow options:key=123 ofport_request=5\
> >                      -- add-port int-br t5 -- set Interface t5 type=gre \
> > -                       options:remote_ip=2001:cafe::92 options:key=455 options:layer3=true ofport_request=6\
> > +                       options:remote_ip=2001:cafe::92 options:key=455 options:packet_type=legacy_l3
> > ofport_request=6\
> >                         ], [0])
> > 
> >  AT_CHECK([ovs-appctl dpif/show], [0], [dnl
> > @@ -27,7 +27,7 @@ dummy at ovs-dummy: hit:0 missed:0
> >  		t2 2/4789: (vxlan: key=123, remote_ip=2001:cafe::92)
> >  		t3 4/4789: (vxlan: csum=true, out_key=flow, remote_ip=2001:cafe::93)
> >  		t4 5/6081: (geneve: key=123, remote_ip=flow)
> > -		t5 6/3: (gre: key=455, layer3=true, remote_ip=2001:cafe::92)
> > +		t5 6/3: (gre: key=455, packet_type=legacy_l3, remote_ip=2001:cafe::92)
> >  ])
> > 
> >  dnl First setup dummy interface IP address, then add the route
> > diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
> > index 5a2c423839db..c376e719e2ff 100644
> > --- a/tests/tunnel-push-pop.at
> > +++ b/tests/tunnel-push-pop.at
> > @@ -15,7 +15,7 @@ AT_CHECK([ovs-vsctl add-port int-br t2 -- set Interface t2 type=vxlan \
> >                      -- add-port int-br t5 -- set Interface t5 type=geneve \
> >                         options:remote_ip=1.1.2.93 options:out_key=flow options:egress_pkt_mark=1234
> > ofport_request=6\
> >                      -- add-port int-br t6 -- set Interface t6 type=gre \
> > -                       options:remote_ip=1.1.2.92 options:key=456 options:layer3=true ofport_request=7\
> > +                       options:remote_ip=1.1.2.92 options:key=456 options:packet_type=legacy_l3 ofport_request=7\
> >                      -- add-port int-br t7 -- set Interface t7 type=vxlan \
> >                         options:remote_ip=1.1.2.92 options:key=345 options:exts=gpe ofport_request=8\
> >                         ], [0])
> > @@ -32,7 +32,7 @@ dummy at ovs-dummy: hit:0 missed:0
> >  		t3 4/4789: (vxlan: csum=true, out_key=flow, remote_ip=1.1.2.93)
> >  		t4 5/6081: (geneve: key=123, remote_ip=flow)
> >  		t5 6/6081: (geneve: egress_pkt_mark=1234, out_key=flow, remote_ip=1.1.2.93)
> > -		t6 7/3: (gre: key=456, layer3=true, remote_ip=1.1.2.92)
> > +		t6 7/3: (gre: key=456, packet_type=legacy_l3, remote_ip=1.1.2.92)
> >  		t7 8/4789: (vxlan: key=345, remote_ip=1.1.2.92)
> >  ])
> > 
> > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> > index 9bb828faa8eb..de68f89886a5 100644
> > --- a/vswitchd/vswitch.xml
> > +++ b/vswitchd/vswitch.xml
> > @@ -2397,6 +2397,31 @@
> >          including tunnel monitoring.
> >        </column>
> > 
> > +      <group title="Tunnel Options: lisp only">
> > +        <column name="options" key="packet_type"
> > +                type='{"type": "string", "enum": ["set",
> > +                      ["legacy_l3", "ptap"]]}'>
> > +          <p>
> > +            A LISP tunnel sends and receives only IPv4 and IPv6 packets.  This
> > +            option controls what how the tunnel represents the packets that it
> > +            sends and receives:
> > +          </p>
> > +
> > +          <ul>
> > +            <li>
> > +              By default, or if this option is <code>legacy_l3</code>, the
> > +              tunnel represents packets as Ethernet frames for compatibility
> > +              with legacy OpenFlow controllers that expect this behavior.
> > +            </li>
> > +            <li>
> > +              If this option is <code>ptap</code>, the tunnel represents
> > +              packets using the <code>packet_type</code> mechanism introduced
> > +              in OpenFlow 1.5.
> > +            </li>
> > +          </ul>
> > +        </column>
> > +      </group>
> > +
> >        <group title="Tunnel Options: vxlan only">
> > 
> >          <column name="options" key="exts">
> > @@ -2416,21 +2441,42 @@
> >                <code>gpe</code>: Support for Generic Protocol Encapsulation in
> >                accordance with IETF draft
> >                <code>https://tools.ietf.org/html/draft-ietf-nvo3-vxlan-gpe</code>.
> > +              Without this option, a VXLAN packet always encapsulates an
> > +              Ethernet frame.  With this option, an VXLAN packet may also
> > +              encapsulate an IPv4, IPv6, NSH, or MPLS packet.
> >              </li>
> >            </ul>
> >          </column>
> > 
> > -        <column name="options" key="layer3" type='{"type": "boolean"}'>
> > +        <column name="options" key="packet_type"
> > +                type='{"type": "string", "enum": ["set",
> > +                      ["legacy_l2", "legacy_l3", "ptap"]]}'>
> >            <p>
> > -            By default, or if set to false, the tunnel carries L2 packets (with
> > -            an Ethernet header).  If set to true, the tunnel carries L3 packets
> > -            (without an Ethernet header present).
> > +            This option controls what types of packets the tunnel sends and
> > +            receives and how it represents them:
> >            </p>
> > 
> > -          <p>
> > -            To set this option to true, the <code>gpe</code> extension must
> > -            also be enabled in <ref column="options" key="exts"/>.
> > -          </p>
> > +          <ul>
> > +            <li>
> > +              By default, or if this option is <code>legacy_l2</code>, the
> > +              tunnel sends and receives only Ethernet frames.
> > +            </li>
> > +            <li>
> > +              If this option is <code>legacy_l3</code>, the tunnel sends and
> > +              receives only non-Ethernet (L3) packet, but the packets are
> > +              represented as Ethernet frames for compatibility with legacy
> > +              OpenFlow controllers that expect this behavior.  This requires
> > +              enabling <code>gpe</code> in <ref column="options" key="exts"/>.
> > +            </li>
> > +            <li>
> > +              If this option is <code>ptap</code>, Open vSwitch represents
> > +              packets in the tunnel using the <code>packet_type</code>
> > +              mechanism introduced in OpenFlow 1.5.  This mechanism supports
> > +              any kind of packet, but actually sending and receiving
> > +              non-Ethernet packets requires additionally enabling
> > +              <code>gpe</code> in <ref column="options" key="exts"/>.
> > +            </li>
> > +          </ul>
> >          </column>
> >        </group>
> > 
> > @@ -2439,18 +2485,32 @@
> >            <code>gre</code> interfaces support these options.
> >          </p>
> > 
> > -        <column name="options" key="layer3" type='{"type": "boolean"}'>
> > +        <column name="options" key="packet_type"
> > +                type='{"type": "string", "enum": ["set",
> > +                      ["legacy_l2", "legacy_l3", "ptap"]]}'>
> >            <p>
> > -            By default, or if set to false, the tunnel carries L2 packets (with
> > -            an Ethernet header).  If set to true, the tunnel carries L3 packets
> > -            (without an Ethernet header present).
> > +            This option controls what types of packets the tunnel sends and
> > +            receives and how it represents them:
> >            </p>
> > 
> > -          <p>
> > -            A single GRE tunnel cannot carry both L2 and L3 packets, but the
> > -            same effect can be realized by creating two tunnels with different
> > -            <code>layer3</code> settings and otherwise the same configuration.
> > -          </p>
> > +          <ul>
> > +            <li>
> > +              By default, or if this option is <code>legacy_l2</code>, the
> > +              tunnel sends and receives only Ethernet frames.
> > +            </li>
> > +            <li>
> > +              If this option is <code>legacy_l3</code>, the tunnel sends and
> > +              receives only non-Ethernet (L3) packet, but the packets are
> > +              represented as Ethernet frames for compatibility with legacy
> > +              OpenFlow controllers that expect this behavior.
> > +            </li>
> > +            <li>
> > +              If this option is <code>ptap</code>, the tunnel sends and
> > +              receives any kind of packet.  Open vSwitch represents packets in
> > +              the tunnel using the <code>packet_type</code> mechanism
> > +              introduced in OpenFlow 1.5.
> > +            </li>
> > +          </ul>
> >          </column>
> >        </group>
> > 
> > --
> > 2.10.2
> > 
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev



More information about the dev mailing list