[ovs-dev] [PATCH v2 08/12] userspace: Handling of versatile tunnel ports

Zoltán Balogh zoltan.balogh at ericsson.com
Tue Jun 20 10:09:39 UTC 2017


Hi Ben,

I guess you meant "options:packet_type" instead of "other-config:packet_type", since it's about an interface not a bridge property.

Best regards,
Zoltan

> -----Original Message-----
> From: Ben Pfaff [mailto:blp at ovn.org]
> Sent: Tuesday, June 20, 2017 4:20 AM
> To: Zoltán Balogh <zoltan.balogh at ericsson.com>
> Cc: dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 08/12] userspace: Handling of versatile tunnel ports
> 
> Thanks for reporting that.  I changed this paragraph to:
> 
>   <p>
>     Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
>     concept introduced in OpenFlow 1.5.  Such a pipeline does not have any root
>     fields.  Instead, a new metadata field, <ref field="packet_type"/>,
>     indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
>     or another type.  For backward compatibility, by default Open vSwitch 2.8
>     imitates the behavior of Open vSwitch 2.7 and earlier.  Later versions of
>     Open vSwitch may change the default, and in the meantime controllers can
>     turn off this legacy behavior, on a port-by-port basis, by setting
>     <code>other_config:packet_type</code> to <code>ptap</code> in the
>     <code>Interface</code> table.  This is significant only for ports that can
>     handle non-Ethernet packets, which is currently just LISP, VXLAN-GPE, and
>     GRE tunnel ports.  See <code>ovs-vwitchd.conf.db</code>(5) for more
>     information.
>   </p>
> 
> 
> On Mon, Jun 19, 2017 at 03:33:33PM +0000, Zoltán Balogh wrote:
> > Hi Ben,
> >
> > In the lib/meta-flow.xml, you introduced the 'packet type-aware pipeline' concept.
> > You mentioned, controllers can turn off legacy behavior by setting 'other-config:packet-type' bridge property to
> 'ptap'.
> > As far as I know, you discussed on Friday, there will be only one property for the tunnel ports and no bridge
> property.
> >
> > Best regards,
> > Zoltan
> >
> >
> > > -----Original Message-----
> > > From: ovs-dev-bounces at openvswitch.org [mailto:ovs-dev-bounces at openvswitch.org] On Behalf Of Ben Pfaff
> > > Sent: Monday, June 19, 2017 1:30 AM
> > > To: dev at openvswitch.org
> > > Cc: Ben Pfaff <blp at ovn.org>
> > > Subject: [ovs-dev] [PATCH v2 08/12] userspace: Handling of versatile tunnel ports
> > >
> > > In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based
> > > on packet_type of flow. If it's about an Ethernet packet, it is set to
> > > ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set
> > > according to the name space type.
> > >
> > > Signed-off-by: Jan Scheurich <jan.scheurich at ericsson.com>
> > > Signed-off-by: Ben Pfaff <blp at ovn.org>
> > > ---
> > >  NEWS                          |   6 +--
> > >  lib/meta-flow.xml             |  28 ++++++-----
> > >  lib/netdev-bsd.c              |   1 +
> > >  lib/netdev-dpdk.c             |   1 +
> > >  lib/netdev-dummy.c            |   1 +
> > >  lib/netdev-linux.c            |   1 +
> > >  lib/netdev-native-tnl.c       |  23 ++++++---
> > >  lib/netdev-provider.h         |   6 +++
> > >  lib/netdev-vport.c            | 106 ++++++++++++++++++++++++++++++------------
> > >  lib/netdev-vport.h            |   1 -
> > >  lib/netdev.c                  |   8 ++++
> > >  lib/netdev.h                  |  29 +++++++++++-
> > >  ofproto/ofproto-dpif-xlate.c  |  35 ++++++++------
> > >  ofproto/ofproto-dpif.c        |   4 +-
> > >  ofproto/tunnel.c              |  27 ++++++++---
> > >  tests/tunnel-push-pop-ipv6.at |   4 +-
> > >  tests/tunnel-push-pop.at      |   4 +-
> > >  vswitchd/vswitch.xml          |  94 ++++++++++++++++++++++++++++++-------
> > >  18 files changed, 277 insertions(+), 102 deletions(-)
> > >
> > > diff --git a/NEWS b/NEWS
> > > index a2f5a6dc8e54..8b0ad6191325 100644
> > > --- a/NEWS
> > > +++ b/NEWS
> > > @@ -59,11 +59,9 @@ Post-v2.7.0
> > >       * OVN services are no longer restarted automatically after upgrade.
> > >     - Add --cleanup option to command 'ovs-appctl exit' (see ovs-vswitchd(8)).
> > >     - L3 tunneling:
> > > -     * Add "layer3" options for tunnel ports that support non-Ethernet (L3)
> > > -       payload (GRE, VXLAN-GPE).
> > > +     * Use new tunnel port option "packet_type" to configure L2 vs. L3.
> > >       * New vxlan tunnel extension "gpe" to support VXLAN-GPE tunnels.
> > > -     * Transparently pop and push Ethernet headers at transmit/reception
> > > -       of packets to/from L3 tunnels.
> > > +     * New support for non-Ethernet (L3) payloads in GRE and VXLAN-GPE.
> > >     - The BFD detection multiplier is now user-configurable.
> > >     - New support for HW offloading
> > >       * HW offloading is disabled by default.
> > > diff --git a/lib/meta-flow.xml b/lib/meta-flow.xml
> > > index 856e1ba8cf7b..dc2731e2a260 100644
> > > --- a/lib/meta-flow.xml
> > > +++ b/lib/meta-flow.xml
> > > @@ -26,19 +26,25 @@
> > >      networking technology in use are called called <dfn>root fields</dfn>.
> > >      Open vSwitch 2.7 and earlier considered Ethernet fields to be root fields,
> > >      and this remains the default mode of operation for Open vSwitch bridges.
> > > -    In this mode, when a packet is received from a non-Ethernet interfaces,
> > > -    such as a layer-3 LISP or GRE tunnel, Open vSwitch force-fits it to this
> > > +    When a packet is received from a non-Ethernet interfaces, such as a layer-3
> > > +    LISP tunnel, Open vSwitch 2.7 and earlier force-fit the packet to this
> > >      Ethernet-centric point of view by pretending that an Ethernet header is
> > >      present whose Ethernet type that indicates the packet's actual type (and
> > >      whose source and destination addresses are all-zero).
> > >    </p>
> > >
> > >    <p>
> > > -    Open vSwitch 2.8 and later supports the ``packet type-aware pipeline''
> > > -    concept introduced in OpenFlow 1.5.  A bridge configured to be packet
> > > -    type-aware can handle packets of multiple networking technologies, such as
> > > -    Ethernet, IP, ARP, MPLS, or NSH in parallel.  Such a bridge does not have
> > > -    any root fields.
> > > +    Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
> > > +    concept introduced in OpenFlow 1.5.  Such a pipeline does not have any root
> > > +    fields.  Instead, a new metadata field, <ref field="packet_type"/>,
> > > +    indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
> > > +    or another type.  For backward compatibility, by default Open vSwitch 2.8
> > > +    imitates the behavior of Open vSwitch 2.7 and earlier.  Later versions of
> > > +    Open vSwitch may change the default, and in the meantime controllers can
> > > +    turn off this legacy behavior by setting
> > > +    <code>other-config:packet-type</code> to <code>ptap</code> in the
> > > +    <code>Bridge</code> table.  (See <code>ovs-vwitchd.conf.db</code>(5) for
> > > +    more information.)
> > >    </p>
> > >
> > >    <p>
> > > @@ -332,14 +338,6 @@ tcp,tp_src=0x07c0/0xfff0
> > >      <dt><code>mplsm</code></dt>  <dd><code>eth_type=0x8848</code></dd>
> > >    </dl>
> > >
> > > -  <p>
> > > -    These shorthand notations continue to work in packet type-aware bridges.
> > > -    The absence of a packet_type match implies
> > > -    <code>packet_type=ethernet</code>, so that shorthands match on Ethernet
> > > -    packets with the implied eth_type. Please note that the shorthand
> > > -    <code>ip</code> does not match packets of packet_type (1,0x800) for IPv4.
> > > -  </p>
> > > -
> > >
> > >    <h2>Evolution of OpenFlow Fields</h2>
> > >
> > > diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
> > > index f863a189cd5e..6cc83d347795 100644
> > > --- a/lib/netdev-bsd.c
> > > +++ b/lib/netdev-bsd.c
> > > @@ -1517,6 +1517,7 @@ netdev_bsd_update_flags(struct netdev *netdev_, enum netdev_flags off,
> > >                                                       \
> > >      GET_FEATURES,                                    \
> > >      NULL, /* set_advertisement */                    \
> > > +    NULL, /* get_pt_mode */                          \
> > >      NULL, /* set_policing */                         \
> > >      NULL, /* get_qos_type */                         \
> > >      NULL, /* get_qos_capabilities */                 \
> > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > > index bba4de378888..5ad92446950f 100644
> > > --- a/lib/netdev-dpdk.c
> > > +++ b/lib/netdev-dpdk.c
> > > @@ -3276,6 +3276,7 @@ unlock:
> > >      GET_STATS,                                                \
> > >      GET_FEATURES,                                             \
> > >      NULL,                       /* set_advertisements */      \
> > > +    NULL,                       /* get_pt_mode */             \
> > >                                                                \
> > >      netdev_dpdk_set_policing,                                 \
> > >      netdev_dpdk_get_qos_types,                                \
> > > diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
> > > index d189a8615e05..51d29d54a4ac 100644
> > > --- a/lib/netdev-dummy.c
> > > +++ b/lib/netdev-dummy.c
> > > @@ -1382,6 +1382,7 @@ netdev_dummy_update_flags(struct netdev *netdev_,
> > >                                                                  \
> > >      NULL,                       /* get_features */              \
> > >      NULL,                       /* set_advertisements */        \
> > > +    NULL,                       /* get_pt_mode */               \
> > >                                                                  \
> > >      NULL,                       /* set_policing */              \
> > >      NULL,                       /* get_qos_types */             \
> > > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> > > index f5dc30fbc188..a8c296002cc4 100644
> > > --- a/lib/netdev-linux.c
> > > +++ b/lib/netdev-linux.c
> > > @@ -2842,6 +2842,7 @@ netdev_linux_update_flags(struct netdev *netdev_, enum netdev_flags off,
> > >                                                                  \
> > >      GET_FEATURES,                                               \
> > >      netdev_linux_set_advertisements,                            \
> > > +    NULL,                       /* get_pt_mode */               \
> > >                                                                  \
> > >      netdev_linux_set_policing,                                  \
> > >      netdev_linux_get_qos_types,                                 \
> > > diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> > > index c7a29934537e..7f3cf984e887 100644
> > > --- a/lib/netdev-native-tnl.c
> > > +++ b/lib/netdev-native-tnl.c
> > > @@ -463,10 +463,13 @@ netdev_gre_build_header(const struct netdev *netdev,
> > >
> > >      greh = netdev_tnl_ip_build_header(data, params, IPPROTO_GRE);
> > >
> > > -    if (tnl_cfg->is_layer3) {
> > > -        greh->protocol = params->flow->dl_type;
> > > -    } else {
> > > +    if (params->flow->packet_type == htonl(PT_ETH)) {
> > >          greh->protocol = htons(ETH_TYPE_TEB);
> > > +    } else if (pt_ns(params->flow->packet_type) == OFPHTN_ETHERTYPE) {
> > > +        greh->protocol = pt_ns_type_be(params->flow->packet_type);
> > > +    } else {
> > > +        ovs_mutex_unlock(&dev->mutex);
> > > +        return 1;
> > >      }
> > >      greh->flags = 0;
> > >
> > > @@ -575,8 +578,10 @@ netdev_vxlan_build_header(const struct netdev *netdev,
> > >          put_16aligned_be32(&vxh->vx_flags, htonl(VXLAN_FLAGS | VXLAN_HF_GPE));
> > >          put_16aligned_be32(&vxh->vx_vni,
> > >                             htonl(ntohll(params->flow->tunnel.tun_id) << 8));
> > > -        if (tnl_cfg->is_layer3) {
> > > -            switch (ntohs(params->flow->dl_type)) {
> > > +        if (params->flow->packet_type == htonl(PT_ETH)) {
> > > +            vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
> > > +        } else if (pt_ns(params->flow->packet_type) == OFPHTN_ETHERTYPE) {
> > > +            switch (pt_ns_type(params->flow->packet_type)) {
> > >              case ETH_TYPE_IP:
> > >                  vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_IPV4;
> > >                  break;
> > > @@ -586,9 +591,11 @@ netdev_vxlan_build_header(const struct netdev *netdev,
> > >              case ETH_TYPE_TEB:
> > >                  vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
> > >                  break;
> > > +            default:
> > > +                goto drop;
> > >              }
> > >          } else {
> > > -            vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
> > > +            goto drop;
> > >          }
> > >      } else {
> > >          put_16aligned_be32(&vxh->vx_flags, htonl(VXLAN_FLAGS));
> > > @@ -600,6 +607,10 @@ netdev_vxlan_build_header(const struct netdev *netdev,
> > >      data->header_len += sizeof *vxh;
> > >      data->tnl_type = OVS_VPORT_TYPE_VXLAN;
> > >      return 0;
> > > +
> > > +drop:
> > > +    ovs_mutex_unlock(&dev->mutex);
> > > +    return 1;
> > >  }
> > >
> > >  struct dp_packet *
> > > diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
> > > index 79143d2c8dfb..3c3c181135de 100644
> > > --- a/lib/netdev-provider.h
> > > +++ b/lib/netdev-provider.h
> > > @@ -474,6 +474,12 @@ struct netdev_class {
> > >      int (*set_advertisements)(struct netdev *netdev,
> > >                                enum netdev_features advertise);
> > >
> > > +    /* Returns 'netdev''s configured packet_type mode.
> > > +     *
> > > +     * This function may be set to null if it would always return
> > > +     * NETDEV_PT_LEGACY_L2. */
> > > +    enum netdev_pt_mode (*get_pt_mode)(const struct netdev *netdev);
> > > +
> > >      /* Attempts to set input rate limiting (policing) policy, such that up to
> > >       * 'kbits_rate' kbps of traffic is accepted, with a maximum accumulative
> > >       * burst size of 'kbits' kb.
> > > diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
> > > index 640cdbe9ca7c..f1db38cbb10f 100644
> > > --- a/lib/netdev-vport.c
> > > +++ b/lib/netdev-vport.c
> > > @@ -98,18 +98,6 @@ netdev_vport_is_patch(const struct netdev *netdev)
> > >      return class->get_config == get_patch_config;
> > >  }
> > >
> > > -bool
> > > -netdev_vport_is_layer3(const struct netdev *dev)
> > > -{
> > > -    if (is_vport_class(netdev_get_class(dev))) {
> > > -        struct netdev_vport *vport = netdev_vport_cast(dev);
> > > -
> > > -        return vport->tnl_cfg.is_layer3;
> > > -    }
> > > -
> > > -    return false;
> > > -}
> > > -
> > >  static bool
> > >  netdev_vport_needs_dst_port(const struct netdev *dev)
> > >  {
> > > @@ -407,6 +395,30 @@ parse_tunnel_ip(const char *value, bool accept_mcast, bool *flow,
> > >      return 0;
> > >  }
> > >
> > > +enum tunnel_layers {
> > > +    TNL_L2 = 1 << 0,       /* 1 if a tunnel type can carry Ethernet traffic. */
> > > +    TNL_L3 = 1 << 1        /* 1 if a tunnel type can carry L3 traffic. */
> > > +};
> > > +static enum tunnel_layers
> > > +tunnel_supported_layers(const char *type,
> > > +                        const struct netdev_tunnel_config *tnl_cfg)
> > > +{
> > > +    if (!strcmp(type, "lisp")) {
> > > +        return TNL_L3;
> > > +    } else if (!strcmp(type, "gre")) {
> > > +        return TNL_L2 | TNL_L3;
> > > +    } else if (!strcmp(type, "vxlan") && tnl_cfg->exts & OVS_VXLAN_EXT_GPE) {
> > > +        return TNL_L2 | TNL_L3;
> > > +    } else {
> > > +        return TNL_L2;
> > > +    }
> > > +}
> > > +static enum netdev_pt_mode
> > > +default_pt_mode(enum tunnel_layers layers)
> > > +{
> > > +    return layers == TNL_L3 ? NETDEV_PT_LEGACY_L3 : NETDEV_PT_LEGACY_L2;
> > > +}
> > > +
> > >  static int
> > >  set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> > >  {
> > > @@ -414,16 +426,14 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> > >      const char *name = netdev_get_name(dev_);
> > >      const char *type = netdev_get_type(dev_);
> > >      struct ds errors = DS_EMPTY_INITIALIZER;
> > > -    bool needs_dst_port, has_csum, optional_layer3;
> > > +    bool needs_dst_port, has_csum;
> > >      uint16_t dst_proto = 0, src_proto = 0;
> > >      struct netdev_tunnel_config tnl_cfg;
> > >      struct smap_node *node;
> > > -    bool is_layer3 = false;
> > >      int err;
> > >
> > >      has_csum = strstr(type, "gre") || strstr(type, "geneve") ||
> > >                 strstr(type, "stt") || strstr(type, "vxlan");
> > > -    optional_layer3 = !strcmp(type, "gre");
> > >      memset(&tnl_cfg, 0, sizeof tnl_cfg);
> > >
> > >      /* Add a default destination port for tunnel ports if none specified. */
> > > @@ -437,7 +447,6 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> > >
> > >      if (!strcmp(type, "lisp")) {
> > >          tnl_cfg.dst_port = htons(LISP_DST_PORT);
> > > -        tnl_cfg.is_layer3 = true;
> > >      }
> > >
> > >      if (!strcmp(type, "stt")) {
> > > @@ -501,9 +510,10 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> > >              }
> > >          } else if (!strcmp(node->key, "key") ||
> > >                     !strcmp(node->key, "in_key") ||
> > > -                   !strcmp(node->key, "out_key")) {
> > > +                   !strcmp(node->key, "out_key") ||
> > > +                   !strcmp(node->key, "packet_type")) {
> > >              /* Handled separately below. */
> > > -        } else if (!strcmp(node->key, "exts")) {
> > > +        } else if (!strcmp(node->key, "exts") && !strcmp(type, "vxlan")) {
> > >              char *str = xstrdup(node->value);
> > >              char *ext, *save_ptr = NULL;
> > >
> > > @@ -515,7 +525,6 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> > >                      tnl_cfg.exts |= (1 << OVS_VXLAN_EXT_GBP);
> > >                  } else if (!strcmp(type, "vxlan") && !strcmp(ext, "gpe")) {
> > >                      tnl_cfg.exts |= (1 << OVS_VXLAN_EXT_GPE);
> > > -                    optional_layer3 = true;
> > >                  } else {
> > >                      ds_put_format(&errors, "%s: unknown extension '%s'\n",
> > >                                    name, ext);
> > > @@ -528,21 +537,44 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
> > >          } else if (!strcmp(node->key, "egress_pkt_mark")) {
> > >              tnl_cfg.egress_pkt_mark = strtoul(node->value, NULL, 10);
> > >              tnl_cfg.set_egress_pkt_mark = true;
> > > -        } else if (!strcmp(node->key, "layer3")) {
> > > -            if (!strcmp(node->value, "true")) {
> > > -                is_layer3 = true;
> > > -            }
> > >          } else {
> > >              ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name,
> > >                            type, node->key);
> > >          }
> > >      }
> > >
> > > -    if (optional_layer3 && is_layer3) {
> > > -       tnl_cfg.is_layer3 = is_layer3;
> > > -    } else if (!optional_layer3 && is_layer3) {
> > > -        ds_put_format(&errors, "%s: unknown %s argument '%s'\n",
> > > -                      name, type, "layer3");
> > > +    enum tunnel_layers layers = tunnel_supported_layers(type, &tnl_cfg);
> > > +    const char *full_type = (strcmp(type, "vxlan") ? type
> > > +                             : tnl_cfg.exts & OVS_VXLAN_EXT_GPE ? "VXLAN-GPE"
> > > +                             : "VXLAN (without GPE");
> > > +    const char *packet_type = smap_get(args, "packet_type");
> > > +    if (!packet_type) {
> > > +        tnl_cfg.pt_mode = default_pt_mode(layers);
> > > +    } else if (!strcmp(packet_type, "legacy_l2")) {
> > > +        tnl_cfg.pt_mode = NETDEV_PT_LEGACY_L2;
> > > +        if (!(layers & TNL_L2)) {
> > > +            ds_put_format(&errors, "%s: legacy_l2 configured on %s tunnel "
> > > +                          "that cannot carry L2 traffic\n",
> > > +                          name, full_type);
> > > +            err = EINVAL;
> > > +            goto out;
> > > +        }
> > > +    } else if (!strcmp(packet_type, "legacy_l3")) {
> > > +        tnl_cfg.pt_mode = NETDEV_PT_LEGACY_L3;
> > > +        if (!(layers & TNL_L3)) {
> > > +            ds_put_format(&errors, "%s: legacy_l3 configured on %s tunnel "
> > > +                          "that cannot carry L3 traffic\n",
> > > +                          name, full_type);
> > > +            err = EINVAL;
> > > +            goto out;
> > > +        }
> > > +    } else if (!strcmp(packet_type, "ptap")) {
> > > +        tnl_cfg.pt_mode = NETDEV_PT_AWARE;
> > > +    } else {
> > > +        ds_put_format(&errors, "%s: unknown packet_type '%s'\n",
> > > +                      name, packet_type);
> > > +        err = EINVAL;
> > > +        goto out;
> > >      }
> > >
> > >      if (!ipv6_addr_is_set(&tnl_cfg.ipv6_dst) && !tnl_cfg.ip_dst_flow) {
> > > @@ -675,9 +707,12 @@ get_tunnel_config(const struct netdev *dev, struct smap *args)
> > >          smap_add(args, "csum", "true");
> > >      }
> > >
> > > -    if (tnl_cfg.is_layer3 && (!strcmp("gre", type) ||
> > > -        !strcmp("vxlan", type))) {
> > > -        smap_add(args, "layer3", "true");
> > > +    enum tunnel_layers layers = tunnel_supported_layers(type, &tnl_cfg);
> > > +    if (tnl_cfg.pt_mode != default_pt_mode(layers)) {
> > > +        smap_add(args, "packet_type",
> > > +                 tnl_cfg.pt_mode == NETDEV_PT_LEGACY_L2 ? "legacy_l2"
> > > +                 : tnl_cfg.pt_mode == NETDEV_PT_LEGACY_L3 ? "legacy_l3"
> > > +                 : "ptap");
> > >      }
> > >
> > >      if (!tnl_cfg.dont_fragment) {
> > > @@ -809,6 +844,14 @@ get_stats(const struct netdev *netdev, struct netdev_stats *stats)
> > >      return 0;
> > >  }
> > >
> > > +static enum netdev_pt_mode
> > > +get_pt_mode(const struct netdev *netdev)
> > > +{
> > > +    struct netdev_vport *dev = netdev_vport_cast(netdev);
> > > +
> > > +    return dev->tnl_cfg.pt_mode;
> > > +}
> > > +
> > >
> >
> > >  #ifdef __linux__
> > >  static int
> > > @@ -873,6 +916,7 @@ netdev_vport_get_ifindex(const struct netdev *netdev_)
> > >                                                              \
> > >      NULL,                       /* get_features */          \
> > >      NULL,                       /* set_advertisements */    \
> > > +    get_pt_mode,                                            \
> > >                                                              \
> > >      NULL,                       /* set_policing */          \
> > >      NULL,                       /* get_qos_types */         \
> > > diff --git a/lib/netdev-vport.h b/lib/netdev-vport.h
> > > index 048aa6ebf223..9d756a265c4f 100644
> > > --- a/lib/netdev-vport.h
> > > +++ b/lib/netdev-vport.h
> > > @@ -31,7 +31,6 @@ void netdev_vport_tunnel_register(void);
> > >  void netdev_vport_patch_register(void);
> > >
> > >  bool netdev_vport_is_patch(const struct netdev *);
> > > -bool netdev_vport_is_layer3(const struct netdev *);
> > >
> > >  char *netdev_vport_patch_peer(const struct netdev *netdev);
> > >
> > > diff --git a/lib/netdev.c b/lib/netdev.c
> > > index 765bf4b9ccad..a7840a84e594 100644
> > > --- a/lib/netdev.c
> > > +++ b/lib/netdev.c
> > > @@ -727,6 +727,14 @@ netdev_set_tx_multiq(struct netdev *netdev, unsigned int n_txq)
> > >      return error;
> > >  }
> > >
> > > +enum netdev_pt_mode
> > > +netdev_get_pt_mode(const struct netdev *netdev)
> > > +{
> > > +    return (netdev->netdev_class->get_pt_mode
> > > +            ? netdev->netdev_class->get_pt_mode(netdev)
> > > +            : NETDEV_PT_LEGACY_L2);
> > > +}
> > > +
> > >  /* Sends 'batch' on 'netdev'.  Returns 0 if successful (for every packet),
> > >   * otherwise a positive errno value.  Returns EAGAIN without blocking if
> > >   * at least one the packets cannot be queued immediately.  Returns EMSGSIZE
> > > diff --git a/lib/netdev.h b/lib/netdev.h
> > > index 31846fabf9af..998f942e29d9 100644
> > > --- a/lib/netdev.h
> > > +++ b/lib/netdev.h
> > > @@ -71,6 +71,32 @@ struct smap;
> > >  struct sset;
> > >  struct ovs_action_push_tnl;
> > >
> > > +enum netdev_pt_mode {
> > > +    /* The netdev is packet type aware.  It can potentially carry any kind of
> > > +     * packet.  This "modern" mode is appropriate for both netdevs that handle
> > > +     * only a single kind of packet (such as a virtual or physical Ethernet
> > > +     * interface) and for those that can handle multiple (such as VXLAN-GPE or
> > > +     * Geneve). */
> > > +    NETDEV_PT_AWARE,
> > > +
> > > +    /* The netdev sends and receives only Ethernet frames.  The netdev cannot
> > > +     * carry packets other than Ethernet frames.  This is a legacy mode for
> > > +     * backward compability with controllers that are not prepared to handle
> > > +     * OpenFlow 1.5+ "packet_type". */
> > > +    NETDEV_PT_LEGACY_L2,
> > > +
> > > +    /* The netdev sends and receives only IPv4 and IPv6 packets.  The netdev
> > > +     * cannot carry Ethernet frames or other kinds of packets.
> > > +     *
> > > +     * IPv4 and IPv6 packets carried over the netdev are treated as Ethernet:
> > > +     * when they are received, they are converted to Ethernet by adding a dummy
> > > +     * header with the proper Ethertype; on tranmission, the Ethernet header is
> > > +     * stripped.  This is a legacy mode for backward compability with
> > > +     * controllers that are not prepared to handle OpenFlow 1.5+
> > > +     * "packet_type". */
> > > +    NETDEV_PT_LEGACY_L3,
> > > +};
> > > +
> > >  /* Configuration specific to tunnels. */
> > >  struct netdev_tunnel_config {
> > >      bool in_key_present;
> > > @@ -100,7 +126,7 @@ struct netdev_tunnel_config {
> > >
> > >      bool csum;
> > >      bool dont_fragment;
> > > -    bool is_layer3;
> > > +    enum netdev_pt_mode pt_mode;
> > >  };
> > >
> > >  void netdev_run(void);
> > > @@ -140,6 +166,7 @@ void netdev_mtu_user_config(struct netdev *, bool);
> > >  bool netdev_mtu_is_user_config(struct netdev *);
> > >  int netdev_get_ifindex(const struct netdev *);
> > >  int netdev_set_tx_multiq(struct netdev *, unsigned int n_txq);
> > > +enum netdev_pt_mode netdev_get_pt_mode(const struct netdev *);
> > >
> > >  /* Packet reception. */
> > >  int netdev_rxq_open(struct netdev *, struct netdev_rxq **, int id);
> > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > > index c7a7a371d32b..26cf3ba286cf 100644
> > > --- a/ofproto/ofproto-dpif-xlate.c
> > > +++ b/ofproto/ofproto-dpif-xlate.c
> > > @@ -165,7 +165,7 @@ struct xport {
> > >
> > >      bool may_enable;                 /* May be enabled in bonds. */
> > >      bool is_tunnel;                  /* Is a tunnel port. */
> > > -    bool is_layer3;                  /* Is a layer 3 port. */
> > > +    enum netdev_pt_mode pt_mode;     /* packet_type handling. */
> > >
> > >      struct cfm *cfm;                 /* CFM handle or null. */
> > >      struct bfd *bfd;                 /* BFD handle or null. */
> > > @@ -905,7 +905,7 @@ xlate_xport_set(struct xport *xport, odp_port_t odp_port,
> > >      xport->state = state;
> > >      xport->stp_port_no = stp_port_no;
> > >      xport->is_tunnel = is_tunnel;
> > > -    xport->is_layer3 = netdev_vport_is_layer3(netdev);
> > > +    xport->pt_mode = netdev_get_pt_mode(netdev);
> > >      xport->may_enable = may_enable;
> > >      xport->odp_port = odp_port;
> > >
> > > @@ -2691,7 +2691,10 @@ xlate_normal(struct xlate_ctx *ctx)
> > >
> > >      /* Learn source MAC. */
> > >      bool is_grat_arp = is_gratuitous_arp(flow, wc);
> > > -    if (ctx->xin->allow_side_effects && !in_port->is_layer3) {
> > > +    if (ctx->xin->allow_side_effects
> > > +        && flow->packet_type == htonl(PT_ETH)
> > > +        && in_port->pt_mode != NETDEV_PT_LEGACY_L3
> > > +    ) {
> > >          update_learning_table(ctx, in_xbundle, flow->dl_src, vlan,
> > >                                is_grat_arp);
> > >      }
> > > @@ -3351,15 +3354,19 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
> > >          return;
> > >      }
> > >
> > > -    if (flow->packet_type == htonl(PT_ETH) && xport->is_layer3) {
> > > -        /* Ethernet packet to L3 outport -> pop ethernet header. */
> > > -        flow->packet_type = PACKET_TYPE_BE(OFPHTN_ETHERTYPE,
> > > -                                           ntohs(flow->dl_type));
> > > -    } else if (flow->packet_type != htonl(PT_ETH) && !xport->is_layer3) {
> > > -        /* L2 outport and non-ethernet packet_type -> add dummy eth header. */
> > > -        flow->packet_type = htonl(PT_ETH);
> > > -        flow->dl_dst = eth_addr_zero;
> > > -        flow->dl_src = eth_addr_zero;
> > > +    if (flow->packet_type == htonl(PT_ETH)) {
> > > +        /* Strip Ethernet header for legacy L3 port. */
> > > +        if (xport->pt_mode == NETDEV_PT_LEGACY_L3) {
> > > +            flow->packet_type = PACKET_TYPE_BE(OFPHTN_ETHERTYPE,
> > > +                                               ntohs(flow->dl_type));
> > > +        }
> > > +    } else {
> > > +        /* Add dummy Ethernet header for legacy L2 port. */
> > > +        if (xport->pt_mode == NETDEV_PT_LEGACY_L2) {
> > > +            flow->packet_type = htonl(PT_ETH);
> > > +            flow->dl_dst = eth_addr_zero;
> > > +            flow->dl_src = eth_addr_zero;
> > > +        }
> > >      }
> > >
> > >      if (xport->peer) {
> > > @@ -6391,8 +6398,8 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout)
> > >      struct xport *in_port = get_ofp_port(xbridge,
> > >                                           ctx.base_flow.in_port.ofp_port);
> > >
> > > -    if (flow->packet_type != htonl(PT_ETH) && in_port && in_port->is_layer3 &&
> > > -        ctx.table_id == 0) {
> > > +    if (flow->packet_type != htonl(PT_ETH) && in_port &&
> > > +        in_port->pt_mode == NETDEV_PT_LEGACY_L3 && ctx.table_id == 0) {
> > >          /* Add dummy Ethernet header to non-L2 packet if it's coming from a
> > >           * L3 port. So all packets will be L2 packets for lookup.
> > >           * The dl_type has already been set from the packet_type. */
> > > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> > > index cc325ddd7a37..d19d486d9d81 100644
> > > --- a/ofproto/ofproto-dpif.c
> > > +++ b/ofproto/ofproto-dpif.c
> > > @@ -2893,7 +2893,7 @@ bundle_update(struct ofbundle *bundle)
> > >      bundle->floodable = true;
> > >      LIST_FOR_EACH (port, bundle_node, &bundle->ports) {
> > >          if (port->up.pp.config & OFPUTIL_PC_NO_FLOOD
> > > -            || netdev_vport_is_layer3(port->up.netdev)
> > > +            || netdev_get_pt_mode(port->up.netdev) == NETDEV_PT_LEGACY_L3
> > >              || (bundle->ofproto->stp && !stp_forward_in_state(port->stp_state))
> > >              || (bundle->ofproto->rstp && !rstp_forward_in_state(port->rstp_state))) {
> > >              bundle->floodable = false;
> > > @@ -2942,7 +2942,7 @@ bundle_add_port(struct ofbundle *bundle, ofp_port_t ofp_port,
> > >          port->bundle = bundle;
> > >          ovs_list_push_back(&bundle->ports, &port->bundle_node);
> > >          if (port->up.pp.config & OFPUTIL_PC_NO_FLOOD
> > > -            || netdev_vport_is_layer3(port->up.netdev)
> > > +            || netdev_get_pt_mode(port->up.netdev) == NETDEV_PT_LEGACY_L3
> > >              || (bundle->ofproto->stp && !stp_forward_in_state(port->stp_state))
> > >              || (bundle->ofproto->rstp && !rstp_forward_in_state(port->rstp_state))) {
> > >              bundle->floodable = false;
> > > diff --git a/ofproto/tunnel.c b/ofproto/tunnel.c
> > > index fa99b3102862..c6856a09ef4e 100644
> > > --- a/ofproto/tunnel.c
> > > +++ b/ofproto/tunnel.c
> > > @@ -50,7 +50,7 @@ struct tnl_match {
> > >      bool in_key_flow;
> > >      bool ip_src_flow;
> > >      bool ip_dst_flow;
> > > -    bool is_layer3;
> > > +    enum netdev_pt_mode pt_mode;
> > >  };
> > >
> > >  struct tnl_port {
> > > @@ -164,7 +164,7 @@ tnl_port_add__(const struct ofport_dpif *ofport, const struct netdev *netdev,
> > >      tnl_port->match.ip_dst_flow = cfg->ip_dst_flow;
> > >      tnl_port->match.in_key_flow = cfg->in_key_flow;
> > >      tnl_port->match.odp_port = odp_port;
> > > -    tnl_port->match.is_layer3 = netdev_vport_is_layer3(netdev);
> > > +    tnl_port->match.pt_mode = netdev_get_pt_mode(netdev);
> > >
> > >      map = tnl_match_map(&tnl_port->match);
> > >      existing_port = tnl_find_exact(&tnl_port->match, *map);
> > > @@ -564,8 +564,20 @@ tnl_find(const struct flow *flow) OVS_REQ_RDLOCK(rwlock)
> > >                      match.in_key_flow = in_key_flow;
> > >                      match.ip_dst_flow = ip_dst_flow;
> > >                      match.ip_src_flow = ip_src == IP_SRC_FLOW;
> > > -                    match.is_layer3 = flow->packet_type != htonl(PT_ETH);
> > >
> > > +                    /* Look for a legacy L2 or L3 tunnel port first. */
> > > +                    if (pt_ns(flow->packet_type) == OFPHTN_ETHERTYPE) {
> > > +                        match.pt_mode = NETDEV_PT_LEGACY_L3;
> > > +                    } else {
> > > +                        match.pt_mode = NETDEV_PT_LEGACY_L2;
> > > +                    }
> > > +                    tnl_port = tnl_find_exact(&match, map);
> > > +                    if (tnl_port) {
> > > +                        return tnl_port;
> > > +                    }
> > > +
> > > +                    /* Then check for a packet type aware port. */
> > > +                    match.pt_mode = NETDEV_PT_AWARE;
> > >                      tnl_port = tnl_find_exact(&match, map);
> > >                      if (tnl_port) {
> > >                          return tnl_port;
> > > @@ -614,11 +626,12 @@ tnl_match_fmt(const struct tnl_match *match, struct ds *ds)
> > >      } else {
> > >          ds_put_format(ds, ", key=%#"PRIx64, ntohll(match->in_key));
> > >      }
> > > -    if (match->is_layer3) {
> > > -        ds_put_cstr(ds, ", layer3");
> > > -    }
> > >
> > > -    ds_put_format(ds, ", dp port=%"PRIu32, match->odp_port);
> > > +    const char *pt_mode
> > > +        = (match->pt_mode == NETDEV_PT_LEGACY_L2 ? "legacy_l2"
> > > +           : match->pt_mode == NETDEV_PT_LEGACY_L3 ? "legacy_l3"
> > > +           : "ptap");
> > > +    ds_put_format(ds, ", %s, dp port=%"PRIu32, pt_mode, match->odp_port);
> > >  }
> > >
> > >  static void
> > > diff --git a/tests/tunnel-push-pop-ipv6.at b/tests/tunnel-push-pop-ipv6.at
> > > index 228a9af43573..9ff7c897c0b7 100644
> > > --- a/tests/tunnel-push-pop-ipv6.at
> > > +++ b/tests/tunnel-push-pop-ipv6.at
> > > @@ -13,7 +13,7 @@ AT_CHECK([ovs-vsctl add-port int-br t2 -- set Interface t2 type=vxlan \
> > >                      -- add-port int-br t4 -- set Interface t4 type=geneve \
> > >                         options:remote_ip=flow options:key=123 ofport_request=5\
> > >                      -- add-port int-br t5 -- set Interface t5 type=gre \
> > > -                       options:remote_ip=2001:cafe::92 options:key=455 options:layer3=true ofport_request=6\
> > > +                       options:remote_ip=2001:cafe::92 options:key=455 options:packet_type=legacy_l3
> > > ofport_request=6\
> > >                         ], [0])
> > >
> > >  AT_CHECK([ovs-appctl dpif/show], [0], [dnl
> > > @@ -27,7 +27,7 @@ dummy at ovs-dummy: hit:0 missed:0
> > >  		t2 2/4789: (vxlan: key=123, remote_ip=2001:cafe::92)
> > >  		t3 4/4789: (vxlan: csum=true, out_key=flow, remote_ip=2001:cafe::93)
> > >  		t4 5/6081: (geneve: key=123, remote_ip=flow)
> > > -		t5 6/3: (gre: key=455, layer3=true, remote_ip=2001:cafe::92)
> > > +		t5 6/3: (gre: key=455, packet_type=legacy_l3, remote_ip=2001:cafe::92)
> > >  ])
> > >
> > >  dnl First setup dummy interface IP address, then add the route
> > > diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
> > > index 5a2c423839db..c376e719e2ff 100644
> > > --- a/tests/tunnel-push-pop.at
> > > +++ b/tests/tunnel-push-pop.at
> > > @@ -15,7 +15,7 @@ AT_CHECK([ovs-vsctl add-port int-br t2 -- set Interface t2 type=vxlan \
> > >                      -- add-port int-br t5 -- set Interface t5 type=geneve \
> > >                         options:remote_ip=1.1.2.93 options:out_key=flow options:egress_pkt_mark=1234
> > > ofport_request=6\
> > >                      -- add-port int-br t6 -- set Interface t6 type=gre \
> > > -                       options:remote_ip=1.1.2.92 options:key=456 options:layer3=true ofport_request=7\
> > > +                       options:remote_ip=1.1.2.92 options:key=456 options:packet_type=legacy_l3
> ofport_request=7\
> > >                      -- add-port int-br t7 -- set Interface t7 type=vxlan \
> > >                         options:remote_ip=1.1.2.92 options:key=345 options:exts=gpe ofport_request=8\
> > >                         ], [0])
> > > @@ -32,7 +32,7 @@ dummy at ovs-dummy: hit:0 missed:0
> > >  		t3 4/4789: (vxlan: csum=true, out_key=flow, remote_ip=1.1.2.93)
> > >  		t4 5/6081: (geneve: key=123, remote_ip=flow)
> > >  		t5 6/6081: (geneve: egress_pkt_mark=1234, out_key=flow, remote_ip=1.1.2.93)
> > > -		t6 7/3: (gre: key=456, layer3=true, remote_ip=1.1.2.92)
> > > +		t6 7/3: (gre: key=456, packet_type=legacy_l3, remote_ip=1.1.2.92)
> > >  		t7 8/4789: (vxlan: key=345, remote_ip=1.1.2.92)
> > >  ])
> > >
> > > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> > > index 9bb828faa8eb..de68f89886a5 100644
> > > --- a/vswitchd/vswitch.xml
> > > +++ b/vswitchd/vswitch.xml
> > > @@ -2397,6 +2397,31 @@
> > >          including tunnel monitoring.
> > >        </column>
> > >
> > > +      <group title="Tunnel Options: lisp only">
> > > +        <column name="options" key="packet_type"
> > > +                type='{"type": "string", "enum": ["set",
> > > +                      ["legacy_l3", "ptap"]]}'>
> > > +          <p>
> > > +            A LISP tunnel sends and receives only IPv4 and IPv6 packets.  This
> > > +            option controls what how the tunnel represents the packets that it
> > > +            sends and receives:
> > > +          </p>
> > > +
> > > +          <ul>
> > > +            <li>
> > > +              By default, or if this option is <code>legacy_l3</code>, the
> > > +              tunnel represents packets as Ethernet frames for compatibility
> > > +              with legacy OpenFlow controllers that expect this behavior.
> > > +            </li>
> > > +            <li>
> > > +              If this option is <code>ptap</code>, the tunnel represents
> > > +              packets using the <code>packet_type</code> mechanism introduced
> > > +              in OpenFlow 1.5.
> > > +            </li>
> > > +          </ul>
> > > +        </column>
> > > +      </group>
> > > +
> > >        <group title="Tunnel Options: vxlan only">
> > >
> > >          <column name="options" key="exts">
> > > @@ -2416,21 +2441,42 @@
> > >                <code>gpe</code>: Support for Generic Protocol Encapsulation in
> > >                accordance with IETF draft
> > >                <code>https://tools.ietf.org/html/draft-ietf-nvo3-vxlan-gpe</code>.
> > > +              Without this option, a VXLAN packet always encapsulates an
> > > +              Ethernet frame.  With this option, an VXLAN packet may also
> > > +              encapsulate an IPv4, IPv6, NSH, or MPLS packet.
> > >              </li>
> > >            </ul>
> > >          </column>
> > >
> > > -        <column name="options" key="layer3" type='{"type": "boolean"}'>
> > > +        <column name="options" key="packet_type"
> > > +                type='{"type": "string", "enum": ["set",
> > > +                      ["legacy_l2", "legacy_l3", "ptap"]]}'>
> > >            <p>
> > > -            By default, or if set to false, the tunnel carries L2 packets (with
> > > -            an Ethernet header).  If set to true, the tunnel carries L3 packets
> > > -            (without an Ethernet header present).
> > > +            This option controls what types of packets the tunnel sends and
> > > +            receives and how it represents them:
> > >            </p>
> > >
> > > -          <p>
> > > -            To set this option to true, the <code>gpe</code> extension must
> > > -            also be enabled in <ref column="options" key="exts"/>.
> > > -          </p>
> > > +          <ul>
> > > +            <li>
> > > +              By default, or if this option is <code>legacy_l2</code>, the
> > > +              tunnel sends and receives only Ethernet frames.
> > > +            </li>
> > > +            <li>
> > > +              If this option is <code>legacy_l3</code>, the tunnel sends and
> > > +              receives only non-Ethernet (L3) packet, but the packets are
> > > +              represented as Ethernet frames for compatibility with legacy
> > > +              OpenFlow controllers that expect this behavior.  This requires
> > > +              enabling <code>gpe</code> in <ref column="options" key="exts"/>.
> > > +            </li>
> > > +            <li>
> > > +              If this option is <code>ptap</code>, Open vSwitch represents
> > > +              packets in the tunnel using the <code>packet_type</code>
> > > +              mechanism introduced in OpenFlow 1.5.  This mechanism supports
> > > +              any kind of packet, but actually sending and receiving
> > > +              non-Ethernet packets requires additionally enabling
> > > +              <code>gpe</code> in <ref column="options" key="exts"/>.
> > > +            </li>
> > > +          </ul>
> > >          </column>
> > >        </group>
> > >
> > > @@ -2439,18 +2485,32 @@
> > >            <code>gre</code> interfaces support these options.
> > >          </p>
> > >
> > > -        <column name="options" key="layer3" type='{"type": "boolean"}'>
> > > +        <column name="options" key="packet_type"
> > > +                type='{"type": "string", "enum": ["set",
> > > +                      ["legacy_l2", "legacy_l3", "ptap"]]}'>
> > >            <p>
> > > -            By default, or if set to false, the tunnel carries L2 packets (with
> > > -            an Ethernet header).  If set to true, the tunnel carries L3 packets
> > > -            (without an Ethernet header present).
> > > +            This option controls what types of packets the tunnel sends and
> > > +            receives and how it represents them:
> > >            </p>
> > >
> > > -          <p>
> > > -            A single GRE tunnel cannot carry both L2 and L3 packets, but the
> > > -            same effect can be realized by creating two tunnels with different
> > > -            <code>layer3</code> settings and otherwise the same configuration.
> > > -          </p>
> > > +          <ul>
> > > +            <li>
> > > +              By default, or if this option is <code>legacy_l2</code>, the
> > > +              tunnel sends and receives only Ethernet frames.
> > > +            </li>
> > > +            <li>
> > > +              If this option is <code>legacy_l3</code>, the tunnel sends and
> > > +              receives only non-Ethernet (L3) packet, but the packets are
> > > +              represented as Ethernet frames for compatibility with legacy
> > > +              OpenFlow controllers that expect this behavior.
> > > +            </li>
> > > +            <li>
> > > +              If this option is <code>ptap</code>, the tunnel sends and
> > > +              receives any kind of packet.  Open vSwitch represents packets in
> > > +              the tunnel using the <code>packet_type</code> mechanism
> > > +              introduced in OpenFlow 1.5.
> > > +            </li>
> > > +          </ul>
> > >          </column>
> > >        </group>
> > >
> > > --
> > > 2.10.2
> > >
> > > _______________________________________________
> > > dev mailing list
> > > dev at openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev



More information about the dev mailing list