[ovs-dev] [PATCH v6] Bareudp Tunnel Support
Martin Varghese
martinvarghesenokia at gmail.com
Mon Jun 29 13:31:47 UTC 2020
From: Martin Varghese <martin.varghese at nokia.com>
There are various L3 encapsulation standards using UDP being discussed to
leverage the UDP based load balancing capability of different networks.
MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.
The Bareudp tunnel provides a generic L3 encapsulation support for
tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
tunnel.
An example to create bareudp device to tunnel MPLS traffic is
given
$ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
type=bareudp options:remote_ip=2.1.1.3
options:local_ip=2.1.1.2 \
options:payload_type=0x8847 options:dst_port=6635 \
options:packet_type="legacy_l3" \
ofport_request=$bareudp_egress_port
The bareudp device supports special handling for MPLS & IP as
they can have multiple ethertypes. MPLS procotcol can have ethertypes
ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have
ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).
The bareudp device to tunnel L3 traffic with multiple ethertypes
(MPLS & IP) can be created by passing the L3 protocol name as string in
the field payload_type. An example to create bareudp device to tunnel
MPLS unicast & multicast traffic is given below.
$ ovs-vsctl add-port br_mpls udp_port -- set interface
udp_port \
type=bareudp options:remote_ip=2.1.1.3
options:local_ip=2.1.1.2 \
options:payload_type=mpls options:dst_port=6635 \
options:packet_type="legacy_l3"
Signed-off-by: Martin Varghese <martin.varghese at nokia.com>
---
Changes in v2:
- Removed vport-bareudp module.
Changes in v3:
- Added net-next upstream commit id and message to commit message.
Changes in v4:
- Removed kernel datapath changes.
Changes in v5:
- Fixed release notes errors.
- Fixed coding errors in dpif-nelink-rtnl.c.
Changes in v6:
- Added code to enable rx metadata collection in the kernel device.
- Added version history.
Documentation/automake.mk | 1 +
Documentation/faq/bareudp.rst | 62 +++++++++++++++++++++++
Documentation/faq/index.rst | 1 +
Documentation/faq/releases.rst | 1 +
NEWS | 4 ++
datapath/linux/compat/include/linux/openvswitch.h | 10 ++++
lib/dpif-netlink-rtnl.c | 55 ++++++++++++++++++++
lib/dpif-netlink.c | 5 ++
lib/netdev-vport.c | 27 +++++++++-
lib/netdev.h | 1 +
ofproto/ofproto-dpif-xlate.c | 1 +
tests/system-layer3-tunnels.at | 47 +++++++++++++++++
12 files changed, 213 insertions(+), 2 deletions(-)
create mode 100644 Documentation/faq/bareudp.rst
diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index f85c432..ea3475f 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -88,6 +88,7 @@ DOC_SOURCE = \
Documentation/faq/terminology.rst \
Documentation/faq/vlan.rst \
Documentation/faq/vxlan.rst \
+ Documentation/faq/bareudp.rst \
Documentation/internals/index.rst \
Documentation/internals/authors.rst \
Documentation/internals/bugs.rst \
diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst
new file mode 100644
index 0000000..9266daa
--- /dev/null
+++ b/Documentation/faq/bareudp.rst
@@ -0,0 +1,62 @@
+..
+ Licensed under the Apache License, Version 2.0 (the "License"); you may
+ not use this file except in compliance with the License. You may obtain
+ a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ License for the specific language governing permissions and limitations
+ under the License.
+
+ Convention for heading levels in Open vSwitch documentation:
+
+ ======= Heading 0 (reserved for the title in a document)
+ ------- Heading 1
+ ~~~~~~~ Heading 2
+ +++++++ Heading 3
+ ''''''' Heading 4
+
+ Avoid deeper levels because they do not render well.
+
+=======
+Bareudp
+=======
+
+Q: What is Bareudp?
+
+ A: There are various L3 encapsulation standards using UDP being discussed
+ to leverage the UDP based load balancing capability of different
+ networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among
+ them.
+
+ The Bareudp tunnel provides a generic L3 encapsulation support for
+ tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
+ tunnel.
+
+ An example to create bareudp device to tunnel MPLS traffic is given
+ below.::
+
+ $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
+ type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \
+ options:payload_type=0x8847 options:dst_port=6635 \
+ options:packet_type="legacy_l3" \
+ ofport_request=$bareudp_egress_port
+
+ The bareudp device supports special handling for MPLS & IP as they can
+ have multiple ethertypes.
+ MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) &
+ ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4)
+ & ETH_P_IPV6 (v6).
+
+ The bareudp device to tunnel L3 traffic with multiple ethertypes
+ (MPLS & IP) can be created by passing the L3 protocol name as string in
+ the field payload_type. An example to create bareudp device to tunnel
+ MPLS unicast & multicast traffic is given below.::
+
+ $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
+ type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \
+ options:payload_type=mpls options:dst_port=6635 \
+ options:packet_type="legacy_l3"
diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst
index 334b828..1dd2998 100644
--- a/Documentation/faq/index.rst
+++ b/Documentation/faq/index.rst
@@ -30,6 +30,7 @@ Open vSwitch FAQ
.. toctree::
:maxdepth: 2
+ bareudp
configuration
contributing
design
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index e5cef39..9915839 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -136,6 +136,7 @@ Q: Are all features available with all datapaths?
Tunnel - ERSPAN 4.18 2.10 2.10 NO
Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO
Tunnel - GTP-U NO NO 2.14 NO
+ Tunnel - Bareudp 5.7 NO 2.14 NO
QoS - Policing YES 1.1 2.6 NO
QoS - Shaping YES 1.1 NO NO
sFlow YES 1.0 1.0 NO
diff --git a/NEWS b/NEWS
index 0116b3e..f5aa840 100644
--- a/NEWS
+++ b/NEWS
@@ -23,6 +23,10 @@ Post-v2.13.0
- Tunnels: TC Flower offload
* Tunnel Local endpoint address masked match are supported.
* Tunnel Romte endpoint address masked match are supported.
+ - Bareudp Tunnel
+ * Bareudp device support is present in linux kernel from version 5.7
+ * Kernel bareudp device is not backported to ovs tree.
+ * Userspace datapath support is not added
v2.13.0 - 14 Feb 2020
diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h
index cc41bbe..3073faa 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -240,6 +240,7 @@ enum ovs_vport_type {
OVS_VPORT_TYPE_GRE, /* GRE tunnel. */
OVS_VPORT_TYPE_VXLAN, /* VXLAN tunnel. */
OVS_VPORT_TYPE_GENEVE, /* Geneve tunnel. */
+ OVS_VPORT_TYPE_BAREUDP, /* Bareudp tunnel. */
OVS_VPORT_TYPE_LISP = 105, /* LISP tunnel */
OVS_VPORT_TYPE_STT = 106, /* STT tunnel */
OVS_VPORT_TYPE_ERSPAN = 107, /* ERSPAN tunnel. */
@@ -308,6 +309,15 @@ enum {
#define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1)
+enum {
+ OVS_BAREUDP_EXT_UNSPEC,
+ OVS_BAREUDP_EXT_MULTIPROTO_MODE,
+ /* place new values here to fill gap. */
+ __OVS_BAREUDP_EXT_MAX,
+};
+
+#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1)
+
/* OVS_VPORT_ATTR_OPTIONS attributes for tunnels.
*/
enum {
diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
index fd157ce..3f6842a 100644
--- a/lib/dpif-netlink-rtnl.c
+++ b/lib/dpif-netlink-rtnl.c
@@ -58,6 +58,19 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl);
#define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10
#endif
+#ifndef __IFLA_BAREUDP_MAX
+#define IFLA_BAREUDP_MAX 0
+#endif
+#if IFLA_BAREUDP_MAX < 4
+#define IFLA_BAREUDP_PORT 1
+#define IFLA_BAREUDP_ETHERTYPE 2
+#define IFLA_BAREUDP_SRCPORT_MIN 3
+#define IFLA_BAREUDP_MULTIPROTO_MODE 4
+#define IFLA_BAREUDP_RX_COLLECT_METADATA 5
+#endif
+
+#define BAREUDP_MPLS_SRCPORT_MIN 49153
+
static const struct nl_policy rtlink_policy[] = {
[IFLA_LINKINFO] = { .type = NL_A_NESTED },
};
@@ -81,6 +94,10 @@ static const struct nl_policy geneve_policy[] = {
[IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 },
[IFLA_GENEVE_PORT] = { .type = NL_A_U16 },
};
+static const struct nl_policy bareudp_policy[] = {
+ [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 },
+ [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 },
+};
static const char *
vport_type_to_kind(enum ovs_vport_type type,
@@ -113,6 +130,8 @@ vport_type_to_kind(enum ovs_vport_type type,
}
case OVS_VPORT_TYPE_GTPU:
return NULL;
+ case OVS_VPORT_TYPE_BAREUDP:
+ return "bareudp";
case OVS_VPORT_TYPE_NETDEV:
case OVS_VPORT_TYPE_INTERNAL:
case OVS_VPORT_TYPE_LISP:
@@ -243,6 +262,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg,
return err;
}
+static int
+dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg,
+ const char *kind, struct ofpbuf *reply)
+{
+ struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)];
+ int err;
+
+ err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp,
+ ARRAY_SIZE(bareudp_policy));
+ if (!err) {
+ if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT]))
+ || (tnl_cfg->payload_ethertype
+ != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) {
+ err = EINVAL;
+ }
+ }
+ return err;
+}
static int
dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg,
@@ -275,6 +312,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg,
case OVS_VPORT_TYPE_GENEVE:
err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply);
break;
+ case OVS_VPORT_TYPE_BAREUDP:
+ err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply);
+ break;
case OVS_VPORT_TYPE_NETDEV:
case OVS_VPORT_TYPE_INTERNAL:
case OVS_VPORT_TYPE_LISP:
@@ -357,6 +397,20 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg,
nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1);
nl_msg_put_be16(&request, IFLA_GENEVE_PORT, tnl_cfg->dst_port);
break;
+ case OVS_VPORT_TYPE_BAREUDP:
+ nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE,
+ tnl_cfg->payload_ethertype);
+ if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) ||
+ (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) {
+ nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN,
+ BAREUDP_MPLS_SRCPORT_MIN);
+ }
+ nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port);
+ if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) {
+ nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE);
+ }
+ nl_msg_put_flag(&request, IFLA_BAREUDP_RX_COLLECT_METADATA);
+ break;
case OVS_VPORT_TYPE_NETDEV:
case OVS_VPORT_TYPE_INTERNAL:
case OVS_VPORT_TYPE_LISP:
@@ -470,6 +524,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type)
case OVS_VPORT_TYPE_ERSPAN:
case OVS_VPORT_TYPE_IP6ERSPAN:
case OVS_VPORT_TYPE_IP6GRE:
+ case OVS_VPORT_TYPE_BAREUDP:
return dpif_netlink_rtnl_destroy(name);
case OVS_VPORT_TYPE_NETDEV:
case OVS_VPORT_TYPE_INTERNAL:
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 18322e8..2ad0e64 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport *vport)
case OVS_VPORT_TYPE_GTPU:
return "gtpu";
+ case OVS_VPORT_TYPE_BAREUDP:
+ return "bareudp";
+
case OVS_VPORT_TYPE_UNSPEC:
case __OVS_VPORT_TYPE_MAX:
break;
@@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type)
return OVS_VPORT_TYPE_GRE;
} else if (!strcmp(type, "gtpu")) {
return OVS_VPORT_TYPE_GTPU;
+ } else if (!strcmp(type, "bareudp")) {
+ return OVS_VPORT_TYPE_BAREUDP;
} else {
return OVS_VPORT_TYPE_UNSPEC;
}
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 0252b61..c86d420 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev)
return (class->get_config == get_tunnel_config &&
(!strcmp("geneve", type) || !strcmp("vxlan", type) ||
!strcmp("lisp", type) || !strcmp("stt", type) ||
- !strcmp("gtpu", type)));
+ !strcmp("gtpu", type) || !strcmp("bareudp",type)));
}
const char *
@@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_)
dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT);
} else if (!strcmp(type, "gtpu")) {
dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT);
+ } else if (!strcmp(type, "bareudp")) {
+ dev->tnl_cfg.dst_port = htons(port);
}
dev->tnl_cfg.dont_fragment = true;
@@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type,
return TNL_L2 | TNL_L3;
} else if (!strcmp(type, "gtpu")) {
return TNL_L3;
+ } else if (!strcmp(type, "bareudp")) {
+ return TNL_L3;
} else {
return TNL_L2;
}
@@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
goto out;
}
}
+ } else if (!strcmp(node->key, "payload_type")) {
+ if (strcmp(node->key, "mpls")) {
+ tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS);
+ tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE);
+ } else if ((strcmp(node->key, "ip"))) {
+ tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP);
+ tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE);
+ } else {
+ tnl_cfg.payload_ethertype = htons(atoi(node->value));
+ }
} else {
ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name,
type, node->key);
@@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, struct smap *args)
(!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) ||
(!strcmp("lisp", type) && dst_port != LISP_DST_PORT) ||
(!strcmp("stt", type) && dst_port != STT_DST_PORT) ||
- (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) {
+ (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) ||
+ !strcmp("bareudp", type)) {
smap_add_format(args, "dst_port", "%d", dst_port);
}
}
@@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void)
},
{{NULL, NULL, 0, 0}}
},
+ { "udp_sys",
+ {
+ TUNNEL_FUNCTIONS_COMMON,
+ .type = "bareudp",
+ .get_ifindex = NETDEV_VPORT_GET_IFINDEX,
+ },
+ {{NULL, NULL, 0, 0}}
+ },
};
static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
diff --git a/lib/netdev.h b/lib/netdev.h
index fdbe0e1..f15bca5 100644
--- a/lib/netdev.h
+++ b/lib/netdev.h
@@ -107,6 +107,7 @@ struct netdev_tunnel_config {
bool out_key_flow;
ovs_be64 out_key;
+ ovs_be16 payload_ethertype;
ovs_be16 dst_port;
bool ip_src_flow;
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index e0ede2c..6e07960 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx, struct eth_addr dmac,
case OVS_VPORT_TYPE_VXLAN:
case OVS_VPORT_TYPE_GENEVE:
case OVS_VPORT_TYPE_GTPU:
+ case OVS_VPORT_TYPE_BAREUDP:
nw_proto = IPPROTO_UDP;
break;
case OVS_VPORT_TYPE_LISP:
diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at
index 1232964..5d9ea93 100644
--- a/tests/system-layer3-tunnels.at
+++ b/tests/system-layer3-tunnels.at
@@ -152,3 +152,50 @@ AT_CHECK([tail -1 stdout], [0],
OVS_VSWITCHD_STOP
AT_CLEANUP
+
+AT_SETUP([layer3 - ping over MPLS Bareudp])
+OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])])
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01")
+ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02")
+
+ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3], [8.1.1.2/24],
+ [ options:local_ip=8.1.1.2 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635])
+
+ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2], [8.1.1.3/24],
+ [options:local_ip=8.1.1.3 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635])
+
+AT_DATA([flows0.txt], [dnl
+table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0
+table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0
+table=0,priority=10 actions=normal
+])
+
+AT_DATA([flows1.txt], [dnl
+table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1
+table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1
+table=0,priority=10 actions=normal
+])
+
+AT_CHECK([ip link add patch0 type veth peer name patch1])
+on_exit 'ip link del patch0'
+
+AT_CHECK([ip link set dev patch0 up])
+AT_CHECK([ip link set dev patch1 up])
+AT_CHECK([ovs-vsctl add-port br0 patch0])
+AT_CHECK([ovs-vsctl add-port br1 patch1])
+
+
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt])
+
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], [0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
--
1.8.3.1
More information about the dev
mailing list