[ovs-dev] [PATCH v4] Bareudp Tunnel Support
Gregory Rose
gvrose8192 at gmail.com
Thu Jun 4 16:21:25 UTC 2020
On 6/4/2020 5:57 AM, Martin Varghese wrote:
> On Tue, Jun 02, 2020 at 11:59:37AM -0700, Gregory Rose wrote:
>>
>> On 5/25/2020 8:31 PM, Martin Varghese wrote:
>>> From: Martin Varghese <martin.varghese at nokia.com>
>>>
>>> There are various L3 encapsulation standards using UDP being discussed to
>>> leverage the UDP based load balancing capability of different networks.
>>> MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.
>>>
>>> The Bareudp tunnel provides a generic L3 encapsulation tunnelling support
>>> for tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
>>> tunnel.
>>>
>>> An example to create bareudp device to tunnel MPLS traffic is
>>> given
>>>
>>> $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
>>> type=bareudp options:remote_ip=2.1.1.3
>>> options:local_ip=2.1.1.2 \
>>> options:payload_type=0x8847 options:dst_port=6635 \
>>> options:packet_type="legacy_l3" \
>>> ofport_request=$bareudp_egress_port
>>>
>>> The bareudp device supports special handling for MPLS & IP as
>>> they can have multiple ethertypes. MPLS procotcol can have ethertypes
>>> ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have
>>> ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).
>>>
>>> The bareudp device to tunnel L3 traffic with multiple ethertypes
>>> (MPLS & IP) can be created by passing the L3 protocol name as string in
>>> the field payload_type. An example to create bareudp device to tunnel
>>> MPLS unicast & multicast traffic is given below.::
>>>
>>> $ ovs-vsctl add-port br_mpls udp_port -- set interface
>>> udp_port \
>>> type=bareudp options:remote_ip=2.1.1.3
>>> options:local_ip=2.1.1.2 \
>>> options:payload_type=mpls options:dst_port=6635 \
>>> options:packet_type="legacy_l3"
>>>
>>> Signed-off-by: Martin Varghese <martin.varghese at nokia.com>
>>
>> Hi Martin,
>>
>> I built and installed a 5.7 kernel with bareudp configured and
>> the system traffic test worked fine. I did not see any regressions
>> in any other system traffic tests.
>>
>> The documentation and code looks pretty good but I do have a few
>> comments below.
>>
>> Thanks!
>>
>> - Greg
>>
>>> ---
>>> Documentation/automake.mk | 1 +
>>> Documentation/faq/bareudp.rst | 62 +++++++++++++++++++++++
>>> Documentation/faq/index.rst | 1 +
>>> Documentation/faq/releases.rst | 1 +
>>> NEWS | 3 ++
>>> datapath/linux/compat/include/linux/openvswitch.h | 10 ++++
>>> lib/dpif-netlink-rtnl.c | 53 +++++++++++++++++++
>>> lib/dpif-netlink.c | 7 ++-
>>> lib/netdev-vport.c | 24 ++++++++-
>>> lib/netdev.h | 1 +
>>> ofproto/ofproto-dpif-xlate.c | 1 +
>>> tests/system-layer3-tunnels.at | 47 +++++++++++++++++
>>> 12 files changed, 209 insertions(+), 2 deletions(-)
>>> create mode 100644 Documentation/faq/bareudp.rst
>>>
>>> diff --git a/Documentation/automake.mk b/Documentation/automake.mk
>>> index f85c432..ea3475f 100644
>>> --- a/Documentation/automake.mk
>>> +++ b/Documentation/automake.mk
>>> @@ -88,6 +88,7 @@ DOC_SOURCE = \
>>> Documentation/faq/terminology.rst \
>>> Documentation/faq/vlan.rst \
>>> Documentation/faq/vxlan.rst \
>>> + Documentation/faq/bareudp.rst \
>>> Documentation/internals/index.rst \
>>> Documentation/internals/authors.rst \
>>> Documentation/internals/bugs.rst \
>>> diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst
>>> new file mode 100644
>>> index 0000000..257f20e
>>> --- /dev/null
>>> +++ b/Documentation/faq/bareudp.rst
>>> @@ -0,0 +1,62 @@
>>> +..
>>> + Licensed under the Apache License, Version 2.0 (the "License"); you may
>>> + not use this file except in compliance with the License. You may obtain
>>> + a copy of the License at
>>> +
>>> + http://www.apache.org/licenses/LICENSE-2.0
>>> +
>>> + Unless required by applicable law or agreed to in writing, software
>>> + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
>>> + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
>>> + License for the specific language governing permissions and limitations
>>> + under the License.
>>> +
>>> + Convention for heading levels in Open vSwitch documentation:
>>> +
>>> + ======= Heading 0 (reserved for the title in a document)
>>> + ------- Heading 1
>>> + ~~~~~~~ Heading 2
>>> + +++++++ Heading 3
>>> + ''''''' Heading 4
>>> +
>>> + Avoid deeper levels because they do not render well.
>>> +
>>> +=======
>>> +Bareudp
>>> +=======
>>> +
>>> +Q: What is Bareudp?
>>> +
>>> + A: There are various L3 encapsulation standards using UDP being discussed
>>> + to leverage the UDP based load balancing capability of different
>>> + networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among
>>> + them.
>>> +
>>> + The Bareudp tunnel provides a generic L3 encapsulation tunnelling
>>> + support for tunnelling different L3 protocols like MPLS, IP, NSH etc.
>>> + inside a UDP tunnel.
>>> +
>>> + An example to create bareudp device to tunnel MPLS traffic is given
>>> + below.::
>>> +
>>> + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
>>> + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \
>>> + options:payload_type=0x8847 options:dst_port=6635 \
>>> + options:packet_type="legacy_l3" \
>>> + ofport_request=$bareudp_egress_port
>>> +
>>> + The bareudp device supports special handling for MPLS & IP as they can
>>> + have multiple ethertypes.
>>> + MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) &
>>> + ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4)
>>> + & ETH_P_IPV6 (v6).
>>> +
>>> + The bareudp device to tunnel L3 traffic with multiple ethertypes
>>> + (MPLS & IP) can be created by passing the L3 protocol name as string in
>>> + the field payload_type. An example to create bareudp device to tunnel
>>> + MPLS unicast & multicast traffic is given below.::
>>> +
>>> + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
>>> + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \
>>> + options:payload_type=mpls options:dst_port=6635 \
>>> + options:packet_type="legacy_l3"
>>> diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst
>>> index 334b828..1dd2998 100644
>>> --- a/Documentation/faq/index.rst
>>> +++ b/Documentation/faq/index.rst
>>> @@ -30,6 +30,7 @@ Open vSwitch FAQ
>>> .. toctree::
>>> :maxdepth: 2
>>> + bareudp
>>> configuration
>>> contributing
>>> design
>>> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
>>> index e5cef39..93cb10a 100644
>>> --- a/Documentation/faq/releases.rst
>>> +++ b/Documentation/faq/releases.rst
>>> @@ -136,6 +136,7 @@ Q: Are all features available with all datapaths?
>>> Tunnel - ERSPAN 4.18 2.10 2.10 NO
>>> Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO
>>> Tunnel - GTP-U NO NO 2.14 NO
>>> + Tunnel - Bareudp 5.6 NO 2.14 NO
>>
>> 5.7
>>
> Noted
>
>>> QoS - Policing YES 1.1 2.6 NO
>>> QoS - Shaping YES 1.1 NO NO
>>> sFlow YES 1.0 1.0 NO
>>> diff --git a/NEWS b/NEWS
>>> index 3dbd8ec..e75b537 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -16,6 +16,9 @@ Post-v2.13.0
>>> by enabling interrupt mode.
>>> - Userspace datapath:
>>> * Add support for conntrack zone-based timeout policy.
>>> + - Bareudp Tunnel
>>> + * Userspace datapath support is not added.
>>> + * Kernel datapath support not backported to ovs tree.
>>
>> Maybe another line about kernel datapath is supported in the
>> Linux 5.7 kernel? I know you will update the releases doc
>> but I think it would help to mention it here as well.
>>
> I will update
>
>>> v2.13.0 - 14 Feb 2020
>>> diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h
>>> index f7c3b2e..09f313a 100644
>>> --- a/datapath/linux/compat/include/linux/openvswitch.h
>>> +++ b/datapath/linux/compat/include/linux/openvswitch.h
>>> @@ -240,6 +240,7 @@ enum ovs_vport_type {
>>> OVS_VPORT_TYPE_GRE, /* GRE tunnel. */
>>> OVS_VPORT_TYPE_VXLAN, /* VXLAN tunnel. */
>>> OVS_VPORT_TYPE_GENEVE, /* Geneve tunnel. */
>>> + OVS_VPORT_TYPE_BAREUDP, /* Bareudp tunnel. */
>>> OVS_VPORT_TYPE_LISP = 105, /* LISP tunnel */
>>> OVS_VPORT_TYPE_STT = 106, /* STT tunnel */
>>> OVS_VPORT_TYPE_ERSPAN = 107, /* ERSPAN tunnel. */
>>> @@ -308,6 +309,15 @@ enum {
>>> #define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1)
>>> +enum {
>>> + OVS_BAREUDP_EXT_UNSPEC,
>>> + OVS_BAREUDP_EXT_MULTIPROTO_MODE,
>>> + /* place new values here to fill gap. */
>>> + __OVS_BAREUDP_EXT_MAX,
>>> +};
>>> +
>>> +#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1)
>>> +
>>> /* OVS_VPORT_ATTR_OPTIONS attributes for tunnels.
>>> */
>>> enum {
>>> diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
>>> index fd157ce..b93b1fc 100644
>>> --- a/lib/dpif-netlink-rtnl.c
>>> +++ b/lib/dpif-netlink-rtnl.c
>>> @@ -58,6 +58,18 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl);
>>> #define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10
>>> #endif
>>> +#ifndef __IFLA_BAREUDP_MAX
>>> +#define IFLA_BAREUDP_MAX 0
>>> +#endif
>>> +#if IFLA_BAREUDP_MAX < 4
>>> +#define IFLA_BAREUDP_PORT 1
>>> +#define IFLA_BAREUDP_ETHERTYPE 2
>>> +#define IFLA_BAREUDP_SRCPORT_MIN 3
>>> +#define IFLA_BAREUDP_MULTIPROTO_MODE 4
>>> +#endif
>>> +
>>> +#define BAREUDP_MPLS_SRCPORT_MIN 49153
>>> +
>>> static const struct nl_policy rtlink_policy[] = {
>>> [IFLA_LINKINFO] = { .type = NL_A_NESTED },
>>> };
>>> @@ -81,6 +93,10 @@ static const struct nl_policy geneve_policy[] = {
>>> [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 },
>>> [IFLA_GENEVE_PORT] = { .type = NL_A_U16 },
>>> };
>>> +static const struct nl_policy bareudp_policy[] = {
>>> + [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 },
>>> + [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 },
>>> +};
>>> static const char *
>>> vport_type_to_kind(enum ovs_vport_type type,
>>> @@ -113,6 +129,8 @@ vport_type_to_kind(enum ovs_vport_type type,
>>> }
>>> case OVS_VPORT_TYPE_GTPU:
>>> return NULL;
>>> + case OVS_VPORT_TYPE_BAREUDP:
>>> + return "bareudp";
>>> case OVS_VPORT_TYPE_NETDEV:
>>> case OVS_VPORT_TYPE_INTERNAL:
>>> case OVS_VPORT_TYPE_LISP:
>>> @@ -243,6 +261,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg,
>>> return err;
>>> }
>>> +static int
>>> +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg,
>>> + const char *kind, struct ofpbuf *reply)
>>> +{
>>> + struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)];
>>> + int err;
>>> +
>>> + err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp,
>>> + ARRAY_SIZE(bareudp_policy));
>>> + if (!err) {
>>> + if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT]))
>>> + || (tnl_cfg->payload_ethertype
>>> + != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) {
>>> + err = EINVAL;
>>> + }
>>> + }
>>> + return err;
>>> +}
>>> static int
>>> dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg,
>>> @@ -275,6 +311,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg,
>>> case OVS_VPORT_TYPE_GENEVE:
>>> err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply);
>>> break;
>>> + case OVS_VPORT_TYPE_BAREUDP:
>>> + err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply);
>>> + break;
>>> case OVS_VPORT_TYPE_NETDEV:
>>> case OVS_VPORT_TYPE_INTERNAL:
>>> case OVS_VPORT_TYPE_LISP:
>>> @@ -362,6 +401,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg,
>>> case OVS_VPORT_TYPE_LISP:
>>> case OVS_VPORT_TYPE_STT:
>>> case OVS_VPORT_TYPE_GTPU:
>>> + case OVS_VPORT_TYPE_BAREUDP:
>>> + nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE,
>>> + tnl_cfg->payload_ethertype);
>>> + if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) ||
>>> + (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) {
>>> + nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN,
>>> + BAREUDP_MPLS_SRCPORT_MIN);
>>> + }
>>> + nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port);
>>> + if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) {
>>> + nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE);
>>> + }
>>> + break;
>>> case OVS_VPORT_TYPE_UNSPEC:
>>> case __OVS_VPORT_TYPE_MAX:
>>> default:
>>> @@ -470,6 +522,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type)
>>> case OVS_VPORT_TYPE_ERSPAN:
>>> case OVS_VPORT_TYPE_IP6ERSPAN:
>>> case OVS_VPORT_TYPE_IP6GRE:
>>> + case OVS_VPORT_TYPE_BAREUDP:
>>> return dpif_netlink_rtnl_destroy(name);
>>> case OVS_VPORT_TYPE_NETDEV:
>>> case OVS_VPORT_TYPE_INTERNAL:
>>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>>> index dc64210..ce7e5e6 100644
>>> --- a/lib/dpif-netlink.c
>>> +++ b/lib/dpif-netlink.c
>>> @@ -740,7 +740,7 @@ get_vport_type(const struct dpif_netlink_vport *vport)
>>> return "erspan";
>>> case OVS_VPORT_TYPE_IP6ERSPAN:
>>> - return "ip6erspan";
>>> + return "ip6erspan";
>>
>> Why is this changed? From the diff I can't even tell what has changed.
>> It seems unrelated.
>>
> There was a trialing space at that line. May be i should not mix it with this patch.
> We could add a seperate patch to fix it. Anyways fine for me.
Right, let's not intermix formatting fixes with functional changes.
>>> case OVS_VPORT_TYPE_IP6GRE:
>>> return "ip6gre";
>>> @@ -748,6 +748,9 @@ get_vport_type(const struct dpif_netlink_vport *vport)
>>> case OVS_VPORT_TYPE_GTPU:
>>> return "gtpu";
>>> + case OVS_VPORT_TYPE_BAREUDP:
>>> + return "bareudp";
>>> +
>>> case OVS_VPORT_TYPE_UNSPEC:
>>> case __OVS_VPORT_TYPE_MAX:
>>> break;
>>> @@ -783,6 +786,8 @@ netdev_to_ovs_vport_type(const char *type)
>>> return OVS_VPORT_TYPE_GRE;
>>> } else if (!strcmp(type, "gtpu")) {
>>> return OVS_VPORT_TYPE_GTPU;
>>> + } else if (!strcmp(type, "bareudp")) {
>>> + return OVS_VPORT_TYPE_BAREUDP;
>>> } else {
>>> return OVS_VPORT_TYPE_UNSPEC;
>>> }
>>> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
>>> index 0252b61..3aca319 100644
>>> --- a/lib/netdev-vport.c
>>> +++ b/lib/netdev-vport.c
>>> @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev)
>>> return (class->get_config == get_tunnel_config &&
>>> (!strcmp("geneve", type) || !strcmp("vxlan", type) ||
>>> !strcmp("lisp", type) || !strcmp("stt", type) ||
>>> - !strcmp("gtpu", type)));
>>> + !strcmp("gtpu", type) || !strcmp("bareudp",type)));
>>> }
>>> const char *
>>> @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_)
>>> dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT);
>>> } else if (!strcmp(type, "gtpu")) {
>>> dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT);
>>> + } else if (!strcmp(type, "bareudp")) {
>>> + dev->tnl_cfg.dst_port = htons(port);
>>> }
>>
>> I notice that for GENEVE, VXLAN, LISP, STT and GTPU the code checks
>> if port has a value and then if it does not it will assign a
>> default port. Is there some reason you do not follow this
>> format for the BAREUDP port?
>>
> unlike Genveve, VXLAN etc. bareudp can be configured with any UDP port and
> there is no default port.
So it's OK for the port value to zero?
Thanks,
- Greg
[snipped]
More information about the dev
mailing list