[ovs-dev] [PATCH v4] Bareudp Tunnel Support

Gregory Rose gvrose8192 at gmail.com
Thu Jun 4 16:21:25 UTC 2020


On 6/4/2020 5:57 AM, Martin Varghese wrote:
> On Tue, Jun 02, 2020 at 11:59:37AM -0700, Gregory Rose wrote:
>>
>> On 5/25/2020 8:31 PM, Martin Varghese wrote:
>>> From: Martin Varghese <martin.varghese at nokia.com>
>>>
>>> There are various L3 encapsulation standards using UDP being discussed to
>>> leverage the UDP based load balancing capability of different networks.
>>> MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.
>>>
>>> The Bareudp tunnel provides a generic L3 encapsulation tunnelling support
>>> for tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
>>> tunnel.
>>>
>>> An example to create bareudp device to tunnel MPLS traffic is
>>> given
>>>
>>> $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
>>>               type=bareudp options:remote_ip=2.1.1.3
>>>               options:local_ip=2.1.1.2 \
>>>               options:payload_type=0x8847 options:dst_port=6635 \
>>>               options:packet_type="legacy_l3" \
>>>               ofport_request=$bareudp_egress_port
>>>
>>> The bareudp device supports special handling for MPLS & IP as
>>> they can have multiple ethertypes. MPLS procotcol can have ethertypes
>>> ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have
>>> ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).
>>>
>>> The bareudp device to tunnel L3 traffic with multiple ethertypes
>>> (MPLS & IP) can be created by passing the L3 protocol name as string in
>>> the field payload_type. An example to create bareudp device to tunnel
>>> MPLS unicast & multicast traffic is given below.::
>>>
>>> $ ovs-vsctl add-port  br_mpls udp_port -- set interface
>>>              udp_port \
>>>              type=bareudp options:remote_ip=2.1.1.3
>>>              options:local_ip=2.1.1.2 \
>>>              options:payload_type=mpls options:dst_port=6635 \
>>>              options:packet_type="legacy_l3"
>>>
>>> Signed-off-by: Martin Varghese <martin.varghese at nokia.com>
>>
>> Hi Martin,
>>
>> I built and installed a 5.7 kernel with bareudp configured and
>> the system traffic test worked fine.  I did not see any regressions
>> in any other system traffic tests.
>>
>> The documentation and code looks pretty good but I do have a few
>> comments below.
>>
>> Thanks!
>>
>> - Greg
>>
>>> ---
>>>   Documentation/automake.mk                         |  1 +
>>>   Documentation/faq/bareudp.rst                     | 62 +++++++++++++++++++++++
>>>   Documentation/faq/index.rst                       |  1 +
>>>   Documentation/faq/releases.rst                    |  1 +
>>>   NEWS                                              |  3 ++
>>>   datapath/linux/compat/include/linux/openvswitch.h | 10 ++++
>>>   lib/dpif-netlink-rtnl.c                           | 53 +++++++++++++++++++
>>>   lib/dpif-netlink.c                                |  7 ++-
>>>   lib/netdev-vport.c                                | 24 ++++++++-
>>>   lib/netdev.h                                      |  1 +
>>>   ofproto/ofproto-dpif-xlate.c                      |  1 +
>>>   tests/system-layer3-tunnels.at                    | 47 +++++++++++++++++
>>>   12 files changed, 209 insertions(+), 2 deletions(-)
>>>   create mode 100644 Documentation/faq/bareudp.rst
>>>
>>> diff --git a/Documentation/automake.mk b/Documentation/automake.mk
>>> index f85c432..ea3475f 100644
>>> --- a/Documentation/automake.mk
>>> +++ b/Documentation/automake.mk
>>> @@ -88,6 +88,7 @@ DOC_SOURCE = \
>>>   	Documentation/faq/terminology.rst \
>>>   	Documentation/faq/vlan.rst \
>>>   	Documentation/faq/vxlan.rst \
>>> +	Documentation/faq/bareudp.rst \
>>>   	Documentation/internals/index.rst \
>>>   	Documentation/internals/authors.rst \
>>>   	Documentation/internals/bugs.rst \
>>> diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst
>>> new file mode 100644
>>> index 0000000..257f20e
>>> --- /dev/null
>>> +++ b/Documentation/faq/bareudp.rst
>>> @@ -0,0 +1,62 @@
>>> +..
>>> +      Licensed under the Apache License, Version 2.0 (the "License"); you may
>>> +      not use this file except in compliance with the License. You may obtain
>>> +      a copy of the License at
>>> +
>>> +          http://www.apache.org/licenses/LICENSE-2.0
>>> +
>>> +      Unless required by applicable law or agreed to in writing, software
>>> +      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
>>> +      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
>>> +      License for the specific language governing permissions and limitations
>>> +      under the License.
>>> +
>>> +      Convention for heading levels in Open vSwitch documentation:
>>> +
>>> +      =======  Heading 0 (reserved for the title in a document)
>>> +      -------  Heading 1
>>> +      ~~~~~~~  Heading 2
>>> +      +++++++  Heading 3
>>> +      '''''''  Heading 4
>>> +
>>> +      Avoid deeper levels because they do not render well.
>>> +
>>> +=======
>>> +Bareudp
>>> +=======
>>> +
>>> +Q: What is Bareudp?
>>> +
>>> +    A: There are various L3 encapsulation standards using UDP being discussed
>>> +       to leverage the UDP based load balancing capability of different
>>> +       networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among
>>> +       them.
>>> +
>>> +       The Bareudp tunnel provides a generic L3 encapsulation tunnelling
>>> +       support for tunnelling different L3 protocols like MPLS, IP, NSH etc.
>>> +       inside a UDP tunnel.
>>> +
>>> +       An example to create bareudp device to tunnel MPLS traffic is given
>>> +       below.::
>>> +
>>> +           $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
>>> +             type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \
>>> +             options:payload_type=0x8847 options:dst_port=6635 \
>>> +             options:packet_type="legacy_l3" \
>>> +             ofport_request=$bareudp_egress_port
>>> +
>>> +       The bareudp device supports special handling for MPLS & IP as they can
>>> +       have multiple ethertypes.
>>> +       MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) &
>>> +       ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4)
>>> +       & ETH_P_IPV6 (v6).
>>> +
>>> +       The bareudp device to tunnel L3 traffic with multiple ethertypes
>>> +       (MPLS & IP) can be created by passing the L3 protocol name as string in
>>> +       the field payload_type. An example to create bareudp device to tunnel
>>> +       MPLS unicast & multicast traffic is given below.::
>>> +
>>> +            $ ovs-vsctl add-port  br_mpls udp_port -- set interface udp_port \
>>> +              type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \
>>> +              options:payload_type=mpls options:dst_port=6635 \
>>> +              options:packet_type="legacy_l3"
>>> diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst
>>> index 334b828..1dd2998 100644
>>> --- a/Documentation/faq/index.rst
>>> +++ b/Documentation/faq/index.rst
>>> @@ -30,6 +30,7 @@ Open vSwitch FAQ
>>>   .. toctree::
>>>      :maxdepth: 2
>>> +   bareudp
>>>      configuration
>>>      contributing
>>>      design
>>> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
>>> index e5cef39..93cb10a 100644
>>> --- a/Documentation/faq/releases.rst
>>> +++ b/Documentation/faq/releases.rst
>>> @@ -136,6 +136,7 @@ Q: Are all features available with all datapaths?
>>>       Tunnel - ERSPAN                 4.18           2.10         2.10     NO
>>>       Tunnel - ERSPAN-IPv6            4.18           2.10         2.10     NO
>>>       Tunnel - GTP-U                  NO             NO           2.14     NO
>>> +    Tunnel - Bareudp                5.6            NO           2.14     NO
>>
>> 5.7
>>
> Noted
>   
>>>       QoS - Policing                  YES            1.1          2.6      NO
>>>       QoS - Shaping                   YES            1.1          NO       NO
>>>       sFlow                           YES            1.0          1.0      NO
>>> diff --git a/NEWS b/NEWS
>>> index 3dbd8ec..e75b537 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -16,6 +16,9 @@ Post-v2.13.0
>>>          by enabling interrupt mode.
>>>      - Userspace datapath:
>>>        * Add support for conntrack zone-based timeout policy.
>>> +   - Bareudp Tunnel
>>> +     * Userspace datapath support is not added.
>>> +     * Kernel datapath support not backported to ovs tree.
>>
>> Maybe another line about kernel datapath is supported in the
>> Linux 5.7 kernel?  I know you will update the releases doc
>> but I think it would help to mention it here as well.
>>
> I will update
>   
>>>   v2.13.0 - 14 Feb 2020
>>> diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h
>>> index f7c3b2e..09f313a 100644
>>> --- a/datapath/linux/compat/include/linux/openvswitch.h
>>> +++ b/datapath/linux/compat/include/linux/openvswitch.h
>>> @@ -240,6 +240,7 @@ enum ovs_vport_type {
>>>   	OVS_VPORT_TYPE_GRE,      /* GRE tunnel. */
>>>   	OVS_VPORT_TYPE_VXLAN,	 /* VXLAN tunnel. */
>>>   	OVS_VPORT_TYPE_GENEVE,	 /* Geneve tunnel. */
>>> +	OVS_VPORT_TYPE_BAREUDP,  /* Bareudp tunnel. */
>>>   	OVS_VPORT_TYPE_LISP = 105,  /* LISP tunnel */
>>>   	OVS_VPORT_TYPE_STT = 106, /* STT tunnel */
>>>   	OVS_VPORT_TYPE_ERSPAN = 107, /* ERSPAN tunnel. */
>>> @@ -308,6 +309,15 @@ enum {
>>>   #define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1)
>>> +enum {
>>> +        OVS_BAREUDP_EXT_UNSPEC,
>>> +        OVS_BAREUDP_EXT_MULTIPROTO_MODE,
>>> +        /* place new values here to fill gap. */
>>> +        __OVS_BAREUDP_EXT_MAX,
>>> +};
>>> +
>>> +#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1)
>>> +
>>>   /* OVS_VPORT_ATTR_OPTIONS attributes for tunnels.
>>>    */
>>>   enum {
>>> diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
>>> index fd157ce..b93b1fc 100644
>>> --- a/lib/dpif-netlink-rtnl.c
>>> +++ b/lib/dpif-netlink-rtnl.c
>>> @@ -58,6 +58,18 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl);
>>>   #define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10
>>>   #endif
>>> +#ifndef __IFLA_BAREUDP_MAX
>>> +#define IFLA_BAREUDP_MAX 0
>>> +#endif
>>> +#if IFLA_BAREUDP_MAX < 4
>>> +#define IFLA_BAREUDP_PORT 1
>>> +#define IFLA_BAREUDP_ETHERTYPE 2
>>> +#define IFLA_BAREUDP_SRCPORT_MIN 3
>>> +#define IFLA_BAREUDP_MULTIPROTO_MODE 4
>>> +#endif
>>> +
>>> +#define BAREUDP_MPLS_SRCPORT_MIN 49153
>>> +
>>>   static const struct nl_policy rtlink_policy[] = {
>>>       [IFLA_LINKINFO] = { .type = NL_A_NESTED },
>>>   };
>>> @@ -81,6 +93,10 @@ static const struct nl_policy geneve_policy[] = {
>>>       [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 },
>>>       [IFLA_GENEVE_PORT] = { .type = NL_A_U16 },
>>>   };
>>> +static const struct nl_policy bareudp_policy[] = {
>>> +    [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 },
>>> +    [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 },
>>> +};
>>>   static const char *
>>>   vport_type_to_kind(enum ovs_vport_type type,
>>> @@ -113,6 +129,8 @@ vport_type_to_kind(enum ovs_vport_type type,
>>>           }
>>>       case OVS_VPORT_TYPE_GTPU:
>>>           return NULL;
>>> +    case OVS_VPORT_TYPE_BAREUDP:
>>> +        return "bareudp";
>>>       case OVS_VPORT_TYPE_NETDEV:
>>>       case OVS_VPORT_TYPE_INTERNAL:
>>>       case OVS_VPORT_TYPE_LISP:
>>> @@ -243,6 +261,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg,
>>>       return err;
>>>   }
>>> +static int
>>> +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg,
>>> +                                const char *kind, struct ofpbuf *reply)
>>> +{
>>> +    struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)];
>>> +    int err;
>>> +
>>> +    err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp,
>>> +                            ARRAY_SIZE(bareudp_policy));
>>> +    if (!err) {
>>> +        if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT]))
>>> +            || (tnl_cfg->payload_ethertype
>>> +                != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) {
>>> +            err = EINVAL;
>>> +        }
>>> +    }
>>> +    return err;
>>> +}
>>>   static int
>>>   dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg,
>>> @@ -275,6 +311,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg,
>>>       case OVS_VPORT_TYPE_GENEVE:
>>>           err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply);
>>>           break;
>>> +    case OVS_VPORT_TYPE_BAREUDP:
>>> +        err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply);
>>> +        break;
>>>       case OVS_VPORT_TYPE_NETDEV:
>>>       case OVS_VPORT_TYPE_INTERNAL:
>>>       case OVS_VPORT_TYPE_LISP:
>>> @@ -362,6 +401,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg,
>>>       case OVS_VPORT_TYPE_LISP:
>>>       case OVS_VPORT_TYPE_STT:
>>>       case OVS_VPORT_TYPE_GTPU:
>>> +    case OVS_VPORT_TYPE_BAREUDP:
>>> +        nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE,
>>> +                        tnl_cfg->payload_ethertype);
>>> +        if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) ||
>>> +            (tnl_cfg->payload_ethertype ==  htons(ETH_TYPE_MPLS_MCAST))) {
>>> +            nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN,
>>> +                            BAREUDP_MPLS_SRCPORT_MIN);
>>> +        }
>>> +        nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port);
>>> +        if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) {
>>> +            nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE);
>>> +        }
>>> +        break;
>>>       case OVS_VPORT_TYPE_UNSPEC:
>>>       case __OVS_VPORT_TYPE_MAX:
>>>       default:
>>> @@ -470,6 +522,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type)
>>>       case OVS_VPORT_TYPE_ERSPAN:
>>>       case OVS_VPORT_TYPE_IP6ERSPAN:
>>>       case OVS_VPORT_TYPE_IP6GRE:
>>> +    case OVS_VPORT_TYPE_BAREUDP:
>>>           return dpif_netlink_rtnl_destroy(name);
>>>       case OVS_VPORT_TYPE_NETDEV:
>>>       case OVS_VPORT_TYPE_INTERNAL:
>>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>>> index dc64210..ce7e5e6 100644
>>> --- a/lib/dpif-netlink.c
>>> +++ b/lib/dpif-netlink.c
>>> @@ -740,7 +740,7 @@ get_vport_type(const struct dpif_netlink_vport *vport)
>>>           return "erspan";
>>>       case OVS_VPORT_TYPE_IP6ERSPAN:
>>> -        return "ip6erspan";
>>> +        return "ip6erspan";
>>
>> Why is this changed?  From the diff I can't even tell what has changed.
>> It seems unrelated.
>>
> There was a trialing space at that line. May be i should not mix it with this patch.
> We could add a seperate patch to fix it. Anyways fine for me.

Right, let's not intermix formatting fixes with functional changes.

>>>       case OVS_VPORT_TYPE_IP6GRE:
>>>           return "ip6gre";
>>> @@ -748,6 +748,9 @@ get_vport_type(const struct dpif_netlink_vport *vport)
>>>       case OVS_VPORT_TYPE_GTPU:
>>>           return "gtpu";
>>> +    case OVS_VPORT_TYPE_BAREUDP:
>>> +        return "bareudp";
>>> +
>>>       case OVS_VPORT_TYPE_UNSPEC:
>>>       case __OVS_VPORT_TYPE_MAX:
>>>           break;
>>> @@ -783,6 +786,8 @@ netdev_to_ovs_vport_type(const char *type)
>>>           return OVS_VPORT_TYPE_GRE;
>>>       } else if (!strcmp(type, "gtpu")) {
>>>           return OVS_VPORT_TYPE_GTPU;
>>> +    } else if (!strcmp(type, "bareudp")) {
>>> +        return OVS_VPORT_TYPE_BAREUDP;
>>>       } else {
>>>           return OVS_VPORT_TYPE_UNSPEC;
>>>       }
>>> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
>>> index 0252b61..3aca319 100644
>>> --- a/lib/netdev-vport.c
>>> +++ b/lib/netdev-vport.c
>>> @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev)
>>>       return (class->get_config == get_tunnel_config &&
>>>               (!strcmp("geneve", type) || !strcmp("vxlan", type) ||
>>>                !strcmp("lisp", type) || !strcmp("stt", type) ||
>>> -             !strcmp("gtpu", type)));
>>> +             !strcmp("gtpu", type) || !strcmp("bareudp",type)));
>>>   }
>>>   const char *
>>> @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_)
>>>           dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT);
>>>       } else if (!strcmp(type, "gtpu")) {
>>>           dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT);
>>> +    } else if (!strcmp(type, "bareudp")) {
>>> +        dev->tnl_cfg.dst_port = htons(port);
>>>       }
>>
>> I notice that for GENEVE, VXLAN, LISP, STT and GTPU the code checks
>> if port has a value and then if it does not it will assign a
>> default port.  Is there some reason you do not follow this
>> format for the BAREUDP port?
>>
> unlike Genveve, VXLAN  etc. bareudp can be configured with any UDP port and
> there is no default port.

So it's OK for the port value to zero?

Thanks,

- Greg

[snipped]


More information about the dev mailing list