[ovs-dev] [PATCH v7] datapath: Add support for VXLAN tunnels to Open vSwitch

Jesse Gross jesse at nicira.com
Wed Dec 5 18:10:58 UTC 2012


On Wed, Dec 5, 2012 at 8:53 AM, Kyle Mestery <kmestery at cisco.com> wrote:
> Add support for VXLAN tunnels to Open vSwitch. Add support
> for setting the destination UDP port on a per-port basis.
> This is done by adding a "dst_port" parameter to the port
> configuration. This is only applicable currently to VXLAN
> tunnels.
>
> Please note this currently does not implement any sort of multicast
> learning. With this patch, VXLAN tunnels must be configured similar
> to GRE tunnels (e.g. point to point). A subsequent patch will implement
> a VXLAN control plane in userspace to handle multicast learning.
>
> This patch set is based on one posted by Ben Pfaff on Oct. 12, 2011
> to the ovs-dev mailing list:
>
> http://openvswitch.org/pipermail/dev/2011-October/012051.html
>
> The patch has been maintained, updated, and freshened by me and a
> version of it is available at the following github repository:
>
> https://github.com/mestery/ovs-vxlan/tree/vxlan
>
> I've tested this patch with multiple VXLAN tunnels between hosts
> using different UDP port numbers. Performance is on par (though
> slightly faster) than comparable GRE tunnels.
>
> See the following IETF draft for additional information about VXLAN:
> http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02
>
> Signed-off-by: Kyle Mestery <kmestery at cisco.com>
> ---
> Note: v7 addresses comments from Jesse.
>
> Note: v6 of this patch addresses additional comments from
> Jesse. Specifically, I've removed the hash table of
> port/socket mappings and moved to a linked list. I've also
> cleaned up the code around this quite a bit.
>
> Note: v5 of this patch addresses comments from Jesse
> and Chris.
>
> Note: v4 of this patch removes VXLAN over IPSEC support,
> per an offline conversation with Jesse.

Thanks for the new version Kyle.  I ran into a couple of last issues:

There were a bunch of sparse warnings related to destination port byte
order change:
  CHECK   /home/jgross/openvswitch/datapath/linux/tunnel.c
/home/jgross/openvswitch/datapath/linux/tunnel.c:1092:35: warning:
incorrect type in assignment (different base types)
/home/jgross/openvswitch/datapath/linux/tunnel.c:1092:35:    expected
unsigned short [unsigned] [usertype] dst_port
/home/jgross/openvswitch/datapath/linux/tunnel.c:1092:35:    got
restricted __be16 [usertype] <noident>
/home/jgross/openvswitch/datapath/linux/tunnel.c:1251:46: warning:
cast to restricted __be16
/home/jgross/openvswitch/datapath/linux/tunnel.c:1251:46: warning:
cast to restricted __be16
/home/jgross/openvswitch/datapath/linux/tunnel.c:1251:46: warning:
cast to restricted __be16
/home/jgross/openvswitch/datapath/linux/tunnel.c:1251:46: warning:
cast to restricted __be16
  CHECK   /home/jgross/openvswitch/datapath/linux/vlan.c
  CHECK   /home/jgross/openvswitch/datapath/linux/vport.c
  CHECK   /home/jgross/openvswitch/datapath/linux/vport-capwap.c
  CHECK   /home/jgross/openvswitch/datapath/linux/vport-generic.c
  CHECK   /home/jgross/openvswitch/datapath/linux/vport-gre.c
  CHECK   /home/jgross/openvswitch/datapath/linux/vport-internal_dev.c
  CHECK   /home/jgross/openvswitch/datapath/linux/vport-netdev.c
  CHECK   /home/jgross/openvswitch/datapath/linux/vport-patch.c
  CHECK   /home/jgross/openvswitch/datapath/linux/vport-vxlan.c
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:136:20: warning:
incorrect type in assignment (different base types)
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:136:20:
expected restricted __be16 [usertype] dest
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:136:20:    got
unsigned short const [unsigned] [usertype] dst_port
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:223:22: warning:
incorrect type in assignment (different base types)
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:223:22:
expected restricted __be16 [assigned] [usertype] sin_port
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:223:22:    got
unsigned short [unsigned] [usertype] port
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:292:45: warning:
incorrect type in argument 2 (different base types)
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:292:45:
expected unsigned short [unsigned] [usertype] port
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:292:45:    got
restricted __be16 [usertype] <noident>
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:320:26: warning:
incorrect type in assignment (different base types)
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:320:26:
expected unsigned short [unsigned] [usertype] port
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:320:26:    got
restricted __be16 [usertype] <noident>
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:359:43: warning:
incorrect type in argument 2 (different base types)
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:359:43:
expected unsigned short [unsigned] [usertype] port
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:359:43:    got
restricted __be16 [usertype] <noident>
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:395:42: warning:
incorrect type in argument 2 (different base types)
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:395:42:
expected unsigned short [unsigned] [usertype] port
/home/jgross/openvswitch/datapath/linux/vport-vxlan.c:395:42:    got
restricted __be16 [usertype] <noident>

I was originally planning on just fixing this up before pushing.
Here's the patch that I was going to use:

diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index c0b50e7..b7de7a9 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -117,7 +117,7 @@ struct tnl_mutable_config {
        u32     flags;
        u8      tos;
        u8      ttl;
-       u16     dst_port;
+       __be16  dst_port;

        /* Multicast configuration. */
        int     mlink;
diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c
index 32a79f1..1843121 100644
--- a/datapath/vport-vxlan.c
+++ b/datapath/vport-vxlan.c
@@ -78,7 +78,7 @@ struct if_name {
  */
 struct vxlan_port {
        struct list_head list;
-       u16 port;
+       __be16 port;
        struct list_head if_names;
        struct socket *vxlan_rcv_socket;
        int count;
@@ -86,7 +86,7 @@ struct vxlan_port {

 static LIST_HEAD(vxlan_ports);

-static struct vxlan_port *vxlan_port_exists(struct net *net, u16 port)
+static struct vxlan_port *vxlan_port_exists(struct net *net, __be16 port)
 {
        struct vxlan_port *vxlan_port;

@@ -356,7 +356,7 @@ static int vxlan_set_options(struct vport *vport,
struct nlat

        config = rtnl_dereference(tnl_vport->mutable);

-       old_port = vxlan_port_exists(net, htons(config->dst_port));
+       old_port = vxlan_port_exists(net, config->dst_port);

        err = vxlan_tunnel_setup(net, vname, options, &vxlan_port);
        if (err)
@@ -392,7 +392,7 @@ static void vxlan_tnl_destroy(struct vport *vport)
        config = rtnl_dereference(tnl_vport->mutable);

        vxlan_port = vxlan_port_exists(ovs_dp_get_net(vport->dp),
-                                        htons(config->dst_port));
+                                        config->dst_port);

        vxlan_tunnel_release(vxlan_port);

> diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c
> new file mode 100644
> index 0000000..32a79f1
> --- /dev/null
> +++ b/datapath/vport-vxlan.c
> +/**
> + * struct if_names - List of names
> + * @list: list element.
> + * @name: The name of the interface.
> + */
> +struct if_name {
> +       struct list_head list;
> +       char name[IFNAMSIZ];
> +};

I think that all of the references to if_name are now write-only.  Can
we remove them?

> +static int vxlan_set_options(struct vport *vport, struct nlattr *options)
> +{
> +       int err;
> +       const char *vname = vport->ops->get_name(vport);
> +       struct net *net = ovs_dp_get_net(vport->dp);
> +       struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
> +       struct tnl_mutable_config *config;
> +       struct vxlan_port *old_port = NULL;
> +       struct vxlan_port *vxlan_port = NULL;
> +
> +       config = rtnl_dereference(tnl_vport->mutable);
> +
> +       old_port = vxlan_port_exists(net, htons(config->dst_port));
> +
> +       err = vxlan_tunnel_setup(net, vname, options, &vxlan_port);
> +       if (err)
> +               goto out;
> +
> +       err = ovs_tnl_set_options(vport, options);
> +
> +       if (err)
> +               vxlan_tunnel_release(vxlan_port);
> +       else {
> +               if (old_port) {
> +                       /* Release old socket */
> +                       vxlan_tunnel_release(old_port);

We should always have an old_port, right?  Can we just assume that
like in vxlan_tnl_destroy()?

> +static struct vport *vxlan_tnl_create(const struct vport_parms *parms)
> +{
> +       int err;
> +       struct vport *vport;
> +       struct vxlan_port *vxlan_port = NULL;
> +
> +       err = vxlan_tunnel_setup(ovs_dp_get_net(parms->dp), parms->name,
> +                                               parms->options, &vxlan_port);
> +       if (err)
> +               return ERR_PTR(err);
> +
> +       vport = ovs_tnl_create(parms, &ovs_vxlan_vport_ops, &ovs_vxlan_tnl_ops);
> +
> +       if (!vport)
> +               vxlan_tunnel_release(vxlan_port);

ovs_tnl_create() does an ERR_PTR on its return value and therefore
will never return NULL.  We should use IS_ERR instead for this check.



More information about the dev mailing list