[ovs-dev] [PATCH v7] datapath: Add support for VXLAN tunnels to Open vSwitch
Kyle Mestery (kmestery)
kmestery at cisco.com
Wed Dec 5 18:14:48 UTC 2012
On Dec 5, 2012, at 12:10 PM, Jesse Gross <jesse at nicira.com> wrote:
> On Wed, Dec 5, 2012 at 8:53 AM, Kyle Mestery <kmestery at cisco.com> wrote:
>> Add support for VXLAN tunnels to Open vSwitch. Add support
>> for setting the destination UDP port on a per-port basis.
>> This is done by adding a "dst_port" parameter to the port
>> configuration. This is only applicable currently to VXLAN
>> tunnels.
>>
>> Please note this currently does not implement any sort of multicast
>> learning. With this patch, VXLAN tunnels must be configured similar
>> to GRE tunnels (e.g. point to point). A subsequent patch will implement
>> a VXLAN control plane in userspace to handle multicast learning.
>>
>> This patch set is based on one posted by Ben Pfaff on Oct. 12, 2011
>> to the ovs-dev mailing list:
>>
>> http://openvswitch.org/pipermail/dev/2011-October/012051.html
>>
>> The patch has been maintained, updated, and freshened by me and a
>> version of it is available at the following github repository:
>>
>> https://github.com/mestery/ovs-vxlan/tree/vxlan
>>
>> I've tested this patch with multiple VXLAN tunnels between hosts
>> using different UDP port numbers. Performance is on par (though
>> slightly faster) than comparable GRE tunnels.
>>
>> See the following IETF draft for additional information about VXLAN:
>> http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02
>>
>> Signed-off-by: Kyle Mestery <kmestery at cisco.com>
>> ---
>> Note: v7 addresses comments from Jesse.
>>
>> Note: v6 of this patch addresses additional comments from
>> Jesse. Specifically, I've removed the hash table of
>> port/socket mappings and moved to a linked list. I've also
>> cleaned up the code around this quite a bit.
>>
>> Note: v5 of this patch addresses comments from Jesse
>> and Chris.
>>
>> Note: v4 of this patch removes VXLAN over IPSEC support,
>> per an offline conversation with Jesse.
>
> Thanks for the new version Kyle. I ran into a couple of last issues:
>
Thanks for the quick turnaround! I'll clean all this up. And I know I may
have asked this before, but is there a Makefile target to run sparse with?
New patch coming out soon!
Thanks,
Kyle
> There were a bunch of sparse warnings related to destination port byte
> order change:
> CHECK /home/jgross/openvswitch/datapath/linux/tunnel.c
> /home/jgross/openvswitch/datapath/linux/tunnel.c:1092:35: warning:
> incorrect type in assignment (different base types)
> /home/jgross/openvswitch/datapath/linux/tunnel.c:1092:35: expected
> unsigned short [unsigned] [usertype] dst_port
> /home/jgross/openvswitch/datapath/linux/tunnel.c:1092:35: got
> restricted __be16 [usertype] <noident>
> /home/jgross/openvswitch/datapath/linux/tunnel.c:1251:46: warning:
> cast to restricted __be16
> /home/jgross/openvswitch/datapath/linux/tunnel.c:1251:46: warning:
> cast to restricted __be16
> /home/jgross/openvswitch/datapath/linux/tunnel.c:1251:46: warning:
> cast to restricted __be16
> /home/jgross/openvswitch/datapath/linux/tunnel.c:1251:46: warning:
> cast to restricted __be16
> CHECK /home/jgross/openvswitch/datapath/linux/vlan.c
> CHECK /home/jgross/openvswitch/datapath/linux/vport.c
> CHECK /home/jgross/openvswitch/datapath/linux/vport-capwap.c
> CHECK /home/jgross/openvswitch/datapath/linux/vport-generic.c
> CHECK /home/jgross/openvswitch/datapath/linux/vport-gre.c
> CHECK /home/jgross/openvswitch/datapath/linux/vport-internal_dev.c
> CHECK /home/jgross/openvswitch/datapath/linux/vport-netdev.c
> CHECK /home/jgross/openvswitch/datapath/linux/vport-patch.c
> CHECK /home/jgross/openvswitch/datapath/linux/vport-vxlan.c
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:136:20: warning:
> incorrect type in assignment (different base types)
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:136:20:
> expected restricted __be16 [usertype] dest
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:136:20: got
> unsigned short const [unsigned] [usertype] dst_port
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:223:22: warning:
> incorrect type in assignment (different base types)
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:223:22:
> expected restricted __be16 [assigned] [usertype] sin_port
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:223:22: got
> unsigned short [unsigned] [usertype] port
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:292:45: warning:
> incorrect type in argument 2 (different base types)
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:292:45:
> expected unsigned short [unsigned] [usertype] port
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:292:45: got
> restricted __be16 [usertype] <noident>
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:320:26: warning:
> incorrect type in assignment (different base types)
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:320:26:
> expected unsigned short [unsigned] [usertype] port
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:320:26: got
> restricted __be16 [usertype] <noident>
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:359:43: warning:
> incorrect type in argument 2 (different base types)
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:359:43:
> expected unsigned short [unsigned] [usertype] port
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:359:43: got
> restricted __be16 [usertype] <noident>
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:395:42: warning:
> incorrect type in argument 2 (different base types)
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:395:42:
> expected unsigned short [unsigned] [usertype] port
> /home/jgross/openvswitch/datapath/linux/vport-vxlan.c:395:42: got
> restricted __be16 [usertype] <noident>
>
> I was originally planning on just fixing this up before pushing.
> Here's the patch that I was going to use:
>
> diff --git a/datapath/tunnel.h b/datapath/tunnel.h
> index c0b50e7..b7de7a9 100644
> --- a/datapath/tunnel.h
> +++ b/datapath/tunnel.h
> @@ -117,7 +117,7 @@ struct tnl_mutable_config {
> u32 flags;
> u8 tos;
> u8 ttl;
> - u16 dst_port;
> + __be16 dst_port;
>
> /* Multicast configuration. */
> int mlink;
> diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c
> index 32a79f1..1843121 100644
> --- a/datapath/vport-vxlan.c
> +++ b/datapath/vport-vxlan.c
> @@ -78,7 +78,7 @@ struct if_name {
> */
> struct vxlan_port {
> struct list_head list;
> - u16 port;
> + __be16 port;
> struct list_head if_names;
> struct socket *vxlan_rcv_socket;
> int count;
> @@ -86,7 +86,7 @@ struct vxlan_port {
>
> static LIST_HEAD(vxlan_ports);
>
> -static struct vxlan_port *vxlan_port_exists(struct net *net, u16 port)
> +static struct vxlan_port *vxlan_port_exists(struct net *net, __be16 port)
> {
> struct vxlan_port *vxlan_port;
>
> @@ -356,7 +356,7 @@ static int vxlan_set_options(struct vport *vport,
> struct nlat
>
> config = rtnl_dereference(tnl_vport->mutable);
>
> - old_port = vxlan_port_exists(net, htons(config->dst_port));
> + old_port = vxlan_port_exists(net, config->dst_port);
>
> err = vxlan_tunnel_setup(net, vname, options, &vxlan_port);
> if (err)
> @@ -392,7 +392,7 @@ static void vxlan_tnl_destroy(struct vport *vport)
> config = rtnl_dereference(tnl_vport->mutable);
>
> vxlan_port = vxlan_port_exists(ovs_dp_get_net(vport->dp),
> - htons(config->dst_port));
> + config->dst_port);
>
> vxlan_tunnel_release(vxlan_port);
>
>> diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c
>> new file mode 100644
>> index 0000000..32a79f1
>> --- /dev/null
>> +++ b/datapath/vport-vxlan.c
>> +/**
>> + * struct if_names - List of names
>> + * @list: list element.
>> + * @name: The name of the interface.
>> + */
>> +struct if_name {
>> + struct list_head list;
>> + char name[IFNAMSIZ];
>> +};
>
> I think that all of the references to if_name are now write-only. Can
> we remove them?
>
>> +static int vxlan_set_options(struct vport *vport, struct nlattr *options)
>> +{
>> + int err;
>> + const char *vname = vport->ops->get_name(vport);
>> + struct net *net = ovs_dp_get_net(vport->dp);
>> + struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
>> + struct tnl_mutable_config *config;
>> + struct vxlan_port *old_port = NULL;
>> + struct vxlan_port *vxlan_port = NULL;
>> +
>> + config = rtnl_dereference(tnl_vport->mutable);
>> +
>> + old_port = vxlan_port_exists(net, htons(config->dst_port));
>> +
>> + err = vxlan_tunnel_setup(net, vname, options, &vxlan_port);
>> + if (err)
>> + goto out;
>> +
>> + err = ovs_tnl_set_options(vport, options);
>> +
>> + if (err)
>> + vxlan_tunnel_release(vxlan_port);
>> + else {
>> + if (old_port) {
>> + /* Release old socket */
>> + vxlan_tunnel_release(old_port);
>
> We should always have an old_port, right? Can we just assume that
> like in vxlan_tnl_destroy()?
>
>> +static struct vport *vxlan_tnl_create(const struct vport_parms *parms)
>> +{
>> + int err;
>> + struct vport *vport;
>> + struct vxlan_port *vxlan_port = NULL;
>> +
>> + err = vxlan_tunnel_setup(ovs_dp_get_net(parms->dp), parms->name,
>> + parms->options, &vxlan_port);
>> + if (err)
>> + return ERR_PTR(err);
>> +
>> + vport = ovs_tnl_create(parms, &ovs_vxlan_vport_ops, &ovs_vxlan_tnl_ops);
>> +
>> + if (!vport)
>> + vxlan_tunnel_release(vxlan_port);
>
> ovs_tnl_create() does an ERR_PTR on its return value and therefore
> will never return NULL. We should use IS_ERR instead for this check.
More information about the dev
mailing list