[ovs-dev] [PATCHv3.2] Add support for LISP tunneling

Lorand Jakab lojakab at cisco.com
Fri Feb 22 05:52:04 UTC 2013


LISP is an experimental layer 3 tunneling protocol, described in RFC
6830.  This patch adds support for LISP tunneling.  Since LISP
encapsulated packets do not carry an Ethernet header, it is removed
before encapsulation, and added with hardcoded source and destination
MAC addresses after decapsulation.  The harcoded MAC chosen for this
purpose is the locally administered address 02:00:00:00:00:00.  Flow
actions can be used to rewrite this MAC for correct reception.  As such,
this patch is intended to be used for static network configurations, or
with a LISP capable controller.

Signed-off-by: Lorand Jakab <lojakab at cisco.com>
Signed-off-by: Kyle Mestery <kmestery at cisco.com>
---
Changes in v3.2:
  * Combine skb_push() and ethh assignment
  * Combine netdev_vport_is_XXX functions and use them for
    needs_dst_port
  * Fix endianness sparse warnings

Changes in v3.1:
  * Include ethernet header in port statistics for consistency (Jarno)

Changes in v3 - address Jesse's review:
  * Move news item to post-1.10
  * Make destination UDP port configurable
  * Factor out VXLAN common function get_src_port() to tunnel.c
  * Byte swap on labels, rather then switch()
  * Combine __skb_pull() and skb_postpull_rcsum() into
    skb_pull_rcsum()
  * Don't send ICMP Port Unreachable when tunnel is not found
  * Move OVS_VPORT_TYPE_LISP into the non-upstream range
  * Other minor fixes
  * Rebase on latest master once more (CAPWAP removed)

Changes in v2.1:
  * Rebase on latest master

Changes in v2 - address Jarno's review:
  * Add support for network namespaces
  * Update header structure for better variable naming
  * Support for encap/decap IPv6, filter non-IPvX packets
  * Use .send hook instead of new .pre_tunnel
  * Other minor fixes
---
 AUTHORS                       |   1 +
 FAQ                           |   2 +-
 Makefile.am                   |   1 +
 NEWS                          |   3 +
 README                        |   2 +-
 README-lisp                   |  68 ++++++
 datapath/Modules.mk           |   1 +
 datapath/linux/.gitignore     |   1 +
 datapath/tunnel.c             |  15 ++
 datapath/tunnel.h             |   2 +
 datapath/vport-lisp.c         | 488 ++++++++++++++++++++++++++++++++++++++++++
 datapath/vport-vxlan.c        |  17 +-
 datapath/vport.c              |   1 +
 datapath/vport.h              |   1 +
 include/linux/openvswitch.h   |   1 +
 include/openflow/nicira-ext.h |   6 +-
 lib/dpif-linux.c              |   5 +
 lib/netdev-vport.c            |  34 +--
 vswitchd/vswitch.xml          |  17 +-
 19 files changed, 627 insertions(+), 39 deletions(-)
 create mode 100644 README-lisp
 create mode 100644 datapath/vport-lisp.c

diff --git a/AUTHORS b/AUTHORS
index 42358a6..14d7331 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -48,6 +48,7 @@ Keith Amidon            keith at nicira.com
 Krishna Kondaka         kkondaka at vmware.com
 Kyle Mestery            kmestery at cisco.com
 Leo Alterman            lalterman at nicira.com
+Lorand Jakab            lojakab at cisco.com
 Luca Giraudo            lgiraudo at nicira.com
 Martin Casado           casado at nicira.com
 Mehak Mahajan           mmahajan at nicira.com
diff --git a/FAQ b/FAQ
index 41e7c07..1203694 100644
--- a/FAQ
+++ b/FAQ
@@ -167,7 +167,7 @@ Q: What features are not available in the Open vSwitch kernel datapath
 
 A: The kernel module in upstream Linux 3.3 and later does not include
    tunnel virtual ports, that is, interfaces with type "gre",
-   "ipsec_gre", "gre64", "ipsec_gre64", or "vxlan".  It is
+   "ipsec_gre", "gre64", "ipsec_gre64", "vxlan", or "lisp".  It is
    possible to create tunnels in Linux and attach them to Open vSwitch
    as system devices.  However, they cannot be dynamically created
    through the OVSDB protocol or set the tunnel ids as a flow action.
diff --git a/Makefile.am b/Makefile.am
index 328b248..b6c13a3 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -56,6 +56,7 @@ EXTRA_DIST = \
 	OPENFLOW-1.1+ \
 	PORTING \
 	README-gcov \
+	README-lisp \
 	REPORTING-BUGS \
 	SubmittingPatches \
 	WHY-OVS \
diff --git a/NEWS b/NEWS
index e703685..f34ba68 100644
--- a/NEWS
+++ b/NEWS
@@ -3,6 +3,9 @@ post-v1.10.0
     - Stable bond mode has been removed.
     - The autopath action has been removed.
     - CAPWAP tunneling support removed.
+    - New support for the data encapsulation format of the LISP tunnel
+      protocol (RFC 6830).  An external control plane or manual flow
+      setup is required for EID-to-RLOC mapping.
 
 
 v1.10.0 - xx xxx xxxx
diff --git a/README b/README
index 0973b3c..f6ffa84 100644
--- a/README
+++ b/README
@@ -24,7 +24,7 @@ vSwitch supports the following features:
     * NIC bonding with or without LACP on upstream switch
     * NetFlow, sFlow(R), and mirroring for increased visibility
     * QoS (Quality of Service) configuration, plus policing
-    * GRE, GRE over IPSEC, and VXLAN tunneling
+    * GRE, GRE over IPSEC, VXLAN, and LISP tunneling
     * 802.1ag connectivity fault management
     * OpenFlow 1.0 plus numerous extensions
     * Transactional configuration database with C and Python bindings
diff --git a/README-lisp b/README-lisp
new file mode 100644
index 0000000..7c9071a
--- /dev/null
+++ b/README-lisp
@@ -0,0 +1,68 @@
+Using LISP tunneling
+====================
+
+LISP is a layer 3 tunneling mechanism, meaning that encapsulated packets do
+not carry Ethernet headers, and ARP requests shouldn't be sent over the
+tunnel.  Because of this, there are some additional steps required for setting
+up LISP tunnels in Open vSwitch, until support for L3 tunnels will improve.
+
+This guide assumes a point-to-point tunnel between two VMs connected to OVS
+bridges on different hypervisors connected via IPv4.  Of course, more than one
+VM may be connected to any of the hypervisors, using the same LISP tunnel, and
+a hypervisor may be connected to several hypervisors over different LISP
+tunnels.
+
+There are several scenarios:
+
+  1) the VMs have IP addresses in the same subnet and the hypervisors are also
+     in a single subnet (although one different from the VM's);
+  2) the VMs have IP addresses in the same subnet but the hypervisors are
+     separated by a router;
+  3) the VMs are in different subnets.
+
+In cases 1) and 3) ARP resolution can work as normal: ARP traffic is
+configured not to go through the LISP tunnel.  For case 1) ARP is able to
+reach the other VM, if both OVS instances default to MAC address learning.
+Case 3) requires the hypervisor be configured as the default router for the
+VMs.
+
+In case 2) the VMs expect ARP replies from each other, but this is not
+possible over a layer 3 tunnel.  One solution is to have static MAC address
+entries preconfigured on the VMs (e.g., `arp -f /etc/ethers` on startup on
+Unix based VMs), or have the hypervisor do proxy ARP.
+
+On the receiving side, the packet arrives without the original MAC header.
+The LISP tunneling code attaches a header with harcoded source and destination
+MAC addres 02:00:00:00:00:00.  This address has all bits set to 0, except the
+locally administered bit, in order to avoid potential collisions with existing
+allocations.  In order for packets to reach their intended destination, the
+destination MAC address needs to be rewritten.  This can be done using the
+flow table.
+
+See below for an example setup, and the associated flow rules to enable LISP
+tunneling.
+
+               +---+                               +---+
+               |VM1|                               |VM2|
+               +---+                               +---+
+                 |                                   |
+            +--[tap0]--+                       +--[tap0]---+
+            |          |                       |           |
+        [lisp0] OVS1 [eth0]-----------------[eth0] OVS2 [lisp0]
+            |          |                       |           |
+            +----------+                       +-----------+
+
+On each hypervisor, interfaces tap0, eth0, and lisp0 are added to a single
+bridge instance, and become numbered 1, 2, and 3 respectively:
+
+    ovs-vsctl add-br br0
+    ovs-vsctl add-port br0 tap0
+    ovs-vsctl add-port br0 eth0
+    ovs-vsctl add-port br0 lisp0 -- set Interface lisp0 type=lisp options:remote_ip=<OVSx_IP>
+
+Flows on br0 are configured as follows:
+
+    priority=3,dl_dst=02:00:00:00:00:00,action=mod_dl_dst:<VMx_MAC>,output:1
+    priority=2,in_port=1,dl_type=0x0806,action=NORMAL
+    priority=1,in_port=1,dl_type=0x0800,vlan_tci=0,nw_src=<EID_prefix>,action=output:3
+    priority=0,action=NORMAL
diff --git a/datapath/Modules.mk b/datapath/Modules.mk
index d04750b..9941123 100644
--- a/datapath/Modules.mk
+++ b/datapath/Modules.mk
@@ -18,6 +18,7 @@ openvswitch_sources = \
 	vport.c \
 	vport-gre.c \
 	vport-internal_dev.c \
+	vport-lisp.c \
 	vport-netdev.c \
 	vport-vxlan.c
 
diff --git a/datapath/linux/.gitignore b/datapath/linux/.gitignore
index d5d063a..16fbc8a 100644
--- a/datapath/linux/.gitignore
+++ b/datapath/linux/.gitignore
@@ -34,6 +34,7 @@
 /vport-generic.c
 /vport-gre.c
 /vport-internal_dev.c
+/vport-lisp.c
 /vport-netdev.c
 /vport-patch.c
 /vport-vxlan.c
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index 3964208..a05cf54 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -512,6 +512,21 @@ free_frags:
 	return sent_len;
 }
 
+/* Compute source UDP port for outgoing packet.
+ * Currently we use the flow hash.
+ */
+u16 ovs_tnl_get_src_port(struct sk_buff *skb)
+{
+	int low;
+	int high;
+	unsigned int range;
+	u32 hash = OVS_CB(skb)->flow->hash;
+
+	inet_get_local_port_range(&low, &high);
+	range = (high - low) + 1;
+	return (((u64) hash * range) >> 32) + low;
+}
+
 int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
 {
 	struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index 54c34ef..2bdfe91 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -42,6 +42,7 @@
 #define TNL_T_PROTO_GRE		0
 #define TNL_T_PROTO_GRE64	1
 #define TNL_T_PROTO_VXLAN	3
+#define TNL_T_PROTO_LISP	4
 
 /* These flags are only needed when calling tnl_find_port(). */
 #define TNL_T_KEY_EXACT		(1 << 10)
@@ -158,6 +159,7 @@ int ovs_tnl_get_options(const struct vport *, struct sk_buff *);
 const char *ovs_tnl_get_name(const struct vport *vport);
 int ovs_tnl_send(struct vport *vport, struct sk_buff *skb);
 void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb);
+u16 ovs_tnl_get_src_port(struct sk_buff *skb);
 
 struct vport *ovs_tnl_find_port(struct net *net, __be32 saddr, __be32 daddr,
 				__be64 key, int tunnel_type,
diff --git a/datapath/vport-lisp.c b/datapath/vport-lisp.c
new file mode 100644
index 0000000..0f01395
--- /dev/null
+++ b/datapath/vport-lisp.c
@@ -0,0 +1,488 @@
+/*
+ * Copyright (c) 2011 Nicira, Inc.
+ * Copyright (c) 2013 Cisco Systems, Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301, USA
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/version.h>
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,26)
+
+#include <linux/in.h>
+#include <linux/ip.h>
+#include <linux/list.h>
+#include <linux/net.h>
+#include <linux/udp.h>
+
+#include <net/icmp.h>
+#include <net/ip.h>
+#include <net/udp.h>
+
+#include "datapath.h"
+#include "tunnel.h"
+#include "vport.h"
+
+
+/*
+ *  LISP encapsulation header:
+ *
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *  |N|L|E|V|I|flags|            Nonce/Map-Version                  |
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *  |                 Instance ID/Locator Status Bits               |
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *
+ */
+
+/**
+ * struct lisphdr - LISP header
+ * @nonce_present: Flag indicating the presence of a 24 bit nonce value.
+ * @locator_status_bits_present: Flag indicating the presence of Locator Status
+ *                               Bits (LSB).
+ * @solicit_echo_nonce: Flag indicating the use of the echo noncing mechanism.
+ * @map_version_present: Flag indicating the use of mapping versioning.
+ * @instance_id_present: Flag indicating the presence of a 24 bit Instance ID.
+ * @reserved_flags: 3 bits reserved for future flags.
+ * @nonce: 24 bit nonce value.
+ * @map_version: 24 bit mapping version.
+ * @locator_status_bits: Locator Status Bits: 32 bits when instance_id_present
+ *                       is not set, 8 bits when it is.
+ * @instance_id: 24 bit Instance ID
+ */
+struct lisphdr {
+#ifdef __LITTLE_ENDIAN_BITFIELD
+	__u8 reserved_flags:3;
+	__u8 instance_id_present:1;
+	__u8 map_version_present:1;
+	__u8 solicit_echo_nonce:1;
+	__u8 locator_status_bits_present:1;
+	__u8 nonce_present:1;
+#else
+	__u8 nonce_present:1;
+	__u8 locator_status_bits_present:1;
+	__u8 solicit_echo_nonce:1;
+	__u8 map_version_present:1;
+	__u8 instance_id_present:1;
+	__u8 reserved_flags:3;
+#endif
+	union {
+		__u8 nonce[3];
+		__u8 map_version[3];
+	} u1;
+	union {
+		__be32 locator_status_bits;
+		struct {
+			__u8 instance_id[3];
+			__u8 locator_status_bits;
+		} word2;
+	} u2;
+};
+
+#define LISP_HLEN (sizeof(struct udphdr) + sizeof(struct lisphdr))
+
+static inline int lisp_hdr_len(const struct tnl_mutable_config *mutable,
+			       const struct ovs_key_ipv4_tunnel *tun_key)
+{
+	return LISP_HLEN;
+}
+
+/**
+ * struct lisp_port - Keeps track of open UDP ports
+ * @list: list element.
+ * @port: The UDP port number in network byte order.
+ * @socket: The socket created for this port number.
+ * @count: How many ports are using this socket/port.
+ */
+struct lisp_port {
+	struct list_head list;
+	__be16 port;
+	struct socket *lisp_rcv_socket;
+	int count;
+};
+
+static LIST_HEAD(lisp_ports);
+
+static struct lisp_port *lisp_port_exists(struct net *net, __be16 port)
+{
+	struct lisp_port *lisp_port;
+
+	list_for_each_entry(lisp_port, &lisp_ports, list) {
+		if (lisp_port->port == port &&
+			net_eq(sock_net(lisp_port->lisp_rcv_socket->sk), net))
+			return lisp_port;
+	}
+
+	return NULL;
+}
+
+static inline struct lisphdr *lisp_hdr(const struct sk_buff *skb)
+{
+	return (struct lisphdr *)(udp_hdr(skb) + 1);
+}
+
+static int lisp_tnl_send(struct vport *vport, struct sk_buff *skb)
+{
+	int tnl_len;
+	int network_offset = skb_network_offset(skb);
+
+	/* We only encapsulate IPv4 and IPv6 packets */
+	switch (skb->protocol) {
+	case htons(ETH_P_IP):
+	case htons(ETH_P_IPV6):
+		/* Pop off "inner" Ethernet header */
+		skb_pull(skb, network_offset);
+		tnl_len = ovs_tnl_send(vport, skb);
+		return tnl_len > 0 ? tnl_len + network_offset : tnl_len;
+	default:
+		kfree_skb(skb);
+		return 0;
+	}
+}
+
+/* Convert 64 bit tunnel ID to 24 bit Instance ID. */
+static void tunnel_id_to_instance_id(__be64 tun_id, __u8 *iid)
+{
+
+#ifdef __BIG_ENDIAN
+	iid[0] = (__force __u8)(tun_id >> 16);
+	iid[1] = (__force __u8)(tun_id >> 8);
+	iid[2] = (__force __u8)tun_id;
+#else
+	iid[0] = (__force __u8)((__force u64)tun_id >> 40);
+	iid[1] = (__force __u8)((__force u64)tun_id >> 48);
+	iid[2] = (__force __u8)((__force u64)tun_id >> 56);
+#endif
+}
+
+/* Convert 24 bit Instance ID to 64 bit tunnel ID. */
+static __be64 instance_id_to_tunnel_id(__u8 *iid)
+{
+#ifdef __BIG_ENDIAN
+	return (iid[0] << 16) | (iid[1] << 8) | iid[2];
+#else
+	return (__force __be64)(((__force u64)iid[0] << 40) |
+				((__force u64)iid[1] << 48) |
+				((__force u64)iid[2] << 56));
+#endif
+}
+
+static struct sk_buff *lisp_build_header(const struct vport *vport,
+					 const struct tnl_mutable_config *mutable,
+					 struct dst_entry *dst,
+					 struct sk_buff *skb,
+					 int tunnel_hlen)
+{
+	struct udphdr *udph = udp_hdr(skb);
+	struct lisphdr *lisph = (struct lisphdr *)(udph + 1);
+	const struct ovs_key_ipv4_tunnel *tun_key = OVS_CB(skb)->tun_key;
+	__be64 out_key;
+	u32 flags;
+
+	tnl_get_param(mutable, tun_key, &flags, &out_key);
+
+	udph->dest = mutable->dst_port;
+	udph->source = htons(ovs_tnl_get_src_port(skb));
+	udph->check = 0;
+	udph->len = htons(skb->len - skb_transport_offset(skb));
+
+	lisph->nonce_present = 0;	/* We don't support echo nonce algorithm */
+	lisph->locator_status_bits_present = 1;	/* Set LSB */
+	lisph->solicit_echo_nonce = 0;	/* No echo noncing */
+	lisph->map_version_present = 0;	/* No mapping versioning, nonce instead */
+	lisph->instance_id_present = 1;	/* Store the tun_id as Instance ID  */
+	lisph->reserved_flags = 0;	/* Reserved flags, set to 0  */
+
+	lisph->u1.nonce[0] = 0;
+	lisph->u1.nonce[1] = 0;
+	lisph->u1.nonce[2] = 0;
+
+	tunnel_id_to_instance_id(out_key, &lisph->u2.word2.instance_id[0]);
+	lisph->u2.word2.locator_status_bits = 1;
+
+	/*
+	 * Allow our local IP stack to fragment the outer packet even if the
+	 * DF bit is set as a last resort.  We also need to force selection of
+	 * an IP ID here because Linux will otherwise leave it at 0 if the
+	 * packet originally had DF set.
+	 */
+	skb->local_df = 1;
+	__ip_select_ident(ip_hdr(skb), dst, 0);
+
+	return skb;
+}
+
+/* Called with rcu_read_lock and BH disabled. */
+static int lisp_rcv(struct sock *sk, struct sk_buff *skb)
+{
+	struct vport *vport;
+	struct lisphdr *lisph;
+	const struct tnl_mutable_config *mutable;
+	struct iphdr *iph, *inner_iph;
+	struct ovs_key_ipv4_tunnel tun_key;
+	__be64 key;
+	u32 tunnel_flags = 0;
+	struct ethhdr *ethh;
+	__be16 protocol;
+
+	if (unlikely(!pskb_may_pull(skb, LISP_HLEN)))
+		goto error;
+
+	lisph = lisp_hdr(skb);
+
+	skb_pull_rcsum(skb, LISP_HLEN);
+
+	if (lisph->instance_id_present != 1)
+		key = 0;
+	else
+		key = instance_id_to_tunnel_id(&lisph->u2.word2.instance_id[0]);
+
+	iph = ip_hdr(skb);
+	vport = ovs_tnl_find_port(dev_net(skb->dev), iph->daddr, iph->saddr,
+		key, TNL_T_PROTO_LISP, &mutable);
+	if (unlikely(!vport))
+		goto error;
+
+	if (mutable->flags & TNL_F_IN_KEY_MATCH || !mutable->key.daddr)
+		tunnel_flags = OVS_TNL_F_KEY;
+	else
+		key = 0;
+
+	/* Save outer tunnel values */
+	tnl_tun_key_init(&tun_key, iph, key, tunnel_flags);
+	OVS_CB(skb)->tun_key = &tun_key;
+
+	/* Drop non-IP inner packets */
+	inner_iph = (struct iphdr *)(lisph + 1);
+	switch (inner_iph->version) {
+	case 4:
+		protocol = htons(ETH_P_IP);
+		break;
+	case 6:
+		protocol = htons(ETH_P_IPV6);
+		break;
+	default:
+		goto error;
+	}
+
+	/* Add Ethernet header */
+	ethh = (struct ethhdr *)skb_push(skb, ETH_HLEN);
+	memset(ethh, 0, ETH_HLEN);
+	ethh->h_dest[0] = 0x02;
+	ethh->h_source[0] = 0x02;
+	ethh->h_proto = protocol;
+
+	ovs_tnl_rcv(vport, skb);
+	goto out;
+
+error:
+	kfree_skb(skb);
+out:
+	return 0;
+}
+
+/* Arbitrary value.  Irrelevant as long as it's not 0 since we set the handler. */
+#define UDP_ENCAP_LISP 1
+static int lisp_socket_init(struct lisp_port *lisp_port, struct net *net)
+{
+	int err;
+	struct sockaddr_in sin;
+
+	err = sock_create_kern(AF_INET, SOCK_DGRAM, 0,
+			       &lisp_port->lisp_rcv_socket);
+	if (err)
+		goto error;
+
+	/* release net ref. */
+	sk_change_net(lisp_port->lisp_rcv_socket->sk, net);
+
+	sin.sin_family = AF_INET;
+	sin.sin_addr.s_addr = htonl(INADDR_ANY);
+	sin.sin_port = lisp_port->port;
+
+	err = kernel_bind(lisp_port->lisp_rcv_socket, (struct sockaddr *)&sin,
+			  sizeof(struct sockaddr_in));
+	if (err)
+		goto error_sock;
+
+	udp_sk(lisp_port->lisp_rcv_socket->sk)->encap_type = UDP_ENCAP_LISP;
+	udp_sk(lisp_port->lisp_rcv_socket->sk)->encap_rcv = lisp_rcv;
+
+	udp_encap_enable();
+
+	return 0;
+
+error_sock:
+	sk_release_kernel(lisp_port->lisp_rcv_socket->sk);
+error:
+	pr_warn("cannot register lisp protocol handler: %d\n", err);
+	return err;
+}
+
+static void lisp_tunnel_release(struct lisp_port *lisp_port)
+{
+	lisp_port->count--;
+
+	if (lisp_port->count == 0) {
+		/* Release old socket */
+		sk_release_kernel(lisp_port->lisp_rcv_socket->sk);
+		list_del(&lisp_port->list);
+		kfree(lisp_port);
+	}
+}
+
+static int lisp_tunnel_setup(struct net *net, struct nlattr *options,
+			     struct lisp_port **lport)
+{
+	struct nlattr *a;
+	int err;
+	u16 dst_port;
+	struct lisp_port *lisp_port = NULL;
+
+	*lport = NULL;
+
+	if (!options) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	a = nla_find_nested(options, OVS_TUNNEL_ATTR_DST_PORT);
+	if (a && nla_len(a) == sizeof(u16)) {
+		dst_port = nla_get_u16(a);
+	} else {
+		/* Require destination port from userspace. */
+		err = -EINVAL;
+		goto out;
+	}
+
+	/* Verify if we already have a socket created for this port */
+	lisp_port = lisp_port_exists(net, htons(dst_port));
+	if (lisp_port) {
+		lisp_port->count++;
+		err = 0;
+		*lport = lisp_port;
+		goto out;
+	}
+
+	/* Add a new socket for this port */
+	lisp_port = kzalloc(sizeof(struct lisp_port), GFP_KERNEL);
+	if (!lisp_port) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	lisp_port->port = htons(dst_port);
+	lisp_port->count = 1;
+	list_add_tail(&lisp_port->list, &lisp_ports);
+
+	err = lisp_socket_init(lisp_port, net);
+	if (err)
+		goto error;
+
+	*lport = lisp_port;
+	goto out;
+
+error:
+	list_del(&lisp_port->list);
+	kfree(lisp_port);
+out:
+	return err;
+}
+
+static int lisp_tnl_set_options(struct vport *vport, struct nlattr *options)
+{
+	int err;
+	struct net *net = ovs_dp_get_net(vport->dp);
+	struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
+	struct tnl_mutable_config *config;
+	struct lisp_port *old_port = NULL;
+	struct lisp_port *lisp_port = NULL;
+
+	config = rtnl_dereference(tnl_vport->mutable);
+
+	old_port = lisp_port_exists(net, config->dst_port);
+
+	err = lisp_tunnel_setup(net, options, &lisp_port);
+	if (err)
+		goto out;
+
+	err = ovs_tnl_set_options(vport, options);
+
+	if (err)
+		lisp_tunnel_release(lisp_port);
+	else {
+		/* Release old socket */
+		lisp_tunnel_release(old_port);
+	}
+out:
+	return err;
+}
+
+static const struct tnl_ops ovs_lisp_tnl_ops = {
+	.tunnel_type	= TNL_T_PROTO_LISP,
+	.ipproto	= IPPROTO_UDP,
+	.hdr_len	= lisp_hdr_len,
+	.build_header	= lisp_build_header,
+};
+
+static void lisp_tnl_destroy(struct vport *vport)
+{
+	struct lisp_port *lisp_port;
+	struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
+	struct tnl_mutable_config *config;
+
+	config = rtnl_dereference(tnl_vport->mutable);
+
+	lisp_port = lisp_port_exists(ovs_dp_get_net(vport->dp),
+				     config->dst_port);
+
+	lisp_tunnel_release(lisp_port);
+
+	ovs_tnl_destroy(vport);
+}
+
+static struct vport *lisp_tnl_create(const struct vport_parms *parms)
+{
+	int err;
+	struct vport *vport;
+	struct lisp_port *lisp_port = NULL;
+
+	err = lisp_tunnel_setup(ovs_dp_get_net(parms->dp), parms->options,
+				&lisp_port);
+	if (err)
+		return ERR_PTR(err);
+
+	vport = ovs_tnl_create(parms, &ovs_lisp_vport_ops, &ovs_lisp_tnl_ops);
+
+	if (IS_ERR(vport))
+		lisp_tunnel_release(lisp_port);
+
+	return vport;
+}
+
+const struct vport_ops ovs_lisp_vport_ops = {
+	.type		= OVS_VPORT_TYPE_LISP,
+	.flags		= VPORT_F_TUN_ID,
+	.create		= lisp_tnl_create,
+	.destroy	= lisp_tnl_destroy,
+	.get_name	= ovs_tnl_get_name,
+	.get_options	= ovs_tnl_get_options,
+	.set_options	= lisp_tnl_set_options,
+	.send		= lisp_tnl_send,
+};
+#else
+#warning LISP tunneling will not be available on kernels before 2.6.26
+#endif /* Linux kernel < 2.6.26 */
diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c
index 413452e..388d9fb 100644
--- a/datapath/vport-vxlan.c
+++ b/datapath/vport-vxlan.c
@@ -90,21 +90,6 @@ static inline struct vxlanhdr *vxlan_hdr(const struct sk_buff *skb)
 	return (struct vxlanhdr *)(udp_hdr(skb) + 1);
 }
 
-/* Compute source port for outgoing packet.
- * Currently we use the flow hash.
- */
-static u16 get_src_port(struct sk_buff *skb)
-{
-	int low;
-	int high;
-	unsigned int range;
-	u32 hash = OVS_CB(skb)->flow->hash;
-
-        inet_get_local_port_range(&low, &high);
-        range = (high - low) + 1;
-	return (((u64) hash * range) >> 32) + low;
-}
-
 static struct sk_buff *vxlan_build_header(const struct vport *vport,
 					  const struct tnl_mutable_config *mutable,
 					  struct dst_entry *dst,
@@ -120,7 +105,7 @@ static struct sk_buff *vxlan_build_header(const struct vport *vport,
 	tnl_get_param(mutable, tun_key, &flags, &out_key);
 
 	udph->dest = mutable->dst_port;
-	udph->source = htons(get_src_port(skb));
+	udph->source = htons(ovs_tnl_get_src_port(skb));
 	udph->check = 0;
 	udph->len = htons(skb->len - skb_transport_offset(skb));
 
diff --git a/datapath/vport.c b/datapath/vport.c
index 149201c..0a0835e 100644
--- a/datapath/vport.c
+++ b/datapath/vport.c
@@ -43,6 +43,7 @@ static const struct vport_ops *base_vport_ops_list[] = {
 	&ovs_gre64_vport_ops,
 #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,26)
 	&ovs_vxlan_vport_ops,
+	&ovs_lisp_vport_ops,
 #endif
 };
 
diff --git a/datapath/vport.h b/datapath/vport.h
index 57662ee..94c9e97 100644
--- a/datapath/vport.h
+++ b/datapath/vport.h
@@ -225,5 +225,6 @@ extern const struct vport_ops ovs_internal_vport_ops;
 extern const struct vport_ops ovs_gre_vport_ops;
 extern const struct vport_ops ovs_gre64_vport_ops;
 extern const struct vport_ops ovs_vxlan_vport_ops;
+extern const struct vport_ops ovs_lisp_vport_ops;
 
 #endif /* vport.h */
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index 7ee31a2..63d1cac 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -186,6 +186,7 @@ enum ovs_vport_type {
 	OVS_VPORT_TYPE_GRE,	 /* GRE tunnel. */
 	OVS_VPORT_TYPE_VXLAN,    /* VXLAN tunnel */
 	OVS_VPORT_TYPE_GRE64 = 104, /* GRE tunnel with 64-bit keys */
+	OVS_VPORT_TYPE_LISP = 105,  /* LISP tunnel */
 	__OVS_VPORT_TYPE_MAX
 };
 
diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h
index cb02b1e..c4ff904 100644
--- a/include/openflow/nicira-ext.h
+++ b/include/openflow/nicira-ext.h
@@ -1537,9 +1537,9 @@ OFP_ASSERT(sizeof(struct nx_action_output_reg) == 24);
 
 /* Tunnel ID.
  *
- * For a packet received via a GRE or VXLAN tunnel including a (32-bit) key, the
- * key is stored in the low 32-bits and the high bits are zeroed.  For other
- * packets, the value is 0.
+ * For a packet received via a GRE, VXLAN or LISP tunnel including a (32-bit)
+ * key, the key is stored in the low 32-bits and the high bits are zeroed.  For
+ * other packets, the value is 0.
  *
  * All zero bits, for packets not received via a keyed tunnel.
  *
diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c
index 0f61453..1b23b39 100644
--- a/lib/dpif-linux.c
+++ b/lib/dpif-linux.c
@@ -442,6 +442,9 @@ get_vport_type(const struct dpif_linux_vport *vport)
     case OVS_VPORT_TYPE_VXLAN:
         return "vxlan";
 
+    case OVS_VPORT_TYPE_LISP:
+        return "lisp";
+
     case OVS_VPORT_TYPE_UNSPEC:
     case __OVS_VPORT_TYPE_MAX:
         break;
@@ -467,6 +470,8 @@ netdev_to_ovs_vport_type(const struct netdev *netdev)
         return OVS_VPORT_TYPE_GRE;
     } else if (!strcmp(type, "vxlan")) {
         return OVS_VPORT_TYPE_VXLAN;
+    } else if (!strcmp(type, "lisp")) {
+        return OVS_VPORT_TYPE_LISP;
     } else {
         return OVS_VPORT_TYPE_UNSPEC;
     }
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 88817ac..d696404 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -44,6 +44,8 @@ VLOG_DEFINE_THIS_MODULE(netdev_vport);
 /* Default to the OTV port, per the VXLAN IETF draft. */
 #define VXLAN_DST_PORT 8472
 
+#define LISP_DST_PORT 4341
+
 #define DEFAULT_TTL 64
 
 struct netdev_dev_vport {
@@ -112,14 +114,13 @@ netdev_vport_is_patch(const struct netdev *netdev)
 }
 
 static bool
-netdev_vport_is_vxlan(const struct netdev *netdev)
+netdev_vport_needs_dst_port(const struct netdev_dev *dev)
 {
-    const struct netdev_dev *dev = netdev_get_dev(netdev);
     const struct netdev_class *class = netdev_dev_get_class(dev);
     const char *type = netdev_dev_get_type(dev);
 
-    return (class->get_config == get_tunnel_config
-            && !strcmp("vxlan", type));
+    return (class->get_config == get_tunnel_config &&
+            (!strcmp("vxlan", type) || !strcmp("lisp", type)));
 }
 
 const char *
@@ -129,20 +130,21 @@ netdev_vport_get_dpif_port(const struct netdev *netdev)
     const struct netdev_class *class = netdev_dev_get_class(dev);
     const char *dpif_port;
 
-    if (netdev_vport_is_vxlan(netdev)) {
+    if (netdev_vport_needs_dst_port(dev)) {
         const struct netdev_dev_vport *vport = netdev_vport_get_dev(netdev);
         const char *type = netdev_dev_get_type(dev);
-        static char dpif_port_vxlan[IFNAMSIZ];
+        static char dpif_port_combined[IFNAMSIZ];
 
         /*
          * Note: IFNAMSIZ is 16 bytes long. The maximum length of a VXLAN
-         * port name below is 15 bytes. Still, assert here on the size of
-         * strlen(type) in case that changes in the future.
+         * or LISP port name below is 15 or 14 bytes respectively. Still,
+         * assert here on the size of strlen(type) in case that changes
+         * in the future.
          */
         ovs_assert(strlen(type) + 10 < IFNAMSIZ);
-        snprintf(dpif_port_vxlan, IFNAMSIZ, "%s_sys_%d", type,
+        snprintf(dpif_port_combined, IFNAMSIZ, "%s_sys_%d", type,
                  ntohs(vport->tnl_cfg.dst_port));
-        return dpif_port_vxlan;
+        return dpif_port_combined;
     } else {
         dpif_port = (is_vport_class(class)
                      ? vport_class_cast(class)->dpif_port
@@ -318,7 +320,7 @@ set_tunnel_config(struct netdev_dev *dev_, const struct smap *args)
     ipsec_mech_set = false;
     memset(&tnl_cfg, 0, sizeof tnl_cfg);
 
-    needs_dst_port = !strcmp(type, "vxlan");
+    needs_dst_port = netdev_vport_needs_dst_port(dev_);
     tnl_cfg.ipsec = strstr(type, "ipsec");
     tnl_cfg.dont_fragment = true;
 
@@ -403,10 +405,15 @@ set_tunnel_config(struct netdev_dev *dev_, const struct smap *args)
     }
 
     /* Add a default destination port for VXLAN if none specified. */
-    if (needs_dst_port && !tnl_cfg.dst_port) {
+    if (!strcmp(type, "vxlan") && !tnl_cfg.dst_port) {
         tnl_cfg.dst_port = htons(VXLAN_DST_PORT);
     }
 
+    /* Add a default destination port for LISP if none specified. */
+    if (!strcmp(type, "lisp") && !tnl_cfg.dst_port) {
+        tnl_cfg.dst_port = htons(LISP_DST_PORT);
+    }
+
     if (tnl_cfg.ipsec) {
         static pid_t pid = 0;
         if (pid <= 0) {
@@ -686,7 +693,8 @@ netdev_vport_tunnel_register(void)
         TUNNEL_CLASS("ipsec_gre", "gre_system"),
         TUNNEL_CLASS("gre64", "gre64_system"),
         TUNNEL_CLASS("ipsec_gre64", "gre64_system"),
-        TUNNEL_CLASS("vxlan", "vxlan_system")
+        TUNNEL_CLASS("vxlan", "vxlan_system"),
+        TUNNEL_CLASS("lisp", "lisp_system")
     };
 
     int i;
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 4911998..f52b9cc 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -1249,6 +1249,13 @@
 	    </p>
           </dd>
 
+          <dt><code>lisp</code></dt>
+          <dd>
+            A layer 3 tunnel over the experimental, UDP-based Locator/ID
+            Separation Protocol (RFC 6830). LISP is currently supported only
+            with the Linux kernel datapath with kernel version 2.6.26 or later.
+          </dd>
+
           <dt><code>patch</code></dt>
           <dd>
             A pair of virtual devices that act as a patch cable.
@@ -1265,7 +1272,7 @@
       <p>
         These options apply to interfaces with <ref column="type"/> of
         <code>gre</code>, <code>ipsec_gre</code>, <code>gre64</code>,
-        <code>ipsec_gre64</code>, and <code>vxlan</code>.
+        <code>ipsec_gre64</code>, <code>vxlan</code>, and <code>lisp</code>.
       </p>
 
       <p>
@@ -1308,8 +1315,8 @@
             key="in_key"/> at all.
           </li>
           <li>
-            A positive 24-bit (for VXLAN), 32-bit (for GRE) or 64-bit (for
-            GRE64) number.  The tunnel receives only packets with the
+            A positive 24-bit (for VXLAN and LISP), 32-bit (for GRE) or 64-bit
+            (for GRE64) number.  The tunnel receives only packets with the
             specified key.
           </li>
           <li>
@@ -1335,8 +1342,8 @@
             key="out_key"/> at all.
           </li>
           <li>
-            A positive 24-bit (for VXLAN), 32-bit (for GRE) or 64-bit (for
-            GRE64) number.  Packets sent through the tunnel will have the
+            A positive 24-bit (for VXLAN and LISP), 32-bit (for GRE) or 64-bit
+            (for GRE64) number.  Packets sent through the tunnel will have the
             specified key.
           </li>
           <li>
-- 
1.7.12.4




More information about the dev mailing list