[ovs-dev] [PATCH 10/11] tunnel: Geneve TLV handling support for OpenFlow.

Jesse Gross jesse at nicira.com
Fri Jun 19 23:13:24 UTC 2015


The current support for Geneve in OVS is exactly equivalent to VXLAN:
it is possible to set and match on the VNI but not on any options
contained in the header. This patch enables the use of options.

The goal for Geneve support is not to add support for any particular option
but to allow end users or controllers to specify what they would like to
match. That is, the full range of Geneve's capabilities should be exposed
without modifying the code (the one exception being options that require
per-packet computation in the fast path).

The main issue with supporting Geneve options is how to integrate the
fields into the existing OpenFlow pipeline. All existing operations
are referred to by their NXM/OXM field name - matches, action generation,
arithmetic operations (i.e. tranfer to a register). However, the Geneve
option space is exactly the same as the OXM space, so a direct mapping
is not feasible. Instead, we create a pool of 64 NXMs that are then
dynamically mapped on Geneve option TLVs using OpenFlow. Once mapped,
these fields become first-class citizens in the OpenFlow pipeline.

An example of how to use Geneve options:
ovs-ofctl add-geneve-map br0 {class=0xffff,type=0,len=4}->tun_metadata0
ovs-ofctl add-flow br0 in_port=LOCAL,actions=set_field:0xffffffff->tun_metadata0,1

This will add a 4 bytes option (filled will all 1's) to all packets
coming from the LOCAL port and then send then out to port 1.

A limitation of this patch is that although the option table is specified
for a particular switch over OpenFlow, it is currently global to all
switches. This will be addressed in a future patch.

Based on work originally done by Madhu Challa.

Signed-off-by: Jesse Gross <jesse at nicira.com>
---
 NEWS                         |   1 +
 build-aux/extract-ofp-fields |  13 +-
 lib/automake.mk              |   1 +
 lib/flow.c                   |  18 +-
 lib/flow.h                   |   4 +-
 lib/match.c                  |   9 +-
 lib/match.h                  |   2 +
 lib/meta-flow.c              |  22 ++
 lib/meta-flow.h              | 188 ++++++++++
 lib/nx-match.c               |   6 +-
 lib/nx-match.h               |   3 +
 lib/odp-util.c               |  62 +---
 lib/odp-util.h               |   2 +-
 lib/ofp-util.c               |   2 +-
 lib/packets.h                |   2 +
 lib/tun-metadata.c           | 792 +++++++++++++++++++++++++++++++++++++++++++
 lib/tun-metadata.h           |  80 ++++-
 ofproto/ofproto-dpif-rid.h   |   2 +-
 ofproto/ofproto-dpif-xlate.c |   2 +-
 ofproto/ofproto.c            |  19 +-
 tests/ofproto.at             |  68 +++-
 tests/ovs-ofctl.at           |   2 +
 tests/tunnel.at              |  65 ++++
 utilities/ovs-ofctl.8.in     |  17 +
 vswitchd/vswitch.xml         |   7 +-
 25 files changed, 1312 insertions(+), 77 deletions(-)
 create mode 100644 lib/tun-metadata.c

diff --git a/NEWS b/NEWS
index e9d1afa..f06468d 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
 Post-v2.4.0
 ---------------------
+   - Support for matching and generating options with Geneve tunnels.
 
 
 v2.4.0 - xx xxx xxxx
diff --git a/build-aux/extract-ofp-fields b/build-aux/extract-ofp-fields
index 042f633..e0284f9 100755
--- a/build-aux/extract-ofp-fields
+++ b/build-aux/extract-ofp-fields
@@ -14,12 +14,13 @@ VERSION = {"1.0": 0x01,
            "1.4": 0x05,
            "1.5": 0x06}
 
-TYPES = {"u8":   (1, False),
-         "be16": (2, False),
-         "be32": (4, False),
-         "MAC":  (6, False),
-         "be64": (8, False),
-         "IPv6": (16, False)}
+TYPES = {"u8":       (1,   False),
+         "be16":     (2,   False),
+         "be32":     (4,   False),
+         "MAC":      (6,   False),
+         "be64":     (8,   False),
+         "IPv6":     (16,  False),
+         "tunnelMD": (124, True)}
 
 FORMATTING = {"decimal":            ("MFS_DECIMAL",      1,   8),
               "hexadecimal":        ("MFS_HEXADECIMAL",  1, 127),
diff --git a/lib/automake.mk b/lib/automake.mk
index b95d254..6e1f13d 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -248,6 +248,7 @@ lib_libopenvswitch_la_SOURCES = \
 	lib/tnl-ports.c \
 	lib/tnl-ports.h \
 	lib/token-bucket.c \
+	lib/tun-metadata.c \
 	lib/tun-metadata.h \
 	lib/type-props.h \
 	lib/unaligned.h \
diff --git a/lib/flow.c b/lib/flow.c
index 3e99d5e..7350a17 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -123,7 +123,7 @@ struct mf_ctx {
  * away.  Some GCC versions gave warnings on ALWAYS_INLINE, so these are
  * defined as macros. */
 
-#if (FLOW_WC_SEQ != 31)
+#if (FLOW_WC_SEQ != 32)
 #define MINIFLOW_ASSERT(X) ovs_assert(X)
 BUILD_MESSAGE("FLOW_WC_SEQ changed: miniflow_extract() will have runtime "
                "assertions enabled. Consider updating FLOW_WC_SEQ after "
@@ -766,7 +766,7 @@ flow_get_metadata(const struct flow *flow, struct match *flow_metadata)
 {
     int i;
 
-    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
     match_init_catchall(flow_metadata);
     if (flow->tunnel.tun_id != htonll(0)) {
@@ -784,6 +784,7 @@ flow_get_metadata(const struct flow *flow, struct match *flow_metadata)
     if (flow->tunnel.gbp_flags) {
         match_set_tun_gbp_flags(flow_metadata, flow->tunnel.gbp_flags);
     }
+    tun_metadata_get_fmd(&flow->tunnel.metadata, flow_metadata);
     if (flow->metadata != htonll(0)) {
         match_set_metadata(flow_metadata, flow->metadata);
     }
@@ -942,7 +943,7 @@ void flow_wildcards_init_for_packet(struct flow_wildcards *wc,
     memset(&wc->masks, 0x0, sizeof wc->masks);
 
     /* Update this function whenever struct flow changes. */
-    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
     if (flow->tunnel.ip_dst) {
         if (flow->tunnel.flags & FLOW_TNL_F_KEY) {
@@ -957,6 +958,11 @@ void flow_wildcards_init_for_packet(struct flow_wildcards *wc,
         WC_MASK_FIELD(wc, tunnel.tp_dst);
         WC_MASK_FIELD(wc, tunnel.gbp_id);
         WC_MASK_FIELD(wc, tunnel.gbp_flags);
+
+        if (flow->tunnel.metadata.opt_map) {
+            wc->masks.tunnel.metadata.opt_map = flow->tunnel.metadata.opt_map;
+            WC_MASK_FIELD(wc, tunnel.metadata.opts);
+        }
     } else if (flow->tunnel.tun_id) {
         WC_MASK_FIELD(wc, tunnel.tun_id);
     }
@@ -1041,7 +1047,7 @@ uint64_t
 flow_wc_map(const struct flow *flow)
 {
     /* Update this function whenever struct flow changes. */
-    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
     uint64_t map = (flow->tunnel.ip_dst) ? MINIFLOW_MAP(tunnel) : 0;
 
@@ -1093,7 +1099,7 @@ void
 flow_wildcards_clear_non_packet_fields(struct flow_wildcards *wc)
 {
     /* Update this function whenever struct flow changes. */
-    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
     memset(&wc->masks.metadata, 0, sizeof wc->masks.metadata);
     memset(&wc->masks.regs, 0, sizeof wc->masks.regs);
@@ -1648,7 +1654,7 @@ flow_push_mpls(struct flow *flow, int n, ovs_be16 mpls_eth_type,
         flow->mpls_lse[0] = set_mpls_lse_values(ttl, tc, 1, htonl(label));
 
         /* Clear all L3 and L4 fields and dp_hash. */
-        BUILD_ASSERT(FLOW_WC_SEQ == 31);
+        BUILD_ASSERT(FLOW_WC_SEQ == 32);
         memset((char *) flow + FLOW_SEGMENT_2_ENDS_AT, 0,
                sizeof(struct flow) - FLOW_SEGMENT_2_ENDS_AT);
         flow->dp_hash = 0;
diff --git a/lib/flow.h b/lib/flow.h
index 70554e4..384a031 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -39,7 +39,7 @@ struct match;
 /* This sequence number should be incremented whenever anything involving flows
  * or the wildcarding of flows changes.  This will cause build assertion
  * failures in places which likely need to be updated. */
-#define FLOW_WC_SEQ 31
+#define FLOW_WC_SEQ 32
 
 /* Number of Open vSwitch extension 32-bit registers. */
 #define FLOW_N_REGS 8
@@ -157,7 +157,7 @@ BUILD_ASSERT_DECL(sizeof(struct flow) % sizeof(uint64_t) == 0);
 /* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */
 BUILD_ASSERT_DECL(offsetof(struct flow, igmp_group_ip4) + sizeof(uint32_t)
                   == sizeof(struct flow_tnl) + 192
-                  && FLOW_WC_SEQ == 31);
+                  && FLOW_WC_SEQ == 32);
 
 /* Incremental points at which flow classification may be performed in
  * segments.
diff --git a/lib/match.c b/lib/match.c
index 7d0b409..ca9492f 100644
--- a/lib/match.c
+++ b/lib/match.c
@@ -21,6 +21,7 @@
 #include "dynamic-string.h"
 #include "ofp-util.h"
 #include "packets.h"
+#include "tun-metadata.h"
 
 /* Converts the flow in 'flow' into a match in 'match', with the given
  * 'wildcards'. */
@@ -31,6 +32,7 @@ match_init(struct match *match,
     match->flow = *flow;
     match->wc = *wc;
     match_zero_wildcarded_fields(match);
+    memset(&match->tun_md, 0, sizeof match->tun_md);
 }
 
 /* Converts a flow into a match.  It sets the wildcard masks based on
@@ -44,6 +46,8 @@ match_wc_init(struct match *match, const struct flow *flow)
     flow_wildcards_init_for_packet(&match->wc, flow);
     WC_MASK_FIELD(&match->wc, regs);
     WC_MASK_FIELD(&match->wc, metadata);
+
+    memset(&match->tun_md, 0, sizeof match->tun_md);
 }
 
 /* Initializes 'match' as a "catch-all" match that matches every packet. */
@@ -52,6 +56,7 @@ match_init_catchall(struct match *match)
 {
     memset(&match->flow, 0, sizeof match->flow);
     flow_wildcards_init_catchall(&match->wc);
+    memset(&match->tun_md, 0, sizeof match->tun_md);
 }
 
 /* For each bit or field wildcarded in 'match', sets the corresponding bit or
@@ -897,6 +902,7 @@ format_flow_tunnel(struct ds *s, const struct match *match)
         format_flags(s, flow_tun_flag_to_string, tnl->flags, '|');
         ds_put_char(s, ',');
     }
+    tun_metadata_match_format(s, match);
 }
 
 /* Appends a string representation of 'match' to 's'.  If 'priority' is
@@ -912,7 +918,7 @@ match_format(const struct match *match, struct ds *s, int priority)
 
     int i;
 
-    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
     if (priority != OFP_DEFAULT_PRIORITY) {
         ds_put_format(s, "priority=%d,", priority);
@@ -1226,6 +1232,7 @@ minimatch_expand(const struct minimatch *src, struct match *dst)
 {
     miniflow_expand(&src->flow, &dst->flow);
     minimask_expand(&src->mask, &dst->wc);
+    memset(&dst->tun_md, 0, sizeof dst->tun_md);
 }
 
 /* Returns true if 'a' and 'b' match the same packets, false otherwise.  */
diff --git a/lib/match.h b/lib/match.h
index 10aa0af..9cec9da 100644
--- a/lib/match.h
+++ b/lib/match.h
@@ -19,6 +19,7 @@
 
 #include "flow.h"
 #include "packets.h"
+#include "tun-metadata.h"
 
 struct ds;
 
@@ -33,6 +34,7 @@ struct ds;
 struct match {
     struct flow flow;
     struct flow_wildcards wc;
+    struct tun_metadata_allocation tun_md;
 };
 
 /* Initializer for a "struct match" that matches every packet. */
diff --git a/lib/meta-flow.c b/lib/meta-flow.c
index c4295c5..c5c8102 100644
--- a/lib/meta-flow.c
+++ b/lib/meta-flow.c
@@ -33,6 +33,7 @@
 #include "random.h"
 #include "shash.h"
 #include "socket-util.h"
+#include "tun-metadata.h"
 #include "unaligned.h"
 #include "util.h"
 #include "openvswitch/vlog.h"
@@ -189,6 +190,12 @@ mf_is_all_wild(const struct mf_field *mf, const struct flow_wildcards *wc)
         return !wc->masks.tunnel.gbp_id;
     case MFF_TUN_GBP_FLAGS:
         return !wc->masks.tunnel.gbp_flags;
+    CASE_MFF_TUN_METADATA: {
+        union mf_value value;
+
+        tun_metadata_read(&wc->masks.tunnel.metadata, mf, &value);
+        return is_all_zeros(&value.tun_metadata, mf->n_bytes);
+    }
     case MFF_METADATA:
         return !wc->masks.metadata;
     case MFF_IN_PORT:
@@ -480,6 +487,7 @@ mf_is_value_valid(const struct mf_field *mf, const union mf_value *value)
     case MFF_TUN_FLAGS:
     case MFF_TUN_GBP_ID:
     case MFF_TUN_GBP_FLAGS:
+    CASE_MFF_TUN_METADATA:
     case MFF_METADATA:
     case MFF_IN_PORT:
     case MFF_SKB_PRIORITY:
@@ -602,6 +610,9 @@ mf_get_value(const struct mf_field *mf, const struct flow *flow,
     case MFF_TUN_TOS:
         value->u8 = flow->tunnel.ip_tos;
         break;
+    CASE_MFF_TUN_METADATA:
+        tun_metadata_read(&flow->tunnel.metadata, mf, value);
+        break;
 
     case MFF_METADATA:
         value->be64 = flow->metadata;
@@ -816,6 +827,9 @@ mf_set_value(const struct mf_field *mf,
     case MFF_TUN_TTL:
         match_set_tun_ttl(match, value->u8);
         break;
+    CASE_MFF_TUN_METADATA:
+        tun_metadata_set_match(mf, value, NULL, match);
+        break;
 
     case MFF_METADATA:
         match_set_metadata(match, value->be64);
@@ -1099,6 +1113,8 @@ mf_set_flow_value(const struct mf_field *mf,
     case MFF_TUN_TTL:
         flow->tunnel.ip_ttl = value->u8;
         break;
+    CASE_MFF_TUN_METADATA:
+        tun_metadata_write(&flow->tunnel.metadata, mf, value);
 
     case MFF_METADATA:
         flow->metadata = value->be64;
@@ -1363,6 +1379,9 @@ mf_set_wild(const struct mf_field *mf, struct match *match)
     case MFF_TUN_TTL:
         match_set_tun_ttl_masked(match, 0, 0);
         break;
+    CASE_MFF_TUN_METADATA:
+        tun_metadata_set_match(mf, NULL, NULL, match);
+        break;
 
     case MFF_METADATA:
         match_set_metadata_masked(match, htonll(0), htonll(0));
@@ -1617,6 +1636,9 @@ mf_set(const struct mf_field *mf,
     case MFF_TUN_TOS:
         match_set_tun_tos_masked(match, value->u8, mask->u8);
         break;
+    CASE_MFF_TUN_METADATA:
+        tun_metadata_set_match(mf, value, mask, match);
+        break;
 
     case MFF_METADATA:
         match_set_metadata_masked(match, value->be64, mask->be64);
diff --git a/lib/meta-flow.h b/lib/meta-flow.h
index 01b129c..7592da4 100644
--- a/lib/meta-flow.h
+++ b/lib/meta-flow.h
@@ -137,6 +137,8 @@ struct match;
  *
  *         MAC: A six-byte field whose value is an Ethernet address.
  *         IPv6: A 16-byte field whose value is an IPv6 address.
+ *         tunnelMD: A variable length field, up to 124 bytes, that carries
+ *                   tunnel metadata.
  *
  *   Maskable:
  *
@@ -443,6 +445,155 @@ enum OVS_PACKED_ENUM mf_field_id {
      */
     MFF_TUN_GBP_FLAGS,
 
+#if TUN_METADATA_NUM_OPTS == 64
+    /* "tun_metadata<N>".
+     *
+     * Encapsulation metadata for tunnels.
+     *
+     * Each NXM can be dynamically mapped onto a particular tunnel
+     * field using OVSDB. The individual NXMs can each carry up to
+     * 124 bytes of data and a combined total of 256 across all
+     * allocated fields.
+     *
+     * Type: tunnelMD.
+     * Maskable: bitwise.
+     * Formatting: hexadecimal.
+     * Prerequisites: none.
+     * Access: read/write.
+     * NXM: NXM_NX_TUN_METADATA0(40) since v2.5.        <0>
+     * NXM: NXM_NX_TUN_METADATA1(41) since v2.5.        <1>
+     * NXM: NXM_NX_TUN_METADATA2(42) since v2.5.        <2>
+     * NXM: NXM_NX_TUN_METADATA3(43) since v2.5.        <3>
+     * NXM: NXM_NX_TUN_METADATA4(44) since v2.5.        <4>
+     * NXM: NXM_NX_TUN_METADATA5(45) since v2.5.        <5>
+     * NXM: NXM_NX_TUN_METADATA6(46) since v2.5.        <6>
+     * NXM: NXM_NX_TUN_METADATA7(47) since v2.5.        <7>
+     * NXM: NXM_NX_TUN_METADATA8(48) since v2.5.        <8>
+     * NXM: NXM_NX_TUN_METADATA9(49) since v2.5.        <9>
+     * NXM: NXM_NX_TUN_METADATA10(50) since v2.5.       <10>
+     * NXM: NXM_NX_TUN_METADATA11(51) since v2.5.       <11>
+     * NXM: NXM_NX_TUN_METADATA12(52) since v2.5.       <12>
+     * NXM: NXM_NX_TUN_METADATA13(53) since v2.5.       <13>
+     * NXM: NXM_NX_TUN_METADATA14(54) since v2.5.       <14>
+     * NXM: NXM_NX_TUN_METADATA15(55) since v2.5.       <15>
+     * NXM: NXM_NX_TUN_METADATA16(56) since v2.5.       <16>
+     * NXM: NXM_NX_TUN_METADATA17(57) since v2.5.       <17>
+     * NXM: NXM_NX_TUN_METADATA18(58) since v2.5.       <18>
+     * NXM: NXM_NX_TUN_METADATA19(59) since v2.5.       <19>
+     * NXM: NXM_NX_TUN_METADATA20(60) since v2.5.       <20>
+     * NXM: NXM_NX_TUN_METADATA21(61) since v2.5.       <21>
+     * NXM: NXM_NX_TUN_METADATA22(62) since v2.5.       <22>
+     * NXM: NXM_NX_TUN_METADATA23(63) since v2.5.       <23>
+     * NXM: NXM_NX_TUN_METADATA24(64) since v2.5.       <24>
+     * NXM: NXM_NX_TUN_METADATA25(65) since v2.5.       <25>
+     * NXM: NXM_NX_TUN_METADATA26(66) since v2.5.       <26>
+     * NXM: NXM_NX_TUN_METADATA27(67) since v2.5.       <27>
+     * NXM: NXM_NX_TUN_METADATA28(68) since v2.5.       <28>
+     * NXM: NXM_NX_TUN_METADATA29(69) since v2.5.       <29>
+     * NXM: NXM_NX_TUN_METADATA30(70) since v2.5.       <30>
+     * NXM: NXM_NX_TUN_METADATA31(71) since v2.5.       <31>
+     * NXM: NXM_NX_TUN_METADATA32(72) since v2.5.       <32>
+     * NXM: NXM_NX_TUN_METADATA33(73) since v2.5.       <33>
+     * NXM: NXM_NX_TUN_METADATA34(74) since v2.5.       <34>
+     * NXM: NXM_NX_TUN_METADATA35(75) since v2.5.       <35>
+     * NXM: NXM_NX_TUN_METADATA36(76) since v2.5.       <36>
+     * NXM: NXM_NX_TUN_METADATA37(77) since v2.5.       <37>
+     * NXM: NXM_NX_TUN_METADATA38(78) since v2.5.       <38>
+     * NXM: NXM_NX_TUN_METADATA39(79) since v2.5.       <39>
+     * NXM: NXM_NX_TUN_METADATA40(80) since v2.5.       <40>
+     * NXM: NXM_NX_TUN_METADATA41(81) since v2.5.       <41>
+     * NXM: NXM_NX_TUN_METADATA42(82) since v2.5.       <42>
+     * NXM: NXM_NX_TUN_METADATA43(83) since v2.5.       <43>
+     * NXM: NXM_NX_TUN_METADATA44(84) since v2.5.       <44>
+     * NXM: NXM_NX_TUN_METADATA45(85) since v2.5.       <45>
+     * NXM: NXM_NX_TUN_METADATA46(86) since v2.5.       <46>
+     * NXM: NXM_NX_TUN_METADATA47(87) since v2.5.       <47>
+     * NXM: NXM_NX_TUN_METADATA48(88) since v2.5.       <48>
+     * NXM: NXM_NX_TUN_METADATA49(89) since v2.5.       <49>
+     * NXM: NXM_NX_TUN_METADATA50(90) since v2.5.       <50>
+     * NXM: NXM_NX_TUN_METADATA51(91) since v2.5.       <51>
+     * NXM: NXM_NX_TUN_METADATA52(92) since v2.5.       <52>
+     * NXM: NXM_NX_TUN_METADATA53(93) since v2.5.       <53>
+     * NXM: NXM_NX_TUN_METADATA54(94) since v2.5.       <54>
+     * NXM: NXM_NX_TUN_METADATA55(95) since v2.5.       <55>
+     * NXM: NXM_NX_TUN_METADATA56(96) since v2.5.       <56>
+     * NXM: NXM_NX_TUN_METADATA57(97) since v2.5.       <57>
+     * NXM: NXM_NX_TUN_METADATA58(98) since v2.5.       <58>
+     * NXM: NXM_NX_TUN_METADATA59(99) since v2.5.       <59>
+     * NXM: NXM_NX_TUN_METADATA60(100) since v2.5.      <60>
+     * NXM: NXM_NX_TUN_METADATA61(101) since v2.5.      <61>
+     * NXM: NXM_NX_TUN_METADATA62(102) since v2.5.      <62>
+     * NXM: NXM_NX_TUN_METADATA63(103) since v2.5.      <63>
+     * OXM: none.
+     */
+    MFF_TUN_METADATA0,
+    MFF_TUN_METADATA1,
+    MFF_TUN_METADATA2,
+    MFF_TUN_METADATA3,
+    MFF_TUN_METADATA4,
+    MFF_TUN_METADATA5,
+    MFF_TUN_METADATA6,
+    MFF_TUN_METADATA7,
+    MFF_TUN_METADATA8,
+    MFF_TUN_METADATA9,
+    MFF_TUN_METADATA10,
+    MFF_TUN_METADATA11,
+    MFF_TUN_METADATA12,
+    MFF_TUN_METADATA13,
+    MFF_TUN_METADATA14,
+    MFF_TUN_METADATA15,
+    MFF_TUN_METADATA16,
+    MFF_TUN_METADATA17,
+    MFF_TUN_METADATA18,
+    MFF_TUN_METADATA19,
+    MFF_TUN_METADATA20,
+    MFF_TUN_METADATA21,
+    MFF_TUN_METADATA22,
+    MFF_TUN_METADATA23,
+    MFF_TUN_METADATA24,
+    MFF_TUN_METADATA25,
+    MFF_TUN_METADATA26,
+    MFF_TUN_METADATA27,
+    MFF_TUN_METADATA28,
+    MFF_TUN_METADATA29,
+    MFF_TUN_METADATA30,
+    MFF_TUN_METADATA31,
+    MFF_TUN_METADATA32,
+    MFF_TUN_METADATA33,
+    MFF_TUN_METADATA34,
+    MFF_TUN_METADATA35,
+    MFF_TUN_METADATA36,
+    MFF_TUN_METADATA37,
+    MFF_TUN_METADATA38,
+    MFF_TUN_METADATA39,
+    MFF_TUN_METADATA40,
+    MFF_TUN_METADATA41,
+    MFF_TUN_METADATA42,
+    MFF_TUN_METADATA43,
+    MFF_TUN_METADATA44,
+    MFF_TUN_METADATA45,
+    MFF_TUN_METADATA46,
+    MFF_TUN_METADATA47,
+    MFF_TUN_METADATA48,
+    MFF_TUN_METADATA49,
+    MFF_TUN_METADATA50,
+    MFF_TUN_METADATA51,
+    MFF_TUN_METADATA52,
+    MFF_TUN_METADATA53,
+    MFF_TUN_METADATA54,
+    MFF_TUN_METADATA55,
+    MFF_TUN_METADATA56,
+    MFF_TUN_METADATA57,
+    MFF_TUN_METADATA58,
+    MFF_TUN_METADATA59,
+    MFF_TUN_METADATA60,
+    MFF_TUN_METADATA61,
+    MFF_TUN_METADATA62,
+    MFF_TUN_METADATA63,
+#else
+#error "Need to update MFF_TUN_METADATA* to match TUN_METADATA_NUM_OPTS"
+#endif
+
     /* "metadata".
      *
      * A scratch pad value standardized in OpenFlow 1.1+.  Initially zero, at
@@ -1433,6 +1584,42 @@ struct mf_bitmap {
 #error "Need to update CASE_MFF_XREGS to match FLOW_N_XREGS"
 #endif
 
+/* Use this macro as CASE_MFF_TUN_METADATA: in a switch statement to choose
+ * all of the MFF_TUN_METADATAn cases. */
+#define CASE_MFF_TUN_METADATA                         \
+    case MFF_TUN_METADATA0: case MFF_TUN_METADATA1:   \
+    case MFF_TUN_METADATA2: case MFF_TUN_METADATA3:   \
+    case MFF_TUN_METADATA4: case MFF_TUN_METADATA5:   \
+    case MFF_TUN_METADATA6: case MFF_TUN_METADATA7:   \
+    case MFF_TUN_METADATA8: case MFF_TUN_METADATA9:   \
+    case MFF_TUN_METADATA10: case MFF_TUN_METADATA11: \
+    case MFF_TUN_METADATA12: case MFF_TUN_METADATA13: \
+    case MFF_TUN_METADATA14: case MFF_TUN_METADATA15: \
+    case MFF_TUN_METADATA16: case MFF_TUN_METADATA17: \
+    case MFF_TUN_METADATA18: case MFF_TUN_METADATA19: \
+    case MFF_TUN_METADATA20: case MFF_TUN_METADATA21: \
+    case MFF_TUN_METADATA22: case MFF_TUN_METADATA23: \
+    case MFF_TUN_METADATA24: case MFF_TUN_METADATA25: \
+    case MFF_TUN_METADATA26: case MFF_TUN_METADATA27: \
+    case MFF_TUN_METADATA28: case MFF_TUN_METADATA29: \
+    case MFF_TUN_METADATA30: case MFF_TUN_METADATA31: \
+    case MFF_TUN_METADATA32: case MFF_TUN_METADATA33: \
+    case MFF_TUN_METADATA34: case MFF_TUN_METADATA35: \
+    case MFF_TUN_METADATA36: case MFF_TUN_METADATA37: \
+    case MFF_TUN_METADATA38: case MFF_TUN_METADATA39: \
+    case MFF_TUN_METADATA40: case MFF_TUN_METADATA41: \
+    case MFF_TUN_METADATA42: case MFF_TUN_METADATA43: \
+    case MFF_TUN_METADATA44: case MFF_TUN_METADATA45: \
+    case MFF_TUN_METADATA46: case MFF_TUN_METADATA47: \
+    case MFF_TUN_METADATA48: case MFF_TUN_METADATA49: \
+    case MFF_TUN_METADATA50: case MFF_TUN_METADATA51: \
+    case MFF_TUN_METADATA52: case MFF_TUN_METADATA53: \
+    case MFF_TUN_METADATA54: case MFF_TUN_METADATA55: \
+    case MFF_TUN_METADATA56: case MFF_TUN_METADATA57: \
+    case MFF_TUN_METADATA58: case MFF_TUN_METADATA59: \
+    case MFF_TUN_METADATA60: case MFF_TUN_METADATA61: \
+    case MFF_TUN_METADATA62: case MFF_TUN_METADATA63
+
 /* Prerequisites for matching a field.
  *
  * A field may only be matched if the correct lower-level protocols are also
@@ -1551,6 +1738,7 @@ union mf_value {
     uint8_t u8;
 };
 BUILD_ASSERT_DECL(sizeof(union mf_value) == 128);
+BUILD_ASSERT_DECL(sizeof(union mf_value) >= GENEVE_MAX_OPT_SIZE);
 
 /* Part of a field. */
 struct mf_subfield {
diff --git a/lib/nx-match.c b/lib/nx-match.c
index f768dac..f883944 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -31,6 +31,7 @@
 #include "openflow/nicira-ext.h"
 #include "packets.h"
 #include "shash.h"
+#include "tun-metadata.h"
 #include "unaligned.h"
 #include "util.h"
 #include "openvswitch/vlog.h"
@@ -678,7 +679,7 @@ nxm_put_unmasked(struct ofpbuf *b, enum mf_field_id field,
     ofpbuf_put(b, value, n_bytes);
 }
 
-static void
+void
 nxm_put(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
         const void *value, const void *mask, size_t n_bytes)
 {
@@ -890,7 +891,7 @@ nx_put_raw(struct ofpbuf *b, enum ofp_version oxm, const struct match *match,
     int match_len;
     int i;
 
-    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
     /* Metadata. */
     if (match->wc.masks.dp_hash) {
@@ -1003,6 +1004,7 @@ nx_put_raw(struct ofpbuf *b, enum ofp_version oxm, const struct match *match,
                 flow->tunnel.gbp_id, match->wc.masks.tunnel.gbp_id);
     nxm_put_8m(b, MFF_TUN_GBP_FLAGS, oxm,
                flow->tunnel.gbp_flags, match->wc.masks.tunnel.gbp_flags);
+    tun_metadata_to_nx_match(b, oxm, match);
 
     /* Registers. */
     if (oxm < OFP15_VERSION) {
diff --git a/lib/nx-match.h b/lib/nx-match.h
index fe0b68c..db20987 100644
--- a/lib/nx-match.h
+++ b/lib/nx-match.h
@@ -70,6 +70,9 @@ enum ofperr nx_pull_entry(struct ofpbuf *, const struct mf_field **,
                           union mf_value *value, union mf_value *mask);
 enum ofperr nx_pull_header(struct ofpbuf *, const struct mf_field **,
                            bool *masked);
+void nxm_put(struct ofpbuf *b, enum mf_field_id field,
+             enum ofp_version version, const void *value,
+             const void *mask, size_t n_bytes);
 void nx_put_entry(struct ofpbuf *, enum mf_field_id, enum ofp_version,
                   const union mf_value *value, const union mf_value *mask);
 void nx_put_header(struct ofpbuf *, enum mf_field_id, enum ofp_version,
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 75f64c5..efdc651 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -35,6 +35,7 @@
 #include "packets.h"
 #include "simap.h"
 #include "timeval.h"
+#include "tun-metadata.h"
 #include "unaligned.h"
 #include "util.h"
 #include "uuid.h"
@@ -1317,45 +1318,10 @@ ovs_frag_type_to_string(enum ovs_frag_type type)
     }
 }
 
-#define GENEVE_OPT(class, type) ((OVS_FORCE uint32_t)(class) << 8 | (type))
-static int
-parse_geneve_opts(const struct nlattr *attr)
-{
-    int opts_len = nl_attr_get_size(attr);
-    const struct geneve_opt *opt = nl_attr_get(attr);
-
-    while (opts_len > 0) {
-        int len;
-
-        if (opts_len < sizeof(*opt)) {
-            return -EINVAL;
-        }
-
-        len = sizeof(*opt) + opt->length * 4;
-        if (len > opts_len) {
-            return -EINVAL;
-        }
-
-        switch (GENEVE_OPT(opt->opt_class, opt->type)) {
-        default:
-            if (opt->type & GENEVE_CRIT_OPT_TYPE) {
-                return -EINVAL;
-            }
-        };
-
-        opt = opt + len / sizeof(*opt);
-        opts_len -= len;
-    };
-
-    return 0;
-}
-
 static enum odp_key_fitness
 odp_tun_key_from_attr__(const struct nlattr *attr,
-                        const struct nlattr *flow_attrs OVS_UNUSED,
-                        size_t flow_attr_len OVS_UNUSED,
-                        const struct flow_tnl *src_tun OVS_UNUSED,
-                        struct flow_tnl *tun)
+                        const struct nlattr *flow_attrs, size_t flow_attr_len,
+                        const struct flow_tnl *src_tun, struct flow_tnl *tun)
 {
     unsigned int left;
     const struct nlattr *a;
@@ -1424,15 +1390,14 @@ odp_tun_key_from_attr__(const struct nlattr *attr,
 
             break;
         }
-        case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS: {
-            if (parse_geneve_opts(a)) {
+        case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS:
+            if (tun_metadata_from_geneve_nlattr(a, flow_attrs, flow_attr_len,
+                                                &src_tun->metadata,
+                                                &tun->metadata)) {
                 return ODP_FIT_ERROR;
             }
-            /* It is necessary to reproduce options exactly (including order)
-             * so it's easiest to just echo them back. */
-            unknown = true;
             break;
-        }
+
         default:
             /* Allow this to show up as unexpected, if there are unknown
              * tunnel attribute, eventually resulting in ODP_FIT_TOO_MUCH. */
@@ -1458,8 +1423,8 @@ odp_tun_key_from_attr(const struct nlattr *attr, struct flow_tnl *tun)
 
 static void
 tun_key_to_attr(struct ofpbuf *a, const struct flow_tnl *tun_key,
-                const struct flow_tnl *tun_flow_key OVS_UNUSED,
-                const struct ofpbuf *key_buf OVS_UNUSED)
+                const struct flow_tnl *tun_flow_key,
+                const struct ofpbuf *key_buf)
 {
     size_t tun_key_ofs;
 
@@ -1503,6 +1468,13 @@ tun_key_to_attr(struct ofpbuf *a, const struct flow_tnl *tun_key,
         nl_msg_end_nested(a, vxlan_opts_ofs);
     }
 
+    if (tun_key == tun_flow_key) {
+        tun_metadata_to_geneve_nlattr_flow(&tun_key->metadata, a);
+    } else {
+        tun_metadata_to_geneve_nlattr_mask(key_buf, &tun_key->metadata,
+                                           &tun_flow_key->metadata, a);
+    }
+
     nl_msg_end_nested(a, tun_key_ofs);
 }
 
diff --git a/lib/odp-util.h b/lib/odp-util.h
index b6e964c..763e3f9 100644
--- a/lib/odp-util.h
+++ b/lib/odp-util.h
@@ -135,7 +135,7 @@ void odp_portno_names_destroy(struct hmap *portno_names);
  * add another field and forget to adjust this value.
  */
 #define ODPUTIL_FLOW_KEY_BYTES 512
-BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
 /* A buffer with sufficient size and alignment to hold an nlattr-formatted flow
  * key.  An array of "struct nlattr" might not, in theory, be sufficiently
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 9e8215b..ca9bef2 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -196,7 +196,7 @@ ofputil_netmask_to_wcbits(ovs_be32 netmask)
 void
 ofputil_wildcard_from_ofpfw10(uint32_t ofpfw, struct flow_wildcards *wc)
 {
-    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
     /* Initialize most of wc. */
     flow_wildcards_init_catchall(wc);
diff --git a/lib/packets.h b/lib/packets.h
index 23fb40a..688e7373 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -26,6 +26,7 @@
 #include "openvswitch/types.h"
 #include "random.h"
 #include "hash.h"
+#include "tun-metadata.h"
 #include "util.h"
 
 struct dp_packet;
@@ -44,6 +45,7 @@ struct flow_tnl {
     ovs_be16 gbp_id;
     uint8_t  gbp_flags;
     uint8_t  pad1[5];        /* Pad to 64 bits. */
+    struct tun_metadata metadata;
 };
 
 /* Unfortunately, a "struct flow" sometimes has to handle OpenFlow port
diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c
new file mode 100644
index 0000000..b09085e
--- /dev/null
+++ b/lib/tun-metadata.c
@@ -0,0 +1,792 @@
+/*
+ * Copyright (c) 2015 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <config.h>
+#include <errno.h>
+#include <stdbool.h>
+
+#include "bitmap.h"
+#include "compiler.h"
+#include "hmap.h"
+#include "match.h"
+#include "nx-match.h"
+#include "odp-netlink.h"
+#include "ofp-util.h"
+#include "ovs-thread.h"
+#include "ovs-rcu.h"
+#include "packets.h"
+#include "tun-metadata.h"
+
+struct tun_meta_entry {
+    struct hmap_node node;
+    uint32_t key; /* Class and Type */
+    struct tun_metadata_loc loc;
+    bool valid;
+};
+
+struct tun_table {
+    struct tun_meta_entry entries[TUN_METADATA_NUM_OPTS];
+    /* Each bit represents 4 bytes of allocated space */
+    unsigned long alloc_map[BITMAP_N_LONGS(TUN_METADATA_TOT_OPT_SIZE / 4)];
+    struct hmap key_hmap;
+};
+BUILD_ASSERT_DECL(TUN_METADATA_TOT_OPT_SIZE % 4 == 0);
+
+static struct ovs_mutex tab_mutex = OVS_MUTEX_INITIALIZER;
+static OVSRCU_TYPE(struct tun_table *) metadata_tab;
+
+#define PRESENT_OPT_FOR_EACH(IDX, TMP, BITMAP)      \
+    for (TMP = BITMAP, IDX = raw_ctz(TMP);          \
+         TMP;                                       \
+         TMP &= ~(1ULL << IDX), IDX = raw_ctz(TMP)) \
+
+static enum ofperr tun_metadata_add_entry(struct tun_table *map, uint8_t idx,
+                                          uint16_t opt_class, uint8_t type,
+                                          uint8_t len) OVS_REQUIRES(tab_mutex);
+static void tun_metadata_del_entry(struct tun_table *map, uint8_t idx)
+            OVS_REQUIRES(tab_mutex);
+static void memcpy_to_metadata(struct tun_metadata *dst, const void *src,
+                               const struct tun_metadata_loc *);
+static void memcpy_from_metadata(void *dst, const struct tun_metadata *src,
+                                 const struct tun_metadata_loc *);
+
+static uint32_t
+tun_meta_key(ovs_be16 class, uint8_t type)
+{
+    return (OVS_FORCE uint16_t)class << 8 | type;
+}
+
+static ovs_be16
+tun_key_class(uint32_t key)
+{
+    return (OVS_FORCE ovs_be16)(key >> 8);
+}
+
+static uint8_t
+tun_key_type(uint32_t key)
+{
+    return key & 0xff;
+}
+
+static struct tun_table *
+table_alloc(const struct tun_table *old_map) OVS_REQUIRES(tab_mutex)
+{
+    struct tun_table *new_map;
+
+    new_map = xzalloc(sizeof *new_map);
+
+    if (old_map) {
+        struct tun_meta_entry *entry;
+
+        *new_map = *old_map;
+        hmap_init(&new_map->key_hmap);
+
+        HMAP_FOR_EACH (entry, node, &old_map->key_hmap) {
+            struct tun_meta_entry *new_entry;
+            struct tun_metadata_loc_chain *chain;
+
+            new_entry = &new_map->entries[entry - old_map->entries];
+            hmap_insert(&new_map->key_hmap, &new_entry->node, entry->node.hash);
+
+            chain = &new_entry->loc.c;
+            while (chain->next) {
+                chain->next = xmemdup(chain->next, sizeof *chain->next);
+                chain = chain->next;
+            }
+        }
+    } else {
+        hmap_init(&new_map->key_hmap);
+    }
+
+    return new_map;
+}
+
+static void
+table_free(struct tun_table *map) OVS_REQUIRES(tab_mutex)
+{
+    struct tun_meta_entry *entry;
+
+    if (!map) {
+        return;
+    }
+
+    HMAP_FOR_EACH (entry, node, &map->key_hmap) {
+        tun_metadata_del_entry(map, entry - map->entries);
+    }
+
+    free(map);
+}
+
+void
+tun_metadata_init(void)
+{
+    ovs_mutex_lock(&tab_mutex);
+
+    if (!ovsrcu_get_protected(struct tun_table *, &metadata_tab)) {
+        ovsrcu_set(&metadata_tab, table_alloc(NULL));
+    }
+
+    ovs_mutex_unlock(&tab_mutex);
+}
+
+enum ofperr
+tun_metadata_table_mod(struct ofputil_geneve_table_mod *gtm)
+{
+    struct tun_table *old_map, *new_map;
+    struct ofputil_geneve_map *ofp_map;
+    enum ofperr err = 0;
+
+    ovs_mutex_lock(&tab_mutex);
+
+    old_map = ovsrcu_get_protected(struct tun_table *, &metadata_tab);
+
+    switch (gtm->command) {
+    case NXGTMC_ADD:
+        new_map = table_alloc(old_map);
+
+        LIST_FOR_EACH (ofp_map, list_node, &gtm->mappings) {
+            err = tun_metadata_add_entry(new_map, ofp_map->index,
+                                         ofp_map->option_class,
+                                         ofp_map->option_type,
+                                         ofp_map->option_len);
+            if (err) {
+                table_free(new_map);
+                goto out;
+            }
+        }
+        break;
+
+    case NXGTMC_DELETE:
+        new_map = table_alloc(old_map);
+
+        LIST_FOR_EACH (ofp_map, list_node, &gtm->mappings) {
+            tun_metadata_del_entry(new_map, ofp_map->index);
+        }
+        break;
+
+    case NXGTMC_CLEAR:
+        new_map = table_alloc(NULL);
+        break;
+
+    default:
+        OVS_NOT_REACHED();
+    }
+
+    ovsrcu_set(&metadata_tab, new_map);
+    ovsrcu_postpone(table_free, old_map);
+
+out:
+    ovs_mutex_unlock(&tab_mutex);
+    return err;
+}
+
+void
+tun_metadata_table_request(struct ofputil_geneve_table_reply *gtr)
+{
+    struct tun_table *map = ovsrcu_get(struct tun_table *, &metadata_tab);
+    int i;
+
+    gtr->max_option_space = TUN_METADATA_TOT_OPT_SIZE;
+    gtr->max_fields = TUN_METADATA_NUM_OPTS;
+    list_init(&gtr->mappings);
+
+    for (i = 0; i < TUN_METADATA_NUM_OPTS; i++) {
+        struct tun_meta_entry *entry = &map->entries[i];
+        struct ofputil_geneve_map *map;
+
+        if (!entry->valid) {
+            continue;
+        }
+
+        map = xmalloc(sizeof *map);
+        map->option_class = ntohs(tun_key_class(entry->key));
+        map->option_type = tun_key_type(entry->key);
+        map->option_len = entry->loc.len;
+        map->index = i;
+
+        list_push_back(&gtr->mappings, &map->list_node);
+    }
+}
+
+void
+tun_metadata_read(const struct tun_metadata *metadata,
+                  const struct mf_field *mf, union mf_value *value)
+{
+    struct tun_table *map = ovsrcu_get(struct tun_table *, &metadata_tab);
+    unsigned int idx = mf->id - MFF_TUN_METADATA0;
+    struct tun_metadata_loc *loc;
+
+    if (!map) {
+        memset(value->tun_metadata, 0, mf->n_bytes);
+        return;
+    }
+
+    loc = &map->entries[idx].loc;
+
+    memset(value->tun_metadata, 0, mf->n_bytes - loc->len);
+    memcpy_from_metadata(value->tun_metadata + mf->n_bytes - loc->len,
+                         metadata, loc);
+}
+
+void
+tun_metadata_write(struct tun_metadata *metadata,
+                   const struct mf_field *mf, const union mf_value *value)
+{
+    struct tun_table *map = ovsrcu_get(struct tun_table *, &metadata_tab);
+    unsigned int idx = mf->id - MFF_TUN_METADATA0;
+    struct tun_metadata_loc *loc;
+
+    if (!map || !map->entries[idx].valid) {
+        return;
+    }
+
+    loc = &map->entries[idx].loc;
+
+    metadata->opt_map |= 1 << idx;
+    memcpy_to_metadata(metadata, value->tun_metadata + mf->n_bytes - loc->len,
+                       loc);
+}
+
+static const struct tun_metadata_loc *
+metadata_loc_from_match(struct tun_table *map, struct match *match,
+                        unsigned int idx, unsigned int field_len)
+{
+    ovs_assert(idx < TUN_METADATA_NUM_OPTS);
+
+    if (map) {
+        if (map->entries[idx].valid) {
+            return &map->entries[idx].loc;
+        } else {
+            return NULL;
+        }
+    }
+
+    if (match->tun_md.alloc_offset + field_len >= TUN_METADATA_NUM_OPTS ||
+        match->tun_md.loc[idx].len) {
+        return NULL;
+    }
+
+    match->tun_md.loc[idx].len = field_len;
+    match->tun_md.loc[idx].c.offset = match->tun_md.alloc_offset;
+    match->tun_md.loc[idx].c.len = field_len;
+    match->tun_md.loc[idx].c.next = NULL;
+    match->tun_md.alloc_offset += field_len;
+    match->tun_md.valid = true;
+
+    return &match->tun_md.loc[idx];
+}
+
+void
+tun_metadata_set_match(const struct mf_field *mf, const union mf_value *value,
+                       const union mf_value *mask, struct match *match)
+{
+    struct tun_table *map = ovsrcu_get(struct tun_table *, &metadata_tab);
+    const struct tun_metadata_loc *loc;
+    unsigned int idx = mf->id - MFF_TUN_METADATA0;
+    unsigned int field_len;
+    unsigned int data_offset;
+    union mf_value data;
+
+    field_len = mf_field_len(mf, value, mask);
+    loc = metadata_loc_from_match(map, match, idx, field_len);
+    if (!loc) {
+        return;
+    }
+
+    data_offset = mf->n_bytes - loc->len;
+
+    if (!value) {
+        memset(data.tun_metadata, 0, loc->len);
+    } else if (!mask) {
+        memcpy(data.tun_metadata, value->tun_metadata + data_offset, loc->len);
+    } else {
+        int i;
+        for (i = 0; i < loc->len; i++) {
+            data.tun_metadata[i] = value->tun_metadata[data_offset + i] &
+                                   mask->tun_metadata[data_offset + i];
+        }
+    }
+    match->flow.tunnel.metadata.opt_map |= 1 << idx;
+    memcpy_to_metadata(&match->flow.tunnel.metadata, data.tun_metadata, loc);
+
+    if (!value) {
+        memset(data.tun_metadata, 0, loc->len);
+    } else if (!mask) {
+        memset(data.tun_metadata, 0xff, loc->len);
+    } else {
+        memcpy(data.tun_metadata, mask->tun_metadata + data_offset, loc->len);
+    }
+    match->wc.masks.tunnel.metadata.opt_map |= 1 << idx;
+    memcpy_to_metadata(&match->wc.masks.tunnel.metadata, data.tun_metadata, loc);
+}
+
+void
+tun_metadata_get_fmd(const struct tun_metadata *metadata,
+                     struct match *flow_metadata)
+{
+    struct tun_table *map;
+    int i;
+    uint64_t bm;
+
+    map = metadata->tab;
+    if (!map) {
+        map = ovsrcu_get(struct tun_table *, &metadata_tab);
+    }
+
+    PRESENT_OPT_FOR_EACH (i, bm, metadata->opt_map) {
+        union mf_value opts;
+        const struct tun_metadata_loc *old_loc = &map->entries[i].loc;
+        const struct tun_metadata_loc *new_loc;
+
+	new_loc = metadata_loc_from_match(NULL, flow_metadata, i, old_loc->len);
+
+        memcpy_from_metadata(opts.tun_metadata, metadata, old_loc);
+        memcpy_to_metadata(&flow_metadata->flow.tunnel.metadata,
+                           opts.tun_metadata, new_loc);
+
+        memset(opts.tun_metadata, 0xff, old_loc->len);
+        memcpy_to_metadata(&flow_metadata->wc.masks.tunnel.metadata,
+                           opts.tun_metadata, new_loc);
+    }
+}
+
+static uint32_t
+tun_meta_hash(uint32_t key)
+{
+    return hash_int(key, 0);
+}
+
+static struct tun_meta_entry *
+tun_meta_find_key(const struct hmap *hmap, uint32_t key)
+{
+    struct tun_meta_entry *entry;
+
+    HMAP_FOR_EACH_WITH_HASH (entry, node, tun_meta_hash(key), hmap) {
+        if (entry->key == key) {
+            return entry;
+        }
+    }
+    return NULL;
+}
+
+static void
+memcpy_to_metadata(struct tun_metadata *dst, const void *src,
+                   const struct tun_metadata_loc *loc)
+{
+    const struct tun_metadata_loc_chain *chain = &loc->c;
+    int addr = 0;
+
+    while (chain) {
+        memcpy(dst->opts + loc->c.offset + addr, (uint8_t *)src + addr,
+               chain->len);
+        addr += chain->len;
+        chain = chain->next;
+    }
+}
+
+static void
+memcpy_from_metadata(void *dst, const struct tun_metadata *src,
+                     const struct tun_metadata_loc *loc)
+{
+    const struct tun_metadata_loc_chain *chain = &loc->c;
+    int addr = 0;
+
+    while (chain) {
+        memcpy((uint8_t *)dst + addr, src->opts + loc->c.offset + addr,
+               chain->len);
+        addr += chain->len;
+        chain = chain->next;
+    }
+}
+
+static int
+tun_metadata_alloc_chain(struct tun_table *map, uint8_t len,
+                         struct tun_metadata_loc_chain *loc)
+                         OVS_REQUIRES(tab_mutex)
+{
+    int alloc_len = len / 4;
+    int scan_start = 0;
+    int scan_end = TUN_METADATA_TOT_OPT_SIZE / 4;
+    int pos_start, pos_end, pos_len;
+    int best_start = 0, best_len = 0;
+
+    while (true) {
+        pos_start = bitmap_scan(map->alloc_map, 0, scan_start, scan_end);
+        if (pos_start == scan_end) {
+            break;
+        }
+
+        pos_end = bitmap_scan(map->alloc_map, 1, pos_start,
+                              MIN(pos_start + alloc_len, scan_end));
+        pos_len = pos_end - pos_start;
+        if (pos_len == alloc_len) {
+            goto found;
+        }
+
+        if (pos_len > best_len) {
+            best_start = pos_start;
+            best_len = pos_len;
+        }
+        scan_start = pos_end + 1;
+    }
+
+    if (best_len == 0) {
+        return ENOSPC;
+    }
+
+    pos_start = best_start;
+    pos_len = best_len;
+
+found:
+    bitmap_set_multiple(map->alloc_map, pos_start, pos_len, 1);
+    loc->offset = pos_start * 4;
+    loc->len = pos_len * 4;
+
+    return 0;
+}
+
+static enum ofperr
+tun_metadata_add_entry(struct tun_table *map, uint8_t idx, uint16_t opt_class,
+                       uint8_t type, uint8_t len) OVS_REQUIRES(tab_mutex)
+{
+    struct tun_meta_entry *entry;
+    struct tun_metadata_loc_chain *cur_chain, *prev_chain;
+
+    ovs_assert(idx < TUN_METADATA_NUM_OPTS);
+
+    entry = &map->entries[idx];
+    if (entry->valid) {
+        return OFPERR_NXGTMFC_ALREADY_MAPPED;
+    }
+
+    entry->key = tun_meta_key(htons(opt_class), type);
+    if (tun_meta_find_key(&map->key_hmap, entry->key)) {
+        return OFPERR_NXGTMFC_DUP_ENTRY;
+    }
+
+    entry->valid = true;
+    hmap_insert(&map->key_hmap, &entry->node,
+                tun_meta_hash(entry->key));
+
+    entry->loc.len = len;
+    cur_chain = &entry->loc.c;
+    memset(cur_chain, 0, sizeof *cur_chain);
+    prev_chain = NULL;
+
+    while (len) {
+        int err;
+
+        if (!cur_chain) {
+            cur_chain = xzalloc(sizeof *cur_chain);
+        }
+
+        err = tun_metadata_alloc_chain(map, len, cur_chain);
+        if (err) {
+            tun_metadata_del_entry(map, idx);
+            return OFPERR_NXGTMFC_TABLE_FULL;
+        }
+
+        len -= cur_chain->len;
+
+        if (prev_chain) {
+            prev_chain->next = cur_chain;
+        }
+        prev_chain = cur_chain;
+        cur_chain = NULL;
+    }
+
+    return 0;
+}
+
+static void
+tun_metadata_del_entry(struct tun_table *map, uint8_t idx)
+                       OVS_REQUIRES(tab_mutex)
+{
+    struct tun_meta_entry *entry;
+    struct tun_metadata_loc_chain *chain;
+
+    if (idx >= TUN_METADATA_NUM_OPTS) {
+        return;
+    }
+
+    entry = &map->entries[idx];
+    if (!entry->valid) {
+        return;
+    }
+
+    chain = &entry->loc.c;
+    while (chain) {
+        struct tun_metadata_loc_chain *next = chain->next;
+
+        bitmap_set_multiple(map->alloc_map, chain->offset / 4,
+                            chain->len / 4, 0);
+        if (chain != &entry->loc.c) {
+            free(chain);
+        }
+        chain = next;
+    }
+
+    entry->valid = false;
+    hmap_remove(&map->key_hmap, &entry->node);
+    memset(&entry->loc, 0, sizeof entry->loc);
+}
+
+int
+tun_metadata_from_geneve_nlattr(const struct nlattr *attr,
+                                const struct nlattr *flow_attrs,
+                                size_t flow_attr_len,
+                                const struct tun_metadata *flow_metadata,
+                                struct tun_metadata *metadata)
+{
+    bool is_mask = !!flow_attrs;
+    struct tun_table *map;
+    const struct nlattr *flow;
+    int opts_len;
+    const struct geneve_opt *flow_opt;
+    const struct geneve_opt *opt = nl_attr_get(attr);
+
+    if (!is_mask) {
+        map = ovsrcu_get(struct tun_table *, &metadata_tab);
+        metadata->tab = map;
+    } else {
+        map = flow_metadata->tab;
+    }
+
+    if (!map) {
+        return 0;
+    }
+
+    if (is_mask) {
+        const struct nlattr *tnl_key;
+        int mask_len = nl_attr_get_size(attr);
+
+        tnl_key = nl_attr_find__(flow_attrs, flow_attr_len, OVS_KEY_ATTR_TUNNEL);
+        if (!tnl_key) {
+            return mask_len ? EINVAL : 0;
+        }
+
+        flow = nl_attr_find_nested(tnl_key, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS);
+        if (!flow) {
+            return mask_len ? EINVAL : 0;
+        }
+
+        if (mask_len != nl_attr_get_size(flow)) {
+            return EINVAL;
+        }
+    } else {
+        flow = attr;
+    }
+
+    opts_len = nl_attr_get_size(flow);
+    flow_opt = nl_attr_get(flow);
+
+    while (opts_len > 0) {
+        int len;
+        struct tun_meta_entry *entry;
+
+        if (opts_len < sizeof(*opt)) {
+            return EINVAL;
+        }
+
+        len = sizeof(*opt) + flow_opt->length * 4;
+        if (len > opts_len) {
+            return EINVAL;
+        }
+
+        entry = tun_meta_find_key(&map->key_hmap,
+                                  tun_meta_key(flow_opt->opt_class,
+                                               flow_opt->type));
+        if (entry) {
+            if (entry->loc.len == flow_opt->length * 4) {
+                memcpy_to_metadata(metadata, opt + 1, &entry->loc);
+                metadata->opt_map |= 1 << (entry - map->entries);
+            } else {
+                return EINVAL;
+            }
+        } else if (flow_opt->type & GENEVE_CRIT_OPT_TYPE) {
+            return EINVAL;
+        }
+
+        opt = opt + len / sizeof(*opt);
+        flow_opt = flow_opt + len / sizeof(*opt);
+        opts_len -= len;
+    }
+
+    return 0;
+}
+
+void
+tun_metadata_to_geneve_nlattr_flow(const struct tun_metadata *flow,
+                                   struct ofpbuf *b)
+{
+    struct tun_table *map;
+    size_t nlattr_offset;
+    uint64_t bm;
+    int i;
+
+    if (!flow->opt_map) {
+        return;
+    }
+
+    map = flow->tab;
+    if (!map) {
+        map = ovsrcu_get(struct tun_table *, &metadata_tab);
+    }
+
+    /* For all intents and purposes, the Geneve options are nested
+     * attributes even if this doesn't show up directly to netlink. It's
+     * similar enough that we can use the same mechanism. */
+    nlattr_offset = nl_msg_start_nested(b, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS);
+
+    PRESENT_OPT_FOR_EACH (i, bm, flow->opt_map) {
+        struct tun_meta_entry *entry = &map->entries[i];
+        struct geneve_opt *opt;
+
+        opt = ofpbuf_put_uninit(b, sizeof *opt + entry->loc.len);
+
+        opt->opt_class = tun_key_class(entry->key);
+        opt->type = tun_key_type(entry->key);
+        opt->length = entry->loc.len / 4;
+        opt->r1 = 0;
+        opt->r2 = 0;
+        opt->r3 = 0;
+
+        memcpy_from_metadata(opt + 1, flow, &entry->loc);
+    }
+
+    nl_msg_end_nested(b, nlattr_offset);
+}
+
+void
+tun_metadata_to_geneve_nlattr_mask(const struct ofpbuf *key,
+                                   const struct tun_metadata *mask,
+                                   const struct tun_metadata *flow,
+                                   struct ofpbuf *b)
+{
+    struct tun_table *map = flow->tab;
+    const struct nlattr *tnl_key, *geneve_key;
+    struct nlattr *geneve_mask;
+    struct geneve_opt *opt;
+    int opts_len;
+
+    if (!map) {
+        return;
+    }
+
+    tnl_key = nl_attr_find(key, 0, OVS_KEY_ATTR_TUNNEL);
+    if (!tnl_key) {
+        return;
+    }
+
+    geneve_key = nl_attr_find_nested(tnl_key,
+                                     OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS);
+    if (!geneve_key) {
+        return;
+    }
+
+    geneve_mask = ofpbuf_tail(b);
+    nl_msg_put(b, geneve_key, geneve_key->nla_len);
+
+    /* All of these options have already been validated, so no need
+     * for sanity checking. */
+    opt = CONST_CAST(struct geneve_opt *, nl_attr_get(geneve_mask));
+    opts_len = nl_attr_get_size(geneve_mask);
+
+    while (opts_len > 0) {
+        struct tun_meta_entry *entry;
+        int len = sizeof(*opt) + opt->length * 4;
+
+        entry = tun_meta_find_key(&map->key_hmap,
+                                  tun_meta_key(opt->opt_class, opt->type));
+        if (entry) {
+            memcpy_from_metadata(opt + 1, mask, &entry->loc);
+        } else {
+	    memset(opt + 1, 0, opt->length * 4);
+        }
+
+        opt->opt_class = htons(0xffff);
+        opt->type = 0xff;
+        opt->length = 0x1f;
+        opt->r1 = 0;
+        opt->r2 = 0;
+        opt->r3 = 0;
+
+        opt = opt + len / sizeof(*opt);
+        opts_len -= len;
+    }
+}
+
+static const struct tun_metadata_loc *
+metadata_loc_from_match_read(struct tun_table *map, const struct match *match,
+                             unsigned int idx)
+{
+    if (match->tun_md.valid) {
+        return &match->tun_md.loc[idx];
+    }
+
+    return &map->entries[idx].loc;
+}
+
+void
+tun_metadata_to_nx_match(struct ofpbuf *b, enum ofp_version oxm,
+                         const struct match *match)
+{
+    struct tun_table *map = ovsrcu_get(struct tun_table *, &metadata_tab);
+    const struct tun_metadata *metadata = &match->flow.tunnel.metadata;
+    const struct tun_metadata *mask = &match->wc.masks.tunnel.metadata;
+    uint64_t bm;
+    int i;
+
+    PRESENT_OPT_FOR_EACH (i, bm, mask->opt_map) {
+        const struct tun_metadata_loc *loc;
+        union mf_value opts;
+        union mf_value mask_opts;
+
+        loc = metadata_loc_from_match_read(map, match, i);
+        memcpy_from_metadata(opts.tun_metadata, metadata, loc);
+        memcpy_from_metadata(mask_opts.tun_metadata, mask, loc);
+        nxm_put(b, MFF_TUN_METADATA0 + i, oxm, opts.tun_metadata,
+                mask_opts.tun_metadata, loc->len);
+    }
+}
+
+void
+tun_metadata_match_format(struct ds *s, const struct match *match)
+{
+    struct tun_table *map = ovsrcu_get(struct tun_table *, &metadata_tab);
+    const struct tun_metadata *metadata = &match->flow.tunnel.metadata;
+    const struct tun_metadata *mask = &match->wc.masks.tunnel.metadata;
+    unsigned int i;
+    uint64_t bm;
+
+    PRESENT_OPT_FOR_EACH (i, bm, mask->opt_map) {
+        const struct tun_metadata_loc *loc;
+        union mf_value opts;
+
+        loc = metadata_loc_from_match_read(map, match, i);
+
+        ds_put_format(s, "tun_metadata%u=", i);
+        memcpy_from_metadata(opts.tun_metadata, metadata, loc);
+        ds_put_hex(s, opts.tun_metadata, loc->len);
+
+        memcpy_from_metadata(opts.tun_metadata, mask, loc);
+        if (!is_all_ones(opts.tun_metadata, loc->len)) {
+            ds_put_char(s, '/');
+            ds_put_hex(s, opts.tun_metadata, loc->len);
+        }
+        ds_put_char(s, ',');
+    }
+}
diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h
index e7a6d3e..8ba0757 100644
--- a/lib/tun-metadata.h
+++ b/lib/tun-metadata.h
@@ -17,6 +17,84 @@
 #ifndef TUN_METADATA_H
 #define TUN_METADATA_H 1
 
-#define TUN_METADATA_NUM_OPTS 0
+#include <stdint.h>
+
+#include "dynamic-string.h"
+#include "netlink.h"
+#include "ofpbuf.h"
+#include "openflow/openflow.h"
+
+struct match;
+struct mf_field;
+union mf_value;
+struct ofputil_geneve_table_mod;
+struct ofputil_geneve_table_reply;
+struct tun_table;
+
+#define TUN_METADATA_NUM_OPTS 64
+#define TUN_METADATA_TOT_OPT_SIZE 256
+struct tun_metadata {
+    uint8_t opts[TUN_METADATA_TOT_OPT_SIZE];
+    uint64_t opt_map;
+    struct tun_table *tab;
+};
+BUILD_ASSERT_DECL(sizeof(((struct tun_metadata *)0)->opt_map) * 8 >=
+                  TUN_METADATA_NUM_OPTS);
+
+/* The location of an option can be stored either as a single offset/len
+ * pair (hopefully) or if the address space is fragmented then it is a
+ * linked list of these blocks. */
+struct tun_metadata_loc_chain {
+    struct tun_metadata_loc_chain *next;
+    uint8_t offset;
+    uint8_t len;
+};
+
+struct tun_metadata_loc {
+    int len;
+    struct tun_metadata_loc_chain c;
+};
+
+/* Allocation of options to be used inside a match structure. This is
+ * important if we don't have access to a global allocation table - either
+ * because there isn't one (ovs-ofctl) or if we need to keep the allocation
+ * outside of packet processing context (Packet-In). These structures never
+ * have dynamically allocated memory because the address space is never
+ * fragmented. */
+struct tun_metadata_allocation {
+    struct tun_metadata_loc loc[TUN_METADATA_NUM_OPTS];
+    uint8_t alloc_offset;
+    bool valid;
+};
+
+void tun_metadata_init(void);
+
+enum ofperr tun_metadata_table_mod(struct ofputil_geneve_table_mod *);
+void tun_metadata_table_request(struct ofputil_geneve_table_reply *);
+
+void tun_metadata_read(const struct tun_metadata *,
+                       const struct mf_field *, union mf_value *);
+void tun_metadata_write(struct tun_metadata *,
+                        const struct mf_field *, const union mf_value *);
+void tun_metadata_set_match(const struct mf_field *,
+                            const union mf_value *value,
+                            const union mf_value *mask, struct match *);
+void tun_metadata_get_fmd(const struct tun_metadata *,
+                          struct match *flow_metadata);
+
+int tun_metadata_from_geneve_nlattr(const struct nlattr *attr,
+                                    const struct nlattr *flow_attrs,
+                                    size_t flow_attr_len,
+                                    const struct tun_metadata *flow_metadata,
+                                    struct tun_metadata *metadata);
+void tun_metadata_to_geneve_nlattr_flow(const struct tun_metadata *flow,
+                                        struct ofpbuf *);
+void tun_metadata_to_geneve_nlattr_mask(const struct ofpbuf *key,
+                                        const struct tun_metadata *mask,
+                                        const struct tun_metadata *flow,
+                                        struct ofpbuf *);
+void tun_metadata_to_nx_match(struct ofpbuf *b, enum ofp_version oxm,
+                              const struct match *);
+void tun_metadata_match_format(struct ds *, const struct match *);
 
 #endif /* tun-metadata.h */
diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h
index 81a61a2..dc533ce 100644
--- a/ofproto/ofproto-dpif-rid.h
+++ b/ofproto/ofproto-dpif-rid.h
@@ -91,7 +91,7 @@ struct rule;
 /* Metadata for restoring pipeline context after recirculation.  Helpers
  * are inlined below to keep them together with the definition for easier
  * updates. */
-BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
 
 struct recirc_metadata {
     /* Metadata in struct flow. */
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index d3ae567..b9bee5e 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -2769,7 +2769,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
 
     /* If 'struct flow' gets additional metadata, we'll need to zero it out
      * before traversing a patch port. */
-    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31);
+    BUILD_ASSERT_DECL(FLOW_WC_SEQ == 32);
     memset(&flow_tnl, 0, sizeof flow_tnl);
 
     if (!xport) {
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index ae7edcb..523130b 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -56,6 +56,7 @@
 #include "smap.h"
 #include "sset.h"
 #include "timeval.h"
+#include "tun-metadata.h"
 #include "unaligned.h"
 #include "unixctl.h"
 #include "openvswitch/vlog.h"
@@ -566,6 +567,7 @@ ofproto_create(const char *datapath_name, const char *datapath_type,
         ofproto->ogf.max_groups[i] = OFPG_MAX;
         ofproto->ogf.ofpacts[i] = (UINT64_C(1) << N_OFPACTS) - 1;
     }
+    tun_metadata_init();
 
     error = ofproto->ofproto_class->construct(ofproto);
     if (error) {
@@ -6924,15 +6926,24 @@ handle_geneve_table_mod(struct ofconn *ofconn, const struct ofp_header *oh)
         return error;
     }
 
+    error = tun_metadata_table_mod(&gtm);
+
     ofputil_uninit_geneve_table(&gtm.mappings);
-    return OFPERR_OFPBRC_BAD_TYPE;
+    return error;
 }
 
 static enum ofperr
-handle_geneve_table_request(struct ofconn *ofconn OVS_UNUSED,
-                            const struct ofp_header *oh OVS_UNUSED)
+handle_geneve_table_request(struct ofconn *ofconn, const struct ofp_header *oh)
 {
-    return OFPERR_OFPBRC_BAD_TYPE;
+    struct ofputil_geneve_table_reply gtr;
+    struct ofpbuf *b;
+
+    tun_metadata_table_request(&gtr);
+    b = ofputil_encode_geneve_table_reply(oh, &gtr);
+    ofputil_uninit_geneve_table(&gtr.mappings);
+
+    ofconn_send_reply(ofconn, b);
+    return 0;
 }
 
 static enum ofperr
diff --git a/tests/ofproto.at b/tests/ofproto.at
index 9c5f0bb..8164672 100644
--- a/tests/ofproto.at
+++ b/tests/ofproto.at
@@ -1497,7 +1497,7 @@ OVS_VSWITCHD_START
       instructions: meter,apply_actions,clear_actions,write_actions,write_metadata$goto
       Write-Actions and Apply-Actions features:
         actions: output group set_field strip_vlan push_vlan mod_nw_ttl dec_ttl set_mpls_ttl dec_mpls_ttl push_mpls pop_mpls set_queue
-        supported on Set-Field: tun_id tun_src tun_dst tun_gbp_id tun_gbp_flags metadata in_port in_port_oxm pkt_mark reg0 reg1 reg2 reg3 reg4 reg5 reg6 reg7 xreg0 xreg1 xreg2 xreg3 eth_src eth_dst vlan_tci vlan_vid vlan_pcp mpls_label mpls_tc ip_src ip_dst ipv6_src ipv6_dst ipv6_label nw_tos ip_dscp nw_ecn nw_ttl arp_op arp_spa arp_tpa arp_sha arp_tha tcp_src tcp_dst udp_src udp_dst sctp_src sctp_dst nd_target nd_sll nd_tll
+        supported on Set-Field: tun_id tun_src tun_dst tun_gbp_id tun_gbp_flags tun_metadata0 tun_metadata1 tun_metadata2 tun_metadata3 tun_metadata4 tun_metadata5 tun_metadata6 tun_metadata7 tun_metadata8 tun_metadata9 tun_metadata10 tun_metadata11 tun_metadata12 tun_metadata13 tun_metadata14 tun_metadata15 tun_metadata16 tun_metadata17 tun_metadata18 tun_metadata19 tun_metadata20 tun_metadata21 tun_metadata22 tun_metadata23 tun_metadata24 tun_metadata25 tun_metadata26 tun_metadata27 tun_metadata28 tun_metadata29 tun_metadata30 tun_metadata31 tun_metadata32 tun_metadata33 tun_metadata34 tun_metadata35 tun_metadata36 tun_metadata37 tun_metadata38 tun_metadata39 tun_metadata40 tun_metadata41 tun_metadata42 tun_metadata43 tun_metadata44 tun_metadata45 tun_metadata46 tun_metadata47 tun_metadata48 tun_metadata49 tun_metadata50 tun_metadata51 tun_metadata52 tun_metadata53 tun_metadata54 tun_metadata55 tun_metadata56 tun_metadata57 tun_metadata58 tun_metadata59 tun_metadata60 tun_
 metadata61 tun_metadata62 tun_metadata63 metadata in_port in_port_oxm pkt_mark reg0 reg1 reg2 reg3 reg4 reg5 reg6 reg7 xreg0 xreg1 xreg2 xreg3 eth_src eth_dst vlan_tci vlan_vid vlan_pcp mpls_label mpls_tc ip_src ip_dst ipv6_src ipv6_dst ipv6_label nw_tos ip_dscp nw_ecn nw_ttl arp_op arp_spa arp_tpa arp_sha arp_tha tcp_src tcp_dst udp_src udp_dst sctp_src sctp_dst nd_target nd_sll nd_tll
     matching:
       dp_hash: arbitrary mask
       recirc_id: exact match or wildcard
@@ -1507,6 +1507,70 @@ OVS_VSWITCHD_START
       tun_dst: arbitrary mask
       tun_gbp_id: arbitrary mask
       tun_gbp_flags: arbitrary mask
+      tun_metadata0: arbitrary mask
+      tun_metadata1: arbitrary mask
+      tun_metadata2: arbitrary mask
+      tun_metadata3: arbitrary mask
+      tun_metadata4: arbitrary mask
+      tun_metadata5: arbitrary mask
+      tun_metadata6: arbitrary mask
+      tun_metadata7: arbitrary mask
+      tun_metadata8: arbitrary mask
+      tun_metadata9: arbitrary mask
+      tun_metadata10: arbitrary mask
+      tun_metadata11: arbitrary mask
+      tun_metadata12: arbitrary mask
+      tun_metadata13: arbitrary mask
+      tun_metadata14: arbitrary mask
+      tun_metadata15: arbitrary mask
+      tun_metadata16: arbitrary mask
+      tun_metadata17: arbitrary mask
+      tun_metadata18: arbitrary mask
+      tun_metadata19: arbitrary mask
+      tun_metadata20: arbitrary mask
+      tun_metadata21: arbitrary mask
+      tun_metadata22: arbitrary mask
+      tun_metadata23: arbitrary mask
+      tun_metadata24: arbitrary mask
+      tun_metadata25: arbitrary mask
+      tun_metadata26: arbitrary mask
+      tun_metadata27: arbitrary mask
+      tun_metadata28: arbitrary mask
+      tun_metadata29: arbitrary mask
+      tun_metadata30: arbitrary mask
+      tun_metadata31: arbitrary mask
+      tun_metadata32: arbitrary mask
+      tun_metadata33: arbitrary mask
+      tun_metadata34: arbitrary mask
+      tun_metadata35: arbitrary mask
+      tun_metadata36: arbitrary mask
+      tun_metadata37: arbitrary mask
+      tun_metadata38: arbitrary mask
+      tun_metadata39: arbitrary mask
+      tun_metadata40: arbitrary mask
+      tun_metadata41: arbitrary mask
+      tun_metadata42: arbitrary mask
+      tun_metadata43: arbitrary mask
+      tun_metadata44: arbitrary mask
+      tun_metadata45: arbitrary mask
+      tun_metadata46: arbitrary mask
+      tun_metadata47: arbitrary mask
+      tun_metadata48: arbitrary mask
+      tun_metadata49: arbitrary mask
+      tun_metadata50: arbitrary mask
+      tun_metadata51: arbitrary mask
+      tun_metadata52: arbitrary mask
+      tun_metadata53: arbitrary mask
+      tun_metadata54: arbitrary mask
+      tun_metadata55: arbitrary mask
+      tun_metadata56: arbitrary mask
+      tun_metadata57: arbitrary mask
+      tun_metadata58: arbitrary mask
+      tun_metadata59: arbitrary mask
+      tun_metadata60: arbitrary mask
+      tun_metadata61: arbitrary mask
+      tun_metadata62: arbitrary mask
+      tun_metadata63: arbitrary mask
       metadata: arbitrary mask
       in_port: exact match or wildcard
       in_port_oxm: exact match or wildcard
@@ -1581,7 +1645,7 @@ AT_CHECK(
 # Check that the configuration was updated.
 mv expout orig-expout
 sed 's/classifier/main/
-77s/1000000/1024/' < orig-expout > expout
+141s/1000000/1024/' < orig-expout > expout
 AT_CHECK([ovs-ofctl -O OpenFlow13 dump-table-features br0 | sed '/^$/d
 /^OFPST_TABLE_FEATURES/d'], [0], [expout])
 OVS_VSWITCHD_STOP
diff --git a/tests/ovs-ofctl.at b/tests/ovs-ofctl.at
index 6c48569..824af57 100644
--- a/tests/ovs-ofctl.at
+++ b/tests/ovs-ofctl.at
@@ -17,6 +17,8 @@ for test_case in \
     'tun_gbp_id=0/0x1                            NXM,OXM' \
     'tun_gbp_flags=0                             NXM,OXM' \
     'tun_gbp_flags=0/0x1                         NXM,OXM' \
+    'tun_metadata0=0                             NXM,OXM' \
+    'tun_metadata0=0/0x1                         NXM,OXM' \
     'metadata=0                                  NXM,OXM,OpenFlow11' \
     'metadata=1/1                                NXM,OXM,OpenFlow11' \
     'in_port=1                                   any' \
diff --git a/tests/tunnel.at b/tests/tunnel.at
index 7ff1ba4..ce4cb1e 100644
--- a/tests/tunnel.at
+++ b/tests/tunnel.at
@@ -411,3 +411,68 @@ AT_CHECK([tail -1 stdout], [0],
 ])
 OVS_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([tunnel - Geneve metadata])
+OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=geneve \
+                    options:remote_ip=1.1.1.1 ofport_request=1 \
+                    -- add-port br0 p2 -- set Interface p2 type=dummy \
+                    ofport_request=2 ofport_request=2])
+OVS_VSWITCHD_DISABLE_TUNNEL_PUSH_POP
+
+AT_CHECK([ovs-ofctl add-geneve-map br0 "{class=0xffff,type=0,len=4}->tun_metadata0,{class=0xffff,type=1,len=8}->tun_metadata1"])
+
+AT_DATA([flows.txt], [dnl
+in_port=2,actions=set_field:0xa->tun_metadata0,set_field:0x1234567890abcdef->tun_metadata1,1
+tun_metadata0=0xb/0xf,actions=2
+])
+AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
+
+dnl Option generation
+AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0)'], [0], [stdout])
+AT_CHECK([tail -1 stdout], [0],
+  [Datapath actions: set(tunnel(dst=1.1.1.1,ttl=64,geneve({class=0xffff,type=0,len=4,0xa}{class=0xffff,type=0x1,len=8,0x1234567890abcdef}),flags(df))),6081
+])
+
+dnl Option match
+AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(0),tunnel(tun_id=0x0,src=1.1.1.1,dst=1.1.1.2,ttl=64,geneve({class=0xffff,type=0,len=4,0xb}),flags(df,key)),in_port(6081),skb_mark(0),eth_type(0x0800),ipv4(frag=no)'], [0], [stdout])
+AT_CHECK([tail -2 stdout], [0],
+  [Megaflow: pkt_mark=0,recirc_id=0,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_ttl=64,df|key,tun_metadata0=0xb/0xf,in_port=1,nw_frag=no
+Datapath actions: 2
+])
+
+dnl Skip unknown option
+AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(0),tunnel(tun_id=0x0,src=1.1.1.1,dst=1.1.1.2,ttl=64,geneve({class=0xffff,type=0,len=4,0xb}{class=0xffff,type=2,len=4,0xc}),flags(df,key)),in_port(6081),skb_mark(0),eth_type(0x0800),ipv4(frag=no)'], [0], [stdout])
+AT_CHECK([tail -2 stdout], [0],
+  [Megaflow: pkt_mark=0,recirc_id=0,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_ttl=64,df|key,tun_metadata0=0xb/0xf,in_port=1,nw_frag=no
+Datapath actions: 2
+])
+
+dnl Check mapping table constraints
+AT_CHECK([ovs-ofctl add-geneve-map br0 "{class=0xffff,type=2,len=124}->tun_metadata2,{class=0xffff,type=3,len=124}->tun_metadata3"], [1], [ignore],
+[OFPT_ERROR (xid=0x4): NXGTMFC_TABLE_FULL
+NXT_GENEVE_TABLE_MOD (xid=0x4):
+ ADD mapping table:
+ class	type	length	match field
+ -----	----	------	-----------
+ 0xffff	0x2	124	tun_metadata2
+ 0xffff	0x3	124	tun_metadata3
+])
+
+dnl Allocation and match with fragmented address space
+AT_CHECK([ovs-ofctl add-geneve-map br0 "{class=0xffff,type=2,len=124}->tun_metadata2"])
+AT_CHECK([ovs-ofctl add-geneve-map br0 "{class=0xffff,type=3,len=4}->tun_metadata3"])
+AT_CHECK([ovs-ofctl add-geneve-map br0 "{class=0xffff,type=4,len=112}->tun_metadata4"])
+AT_CHECK([ovs-ofctl del-geneve-map br0 "{class=0xffff,type=3,len=4}->tun_metadata3"])
+AT_CHECK([ovs-ofctl add-geneve-map br0 "{class=0xffff,type=3,len=8}->tun_metadata3"])
+
+AT_CHECK([ovs-ofctl add-flow br0 tun_metadata3=0x1234567890abcdef,actions=2])
+AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(0),tunnel(tun_id=0x0,src=1.1.1.1,dst=1.1.1.2,ttl=64,geneve({class=0xffff,type=3,len=8,0x1234567890abcdef}),flags(df,key)),in_port(6081),skb_mark(0),eth_type(0x0800),ipv4(frag=no)'], [0], [stdout])
+AT_CHECK([tail -2 stdout], [0],
+  [Megaflow: pkt_mark=0,recirc_id=0,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_ttl=64,df|key,tun_metadata0=0/0xf,tun_metadata3=0x1234567890abcdef,in_port=1,nw_frag=no
+Datapath actions: 2
+])
+
+AT_CHECK([ovs-ofctl del-geneve-map br0])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
index f83760a..9852e61 100644
--- a/utilities/ovs-ofctl.8.in
+++ b/utilities/ovs-ofctl.8.in
@@ -418,6 +418,10 @@ commands. The format for \fIoptions\fR is given in \fBOption Syntax\fR below.
 Note that mappings should not be changed while they are in active use by
 a flow. The result of doing so is undefined.
 
+Currently, the Geneve mapping table is shared between all OpenFlow
+switches in a given instance of Open vSwitch. This restriction will
+be lifted in the future to allow for easier management.
+
 .IP "\fBadd\-geneve\-map \fIswitch options\fR"
 Add each option entry to \fIswitch\fR's tables. Duplicate fields are
 rejected.
@@ -1175,6 +1179,19 @@ set.
 For more information, please see the corresponding IETF draft:
 https://tools.ietf.org/html/draft-smith-vxlan-group-policy
 .
+.IP "\fBtun_metadata\fIidx\fB=\fIvalue\fR[\fB/\fImask\fR]"
+Matches \fIvalue\fR either exactly or with optional \fImask\fR in
+tunnel metadata field number \fIidx\fR (numbered from 0 to 63).
+Tunnel metadata fields can be dynamically assigned onto the data
+contained in the options of Geneve packets using the commands
+described in the section \fBOpenFlow Switch Geneve Option Table
+Commands\fR. Once assigned, the length of the field is variable
+according to the size of the option. Before updating a mapping in
+the option table, flows with references to it should be removed,
+otherwise the result is non-deterministic.
+.IP
+These fields were introduced in Open vSwitch 2.5.
+.
 .IP "\fBreg\fIidx\fB=\fIvalue\fR[\fB/\fImask\fR]"
 Matches \fIvalue\fR either exactly or with optional \fImask\fR in
 register number \fIidx\fR.  The valid range of \fIidx\fR depends on
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index c43bfd1..fddd45b 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -1828,12 +1828,11 @@
 
           <dt><code>geneve</code></dt>
           <dd>
-            An Ethernet over Geneve (<code>http://tools.ietf.org/html/draft-gross-geneve-00</code>)
+            An Ethernet over Geneve (<code>http://tools.ietf.org/html/draft-ietf-nvo3-geneve-00</code>)
             IPv4 tunnel.
 
-            Geneve supports options as a means to transport additional metadata,
-            however, currently only the 24-bit VNI is supported. This is planned
-            to be extended in the future.
+            A description of how to match and set Geneve options can be found
+            in the <code>ovs-ofctl</code> manual page.
           </dd>
 
           <dt><code>gre</code></dt>
-- 
2.1.0




More information about the dev mailing list