[ovs-dev] [PATCH v7] Extend OVS IPFIX exporter to export tunnel headers

Ben Pfaff blp at nicira.com
Tue Aug 5 17:05:09 UTC 2014


On Thu, Jul 31, 2014 at 04:12:30AM -0700, Wenyu Zhang wrote:
> Extend IPFIX exporter to export tunnel headers when both input and output
> of the port.
> Add three other_config options in IPFIX table: enable-input-sampling,
> enable-output-sampling and enable-tunnel-sampling, to control whether
> sampling tunnel info, on which direction (input or output).
> Insert sampling action before output action and the output tunnel port
> is sent to datapath in the sampling action.
> Make datapath collect output tunnel info and send it back to userpace
> in upcall message with a new additional optional attribute.
> Add a tunnel ports map to make the tunnel port lookup faster in sampling
> upcalls in IPFIX exporter. Make the IPFIX exporter generate IPFIX template
> sets with enterprise elements for the tunnel info, save the tunnel info
> in IPFIX cache entries, and send IPFIX DATA with tunnel info.
> Add flowDirection element in IPFIX templates.
> 
> Signed-off-by: Wenyu Zhang <wenyuz at vmware.com>
> Acked-by: Romain Lenglet <rlenglet at vmware.com>

Important
---------

This needs a rebase because openvswitch.h has moved.

There are sparse warnings and errors.  I'll append a patch to fix
these.

In datapath/flow_netlink.c, ipv4_tun_from_nlattr() is missing a
"break" in the OVS_TUNNEL_KEY_ATTR_TP_DST case.


Minor
-----
The comment on 'out_tun_key' in the definition of struct dpif_upcall
is wrong.

The comments in enum ovs_tunnel_key_attr say that
OVS_TUNNEL_KEY_ATTR_TP_SRC and OVS_TUNNEL_KEY_ATTR_TP_DST are u16s but
they appear to be be16s instead.

I don't understand the comment for OVS_PACKET_ATTR_OUT_TUNNEL_KEY.
How is the port specified on OVS_USERSPACE_ATTR_TUNNEL_OUT_PORT used?

I believe that OVS_USERSPACE_ATTR_TUNNEL_OUT_PORT is a kernel port
number.  However, the odp_put_userspace_action() parameter for this
value has 'pid' in the name (and I see a few other related uses of
"pid").  A PID is a different concept.

NUM_DPIF_IPFIX_TUNNEL is not actually the number of tunnel types since
the list skips 0x04 and 0x06.

IPFIX_PROTO_TUNNEL_UNKNOWN and IPFIX_PROTO_TUNNEL_KNOWN seem like odd
names since I think they actually mean "not tunneled" and "tunneled".

Please break long lines, e.g.:
    #define IPFIX_ENTERPRISE_ENTITY(ENUM, ID, SIZE, NAME, ENTERPRISE)  IPFIX_ENTITY_ID_##ENUM = IPFIX_SET_ENTERPRISE(ID),
as:
    #define IPFIX_ENTERPRISE_ENTITY(ENUM, ID, SIZE, NAME, ENTERPRISE) \
        IPFIX_ENTITY_ID_##ENUM = IPFIX_SET_ENTERPRISE(ID),

I don't like adding a new member to flow_tnl just to copy out a few
bytes individually in one place.  The diff I appended includes a
suggestion.
 
In dpif_ipfix_add_tunnel_port(), the check for dip == NULL should be
removed.  xmalloc() never returns NULL.

dpif_ipfix_add_tunnel_port() consistently uses strncmp() instead of
strcmp().  Why?  (Please use strcmp().)

Why is Geneve restricted to a 24-bit key?

Please don't parenthesize the expression on 'return' (see
dpif_ipfix_get_tunnel_port()).

ipfix_send_template_msg() now contains a fair bit of cut-and-paste
duplication.  Can you try to reduce that?

In ipfix_cache_entry_init(), please write pointer declarators with a
leading but not a trailing space, e.g.:
        struct ipfix_data_record_flow_key_tunnel *data_tunnel;

Just below that, also please omit parentheses when taking the size of
an expression, e.g.:
        data_tunnel = ofpbuf_put_zeros(&msg, sizeof *data_tunnel +
                                             tunnel_port->tunnel_key_length);

Please put a space before { in cases like this:
            if (dupcall->out_tun_key){
I see this in a few places.

I'm not sure why tnl_xlate_init() has this new code:
        /* The tp_src and tp_dst members in flow_tnl are set to be always
         * wildcarded, not to unwildcard them here. */
        wc->masks.tunnel.tp_src = 0;
        wc->masks.tunnel.tp_dst = 0;
Does it have any visible effect?  It appears to me that these fields
are already 0, as initialized by the caller.

It would be nice to document the VMware enterprise entities in
vswitch.xml, or at least point to documentation.

Here are changes that I suggest folding in.  They include "sparse"
fixes, updates to comments and code to fit the OVS style (including a
lot of spelling errors in comments), and a couple of other minor
things I noticed.

diff --git a/datapath-windows/ovsext/OvsNetProto.h b/datapath-windows/ovsext/OvsNetProto.h
index 5e98206..a21ab5c 100644
--- a/datapath-windows/ovsext/OvsNetProto.h
+++ b/datapath-windows/ovsext/OvsNetProto.h
@@ -80,6 +80,7 @@ typedef UINT64 IP6UnitLength;
 #define IPPROTO_ICMP    1
 #define IPPROTO_IGMP    2
 #define IPPROTO_UDP     17
+#define IPPROTO_GRE     47
 #define IPPROTO_TCP     6
 #define IPPROTO_RSVD    0xff
 
diff --git a/include/sparse/netinet/in.h b/include/sparse/netinet/in.h
index c63dca7..83ad2cc 100644
--- a/include/sparse/netinet/in.h
+++ b/include/sparse/netinet/in.h
@@ -66,6 +66,7 @@ struct sockaddr_in6 {
 #define IPPROTO_UDP 17
 #define IPPROTO_ROUTING 43
 #define IPPROTO_FRAGMENT 44
+#define IPPROTO_GRE 47
 #define IPPROTO_AH 51
 #define IPPROTO_ICMPV6 58
 #define IPPROTO_NONE 59
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 49d3b0f..c64201e 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -523,7 +523,7 @@ parse_odp_userspace_action(const char *s, struct ofpbuf *actions)
     uint32_t pid;
     union user_action_cookie cookie;
     struct ofpbuf buf;
-    uint32_t tunnel_out_port;
+    odp_port_t tunnel_out_port;
     int n = -1;
     void *user_data = NULL;
     size_t user_data_size = 0;
@@ -593,7 +593,7 @@ parse_odp_userspace_action(const char *s, struct ofpbuf *actions)
                             &output, &n1) ) {
             n += n1;
             cookie.type = USER_ACTION_COOKIE_IPFIX;
-            cookie.ipfix.output_odp_port = output;
+            cookie.ipfix.output_odp_port = u32_to_odp(output);
             user_data = &cookie;
             user_data_size = sizeof cookie.ipfix;
         } else if (ovs_scan(&s[n], ",userdata(%n",
@@ -3550,7 +3550,7 @@ odp_key_fitness_to_string(enum odp_key_fitness fitness)
 size_t
 odp_put_userspace_action(uint32_t pid,
                          const void *userdata, size_t userdata_size,
-                         const uint32_t *tunnel_out_pid,
+                         const odp_port_t *tunnel_out_port,
                          struct ofpbuf *odp_actions)
 {
     size_t userdata_ofs;
@@ -3577,8 +3577,9 @@ odp_put_userspace_action(uint32_t pid,
     } else {
         userdata_ofs = 0;
     }
-    if (tunnel_out_pid) {
-        nl_msg_put_u32(odp_actions, OVS_USERSPACE_ATTR_TUNNEL_OUT_PORT, *tunnel_out_pid);
+    if (tunnel_out_port) {
+        nl_msg_put_odp_port(odp_actions, OVS_USERSPACE_ATTR_TUNNEL_OUT_PORT,
+                            *tunnel_out_port);
     }
     nl_msg_end_nested(odp_actions, offset);
 
diff --git a/lib/odp-util.h b/lib/odp-util.h
index 04f657c..f511091 100644
--- a/lib/odp-util.h
+++ b/lib/odp-util.h
@@ -230,14 +230,14 @@ union user_action_cookie {
 
     struct {
         uint16_t type;          /* USER_ACTION_COOKIE_IPFIX. */
-        uint32_t output_odp_port; /* The output odp port on datapath to get tunnel info. */
+        odp_port_t output_odp_port; /* The output odp port on datapath to get tunnel info. */
     } ipfix;
 };
 BUILD_ASSERT_DECL(sizeof(union user_action_cookie) == 16);
 
 size_t odp_put_userspace_action(uint32_t pid,
                                 const void *userdata, size_t userdata_size,
-                                const uint32_t *tunnel_out_pid,
+                                const odp_port_t *tunnel_out_port,
                                 struct ofpbuf *odp_actions);
 void odp_put_tunnel_action(const struct flow_tnl *tunnel,
                            struct ofpbuf *odp_actions);
diff --git a/lib/packets.h b/lib/packets.h
index e54b287..0258745 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -33,10 +33,7 @@ struct ds;
 
 /* Tunnel information used in flow key and metadata. */
 struct flow_tnl {
-    union {
-        ovs_be64 tun_id;
-        uint8_t tun_id_octarray[8];
-    };
+    ovs_be64 tun_id;
     ovs_be32 ip_src;
     ovs_be32 ip_dst;
     uint16_t flags;
@@ -68,7 +65,7 @@ struct pkt_metadata {
 };
 
 #define PKT_METADATA_INITIALIZER(PORT) \
-    (struct pkt_metadata){ 0, 0, { {0}, 0, 0, 0, 0, 0, 0, 0}, 0, 0, {(PORT)} }
+    (struct pkt_metadata){ .in_port.odp_port = PORT }
 
 bool dpid_from_string(const char *s, uint64_t *dpidp);
 
diff --git a/ofproto/ipfix-enterprise-entities.def b/ofproto/ipfix-enterprise-entities.def
index 013cee2..ea94586 100644
--- a/ofproto/ipfix-enterprise-entities.def
+++ b/ofproto/ipfix-enterprise-entities.def
@@ -3,9 +3,7 @@
 #define IPFIX_ENTERPRISE_ENTITY(ENUM, ID, SIZE, NAME, ENTERPRISE)
 #endif
 
-#ifndef IPFIX_ENTERPRISE_VMWARE
 #define IPFIX_ENTERPRISE_VMWARE 6876
-#endif
 
 IPFIX_ENTERPRISE_ENTITY(TUNNEL_TYPE, 891, 1, tunnelType, IPFIX_ENTERPRISE_VMWARE)
 IPFIX_ENTERPRISE_ENTITY(TUNNEL_KEY, 892, 0, tunnelKey, IPFIX_ENTERPRISE_VMWARE)
@@ -15,5 +13,4 @@ IPFIX_ENTERPRISE_ENTITY(TUNNEL_PROTOCOL_IDENTIFIER, 895, 1, tunnelProtocolIdenti
 IPFIX_ENTERPRISE_ENTITY(TUNNEL_SOURCE_TRANSPORT_PORT, 896, 2, tunnelSourceTransportPort, IPFIX_ENTERPRISE_VMWARE)
 IPFIX_ENTERPRISE_ENTITY(TUNNEL_DESTINATION_TRANSPORT_PORT, 897, 2, tunnelDestinationTransportPort, IPFIX_ENTERPRISE_VMWARE)
 
-#undef IPFIX_ENTERPRISE_VMWARE
 #undef IPFIX_ENTERPRISE_ENTITY
diff --git a/ofproto/ofproto-dpif-ipfix.c b/ofproto/ofproto-dpif-ipfix.c
index e742fba..6071629 100644
--- a/ofproto/ofproto-dpif-ipfix.c
+++ b/ofproto/ofproto-dpif-ipfix.c
@@ -45,12 +45,12 @@ static struct ovs_mutex mutex = OVS_MUTEX_INITIALIZER;
 /* The standard layer2SegmentId (ID 351) element is included in vDS to send
  * the VxLAN tunnel's VNI. It is 64-bit long, the most significant byte is
  * used to indicate the type of tunnel (0x01 = VxLAN, 0x02 = GRE) and the three
- * lowest significant bytes hold the value of the layer 2 overlay network
+ * least significant bytes hold the value of the layer 2 overlay network
  * segment identifier: a 24-bit VxLAN tunnel's VNI or a 24-bit GRE tunnel's
- * TNI. This is not compatible with GRE-64, as implementd in OVS, as its
+ * TNI. This is not compatible with GRE-64, as implemented in OVS, as its
  * tunnel IDs are 64-bit.
  *
- * Two new enterprise information elements are defined which are simlar to
+ * Two new enterprise information elements are defined which are similar to
  * laryerSegmentId but support 64-bit IDs:
  *     tunnelType (ID 891) and tunnelKey (ID 892).
  *
@@ -374,7 +374,7 @@ BUILD_ASSERT_DECL(sizeof(struct ipfix_data_record_aggregated_ip) == 32);
     (sizeof(struct ipfix_set_header) \
      + MAX_DATA_RECORD_LEN)
 
-/* Max length of an IPFIX message. Arbitrarily set to accomodate low
+/* Max length of an IPFIX message. Arbitrarily set to accommodate low
  * MTU. */
 #define MAX_MESSAGE_LEN 1024
 
@@ -1027,7 +1027,7 @@ ipfix_define_template_entity(enum ipfix_entity_id id,
         field->field_length = htons(size);
     } else {
         /* RFC 5101, Section 7. Variable-Length Information Element */
-        field->field_length = 0xffff;
+        field->field_length = OVS_BE16_MAX;
     }
     if (enterprise) {
         field->enterprise = htonl(enterprise);
@@ -1099,7 +1099,7 @@ ipfix_define_template_fields(enum ipfix_proto_l2 l2, enum ipfix_proto_l3 l3,
         }
     }
 
-    if (tunnel != IPFIX_PROTO_TUNNEL_UNKNOWN){
+    if (tunnel != IPFIX_PROTO_TUNNEL_UNKNOWN) {
         DEF(TUNNEL_SOURCE_IPV4_ADDRESS);
         DEF(TUNNEL_DESTINATION_IPV4_ADDRESS);
         DEF(TUNNEL_PROTOCOL_IDENTIFIER);
@@ -1165,10 +1165,10 @@ ipfix_send_template_msgs(struct dpif_ipfix_exporter *exporter,
                     continue;
                 }
                 for (tunnel = 0; tunnel < NUM_IPFIX_PROTO_TUNNEL; tunnel++) {
-                    /* When the size of the tmplate packet reaches
+                    /* When the size of the template packet reaches
                      * MAX_MESSAGE_LEN(1024), send it out.
-                     * And then reinit the msg to construct a new packet for
-                     * the following templates.
+                     * And then reinitialize the message to construct a new
+                     * packet for the following templates.
                      */
                     if (ofpbuf_size(&msg) >= MAX_MESSAGE_LEN) {
                         /* Send template message. */
@@ -1179,7 +1179,7 @@ ipfix_send_template_msgs(struct dpif_ipfix_exporter *exporter,
                         ipfix_send_msg(exporter->collectors, &msg);
                         ofpbuf_uninit(&msg);
 
-                        /* Reinit the msg buf. */
+                        /* Reinitialize the message buffer. */
                         ofpbuf_use_stub(&msg, msg_stub, sizeof msg_stub);
 
                         ipfix_init_header(export_time_sec, exporter->seq_number,
@@ -1491,16 +1491,17 @@ ipfix_cache_entry_init(struct ipfix_flow_cache_entry *entry,
 
     if (tunnel == IPFIX_PROTO_TUNNEL_KNOWN) {
         struct ipfix_data_record_flow_key_tunnel * data_tunnel;
+        const uint8_t *tun_id;
 
         data_tunnel = ofpbuf_put_zeros(&msg, sizeof(*data_tunnel) +
                                              tunnel_port->tunnel_key_length);
         data_tunnel->tunnel_source_ipv4_address = tunnel_key->ip_src;
         data_tunnel->tunnel_destination_ipv4_address = tunnel_key->ip_dst;
         /* The tunnel_protocol_identifier is from tunnel_proto array, which
-         * contians protocol_identifiers of each tunnel type.
+         * contains protocol_identifiers of each tunnel type.
          * For the tunnel type on the top of IPSec, which uses the protocol
          * identifier of the upper tunnel type is used, the tcp_src and tcp_dst
-         * are decided based on the protocol identifires.
+         * are decided based on the protocol identifiers.
          * E.g:
          * The protocol identifier of DPIF_IPFIX_TUNNEL_IPSEC_GRE is IPPROTO_GRE,
          * and both tp_src and tp_dst are zero.
@@ -1511,9 +1512,10 @@ ipfix_cache_entry_init(struct ipfix_flow_cache_entry *entry,
         data_tunnel->tunnel_destination_transport_port = tunnel_key->tp_dst;
         data_tunnel->tunnel_type = tunnel_port->tunnel_type;
         data_tunnel->tunnel_key_length = tunnel_port->tunnel_key_length;
-        /* tun_id is in network order, and tunnel key is in low bits*/
+        /* tun_id is in network order, and tunnel key is in low bits. */
+        tun_id = (const uint8_t *) &tunnel_key->tun_id;
         memcpy(data_tunnel->tunnel_key,
-               &tunnel_key->tun_id_octarray[8 - tunnel_port->tunnel_key_length],
+               &tun_id[8 - tunnel_port->tunnel_key_length],
                tunnel_port->tunnel_key_length);
     }
 
@@ -1684,7 +1686,7 @@ dpif_ipfix_bridge_sample(struct dpif_ipfix *di, struct ofpbuf *packet,
             tunnel_port = dpif_ipfix_find_port(di, input_odp_port);
         }
         if (output_odp_port != ODPP_NONE && output_tunnel_key) {
-            /* Output tunnel, output_tunnel_key must valid. */
+            /* Output tunnel, output_tunnel_key must be valid. */
             tunnel_key = output_tunnel_key;
             tunnel_port = dpif_ipfix_find_port(di, output_odp_port);
         }
diff --git a/ofproto/ofproto-dpif-ipfix.h b/ofproto/ofproto-dpif-ipfix.h
index 0222054..1d293a5 100644
--- a/ofproto/ofproto-dpif-ipfix.h
+++ b/ofproto/ofproto-dpif-ipfix.h
@@ -48,7 +48,7 @@ void dpif_ipfix_set_options(
 
 void dpif_ipfix_bridge_sample(struct dpif_ipfix *, struct ofpbuf *,
                               const struct flow *,
-                              uint32_t, uint32_t, const struct flow_tnl *);
+                              odp_port_t, odp_port_t, const struct flow_tnl *);
 void dpif_ipfix_flow_sample(struct dpif_ipfix *, struct ofpbuf *,
                             const struct flow *, uint32_t, uint16_t, uint32_t,
                             uint32_t);
diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 6edda74..3ee9fe9 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -906,7 +906,7 @@ convert_upcall(struct udpif *udpif, struct upcall *upcall)
             memcpy(&cookie, nl_attr_get(dupcall->userdata),
                    sizeof cookie.ipfix);
 
-            if (dupcall->out_tun_key){
+            if (dupcall->out_tun_key) {
                 memset(&output_tunnel_key, 0, sizeof output_tunnel_key);
                 odp_tun_key_from_attr(dupcall->out_tun_key,
                                       &output_tunnel_key);
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index ded98fe..18e3538 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -2154,7 +2154,7 @@ compose_sample_action(const struct xbridge *xbridge,
                       const uint32_t probability,
                       const union user_action_cookie *cookie,
                       const size_t cookie_size,
-                      const uint32_t *tunnel_out_port)
+                      const odp_port_t *tunnel_out_port)
 {
     size_t sample_offset, actions_offset;
     odp_port_t odp_port;
@@ -2261,25 +2261,26 @@ compose_ipfix_action(const struct xbridge *xbridge,
 {
     uint32_t probability;
     union user_action_cookie cookie;
-    uint32_t *tunnel_out_port = NULL;
+    odp_port_t *tunnel_out_port = NULL;
 
     if (!xbridge->ipfix || flow->in_port.ofp_port == OFPP_NONE) {
         return;
     }
 
-    /* For input case, output_odp_port is ODPP_NONE, which is an invalid port number. */
+    /* For input case, output_odp_port is ODPP_NONE, which is an invalid port
+     * number. */
     if (output_odp_port == ODPP_NONE &&
         !dpif_ipfix_get_bridge_exporter_input_sampling(xbridge->ipfix)) {
         return;
     }
 
-    /* For output case, output_pid is valid*/
+    /* For output case, output_odp_port is valid. */
     if (output_odp_port != ODPP_NONE) {
         if (!dpif_ipfix_get_bridge_exporter_output_sampling(xbridge->ipfix)) {
             return;
         }
-        /* If enable-tunnel-sampling is enabled, need put an additional option
-         * attribute: OVS_USERSPACE_ATTR_TUNNEL_OUT_PORT
+        /* If tunnel sampling is enabled, put an additional option attribute:
+         * OVS_USERSPACE_ATTR_TUNNEL_OUT_PORT.
          */
         if (dpif_ipfix_get_bridge_exporter_tunnel_sampling(xbridge->ipfix) &&
             dpif_ipfix_get_tunnel_port(xbridge->ipfix, output_odp_port) ) {

Thanks,

Ben.



More information about the dev mailing list