[ovs-dev] [threaded reval 2/2] ofproto: Handle flow installation and eviction in upcall.

Ben Pfaff blp at nicira.com
Tue Dec 17 05:58:33 UTC 2013


On Sun, Dec 15, 2013 at 05:57:57PM -0800, Ethan Jackson wrote:
> This patch moves flow installation and eviction from ofproto-dpif and
> the main thread, into ofproto-dpif-upcall.  This performs
> significantly better (approximately 2x TCP_CRR improvement), and
> allows ovs-vswitchd to maintain significantly larger datapath flow
> tables.  On top of that, it significantly simplifies the code,
> retiring "strut facet" and friends.

"struct"

> Signed-off-by: Ethan Jackson <ethan at nicira.com>

This is good work!  I especially like how this is a net removal of
code.

I would mention in NEWS that flow-eviction-threshold has gone away.
You might want to mention that there is a new flow-limit parameter
(but if you do please also recommend reading the documentation since
the meaning is different).

ofproto-dpif-upcall.h now has two prototypes for
udpif_get_memory_usage().

The definition of FLOW_MISS_MAX_BATCH could be moved into
ofproto-dpif-upcall.c.

I think that the remaining comment in ofproto-dpif-upcall.h is wrong:
I don't think it hands anything up to the main module anymore.

I wasn't sure what this comment on rule_dpif and group_dpif meant, maybe
you could rephrase it:
     *   - Do include packets and bytes from datapath flows which have not been
     *   pushed by ofproto-dpif-upcall. */

Eww:
+        /* We know stats are relatively fresh since a dump just happened, so

Why does udpif_set_threads() destroy each revalidator thread's ukeys?
Can't the thread itself do that before it exits?xs

This might be premature optimization, but did you consider putting
'key_buf' at the end of struct udpif_key so that all of the frequently
used data is in the first cache line?  Ditto for 'key_buf' and
'mask_buf' in struct udpif_flow_dump.

I think that the 'op' member of struct udpif_flow_dump is only used
inside revalidate_udumps().  It adds over 40 bytes to each instance.
Maybe it doesn't matter because there's a limited number of instances
in flight at a given time?  But otherwise we could easily allocate
them on the stack in revalidate_udumps().

'key' in udpif_key, and 'key' and 'mask' in udpif_flow_dump, seem to
be unneeded: they always point to the beginning of the corresponding
_buf.  Maybe they are convenient enough to keep, though.

udpif_get_n_flows() has a mysteriously 'static' local variable.

In handle_upcalls(), I like to do this sort of comment, where we're
actually saying something about the alternative case:
        if (miss->upcall_type == DPIF_UC_MISS) {
            /* For non-miss upcalls, there's a flow in the datapath which this
             * packet was accounted to.  Presumably the revalidators will deal
             * with pushing its stats eventually. */
            xin.resubmit_stats = &miss->stats;
        }
as:
        if (miss->upcall_type == DPIF_UC_MISS) {
            xin.resubmit_stats = &miss->stats;
        } else {
            /* There's a flow in the datapath which this packet was accounted
             * to.  The revalidators will push its stats eventually. */
        }
but I'll understand if you think that's weird.

A struct odputil_keybuf is kind of big because of the odputil_keybuf
member, but revalidate_udumps() uses xzalloc() to allocate one.  Can
we use xmalloc() instead and then just zero the members we must?

Could you document n-revalidator-threads in vswitch.xml?

I have a few more stylistic and other nonessential comments.  I shall
present them in the form of an interpretive dance expressing my joy at
the deletion of code.  I mean, I shall present them as an incremental
patch.

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index a74015f..a7f6dc3 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -31,6 +31,7 @@
 #include "ofpbuf.h"
 #include "ofproto-dpif-ipfix.h"
 #include "ofproto-dpif-sflow.h"
+#include "ofproto-dpif-xlate.h"
 #include "packets.h"
 #include "poll-loop.h"
 #include "seq.h"
@@ -38,6 +39,7 @@
 #include "vlog.h"
 
 #define MAX_QUEUE_LENGTH 512
+#define FLOW_MISS_MAX_BATCH 50
 #define REVALIDATE_MAX_BATCH 50
 
 VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall);
@@ -45,8 +47,8 @@ VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall);
 COVERAGE_DEFINE(upcall_queue_overflow);
 
 /* A thread that processes each upcall handed to it by the dispatcher thread,
- * forwards the upcall's packet, and then queues it to the main ofproto_dpif
- * to possibly set up a kernel flow as a cache. */
+ * forwards the upcall's packet, and possibly sets up a kernel flow as a
+ * cache. */
 struct handler {
     struct udpif *udpif;               /* Parent udpif. */
     pthread_t thread;                  /* Thread ID. */
@@ -64,6 +66,9 @@ struct handler {
                                           'mutex'. */
 };
 
+/* A thread that processes each kernel flow handed to it by the flow_dumper
+ * thread, updates OpenFlow statistics, and updates or removes the kernel flow
+ * as necessary. */
 struct revalidator {
     struct udpif *udpif;               /* Parent udpif. */
     char *name;                        /* Thread name. */
@@ -81,11 +86,14 @@ struct revalidator {
 
 /* An upcall handler for ofproto_dpif.
  *
- * udpif is implemented as a "dispatcher" thread that reads upcalls from the
- * kernel.  It processes each upcall just enough to figure out its next
- * destination.  For a "miss" upcall (MISS_UPCALL), this is one of several
- * "handler" threads (see struct handler).  Other upcalls are queued to the
- * main ofproto_dpif. */
+ * udpif has two logically separate pieces:
+ *
+ *    - A "dispatcher" thread that reads upcalls from the kernel and dispatches
+ *      them to one of several "handler" threads (see struct handler).
+ *
+ *    - A "flow_dumper" thread that reads the kernel flow table and dispatches
+ *      flows to one of several "revalidator" threads (see struct
+ *      revalidator). */
 struct udpif {
     struct list list_node;             /* In all_udpifs list. */
 
@@ -432,7 +440,7 @@ udpif_get_memory_usage(struct udpif *udpif, struct simap *usage)
         ovs_mutex_lock(&revalidator->mutex);
         simap_increase(usage, "revalidator dumps", revalidator->n_udumps);
 
-        /* XXX: This isn't technically thread safe becuase the revalidator
+        /* XXX: This isn't technically thread safe because the revalidator
          * ukeys maps isn't protected by a mutex since it's per thread. */
         simap_increase(usage, "revalidator keys",
                        hmap_count(&revalidator->ukeys));
@@ -891,7 +899,7 @@ handle_upcalls(struct handler *handler, struct list *upcalls)
     atomic_read(&udpif->flow_limit, &flow_limit);
     may_put = udpif_get_n_flows(udpif) < flow_limit;
 
-    /* Extract the flow from each upcall.  Construct in misses a hash table
+    /* Extract the flow from each upcall.  Construct in 'misses' a hash table
      * that maps each unique flow to a 'struct flow_miss'.
      *
      * Most commonly there is a single packet per flow_miss, but there are
@@ -1042,9 +1050,9 @@ handle_upcalls(struct handler *handler, struct list *upcalls)
         xin.may_learn = true;
 
         if (miss->upcall_type == DPIF_UC_MISS) {
-            /* For non miss upcalls, there's a flow in the datapath which this
+            /* For non-miss upcalls, there's a flow in the datapath which this
              * packet was accounted to.  Presumably the revalidators will deal
-             * with pushing it's stats eventually. */
+             * with pushing its stats eventually. */
             xin.resubmit_stats = &miss->stats;
         }
 
@@ -1284,8 +1292,8 @@ revalidate_ukey(struct udpif *udpif, struct udpif_flow_dump *udump,
 
     /* Since the kernel is free to ignore wildcarded bits in the mask, we can't
      * directly check that the masks are the same.  Instead we check that the
-     * mask in the kernel is more specific i.e. less wildcarded, then what
-     * we've calculated here.  This guarnatees we don't catch any packets we
+     * mask in the kernel is more specific i.e. less wildcarded, than what
+     * we've calculated here.  This guarantees we don't catch any packets we
      * shouldn't with the megaflow. */
     udump32 = (uint32_t *) &udump_mask;
     xout32 = (uint32_t *) &xout.wc.masks;
@@ -1297,12 +1305,8 @@ revalidate_ukey(struct udpif *udpif, struct udpif_flow_dump *udump,
     ok = true;
 
 exit:
-    if (actions) {
-        ofpbuf_delete(actions);
-    }
-    if (xoutp) {
-        xlate_out_uninit(xoutp);
-    }
+    ofpbuf_delete(actions);
+    xlate_out_uninit(xoutp);
     return ok;
 }
 
diff --git a/ofproto/ofproto-dpif-upcall.h b/ofproto/ofproto-dpif-upcall.h
index 9f445c2..85d4a28 100644
--- a/ofproto/ofproto-dpif-upcall.h
+++ b/ofproto/ofproto-dpif-upcall.h
@@ -15,23 +15,14 @@
 #ifndef OFPROTO_DPIF_UPCALL_H
 #define OFPROTO_DPIF_UPCALL_H
 
-#define FLOW_MISS_MAX_BATCH 50
+#include <stddef.h>
 
-#include "dpif.h"
-#include "flow.h"
-#include "hmap.h"
-#include "list.h"
-#include "odp-util.h"
-#include "ofpbuf.h"
-#include "ofproto-dpif-xlate.h"
-
-struct seq;
 struct dpif;
 struct dpif_backer;
+struct seq;
+struct simap;
 
-/* udif is responsible for retrieving upcalls from the kernel, processing miss
- * upcalls, and handing more complex ones up to the main ofproto-dpif
- * module. */
+/* udif retrieves upcalls from the kernel and processing them. */
 
 struct udpif *udpif_create(struct dpif_backer *, struct dpif *);
 void udpif_set_threads(struct udpif *, size_t n_handlers,
@@ -44,8 +35,6 @@ void udpif_get_memory_usage(struct udpif *, struct simap *usage);
 
 struct seq *udpif_dump_seq(struct udpif *);
 
-void udpif_get_memory_usage(struct udpif *, struct simap *usage);
-
 void udpif_flush(void);
 
 #endif /* ofproto-dpif-upcall.h */
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index bf2d45d..f5b7428 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -213,7 +213,7 @@ struct dpif_completion {
     struct ofoperation *op;
 };
 
-/* Reasons that we might need to revalidate every datapath flow , and
+/* Reasons that we might need to revalidate every datapath flow, and
  * corresponding coverage counters.
  *
  * A value of 0 means that there is no need to revalidate.
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 28a6c2b..949bbfd 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -126,7 +126,7 @@
       <column name="other_config" key="flow-limit"
               type='{"type": "integer", "minInteger": 0}'>
         <p>
-          A number of flows as a nonnegative integer.  This sets the maxmimum
+          The maximum
           number of flows allowed in the datapath flow table.  Internally OVS
           will choose a flow limit which will likely be lower than this number,
           based on real time network conditions.



More information about the dev mailing list