[ovs-dev] [threaded reval 2/2] ofproto: Handle flow installation and eviction in upcall.
Ben Pfaff
blp at nicira.com
Tue Dec 17 05:58:33 UTC 2013
On Sun, Dec 15, 2013 at 05:57:57PM -0800, Ethan Jackson wrote:
> This patch moves flow installation and eviction from ofproto-dpif and
> the main thread, into ofproto-dpif-upcall. This performs
> significantly better (approximately 2x TCP_CRR improvement), and
> allows ovs-vswitchd to maintain significantly larger datapath flow
> tables. On top of that, it significantly simplifies the code,
> retiring "strut facet" and friends.
"struct"
> Signed-off-by: Ethan Jackson <ethan at nicira.com>
This is good work! I especially like how this is a net removal of
code.
I would mention in NEWS that flow-eviction-threshold has gone away.
You might want to mention that there is a new flow-limit parameter
(but if you do please also recommend reading the documentation since
the meaning is different).
ofproto-dpif-upcall.h now has two prototypes for
udpif_get_memory_usage().
The definition of FLOW_MISS_MAX_BATCH could be moved into
ofproto-dpif-upcall.c.
I think that the remaining comment in ofproto-dpif-upcall.h is wrong:
I don't think it hands anything up to the main module anymore.
I wasn't sure what this comment on rule_dpif and group_dpif meant, maybe
you could rephrase it:
* - Do include packets and bytes from datapath flows which have not been
* pushed by ofproto-dpif-upcall. */
Eww:
+ /* We know stats are relatively fresh since a dump just happened, so
Why does udpif_set_threads() destroy each revalidator thread's ukeys?
Can't the thread itself do that before it exits?xs
This might be premature optimization, but did you consider putting
'key_buf' at the end of struct udpif_key so that all of the frequently
used data is in the first cache line? Ditto for 'key_buf' and
'mask_buf' in struct udpif_flow_dump.
I think that the 'op' member of struct udpif_flow_dump is only used
inside revalidate_udumps(). It adds over 40 bytes to each instance.
Maybe it doesn't matter because there's a limited number of instances
in flight at a given time? But otherwise we could easily allocate
them on the stack in revalidate_udumps().
'key' in udpif_key, and 'key' and 'mask' in udpif_flow_dump, seem to
be unneeded: they always point to the beginning of the corresponding
_buf. Maybe they are convenient enough to keep, though.
udpif_get_n_flows() has a mysteriously 'static' local variable.
In handle_upcalls(), I like to do this sort of comment, where we're
actually saying something about the alternative case:
if (miss->upcall_type == DPIF_UC_MISS) {
/* For non-miss upcalls, there's a flow in the datapath which this
* packet was accounted to. Presumably the revalidators will deal
* with pushing its stats eventually. */
xin.resubmit_stats = &miss->stats;
}
as:
if (miss->upcall_type == DPIF_UC_MISS) {
xin.resubmit_stats = &miss->stats;
} else {
/* There's a flow in the datapath which this packet was accounted
* to. The revalidators will push its stats eventually. */
}
but I'll understand if you think that's weird.
A struct odputil_keybuf is kind of big because of the odputil_keybuf
member, but revalidate_udumps() uses xzalloc() to allocate one. Can
we use xmalloc() instead and then just zero the members we must?
Could you document n-revalidator-threads in vswitch.xml?
I have a few more stylistic and other nonessential comments. I shall
present them in the form of an interpretive dance expressing my joy at
the deletion of code. I mean, I shall present them as an incremental
patch.
diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index a74015f..a7f6dc3 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -31,6 +31,7 @@
#include "ofpbuf.h"
#include "ofproto-dpif-ipfix.h"
#include "ofproto-dpif-sflow.h"
+#include "ofproto-dpif-xlate.h"
#include "packets.h"
#include "poll-loop.h"
#include "seq.h"
@@ -38,6 +39,7 @@
#include "vlog.h"
#define MAX_QUEUE_LENGTH 512
+#define FLOW_MISS_MAX_BATCH 50
#define REVALIDATE_MAX_BATCH 50
VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall);
@@ -45,8 +47,8 @@ VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall);
COVERAGE_DEFINE(upcall_queue_overflow);
/* A thread that processes each upcall handed to it by the dispatcher thread,
- * forwards the upcall's packet, and then queues it to the main ofproto_dpif
- * to possibly set up a kernel flow as a cache. */
+ * forwards the upcall's packet, and possibly sets up a kernel flow as a
+ * cache. */
struct handler {
struct udpif *udpif; /* Parent udpif. */
pthread_t thread; /* Thread ID. */
@@ -64,6 +66,9 @@ struct handler {
'mutex'. */
};
+/* A thread that processes each kernel flow handed to it by the flow_dumper
+ * thread, updates OpenFlow statistics, and updates or removes the kernel flow
+ * as necessary. */
struct revalidator {
struct udpif *udpif; /* Parent udpif. */
char *name; /* Thread name. */
@@ -81,11 +86,14 @@ struct revalidator {
/* An upcall handler for ofproto_dpif.
*
- * udpif is implemented as a "dispatcher" thread that reads upcalls from the
- * kernel. It processes each upcall just enough to figure out its next
- * destination. For a "miss" upcall (MISS_UPCALL), this is one of several
- * "handler" threads (see struct handler). Other upcalls are queued to the
- * main ofproto_dpif. */
+ * udpif has two logically separate pieces:
+ *
+ * - A "dispatcher" thread that reads upcalls from the kernel and dispatches
+ * them to one of several "handler" threads (see struct handler).
+ *
+ * - A "flow_dumper" thread that reads the kernel flow table and dispatches
+ * flows to one of several "revalidator" threads (see struct
+ * revalidator). */
struct udpif {
struct list list_node; /* In all_udpifs list. */
@@ -432,7 +440,7 @@ udpif_get_memory_usage(struct udpif *udpif, struct simap *usage)
ovs_mutex_lock(&revalidator->mutex);
simap_increase(usage, "revalidator dumps", revalidator->n_udumps);
- /* XXX: This isn't technically thread safe becuase the revalidator
+ /* XXX: This isn't technically thread safe because the revalidator
* ukeys maps isn't protected by a mutex since it's per thread. */
simap_increase(usage, "revalidator keys",
hmap_count(&revalidator->ukeys));
@@ -891,7 +899,7 @@ handle_upcalls(struct handler *handler, struct list *upcalls)
atomic_read(&udpif->flow_limit, &flow_limit);
may_put = udpif_get_n_flows(udpif) < flow_limit;
- /* Extract the flow from each upcall. Construct in misses a hash table
+ /* Extract the flow from each upcall. Construct in 'misses' a hash table
* that maps each unique flow to a 'struct flow_miss'.
*
* Most commonly there is a single packet per flow_miss, but there are
@@ -1042,9 +1050,9 @@ handle_upcalls(struct handler *handler, struct list *upcalls)
xin.may_learn = true;
if (miss->upcall_type == DPIF_UC_MISS) {
- /* For non miss upcalls, there's a flow in the datapath which this
+ /* For non-miss upcalls, there's a flow in the datapath which this
* packet was accounted to. Presumably the revalidators will deal
- * with pushing it's stats eventually. */
+ * with pushing its stats eventually. */
xin.resubmit_stats = &miss->stats;
}
@@ -1284,8 +1292,8 @@ revalidate_ukey(struct udpif *udpif, struct udpif_flow_dump *udump,
/* Since the kernel is free to ignore wildcarded bits in the mask, we can't
* directly check that the masks are the same. Instead we check that the
- * mask in the kernel is more specific i.e. less wildcarded, then what
- * we've calculated here. This guarnatees we don't catch any packets we
+ * mask in the kernel is more specific i.e. less wildcarded, than what
+ * we've calculated here. This guarantees we don't catch any packets we
* shouldn't with the megaflow. */
udump32 = (uint32_t *) &udump_mask;
xout32 = (uint32_t *) &xout.wc.masks;
@@ -1297,12 +1305,8 @@ revalidate_ukey(struct udpif *udpif, struct udpif_flow_dump *udump,
ok = true;
exit:
- if (actions) {
- ofpbuf_delete(actions);
- }
- if (xoutp) {
- xlate_out_uninit(xoutp);
- }
+ ofpbuf_delete(actions);
+ xlate_out_uninit(xoutp);
return ok;
}
diff --git a/ofproto/ofproto-dpif-upcall.h b/ofproto/ofproto-dpif-upcall.h
index 9f445c2..85d4a28 100644
--- a/ofproto/ofproto-dpif-upcall.h
+++ b/ofproto/ofproto-dpif-upcall.h
@@ -15,23 +15,14 @@
#ifndef OFPROTO_DPIF_UPCALL_H
#define OFPROTO_DPIF_UPCALL_H
-#define FLOW_MISS_MAX_BATCH 50
+#include <stddef.h>
-#include "dpif.h"
-#include "flow.h"
-#include "hmap.h"
-#include "list.h"
-#include "odp-util.h"
-#include "ofpbuf.h"
-#include "ofproto-dpif-xlate.h"
-
-struct seq;
struct dpif;
struct dpif_backer;
+struct seq;
+struct simap;
-/* udif is responsible for retrieving upcalls from the kernel, processing miss
- * upcalls, and handing more complex ones up to the main ofproto-dpif
- * module. */
+/* udif retrieves upcalls from the kernel and processing them. */
struct udpif *udpif_create(struct dpif_backer *, struct dpif *);
void udpif_set_threads(struct udpif *, size_t n_handlers,
@@ -44,8 +35,6 @@ void udpif_get_memory_usage(struct udpif *, struct simap *usage);
struct seq *udpif_dump_seq(struct udpif *);
-void udpif_get_memory_usage(struct udpif *, struct simap *usage);
-
void udpif_flush(void);
#endif /* ofproto-dpif-upcall.h */
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index bf2d45d..f5b7428 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -213,7 +213,7 @@ struct dpif_completion {
struct ofoperation *op;
};
-/* Reasons that we might need to revalidate every datapath flow , and
+/* Reasons that we might need to revalidate every datapath flow, and
* corresponding coverage counters.
*
* A value of 0 means that there is no need to revalidate.
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 28a6c2b..949bbfd 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -126,7 +126,7 @@
<column name="other_config" key="flow-limit"
type='{"type": "integer", "minInteger": 0}'>
<p>
- A number of flows as a nonnegative integer. This sets the maxmimum
+ The maximum
number of flows allowed in the datapath flow table. Internally OVS
will choose a flow limit which will likely be lower than this number,
based on real time network conditions.
More information about the dev
mailing list