[ovs-dev] [ovs-discuss] [PATCH] datapath: Use hash table more tolerant of collisions for flow table.
Ben Pfaff
blp at nicira.com
Tue Sep 1 17:37:49 UTC 2009
Justin Pettit <jpettit at nicira.com> writes:
> On Aug 26, 2009, at 5:00 PM, Ben Pfaff wrote:
>
>> /* Expand table, if necessary, to make room. */
>> - if (dp->n_flows * 4 >= table->n_buckets &&
>> - table->n_buckets < DP_MAX_BUCKETS) {
>> + if (dp->n_flows >= table->n_buckets) {
>> error = dp_table_expand(dp);
>
> I don't see where the table size is bounded to DP_MAX_BUCKETS. The
> check should probably be in that "if" statement, since if it were in
> dp_table_expand(), then the max number of flows would be equal to
> DP_MAX_BUCKETS.
Good catch, thank you. I applied this:
diff --git a/datapath/datapath.c b/datapath/datapath.c
index 9d54d5e..1460215 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -872,6 +872,10 @@ static int put_flow(struct datapath *dp, struct odp_flow_put __user *ufp)
/* Expand table, if necessary, to make room. */
if (dp->n_flows >= table->n_buckets) {
+ error = -ENOSPC;
+ if (table->n_buckets >= DP_MAX_BUCKETS)
+ goto error;
+
error = dp_table_expand(dp);
if (error)
goto error;
>> +int dp_table_insert(struct dp_table *table, struct sw_flow *target)
>
> Any thoughts on making the new entry first instead of last, so it's
> found faster. I guess the question is: Is a new flow or old flow more
> likely to be more active, since the timeouts are relatively short?
I don't know. Do you have an opinion?
Heuristics could be useful here. For example, we could put the
new entry just after the last existing entry that has a nonzero
packet_count, and otherwise at the beginning. Or we could even
do a full sort in descending order of packet count.
Really I hope that collisions are rare, so that it doesn't
matter.
I'm happy to revisit this later if it seems useful. For now, I
just left it as-is.
> Would it make sense to make the buckets a bit larger, so an insertion
> is less expensive? Would using a kmem_cache of a common size (maybe
> even one entry) make sense? I suppose all of this may complicate
> things, since we're hoping collisions are rare.
I thought about that kind of thing. I decided against it for a
few reasons:
* Laziness and simplicity of code.
* Lack of confidence that it matters.
* I know that I can replace a pointer to a bucket safely
with RCU, but the rules for updating a non-pointer
(n_flows in struct dp_bucket) are not written out
explicitly anywhere (as far as I can tell) and I am not
certain that they would be the same. Probably, I could
just use smp_wmb() between adding a flow and updating
the counter (just like rcu_assign_pointer() does
internally), but I didn't feel like this issue was
worth trying to figure that out for sure.
* Fewer code paths, easier to test.
>> int dp_table_delete(struct dp_table *table, struct sw_flow *target)
>
> Similar to insert, would it make sense to leave a few slots open so
> insertion is less expensive?
Same reasoning here.
With the change above, I pushed this out. Let me know if you
feel strongly enough about any of the issues above and I'll
revisit them.
More information about the dev
mailing list