[ovs-dev] [ovs-discuss] [PATCH] datapath: Use hash table more tolerant of collisions for flow table.

Ben Pfaff blp at nicira.com
Tue Sep 1 17:37:49 UTC 2009


Justin Pettit <jpettit at nicira.com> writes:

> On Aug 26, 2009, at 5:00 PM, Ben Pfaff wrote:
>
>> 		/* Expand table, if necessary, to make room. */
>> -		if (dp->n_flows * 4 >= table->n_buckets &&
>> -		    table->n_buckets < DP_MAX_BUCKETS) {
>> +		if (dp->n_flows >= table->n_buckets) {
>> 			error = dp_table_expand(dp);
>
> I don't see where the table size is bounded to DP_MAX_BUCKETS.  The
> check should probably be in that "if" statement, since if it were in
> dp_table_expand(), then the max number of flows would be equal to
> DP_MAX_BUCKETS.

Good catch, thank you.  I applied this:

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 9d54d5e..1460215 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -872,6 +872,10 @@ static int put_flow(struct datapath *dp, struct odp_flow_put __user *ufp)
 
 		/* Expand table, if necessary, to make room. */
 		if (dp->n_flows >= table->n_buckets) {
+			error = -ENOSPC;
+			if (table->n_buckets >= DP_MAX_BUCKETS)
+				goto error;
+
 			error = dp_table_expand(dp);
 			if (error)
 				goto error;


>> +int dp_table_insert(struct dp_table *table, struct sw_flow *target)
>
> Any thoughts on making the new entry first instead of last, so it's
> found faster.  I guess the question is: Is a new flow or old flow more
> likely to be more active, since the timeouts are relatively short?

I don't know.  Do you have an opinion?

Heuristics could be useful here.  For example, we could put the
new entry just after the last existing entry that has a nonzero
packet_count, and otherwise at the beginning.  Or we could even
do a full sort in descending order of packet count.

Really I hope that collisions are rare, so that it doesn't
matter.

I'm happy to revisit this later if it seems useful.  For now, I
just left it as-is.

> Would it make sense to make the buckets a bit larger, so an insertion
> is less expensive?  Would using a kmem_cache of a common size (maybe
> even one entry) make sense?  I suppose all of this may complicate
> things, since we're hoping collisions are rare.

I thought about that kind of thing.  I decided against it for a
few reasons:

        * Laziness and simplicity of code.

        * Lack of confidence that it matters.

        * I know that I can replace a pointer to a bucket safely
          with RCU, but the rules for updating a non-pointer
          (n_flows in struct dp_bucket) are not written out
          explicitly anywhere (as far as I can tell) and I am not
          certain that they would be the same.  Probably, I could
          just use smp_wmb() between adding a flow and updating
          the counter (just like rcu_assign_pointer() does
          internally), but I didn't feel like this issue was
          worth trying to figure that out for sure.

        * Fewer code paths, easier to test.

>> int dp_table_delete(struct dp_table *table, struct sw_flow *target)
>
> Similar to insert, would it make sense to leave a few slots open so
> insertion is less expensive?

Same reasoning here.

With the change above, I pushed this out.  Let me know if you
feel strongly enough about any of the issues above and I'll
revisit them.




More information about the dev mailing list