[ovs-dev] [cmap v2 1/2] cmap: New module for cuckoo hash table.

Jarno Rajahalme jrajahalme at nicira.com
Tue May 20 19:40:20 UTC 2014


On May 20, 2014, at 10:08 AM, Ben Pfaff <blp at nicira.com> wrote:

> On Tue, May 20, 2014 at 09:12:23AM -0700, Ben Pfaff wrote:
>> On Fri, May 16, 2014 at 04:38:02PM -0700, Jarno Rajahalme wrote:
>>>> +static bool
>>>> +cmap_insert_dup(struct cmap_node *new_node, uint32_t hash,
>>>> +                struct cmap_bucket *b)
>>>> +{
>>>> +    int i;
>>>> +
>>>> +    for (i = 0; i < CMAP_K; i++) {
>>>> +        struct cmap_node *node = b->nodes[i];
>>>> +        if (b->hashes[i] == hash && node) {
>>>> +            for (;;) {
>>>> +                struct cmap_node *next = cmap_node_next_protected(node);
>>>> +                if (!next) {
>>>> +                    ovsrcu_set(&node->next, new_node);
>>>> +                    return true;
>>>> +                }
>>>> +                node = next;
>>>> +            }
>>> 
>>> It would be faster to add the dup to the beginning of the list instead:
>>> 
>>>    for (i = 0; i < CMAP_K; i++) {
>>>        if (b->hashes[i] == hash) {
>>>            ovsrcu_set(&new_node->next, b->nodes[i]);
>>>            b->nodes[i] = new_node;
>>>            return true;
>>>        }
>>>    }
>>> 
>>> Syncing via the counter should be unnecessary, as the hash value was
>>> already set.
>> 
>> That's a good point about the counter.  That's the reason that I
>> switched to this method in v2.
> 
> OK, here's an incremental I'm folding in.  What do you think?
> 
> diff --git a/lib/cmap.c b/lib/cmap.c
> index a1367bc..30a6e2d 100644
> --- a/lib/cmap.c
> +++ b/lib/cmap.c
> @@ -392,7 +392,7 @@ cmap_set_bucket(struct cmap_bucket *b, int i,
> 
> /* Searches 'b' for a node with the given 'hash'.  If it finds one, adds
>  * 'new_node' to the node's linked list and returns true.  If it does not find
> - * one, returns false.  */
> + * one, returns false. */
> static bool
> cmap_insert_dup(struct cmap_node *new_node, uint32_t hash,
>                 struct cmap_bucket *b)
> @@ -402,14 +402,30 @@ cmap_insert_dup(struct cmap_node *new_node, uint32_t hash,
>     for (i = 0; i < CMAP_K; i++) {
>         struct cmap_node *node = b->nodes[i];
>         if (b->hashes[i] == hash && node) {
> +            struct cmap_node *p;
> +
> +            /* The common case is that 'new_node' is a singleton, with a null
> +             * 'next' pointer, but rehashing can add a longer chain.  Find the
> +             * end of the chain starting at 'new_node', then splice 'node' to
> +             * the end of that chain. */

I think this could be done regardless of the value of the ‘node’. ‘node’ could be NULL, if a preceding remove just NULLed the node pointer but left the hash intact. I.e. it seems to me that when a node is removed, there is no reason to remove the hash value from the bucket, meaning that syncing via the counter would not be necessary when removing nodes from cmap.

  Jarno

> +            p = new_node;
>             for (;;) {
> -                struct cmap_node *next = cmap_node_next_protected(node);
> +                struct cmap_node *next = cmap_node_next_protected(p);
>                 if (!next) {
> -                    ovsrcu_set(&node->next, new_node);
> -                    return true;
> +                    break;
>                 }
> -                node = next;
> +                p = next;
>             }
> +            ovsrcu_set(&p->next, node);
> +
> +            /* Change the bucket to point to 'new_node'.  This is a degenerate
> +             * form of cmap_set_bucket() that doesn't update the counter since
> +             * we're only touching one field and in a way that doesn't change
> +             * the bucket's meaning for readers. */
> +            b->nodes[i] = new_node;
> +            atomic_thread_fence(memory_order_release);
> +
> +            return true;
>         }
>     }
>     return false;
> @@ -579,6 +595,10 @@ cmap_insert_bfs(struct cmap_impl *impl, struct cmap_node *new_node,
>     return false;
> }
> 
> +/* Adds 'node', with the given 'hash', to 'impl'.
> + *
> + * 'node' is ordinarily a single node, with a null 'next' pointer.  When
> + * rehashing, however, it may be a longer chain of nodes. */
> static bool
> cmap_try_insert(struct cmap_impl *impl, struct cmap_node *node, uint32_t hash)
> {




More information about the dev mailing list