[ovs-dev] [cmap v2 1/2] cmap: New module for cuckoo hash table.

Ben Pfaff blp at nicira.com
Tue May 20 23:46:48 UTC 2014


On Tue, May 20, 2014 at 01:25:35PM -0700, Jarno Rajahalme wrote:
> 
> On May 20, 2014, at 1:03 PM, Ben Pfaff <blp at nicira.com> wrote:
> 
> > On Tue, May 20, 2014 at 12:40:20PM -0700, Jarno Rajahalme wrote:
> >> 
> >> On May 20, 2014, at 10:08 AM, Ben Pfaff <blp at nicira.com> wrote:
> >> 
> >>> On Tue, May 20, 2014 at 09:12:23AM -0700, Ben Pfaff wrote:
> >>>> On Fri, May 16, 2014 at 04:38:02PM -0700, Jarno Rajahalme wrote:
> >>>>>> +static bool
> >>>>>> +cmap_insert_dup(struct cmap_node *new_node, uint32_t hash,
> >>>>>> +                struct cmap_bucket *b)
> >>>>>> +{
> >>>>>> +    int i;
> >>>>>> +
> >>>>>> +    for (i = 0; i < CMAP_K; i++) {
> >>>>>> +        struct cmap_node *node = b->nodes[i];
> >>>>>> +        if (b->hashes[i] == hash && node) {
> >>>>>> +            for (;;) {
> >>>>>> +                struct cmap_node *next = cmap_node_next_protected(node);
> >>>>>> +                if (!next) {
> >>>>>> +                    ovsrcu_set(&node->next, new_node);
> >>>>>> +                    return true;
> >>>>>> +                }
> >>>>>> +                node = next;
> >>>>>> +            }
> >>>>> 
> >>>>> It would be faster to add the dup to the beginning of the list instead:
> >>>>> 
> >>>>>   for (i = 0; i < CMAP_K; i++) {
> >>>>>       if (b->hashes[i] == hash) {
> >>>>>           ovsrcu_set(&new_node->next, b->nodes[i]);
> >>>>>           b->nodes[i] = new_node;
> >>>>>           return true;
> >>>>>       }
> >>>>>   }
> >>>>> 
> >>>>> Syncing via the counter should be unnecessary, as the hash value was
> >>>>> already set.
> >>>> 
> >>>> That's a good point about the counter.  That's the reason that I
> >>>> switched to this method in v2.
> >>> 
> >>> OK, here's an incremental I'm folding in.  What do you think?
> >>> 
> >>> diff --git a/lib/cmap.c b/lib/cmap.c
> >>> index a1367bc..30a6e2d 100644
> >>> --- a/lib/cmap.c
> >>> +++ b/lib/cmap.c
> >>> @@ -392,7 +392,7 @@ cmap_set_bucket(struct cmap_bucket *b, int i,
> >>> 
> >>> /* Searches 'b' for a node with the given 'hash'.  If it finds one, adds
> >>> * 'new_node' to the node's linked list and returns true.  If it does not find
> >>> - * one, returns false.  */
> >>> + * one, returns false. */
> >>> static bool
> >>> cmap_insert_dup(struct cmap_node *new_node, uint32_t hash,
> >>>                struct cmap_bucket *b)
> >>> @@ -402,14 +402,30 @@ cmap_insert_dup(struct cmap_node *new_node, uint32_t hash,
> >>>    for (i = 0; i < CMAP_K; i++) {
> >>>        struct cmap_node *node = b->nodes[i];
> >>>        if (b->hashes[i] == hash && node) {
> >>> +            struct cmap_node *p;
> >>> +
> >>> +            /* The common case is that 'new_node' is a singleton, with a null
> >>> +             * 'next' pointer, but rehashing can add a longer chain.  Find the
> >>> +             * end of the chain starting at 'new_node', then splice 'node' to
> >>> +             * the end of that chain. */
> >> 
> >> I think this could be done regardless of the value of the
> >> ?node?. ?node? could be NULL, if a preceding remove just NULLed the
> >> node pointer but left the hash intact. I.e. it seems to me that when
> >> a node is removed, there is no reason to remove the hash value from
> >> the bucket, meaning that syncing via the counter would not be
> >> necessary when removing nodes from cmap.
> > 
> > I believe that the incremental that I posted does what you are describing.
> > It does not call cmap_set_bucket() or set ->hashes[].
> 
> There is still this before the added code lines:
> 
> @@ -402,14 +402,30 @@ cmap_insert_dup(struct cmap_node *new_node, uint32_t hash,
>     for (i = 0; i < CMAP_K; i++) {
>         struct cmap_node *node = b->nodes[i];
>         if (b->hashes[i] == hash && node) {
> 
> i.e., ?node? is checked to be non-NULL.

Oh, I completely missed your meaning before.

I changed this code to:

        if (b->hashes[i] == hash) {
            if (node) {
                struct cmap_node *p;

                /* The common case is that 'new_node' is a singleton, with a
                 * null 'next' pointer, but rehashing can add a longer chain.
                 * Find the end of the chain starting at 'new_node', then
                 * splice 'node' to the end of that chain. */
                p = new_node;
                for (;;) {
                    struct cmap_node *next = cmap_node_next_protected(p);
                    if (!next) {
                        break;
                    }
                    p = next;
                }
                ovsrcu_set(&p->next, node);
            } else {
                /* The hash value is there from some previous insertion, but
                 * the associated node has been removed.  We're not really
                 * inserting a duplicate, but we can still reuse the slot.
                 * Carry on. */
            }

            /* Change the bucket to point to 'new_node'.  This is a degenerate
             * form of cmap_set_bucket() that doesn't update the counter since
             * we're only touching one field and in a way that doesn't change
             * the bucket's meaning for readers. */
            b->nodes[i] = new_node;
            atomic_thread_fence(memory_order_release);

            return true;
        }

and will now re-test.



More information about the dev mailing list