[ovs-dev] [PATCH] util: Make xmalloc_cacheline() allocate full cachelines.

Bodireddy, Bhanuprakash bhanuprakash.bodireddy at intel.com
Wed Nov 29 08:02:17 UTC 2017


>
>On Tue, Nov 28, 2017 at 09:06:09PM +0000, Bodireddy, Bhanuprakash wrote:
>> >Until now, xmalloc_cacheline() has provided its caller memory that
>> >does not share a cache line, but when posix_memalign() is not
>> >available it did not provide a full cache line; instead, it returned
>> >memory that was offset 8 bytes into a cache line.  This makes it hard
>> >for clients to design structures to be cache line-aligned.  This
>> >commit changes
>> >xmalloc_cacheline() to always return a full cache line instead of
>> >memory offset into one.
>> >
>> >Signed-off-by: Ben Pfaff <blp at ovn.org>
>> >---
>> > lib/util.c | 60
>> >++++++++++++++++++++++++++++++++---------------------------
>> >-
>> > 1 file changed, 32 insertions(+), 28 deletions(-)
>> >
>> >diff --git a/lib/util.c b/lib/util.c
>> >index 9e6edd27ae4c..137091a3cd4f 100644
>> >--- a/lib/util.c
>> >+++ b/lib/util.c
>> >@@ -196,15 +196,9 @@ x2nrealloc(void *p, size_t *n, size_t s)
>> >     return xrealloc(p, *n * s);
>> > }
>> >
>> >-/* The desired minimum alignment for an allocated block of memory.
>> >*/ - #define MEM_ALIGN MAX(sizeof(void *), 8) -
>> >BUILD_ASSERT_DECL(IS_POW2(MEM_ALIGN));
>> >-BUILD_ASSERT_DECL(CACHE_LINE_SIZE >= MEM_ALIGN);
>> >-
>> >-/* Allocates and returns 'size' bytes of memory in dedicated cache
>> >lines.  That
>> >- * is, the memory block returned will not share a cache line with
>> >other data,
>> >- * avoiding "false sharing".  (The memory returned will not be at
>> >the start of
>> >- * a cache line, though, so don't assume such alignment.)
>> >+/* Allocates and returns 'size' bytes of memory aligned to a cache
>> >+line and in
>> >+ * dedicated cache lines.  That is, the memory block returned will
>> >+not share a
>> >+ * cache line with other data, avoiding "false sharing".
>> >  *
>> >  * Use free_cacheline() to free the returned memory block. */  void
>> >* @@ -
>> >221,28 +215,37 @@ xmalloc_cacheline(size_t size)
>> >     }
>> >     return p;
>> > #else
>> >-    void **payload;
>> >-    void *base;
>> >-
>> >     /* Allocate room for:
>> >      *
>> >-     *     - Up to CACHE_LINE_SIZE - 1 bytes before the payload, so that the
>> >-     *       start of the payload doesn't potentially share a cache line.
>> >+     *     - Header padding: Up to CACHE_LINE_SIZE - 1 bytes, to allow the
>> >+     *       pointer to be aligned exactly sizeof(void *) bytes before the
>> >+     *       beginning of a cache line.
>> >      *
>> >-     *     - A payload consisting of a void *, followed by padding out to
>> >-     *       MEM_ALIGN bytes, followed by 'size' bytes of user data.
>> >+     *     - Pointer: A pointer to the start of the header padding, to allow us
>> >+     *       to free() the block later.
>> >      *
>> >-     *     - Space following the payload up to the end of the cache line, so
>> >-     *       that the end of the payload doesn't potentially share a cache line
>> >-     *       with some following block. */
>> >-    base = xmalloc((CACHE_LINE_SIZE - 1)
>> >-                   + ROUND_UP(MEM_ALIGN + size, CACHE_LINE_SIZE));
>> >-
>> >-    /* Locate the payload and store a pointer to the base at the beginning.
>*/
>> >-    payload = (void **) ROUND_UP((uintptr_t) base, CACHE_LINE_SIZE);
>> >-    *payload = base;
>> >-
>> >-    return (char *) payload + MEM_ALIGN;
>> >+     *     - User data: 'size' bytes.
>> >+     *
>> >+     *     - Trailer padding: Enough to bring the user data up to a cache line
>> >+     *       multiple.
>> >+     *
>> >+     * +---------------+---------+------------------------+---------+
>> >+     * | header        | pointer | user data              | trailer |
>> >+     * +---------------+---------+------------------------+---------+
>> >+     * ^               ^         ^
>> >+     * |               |         |
>> >+     * p               q         r
>> >+     *
>> >+     */
>> >+    void *p = xmalloc((CACHE_LINE_SIZE - 1)
>> >+                      + sizeof(void *)
>> >+                      + ROUND_UP(size, CACHE_LINE_SIZE));
>> >+    bool runt = PAD_SIZE((uintptr_t) p, CACHE_LINE_SIZE) < sizeof(void *);
>> >+    void *r = (void *) ROUND_UP((uintptr_t) p + (runt ? CACHE_LINE_SIZE :
>0),
>> >+                                CACHE_LINE_SIZE);
>> >+    void **q = (void **) r - 1;
>> >+    *q = p;
>> >+    return r;
>> > #endif
>> > }
>> >
>> >@@ -265,7 +268,8 @@ free_cacheline(void *p)
>> >     free(p);
>> > #else
>> >     if (p) {
>> >-        free(*(void **) ((uintptr_t) p - MEM_ALIGN));
>> >+        void **q = (void **) p - 1;
>> >+        free(*q);
>> >     }
>> > #endif
>> > }
>> >--
>>
>> Thanks for the patch.
>> Reviewed and tested this and now it returns 64 byte aligned address.
>>
>> Acked-by: Bhanuprakash Bodireddy <Bhanuprakash.bodireddy at intel.com>
>
>Thank you for the review.
>
>You are likely testing on Linux or another system that has posix_memalign().
>On such a system, the code that I just modified is not used.  Did you add
>"#undef HAVE_POSIX_MEMALIGN" or otherwise make sure that the new
>code was actually used?

I am testing this on Linux and my system do support posix_memalign(). 
So I had to disable HAVE_POSIX_MEMALIGN block so that allocation would go through the else block you implemented here.

Also I applied the patch (https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341231.html) on top of this
to check if the address returned is 64 byte aligned. I also cross verified with gdb on vswitchd thread and changing pmd-cpu-mask
to see if the code block is getting executed.

- Bhanuprakash.



More information about the dev mailing list