[ovs-dev] [PATCH v3 7/8] lib/ovs-atomic: Native support for x86_64 with GCC.

Ben Pfaff blp at nicira.com
Tue Aug 5 18:35:47 UTC 2014


On Thu, Jul 31, 2014 at 03:21:53PM -0700, Jarno Rajahalme wrote:
> Some supported XenServer build environments lack compiler support for
> atomic operations.  This patch provides native support for x86_64 on
> GCC, which covers possible future 64-bit builds on XenServer.
> 
> Since this implementation is faster than the existing support prior to
> GCC 4.7, especially for cmap inserts, we use this with GCC < 4.7 on
> x86_64.
> 
> Example numbers with "tests/test-cmap benchmark 2000000 8 0.1" on
> quad-core hyperthreaded laptop, built with GCC 4.6 -O2:
> 
> Using ovs-atomic-pthreads on x86_64:
> 
> Benchmarking with n=2000000, 8 threads, 0.10% mutations:
> cmap insert:   4725 ms
> cmap iterate:   329 ms
> cmap search:   5945 ms
> cmap destroy:   911 ms
> 
> Using ovs-atomic-gcc4+ on x86_64:
> 
> Benchmarking with n=2000000, 8 threads, 0.10% mutations:
> cmap insert:    845 ms
> cmap iterate:    58 ms
> cmap search:    308 ms
> cmap destroy:   295 ms
> 
> With the native support provided by this patch:
> 
> Benchmarking with n=2000000, 8 threads, 0.10% mutations:
> cmap insert:    530 ms
> cmap iterate:    59 ms
> cmap search:    305 ms
> cmap destroy:   232 ms
> 
> Signed-off-by: Jarno Rajahalme <jrajahalme at nicira.com>

Your research is far stronger than mine on this.  I have only a few
comments.

This ignores the possibility of misaligned atomic variables
(especially in atomic_is_lock_free()).  All the other existing
implementations ignore that possibility, too, which might be entirely
fair, so I'm pointing it out only for completeness.

The xchg instruction always implies a lock prefix, which makes one
wonder whether "xchg" is more expensive than unlocked cmpxchg.  Beats
me.

I see a few uses of the ORDER argument that might better have
parentheses, e.g.:
        if (ORDER > memory_order_consume) {

In the notes (thanks a lot for the notes, by the way), I see:
 * - Stores are not reordered with other stores, except for special
 *   instructions (CLFLUSH, streaming stores, string operations).  However,
 *   these are not emitted by compilers.
which makes me worry slightly because compilers do sometimes use the
"stos" string instruction for initializing data structures.  Do you
know what the deal is with the string instructions?

Acked-by: Ben Pfaff <blp at nicira.com>



More information about the dev mailing list