[ovs-dev] [PATCH v3 7/8] lib/ovs-atomic: Native support for x86_64 with GCC.

Fri Aug 1 15:53:11 UTC 2014

Figured this out last night:

On Jul 31, 2014, at 3:21 PM, Jarno Rajahalme <jrajahalme at nicira.com> wrote:
(snip)
> +#define atomic_compare_exchange__(DST, EXP, SRC, RES, CLOB)           \
> +    asm volatile("lock; cmpxchg %3,%1 ; "                             \
> +                 "      sete    %0      "                             \
> +                 "# atomic_compare_exchange__"                        \
> +                 : "=q" (RES),           /* 0 */                      \
> +                   "+m" (*DST),          /* 1 */                      \
> +                   "+a" (EXP)            /* 2 */                      \
> +                 : "r" (SRC)             /* 3 */                      \
> +                 : CLOB, "cc")
> +
> +/* All memory models are valid for read-modify-write operations.
> + *
> + * Valid memory_models for the read operation of the current value in

“memory models”

> + * the failure case are the same as for atomic read, but can not be
> + * stronger than the success memory model. */
> +#define atomic_compare_exchange_strong_explicit(DST, EXP, SRC, ORDER, ORD_FAIL) \
> +    ({                                                              \
> +        typeof(DST) dst__ = (DST);                                  \
> +        typeof(DST) expp__ = (EXP);                                 \
> +        typeof(*DST) src__ = (SRC);                                 \
> +        typeof(*DST) exp__ = *expp__;                               \
> +        uint8_t res__;                                              \
> +                                                                    \
> +        if (ORDER > memory_order_consume) {                         \
> +            atomic_compare_exchange__(dst__, exp__, src__, res__,   \
> +                                      "memory");                    \
> +        } else {                                                    \
> +            atomic_compare_exchange__(dst__, exp__, src__, res__,   \
> +                                      "cc");                        \
> +        }                                                           \
> +        if (!res__) {                                               \
> +            *expp__ = exp__;                                        \
> +            atomic_compiler_barrier(ORD_FAIL);                      \

This barrier is not needed, as the barrier of the atomic_compare_exchange__ is always strong enough. Also the assignment here is not an atomic operation, the atomic load operation was performed by atomic_compare_exchange__, loading the value of *dst__ to exp__ in case of failure. Here we just move the value from one local variable to another; the compiler should have the right to optimize this in any way it likes.

Just to make this clear, this is the breakdown of the different allowed combinations:

ORDER > memory_order_consume: atomic_compare_exchange__ already implements a full CPU & compiler barrier
ORDER == memory_order_acquire: atomic_compare_exchange__ already implements the strongest possible barrier allowed for ORD_FAIL (memory_order_acquire)
ORDER == memory_order_consume: atomic_compare_exchange__ already implements the strongest possible barrier allowed for ORD_FAIL (memory_order_consume)
ORDER == memory_order_relaxed: atomic_compare_exchange__ already implements the strongest possible barrier allowed for ORD_FAIL (memory_order_relaxed)

> +        }                                                           \
> +        (bool)res__;                                                \
> +    })

  Jarno