[ovs-dev] [PATCH] datapath-windows: Add Windows thread atomic APIs for x64 binaries.

Paul Boca pboca at cloudbasesolutions.com
Fri Jul 8 15:25:02 UTC 2016


Hi Ben!

I hope I didn't misunderstand the problems raised here.
Please see my comments inline.

Thanks,
Paul

> -----Original Message-----
> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Ben Pfaff
> Sent: Tuesday, March 29, 2016 8:10 PM
> To: Sorin Vinturis
> Cc: dev at openvswitch.org; Guru Shetty
> Subject: Re: [ovs-dev] [PATCH] datapath-windows: Add Windows thread
> atomic APIs for x64 binaries.
> 
> I understand that InterlockedExchange64 works to implement a 64-bit
> store.  I also understand 32-bit builds use it for 64-bit atomic stores:
> because 32-bit code cannot otherwise store to 64-bit objects.  I don't
> understand why 64-bit code would use it for 64-bit atomic stores; it
> does not make any sense for MSVC to convert such a store to two 32-bit
> instructions, and I doubt that it does.
> 
> Here's the code I'm talking about:
> 
> /* MSVC converts 64 bit writes into two instructions. So there is
>  * a possibility that an interrupt can make a 64 bit write non-atomic even
>  * when 8 byte aligned. So use InterlockedExchange64().
[Paul Boca] I think that the description for this function doesn't apply for 64-bit applications.
Indeed on 32-bit application, most of the Interlocked...64 functions use two 32-bit registers
as operands.
On x64 there are CPU instructions that makes the operation atomically using only one instruction.
i.e. InterlockedExchange64 on 64-bit app, uses 'xchg' without any other instructions;
on 32-bit app is used 'cmpxchg8b' in a while loop, checking if the exchanged value
is the right one, this uses pairs of 32-bit registers to hold the 64-bit value.
( http://faydoc.tripod.com/cpu/cmpxchg8b.htm ).

>  *
>  * For atomic stores, 'consume' and 'acquire' semantics are not valid. But we
>  * are using 'Exchange' to get atomic stores here and we only have
>  * InterlockedExchange64(), InterlockedExchangeNoFence64() and
>  * InterlockedExchange64Acquire() available. So we are forced to use
>  * InterlockedExchange64() which uses full memory barrier for everything
>  * greater than 'memory_order_relaxed'. */
> #define atomic_store64(DST, SRC, ORDER)                                    \
>     if (ORDER == memory_order_relaxed) {                                   \
>         InterlockedExchangeNoFence64((int64_t volatile *) (DST),           \
>                                      (int64_t) (SRC));                     \
>     } else {                                                               \
>         InterlockedExchange64((int64_t volatile *) (DST), (int64_t) (SRC));\
>     }
> 
> Similarly for atomic_read64():
> 
> /* MSVC converts 64 bit reads into two instructions. So there is
>  * a possibility that an interrupt can make a 64 bit read non-atomic even
>  * when 8 byte aligned. So use fully memory barrier InterlockedOr64(). */
> #define atomic_read64(SRC, DST, ORDER)                                     \
>     __pragma (warning(push))                                               \
>     __pragma (warning(disable:4047))                                       \
>     *(DST) = InterlockedOr64((int64_t volatile *) (SRC), 0);               \
>     __pragma (warning(pop))
> 
> 
> On Tue, Mar 29, 2016 at 04:20:13PM +0000, Sorin Vinturis wrote:
> > Ben, ovs-atomic-msvc.h contains support for 8,16,32 and 64-bit arithmetic
> and logical operations.
> > Regarding 64-bit operations, like read and write, they are performed using
> specific 64-bit interlocked functions, i.e. InterlockedCompareExchange64
> and, respectivelly, InterlockedOr64. Both functions are performed atomically.
> >
> > All remaining 64-bit operations use, also, 64-bit interlocked functions,
> which are executed atomically, except for logical AND operation. For the
> latter operation, an intrinsic function is used, _InterlockedExchangeAdd64,
> which means that is a faster function built in to the compiler. More details
> about this is here: https://msdn.microsoft.com/ro-
> ro/library/26td21ds(v=vs.80).aspx.
> >
> >
> > -----Original Message-----
> > From: Ben Pfaff [mailto:blp at ovn.org]
> > Sent: Tuesday, 29 March, 2016 00:25
> > To: Sorin Vinturis
> > Cc: Guru Shetty; dev at openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH] datapath-windows: Add Windows thread
> atomic APIs for x64 binaries.
> >
> > On Mon, Mar 28, 2016 at 09:11:24PM +0000, Sorin Vinturis wrote:
> > > Hi Guru,
> > >
> > > I have verified the ovs-atomic-msvc.h header and all the defined
> > > macros and functions have 32-bit and 64-bit correspondent. All 64-bit
> > > macros use InterlockedXXX functions that are atomic:
> > > InterlockedExchangeNoFence64, InterlockedExchange64,
> > > InterlockedCompareExchange64, _InterlockedExchangeAdd64,
> > > InterlockedAnd64, InterlockedOr64, InterlockedXor64. I have ran
> > > test-atomic.c unit tests, on both 32-bit and 64-bit platforms, and all
> > > the tests are verified.
> > >
> > > I also tested the 64-bit build on Hyper-V and it is stable.
> > >
> > > Is there a reason why we should not use ovs-atomic-msvc APIs for 64-bit
> builds?
> >
> > ovs-atomic-msvc.h seems pretty oriented toward a 32-bit build.  At least, I
> really doubt MSVC would break 64-bit reads and writes into two 32-bit reads
> and writes on 64-bit builds, which is what the code in the header currently
> assumes.  You don't want to update it to handle 64-bit builds differently?  I
> guess that these would be optimizations, but they seem straightforward.
[Paul Boca] On 64-bit builds the Interlocked functions are redirected, using macros,
to intrinsic functions. In my opinion the usage of Interlocked functions in this case
cannot be optimized. Please correct me if I'm wrong.

> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev


More information about the dev mailing list