[ovs-dev] [PATCH 02/12] ovs-atomic: Use memory_order_relaxed for ovs_refcount.

Jarno Rajahalme jrajahalme at nicira.com
Fri Jul 4 13:15:27 UTC 2014


Additional data point: I used http://en.cppreference.com/w/cpp/atomic/memory_order as a reference, and it reads:

“Typical use for relaxed memory ordering is updating counters, such as the reference counters of std::shared_ptr, since this only requires atomicity, but not ordering or synchronization.”

I guess the stress should be on the words “typical use”, read on...

Looking deeper into std::shared_ptr reveals that they do add synchronization before the memory is recycled, see, e.g. https://gcc.gnu.org/ml/libstdc++/2005-11/msg00136.html. The GCC implementation avoids the additional synchronization, if the atomic operation already provides synchronization (see /usr/include/c++/4.7/bits/shared_ptr_base.h).

From this I see that the commentary you provided is spot on. However, we *typically use* either locks, mutexes, or RCU to protect the access to the shared objects in OVS. All these already provide full memory barriers, so no additional barriers in unref are needed. I plan to post a v2 with an explicit ovs_refcount_unref_relaxed(), which can be used in the situations mentioned above, in addition to adding the mentioned barriers to bare ovs_refcount_unref().

  Jarno

On Jul 3, 2014, at 3:35 PM, Jarno Rajahalme <jrajahalme at nicira.com> wrote:

> It seems I was too invested in the combined refcount/RCU case here. I still think that with RCU postponed destruction relaxed is the proper memory model. So maybe we should add a relaxed variant of the unref function to be used with RCU objects and make the normal unref use release to guarantee writes to the protected object are done before the reference is dropped.
> 
> I do not yet fully understand the need for acquire before the delete. Is the concern that the current thread might immediately recycle the memory and writes to it (e.g., initialization) might be reordered to happen before the atomic sub while other threads might still be using the object?
> 
>  Jarno
> 
>> On Jul 3, 2014, at 11:52 PM, Ben Pfaff <blp at nicira.com> wrote:
>> 
>>> On Mon, Jun 30, 2014 at 08:17:18AM -0700, Jarno Rajahalme wrote:
>>> Updating the reference count only requires atomicity, but no memory
>>> ordering with respect to any other loads or stores.  Avoiding the
>>> overhead of the default memory_order_seq_cst can make these more
>>> efficient.
>>> 
>>> Signed-off-by: Jarno Rajahalme <jrajahalme at nicira.com>
>> 
>> This website makes pretty different claims:
>>       http://www.chaoticmind.net/~hcb/projects/boost.atomic/doc/atomic/usage_examples.html
>> 
>> Relevant excerpts:
>> 
>>   #include <boost/intrusive_ptr.hpp>
>>   #include <boost/atomic.hpp>
>> 
>>   class X {
>>   public:
>>     typedef boost::intrusive_ptr<X> pointer;
>>     X() : refcount_(0) {}
>> 
>>   private:
>>     mutable boost::atomic<int> refcount_;
>>     friend void intrusive_ptr_add_ref(const X * x)
>>     {
>>       x->refcount_.fetch_add(1, boost::memory_order_relaxed);
>>     }
>>     friend void intrusive_ptr_release(const X * x)
>>     {
>>       if (x->refcount_.fetch_sub(1, boost::memory_order_release) == 1) {
>>         boost::atomic_thread_fence(boost::memory_order_acquire);
>>         delete x;
>>       }
>>     }
>>   };
>> 
>> ...
>> 
>>   Increasing the reference counter can always be done with
>>   memory_order_relaxed: New references to an object can only be formed
>>   from an existing reference, and passing an existing reference from one
>>   thread to another must already provide any required synchronization.
>> 
>>   It is important to enforce any possible access to the object in one
>>   thread (through an existing reference) to happen before deleting the
>>   object in a different thread. This is achieved by a "release"
>>   operation after dropping a reference (any access to the object through
>>   this reference must obviously happened before), and an "acquire"
>>   operation before deleting the object.
>> 
>>   It would be possible to use memory_order_acq_rel for the fetch_sub
>>   operation, but this results in unneeded "acquire" operations when the
>>   reference counter does not yet reach zero and may impose a performance
>>   penalty.




More information about the dev mailing list