1 /* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
2 /* vim: set ts=8 sts=2 et sw=2 tw=80: */
3 /* This Source Code Form is subject to the terms of the Mozilla Public
4 * License, v. 2.0. If a copy of the MPL was not distributed with this
5 * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
8 * Implements (almost always) lock-free atomic operations. The operations here
9 * are a subset of that which can be found in C++11's <atomic> header, with a
10 * different API to enforce consistent memory ordering constraints.
12 * Anyone caught using |volatile| for inter-thread memory safety needs to be
13 * sent a copy of this header and the C++11 standard.
16 #ifndef mozilla_Atomics_h
17 #define mozilla_Atomics_h
19 #include "mozilla/Assertions.h"
20 #include "mozilla/Attributes.h"
21 #include "mozilla/Compiler.h"
22 #include "mozilla/RecordReplay.h"
23 #include "mozilla/TypeTraits.h"
32 * An enum of memory ordering possibilities for atomics.
34 * Memory ordering is the observable state of distinct values in memory.
35 * (It's a separate concept from atomicity, which concerns whether an
36 * operation can ever be observed in an intermediate state. Don't
37 * conflate the two!) Given a sequence of operations in source code on
38 * memory, it is *not* always the case that, at all times and on all
39 * cores, those operations will appear to have occurred in that exact
40 * sequence. First, the compiler might reorder that sequence, if it
41 * thinks another ordering will be more efficient. Second, the CPU may
42 * not expose so consistent a view of memory. CPUs will often perform
43 * their own instruction reordering, above and beyond that performed by
44 * the compiler. And each core has its own memory caches, and accesses
45 * (reads and writes both) to "memory" may only resolve to out-of-date
46 * cache entries -- not to the "most recently" performed operation in
47 * some global sense. Any access to a value that may be used by
48 * multiple threads, potentially across multiple cores, must therefore
49 * have a memory ordering imposed on it, for all code on all
50 * threads/cores to have a sufficiently coherent worldview.
52 * http://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync and
53 * http://en.cppreference.com/w/cpp/atomic/memory_order go into more
54 * detail on all this, including examples of how each mode works.
56 * Note that for simplicity and practicality, not all of the modes in
57 * C++11 are supported. The missing C++11 modes are either subsumed by
58 * the modes we provide below, or not relevant for the CPUs we support
59 * in Gecko. These three modes are confusing enough as it is!
63 * Relaxed ordering is the simplest memory ordering: none at all.
64 * When the result of a write is observed, nothing may be inferred
65 * about other memory. Writes ostensibly performed "before" on the
66 * writing thread may not yet be visible. Writes performed "after" on
67 * the writing thread may already be visible, if the compiler or CPU
68 * reordered them. (The latter can happen if reads and/or writes get
69 * held up in per-processor caches.) Relaxed ordering means
70 * operations can always use cached values (as long as the actual
71 * updates to atomic values actually occur, correctly, eventually), so
72 * it's usually the fastest sort of atomic access. For this reason,
73 * *it's also the most dangerous kind of access*.
75 * Relaxed ordering is good for things like process-wide statistics
76 * counters that don't need to be consistent with anything else, so
77 * long as updates themselves are atomic. (And so long as any
78 * observations of that value can tolerate being out-of-date -- if you
79 * need some sort of up-to-date value, you need some sort of other
80 * synchronizing operation.) It's *not* good for locks, mutexes,
81 * reference counts, etc. that mediate access to other memory, or must
82 * be observably consistent with other memory.
84 * x86 architectures don't take advantage of the optimization
85 * opportunities that relaxed ordering permits. Thus it's possible
86 * that using relaxed ordering will "work" on x86 but fail elsewhere
87 * (ARM, say, which *does* implement non-sequentially-consistent
88 * relaxed ordering semantics). Be extra-careful using relaxed
89 * ordering if you can't easily test non-x86 architectures!
94 * When an atomic value is updated with ReleaseAcquire ordering, and
95 * that new value is observed with ReleaseAcquire ordering, prior
96 * writes (atomic or not) are also observable. What ReleaseAcquire
97 * *doesn't* give you is any observable ordering guarantees for
98 * ReleaseAcquire-ordered operations on different objects. For
99 * example, if there are two cores that each perform ReleaseAcquire
100 * operations on separate objects, each core may or may not observe
101 * the operations made by the other core. The only way the cores can
102 * be synchronized with ReleaseAcquire is if they both
103 * ReleaseAcquire-access the same object. This implies that you can't
104 * necessarily describe some global total ordering of ReleaseAcquire
107 * ReleaseAcquire ordering is good for (as the name implies) atomic
108 * operations on values controlling ownership of things: reference
109 * counts, mutexes, and the like. However, if you are thinking about
110 * using these to implement your own locks or mutexes, you should take
111 * a good, hard look at actual lock or mutex primitives first.
116 * When an atomic value is updated with SequentiallyConsistent
117 * ordering, all writes observable when the update is observed, just
118 * as with ReleaseAcquire ordering. But, furthermore, a global total
119 * ordering of SequentiallyConsistent operations *can* be described.
120 * For example, if two cores perform SequentiallyConsistent operations
121 * on separate objects, one core will observably perform its update
122 * (and all previous operations will have completed), then the other
123 * core will observably perform its update (and all previous
124 * operations will have completed). (Although those previous
125 * operations aren't themselves ordered -- they could be intermixed,
126 * or ordered if they occur on atomic values with ordering
127 * requirements.) SequentiallyConsistent is the *simplest and safest*
128 * ordering of atomic operations -- it's always as if one operation
129 * happens, then another, then another, in some order -- and every
130 * core observes updates to happen in that single order. Because it
131 * has the most synchronization requirements, operations ordered this
132 * way also tend to be slowest.
134 * SequentiallyConsistent ordering can be desirable when multiple
135 * threads observe objects, and they all have to agree on the
136 * observable order of changes to them. People expect
137 * SequentiallyConsistent ordering, even if they shouldn't, when
138 * writing code, atomic or otherwise. SequentiallyConsistent is also
139 * the ordering of choice when designing lockless data structures. If
140 * you don't know what order to use, use this one.
142 SequentiallyConsistent
,
148 * Structure which can be used to preserve the ordering of atomic accesses
149 * when recording or replaying an execution, depending on the Recording enum.
151 * Atomic access ordering is preserved by default when recording/replaying.
152 * This should be overridden for atomics that can be accessed in code that
153 * runs non-deterministically when recording/replaying, such as during GC, the
154 * JS interrupt callback, or code that is affected by JIT compilation or
157 template <recordreplay::Behavior Recording
>
158 struct AutoRecordAtomicAccess
;
161 struct AutoRecordAtomicAccess
<recordreplay::Behavior::DontPreserve
> {
162 explicit AutoRecordAtomicAccess(const void* aValue
) {}
163 ~AutoRecordAtomicAccess() {}
167 struct AutoRecordAtomicAccess
<recordreplay::Behavior::Preserve
> {
168 explicit AutoRecordAtomicAccess(const void* aValue
) {
169 recordreplay::BeginOrderedAtomicAccess(aValue
);
171 ~AutoRecordAtomicAccess() { recordreplay::EndOrderedAtomicAccess(); }
175 * We provide CompareExchangeFailureOrder to work around a bug in some
176 * versions of GCC's <atomic> header. See bug 898491.
178 template <MemoryOrdering Order
>
179 struct AtomicOrderConstraints
;
182 struct AtomicOrderConstraints
<Relaxed
> {
183 static const std::memory_order AtomicRMWOrder
= std::memory_order_relaxed
;
184 static const std::memory_order LoadOrder
= std::memory_order_relaxed
;
185 static const std::memory_order StoreOrder
= std::memory_order_relaxed
;
186 static const std::memory_order CompareExchangeFailureOrder
=
187 std::memory_order_relaxed
;
191 struct AtomicOrderConstraints
<ReleaseAcquire
> {
192 static const std::memory_order AtomicRMWOrder
= std::memory_order_acq_rel
;
193 static const std::memory_order LoadOrder
= std::memory_order_acquire
;
194 static const std::memory_order StoreOrder
= std::memory_order_release
;
195 static const std::memory_order CompareExchangeFailureOrder
=
196 std::memory_order_acquire
;
200 struct AtomicOrderConstraints
<SequentiallyConsistent
> {
201 static const std::memory_order AtomicRMWOrder
= std::memory_order_seq_cst
;
202 static const std::memory_order LoadOrder
= std::memory_order_seq_cst
;
203 static const std::memory_order StoreOrder
= std::memory_order_seq_cst
;
204 static const std::memory_order CompareExchangeFailureOrder
=
205 std::memory_order_seq_cst
;
208 template <typename T
, MemoryOrdering Order
>
209 struct IntrinsicBase
{
210 typedef std::atomic
<T
> ValueType
;
211 typedef AtomicOrderConstraints
<Order
> OrderedOp
;
214 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
215 struct IntrinsicMemoryOps
: public IntrinsicBase
<T
, Order
> {
216 typedef IntrinsicBase
<T
, Order
> Base
;
218 static T
load(const typename
Base::ValueType
& aPtr
) {
219 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
220 return aPtr
.load(Base::OrderedOp::LoadOrder
);
223 static void store(typename
Base::ValueType
& aPtr
, T aVal
) {
224 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
225 aPtr
.store(aVal
, Base::OrderedOp::StoreOrder
);
228 static T
exchange(typename
Base::ValueType
& aPtr
, T aVal
) {
229 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
230 return aPtr
.exchange(aVal
, Base::OrderedOp::AtomicRMWOrder
);
233 static bool compareExchange(typename
Base::ValueType
& aPtr
, T aOldVal
,
235 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
236 return aPtr
.compare_exchange_strong(
237 aOldVal
, aNewVal
, Base::OrderedOp::AtomicRMWOrder
,
238 Base::OrderedOp::CompareExchangeFailureOrder
);
242 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
243 struct IntrinsicAddSub
: public IntrinsicBase
<T
, Order
> {
244 typedef IntrinsicBase
<T
, Order
> Base
;
246 static T
add(typename
Base::ValueType
& aPtr
, T aVal
) {
247 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
248 return aPtr
.fetch_add(aVal
, Base::OrderedOp::AtomicRMWOrder
);
251 static T
sub(typename
Base::ValueType
& aPtr
, T aVal
) {
252 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
253 return aPtr
.fetch_sub(aVal
, Base::OrderedOp::AtomicRMWOrder
);
257 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
258 struct IntrinsicAddSub
<T
*, Order
, Recording
> : public IntrinsicBase
<T
*, Order
> {
259 typedef IntrinsicBase
<T
*, Order
> Base
;
261 static T
* add(typename
Base::ValueType
& aPtr
, ptrdiff_t aVal
) {
262 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
263 return aPtr
.fetch_add(aVal
, Base::OrderedOp::AtomicRMWOrder
);
266 static T
* sub(typename
Base::ValueType
& aPtr
, ptrdiff_t aVal
) {
267 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
268 return aPtr
.fetch_sub(aVal
, Base::OrderedOp::AtomicRMWOrder
);
272 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
273 struct IntrinsicIncDec
: public IntrinsicAddSub
<T
, Order
, Recording
> {
274 typedef IntrinsicBase
<T
, Order
> Base
;
276 static T
inc(typename
Base::ValueType
& aPtr
) {
277 return IntrinsicAddSub
<T
, Order
, Recording
>::add(aPtr
, 1);
280 static T
dec(typename
Base::ValueType
& aPtr
) {
281 return IntrinsicAddSub
<T
, Order
, Recording
>::sub(aPtr
, 1);
285 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
286 struct AtomicIntrinsics
: public IntrinsicMemoryOps
<T
, Order
, Recording
>,
287 public IntrinsicIncDec
<T
, Order
, Recording
> {
288 typedef IntrinsicBase
<T
, Order
> Base
;
290 static T
or_(typename
Base::ValueType
& aPtr
, T aVal
) {
291 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
292 return aPtr
.fetch_or(aVal
, Base::OrderedOp::AtomicRMWOrder
);
295 static T
xor_(typename
Base::ValueType
& aPtr
, T aVal
) {
296 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
297 return aPtr
.fetch_xor(aVal
, Base::OrderedOp::AtomicRMWOrder
);
300 static T
and_(typename
Base::ValueType
& aPtr
, T aVal
) {
301 AutoRecordAtomicAccess
<Recording
> record(&aPtr
);
302 return aPtr
.fetch_and(aVal
, Base::OrderedOp::AtomicRMWOrder
);
306 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
307 struct AtomicIntrinsics
<T
*, Order
, Recording
>
308 : public IntrinsicMemoryOps
<T
*, Order
, Recording
>,
309 public IntrinsicIncDec
<T
*, Order
, Recording
> {};
311 template <typename T
>
312 struct ToStorageTypeArgument
{
313 static constexpr T
convert(T aT
) { return aT
; }
316 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
318 static_assert(sizeof(T
) == 4 || sizeof(T
) == 8,
319 "mozilla/Atomics.h only supports 32-bit and 64-bit types");
322 typedef typename
detail::AtomicIntrinsics
<T
, Order
, Recording
> Intrinsics
;
323 typedef typename
Intrinsics::ValueType ValueType
;
327 constexpr AtomicBase() : mValue() {}
328 explicit constexpr AtomicBase(T aInit
)
329 : mValue(ToStorageTypeArgument
<T
>::convert(aInit
)) {}
331 // Note: we can't provide operator T() here because Atomic<bool> inherits
332 // from AtomcBase with T=uint32_t and not T=bool. If we implemented
333 // operator T() here, it would cause errors when comparing Atomic<bool> with
336 T
operator=(T aVal
) {
337 Intrinsics::store(mValue
, aVal
);
342 * Performs an atomic swap operation. aVal is stored and the previous
343 * value of this variable is returned.
345 T
exchange(T aVal
) { return Intrinsics::exchange(mValue
, aVal
); }
348 * Performs an atomic compare-and-swap operation and returns true if it
349 * succeeded. This is equivalent to atomically doing
351 * if (mValue == aOldValue) {
352 * mValue = aNewValue;
358 bool compareExchange(T aOldValue
, T aNewValue
) {
359 return Intrinsics::compareExchange(mValue
, aOldValue
, aNewValue
);
363 AtomicBase(const AtomicBase
& aCopy
) = delete;
366 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
367 class AtomicBaseIncDec
: public AtomicBase
<T
, Order
, Recording
> {
368 typedef typename
detail::AtomicBase
<T
, Order
, Recording
> Base
;
371 constexpr AtomicBaseIncDec() : Base() {}
372 explicit constexpr AtomicBaseIncDec(T aInit
) : Base(aInit
) {}
374 using Base::operator=;
376 operator T() const { return Base::Intrinsics::load(Base::mValue
); }
377 T
operator++(int) { return Base::Intrinsics::inc(Base::mValue
); }
378 T
operator--(int) { return Base::Intrinsics::dec(Base::mValue
); }
379 T
operator++() { return Base::Intrinsics::inc(Base::mValue
) + 1; }
380 T
operator--() { return Base::Intrinsics::dec(Base::mValue
) - 1; }
383 AtomicBaseIncDec(const AtomicBaseIncDec
& aCopy
) = delete;
386 } // namespace detail
389 * A wrapper for a type that enforces that all memory accesses are atomic.
391 * In general, where a variable |T foo| exists, |Atomic<T> foo| can be used in
392 * its place. Implementations for integral and pointer types are provided
395 * Atomic accesses are sequentially consistent by default. You should
396 * use the default unless you are tall enough to ride the
397 * memory-ordering roller coaster (if you're not sure, you aren't) and
398 * you have a compelling reason to do otherwise.
400 * There is one exception to the case of atomic memory accesses: providing an
401 * initial value of the atomic value is not guaranteed to be atomic. This is a
402 * deliberate design choice that enables static atomic variables to be declared
403 * without introducing extra static constructors.
405 template <typename T
, MemoryOrdering Order
= SequentiallyConsistent
,
406 recordreplay::Behavior Recording
= recordreplay::Behavior::Preserve
,
407 typename Enable
= void>
411 * Atomic<T> implementation for integral types.
413 * In addition to atomic store and load operations, compound assignment and
414 * increment/decrement operators are implemented which perform the
415 * corresponding read-modify-write operation atomically. Finally, an atomic
416 * swap method is provided.
418 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
421 typename EnableIf
<IsIntegral
<T
>::value
&& !IsSame
<T
, bool>::value
>::Type
>
422 : public detail::AtomicBaseIncDec
<T
, Order
, Recording
> {
423 typedef typename
detail::AtomicBaseIncDec
<T
, Order
, Recording
> Base
;
426 constexpr Atomic() : Base() {}
427 explicit constexpr Atomic(T aInit
) : Base(aInit
) {}
429 using Base::operator=;
431 T
operator+=(T aDelta
) {
432 return Base::Intrinsics::add(Base::mValue
, aDelta
) + aDelta
;
435 T
operator-=(T aDelta
) {
436 return Base::Intrinsics::sub(Base::mValue
, aDelta
) - aDelta
;
439 T
operator|=(T aVal
) {
440 return Base::Intrinsics::or_(Base::mValue
, aVal
) | aVal
;
443 T
operator^=(T aVal
) {
444 return Base::Intrinsics::xor_(Base::mValue
, aVal
) ^ aVal
;
447 T
operator&=(T aVal
) {
448 return Base::Intrinsics::and_(Base::mValue
, aVal
) & aVal
;
452 Atomic(Atomic
& aOther
) = delete;
456 * Atomic<T> implementation for pointer types.
458 * An atomic compare-and-swap primitive for pointer variables is provided, as
459 * are atomic increment and decement operators. Also provided are the compound
460 * assignment operators for addition and subtraction. Atomic swap (via
461 * exchange()) is included as well.
463 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
464 class Atomic
<T
*, Order
, Recording
>
465 : public detail::AtomicBaseIncDec
<T
*, Order
, Recording
> {
466 typedef typename
detail::AtomicBaseIncDec
<T
*, Order
, Recording
> Base
;
469 constexpr Atomic() : Base() {}
470 explicit constexpr Atomic(T
* aInit
) : Base(aInit
) {}
472 using Base::operator=;
474 T
* operator+=(ptrdiff_t aDelta
) {
475 return Base::Intrinsics::add(Base::mValue
, aDelta
) + aDelta
;
478 T
* operator-=(ptrdiff_t aDelta
) {
479 return Base::Intrinsics::sub(Base::mValue
, aDelta
) - aDelta
;
483 Atomic(Atomic
& aOther
) = delete;
487 * Atomic<T> implementation for enum types.
489 * The atomic store and load operations and the atomic swap method is provided.
491 template <typename T
, MemoryOrdering Order
, recordreplay::Behavior Recording
>
492 class Atomic
<T
, Order
, Recording
, typename EnableIf
<IsEnum
<T
>::value
>::Type
>
493 : public detail::AtomicBase
<T
, Order
, Recording
> {
494 typedef typename
detail::AtomicBase
<T
, Order
, Recording
> Base
;
497 constexpr Atomic() : Base() {}
498 explicit constexpr Atomic(T aInit
) : Base(aInit
) {}
500 operator T() const { return T(Base::Intrinsics::load(Base::mValue
)); }
502 using Base::operator=;
505 Atomic(Atomic
& aOther
) = delete;
509 * Atomic<T> implementation for boolean types.
511 * The atomic store and load operations and the atomic swap method is provided.
515 * - sizeof(Atomic<bool>) != sizeof(bool) for some implementations of
516 * bool and/or some implementations of std::atomic. This is allowed in
517 * [atomic.types.generic]p9.
519 * - It's not obvious whether the 8-bit atomic functions on Windows are always
520 * inlined or not. If they are not inlined, the corresponding functions in the
521 * runtime library are not available on Windows XP. This is why we implement
522 * Atomic<bool> with an underlying type of uint32_t.
524 template <MemoryOrdering Order
, recordreplay::Behavior Recording
>
525 class Atomic
<bool, Order
, Recording
>
526 : protected detail::AtomicBase
<uint32_t, Order
, Recording
> {
527 typedef typename
detail::AtomicBase
<uint32_t, Order
, Recording
> Base
;
530 constexpr Atomic() : Base() {}
531 explicit constexpr Atomic(bool aInit
) : Base(aInit
) {}
533 // We provide boolean wrappers for the underlying AtomicBase methods.
534 MOZ_IMPLICIT
operator bool() const {
535 return Base::Intrinsics::load(Base::mValue
);
538 bool operator=(bool aVal
) { return Base::operator=(aVal
); }
540 bool exchange(bool aVal
) { return Base::exchange(aVal
); }
542 bool compareExchange(bool aOldValue
, bool aNewValue
) {
543 return Base::compareExchange(aOldValue
, aNewValue
);
547 Atomic(Atomic
& aOther
) = delete;
550 // If you want to atomically swap two atomic values, use exchange().
551 template <typename T
, MemoryOrdering Order
>
552 void Swap(Atomic
<T
, Order
>&, Atomic
<T
, Order
>&) = delete;
554 } // namespace mozilla
556 #endif /* mozilla_Atomics_h */