AtomicMemoryOps
Purpose
Modern multicore processors provide a various atomic memory operations. These are instructions that work (i.e. are linearizable) even when used concurrently by multiple threads on the same memory address. They include compare-and-swap, atomic read-modify-write operations such as fetch-and-add, and others.
The main motivation for exposing these operations is to implement lock-free data structures in Haskell, and to perform efficient concurrent operations on existing data structures like IORef and Vector.
To get started quickly using these operations in Haskell, grab the atomic-primops package from Hackage. This is a low-level library that provides a thin layer over the basic operations: GHC primops or foreign primops. It does, as of version 0.4, provide some useful operations for end-users, including CAS on IORef
s, and in the future we hope to add more functionality for modifying popular data structures such as Vector
s.
Problems with CAS in Haskell
When you begin using atomic-primops, you will find that operations such as fetch-and-add within a byte-array, or a storeLoadBarrier
, do exactly the same thing as they would in C or Java or LLVM.
Compare-and-swap, on the other hand, is a delicate business when applied to arbitrary Haskell values. Compare-and-swap on boxed values (pointers) depends on a stable notion of pointer equality, which exists in languages like Java, ML, or Scheme, but which doesn't exist in Haskell. In fact, the GHC implementation does provide a primop to test for pointer equality, but its name tells the story: reallyUnsafePtrEquality#.
Haskell preserves referential transparency, and both Haskell programmers and the GHC compiler exercise equational reasoning that requires replacing terms with equal terms, using a notion of equality that does not respect pointer equality. Here are some things that GHC might do which can change the pointer that you think you have:
- Unbox and rebox an
Int
value. - Wrap extra thunks around your values in various places, e.g. for profiling.
- Duplicate immutable objects in the heap during garbage collection.
- Substitute a different variable for the one you thought you used even though at the source level they have different types, and one is produced by an
unsafeCoerce
of the other. (For example, see this bug.)
More details about these problems can be found in a Haskell Implementor's workshop talk from 2012.
One Solution: CAS with abstract tickets
Normally, CAS(ref,old,new)
takes an old and a new value (of any type), and if the mutable reference is still set to the old value, it is modified to point at the new one. The problem with that approach in Haskell is that it can be very hard to provide the old pointer with any reliability, due to the issues cited above.
In Haskell, we want a CAS which operationally does the same thing, but we need a different interface. In particular, a CAS operation exposed by the Data.Atomics module looks like this:
casIORef :: IORef a -> Ticket a -> a -> IO (Bool, Ticket a) casIORef ref old new = ...
The old
argument is of type Ticket a
rather than a
. The Data.Atomics
library takes care of making sure the Ticket refers to something that the GHC compiler cannot interfere with. (In GHC 7.6 it uses the Any
kind.) This involves a delicate balance of type coercions, NOINLINE pragmas, and so on, but the user shouldn't need to worry about it. A Ticket is a proper value that can be stored in data structures, passed around, and be safely relied upon to provide a snapshot of the previous state of the location when it comes time to modify it again. The Ticket returned from one successful CAS operation can be used in the next.
Implementation notes
The original version of atomic-primops implemented all the relevant compiler primops as foreign primops. This also necessitated duplicating some C code from the GHC runtime system.
This is not ideal, and in GHC 7.8 the primops in question were added directly to the compiler. But more may be needed in the future. Also, a remaining drawback is that all atomic primops remain out-of-line rather than inline-primops. This is because the GHC native code generator does not yet know how to produce the relevant instructions on all supported architectures, rather, this is done in C code.
Future work may include adding direct support for these primops in the LLVM backend, which supports all the operations in a straightforward way, and would not have to suffer the overhead of calling into C code.