Lightweight concurrency

From HaskellWiki
Revision as of 12:53, 7 March 2012 by Kc (talk | contribs) (Added section interaction with RTS and subsections)
Jump to navigation Jump to search

This page contains information about the design, implementation, problems and potential solutions for building user-level concurrency primitives in GHC.

Introdution

All of GHC's concurrency primitives are written in C code and is baked in as a part of the RTS. This precludes extensibility as well as making it difficult to maintain. Ideally, the concurrency libraries will be implemented completely in Haskell code, over a small subset of primitive operations provided by the RTS. This will provide a Haskell programmer the ability to build custom schedulers and concurrency libraries. For an earlier attempt at this problem, please look at Peng Li's paper [1].

Substrate primitives

Substrate primitives are the primitives exposed by the RTS, on top of which user-level concurreny libraries are built.

data PTM a -- Primitive transactional memory
instance Monad PTM
unsafeIOToPTM :: IO a -> PTM a
atomically :: PTM a -> IO a

data PVar a -- Primitive transactional variable
newPVar :: a -> PTM (PVar a)
newPVarIO :: a -> IO (PVar a)
readPVar :: PVar a -> PTM a
writePVar :: PVar a -> a -> PTM ()

data SCont -- One-shot continuations
data ThreadStatus = Blocked | Completed -- | Running. Running is set implicitly.
newSCont :: IO () -> IO SCont
switch   :: (SCont -> PTM (SCont, ThreadStatus)) -> IO ()
{- For switch, target thread's status must be Blocked. Otherwise, raises runtime error. 
 - After switching, target thread's status is implicitly set to Running, and current 
 - thread's status is set to ThreadStatus that was passed.
 -}
getSCont :: PTM SCont
switchTo :: SCont -> ThreadStatus -> PTM ()


Concurrency libraries

In order to support the construction of extensible user-level schedulers in GHC, special care has to be taken about blocking concurrency actions. When there is no default scheduler, the user-level scheduler must be made aware of the blocking action, and more interestingly, the blocking action of the user-level scheduler.

Motivation

The interaction between user-level schedulers and blocking actions is motivated through actions on MVars.The semantics of takeMVar is to block the calling thread if the MVar is empty, and eventually unblock and return the value when the MVar becomes full. Internally, when a thread blocks on an MVar, it switches to the next runnable thread. This assumes that the takeMVar has knowledge about the scheduler. In particular, the current implementation of takeMVar knows how to perform the following:

  • Block action: blocking the current thread on a condition, and switching to another runnable thread.
  • Unblock action: placing the unblocked thread back into the scheduler data structure.

Proposal

The new, scheduler agnostic version of takeMVar (say takeMVarPrim), will have the type:

takeMVarPrim :: PTM () -> PTM () -> MVarPrim a -> IO a

where the first and second arguments are the block and unblock actions. If the blocking and unblocking actions are known, takeMVar with its usual type can be obtained simply by partial application:

takeMVar :: MVarPrim a -> IO a
takeMVar = takeMVarPrim blockAct unblockAct

Since the MVar implementation is independent of the schedulers, even threads from different schedulers can perform operations on the same MVar. The knowledge of schedulers is completely embedded in the block and unblock actions. A typical implementation of blockAct and unblockAct for a scheduler might look like

data Scheduler

getBlockUnblockPair :: Scheduler -> (PTM (), PTM ())
getBlockUnblockPair sched = do
  thread <- Substrate.getSCont
  let blockAction = do {
    nextThread <- -- get next thread to run from sched
    switchTo nextThread Substrate.Blocked
  }
  let unblockAction = -- enque thread to sched
  return (blockAction, unblockAction)


Interaction with RTS

Interaction with GC

Bound threads

Safe foreign calls