Difference between revisions of "Lightweight concurrency"

From HaskellWiki
Jump to: navigation, search
(Interaction with GC)
(Bound threads: Added problem)
Line 99: Line 99:
   
 
=== Bound threads ===
 
=== Bound threads ===
  +
  +
Bound threads [http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html#g:9] are ''bound'' to operating system threads (tasks), and only the task to which the haskell thread is bound to can run it. Under vanilla GHC, every capability has a pool of tasks (workers), only one of which can own the capability at any give time. Ideally, there is one-to-one mapping between capabilities and cores. If the next Haskell thread to be run is a bound thread, and if the bound thread is bound to:
  +
  +
* the current task, run it.
  +
* a task belonging to the current capability, add the thread to the front of the run queue, and pass the current capability to the bound thread's task (which is currently suspended). This suspends the current task, and wakes up the target task.
  +
* a task belonging to a different capability, add the thread to that capability's run queue, and switch to next task.
  +
  +
==== Problem ====
  +
  +
Under lightweight concurrency implementation, the first two cases can be handled just like in the vanilla implementation. For the last case, since there is no notion of run queues per capability, we are stuck.
  +
 
=== Safe foreign calls ===
 
=== Safe foreign calls ===

Revision as of 15:10, 7 March 2012

This page contains information about the design, implementation, problems and potential solutions for building user-level concurrency primitives in GHC.

Introdution

All of GHC's concurrency primitives are written in C code and is baked in as a part of the RTS. This precludes extensibility as well as making it difficult to maintain. Ideally, the concurrency libraries will be implemented completely in Haskell code, over a small subset of primitive operations provided by the RTS. This will provide a Haskell programmer the ability to build custom schedulers and concurrency libraries. For an earlier attempt at this problem, please look at Peng Li's paper [1].

Substrate primitives

Substrate primitives are the primitives exposed by the RTS, on top of which user-level concurreny libraries are built.

data PTM a -- Primitive transactional memory
instance Monad PTM
unsafeIOToPTM :: IO a -> PTM a
atomically :: PTM a -> IO a

data PVar a -- Primitive transactional variable
newPVar :: a -> PTM (PVar a)
newPVarIO :: a -> IO (PVar a)
readPVar :: PVar a -> PTM a
writePVar :: PVar a -> a -> PTM ()

data SCont -- One-shot continuations
data ThreadStatus = Blocked | Completed -- | Running. Running is set implicitly.
newSCont :: IO () -> IO SCont
switch   :: (SCont -> PTM (SCont, ThreadStatus)) -> IO ()
{- For switch, target thread's status must be Blocked. Otherwise, raises runtime error. 
 - After switching, target thread's status is implicitly set to Running, and current 
 - thread's status is set to ThreadStatus that was passed.
 -}
getSCont :: PTM SCont
switchTo :: SCont -> ThreadStatus -> PTM ()


Concurrency libraries

In order to support the construction of extensible user-level schedulers in GHC, special care has to be taken about blocking concurrency actions. When there is no default scheduler, the user-level scheduler must be made aware of the blocking action, and more interestingly, the blocking action of the user-level scheduler.

Motivation

The interaction between user-level schedulers and blocking actions is motivated through actions on MVars.The semantics of takeMVar is to block the calling thread if the MVar is empty, and eventually unblock and return the value when the MVar becomes full. Internally, when a thread blocks on an MVar, it switches to the next runnable thread. This assumes that the takeMVar has knowledge about the scheduler. In particular, the current implementation of takeMVar knows how to perform the following:

  • Block action: blocking the current thread on a condition, and switching to another runnable thread.
  • Unblock action: placing the unblocked thread back into the scheduler data structure.

Proposal

The new, scheduler agnostic version of takeMVar (say takeMVarPrim), will have the type:

takeMVarPrim :: PTM () -> PTM () -> MVarPrim a -> IO a

where the first and second arguments are the block and unblock actions. If the blocking and unblocking actions are known, takeMVar with its usual type can be obtained simply by partial application:

takeMVar :: MVarPrim a -> IO a
takeMVar = takeMVarPrim blockAct unblockAct

Since the MVar implementation is independent of the schedulers, even threads from different schedulers can perform operations on the same MVar. The knowledge of schedulers is completely embedded in the block and unblock actions. A typical implementation of blockAct and unblockAct for a scheduler might look like

data Scheduler

getBlockUnblockPair :: Scheduler -> (PTM (), PTM ())
getBlockUnblockPair sched = do
  thread <- Substrate.getSCont
  let blockAction = do {
    nextThread <- -- get next thread to run from sched
    switchTo nextThread Substrate.Blocked
  }
  let unblockAction = -- enque thread to sched
  return (blockAction, unblockAction)


Interaction with RTS

In the current GHC implementation, runtime manages the threading system entirely. By moving the threading system to the user-level, several subtle interactions between the threads and the RTS have to be handled differently. This section goes into details of such interactions, lists the issues and potential solutions.

Interaction with GC

In the vanilla GHC implementation, each capability maintains a list of runnable Haskell threads. Each generation in the GC also maintains a list of threads belonging to that generation. At the end of generational collection, threads that survive are promoted to the next generation. Whenever a new thread is created, it is added to generation0's thread list. During a GC, threads are classified into three categories:

  • Runnable threads: Threads that are on the runnable queues. These are considered to be GC roots.
  • Reachable threads: Threads that are reachable from runnable threads. These threads might be blocked on MVars, STM actions, etc., complete or killed.
  • Unreachable threads: Threads that are unreachable. Unreachable threads might be blocked, complete or killed.

At the end of a GC, all unreachable threads that are blocked are prepared with BlockedIndefinitely exception and added to their capability's run queue. Note that complete and killed reachable threads survive a collection along with runnable threads, since asynchronous exceptions can still be invoked on them.

In the lightweight concurrency implementation, each capability has just a single runnable thread. Each generation still maintains a list of threads belonging to that generation. During a GC, threads are classified into reachable and unreachable. RTS knows whether a thread is blocked or complete since this is made explicit in the switch primitive.

Problem

In the LWC implementation, since there is no notion of a runnable queue of threads for a capability, how do we raise BlockedIndefinitely exception?

Bound threads

Bound threads [2] are bound to operating system threads (tasks), and only the task to which the haskell thread is bound to can run it. Under vanilla GHC, every capability has a pool of tasks (workers), only one of which can own the capability at any give time. Ideally, there is one-to-one mapping between capabilities and cores. If the next Haskell thread to be run is a bound thread, and if the bound thread is bound to:

  • the current task, run it.
  • a task belonging to the current capability, add the thread to the front of the run queue, and pass the current capability to the bound thread's task (which is currently suspended). This suspends the current task, and wakes up the target task.
  • a task belonging to a different capability, add the thread to that capability's run queue, and switch to next task.

Problem

Under lightweight concurrency implementation, the first two cases can be handled just like in the vanilla implementation. For the last case, since there is no notion of run queues per capability, we are stuck.

Safe foreign calls