Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Haskell
Wiki community
Recent changes
Random page
HaskellWiki
Search
Search
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Lightweight concurrency
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Special pages
Page information
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Interaction with RTS == In the current GHC implementation, runtime manages the threading system entirely. By moving the threading system to the user-level, several subtle interactions between the threads and the RTS have to be handled differently. This section goes into details of such interactions, lists the issues and potential solutions. === Up-call handlers === In order to support interaction between the scheduler and RTS, every Haskell thread must have the following up-call handlers: <haskell> switchToNext :: SwitchStatus -> IO () unblockThread :: SCont -> IO () </haskell> <hask>switchToNext</hask> implements code necessary to switch to the next thread from the calling thread's scheduler, and suspends the calling thread with the given status. <hask>unblockThread</hask> enqueues the given thread to the current thread's scheduler. <hask>switchToNext</hask> and <hask>unblockThread</hask> are analogous to the block and unblock actions described under [[Lightweight_concurrency#Proposal|concurrency primitives]]. The <hask>unblockThread</hask> upcall handler explicitly takes an SCont as an argument. This might seem strange at first since every thread has its own <hask>unblockThread</hask> handler. But this signature allows helper threads created by the RTS to inherit a scheduler, so that they will have sensible semantics when they block on a blackhole, for example. The up-call handlers are stored in the StgTSO thread structure so that the RTS may find it. They are traced during a GC as a part of tracing the thread. It is the responsibility of schedulers to install the up-call handlers during thread creating. Currently, up-call handlers are installed using the following primitives exposed by the substrate: <haskell> setSwitchToNextClosure :: SCont -> (SwitchStatus -> IO ()) -> IO () setUnblockThreadClosure :: SCont -> (SCont -> IO ()) -> IO () </haskell> where the given SCont is the target thread. Ideally, this needs to be a part of newSCont primitive. === Interaction with GC === In the vanilla GHC implementation, each capability maintains a '''list of runnable Haskell threads'''. Each generation in the GC also maintains a list of threads belonging to that generation. At the end of generational collection, threads that survive are promoted to the next generation. Whenever a new thread is created, it is added to generation0's thread list. During a GC, threads are classified into three categories: * '''Runnable threads:''' Threads that are on the runnable queues. These are considered to be GC roots. * '''Reachable threads:''' Threads that are reachable from runnable threads. These threads might be blocked on MVars, STM actions, etc., complete or killed. * '''Unreachable threads:''' Threads that are unreachable. Unreachable threads might be blocked, complete or killed. At the end of a GC, all unreachable threads that are blocked are prepared with <hask>BlockedIndefinitely</hask> exception and added to their capability's run queue. Note that complete and killed reachable threads survive a collection along with runnable threads, since asynchronous exceptions can still be invoked on them. In the lightweight concurrency implementation, each capability has just a '''single runnable thread'''. Each generation still maintains a list of threads belonging to that generation. During a GC, threads are classified into reachable and unreachable. RTS knows whether a thread is blocked or complete since this is made explicit in the [[Lightweight_concurrency#Substrate_primitives|switch]] primitive. ==== Problem 1 - Indefinitely blocked threads ==== In the LWC implementation, since there is no notion of a runnable queue of threads for a capability, how do we raise <hask>BlockedIndefinitely</hask> exception? We need to distinguish between blocked on an unreachable concurrent data structure and an unreachable scheduler. The programmer makes this distinction explicit through the thread status argument as a part of [[Lightweight_concurrency#Substrate_primitives|context switch]] primitives. ===== Blocked on an unreachable concurrent data structure ===== If the MVar is unreachable, the scheduler might still be reachable, and some runnable thread is potentially waiting pull work off this scheduler. Thread blocked on an unreachable MVar will be blocked with thread status <hask>BlockedOnConcDS</hask>. In this case, we can prepare the blocked thread for raising the asynchronous exception as we do in the vanilla implementation. Subsequently, RTS need only to evaluate the blocked thread's unblock action, which will enqueue the blocked thread on its scheduler. But on which thread do we execute the unblock action? In the LWC implementation, each capability has only ever one thread in its run queue. The solution proposed here is similar to finalizer invocations. We create an array of IO () actions with the following structure: <haskell> [unblock_currentThread, unblock_t0, unblock_t1, ..., unblock_tn, switchToNext_currentThread] </haskell> where unblock_t0 to unblock_tn correspond to <hask>unblockThread</hask> upcalls of threads t0 to tn, which are being resurrected with <hask>BlockedIndefinitelyOnConcDS</hask> exception. unblock_currentThread and switchToNext_currentThread correspond to the <hask>unblockThread</hask> and <hask>switchToNext</hask> upcalls of the (only) thread currently on this capability. Next, we create a helper thread with the following closure applied to the array constructed previously. <haskell> rtsSchedulerBatchIO :: Array (IO ()) -> IO () </haskell> When given an array of IO () actions, <hask>rtsSchedulerBatchIO</hask> performs each IO action it one-by-one. The net effect of executing the new thread is to add the resurrected threads to their corresponding schedulers and waking up the original thread that was running on this capability. The newly created thread inherits the scheduler of the thread that was running on the scheduler. This is done by copying the upcall handlers. This is necessary since the newly created helper thread might also get blocked due to PTM actions, blackholes, etc,. ===== Blocked on a unreachable scheduler ===== This case is a bit tricky. If a thread is blocked on an unreachable scheduler, we need to find a scheduler for this thread to execute. But which scheduler? RTS does not know about any other user-level schedulers. We might fall back to the vanilla GHC's solution here, which is to prepare the blocked thread for asynchronous exception and add it to the current capability's queue of threads blocked on scheduler. At the end of GC, RTS first raises BlockedIndefinitelyOnScheduler exception on all the threads blocked on scheduler, and finally switches to the actual computation (current thread). This solution is not ideal since we do not eliminate schedulers completely from RTS. ==== Problem 2 - Detecting deadlock ==== In the vanilla implementation, whenever RTS finds there are no runnable threads, a GC is forced, that might potentially release the deadlock. This will happen since any indefinitely blocked threads will be woken up with asynchronous exceptions. In the LWC implementation, how would the runtime distinguish between a scheduler that might actively be spinning, looking for more work and a thread execution? There might be multiple schedulers running on a capability, and no one scheduler might know that all schedulers on the capability are waiting for work. It might not be a good idea to trigger a GC whenever a scheduler runs out of work. ''' Proposal 1 ''' Every capability keeps a count of SConts spawned as schedulers, and empty schedulers. When these counts become equal, a GC is triggered. If no threads are unblocked by this GC, then we are really deadlocked. There is a possibility of false positives with this scheme since a scheduler might be slow to let the RTS know that it has in fact found work. How do we deal with such momentary false positives? ''' Proposal 2 ''' Treat the first Haskell thread (proto_thread) created on any capability as a special thread whose only job is to create threads to execute work. It enqueues itself on the scheduler it creates. But none of the threads on the scheduler will switch to this proto_thread, unless there is nothing else to switch to. The proto_thread, when resumed, will force a GC. However, this solution assumes there is a single scheduler data structure at the lowest level per capability. A capability might really not be deadlocked, since work might be generated by other cores. For example, MVar synchronization might release threads that will be enqueued to schedulers on this capability. Should performing GC on a deadlock be a global property of all capabilities? === Bound threads === Bound threads [http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html#g:9] are ''bound'' to operating system threads (tasks), and only the task to which the haskell thread is bound to can run it. In the vanilla implementation, bound threads are created using <hask>forkOS</hask> primitive, which creates a new task to run this bound thread. A bound thread and its bound task have the invariant that if the bound thread is running, it is running on its bound task, and when the bound thread is blocked, its bound is suspended. When a bound thread is resumed, RTS checks if the current task is its bound task. If not, current task is suspended, and the current capability is passed to the bound task, which resumes the bound thread. Thus, bound threads are handled transparently from the programmer's point of view, and the programmer never sees the tasks. '''LWC implementation''' We would like to have the same interface in the LWC implementation. Bound threads can be created with the new substrate primitive: <haskell> newBoundSCont :: IO () -> IO SCont </haskell> which has the same type signature as <hask>newSCont</hask>. But unlike <hask>newSCont</hask>, a new OS thread (a task), is created in a suspended state and bound to the new thread. During a user-level context switch, the target thread is assigned as the current capability's thread and the control is returned to RTS scheduler loop. The RTS scheduler loop, as it does for the vanilla implementation, takes care of passing capabilities between tasks, creating worker tasks, etc., if the control switches to or from a bound thread. === Safe foreign calls === A safe foreign calls does not impede the execution of other Haskell threads on the same scheduler, if the foreign call blocks, unlike unsafe foreign calls. A safe foreign call is typically more expensive than its unsafe counterpart since it ''potentially'' involves switching between Haskell threads. At the very least, a safe foreign call involves releasing and re-acquiring capability. ==== Anatomy of a safe foreign call ==== Every capability, among other things, has a list of tasks (returning_tasks) that have completed their safe foreign call. The following are the steps involved in invoking a safe foreign call: * Before the foreign call, release the current capability to another worker task. * Perform the foreign call. * Add the current task to returning_tasks list. * Reacquire the capability. * Resume execution of Haskell thread. The first action performed by the worker task that acquired the capability is to check if returning_tasks is not empty. If so, the worker yields the capability to the first task in the returning_task list (fast path). Otherwise, the worker proceeds to run the next task from the run queue (slow path). Thus, in the fast path, the haskell thread never switches. ==== Problem ==== In the LWC implementation, the worker does not have the reference to the scheduler to pick the next task from. And for the same reason, when the task returns from the foreign call, it needs to know what to do with the Haskell thread, whether to switch to it (fast path) or add it to the scheduler data structure, to which it does not have a reference to. Even if the RTS had a reference to the scheduler data structure, it must be implemented in such a way that it is operable by both C and Haskell code. ==== Proposal ==== We might build on top of [[Lightweight_concurrency#Up-call_handlers|up-call handlers]]. * Before the foreign call, release the current capability to a worker, along with its switchToNext closure. * Perform the foreign call. * Add the current task to returning_tasks list. * Reacquire the capability. * If I am on the fast path (i.e, worker did not get the capability), resume execution of Haskell thread. * Otherwise (slow path), execute unblockThread upcall to enque the Haskell thread to the scheduler. At the worker: * Try to acquire the capability. * If the returning_tasks list is not empty, yield capability to the task from the head of the list (fast path). * Otherwise (slow path), execute the switchToNext closure, which will switch control to the next Haskell thread.
Summary:
Please note that all contributions to HaskellWiki are considered to be released under simple permissive license (see
HaskellWiki:Copyrights
for details). If you don't want your writing to be edited mercilessly and redistributed at will, then don't submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
DO NOT SUBMIT COPYRIGHTED WORK WITHOUT PERMISSION!
Cancel
Editing help
(opens in new window)
Toggle limited content width