Thread-local storage

From HaskellWiki

No facility for thread-local storage exists yet in any Haskell compiler.

If we override 'fork', then we can implement a thread-local storage facility using 'ThreadId'. The following implementation uses one variable per type:

http://www.cs.helsinki.fi/u/ekarttun/haskell/TLS/TLSVar.hs

However, if many people are going to use thread-local storage then it would be best to have a standard implementation, for library compatibility.

Simon Marlow and Simon Peyton-Jones have both expressed that they support some form of thread-local storage in the standard libraries.[1][2]

Proposal 1 ('threadlocal')[edit]

Robert Dockins has put forward a proposal[3] which deals specially with initialization issues.

This deals with initialization which is important if the TLS is used for state.

Proposal 2[edit]

Frederik Eaton posted an example API to the Haskell mailing list [4]. It depends on two new functions 'withParams' and 'getParams'.

(new version 2006/8/9)

module Fu.IOParam (
    forkIO,
    newIOParam,
    withIOParam,
    getIOParam,
    modifyIOParam,
    _getParams,
    _withParams
) where

import qualified Data.Map as M
import Data.Maybe
import Data.Unique
import Data.IORef
import Data.Typeable
import Data.Dynamic
import qualified Control.Concurrent as CC
import Control.Concurrent hiding (forkIO)
import Control.Concurrent.MVar
import Control.Exception
import Control.Monad
import System.IO.Unsafe

type IOParam a = IORef (Unique, a)

newIOParam :: Typeable a => a -> IO (IOParam a)
newIOParam def = do
    k <- newUnique
    newIORef (k,def)

withIOParam :: Typeable v => IOParam v -> v -> IO a -> IO a
withIOParam p value act = do
    (k,def) <- readIORef p
    m <- _getParams
    _withParams (M.insert k (toDyn value) m) act

getIOParam :: Typeable v => IOParam v -> IO v
getIOParam p = do
    (k,def) <- readIORef p
    m <- _getParams
    let vm = liftM (flip fromDyn $ (error "internal error in IOParam")) $ M.lookup k m
    return $ fromMaybe def vm

-- | convenience function
modifyIOParam :: Typeable v => IOParam v -> (v -> v) -> IO a -> IO a
modifyIOParam p fm act = do
    v <- getIOParam p
    let v' = fm v
    withIOParam p v' act

forkIO act = do
  p <- _getParams
  CC.forkIO $ _withParams p act

----------------------------------------------------------------
-- non-exposed
type ParamsMap = M.Map Unique Dynamic

_withParams :: ParamsMap -> IO a -> IO a
_withParams newmap act = do
    myid <- myThreadId
    t <- readMVar threadMap
    let oldmap = fromMaybe (M.empty) (M.lookup myid t)
    modifyMVar_ threadMap (return . M.insert myid newmap)
    act `finally`
      modifyMVar_ threadMap (\tm ->
        if M.null oldmap then
          return $ M.delete myid tm
        else
          return $ M.insert myid oldmap tm)

_getParams :: IO ParamsMap
_getParams = do
    myid <- myThreadId
    t <- readMVar threadMap
    return $ fromMaybe (M.empty) (M.lookup myid t)

{-# NOINLINE threadMap #-}
threadMap :: MVar (M.Map ThreadId ParamsMap)
threadMap = unsafePerformIO $ do
--    tid <- myThreadId
--    newMVar $ M.singleton tid M.empty
    newMVar $ M.empty
----------------------------------------------------------------

Comments (feel free to delete)[edit]

  • Is it possible to set default values for threads where the IOParam was not set?
I'm not sure what this means. You can set a value to undefined or error .... You can change the default for a group of threads by setting up new values with withIOParam and then forking threads from this context.
  • Is it possible to have a value in TLS that is not Typeable?
No, but I think everything should be Typeable.
  • Would the idea be to use unsafePerformIO to create the Uniques for top-level keys?
Please clarify.

Comparison to Proposal 3[edit]

From the mailing list: [5]

The main difference between my and your proposals, as I see it, is
that your proposal is based on "keys" which can be used for other
things.

I think that leads to an interface which is less natural. In my
proposal, the IOParam type is quite similar to an IORef - it has a
user-specified initial state, and the internal implementation is
hidden from the user - yours differs in both of these aspects.

...

> *  A key issue is this: when forking a thread, does the new thread
> inherit the current thread's bindings, or does it get a
> freshly-initialised set.  Sometimes you want one, sometimes the other,
> alas.

I think the inheritance semantics are more useful and also more
general: If I wanted a freshly-initialized set of bindings, and I only
had inheritance semantics, then I could start a thread early on when
all the bindings are in their initial state, and have this thread read
actions from a channel and execute them in sub-threads of itself, and
implement a 'fork' variant based on this. More generally, I could do
the same thing from a sub-thread of the main thread - I could start a
thread with any set of bindings, and use it to launch other threads
with those bindings. In this way, the "initial" set of bindings is not
specially privileged over intermediate sets of bindings.

Proposal 3[edit]

Simon Peyton-Jones gave another proposal:

* The thoughts that Simon and were considering about thread-local state
are quite close to Robert's proposal.  For myself, I am somewhat
persuaded that some form of implicitly-passed state in the IO monad
(without explicit parameters) is useful.   Examples I often think of are
        - Allocating unique identifiers
        - Making random numbers
        - Where stdin and stdout should go
In all of these cases, a form of dynamic binding is just what we want:
send stdout to the current thread's stdout, use the current thread's
random number seed, etc.

* There's no need to connect it to *state*.  The key top-level thing you
need is to allocate what Adrian Hey calls a "thing with identity".
http://www.haskell.org/hawiki/GlobalMutableState.
I'll call it a key.  For example, rather than a 'threadlocal'
declaration, one might just have:

        newkey foo :: Key Int

where 'newkey' the keyword; this declares a new key with type (Key Int),
distinct from all other keys.

Now you can imagine that the IO monad could provide operations
        withBinding :: Key a -> a -> IO b -> IO b
        lookupBinding :: Key a -> IO a

very much like the dynamic-binding primitives that have popped up on
this thread.

* If you want *state*, you can have a (Key (IORef Int)).  Now you look
up the binding to get an IORef (or MVar, whatever you like) and you can
mutate that at will.  So this separates a thread-local *environment*
from thread-local *state*.

* Keys may be useful for purposes other than withBinding and
thread-local state.  One would also want to dynamically create new keys:
        newKey :: IO (Key a)

* I agree with Robert that a key issue is initialisation.  Maybe it
should be possible to associate an initialiser with a key.  I have not
thought this out.

*  A key issue is this: when forking a thread, does the new thread
inherit the current thread's bindings, or does it get a
freshly-initialised set.  Sometimes you want one, sometimes the other,
alas.

Comments[edit]

Keys seem like a mechanism that can help many things. E.g. the implementation of Typeable. It might be wise to require keys to be monomorphically typed to solve the polymorphic references problem.

Change "Key" to "IODynamicRef" and it's my proposal for dynamic binding. Calling it a "Key" is just funny to me. -- Taral 16:09, 8 August 2006 (UTC)

Monomorphism[edit]

It would be nice for TLS not act as unsafeCoerce#. This means that they should be monomorphic. The current status is:

  • TLSVar code - safe
  • Proposal 1 - unsafe like global polymorphic IORefs
  • Proposal 2 - safe with a runtime check
  • Proposal 3 - same as proposal 1.
If the top-level declaration (newkey) was required to be monomorphic, would that break anything? -- Taral 16:12, 8 August 2006 (UTC)

Cons of Thread local storage[edit]

Einar Karttunen expressed concern that extensive use of thread-local storage might cause problems with libraries that run actions in thread pools. He suggested that it would be better to define monads which contain all of the contextual state [6]. Frederik Eaton pointed out that in many reasonable designs, an approach which carries state in custom monads requires code which is quadratic in the number of layers of context [7]. Einar Karttunen also suggested a function which would solve the thread pool problem:

-- | Tie all TLS references in the IO action to the current
-- environment rather than the environment it will actually
-- be executed.
tieToCurrentTLS :: IO a -> IO (IO a)