Difference between revisions of "Global keys"

From HaskellWiki
Jump to navigation Jump to search
(categorise)
 
(3 intermediate revisions by 2 users not shown)
Line 9: Line 9:
 
As Brian Hulley put it, at the moment, there is a strange unnatural discrepancy between the fixed set of built-in privileged operations such as Data.Unique.newUnique which are "allowed" to make use of global state, and user defined operations which have to rely on a shaky hack in order to preserve natural abstraction barriers between components such as a user-defined Unique, Atom, and anything involving memoisation or device management etc.
 
As Brian Hulley put it, at the moment, there is a strange unnatural discrepancy between the fixed set of built-in privileged operations such as Data.Unique.newUnique which are "allowed" to make use of global state, and user defined operations which have to rely on a shaky hack in order to preserve natural abstraction barriers between components such as a user-defined Unique, Atom, and anything involving memoisation or device management etc.
   
The kind of applications we have in mind are:
+
The kind of applications we have in mind (please add more) are:
 
* A source of random numbers, or of unique numbers. This should be on a per-thread basis.
 
* A source of random numbers, or of unique numbers. This should be on a per-thread basis.
* The value of 'stdin' or 'stdout'. We don't want to mutate this, but we might want to set the value for sub-computations, including any spawned threads.
+
* The value of 'stdin' or 'stdout'. We don't want to mutate this (although note that the handle itself <em>contains</em> mutable state), but we might want to set the value for sub-computations, including any spawned threads.
   
 
== A straw man ==
 
== A straw man ==
   
  +
The basic idea is this:
Here's a straw-man proposal.
 
  +
* Each thread has access (via the IO or STM monad) to a "dictionary" that maps a (typed) key to a value of that type.
* New top-level declaration to allocate a new <strong>key</strong>
 
  +
* The dictionary is not mutable, but the values might themselves be mutable cells (think of 'stdin').
  +
  +
(I don't know how to put code inside a bullet, and carry on the bullet afterwards, so the formatting below is a mess.)
  +
  +
More concretely:
  +
* A new built-in data type, <code>Key a</code>, an instance of Eq, but not Ord. It's a "thing with identity" (TWI).
  +
* A new built-in data type of dictionaries, <code>Dict</code>, which maps a typed key to a typed value:
  +
<haskell>
  +
lookup :: Dict -> Key a -> Maybe a
 
insert :: Dict -> Key a -> a -> Dict
  +
union :: Dict -> Dict -> Dict
  +
etc
  +
</haskell>
  +
Yes, it'll need a massive interface, like Data.Map.
  +
* You can allocate a fresh (globally unique) key in the IO monad
  +
<haskell>
  +
newKey :: IO (Key a)
  +
</haskell>
 
* However, we add a new top-level declaration to allocate new top-level key:
 
<haskell>
 
<haskell>
 
key rng :: Key StdGen
 
key rng :: Key StdGen
 
</haskell>
 
</haskell>
  +
This is useful independent of mutability. For example, GHC has lots of code going
A key is globally unique in a program. It supports equality but not ordering.
 
* You can read and write global state and local state:
 
 
<haskell>
 
<haskell>
  +
thenIdKey = mkPreludeIdKey 3
readLS, readGS :: Key a -> IO a
 
  +
returnIdName = mkPreludeIdKey 4
writeLS, writeGS :: Key a -> a -> IO a
 
  +
</haskell>
</haskell> The distinction between the two is that there is one global state, shared between all threads, but each thread has its own local state.
 
  +
etc, where we hand out unique identifiers by hand. Easy to screw up.
* Keys have equality but not ordering. We probably want to provide efficient finite maps on keys.
 
  +
* Each thread has an implicit, thread-specific dictionary:
  +
<haskell>
  +
getDict :: IO Dict
 
setDict :: Dict -> IO a -> IO a
  +
</haskell>
  +
(Think of myThreadId.) Notice that setDict does not mutate the dictionary;
  +
it just sets the implicit dictionary for a nested sub-computation.
  +
* A forked thread inherits its parent's dictionary.
  +
  +
== Initialisation ==
  +
  +
Initialisation is the big question here.
  +
A library may want to allocate a key, and an initialiser to be run the first time the key is accessed, without the client of the library needing to know about it at all. A second issue that we may sometimes want to implicitly (?) re-initialise (or clear) all or part of the dictionary when forking a thread.
   
  +
[[Category:Idioms]]
Open questions:
 
* When you forkIO a thread, what (if anything) does the forked thread inherit?
 

Latest revision as of 10:02, 17 September 2006

Introduction

This page is meant to contain discussion about global state and "things with identity". It's a companion to Adrian's http://www.haskell.org/hawiki/GlobalMutableState, but that's in the old Wiki, so I thought it better to start a new page.

See also Ian Stark's ACIO message http://www.haskell.org/pipermail/haskell-cafe/2004-November/007664.html.

Examples

As Brian Hulley put it, at the moment, there is a strange unnatural discrepancy between the fixed set of built-in privileged operations such as Data.Unique.newUnique which are "allowed" to make use of global state, and user defined operations which have to rely on a shaky hack in order to preserve natural abstraction barriers between components such as a user-defined Unique, Atom, and anything involving memoisation or device management etc.

The kind of applications we have in mind (please add more) are:

  • A source of random numbers, or of unique numbers. This should be on a per-thread basis.
  • The value of 'stdin' or 'stdout'. We don't want to mutate this (although note that the handle itself contains mutable state), but we might want to set the value for sub-computations, including any spawned threads.

A straw man

The basic idea is this:

  • Each thread has access (via the IO or STM monad) to a "dictionary" that maps a (typed) key to a value of that type.
  • The dictionary is not mutable, but the values might themselves be mutable cells (think of 'stdin').

(I don't know how to put code inside a bullet, and carry on the bullet afterwards, so the formatting below is a mess.)

More concretely:

  • A new built-in data type, Key a, an instance of Eq, but not Ord. It's a "thing with identity" (TWI).
  • A new built-in data type of dictionaries, Dict, which maps a typed key to a typed value:
  lookup :: Dict -> Key a -> Maybe a
  insert :: Dict -> Key a -> a -> Dict
  union  :: Dict -> Dict -> Dict
  etc

Yes, it'll need a massive interface, like Data.Map.

  • You can allocate a fresh (globally unique) key in the IO monad
  newKey :: IO (Key a)
  • However, we add a new top-level declaration to allocate new top-level key:
  key rng :: Key StdGen

This is useful independent of mutability. For example, GHC has lots of code going

  thenIdKey = mkPreludeIdKey 3
  returnIdName = mkPreludeIdKey 4

etc, where we hand out unique identifiers by hand. Easy to screw up.

  • Each thread has an implicit, thread-specific dictionary:
  getDict :: IO Dict
  setDict :: Dict  -> IO a -> IO a

(Think of myThreadId.) Notice that setDict does not mutate the dictionary; it just sets the implicit dictionary for a nested sub-computation.

  • A forked thread inherits its parent's dictionary.

Initialisation

Initialisation is the big question here. A library may want to allocate a key, and an initialiser to be run the first time the key is accessed, without the client of the library needing to know about it at all. A second issue that we may sometimes want to implicitly (?) re-initialise (or clear) all or part of the dictionary when forking a thread.