# Difference between revisions of "IO Semantics"

m (Replaced "to" with "is") |
(Rewritten to remove continuations.) |
||

Line 6: | Line 6: | ||

The idea is to define <hask>IO</hask> as | The idea is to define <hask>IO</hask> as | ||

<haskell> | <haskell> | ||

− | + | data IO a = Done a | |

+ | | PutChar Char (IO a) | ||

+ | | GetChar (Char -> IO a) | ||

</haskell> | </haskell> | ||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | <hask> | + | For simplicity this an example of <hask>IO</hask> that only gives semantics for teletype IO. |

− | One interprets (or executes) an <hask> | + | |

+ | Think of <hask>IO a</hask> as a tree whose leaves are <hask>Done a</hask> that holds the result of the program. <hask>PutChar</hask> is a node that has one child tree and the node holds one character of data. <hask>GetChar</hask> is a node that has many children; it has one child for every <hask>Char</hask>, but <hask>GetChar</hask> holds no data itself. | ||

+ | |||

+ | This tree contains all the information needed to execute teletype interactions. | ||

+ | One interprets (or executes) an <hask>IO a</hask> by tracing a route from root of the tree to a leaf. | ||

+ | |||

+ | If a <hask>PutChar</hask> node is encountered, the character data contained at that node is output to the terminal and then its subtree is executed. It is at this point that Haskell code evaluated in order to determine what character should be displayed before continuing. If a <hask>GetChar</hask> node is encountered, a character is read from the terminal (blocking if necessary) and the subtree corresponding to the character received is executed. If <hask>Done</hask> is encountered the program ends. <hask>Done</hask> holds the result of the computation, but in the case of <hask>main :: IO ()</hask> the data is of type <hask>()</hask> and thus contains no information and is ignored. | ||

− | + | This execution is not done anywhere in a haskell program, rather it is done by the run-time system. | |

− | |||

− | |||

− | |||

− | |||

− | + | The monadic operations are defined as follows: | |

− | |||

− | |||

− | |||

<haskell> | <haskell> | ||

− | + | return :: a -> IO a | |

+ | return x = Done x | ||

+ | |||

+ | (>>=) :: IO a -> (a -> IO b) -> IO b | ||

+ | Done x >>= f = f x | ||

+ | PutChar c x >>= f = PutChar c (x >>= f) | ||

+ | GetChar g >>= f = GetChar (\c -> g c >>= f) | ||

</haskell> | </haskell> | ||

− | + | As you can see <hask>return</hask> is just another name for <hask>Done</hask>. The bind operation takes a tree <hask>x</hask> and a function <hask>f</hask> and replaces the <hask>Done</hask> nodes (the leaves) of <hask>x</hask> by a new tree produce by applying <hask>f</hask> to the data held in the <hask>Done</hask> nodes. | |

+ | The primitive IO commands are defined using these constructors. | ||

<haskell> | <haskell> | ||

putChar :: Char -> IO () | putChar :: Char -> IO () | ||

− | putChar = | + | putChar x = PutChar x (Done ()) |

getChar :: IO Char | getChar :: IO Char | ||

− | getChar = | + | getChar = GetChar (\c -> Done c) |

</haskell> | </haskell> | ||

+ | The function <hask>putChar</hask> builds a small <hask>IO ()</hask> tree that contains one <hask>PutChar</hask> node holding the character data followed by <hask>Done</hask>. | ||

+ | |||

+ | The function <hask>getChar</hask> builds a short <hask>IO Char</hask> tree that begins with a <hask>GetChar</hask> that holds one <hask>Done</hask> node holding every character. | ||

Other teletype commands can be defined in terms of these primitives | Other teletype commands can be defined in terms of these primitives | ||

Line 55: | Line 54: | ||

putStr = mapM_ putChar | putStr = mapM_ putChar | ||

</haskell> | </haskell> | ||

− | More generally speaking, <hask> | + | |

+ | More generally speaking, <hask>IO a</hask> will represent the desired interaction with the operating system. For every system call there will be a corresponding constructor in <hask>IOTree</hask> of the form | ||

<haskell> | <haskell> | ||

− | | SysCallName p1 p2 ... pn (r -> | + | | SysCallName p1 p2 ... pn (r -> IO a) |

</haskell> | </haskell> | ||

where <hask>p1</hask> ... <hask>pn</hask> are the parameters for the system call, and <hask>r</hask> is the result of the system call. (Thus <hask>PutChar</hask> and <hask>GetChar</hask> will not occur as constructors of <hask>IOTree</hask> if they don't correspond to system calls) | where <hask>p1</hask> ... <hask>pn</hask> are the parameters for the system call, and <hask>r</hask> is the result of the system call. (Thus <hask>PutChar</hask> and <hask>GetChar</hask> will not occur as constructors of <hask>IOTree</hask> if they don't correspond to system calls) | ||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− |

## Revision as of 21:49, 8 May 2010

## Semantics of IO: A Continuation Approach

The following is inspired by Luke Palmer's post. This only describes one possible semantics of `IO a`

; your actually implementation may vary.

The idea is to define `IO`

as

```
data IO a = Done a
| PutChar Char (IO a)
| GetChar (Char -> IO a)
```

For simplicity this an example of `IO`

that only gives semantics for teletype IO.

Think of `IO a`

as a tree whose leaves are `Done a`

that holds the result of the program. `PutChar`

is a node that has one child tree and the node holds one character of data. `GetChar`

is a node that has many children; it has one child for every `Char`

, but `GetChar`

holds no data itself.

This tree contains all the information needed to execute teletype interactions.
One interprets (or executes) an `IO a`

by tracing a route from root of the tree to a leaf.

If a `PutChar`

node is encountered, the character data contained at that node is output to the terminal and then its subtree is executed. It is at this point that Haskell code evaluated in order to determine what character should be displayed before continuing. If a `GetChar`

node is encountered, a character is read from the terminal (blocking if necessary) and the subtree corresponding to the character received is executed. If `Done`

is encountered the program ends. `Done`

holds the result of the computation, but in the case of `main :: IO ()`

the data is of type `()`

and thus contains no information and is ignored.

This execution is not done anywhere in a haskell program, rather it is done by the run-time system.

The monadic operations are defined as follows:

```
return :: a -> IO a
return x = Done x
(>>=) :: IO a -> (a -> IO b) -> IO b
Done x >>= f = f x
PutChar c x >>= f = PutChar c (x >>= f)
GetChar g >>= f = GetChar (\c -> g c >>= f)
```

As you can see `return`

is just another name for `Done`

. The bind operation takes a tree `x`

and a function `f`

and replaces the `Done`

nodes (the leaves) of `x`

by a new tree produce by applying `f`

to the data held in the `Done`

nodes.

The primitive IO commands are defined using these constructors.

```
putChar :: Char -> IO ()
putChar x = PutChar x (Done ())
getChar :: IO Char
getChar = GetChar (\c -> Done c)
```

The function `putChar`

builds a small `IO ()`

tree that contains one `PutChar`

node holding the character data followed by `Done`

.

The function `getChar`

builds a short `IO Char`

tree that begins with a `GetChar`

that holds one `Done`

node holding every character.

Other teletype commands can be defined in terms of these primitives

```
putStr :: String -> IO ()
putStr = mapM_ putChar
```

More generally speaking, `IO a`

will represent the desired interaction with the operating system. For every system call there will be a corresponding constructor in `IOTree`

of the form

```
| SysCallName p1 p2 ... pn (r -> IO a)
```

where `p1`

... `pn`

are the parameters for the system call, and `r`

is the result of the system call. (Thus `PutChar`

and `GetChar`

will not occur as constructors of `IOTree`

if they don't correspond to system calls)