https://wiki.haskell.org/api.php?action=feedcontributions&user=ThiagoArrais&feedformat=atomHaskellWiki - User contributions [en]2016-10-22T21:56:50ZUser contributionsMediaWiki 1.19.14+dfsg-1https://wiki.haskell.org/Haskell_Quiz/The_Solitaire_Cipher/Solution_Thiago_ArraisHaskell Quiz/The Solitaire Cipher/Solution Thiago Arrais2007-10-30T18:02:38Z<p>ThiagoArrais: </p>
<hr />
<div>[[Category:Haskell Quiz solutions|Solitaire Cipher]]<br />
<br />
<haskell><br />
module Main where<br />
<br />
import Char(chr, isAlpha, ord, toUpper)<br />
import List(intersperse)<br />
import System.Environment(getArgs, getProgName)<br />
<br />
toUpperCase = map toUpper<br />
toLetter n = chr $ ord 'A' + (n - 1) `mod` 26<br />
toNumber l = ord l - ord 'A' + 1<br />
<br />
split5 cs<br />
| length cs > 5 = (take 5 cs) : split5 (drop 5 cs)<br />
| otherwise = [cs]<br />
<br />
fill cs = cs ++ replicate (5 - length cs) 'X'<br />
<br />
-- Filters alpha characters and splits them into groups of five<br />
sanitize:: String -> [String]<br />
sanitize cs = reverse $ (fill.head) rchunks : tail rchunks<br />
where rchunks = reverse.split5.filter isAlpha.toUpperCase $ cs<br />
<br />
unkeyeddeck :: [Int]<br />
unkeyeddeck = [1..54]<br />
<br />
jokerA = 53<br />
jokerB = 54<br />
isJoker = (flip elem) [jokerA, jokerB]<br />
<br />
-- Pushes a card (j) down once<br />
push' j xs = if length right > 0<br />
then left ++ head right : j : tail right<br />
else head left : j : tail left<br />
where (left,_:right) = break (== j) xs<br />
<br />
-- Pushes a card (j) down a given number (n) of times<br />
push j n = (!! n) . iterate (push' j)<br />
<br />
pushJokerA = push jokerA 1<br />
pushJokerB = push jokerB 2<br />
<br />
-- Performs a triplecut around the first two cards that satisfy a predicate (p)<br />
tripleCut p xs = bottom ++ j1 : (middle ++ j2 : top)<br />
where (top,j1:b1) = break p xs<br />
(middle,j2:bottom) = break p b1<br />
<br />
countCut n xs = (reverse.tail $ rbottom) ++ top ++ [head rbottom]<br />
where top = take n xs<br />
rbottom = reverse.drop n $ xs<br />
<br />
-- Performs a count cut by the number written on the bottom card<br />
deckCut xs = countCut (last xs) xs<br />
<br />
valueFor 54 = 53 -- B joker's value is 53<br />
valueFor n = n<br />
<br />
-- Shuffles the deck once<br />
nextDeck = deckCut.tripleCut isJoker.pushJokerB.pushJokerA<br />
<br />
-- Shuffles the deck once and extracts the resulting letter<br />
stepStream :: (String, [Int]) -> (String, [Int])<br />
stepStream (_, oldDeck) = (letter $ number newDeck, newDeck)<br />
where newDeck = nextDeck oldDeck<br />
number deck@(n:_) = deck !! valueFor n<br />
letter n = if isJoker n then "" else toLetter n : []<br />
<br />
-- The keystream generated by an unkeyed deck<br />
keystream = concat [c | (c,_) <- tail $ iterate stepStream ([], unkeyeddeck)]<br />
<br />
join = concat.intersperse " "<br />
<br />
-- Combines an input string (xs) and the default keystream by applying the<br />
-- given operation (f). This is the function that does the encoding/decoding<br />
codeWith f xs = join.sanitize.letterize $<br />
zipWith f (numberize letters) (numberize keyletters)<br />
where keyletters = take (length letters) keystream<br />
numberize = map toNumber<br />
letterize = map toLetter<br />
letters = concat $ sanitize xs<br />
<br />
encode, decode :: String -> String<br />
encode = codeWith (+)<br />
decode = codeWith (-)<br />
<br />
-- An action that applies the coding function (f) to a set of words<br />
-- and prints the resulting code<br />
printCode f = putStrLn . f . join<br />
<br />
main = do args <- getArgs<br />
case args of<br />
("d":ws@(_:_)) -> printCode decode ws<br />
("e":ws@(_:_)) -> printCode encode ws<br />
_ -> getProgName >>=<br />
\n -> putStrLn $ "Usage: " ++ n ++ " <d/e> <phrase>"<br />
</haskell></div>ThiagoArraishttps://wiki.haskell.org/Haskell_Quiz/The_Solitaire_CipherHaskell Quiz/The Solitaire Cipher2007-10-30T18:01:23Z<p>ThiagoArrais: </p>
<hr />
<div>The first puzzle of the rubyquiz series was to implement the Solitaire cipher [http://en.wikipedia.org/wiki/Solitaire_(cipher)] Bruce Schneier made for Neil Stephenson's Cryptonomicon [http://en.wikipedia.org/wiki/Cryptonomicon]. The twist is that it's designed to be done by a Spy in a containment camp with no other tools than a deck of bridge cards.<br />
<br />
When creating a page, be sure to categorise it as code, with a <nowiki>[[Category:Haskell Quiz]]</nowiki> tag.<br />
<br />
==The problem==<br />
<br />
* http://www.rubyquiz.com/quiz1.html<br />
* http://www.schneier.com/solitaire.html<br />
<br />
==Solutions==<br />
<br />
* [[Haskell Quiz/The Solitaire Cipher/Solution Dolio|Dan Doel]]<br />
* [[Haskell Quiz/The Solitaire Cipher/Solution Matthias|Matthias]] (incomplete, but stream generation works)<br />
* [[Haskell Quiz/The Solitaire Cipher/Solution Paul|Paul]] ([http://mult.ifario.us/articles/2006/10/25/solitaire-cipher-in-haskell accompanying narrative])<br />
* [[Haskell Quiz/The Solitaire Cipher/Solution Burton|Jim]]<br />
* [[Haskell Quiz/The Solitaire Cipher/Solution Igloo|Igloo]]<br />
* [[Haskell Quiz/The Solitaire Cipher/Solution JFoutz|JFoutz]]<br />
* [[Haskell Quiz/The Solitaire Cipher/Solution Thiago Arrais|Thiago Arrais]]<br />
<br />
[[Category:Haskell Quiz|Solitaire Cipher]]</div>ThiagoArraishttps://wiki.haskell.org/The_Monadic_WayThe Monadic Way2006-08-26T13:02:52Z<p>ThiagoArrais: </p>
<hr />
<div>==An evaluation of Philip Wadler's "Monads for functional programming"==<br />
<br />
This tutorial is a "translation" of Philip Wadler's "Monads for<br />
functional programming".<br />
(avail. from [http://homepages.inf.ed.ac.uk/wadler/topics/monads.html here])<br />
<br />
I'm a Haskell newbie trying to grasp such a difficult concept as the<br />
ones of Monad and monadic computation.<br />
While [http://www.cs.utah.edu/~hal/htut/ "Yet Another Haskell Tutorial"] <br />
gave me a good understanding of the type system when it<br />
comes to monads I find it almost unreadable.<br />
<br />
But I had also Wadler's paper, and started reading it. Well, just<br />
wonderful! It explains how to ''create'' a monad!<br />
<br />
So I decided to "translate it", in order to clarify to myself the<br />
topic. And I'm now sharing this traslation (not completed yet) with<br />
the hope it will be useful to someone else.<br />
<br />
Moreover, that's a wiki, so please improve it. And, specifically,<br />
correct my poor English. I'm Italian, afterall.<br />
<br />
==A Simple Evaluator==<br />
<br />
Let's start with something simple: suppose we want to implement a new<br />
programming language. We just finished with<br />
[http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/ Abelson and Sussman's Structure and Interpretation of ComputerPrograms] <br />
and we want to test what we have learned.<br />
<br />
Our programming language will be very simple: it will just compute the<br />
sum operation.<br />
<br />
So we have just one primitive operation (Add) that takes two constants<br />
and calculates their sum<br />
<br />
For instance, something like:<br />
<br />
(Add (Con 5) (Con 6))<br />
<br />
should yeld:<br />
<br />
11<br />
<br />
We will implement our language with the help of a data type<br />
constructor such as:<br />
<br />
<haskell><br />
<br />
module MyMonads where<br />
data Term = Con Int<br />
| Add Term Term<br />
deriving (Show)<br />
<br />
</haskell><br />
<br />
After that we build our interpreter:<br />
<br />
<haskell><br />
<br />
eval :: Term -> Int<br />
eval (Con a) = a<br />
eval (Add a b) = eval a + eval b<br />
<br />
</haskell><br />
<br />
That's it. Just an example:<br />
<br />
*MyMonads> eval (Add (Con 5) (Con 6))<br />
11<br />
*MyMonads><br />
<br />
Very very simple. The evaluator checks if its argument is a Cons: if<br />
it is it just returns it.<br />
<br />
If it's not a Con, but it is a Term, it evaluates the right one and<br />
sums the result with the result of the evaluation of the second term.<br />
<br />
== Some Output, Please!==<br />
<br />
Now, that's fine, but we'd like to add some features, like providing<br />
some output, to show how the computation was carried out.<br />
Well, but Haskell is a pure functional language, with no side effects,<br />
we were told.<br />
<br />
Now we seem to be wanting to create a side effect of the computation,<br />
its output, and be able to stare at it...<br />
If we had some global variable to store the out that would be<br />
simple...<br />
<br />
But we can create the output and carry it along the computation,<br />
concatenating it with the old one, and present it at the end of the<br />
evaluation together with the evaluation of the expression!<br />
<br />
Simple and neat!<br />
<br />
<haskell><br />
<br />
type MOut a = (a, Output)<br />
type Output = String<br />
<br />
formatLine :: Term -> Int -> Output<br />
formatLine t a = "eval (" ++ show t ++ ") <= " ++ show a ++ " - " <br />
<br />
evalO :: Term -> MOut Int<br />
evalO (Con a) = (a, formatLine (Con a) a)<br />
evalO (Add t u) = ((a + b),(x ++ y ++ formatLine (Add t u) (a + b)))<br />
where (a, x) = evalO t<br />
(b, y) = evalO u<br />
<br />
</haskell><br />
<br />
Now we have what we want. But we had to change our evaluator quite a<br />
bit. First we added a function, that takes a Term (of the expression<br />
to be evaluated), an Int (the result of the evaluation) and gives back<br />
an output of type Output (that is a synonymous of String). <br />
<br />
The evaluator changed quite a lot! Now it has a different type: it<br />
takes a Term data type and produces a new type, we called MOut, that<br />
is actually a pair of a variable type a (an Int in our evaluator) and<br />
a type Output, a string.<br />
<br />
So our evaluator, now, will take a Term (the type of the expressions<br />
in our new programming language) and will produce a pair, composed of<br />
the result of the evaluation (an Int) and the Output, a string.<br />
<br />
So far so good. But what's happening inside the evaluator?<br />
<br />
The first part will just return a pair with the number evaluated and<br />
the output formatted by formatLine. <br />
<br />
The second part does something more complicated: it returns a pair<br />
composed by <br />
1. the result of the evaluation of the right Term summed to the result<br />
of the evaluation of the second Term<br />
2. the output: the concatenation of the output produced by the<br />
evaluation of the right Term, the output produced by the evaluation of<br />
the left Term (each this evaluation returns a pair with the number and<br />
the output), and the formatted output of the evaluation.<br />
<br />
Let's try it:<br />
*MyMonads> evalO (Add (Con 5) (Con 6))<br />
(11,"eval (Con 5) <= 5 - eval (Con 6) <= 6 - eval (Add (Con 5) (Con 6)) <= 11 - ")<br />
*MyMonads><br />
<br />
It works! Let's put the output this way:<br />
eval (Con 5) <= 5 - <br />
eval (Con 6) <= 6 - <br />
eval (Add (Con 5) (Con 6)) <= 11 -<br />
<br />
Great! We are able to produce a side effect of our evaluation and<br />
present it at the end of the computation, after all.<br />
<br />
Let's have a closer look at this expression:<br />
<haskell><br />
<br />
evalO (Add t u) = ((a + b),(x ++ y ++ formatLine (Add t u) (a + b)))<br />
where (a, x) = evalO t<br />
(b, y) = evalO u<br />
<br />
</haskell><br />
<br />
Why all that? The problem is that we need "a" and "b" to calculate their<br />
sum, together with the output coming from their calculation (to be<br />
concatenated by the expression x ++ y ++ formatLine ...).<br />
<br />
So we need to separate the pairs produced by "evalO t" and "eval u"<br />
(remember: eval now produces a value of type M Int, i.e. a pair of an<br />
Int and a String!).<br />
<br />
== Let's Go Monadic==<br />
<br />
Is there a more general way of doing so?<br />
<br />
Let's analyze the evaluator from another perspective. From the type<br />
perspective.<br />
<br />
We solved our problem by creating a new type, a pair of an Int (the<br />
result of the evaluation) and a String (the output of the process of<br />
evaluation).<br />
<br />
The first part of the evaluator does nothing else but creating, from<br />
a value of type Int, an object of type M Int (Int,Output). It does so<br />
by creating a pair with that Int and some text.<br />
<br />
The second part evaluates the two Term(s) and "stores" the values thus<br />
produced in some variables to be use later to compute the output.<br />
<br />
Let's focus on the "stores" action. The correct term should be<br />
"binds".<br />
<br />
Take a function:<br />
<haskell><br />
f x = x + x<br />
</haskell><br />
"x" appears on both sides of the expression. We say that on the right<br />
side "x" is bound to the value of x given on the left side.<br />
<br />
So<br />
<haskell><br />
f 3<br />
</haskell><br />
binds x to 3 for the evaluation of the expression "x + x".<br />
<br />
Our evaluator binds "a" and "x" / "b" and "y" with the evaluation of<br />
"eval t" and "eval u" respectively. "a","b","x" and "y" will be then<br />
used in the evaluation of ((a+)(x ++ ...). <br />
<br />
We know that there is an ad hoc operator for binding variables to a<br />
value: lambda, or \.<br />
<br />
Indeed f x = x + x is syntactic sugar for:<br />
<haskell><br />
f = \x -> x + x<br />
</haskell><br />
When we write f 3 we are actually binding "x" to 3 within what's next<br />
"->", that will be used (substituted) for evaluating f 3.<br />
<br />
So we can try to abstract this phenomenon.<br />
<br />
What we need is a function that takes our composed type MOut Int and a<br />
function in order to produce a new MOut Int, concatenating the<br />
output of the computation of the first with the output of the<br />
computation of the second.<br />
<br />
This is what bindM does:<br />
<br />
<haskell><br />
<br />
bindM :: MOut a -> (a -> MOut b) -> MOut b<br />
bindM m f = (b, x ++ y)<br />
where (a, x) = m<br />
(b, y) = f a<br />
<br />
</haskell><br />
<br />
It takes:<br />
* "m": the compound type MOut Int carrying the result of an "eval Term",<br />
* a function "f". This function will take the Int ("a") extracted by the evaluation of "m" ((a,x)=m). This function will produce anew pair: a new Int produced by a new evaluation; some new output.<br />
<br />
bindM will return the new Int in pair with the concatenated outputs<br />
resulting from the evaluation of "m" and "f a".<br />
<br />
So let's write the new version of the evaluator:<br />
<br />
<haskell><br />
<br />
evalM_1 :: Term -> MOut Int<br />
evalM_1 (Con a) = (a, formatLine (Con a) a)<br />
evalM_1 (Add t u) = bindM (evalM_1 t) (\a -> <br />
bindM (evalM_1 u) (\b -> <br />
((a + b), formatLine (Add t u) (a + b))<br />
)<br />
)<br />
<br />
</haskell><br />
<br />
Ugly, isn't it?<br />
<br />
Let's start from the outside:<br />
<br />
<haskell><br />
bindM (evalM_1 u) (\b -> ((a + b), formatLine (Add t u) (a + b)))<br />
</haskell><br />
<br />
bindM takes the result of the evaluation "evalM_1 u", a type Mout Int,<br />
and a function. It will extract the Int from that type and use it to<br />
bind "b".<br />
<br />
So in bindM (evalM_1 u)... "b" will be bound to a value.<br />
<br />
Then the outer part (bindM (evalM_1 t) (\a...) will bind "a" to the<br />
value needed to evaluate "((a+b), formatLine...) and produce our final<br />
MOut Int.<br />
<br />
We can write the evaluator in a more convinient way, now that we know<br />
what it does:<br />
<br />
<haskell><br />
<br />
evalM_2 :: Term -> MOut Int<br />
evalM_2 (Con a) = (a, formatLine (Con a) a)<br />
evalM_2 (Add t u) = evalM_2 t `bindM` \a -><br />
evalM_2 u `bindM` \b -><br />
((a + b), (formatLine (Add t u) (a + b)))<br />
<br />
</haskell><br />
<br />
Now, look at the first part:<br />
<br />
<haskell><br />
evalM_2 (Con a) = (a, formatLine (Con a) a)<br />
</haskell><br />
<br />
We could use a more general way of creating some output.<br />
<br />
First we need a method for creating someting of type M a, starting from<br />
something of type a. This is what <hask>evalM_2 (Con a)</hask> is doing, after all.<br />
Very simply:<br />
<br />
<haskell><br />
<br />
mkM :: a -> MOut a<br />
mkM a = (a, "")<br />
<br />
</haskell><br />
<br />
We then need to "insert" some text (Output) in our type M:<br />
<br />
<haskell><br />
<br />
outPut :: Output -> MOut ()<br />
outPut x = ((), x)<br />
<br />
</haskell><br />
<br />
Very simple: we have a string "x" (Output) and create a pair with a ()<br />
instead of an Int, and the output.<br />
<br />
This way we will be able to define also this firts part in terms of<br />
bindM, that will take care of concatenating outputs.<br />
<br />
So we have now a new evaluator:<br />
<br />
<haskell><br />
<br />
evalM_3 :: Term -> MOut Int<br />
evalM_3 (Con a) = outPut (formatLine (Con a) a) `bindM` \_ -> mkM a<br />
evalM_3 (Add t u) = evalM_3 t `bindM` \a -><br />
evalM_3 u `bindM` \b -><br />
outPut (formatLine (Add t u) (a + b)) `bindM` \_ -> mkM (a + b)<br />
<br />
</haskell><br />
<br />
Well, this is fine, definetly better then before, anyway.<br />
<br />
Still we use `bindM` \_ -> that binds something we do not use (_). We<br />
could write something for this case, when we concatenate computations<br />
without the need of binding variables. Let's call it `combineM`:<br />
<br />
<haskell><br />
<br />
combineM :: MOut a -> MOut b -> MOut b<br />
combineM m f = m `bindM` \_ -> f<br />
<br />
</haskell><br />
<br />
So the new evaluator:<br />
<br />
<haskell><br />
<br />
evalM :: Term -> MOut Int<br />
evalM (Con a) = outPut (formatLine (Con a) a) `combineM` <br />
mkM a<br />
evalM (Add t u) = evalM t `bindM` \a -><br />
evalM u `bindM` \b -><br />
outPut (formatLine (Add t u) (a + b)) `combineM` <br />
mkM (a + b)<br />
<br />
</haskell><br />
<br />
Let's put everything together (and change some names):<br />
<br />
<haskell><br />
<br />
type MO a = (a, Out)<br />
type Out = String<br />
<br />
mkMO :: a -> MO a<br />
mkMO a = (a, "")<br />
<br />
bindMO :: MO a -> (a -> MO b) -> MO b<br />
bindMO m f = (b, x ++ y)<br />
where (a, x) = m<br />
(b, y) = f a<br />
<br />
combineMO :: MO a -> MO b -> MO b<br />
combineMO m f = m `bindM` \_ -> f<br />
<br />
outMO :: Out -> MO ()<br />
outMO x = ((), x)<br />
<br />
evalMO :: Term -> MO Int<br />
evalMO (Con a) = outMO (formatLine (Con a) a) `combineMO`<br />
mkMO a<br />
evalMO (Add t u) = evalMO t `bindMO` \a -><br />
evalMO u `bindMO` \b -><br />
outMO (formatLine (Add t u) (a + b)) `combineMO` <br />
mkMO (a + b)<br />
<br />
</haskell><br />
<br />
== Some Sugar, Please!==<br />
Now our evaluator has been completely transformed into a monadic<br />
evaluator. That's what it is: a monad.<br />
<br />
We have a function that constructs an object of type MO Int, formed by<br />
a pair: the result of the evaluation and the accumulated<br />
(concatenated) output.<br />
<br />
The process of accumulation and the act of parting the MO Int into its<br />
component is buried into bindM, that can also preserve some value for<br />
later uses.<br />
<br />
So we have:<br />
* MO a type constructor for a type carrying a pair composed by an Int and a String;<br />
* bindMO, that gives a direction to the process of evaluation: it concatenates computations and captures some side effects we created.<br />
* mkOM lets us create an object of type MO Int starting from an Int.<br />
<br />
As you see this is all we need to create a monad. In other words<br />
monads arise from the type system. Everything else is just syntactic<br />
sugar.<br />
<br />
So, let's have a look to that sugar: the famous do-notation!<br />
<br />
We will now rewrite our basic evaluator to use it with the<br />
do-notation.<br />
<br />
Now we have to crate a new type: this is necessary in order to use<br />
specific monadic notation and have at our disposal the more practical<br />
do-notation.<br />
<br />
<haskell><br />
<br />
newtype Eval a = Eval a<br />
deriving (Show)<br />
<br />
</haskell><br />
<br />
So, our type will be an instance of the monad class. We will have to<br />
define the methods of this class (>>= and return), but that will be<br />
easy since we already done that while defining bindMO and mkMO.<br />
<br />
<haskell><br />
<br />
instance Monad Eval where<br />
return a = Eval a<br />
Eval m >>= f = f m<br />
<br />
</haskell><br />
<br />
And then we will take the old version of our evaluator and substitute<br />
`bindMO` with >>= and `mkMO` with return:<br />
<br />
<haskell><br />
<br />
evalM_4 :: Term -> Eval Int<br />
evalM_4 (Con a) = return a<br />
evalM_4 (Add t u) = evalM_4 t >>= \a -><br />
evalM_4 u >>= \b -><br />
return (a + b)<br />
<br />
</haskell><br />
<br />
which is, in the do-notation:<br />
<br />
<haskell><br />
<br />
evalM_5 :: Term -> Eval Int<br />
evalM_5 (Con a) = return a<br />
evalM_5 (Add t u) = do a <- evalM_5 t<br />
b <- evalM_5 u<br />
return (a + b)<br />
<br />
</haskell><br />
<br />
Simple: do binds the result of "eval_M5 t" to a, binds the result of<br />
"eval_M5 u" to b and then returns the sum. In a very imperative style.<br />
<br />
We can now have an image of our monad: it is out type (Eval) that is<br />
made up of a pair: during evaluation the first member of the pair (the<br />
Int) will get the results of our computation (i.e.: the procedures to<br />
calculate the final result). The second part, the String called<br />
Output, will get filled up with the concatenated output of the<br />
computation.<br />
<br />
The sequencing done by bindMO (now >>=) will take care of passing to<br />
the next evaluation the needed Int and will do some more side<br />
calculation to produce the output (concatenating outputs resulting<br />
from computation of the new Int, for instance).<br />
<br />
So we can grasp the basic concept of a monad: it is like a label which<br />
we attach to each step of the evaluation (the String attached to the<br />
Int). <br />
This label is persistent within the process of computation and at each<br />
step bindMO can do some manipulation of it.<br />
We are creating side-effects and propagating them within our monads.<br />
<br />
Ok. Let's translate our output-producing evaluator in monadic<br />
notation:<br />
<br />
<haskell><br />
<br />
newtype Eval_IO a = Eval_IO (a, O)<br />
deriving (Show)<br />
type O = String<br />
<br />
instance Monad Eval_IO where<br />
return a = Eval_IO (a, "")<br />
(>>=) m f = Eval_IO (b, x ++ y)<br />
where Eval_IO (a, x) = m<br />
Eval_IO (b, y) = f a<br />
print_IO :: O -> Eval_IO ()<br />
print_IO x = Eval_IO ((), x)<br />
<br />
eval_IO :: Term -> Eval_IO Int<br />
eval_IO (Con a) = do print_IO (formatLine (Con a) a)<br />
return a<br />
eval_IO (Add t u) = do a <- eval_IO t<br />
b <- eval_IO u<br />
print_IO (formatLine (Add t u) (a + b))<br />
return (a + b)<br />
<br />
</haskell><br />
Let's see the evaluator with output in action: <br />
*MyMonads> eval_IO (Add (Con 6) (Add (Con 16) (Add (Con 20) (Con 12)))) <br />
Eval_IO (54,"eval (Con 6) <= 6 - eval (Con 16) <= 16 - eval (Con 20) <= 20 - eval (Con 12) <= 12 - \ <br />
eval (Add (Con 20) (Con 12)) <= 32 - eval (Add (Con 16) (Add (Con 20) (Con 12))) <= 48 - \ <br />
eval (Add (Con 6) (Add (Con 16) (Add (Con 20) (Con 12)))) <= 54 - ") <br />
*MyMonads> <br />
<br />
Let's format the output part:<br />
eval (Con 6) <= 6 <br />
eval (Con 16) <= 16 <br />
eval (Con 20) <= 20 <br />
eval (Con 12) <= 12 <br />
eval (Add (Con 20) (Con 12)) <= 32 <br />
eval (Add (Con 16) (Add (Con 20) (Con 12))) <= 48 <br />
eval (Add (Con 6) (Add (Con 16) (Add (Con 20) (Con 12)))) <= 54 <br />
<br />
That's it. For today...<br />
<br />
(TO BE CONTINUED)<br />
<br />
Andrea Rossato<br />
arossato AT istitutocolli.org</div>ThiagoArraishttps://wiki.haskell.org/GHC/GHCi_debuggerGHC/GHCi debugger2006-08-17T17:27:44Z<p>ThiagoArrais: </p>
<hr />
<div>This page is a dump of the designs, ideas, and results of the GHCi debugger SoC project. Please contribute with your suggestions and comments.<br />
<br />
The project builds on top of Lemmih's work: a breakpoint function that when hit while evaluating something in GHCi, launchs a new interactive environment with the variables in the local scope of the breakpoint, allowing you to interact with them.<br />
<br />
'''NEW:''' There is also on-going work in the documentation for these features in the ghc user guide. Here is a [http://ender4.dsic.upv.es:81/ghcdocs/ghci.html snapshot] of the ghci documentation page extended with this project.<br />
<br />
== Intermediate Closure Viewer ==<br />
The closure viewer is intended to permit working with polymorphic values in breakpoints, as well as to explore intermediate computations without altering the evaluation order.<br />
<br />
This feature is now (more or less) complete. Currently it provides two new commands under ghci, ''':print''' and ''':sprint''', both used in the same way as <tt>:type</tt> or <tt>:info</tt>. The latter prints a semievaluated closure using underscores to represent suspended computations (pretty much as [[Hood]] does). The former one in addition binds these thunks to variable names, so that you can do things with them.<br />
<br />
Example:<br />
<pre><br />
Prelude> let li = map Just [1..5]<br />
Prelude> length li<br />
5<br />
Prelude> :sp li<br />
li - [_,_,_,_,_,]<br />
<br />
Prelude> head li<br />
Just 1<br />
<br />
Prelude> :sp li<br />
li - [Just 1,_,_,_,_]<br />
<br />
Prelude> last li<br />
Just 5<br />
<br />
Prelude> :sp li<br />
li - [Just 1,_,_,_Just 5]<br />
<br />
Prelude> :p li<br />
li - [Just 1, (_t1::Maybe Integer),(_t2::Maybe Integer),(_t3::Maybe Integer),Just 5]<br />
<br />
Prelude> _t1 `seq` ()<br />
<br />
Prelude> :p li<br />
li - [Just 1, Just 2,(_t3::Maybe Integer),(_t4::Maybe Integer),Just 5]<br />
<br />
Prelude> _t2<br />
Just 3<br />
</pre><br />
<br />
Its best feature is that it can work without type information, so you can display polymorphic objects the type of which you don't know. However if there is type information available, it is used. Thanks to this it can work with opaque or coerced types. For instance:<br />
<br />
<haskell><br />
data Opaque = forall a. O a<br />
</haskell><br />
<br />
<pre><br />
*Test2> let li = map Just [1..5]<br />
*Test2> let o = O li<br />
*Test2> head li `seq` ()<br />
*Test2> length li `seq` ()<br />
*Test2> :p o<br />
o - [O Just 1,(_t1::Integer),(_t2::Integer),(_t3::Integer),(_t4::Integer)]<br />
</pre><br />
<br />
In the example above the <tt>li</tt> inside <tt>o</tt> has an opaque existential type. However, the closure viewer makes it possible to recover its type when it gets evaluated.<br />
<br />
Other currently proposed extensions are a <tt>safeCoerce</tt> function (not so useful, it depends on ghc-api) and an <tt>unsafeDeepSeq</tt> (this one is decoupled from ghc-api). There is also a generally useful (for compiler/tool developers) <tt>isFullyEvaluated</tt> query function. The signatures being:<br />
<br />
<pre><br />
isFullyEvaluated :: a -> IO Bool <br />
unsafeDeepSeq :: a -> b -> b <br />
safeCoerce :: GHC.Session -> a -> Maybe b<br />
</pre><br />
<br />
Finally, note that there are some inconveniences with the current implementation, such as <tt>:p</tt> binding the same closure to different names when used twice on the same closure, but they are minor and temporary (hopefully). <br />
<br />
<br />
== Usability ==<br />
There is plenty of work to be done in this area before the debugger can be shipped with ghc. <br />
<br />
If you have tried the patches maybe you want to add your comments here. Please add feature requests here too.<br />
<br />
== Dynamic Breakpoints ==<br />
<br />
See the user details of the current implementation at the GHC User Guide. Here is a [http://ender4.dsic.upv.es:81/ghcdocs/ghci.html snapshot] of the ghci documentation page extended with this project.<br />
<br />
<br />
=== Event sites and events ===<br />
We define 'event sites' as points in the code where you can want to set a breakpoint. Current candidates for sites are:<br />
* On the entrance to a function / lambda abstraction<br />
* <strike>Prior to function applications</strike> ''(this one does not make sense unless it forces the application using <tt>$!</tt>)<br />
* Local bindings in lets and wheres<br />
* Entrance to statements in monadic-do code<br />
<br />
Overlapping or unnecesary events should be coalesced into a single one. <br />
The rationale for what is an event and what is not is trying to find a middle point between the user interests and the overhead introduced:<br />
* We want to keep the overhead manageable, thus we want to keep the number of breakpoints low.<br />
* The user wants to introduce breakpoints at will.<br />
<br />
Credit goes to both A. Tolmach's ML debugger and the OCaml time-travel debugger for providing the inspiration.<br />
<br />
=== Proposals ===<br />
There are currently the following proposals:<br />
<br />
* Instrument the code with a conditional breakpoint at every event site. Sites are numbered, and the condition uses a site-indexed array to check if there is a breakpoint enabled. The array is maintained inside ghci. Hopefully not much magic is required for this one.<br />
<br />
* In the style of the previous one, but no array is maintained. All the breakpoint conditions are set to False, so almost no overhead is incurred. When the user demands a breakpoint, its BCO in the heap is rewritten to enable the breakpoint. Feasibility of this?<br />
<br />
* Don't use instrumentation. Have a new header for BCOs with breakpoints, say BCO_BREAK, and change headers in execution time on user demand (as in the previous proposal). The problem I see with this one is how to extract the local bindings. I don't fully grok the scheme Lemmih uses to do that yet.<br />
<br />
During this project we have explored the first one, under the lemma of ``do the simplest thing that could possibly work``. <br />
I'm sure there are many other designs. Please add your proposal or just throw an idea in.<br />
<br />
== Call Traces ==<br />
We want to have ''strict'' call traces, not the lazy ones. <br />
<br />
=== Proposals ===<br />
<br />
* It has been suggested that stealing ideas from Cost-Centre Stacks may be useful. I need more pointers on this.<br />
<strike><br />
* Based on Tolmach's debugger, we can instrument the source code to build a timeline of events (either lazily or not). The events contain a pointer to its lexical parent event. With that it should be possible to extract a call trace:<br />
# CASE 1: We are in a Function definition (FN):<br />
## Go back one step in the timeline: it necessarily is an application (APP)<br />
## Go back to its 'binding', i.e. its lexical parent. Keep doing this until it is a FN, then start again from case 1.<br />
## Once you reach the top, i.e. the 0 event, you are done. Display all theAPPs you encountered in the way<br />
# CASE 2: We are in a site other than a FN:<br />
## Go back [[lexically]] until you hit a FN and continue with case 1.<br />
<br />
This is just a wild, untested idea. It's possible that it would not work. Also even if it worked, it's possible that the overhead was unadmissible.<br />
</strike> WON'T WORK WITH LAZINESS<br />
<br />
== Integration ==<br />
Allowing other tools to integrate with the debugger is an important goal. It should not be taken lightly though.<br />
<br />
* It has been suggested to create a client/server protocol so that the debugger can be used by other tools.<br />
<br />
* On the other hand, arguably it would be much easier to provide integration to clients of the ghc-api via some form of debugger api.<br />
<br />
* Finally, it should be possible to derive the client/server architecture as an afterthought provided there is a debugger api in the ghc-api.<br />
<br />
== Further pointers ==<br />
<br />
# [http://www.tekno.chalmers.se/~murk/rectus Rectus], '' Oleg M&uuml;rk and Lennart Kolmodin''<br />
# [http://caml.inria.fr/pub/docs/manual-ocaml/manual030.html The Ocaml Debugger], ''The OCaml Team''<br />
# [http://web.cecs.pdx.edu/~apt/jfp95.ps A debugger for Standard ML], ''A.Tolmach, A. Appel''<br />
# [http://www.haskell.org//pipermail/cvs-ghc/2006-April/029040.html The original discussion in the ghc-cvs mailing list]<br />
<br />
== How to get the patches ==<br />
<br />
The patches are available at the SoC ghc.debugger [http://darcs.haskell.org/SoC/ghc.debugger/ darcs repo]:<br />
<pre><br />
darcs get --partial http://darcs.haskell.org/SoC/ghc.debugger<br />
</pre><br />
and build it following the instructions at the [http://hackage.haskell.org/trac/ghc GHC developers wiki].<br />
<br />
Or simply pull them into your local ghc-6.5 repo and rebuild stage2 and base libraries.<br />
<br />
Have fun! (and feel free to spam [mailto:mnislaih@gmail.com me] with bugs, suggestions or requests!)</div>ThiagoArraishttps://wiki.haskell.org/Talk:GHC/GHCi_debuggerTalk:GHC/GHCi debugger2006-07-20T14:46:25Z<p>ThiagoArrais: cannot use debugger for functions with inferred type</p>
<hr />
<div>How do I use the debugger on implicitly typed and/or class typed functions? Here is what I am trying:<br />
<br />
<code><br />
qsort [] = []<br />
qsort (a:as) = breakpoint $ (qsort left) ++ [a] ++ (qsort right)<br />
where left = filter (<=a) as<br />
right = filter (>a) as<br />
</code><br />
<br />
when I execute this into the patched ghci, here is what I get:<br />
<br />
<pre><br />
*Main> qsort [8, 4, 0, 3, 1, 23, 11, 18]<br />
Local bindings in scope:<br />
src/Main.hs:7> left<br />
<br />
<interactive>:1:0: Not in scope: `left'<br />
src/Main.hs:7> right<br />
<br />
<interactive>:1:0: Not in scope: `right'<br />
src/Main.hs:7> a<br />
<br />
<interactive>:1:0: Not in scope: `a'<br />
</pre><br />
<br />
If I try specifying the inferred type (<code>qsort :: Ord a => [a] -> [a]</code>), I still get the same result. But if I specify a concrete type for the qsort function (<code>qsort :: [Int] -> [Int]</code>), the bindings in scope are shown and I can inspect the values.<br />
<br />
[[User:ThiagoArrais|ThiagoArrais]] 14:46, 20 July 2006 (UTC)<br />
<br />
--------------------<br />
<br />
Is there a mailing list or any kind of user group for this?<br />
<br />
[[User:ThiagoArrais|ThiagoArrais]] 14:46, 20 July 2006 (UTC)</div>ThiagoArraishttps://wiki.haskell.org/Applications_and_libraries/Program_developmentApplications and libraries/Program development2006-07-04T02:27:21Z<p>ThiagoArrais: Updated the EclipseFP link</p>
<hr />
<div>{{unknown copyright}}<br />
{{LibrariesPage}}<br />
<br />
== Tools for program development ==<br />
<br />
A list of tools that are helpful when developing Haskell code.<br />
See also the [[Libraries and tools/Compiler tools|compiler tools]] and [[Libraries and tools/Theorem provers|theorem provers]].<br />
<br />
=== Preprocesors ===<br />
<br />
;[http://www.cs.york.ac.uk/fp/cpphs/ cpphs]<br />
:Cpphs is a re-implementation (in Haskell) of the C pre-processor.<br />
<br />
;[http://repetae.net/john/computer/haskell/DrIFT DrIFT]<br />
:DrIFT is a tool which allows derivation of instances for classes that aren't supported by the standard compilers. In addition, instances can be produced in seperate modules to that containing the type declaration. This allows instances to be derived for a type after the original module has been compiled. As a bonus, simple utility functions can also be produced from a type.<br />
<br />
;[http://www.cs.vu.nl/Strafunski/ Strafunski]<br />
:Strafunski is a Haskell bundle that provides support for generic programming in Haskell, based on the concept of a functional strategy. It consists of a combinator library (StrategyLib) and a precompiler (DrIFT-Strafunski).<br />
<br />
;[http://darcs.haskell.org/~lemmih/zerothHead/ Zeroth]<br />
:A program using Template Haskell must link with the TH library even if it contains no references to TH after it has been compiled. Zeroth is a preprocessor which allows modules to use TH without linking with the TH library. To do this, Zeroth evaluates the top level splices from a module and saves the resulting code.<br />
<br />
=== Build systems ===<br />
<br />
;[http://www.haskell.org/cabal Cabal]<br />
:The Haskell Cabal is a Common Architecture for Building Applications and Libraries. It is an API distributed with GHC, NHC98, and Hugs which allows a developer to easily group together a set of modules into a package. It is the standard build system for new Haskell libraries and applications <br />
<br />
;[http://www.cs.york.ac.uk/fp/hmake/ hmake], a Haskell-aware replacement for make<br />
:Automatically keeps track of module dependencies (i.e. no need to write any Makefiles!). Can be used with any of the usual Haskell compilers (ghc, hbc, nhc98).<br />
<br />
=== Tags ===<br />
<br />
;[http://www.cl.cam.ac.uk/users/rje33/software.html HaskTags]<br />
:Hasktags is a simple program that generates TAGS files for Haskell code. Together with a supporting editor (e.g. NEdit, XEmacs, or Vim) TAGS files can be used to quickly find the places where functions, data constructors etc. are defined.<br />
<br />
;[http://www.dtek.chalmers.se/~d99josve/tagsh.tar.gz tagsh]<br />
:A version of the tags program for Haskell. It uses the standardised hssource and posix library, works with GHC 5.02.1. tags file has been checked to work with vim and nedit.<br />
<br />
=== Program Transformation ===<br />
<br />
;[http://www.cs.kent.ac.uk/projects/refactor-fp/hare.html HaRe -- The Haskell Refactorer]<br />
:Mechanical refactoring of Haskell code (across module boundaries). HaRe now supports many refactorings such as renaming identifiers, moving/introducing/inlining definitions, and so on. Those refactorings are not limited to a single module. HaRe can be accessed from either Vim or Emacs<br />
<br />
;[http://www.isi.edu/~hdaume/HAllInOne/ Haskell All-In-One]<br />
:This Haskell utility takes a program implemented in multiple modules and converts it to a single module.<br />
<br />
;[http://wiki.di.uminho.pt/wiki/bin/view/Alcino/DrHylo DrHylo]<br />
:Tool for deriving hylomorphisms from a restricted Haskell syntax. It is based on the algorithm first presented in the paper "Deriving Structural Hylomorphisms From Recursive Definitions" at ICFP'96 by Hu, Iwasaki, and Takeichi.<br />
<br />
=== Integrated Development Environments ===<br />
<br />
;[http://eclipsefp.sourceforge.net/ Haskell support for Eclipse]<br />
:Eclipse is an open, extensible IDE platform for "everything and nothing in particular". It is implemented in Java and runs on several platforms. The Java IDE built on top of it has already become very popular among Java developers. The Haskell tools extend it to support editing (syntax coloring, code assist), compiling, and running Haskell programs from within the IDE. More features like source code navigation, module browsing etc. will be added in the future.<br />
<br />
;[http://www.dtek.chalmers.se/~d99josve/hide/ hIDE]<br />
:hIDE is a GUI-based Haskell IDE written using gtk+hs. It does not include an editor but instead interfaces with NEdit, vim or GNU emacs.<br />
<br />
;[http://haskell.org/hide hIDE-2]<br />
:Through the dark ages many a programmer has longed for the ultimate tool. In response to this most unnerving craving, of which we ourselves have had maybe more than our fair share, the dynamic trio of #Haskellaniacs (dons, dcoutts and Lemmih) hereby announce, to the relief of the community, that a fetus has been conceived: ''hIDE - the Haskell Integrated Development Environment''. So far the unborn integrates source code recognition and a chameleon editor, resenting these in a snappy gtk2 environment. Although no seer has yet predicted the date of birth of our hIDEous creature, we hope that the mere knowledge of its existence will spread peace of mind throughout the community as oil on troubled waters. See also: [[Screenshots of HIDE]] and [[HIDE]]<br />
<br />
;[http://www.students.cs.uu.nl/people/rjchaaft/JCreator JCreator with Haskell support]<br />
:JCreator is a highly customizable Java IDE for Windows. Features include extensive project support, fully customizable toolbars (including the images of user tools) and menus, increase/decrease indent for a selected block of text (tab/shift+tab respectively). The Haskell support module adds syntax highlighting for haskell files and winhugs, hugs, a static checker (if you double click on the error message, JCreator will jump to the right file and line and highlight it yellow) and the Haskell 98 Report as tools. Platforms: Win95, Win98, WinNT and Win2000 (only Win95 not tested yet). Size: 6MB. JCreator is a trademark of Xinox Software; Copyright &copy; 2000 Xinox Software The Haskell support module is made by [http://www.students.cs.uu.nl/people/rjchaaft/ Rijk-Jan van Haaften].<br />
<br />
;[http://www.haskell.org/visualhaskell Visual Haskell]<br />
:Visual Haskell is a complete development environment for Haskell software, based on Microsoft's [http://msdn.microsoft.com/vstudio/productinfo/ Microsoft Visual Studio] platform. Visual Haskell integrates with the Visual Studio editor to provide interactive features to aid Haskell development, and it enables the construction of projects consisting of multiple Haskell modules, using the Cabal building/packaging infrastructure.<br />
<br />
;[http://www.kdevelop.org/ KDevelop]<br />
:This IDE supports many languages. For Haskell it [http://www.kdevelop.org/HEAD/doc/api/html/LangSupportStatus.html currently supports] project management, syntax highlighting, building (with GHC) & executing within the IDE.<br />
<br />
;[http://haste.dyndns.org:8080/ haste - Haskell TurboEdit]<br />
:haste - Haskell TurboEdit - is an IDE for the functional programming language Haskell, written in Haskell.<br />
<br />
;[http://www.cs.kent.ac.uk/projects/vital/ Vital]<br />
:Vital is a visual programming environment. It is particularly intended for supporting the open-ended, incremental style of development often preferred by end users (engineers, scientists, analysts, etc.).<br />
<br />
;[http://www.cs.kent.ac.uk/projects/pivotal/ Pivotal]<br />
:Pivotal 0.025 is an early prototype of a Vital-like environment for Haskell. Unlike Vital, however, Pivotal is implemented entirely in Haskell. The implementation is based on the use of the hs-plugins library to allow dynamic compilation and evaluation of Haskell expressions together with the gtk2hs library for implementing the GUI.<br />
<br />
=== Editor modes for syntax highlighting ===<br />
<br />
====Kate====<br />
<br />
; Syntax highlighting files for KDE's Kate<br />
:<br />
* [http://www.informatik.uni-bonn.de/~ralf/software.html#syntax Files] by Ralf Hinze.<br />
* [hs.xml hs.xml] and [lhs.xml lhs.xml] by Brian Huffman.<br />
<br />
====NEdit====<br />
<br />
;[http://www.nedit.org/ftp/contrib/highlighting/haskell.pats NEdit] syntax highlighting and block comment support.<br />
<br />
====Vim====<br />
<br />
;[http://www.vim.org vim] syntax highlighting<br />
:<br />
* [ftp://ftp.cse.unsw.edu.au/pub/users/dons/vim/ by Don Stewart]: for TeX and cpp style Haskell files. <br />
* [http://urchin.earth.li/~ian/vim/ by Ian Lynagh]: distinguishes different literal Haskell styles.<br />
* by John Williams: Both regular Haskell [haskell.vim .hs] and [lhaskell.vim .lhs] files that uncomment lines using '>' are supported.<br />
<br />
====Textpad====<br />
<br />
;[http://www.haskell.org/libraries/Haskell98.syn Syntax highlighting file] for [http://www.textpad.com textpad]<br />
:by Jeroen van Wolffelaar and Arjan van IJzerdoorn, which inludes all prelude functions, datatype, constructors, etc, all in seperate groups.<br />
<br />
====Emacs====<br />
<br />
;[http://www.haskell.org/haskell-mode/ Haskell Mode for Emacs]<br />
:Supports font locking, declaration scanning, documentation of types, indentation and interaction with Hugs.<br />
<br />
;Alternative [http://www.haskell.org/libraries/hugs-mode.el Hugs Mode for Emacs] by Chris Van Humbeeck<br />
:Provides fontification and cooperation with Hugs. Updated for emacs 20.* by Adam P. Jenkins.<br />
<br />
====Jed====<br />
<br />
;[http://www.astercity.net/~khaliff/haskell/haskellmode.tgz Haskell mode] {{dead link}}<br />
:for [http://www.jedsoft.org/jed/ jed] by Marcin 'Qrczak' Kowalczyk.<br />
<br />
====Subethaedit====<br />
<br />
;[http://www.codingmonkeys.de/subethaedit/modes.html Haskell mode For SubEthaEdit]<br />
: SubEthaEdit is a Mac OS X editor.<br />
<br />
====Other====<br />
<br />
Some other, mostly obsolete, modes are available in [http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/CONTRIB/haskell-modes/ CVS].<br />
<br />
=== Typesetting Haskell ===<br />
<br />
;[http://www.jantar.org/lambdaTeX/ lambdaTeX]<br />
:A TeX package for typesetting literate scripts in TeX. The output looks much like the code from Chris Okasaki's book "Purely Functional Data Structures", doing syntax highlighting and converting ASCII art such as <code>-&gt;</code> or <code>alpha</code> to proper mathematical symbols. It should work with both LaTeX and plain TeX, and it does its magic without any annotations, directly on the source code (lambdaTeX uses an almost-complete Haskell lexical analyzer written entirely in plain TeX). You only have to add <code>\input lambdaTeX</code> at the top of your source file, and manually typeset your literate comments so they look as good as the source code.<br />
<br />
;[http://www.cse.unsw.edu.au/~chak/haskell/haskell-style.html Haskell Style for LaTeX2e]<br />
:by Manuel Chakravarty provides environments and macros that simplify setting Haskell programs in LaTeX.<br />
<br />
;[http://www.iai.uni-bonn.de/~loeh/lhs2tex/ lhs2tex]<br />
:A preprocessor for typesetting Haskell programs that combines some of the good features of pphs and smugweb. It generates LaTeX code from literate Haskell sources.<br />
<br />
;[http://www.cs.uu.nl/wiki/Ehc/Shuffle Shuffle]<br />
:another tool helping literate programming in Haskell. It helps to maintain ''views'' in a literate programming project. For example, it is among the tools used for developing a compiler in an iterative way with manuals didactically reflecting these evolving series of versions deriving from the literal code (see [http://www.cs.uu.nl/wiki/Ehc/WebHome Essential Haskell Compiler] project). Thus, Shuffle gives us the possibility for making didactically the evolution of versions visible in the documentation, when this is needed. More generally, Shuffle gives us tangling and weaving possibilities of literate programming. I think it gives a way to think of literal program development in a more abstract way by supporting the concept of views (maybe a too far analogy: version control management -- e.g. [http://abridgegame.org/darcs/ darcs] -- helps thinking of program development in a more abstract way, too). Shuffle works well together with lhs2tex.<br />
<br />
;[http://www.acooke.org/jara/pancito/haskell.sty haskell.sty]<br />
:A Latex style file by Andrew Cooke that makes literal programming in Haskell simple.<br />
<br />
;[http://web.comlab.ox.ac.uk/oucl/work/ian.lynagh/Haskell2LaTeX/ Haskell2Latex]<br />
:Ian Lynagh's Haskell2LaTeX takes a literate Haskell program, or any LaTeX document with embedded Haskell, and pretty-prints the Haskell sections within it. The most significant difference between Haskell2LaTeX and other programs with similar goals is is that Haskell2LaTeX parses the input rather than merely lexing it.<br />
<br />
;[ftp://ftp.cs.york.ac.uk/pub/haskell/contrib/hscolour-1.1.tar.gz HsColour]<br />
:Colourise Haskell source code in HTML or ANSI terminal screen codes.<br />
<br />
=== Source documentation and browsing ===<br />
<br />
;[[Haddock]] A Haskell Documentation Tool<br />
:A tool for automatically generating documentation from annotated Haskell source code. It is primarily intended for documenting libraries, but it should be useful for any kind of Haskell code. Haddock lets you write documentation annotations next to the definitions of functions and types in the source code, in a syntax that is easy on the eye when writing the source code (no heavyweight mark-up). The documentation generated by Haddock is fully hyperlinked - click on a type name in a type signature to go straight to the definition, and documentation, for that type.<br />
<br />
;[http://www.cse.unsw.edu.au/~chak/haskell/idoc/ IDoc] A No Frills Haskell Interface Documentation System<br />
:IDoc extracts interface documentation and declarations from Haskell modules based on standard Haskell layout rules and a small number of clues that the programmer embeds in interface comments. These clues have been designed to be visually non-imposing when displaying the source in a text editor. Interface documentation is rendered in standard markup languages (currently, only HTML is supported). IDoc has been designed to be simple to use and install.<br />
<br />
;[http://www.fmi.uni-passau.de/~groessli/hdoc/ HDoc]<br />
:HDoc generates documentation in HTML format for Haskell modules. The generated documents are cross linked and include summaries and detailed descriptions for the documented functions, data types, type classes and instance declarations.<br />
<br />
;[http://www.ida.liu.se/~jakax/haskell.html HaskellDoc]<br />
:This program generates an HTML document showing the module interfaces of a Haskell project. Convenient links are placed for easy browsing of the different modules of the project, and for quick access to the source code.<br />
<br />
;[http://home.conceptsfa.nl/~jwit/HaSpell.html HaSpell]<br />
:HaSpell is a spelling and style checker for Haskell programs. It can detect spelling errors in comments in the program text, and optionally in the code itself. There is an option to detect metasyntactic variables (such as 'foo') and 'bad function prefixes' such as 'compute' and 'doThe' - these make the program less readable and generally indicate bad programming style.<br />
<br />
=== Testing ===<br />
<br />
;[http://hunit.sourceforge.net HUnit]<br />
:A unit testing framework for Haskell, similar to JUnit for Java. With HUnit, the programmer can easily create tests, name them, group them into suites, and execute them, with the framework checking the results automatically. Test specification is concise, flexible, and convenient.<br />
<br />
;[http://www.cs.chalmers.se/~rjmh/QuickCheck/ QuickCheck]<br />
:A tool for testing Haskell programs automatically. The programmer provides a specification of the program, in the form of properties which functions should satisfy, and QuickCheck then tests that the properties hold in a large number of randomly generated cases. Specifications are expressed in Haskell, using combinators defined in the QuickCheck library. QuickCheck provides combinators to define properties, observe the distribution of test data, and define test data generators.<br />
<br />
;[http://www.informatik.uni-freiburg.de/~wehr/haskell/ HTF - The Haskell Test Framework]<br />
:The HTF lets you write HUnit tests and QuickCheck properties in an easy and convenient way. Additionally, the HTF provides a facility for testing programs by running them and comparing the actual output with the expected output (so called "file-based tests"). The HTF uses Template Haskell to collect all tests and properties, so you do not need to write boilerplate code for that purpose. Preprocessor macros provide you with file name and line number information for tests and properties that failed.<br />
<br />
=== Tracing &amp; debugging ===<br />
<br />
Tracing gives access to otherwise invisible information about a computation. Conventional debuggers allow the user to step through the program computation, stop at given points and examine variable contents. This tracing method is quite unsuitable for Haskell, because its evaluation order is complex, function arguments are usually unwieldy large unevaluated expressions and generally<br />
computation details do not match the user's high-level view of functions mapping values to values. <br />
<br />
;[http://www.cs.mu.oz.au/~bjpop/buddha/ Buddha]<br />
:Buddha is a declarative debugger for Haskell 98 programs. It presents the evaluation of a Haskell program as a series of function applications. A typical debugging session involves a series of questions and answers. The questions are posed by the debugger, and the answers are provided by the user. The implementation of Buddha is based on program transformation.<br />
<br />
;[http://www.ida.liu.se/~henni Freja]<br />
:A compiler for a subset of Haskell. Running a compiled program creates an evaluation dependency tree as trace, a structure based on the idea of declarative debugging from the logic programming community. A debugging session consists of the user answering a sequence of yes/no questions.<br />
<br />
;[http://www.cs.york.ac.uk/fp/hat Hat]<br />
:A Haskell program is first transformed by hat-trans and then compiled with nhc98 or ghc. At runtime the program writes a trace file. There are tools for viewing the trace in various ways: Hat-stack shows a virtual stack of redexes. Hat-observe shows top-level functions in the style of Hood. Hat-trail enables exploring a computation backwards, starting from (part of) a faulty output or an error message. Hat-detect provides algorithmic debugging in the style of Freja. Hat-explore allows free navigation through a computation similar to traditional debuggers and algorithmic debugging and slicing.<br />
<br />
;[http://www.haskell.org/hood Hood]<br />
:A library that permits to observe data structures at given program points. It can basically be used like print statements in imperative languages, but the lazy evaluation order is not affected and functions can be observed as well.<br />
<br />
;[http://www.cs.ukc.ac.uk/people/staff/cr3/toolbox/haskell/ GHood]<br />
:"Graphical Hood" - a Java-based graphical observation event viewer, building on Hood.<br />
<br />
;[http://www.cs.mu.oz.au/~bjpop/code.html highWaterMark]<br />
:A library for determining the amount of memory allocated at any point by a GHC program.<br />
<br />
;[http://www.cs.mu.oz.au/~bjpop/code.html GHC Internals library]<br />
:A GHC library for polymorphically deconstructing heap objects from within Haskell code.<br />
<br />
;[http://www.cs.mu.oz.au/~bjpop/code.html GHC Heap and Stable Table Printing library]<br />
:Two libraries for GHC. The first is for printing heap objects from within Haskell or C code. The second is for dumping the contents of the Stable Table which is used for Stable Pointers and Stable Names.<br />
<br />
=== Bug tracking ===<br />
<br />
;[http://urchin.earth.li/darcs/ian/bts/ Bark] <br />
:Bark is a bug tracking system written in Haskell<br />
<br />
=== Miscellaneous ===<br />
<br />
;[http://www.haskell.org/hoogle/ Hoogle]<br />
:Hoogle is a Haskell API search engine. It allows you to search for a function in the standard libraries by either name, or by approximate type signature.<br />
<br />
;[[Lambdabot]]<br />
:Once an IRC bot, this has grown to be a large, ad-hoc collection of Haskell development tools available for offline use. In particular, automatic point-free refactoring is available via a vim interface, as well as access to [[Hoogle]], djinn, ghci, and much much more.<br />
<br />
;[http://www.cse.unsw.edu.au/~dons/darcs-graph.html darcs-graph]<br />
:a tool for generating graphs of commit activity for darcs repositories.<br />
<br />
;[http://www.cse.unsw.edu.au/~chak/haskell/VersionTool/ VersionTool]<br />
:a small utility that:<br />
* extracts version information from Cabal files,<br />
* maintains version tags in darcs,<br />
* computes patch levels by querying darcs,<br />
* extracts the current context from darcs, and<br />
* adds all this information to a source file<br />
<br />
;[http://www.haskell.org/pipermail/haskell/2006-June/018043.html Kamiariduki]<br />
:a system to judge your derivative work's purpose and license is valid with Creative Commons License Works.<br />
<br />
=== Formal methods ===<br />
<br />
See<br />
* [[Analysis and design|analysis and design methods]]<br />
* [[Libraries and tools/Theorem provers|theorem provers]].<br />
<br />
=== Dead ===<br />
<br />
;[http://web.archive.org/web/*/http://www.numeric-quest.com/haskell/explorer/browser.html The Haskell Module Browser]<em>(since 10/06/2003 via Internet Archive)</em><br />
:A browser similar to Smaltalk and Eiffel class browsers.</div>ThiagoArraishttps://wiki.haskell.org/User:ThiagoArraisUser:ThiagoArrais2006-06-30T17:14:57Z<p>ThiagoArrais: </p>
<hr />
<div>I am a programmer from Recife, Brazil. My main interests are on developing tools for software development. [http://thiagoarrais.blogspot.com Mergulhando no Caos] is my blog (in Portuguese) about sotware development and technology in general.<br />
<br />
I hereby license all my contributions to this wiki under the simple permissive license on [[HaskellWiki:Copyrights]].<br />
<br />
== Haskell related projects ==<br />
<br />
I am currently developing [[EclipseFP]], a development environment for Haskell based on the [http://www.eclipse.org Eclipse] platform.</div>ThiagoArraishttps://wiki.haskell.org/EclipseFPEclipseFP2006-06-30T17:12:29Z<p>ThiagoArrais: </p>
<hr />
<div>''EclipseFP'' is an integrated environment for development in Haskell. Our goal is to provide a full-featured IDE that mimics the Eclipse JDT (the Java IDE which is flagship of the Eclipse project). Some planned and implemented features are support for code assistance, auto-building, refactoring and structural search.<br />
<br />
The homepage for details and download is [http://eclipsefp.sourceforge.net/ http://eclipsefp.sourceforge.net/].<br />
<br />
We welcome any suggestions from the community. You can add your feature request to the following list, if you'd like.<br />
<br />
* ...</div>ThiagoArraishttps://wiki.haskell.org/EclipseFPEclipseFP2006-06-30T17:11:32Z<p>ThiagoArrais: </p>
<hr />
<div>EclipseFP is an integrated environment for development in Haskell. Our goal is to provide a full-featured IDE that mimics the Eclipse JDT (the Java IDE which is flagship of the Eclipse project). Some planned and implemented features are support for code assistance, auto-building, refactoring and structural search.<br />
<br />
The homepage for details and download is [http://eclipsefp.sourceforge.net/ http://eclipsefp.sourceforge.net/].<br />
<br />
We welcome any suggestions from the community. You can add your feature request to the following list, if you'd like.<br />
<br />
* ...</div>ThiagoArraishttps://wiki.haskell.org/User:ThiagoArraisUser:ThiagoArrais2006-06-30T16:48:39Z<p>ThiagoArrais: </p>
<hr />
<div>I am a programmer from Recife, Brazil. My main interests are on developing tools for software development.<br />
<br />
I hereby license all my contributions to this wiki under the simple permissive license on [[HaskellWiki:Copyrights]].<br />
<br />
== Haskell related projects ==<br />
<br />
I am currently developing [[EclipseFP]], a development environment for Haskell based on the [http://www.eclipse.org Eclipse] platform.</div>ThiagoArraishttps://wiki.haskell.org/IO_insideIO inside2006-06-30T16:36:00Z<p>ThiagoArrais: </p>
<hr />
<div>Haskell I/O always was a source of confusion and surprises for new<br />
Haskellers. While simple I/O code in Haskell looks very similar to<br />
it's equivalents in imperative languages, our attempts to write<br />
somewhat more complex often ended with total head mess. That is<br />
because Haskell I/O is really very different internally. Haskell is a<br />
pure language and even I/O system don't break this law.<br />
<br />
The following text is an attempt to explain details of Haskell I/O<br />
implementation that should help you eventually master all the smart<br />
I/O tricks. Moreover, i added detailed explanation of various traps<br />
you can encounter on this way. After reading this text, you will get a<br />
degree "A Master of Haskell I/O" that is equal to Bachelor in CS and<br />
Mathematics, simultaneously :)<br />
<br />
If you are new to Haskell I/O you may prefer to start with reading [[Introduction to IO]] page<br />
<br />
<br />
== Haskell is pure language ==<br />
<br />
Haskell is pure language, which means that result of any function call<br />
is fully determined by its arguments. Pseudo-functions like rand() or<br />
getchar() in C which returns different results on each call, are just<br />
impossible and prohibited by language rules. Moreover, Haskell<br />
functions can't have side effects, i.e. they cannot make any changes in the "real<br />
world" - change files, write to the screen, print, send data over the network,<br />
and so on. These two restrictions together mean that any function<br />
call can be omitted, repeated, or replaced by the result of a previous call with the<br />
same parameters, and the language _guarantees_ that all<br />
these rearrangements will not change program result!<br />
<br />
Let's compare this to C - compilers for this language just try to guess<br />
that function don't have side effects and it's result don't depends on<br />
some global variables. If this guess is wrong - the whole optimization<br />
becomes incorrect! As a consequence, C optimizers are enough<br />
conservative in their guesses and/or require from programmer to give<br />
them hints about usage (not meaning!) of functions and variables<br />
<br />
Comparing to them, Haskell compiler is a set of pure mathematical<br />
transformations that can't be wrong by definition - they just<br />
translate one abstract data processing algorithm (i.e. some complex<br />
function) to another equivalent algorithm, just with better<br />
performance. This results in much better high-level optimization<br />
facilities comparing to C compilers<br />
<br />
But this purity creates it's own problems. How we can do I/O, work<br />
with stateful algorithms and side effects in pure language? This<br />
question had many different solutions probed in 18 years of Haskell<br />
existence and finally one based on using monads was widely accepted<br />
<br />
<br />
<br />
== What is the monad? ==<br />
<br />
What is the monad? It's something from mathematical category theory, i<br />
don't know anymore :) In order to understand how monads are used to<br />
solve problem of I/O and side effects, you don't need to know it. It's<br />
enough to just know elementary mathematics, like I do :)<br />
<br />
Let's imagine that we want to implement in Haskell well-known<br />
'getchar' function. What the type it should have? Let's try:<br />
<br />
<haskell><br />
getchar :: Char<br />
<br />
get2chars = [getchar,getchar]<br />
</haskell><br />
<br />
What we will got with 'getchar' having just 'Char' type? You can see<br />
all the possible problems in 'get2chars' definition:<br />
<br />
1) because Haskell compiler treats all functions as pure and not<br />
having side effects, it can avoid "excessive" call to 'getchar' and<br />
use one returned value two times<br />
<br />
2) even if it will make two calls, there is no any clue to determine<br />
which call should be performed first. Do you want to return chars in<br />
the order they read, or in opposite order? Nothing in 'get2chars'<br />
definition answers this question.<br />
<br />
<br />
How these problems can be solved, from plain programmer's viewpoint?<br />
Let's introduce fake parameter of 'getchar' to make each call<br />
"different" from the compiler's point of view:<br />
<br />
<haskell><br />
getchar :: Int -> Char<br />
<br />
get2chars = [getchar 1, getchar 2]<br />
</haskell><br />
<br />
This right away solved the first problem mentioned above - now<br />
compiler will make two calls because it sees them as having different<br />
parameters. The whole 'get2chars' function should also had such<br />
fake parameter, otherwise we will have the same problem calling it:<br />
<br />
<haskell><br />
getchar :: Int -> Char<br />
get2chars :: Int -> String<br />
<br />
get2chars _ = [getchar 1, getchar 2]<br />
</haskell><br />
<br />
<br />
Now, we need to give the compiler some clue to determine which function it<br />
should call first. The Haskell language doesn't provide any way to express<br />
order of evaluation... except for data dependencies! How about adding<br />
artificial data dependency which prevents evaluation of second<br />
'getchar' before the first one? In order to achieve this, we will<br />
return from 'getchar' additional fake result that will be used as<br />
parameter for next 'getchar' call:<br />
<br />
<haskell><br />
getchar :: Int -> (Char, Int)<br />
<br />
get2chars _ = [a,b] where (a,i) = getchar 1<br />
(b,_) = getchar i<br />
</haskell><br />
<br />
So bad so good - now we can guarantee that 'a' is read before 'b'<br />
because 'b' reading need value (i) that is returned by 'a' reading!<br />
<br />
We've added fake parameter to 'get2chars' but the problem is that the<br />
Haskell compiler is too smart! It can believe that the external 'getchar'<br />
function is really dependent on it's parameter but for 'get2chars' it<br />
will see that we just cheating and throw it away! Problem? How about<br />
passing this fake parameter to 'getchar' function?! In this case<br />
compiler can't guess that it is really unused :)<br />
<br />
<haskell><br />
get2chars i0 = [a,b] where (a,i1) = getchar i0<br />
(b,i2) = getchar i1<br />
</haskell><br />
<br />
<br />
And more - 'get2chars' has all the same purity problems as 'getchar'<br />
function. If one need to call it two times, he need a way to describe<br />
order of these calls. Look at:<br />
<br />
<haskell><br />
get4chars = [get2chars 1, get2chars 2] -- order of `get2chars` calls isn't defined<br />
</haskell><br />
<br />
We already know how to fight with such problem - 'get2chars' should<br />
also return some fake value that can be used to order calls:<br />
<br />
<haskell><br />
get2chars :: Int -> (String, Int)<br />
<br />
get4chars i0 = (a++b) where (a,i1) = get2chars i0<br />
(b,i2) = get2chars i1<br />
</haskell><br />
<br />
<br />
But what the fake value it would return? If we will use some integer<br />
constant, too smart Haskell compiler will guess we are cheating, again :)<br />
What about returning the value returned by 'getchar'? See:<br />
<br />
<haskell><br />
get2chars :: Int -> (String, Int)<br />
get2chars i0 = ([a,b], i2) where (a,i1) = getchar i0<br />
(b,i2) = getchar i1<br />
</haskell><br />
<br />
Believe you or not, but we just constructed the whole "monadic"<br />
Haskell I/O system.<br />
<br />
<br />
<br />
== Welcome to RealWorld, baby :) ==<br />
<br />
The 'main' Haskell function has the type:<br />
<br />
<haskell><br />
main :: RealWorld -> ((), RealWorld)<br />
</haskell><br />
<br />
where 'RealWorld' is faking type used instead of our Int. It is something<br />
like baton passed in relay-race. When 'main' calls some IO function,<br />
it pass the "RealWorld" it received as parameter. All IO functions have<br />
similar types involving RealWorld as parameter and result. To be<br />
exact, "IO" is a type synonym defined in the following way:<br />
<br />
<haskell><br />
type IO a = RealWorld -> (a, RealWorld)<br />
</haskell><br />
<br />
so. 'main' just has type "IO ()", 'getChar' has type "IO Char" and so<br />
on. Let's look at 'main' calling 'getChar' two times:<br />
<br />
<haskell><br />
getChar :: RealWorld -> (Char, RealWorld)<br />
<br />
main :: RealWorld -> ((), RealWorld)<br />
main world0 = let (a, world1) = getChar world0<br />
(b, world2) = getChar world1<br />
in ((), world2)<br />
</haskell><br />
<br />
<br />
Look at this closely: 'main' passes to first 'getChar' the "world" it<br />
received. This 'getChar' returns some new value of type RealWorld,<br />
that is used in next call. Finally, 'main' returns the "world" it got<br />
from the second 'getChar':<br />
<br />
1) Is it possible here to omit any call of 'getChar' if the char<br />
it read is not used? No, because we should return the "world" that is<br />
result of second 'getChar' and in turn requires "world" from first 'getChar'.<br />
<br />
2) Is it possible to reorder 'getChar' calls? No, second 'getChar'<br />
can't be called before first one because it uses "world" it returns.<br />
<br />
3) Is it possible to duplicate calls? In Haskell semantics - yes, but<br />
real compilers never duplicate work in such simple cases (otherwise,<br />
the programs generated will not have any speed guarantees)<br />
<br />
<br />
As we already said, RealWorld values are used like baton, passing them<br />
between all routines called by 'main' in strict order. Inside each<br />
routine called, RealWorld values used in the same way. In whole, in<br />
order to "compute" world to be returned from 'main', we should perform<br />
each IO procedure that is called from 'main', directly or indirectly.<br />
This means that each procedure inserted in the chain will be performed<br />
just at the moment (relative to other IO actions) when we planned it<br />
to be called. Let consider the following program:<br />
<br />
<haskell><br />
main = do a <- ask "What is your name?"<br />
b <- ask "How old are you?"<br />
return ()<br />
<br />
ask s = do putStr s<br />
readLn<br />
</haskell><br />
<br />
Now you have enough knowledge to rewrite it in low-level way and<br />
check that each operation what should be performed, will be really<br />
performed with arguments it should have and in order we expecting<br />
<br />
<br />
But what about conditional execution? No problem. Let's define the<br />
well-known 'when' operation:<br />
<br />
<haskell><br />
when :: Bool -> IO () -> IO ()<br />
when condition action world =<br />
if condition<br />
then action world<br />
else ((), world)<br />
</haskell><br />
<br />
As you can see, we can easily include or exclude from execution chain<br />
IO procedures (actions) depending on the data values. If 'condition'<br />
will be False on call of 'when', 'action' will never be called because<br />
real Haskell compilers, again, never calls functions whose results<br />
don't required to calculate final result (i.e., here, final "world" value<br />
of 'main')<br />
<br />
Loops and any more complex control structures can be implemented in<br />
the same way. Try it as an exercise!<br />
<br />
<br />
Finally you may want to know how much costs this passing of RealWorld<br />
values all around. It's free! These fake values exist for compiler<br />
only while it analyze and optimize code, but when it goes to assembler<br />
code generation, it "suddenly" realize that this type is like "()", so<br />
all these parameters and result values can be omitted from generated code.<br />
Is it not really beautiful? :)<br />
<br />
<br />
<br />
== '>>=' and 'do' notation ==<br />
<br />
All beginners (including me :) start by thinking that 'do' is some<br />
magic statement that executes IO actions. It's wrong - 'do' is just a<br />
syntax sugar that simplifies writing of IO procedures. 'do' notation<br />
is finally translated to the statements passing "world" values like<br />
we manually written above and need only to simplify gluing of several<br />
IO actions together. You don't require to use 'do' for just one statement:<br />
<br />
<haskell><br />
main = do putStr "Hello!"<br />
</haskell><br />
<br />
is desugared just to:<br />
<br />
<haskell><br />
main = putStr "Hello!"<br />
</haskell><br />
<br />
But nevertheless it's a Good Style to use 'do' even for one statement<br />
because it simplifies adding new statements in the future.<br />
<br />
<br />
Let's examine how desugared 'do' with multiple statements on the<br />
following example: <br />
<br />
<haskell><br />
main = do putStr "What is your name?"<br />
putStr "How old are you?"<br />
putStr "Nice day!"<br />
</haskell><br />
<br />
'do' statement here just joins several IO actions that should be<br />
performed sequentially. It's translated to sequential applications<br />
of so named "binding operator", namely '>>':<br />
<br />
<haskell><br />
main = (putStr "What is your name?")<br />
>> ( (putStr "How old are you?")<br />
>> (putStr "Nice day!")<br />
)<br />
</haskell><br />
<br />
This binding operator just combines two IO actions, executing them<br />
sequentially by passing the "world" between them:<br />
<br />
<haskell><br />
(>>) :: IO a -> IO b -> IO b<br />
(action1 >> action2) world0 =<br />
let (a, world1) = action1 world0<br />
(b, world2) = action2 world1<br />
in (b, world2)<br />
</haskell><br />
<br />
If such way to define operator looks strange for you, read this<br />
definition as the following:<br />
<br />
<haskell><br />
action1 >> action2 = action<br />
where<br />
action world0 = let (a, world1) = action1 world0<br />
(b, world2) = action2 world1<br />
in (b, world2)<br />
</haskell><br />
<br />
Now you can substitute definition of '>>' at the places of it's usage<br />
and check that program constructed by 'do' desugaring is actually the<br />
same as we can write by manually manipulating "world" values.<br />
<br />
<br />
More complex example involves binding of variable using "<-":<br />
<br />
<haskell><br />
main = do a <- readLn<br />
print a<br />
</haskell><br />
<br />
This code is desugared into:<br />
<br />
<haskell><br />
main = readLn<br />
>>= (\a -> print a)<br />
</haskell><br />
<br />
As you should remember, '>>' binding operator silently ignores<br />
value of it's first action and returns as an overall result just<br />
result of second action. On the other side, '>>=' allows to use value<br />
of it's first action - it's passed as additional parameter to the second one!<br />
Look at the definition:<br />
<br />
<haskell><br />
(>>=) :: IO a -> (a->IO b) -> IO b<br />
(action1 >>= action2) world0 =<br />
let (a, world1) = action1 world0<br />
(b, world2) = action2 a world1<br />
in (b, world2)<br />
</haskell><br />
<br />
First, what means type of second action, namely "a->IO b"? By<br />
substituting the "IO" definition, we get "a -> RealWorld -> (b, RealWorld)".<br />
This means that second action actually has two parameters<br />
- of type 'a' actually used inside it, and of type RealWorld used for<br />
sequencing of IO actions. That's a destiny - any IO procedure has one<br />
more parameter comparing to that you see in it's type signature. This<br />
parameter is hidden inside the definition of type alias "IO".<br />
<br />
Second, you can use these '>>' and '>>=' operations to simplify your<br />
program. For example, in the code above we don't need to introduce the<br />
variable, because 'readLn' result can be send directly to 'print':<br />
<br />
<haskell><br />
main = readLn >>= print<br />
</haskell><br />
<br />
<br />
And third - as you see, the notation:<br />
<br />
<haskell><br />
do x <- action1<br />
action2<br />
</haskell><br />
<br />
where 'action1' has type "IO a" and 'action2' has type "IO b",<br />
translated into:<br />
<br />
<haskell><br />
action1 >>= (\x -> action2)<br />
</haskell><br />
<br />
where second argument of '>>=' has the type "a->IO b". It's the way<br />
how the "<-" binding processed - it just becomes parameter of<br />
subsequent operations represented as one large IO action. Look at the<br />
next example: <br />
<br />
<haskell><br />
main = do putStr "What is your name?"<br />
a <- readLn<br />
putStr "How old are you?"<br />
b <- readLn<br />
print (a,b)<br />
</haskell><br />
<br />
This code is desugared into:<br />
<br />
<haskell><br />
main = putStr "What is your name?"<br />
>> readLn<br />
>>= \a -> putStr "How old are you?"<br />
>> readLn<br />
>>= \b -> print (a,b)<br />
</haskell><br />
<br />
I omitted parentheses here, both '>>' and '>>=' operations are<br />
left-associative that leads to that 'a' and 'b' bindings introduced<br />
here is valid for all remaining actions. As an exercise, add the<br />
parentheses yourself and translate this procedure into the low-level<br />
code passing "world" values. I think it should be enough to finally<br />
realize how 'do' translation and binding operators work.<br />
<br />
<br />
Oh, no. I forgot third monadic operator - 'return'. It just<br />
combines it's two parameters - value passed and "world":<br />
<br />
<haskell><br />
return :: a -> IO a<br />
return a world0 = (a, world0)<br />
</haskell><br />
<br />
How about translating some simple example of 'return' usage? Say,<br />
<br />
<haskell><br />
main = do a <- readLn<br />
return (a*2)<br />
</haskell><br />
<br />
<br />
Programmers with imperative languages background often thinks that<br />
'return' in Haskell, like in other languages, immediately returns from<br />
the IO procedure. As you can see in its definition (and even just<br />
type!), such assumption is totally wrong. The only purpose of using<br />
'return' is to "lift" some value (of type 'a') into the result of<br />
whole action (of type "IO a") and therefore it should be used only as<br />
last executed statements of some IO sequence. For example try to<br />
translate the following procedure into the low-level code:<br />
<br />
<haskell><br />
main = do a <- readLn<br />
when (a>=0) $ do<br />
return ()<br />
print "a is negative"<br />
</haskell><br />
<br />
and you will realize that 'print' statement is executed anyway. If you<br />
need to escape from middle of IO procedure, you can use the 'if'<br />
statement:<br />
<br />
<haskell><br />
main = do a <- readLn<br />
if (a>=0)<br />
then return ()<br />
else print "a is negative"<br />
</haskell><br />
<br />
Moreover, Haskell layout rules allow us to use the following layout:<br />
<br />
<haskell><br />
main = do a <- readLn<br />
if (a>=0) then return ()<br />
else do<br />
print "a is negative"<br />
...<br />
</haskell><br />
<br />
that may be very useful for escaping from middle of longish 'do' statement.<br />
<br />
<br />
Last exercise: implement function 'liftM' that lifts operations on<br />
plain values to the operations on monadic ones. It's type signature:<br />
<br />
<haskell><br />
liftM :: (a->b) -> (IO a -> IO b)<br />
</haskell><br />
<br />
If it's too hard for you, start with the following high-level<br />
definition and rewrite it in low-level fashion:<br />
<br />
<haskell><br />
liftM f action = do x <- action<br />
return (f x)<br />
</haskell><br />
<br />
<br />
<br />
== Mutable data (references, arrays, hash tables...) ==<br />
<br />
As you should know, all names in Haskell are bind to one fixed value.<br />
This greatly simplify understanding of algorithms and optimization of<br />
code, but inappropriate for some cases. Yes, there a plenty of<br />
algorithms that is simpler to implement in terms of updatable<br />
variables, arrays and so on. This means that the value associated with<br />
variable, for example, can be different at different execution points,<br />
so reading it's value can't be considered as pure function. Imagine,<br />
for example the following code:<br />
<br />
<haskell><br />
main = do let a0 = readVariable varA<br />
_ = writeVariable varA 1<br />
a1 = readVariable varA<br />
print (a0,a1)<br />
</haskell><br />
<br />
Looks strange? First, two calls to 'readVariable' looks the same, so<br />
compiler can just reuse the value returned by first call. Second,<br />
result of 'writeVariable' call isn't used so compiler can (and will!)<br />
omit this call completely. To finish the picture, these 3 calls may be<br />
rearranged to any order because they looking independent on each<br />
other. What is the solution? You know - using of IO actions! IO<br />
actions guarantees us that:<br />
<br />
# execution order will be retained<br />
# each action will be mandatory executed<br />
# result of the "same" action (such as "readVariable varA") will not be reused<br />
<br />
So, the code above really should be written as:<br />
<br />
<haskell><br />
main = do varA <- newIORef 0 -- Create and initialize new variable<br />
a0 <- readIORef varA<br />
writeIORef varA 1<br />
a1 <- readIORef varA<br />
print (a0,a1)<br />
</haskell><br />
<br />
Here, 'varA' got type "IORef Int" which means "variable (reference) in<br />
IO monad holding value of type Int". newIORef creates new variable<br />
(reference) and returns it, and then read/write actions use this<br />
reference. Value returned by "readIORef varA" action may depend not<br />
only on variable involved but also on the moment of performing this<br />
operation so it can return different values on each call.<br />
<br />
Arrays, hash tables and any other _mutable_ data structures are<br />
defined in the same way - there is operation that creates new "mutable<br />
value" and returns reference to it. Then special read and write<br />
operations in IO monad are used. The following example shows example<br />
of using mutable array:<br />
<br />
<haskell><br />
import Data.Array.IO<br />
main = do arr <- newArray (1,10) 37 :: IO (IOArray Int Int)<br />
a <- readArray arr 1<br />
writeArray arr 1 64<br />
b <- readArray arr 1<br />
print (a,b)<br />
</haskell><br />
<br />
Here, array of 10 elements with 37 as initial values is created. After<br />
reading value of first element to 'a' this element's value is changed<br />
to 64 and then read again, to 'b'. As you can see by executing this<br />
code, 'a' will be set to 37 and 'b' to 64.<br />
<br />
<br />
<br />
Other state-dependent operations are also often implemented as IO<br />
actions. For example, random numbers generator should return different<br />
values on each call. It looks natural to give it IO-involving type:<br />
<br />
<haskell><br />
rand :: IO Int<br />
</haskell><br />
<br />
Moreover, when you import C routines you should be careful - if this<br />
routine is impure, i.e. it's result depends on something in "real<br />
world" (file system, memory contents...), internal state and so on,<br />
you should give it IO-involving type. Otherwise, compiler can<br />
"optimize" repetitive calls of this procedure with the same parameters! :)<br />
<br />
For example:<br />
<br />
<haskell><br />
foreign import ccall<br />
sin :: Double -> Double<br />
</haskell><br />
<br />
because 'sin' result depends only on it's argument, but<br />
<br />
<haskell><br />
foreign import ccall<br />
tell :: Int -> IO Int<br />
</haskell><br />
<br />
If you will declare 'tell' as pure function (without IO) then you may<br />
got the same position on each call! :)<br />
<br />
== IO actions as values ==<br />
<br />
Now you should precisely understand why it's impossible to use IO<br />
actions inside non-IO (pure) procedures. Such procedures just don't<br />
get a "baton", don't know any "world" value to pass to IO action.<br />
RealWorld is abstract datatype, so they also can't construct it's<br />
values by himself, and it's a strict type, so 'undefined' also can't<br />
be used. So, prohibition of using IO actions inside pure procedures is<br />
just a type trick as it is usual in Haskell :)<br />
<br />
But while pure code can't _execute_ IO actions, it can work with them<br />
as with any other functional values - they can be stored in data<br />
structures, passed as parameters and returned as results, collected in<br />
lists, and partially applied. But anyway IO action will remain<br />
functional value because we can't apply it to the last argument - of<br />
type RealWorld.<br />
<br />
In order to _execute_ the IO action we need to apply it to some<br />
RealWorld value that can be done only inside some IO procedure,<br />
in it's "actions chain". And real execution of this action will take<br />
place only when this procedure is called as part of process of<br />
"calculating final value of world" for 'main'. Look at this example:<br />
<br />
<haskell><br />
main = let get2chars = getChar >> getChar<br />
((), world1) = putStr "Press two keys" world0<br />
(answer, world2) = get2chars world1<br />
in ((), world2)<br />
</haskell><br />
<br />
Here we first bind value to 'get2chars' and then write binding<br />
involving 'putStr'. But what is an execution order? It is not defined<br />
by order of writing bindings, it is defined by order of processing<br />
"world" values! You can arbitrarily reorder binding statements - in<br />
any case execution order will be defined by dependence on passing<br />
"world" values. Let's see how this 'main' looks in the 'do' notation:<br />
<br />
<haskell><br />
main = do let get2chars = getChar >> getChar<br />
putStr "Press two keys"<br />
get2chars<br />
return ()<br />
</haskell><br />
<br />
As you can see, the 'let' binding that is not included in IO chain, is<br />
translated just to 'let' statement inside the 'do' sequence. And as<br />
you now should understand, placement of this 'let' don't has any<br />
impact on the evaluation order, which is defined by order of passing<br />
"world" values that is, in turn, defined by order of ordinal (non-let)<br />
statements inside 'do'!<br />
<br />
Moreover, IO actions like this 'get2chars' can't be executed just<br />
because they are functions with RealWorld parameter. To execute them,<br />
we should supply the RealWorld parameter, i.e. insert them in 'main'<br />
chain, placing them in some 'do' sequence executed from 'main'. Until<br />
that is done, they will be keep as any function, in partially<br />
evaluated form. And we can work with IO actions as with any other<br />
functions - bind them to names (like above), save them to data<br />
structures, pass as function parameters and return as results - and<br />
they will not be performed until you give them this magic RealWorld<br />
parameter!<br />
<br />
Let's try. How about defining list of IO actions?<br />
<br />
<haskell><br />
ioActions :: [IO ()]<br />
ioActions = [(print "Hello!"),<br />
(putStr "just kidding"),<br />
(getChar >> return ())<br />
]<br />
</haskell><br />
<br />
I used additional parentheses around each action, although they are<br />
not really required. If you still can't belive that these actions will<br />
not be executed until your command, just uncover this list type:<br />
<br />
<haskell><br />
ioActions :: [RealWorld -> ((), RealWorld)]<br />
</haskell><br />
<br />
Well, now we want to execute some of these actions. No problem, just<br />
insert them into the 'main' chain:<br />
<br />
<haskell><br />
main = do head ioActions<br />
ioActions !! 1<br />
last ioActions<br />
</haskell><br />
<br />
Looks strange, yeah? :) Really, any IO action you write in the 'do'<br />
statement (or use as parameter for '>>'/'>>=') is an expression<br />
returning result of type "IO a". Typically, you use some function that<br />
has type "x -> y -> ... -> IO a" and provide all these x, y and<br />
so on parameters. But you are not limited to this standard scenario -<br />
don't forget that Haskell is functional language and you are free to<br />
compute the functional value required (recall - "IO a" is a function<br />
type) in any possible way. Here we just extracted several functions<br />
from the list - no problem. This functional value can also be<br />
constructed on-the-fly, as we've done in previous example - it's also<br />
ok. Want to see this functional value passed as the parameter - heh,<br />
just look at the 'when' definition. Hey, we can sell, buy and rent<br />
these IO actions as any other functional values! For example, let's<br />
define function that executes all IO actions in the list:<br />
<br />
<haskell><br />
sequence_ :: [IO a] -> IO ()<br />
sequence_ [] = return ()<br />
sequence_ (x:xs) = do x<br />
sequence_ xs<br />
</haskell><br />
<br />
No black magic - we just extracts IO actions from the list and inserts<br />
them into chain of IO operations that should be performed to "compute<br />
final world value" of entire 'sequence_' call.<br />
<br />
With help of 'sequence_', we can rewrite our last 'main' as:<br />
<br />
<haskell><br />
main = sequence ioActions<br />
</haskell><br />
<br />
<br />
Haskell's ability to work with IO actions as with any other<br />
(functional or non-functional) value allows us to define control<br />
structures of any complexity. Try, for example, to define control<br />
structure that repeats the action until it returns the 'False' result:<br />
<br />
<haskell><br />
while :: IO Bool -> IO ()<br />
while action = ???<br />
</haskell><br />
<br />
<br />
How about returning IO action as the function result? Well, we done<br />
this each time we defined IO procedure - they all return IO action<br />
that need RealWorld value to be performed. While we most times just<br />
executed them in chain of higher-level IO procedure, it's also<br />
possible to just collect them without actual execution:<br />
<br />
<haskell><br />
main = do let a = sequence ioActions<br />
b = when True getChar<br />
c = getChar >> getChar<br />
putStr "'let' statements are not executed!"<br />
</haskell><br />
<br />
These assigned IO procedures can be used as parameters to other<br />
procedures, or written to global variables, or processed in some other<br />
way, or just executed later, as we done in example with 'get2chars'.<br />
<br />
But how about returning from IO procedure a parameterized IO action?<br />
Let's define a procedure that returns i'th byte from file represented<br />
as Handle:<br />
<br />
<haskell><br />
readi h i = do hSeek h i AbsoluteSeek<br />
hGetChar h<br />
</haskell><br />
<br />
So bad so good. But how about procedure that returns i'th byte of file<br />
with given name without reopening it each time?<br />
<br />
<haskell><br />
readfilei :: String -> IO (Integer -> IO Char)<br />
readfilei name = do h <- openFile name ReadMode<br />
return (readi h)<br />
</haskell><br />
<br />
As you can see, it's an IO procedure that opens file and returns...<br />
another IO procedure that will read byte specified. But we can go<br />
further and include 'readi' body into 'readfilei':<br />
<br />
<haskell><br />
readfilei name = do h <- openFile name ReadMode<br />
let readi h i = do hSeek h i AbsoluteSeek<br />
hGetChar h<br />
return (readi h)<br />
</haskell><br />
<br />
Good? May be better. Why we add 'h' as 'readi' parameter if it can be<br />
got from the environment where 'readi' now defined? Shorter will be:<br />
<br />
<haskell><br />
readfilei name = do h <- openFile name ReadMode<br />
let readi i = do hSeek h i AbsoluteSeek<br />
hGetChar h<br />
return readi<br />
</haskell><br />
<br />
What we've done here? We've build parameterized IO action involving local<br />
names inside 'readfilei' and returned it as the result. Now it can be<br />
used in following way:<br />
<br />
<haskell><br />
main = do myfile <- readfilei "test"<br />
a <- myfile 0<br />
b <- myfile 1<br />
print (a,b)<br />
</haskell><br />
<br />
<br />
Such usage of IO actions is very typical for Haskell programs - you<br />
just construct one or more (using tuple) IO actions that your need,<br />
with and/or without parameters, involving the parameters that your<br />
"constructor" received, and return them to caller. Then these IO actions<br />
can be used in rest of program without any knowledge about your<br />
internal implementation strategies. Actually, this is used to<br />
partially emulate OOP (to be exact, ADT) programming ideology.<br />
<br />
<br />
For example, one of my program's modules is the memory suballocator. It<br />
receives address and size of large memory block and returns two<br />
procedures - one to allocate subblock of given size and second to<br />
return allocated block back:<br />
<br />
<haskell><br />
memoryAllocator :: Ptr a -> Int -> IO (Int -> IO (Ptr b),<br />
Ptr c -> IO ())<br />
<br />
memoryAllocator buf size = do ......<br />
let alloc size = do ...<br />
...<br />
free ptr = do ...<br />
...<br />
return (alloc, free)<br />
</haskell><br />
<br />
How this is implemented? 'alloc' and 'free' works with references<br />
created inside this procedure. Because creation of these references is<br />
a part of 'memoryAllocator' IO actions chain, new independent set of<br />
references will be created for each memory block for which<br />
'memoryAllocator' is called:<br />
<br />
<haskell><br />
memoryAllocator buf size = do start <- newIORef buf<br />
end <- newIORef (buf `plusPtr` size)<br />
...<br />
</haskell><br />
<br />
These two references (we will implement very simple memory allocator) are<br />
read and written in 'alloc' and 'free' definitions:<br />
<br />
<haskell><br />
let alloc size = do addr <- readIORef start<br />
writeIORef start (addr `plusPtr` size)<br />
return addr<br />
<br />
let free ptr = do writeIORef start ptr<br />
</haskell><br />
<br />
What we've defined here is just a pair of closures that is using state<br />
available on the moment of their definition. As you can see, it's as<br />
easy as in any other functional language, despite the Haskell's lack<br />
of direct support for non-pure functions.<br />
<br />
<br />
<br />
== unsafePerformIO and unsafeInterleaveIO ==<br />
<br />
Programmers with imperative background often still looks for a ways to<br />
execute IO actions inside the pure procedures. But that this means?<br />
Imagine that you try to write procedure that reads contents of file<br />
with given name:<br />
<br />
<haskell><br />
readContents :: Filename -> String<br />
</haskell><br />
<br />
Defining it as pure function will simplify the code that use it, i<br />
agree. But this creates troubles for the compiler:<br />
<br />
- first, this call is not inserted in sequence of "world<br />
transformations", so compiler don't get a hint - at what exact moment<br />
you want to execute this action. For example, if file contents is one<br />
at the program start and another at the end - what contents you want<br />
to see? Moment of "consumption" of this value don't make strong<br />
guarantees for execution order, because Haskell see all the functions<br />
as pure and fell free to reorder their execution as needed.<br />
<br />
- second, attempts to read contents of file with the same name can be<br />
factorized despite the fact that file (or current directory) can be<br />
changed between calls. Again, Haskell looks at all the functions as<br />
pure ones and feel free to omit excessive calls with the same<br />
parameters.<br />
<br />
So, implementing functions that interacts with Real World as pure ones<br />
considered as a Bad Behavior. Good boys never do it ;)<br />
<br />
<br />
Nevertheless, there are (semi-official) ways to use IO actions inside<br />
of pure functions. As you should remember this is prohibited by<br />
requiring "baton" to call IO action. Pure function don't have the baton,<br />
but there is special procedure, that procures this baton from nowhere,<br />
uses it to call IO action and then throws resulting "world" away!<br />
A little low-level magic :) This very special procedure is:<br />
<br />
<haskell><br />
unsafePerformIO :: IO a -> a<br />
</haskell><br />
<br />
Let's look at it's (possible) definition:<br />
<br />
<haskell><br />
unsafePerformIO :: (RealWorld -> (a,RealWorld)) -> a<br />
unsafePerformIO action = let (a,world1) = action createNewWorld<br />
in a<br />
</haskell><br />
<br />
where 'createNewWorld' is internal function producing new value of<br />
RealWorld type.<br />
<br />
Using unsafePerformIO, you can easily write pure functions that does<br />
I/O inside. But don't do this without real need, and remember to<br />
follow this rule: compiler don't know that you are cheating, it still<br />
consider each non-IO function as pure one. Therefore, all the usual<br />
optimization rules can (and will!) be applied to it's execution. So<br />
you must ensure that:<br />
<br />
1) Result of each call depends only on it's arguments<br />
<br />
2) You don't rely on side-effects of this function, which may be not<br />
executed if it's results are not used<br />
<br />
<br />
Let's investigate this problem deeper. Function evaluation in Haskell<br />
are ruled by value's necessity - computed only the values that really<br />
required to calculate final result. But that this means according to<br />
'main' function? To "calculate final world's" value, it's required to<br />
perform all the intermediate IO actions that included in 'main' chain.<br />
By using 'unsafePerformIO' we call IO actions outside of this chain.<br />
What can guarantee that they will be run? Nothing. The only case when<br />
they will be run is if that is required to compute overall function<br />
result (that in turn should be required to perform some action in<br />
'main' chain). Here we return to the Haskell-natural<br />
evaluation-on-value-need. Now you should clearly see the difference:<br />
<br />
- IO action inside IO procedure guaranteed to execute as long as<br />
it is inside 'main' chain - even when it's result is not used.<br />
You directly specify order of action's execution inside IO procedure.<br />
Data dependencies are simulated via "world" values.<br />
<br />
- IO action inside 'unsafePerformIO' will be performed only if<br />
result of this operation is really used. Evaluation order is not<br />
guaranteed and you should not rely on it (except when you sure about<br />
data dependency).<br />
<br />
<br />
I should also say that inside 'unsafePerformIO' call you can organize<br />
small internal chain of IO actions with help of the same binding<br />
operators and/or 'do' sugar:<br />
<br />
<haskell><br />
one = unsafePerformIO $ do var <- newIORef 0<br />
writeIORef var 1<br />
readIORef var<br />
</haskell><br />
<br />
and in this case ALL the operations in this chain will be performed as<br />
long as 'unsafePerformIO' result will be demanded. To ensure this,<br />
the actual 'unsafePerformIO' implementation evaluates "world" returned<br />
by the 'action':<br />
<br />
<haskell><br />
unsafePerformIO action = let (a,world1) = action createNewWorld<br />
in (world1 `seq` a)<br />
</haskell><br />
<br />
('seq' operation strictly evaluates it's first argument before<br />
returning the value of second one)<br />
<br />
<br />
<br />
But there is even more strange operation - 'unsafeInterleaveIO' that<br />
gets "official baton", makes it's piratical copy, and then run's<br />
"illegal" relay-race in parallel with main one! I can't further say<br />
about it's behavior without grief and indignation, it's not surprise<br />
that this operation is widely used in such software-piratical<br />
countries as Russia and China! ;) Don't even ask me - i will say<br />
nothing about this dirty trick i using permanently ;)<br />
<br />
<br />
<br />
== fixIO and 'mdo' ==<br />
<br />
== ST monad ==<br />
<br />
== Q monad ==<br />
<br />
== Welcome to machine: actual [[GHC]] implementation ==<br />
<br />
A little disclaimer: after all, i should say that i don't described<br />
here what is a monad (i even don't know it myself) and what my<br />
explanations shows only the one _possible_ way to implement them in<br />
Haskell. For example, hbc Haskell compiler implements monads via<br />
continuations. I also don't said anything about exception handling<br />
that is natural part of "monad" concept. You can read "All about<br />
monads" guide to learn more on these topics.<br />
<br />
But there are a good news: first, monad understanding you've build<br />
will work with any implementation. You just can't work with RealWorld<br />
values directly.<br />
<br />
Second, IO monad implementation described here is really used in GHC,<br />
Hugs (nhc/jhc, too?) compilers. It is the really real IO definition<br />
from GHC sources:<br />
<br />
<haskell><br />
newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))<br />
</haskell><br />
<br />
It uses "State# RealWorld" type instead of our RealWorld, it uses "(# #)"<br />
strict tuple for optimization, and it adds IO data constructor<br />
around the type. Nevertheless, there are no principal changes. Knowing<br />
the principle of "chaining" IO actions via fake "state of world"<br />
values, now you can easily understand and write low-level<br />
implementations of GHC I/O operations.<br />
<br />
<br />
=== The [[Yhc]]/nhc98 implementation ===<br />
<br />
<haskell><br />
data World = World<br />
newtype IO a = IO (World -> Either IOError a)<br />
</haskell><br />
<br />
This implementation makes the "World" disappear somewhat, and returns Either a<br />
result "a", or if an error occurs then "IOError". The lack of the World on the<br />
right hand side of the function can only be done because the compiler knows<br />
special things about the IO type, and will not over optimise it.<br />
<br />
<br />
== Further reading ==<br />
<br />
Look at the [[Books and tutorials#Using_Monads]] page<br />
<br />
Are you have more questions? Ask in the haskell-cafe.</div>ThiagoArraishttps://wiki.haskell.org/IO_insideIO inside2006-06-30T16:24:28Z<p>ThiagoArrais: </p>
<hr />
<div>Haskell I/O always was a source of confusion and surprises for new<br />
Haskellers. While simple I/O code in Haskell looks very similar to<br />
it's equivalents in imperative languages, our attempts to write<br />
somewhat more complex often ended with total head mess. That is<br />
because Haskell I/O is really very different internally. Haskell is a<br />
pure language and even I/O system don't break this law.<br />
<br />
The following text is an attempt to explain details of Haskell I/O<br />
implementation that should help you eventually master all the smart<br />
I/O tricks. Moreover, i added detailed explanation of various traps<br />
you can encounter on this way. After reading this text, you will get a<br />
degree "A Master of Haskell I/O" that is equal to Bachelor in CS and<br />
Mathematics, simultaneously :)<br />
<br />
If you are new to Haskell I/O you may prefer to start with reading [[Introduction to IO]] page<br />
<br />
<br />
== Haskell is pure language ==<br />
<br />
Haskell is pure language, which means that result of any function call<br />
is fully determined by its arguments. Pseudo-functions like rand() or<br />
getchar() in C which returns different results on each call, are just<br />
impossible and prohibited by language rules. Moreover, Haskell<br />
functions can't have side effects, i.e. they cannot make any changes in the "real<br />
world" - change files, write to the screen, print, send data over the network,<br />
and so on. These two restrictions together mean that any function<br />
call can be omitted, repeated, or replaced by the result of a previous call with the<br />
same parameters, and the language _guarantees_ that all<br />
these rearrangements will not change program result!<br />
<br />
Let's compare this to C - compilers for this language just try to guess<br />
that function don't have side effects and it's result don't depends on<br />
some global variables. If this guess is wrong - the whole optimization<br />
becomes incorrect! As a consequence, C optimizers are enough<br />
conservative in their guesses and/or require from programmer to give<br />
them hints about usage (not meaning!) of functions and variables<br />
<br />
Comparing to them, Haskell compiler is a set of pure mathematical<br />
transformations that can't be wrong by definition - they just<br />
translate one abstract data processing algorithm (i.e. some complex<br />
function) to another equivalent algorithm, just with better<br />
performance. This results in much better high-level optimization<br />
facilities comparing to C compilers<br />
<br />
But this purity creates it's own problems. How we can do I/O, work<br />
with stateful algorithms and side effects in pure language? This<br />
question had many different solutions probed in 18 years of Haskell<br />
existence and finally one based on using monads was widely accepted<br />
<br />
<br />
<br />
== What is the monad? ==<br />
<br />
What is the monad? It's something from mathematical category theory, i<br />
don't know anymore :) In order to understand how monads are used to<br />
solve problem of I/O and side effects, you don't need to know it. It's<br />
enough to just know elementary mathematics, like I do :)<br />
<br />
Let's imagine that we want to implement in Haskell well-known<br />
'getchar' function. What the type it should have? Let's try:<br />
<br />
<haskell><br />
getchar :: Char<br />
<br />
get2chars = [getchar,getchar]<br />
</haskell><br />
<br />
What we will got with 'getchar' having just 'Char' type? You can see<br />
all the possible problems in 'get2chars' definition:<br />
<br />
1) because Haskell compiler treats all functions as pure and not<br />
having side effects, it can avoid "excessive" call to 'getchar' and<br />
use one returned value two times<br />
<br />
2) even if it will make two calls, there is no any clue to determine<br />
which call should be performed first. Are you want to return chars in<br />
the order they read, or in opposite order? Nothing in 'get2chars'<br />
definition answers this question.<br />
<br />
<br />
How these problems can be solved, from plain programmer's viewpoint?<br />
Let's introduce fake parameter of 'getchar' to make each call<br />
"different" from compiler's POV:<br />
<br />
<haskell><br />
getchar :: Int -> Char<br />
<br />
get2chars = [getchar 1, getchar 2]<br />
</haskell><br />
<br />
This right away solved the first problem mentioned above - now<br />
compiler will make two calls because it sees them as having different<br />
parameters. The whole 'get2chars' function should also had such<br />
fake parameter, otherwise we will have the same problem calling it:<br />
<br />
<haskell><br />
getchar :: Int -> Char<br />
get2chars :: Int -> String<br />
<br />
get2chars _ = [getchar 1, getchar 2]<br />
</haskell><br />
<br />
<br />
Now, we need to give compiler some clue to determine which function it<br />
should call first. Haskell language don't provide any ways to express<br />
order of evaluation... except for data dependencies! How about adding<br />
artificial data dependency which prevents evaluation of second<br />
'getchar' before the first one? In order to achieve this, we will<br />
return from 'getchar' additional fake result that will be used as<br />
parameter for next 'getchar' call:<br />
<br />
<haskell><br />
getchar :: Int -> (Char, Int)<br />
<br />
get2chars _ = [a,b] where (a,i) = getchar 1<br />
(b,_) = getchar i<br />
</haskell><br />
<br />
So bad so good - now we can guarantee that 'a' is read before 'b'<br />
because 'b' reading need value (i) that is returned by 'a' reading!<br />
<br />
We've added fake parameter to 'get2chars' but the problem is what<br />
Haskell compiler is too smart! It can believe that external 'getchar'<br />
function is really dependent on it's parameter but for 'get2chars' it<br />
will see that we just cheating and throw it away! Problem? How about<br />
passing this fake parameter to 'getchar' function?! In this case<br />
compiler can't guess that it really unused :)<br />
<br />
<haskell><br />
get2chars i0 = [a,b] where (a,i1) = getchar i0<br />
(b,i2) = getchar i1<br />
</haskell><br />
<br />
<br />
And more - 'get2chars' has all the same purity problems as 'getchar'<br />
function. If one need to call it two times, he need a way to describe<br />
order of these calls. Look at:<br />
<br />
<haskell><br />
get4chars = [get2chars 1, get2chars 2] -- order of `get2chars` calls isn't defined<br />
</haskell><br />
<br />
We already know how to fight with such problem - 'get2chars' should<br />
also return some fake value that can be used to order calls:<br />
<br />
<haskell><br />
get2chars :: Int -> (String, Int)<br />
<br />
get4chars i0 = (a++b) where (a,i1) = get2chars i0<br />
(b,i2) = get2chars i1<br />
</haskell><br />
<br />
<br />
But what the fake value it would return? If we will use some integer<br />
constant, too smart Haskell compiler will guess we are cheating, again :)<br />
What about returning the value returned by 'getchar'? See:<br />
<br />
<haskell><br />
get2chars :: Int -> (String, Int)<br />
get2chars i0 = ([a,b], i2) where (a,i1) = getchar i0<br />
(b,i2) = getchar i1<br />
</haskell><br />
<br />
Believe you or not, but we just constructed the whole "monadic"<br />
Haskell I/O system.<br />
<br />
<br />
<br />
== Welcome to RealWorld, baby :) ==<br />
<br />
The 'main' Haskell function has the type:<br />
<br />
<haskell><br />
main :: RealWorld -> ((), RealWorld)<br />
</haskell><br />
<br />
where 'RealWorld' is faking type used instead of our Int. It is something<br />
like baton passed in relay-race. When 'main' calls some IO function,<br />
it pass the "RealWorld" it received as parameter. All IO functions has<br />
similar types involving RealWorld as parameter and result. To be<br />
exact, "IO" is a type synonym defined in the following way:<br />
<br />
<haskell><br />
type IO a = RealWorld -> (a, RealWorld)<br />
</haskell><br />
<br />
so. 'main' just has type "IO ()", 'getChar' has type "IO Char" and so<br />
on. Let's look at 'main' calling 'getChar' two times:<br />
<br />
<haskell><br />
getChar :: RealWorld -> (Char, RealWorld)<br />
<br />
main :: RealWorld -> ((), RealWorld)<br />
main world0 = let (a, world1) = getChar world0<br />
(b, world2) = getChar world1<br />
in ((), world2)<br />
</haskell><br />
<br />
<br />
Look at this closely: 'main' passes to first 'getChar' the "world" it<br />
received. This 'getChar' returns some new value of type RealWorld,<br />
that is used in next call. Finally, 'main' returns the "world" it got<br />
from the second 'getChar':<br />
<br />
1) Is it possible here to omit any call of 'getChar' if the char<br />
it read is not used? No, because we should return the "world" that is<br />
result of second 'getChar' and in turn requires "world" from first 'getChar'.<br />
<br />
2) Is it possible to reorder 'getChar' calls? No, second 'getChar'<br />
can't be called before first one because it uses "world" it returns.<br />
<br />
3) Is it possible to duplicate calls? In Haskell semantics - yes, but<br />
real compilers never duplicate work in such simple cases (otherwise,<br />
the programs generated will not have any speed guarantees)<br />
<br />
<br />
As we already said, RealWorld values used like baton, passing them<br />
between all routines called by 'main' in strict order. Inside each<br />
routine called, RealWorld values used in the same way. In whole, in<br />
order to "compute" world to be returned from 'main', we should perform<br />
each IO procedure that is called from 'main', directly or indirectly.<br />
This means that each procedure inserted in the chain will be performed<br />
just at the moment (relative to other IO actions) when we planned it<br />
to be called. Let consider the following program:<br />
<br />
<haskell><br />
main = do a <- ask "What is your name?"<br />
b <- ask "How old are you?"<br />
return ()<br />
<br />
ask s = do putStr s<br />
readLn<br />
</haskell><br />
<br />
Now you have enough knowledge to rewrite it in low-level way and<br />
check that each operation what should be performed, will be really<br />
performed with arguments it should have and in order we expecting<br />
<br />
<br />
But what about conditional execution? No problem. Let's define the<br />
well-known 'when' operation:<br />
<br />
<haskell><br />
when :: Bool -> IO () -> IO ()<br />
when condition action world =<br />
if condition<br />
then action world<br />
else ((), world)<br />
</haskell><br />
<br />
As you can see, we can easily include or exclude from execution chain<br />
IO procedures (actions) depending on the data values. If 'condition'<br />
will be False on call of 'when', 'action' will never be called because<br />
real Haskell compilers, again, never calls functions whose results<br />
don't required to calculate final result (i.e., here, final "world" value<br />
of 'main')<br />
<br />
Loops and any more complex control structures can be implemented in<br />
the same way. Try it as an exercise!<br />
<br />
<br />
Finally you may want to know how much costs this passing of RealWorld<br />
values all around. It's free! These fake values exist for compiler<br />
only while it analyze and optimize code, but when it goes to assembler<br />
code generation, it "suddenly" realize that this type is like "()", so<br />
all these parameters and result values can be omitted from generated code.<br />
Is it not really beautiful? :)<br />
<br />
<br />
<br />
== '>>=' and 'do' notation ==<br />
<br />
All beginners (including me :) start by thinking that 'do' is some<br />
magic statement that executes IO actions. It's wrong - 'do' is just a<br />
syntax sugar that simplifies writing of IO procedures. 'do' notation<br />
is finally translated to the statements passing "world" values like<br />
we manually written above and need only to simplify gluing of several<br />
IO actions together. You don't require to use 'do' for just one statement:<br />
<br />
<haskell><br />
main = do putStr "Hello!"<br />
</haskell><br />
<br />
is desugared just to:<br />
<br />
<haskell><br />
main = putStr "Hello!"<br />
</haskell><br />
<br />
But nevertheless it's a Good Style to use 'do' even for one statement<br />
because it simplifies adding new statements in the future.<br />
<br />
<br />
Let's examine how desugared 'do' with multiple statements on the<br />
following example: <br />
<br />
<haskell><br />
main = do putStr "What is your name?"<br />
putStr "How old are you?"<br />
putStr "Nice day!"<br />
</haskell><br />
<br />
'do' statement here just joins several IO actions that should be<br />
performed sequentially. It's translated to sequential applications<br />
of so named "binding operator", namely '>>':<br />
<br />
<haskell><br />
main = (putStr "What is your name?")<br />
>> ( (putStr "How old are you?")<br />
>> (putStr "Nice day!")<br />
)<br />
</haskell><br />
<br />
This binding operator just combines two IO actions, executing them<br />
sequentially by passing the "world" between them:<br />
<br />
<haskell><br />
(>>) :: IO a -> IO b -> IO b<br />
(action1 >> action2) world0 =<br />
let (a, world1) = action1 world0<br />
(b, world2) = action2 world1<br />
in (b, world2)<br />
</haskell><br />
<br />
If such way to define operator looks strange for you, read this<br />
definition as the following:<br />
<br />
<haskell><br />
action1 >> action2 = action<br />
where<br />
action world0 = let (a, world1) = action1 world0<br />
(b, world2) = action2 world1<br />
in (b, world2)<br />
</haskell><br />
<br />
Now you can substitute definition of '>>' at the places of it's usage<br />
and check that program constructed by 'do' desugaring is actually the<br />
same as we can write by manually manipulating "world" values.<br />
<br />
<br />
More complex example involves binding of variable using "<-":<br />
<br />
<haskell><br />
main = do a <- readLn<br />
print a<br />
</haskell><br />
<br />
This code is desugared into:<br />
<br />
<haskell><br />
main = readLn<br />
>>= (\a -> print a)<br />
</haskell><br />
<br />
As you should remember, '>>' binding operator silently ignores<br />
value of it's first action and returns as an overall result just<br />
result of second action. On the other side, '>>=' allows to use value<br />
of it's first action - it's passed as additional parameter to the second one!<br />
Look at the definition:<br />
<br />
<haskell><br />
(>>=) :: IO a -> (a->IO b) -> IO b<br />
(action1 >>= action2) world0 =<br />
let (a, world1) = action1 world0<br />
(b, world2) = action2 a world1<br />
in (b, world2)<br />
</haskell><br />
<br />
First, what means type of second action, namely "a->IO b"? By<br />
substituting the "IO" definition, we get "a -> RealWorld -> (b, RealWorld)".<br />
This means that second action actually has two parameters<br />
- of type 'a' actually used inside it, and of type RealWorld used for<br />
sequencing of IO actions. That's a destiny - any IO procedure has one<br />
more parameter comparing to that you see in it's type signature. This<br />
parameter is hidden inside the definition of type alias "IO".<br />
<br />
Second, you can use these '>>' and '>>=' operations to simplify your<br />
program. For example, in the code above we don't need to introduce the<br />
variable, because 'readLn' result can be send directly to 'print':<br />
<br />
<haskell><br />
main = readLn >>= print<br />
</haskell><br />
<br />
<br />
And third - as you see, the notation:<br />
<br />
<haskell><br />
do x <- action1<br />
action2<br />
</haskell><br />
<br />
where 'action1' has type "IO a" and 'action2' has type "IO b",<br />
translated into:<br />
<br />
<haskell><br />
action1 >>= (\x -> action2)<br />
</haskell><br />
<br />
where second argument of '>>=' has the type "a->IO b". It's the way<br />
how the "<-" binding processed - it just becomes parameter of<br />
subsequent operations represented as one large IO action. Look at the<br />
next example: <br />
<br />
<haskell><br />
main = do putStr "What is your name?"<br />
a <- readLn<br />
putStr "How old are you?"<br />
b <- readLn<br />
print (a,b)<br />
</haskell><br />
<br />
This code is desugared into:<br />
<br />
<haskell><br />
main = putStr "What is your name?"<br />
>> readLn<br />
>>= \a -> putStr "How old are you?"<br />
>> readLn<br />
>>= \b -> print (a,b)<br />
</haskell><br />
<br />
I omitted parentheses here, both '>>' and '>>=' operations are<br />
left-associative that leads to that 'a' and 'b' bindings introduced<br />
here is valid for all remaining actions. As an exercise, add the<br />
parentheses yourself and translate this procedure into the low-level<br />
code passing "world" values. I think it should be enough to finally<br />
realize how 'do' translation and binding operators work.<br />
<br />
<br />
Oh, no. I forgot third monadic operator - 'return'. It just<br />
combines it's two parameters - value passed and "world":<br />
<br />
<haskell><br />
return :: a -> IO a<br />
return a world0 = (a, world0)<br />
</haskell><br />
<br />
How about translating some simple example of 'return' usage? Say,<br />
<br />
<haskell><br />
main = do a <- readLn<br />
return (a*2)<br />
</haskell><br />
<br />
<br />
Programmers with imperative languages background often thinks that<br />
'return' in Haskell, like in other languages, immediately returns from<br />
the IO procedure. As you can see in its definition (and even just<br />
type!), such assumption is totally wrong. The only purpose of using<br />
'return' is to "lift" some value (of type 'a') into the result of<br />
whole action (of type "IO a") and therefore it should be used only as<br />
last executed statements of some IO sequence. For example try to<br />
translate the following procedure into the low-level code:<br />
<br />
<haskell><br />
main = do a <- readLn<br />
when (a>=0) $ do<br />
return ()<br />
print "a is negative"<br />
</haskell><br />
<br />
and you will realize that 'print' statement is executed anyway. If you<br />
need to escape from middle of IO procedure, you can use the 'if'<br />
statement:<br />
<br />
<haskell><br />
main = do a <- readLn<br />
if (a>=0)<br />
then return ()<br />
else print "a is negative"<br />
</haskell><br />
<br />
Moreover, Haskell layout rules allow us to use the following layout:<br />
<br />
<haskell><br />
main = do a <- readLn<br />
if (a>=0) then return ()<br />
else do<br />
print "a is negative"<br />
...<br />
</haskell><br />
<br />
that may be very useful for escaping from middle of longish 'do' statement.<br />
<br />
<br />
Last exercise: implement function 'liftM' that lifts operations on<br />
plain values to the operations on monadic ones. It's type signature:<br />
<br />
<haskell><br />
liftM :: (a->b) -> (IO a -> IO b)<br />
</haskell><br />
<br />
If it's too hard for you, start with the following high-level<br />
definition and rewrite it in low-level fashion:<br />
<br />
<haskell><br />
liftM f action = do x <- action<br />
return (f x)<br />
</haskell><br />
<br />
<br />
<br />
== Mutable data (references, arrays, hash tables...) ==<br />
<br />
As you should know, all names in Haskell are bind to one fixed value.<br />
This greatly simplify understanding of algorithms and optimization of<br />
code, but inappropriate for some cases. Yes, there a plenty of<br />
algorithms that is simpler to implement in terms of updatable<br />
variables, arrays and so on. This means that the value associated with<br />
variable, for example, can be different at different execution points,<br />
so reading it's value can't be considered as pure function. Imagine,<br />
for example the following code:<br />
<br />
<haskell><br />
main = do let a0 = readVariable varA<br />
_ = writeVariable varA 1<br />
a1 = readVariable varA<br />
print (a0,a1)<br />
</haskell><br />
<br />
Looks strange? First, two calls to 'readVariable' looks the same, so<br />
compiler can just reuse the value returned by first call. Second,<br />
result of 'writeVariable' call isn't used so compiler can (and will!)<br />
omit this call completely. To finish the picture, these 3 calls may be<br />
rearranged to any order because they looking independent on each<br />
other. What is the solution? You know - using of IO actions! IO<br />
actions guarantees us that:<br />
<br />
# execution order will be retained<br />
# each action will be mandatory executed<br />
# result of the "same" action (such as "readVariable varA") will not be reused<br />
<br />
So, the code above really should be written as:<br />
<br />
<haskell><br />
main = do varA <- newIORef 0 -- Create and initialize new variable<br />
a0 <- readIORef varA<br />
writeIORef varA 1<br />
a1 <- readIORef varA<br />
print (a0,a1)<br />
</haskell><br />
<br />
Here, 'varA' got type "IORef Int" which means "variable (reference) in<br />
IO monad holding value of type Int". newIORef creates new variable<br />
(reference) and returns it, and then read/write actions use this<br />
reference. Value returned by "readIORef varA" action may depend not<br />
only on variable involved but also on the moment of performing this<br />
operation so it can return different values on each call.<br />
<br />
Arrays, hash tables and any other _mutable_ data structures are<br />
defined in the same way - there is operation that creates new "mutable<br />
value" and returns reference to it. Then special read and write<br />
operations in IO monad are used. The following example shows example<br />
of using mutable array:<br />
<br />
<haskell><br />
import Data.Array.IO<br />
main = do arr <- newArray (1,10) 37 :: IO (IOArray Int Int)<br />
a <- readArray arr 1<br />
writeArray arr 1 64<br />
b <- readArray arr 1<br />
print (a,b)<br />
</haskell><br />
<br />
Here, array of 10 elements with 37 as initial values is created. After<br />
reading value of first element to 'a' this element's value is changed<br />
to 64 and then read again, to 'b'. As you can see by executing this<br />
code, 'a' will be set to 37 and 'b' to 64.<br />
<br />
<br />
<br />
Other state-dependent operations are also often implemented as IO<br />
actions. For example, random numbers generator should return different<br />
values on each call. It looks natural to give it IO-involving type:<br />
<br />
<haskell><br />
rand :: IO Int<br />
</haskell><br />
<br />
Moreover, when you import C routines you should be careful - if this<br />
routine is impure, i.e. it's result depends on something in "real<br />
world" (file system, memory contents...), internal state and so on,<br />
you should give it IO-involving type. Otherwise, compiler can<br />
"optimize" repetitive calls of this procedure with the same parameters! :)<br />
<br />
For example:<br />
<br />
<haskell><br />
foreign import ccall<br />
sin :: Double -> Double<br />
</haskell><br />
<br />
because 'sin' result depends only on it's argument, but<br />
<br />
<haskell><br />
foreign import ccall<br />
tell :: Int -> IO Int<br />
</haskell><br />
<br />
If you will declare 'tell' as pure function (without IO) then you may<br />
got the same position on each call! :)<br />
<br />
== IO actions as values ==<br />
<br />
Now you should precisely understand why it's impossible to use IO<br />
actions inside non-IO (pure) procedures. Such procedures just don't<br />
get a "baton", don't know any "world" value to pass to IO action.<br />
RealWorld is abstract datatype, so they also can't construct it's<br />
values by himself, and it's a strict type, so 'undefined' also can't<br />
be used. So, prohibition of using IO actions inside pure procedures is<br />
just a type trick as it is usual in Haskell :)<br />
<br />
But while pure code can't _execute_ IO actions, it can work with them<br />
as with any other functional values - they can be stored in data<br />
structures, passed as parameters and returned as results, collected in<br />
lists, and partially applied. But anyway IO action will remain<br />
functional value because we can't apply it to the last argument - of<br />
type RealWorld.<br />
<br />
In order to _execute_ the IO action we need to apply it to some<br />
RealWorld value that can be done only inside some IO procedure,<br />
in it's "actions chain". And real execution of this action will take<br />
place only when this procedure is called as part of process of<br />
"calculating final value of world" for 'main'. Look at this example:<br />
<br />
<haskell><br />
main = let get2chars = getChar >> getChar<br />
((), world1) = putStr "Press two keys" world0<br />
(answer, world2) = get2chars world1<br />
in ((), world2)<br />
</haskell><br />
<br />
Here we first bind value to 'get2chars' and then write binding<br />
involving 'putStr'. But what is an execution order? It is not defined<br />
by order of writing bindings, it is defined by order of processing<br />
"world" values! You can arbitrarily reorder binding statements - in<br />
any case execution order will be defined by dependence on passing<br />
"world" values. Let's see how this 'main' looks in the 'do' notation:<br />
<br />
<haskell><br />
main = do let get2chars = getChar >> getChar<br />
putStr "Press two keys"<br />
get2chars<br />
return ()<br />
</haskell><br />
<br />
As you can see, the 'let' binding that is not included in IO chain, is<br />
translated just to 'let' statement inside the 'do' sequence. And as<br />
you now should understand, placement of this 'let' don't has any<br />
impact on the evaluation order, which is defined by order of passing<br />
"world" values that is, in turn, defined by order of ordinal (non-let)<br />
statements inside 'do'!<br />
<br />
Moreover, IO actions like this 'get2chars' can't be executed just<br />
because they are functions with RealWorld parameter. To execute them,<br />
we should supply the RealWorld parameter, i.e. insert them in 'main'<br />
chain, placing them in some 'do' sequence executed from 'main'. Until<br />
that is done, they will be keep as any function, in partially<br />
evaluated form. And we can work with IO actions as with any other<br />
functions - bind them to names (like above), save them to data<br />
structures, pass as function parameters and return as results - and<br />
they will not be performed until you give them this magic RealWorld<br />
parameter!<br />
<br />
Let's try. How about defining list of IO actions?<br />
<br />
<haskell><br />
ioActions :: [IO ()]<br />
ioActions = [(print "Hello!"),<br />
(putStr "just kidding"),<br />
(getChar >> return ())<br />
]<br />
</haskell><br />
<br />
I used additional parentheses around each action, although they are<br />
not really required. If you still can't belive that these actions will<br />
not be executed until your command, just uncover this list type:<br />
<br />
<haskell><br />
ioActions :: [RealWorld -> ((), RealWorld)]<br />
</haskell><br />
<br />
Well, now we want to execute some of these actions. No problem, just<br />
insert them into the 'main' chain:<br />
<br />
<haskell><br />
main = do head ioActions<br />
ioActions !! 1<br />
last ioActions<br />
</haskell><br />
<br />
Looks strange, yeah? :) Really, any IO action you write in the 'do'<br />
statement (or use as parameter for '>>'/'>>=') is an expression<br />
returning result of type "IO a". Typically, you use some function that<br />
has type "x -> y -> ... -> IO a" and provide all these x, y and<br />
so on parameters. But you are not limited to this standard scenario -<br />
don't forget that Haskell is functional language and you are free to<br />
compute the functional value required (recall - "IO a" is a function<br />
type) in any possible way. Here we just extracted several functions<br />
from the list - no problem. This functional value can also be<br />
constructed on-the-fly, as we've done in previous example - it's also<br />
ok. Want to see this functional value passed as the parameter - heh,<br />
just look at the 'when' definition. Hey, we can sell, buy and rent<br />
these IO actions as any other functional values! For example, let's<br />
define function that executes all IO actions in the list:<br />
<br />
<haskell><br />
sequence_ :: [IO a] -> IO ()<br />
sequence_ [] = return ()<br />
sequence_ (x:xs) = do x<br />
sequence_ xs<br />
</haskell><br />
<br />
No black magic - we just extracts IO actions from the list and inserts<br />
them into chain of IO operations that should be performed to "compute<br />
final world value" of entire 'sequence_' call.<br />
<br />
With help of 'sequence_', we can rewrite our last 'main' as:<br />
<br />
<haskell><br />
main = sequence ioActions<br />
</haskell><br />
<br />
<br />
Haskell's ability to work with IO actions as with any other<br />
(functional or non-functional) value allows us to define control<br />
structures of any complexity. Try, for example, to define control<br />
structure that repeats the action until it returns the 'False' result:<br />
<br />
<haskell><br />
while :: IO Bool -> IO ()<br />
while action = ???<br />
</haskell><br />
<br />
<br />
How about returning IO action as the function result? Well, we done<br />
this each time we defined IO procedure - they all return IO action<br />
that need RealWorld value to be performed. While we most times just<br />
executed them in chain of higher-level IO procedure, it's also<br />
possible to just collect them without actual execution:<br />
<br />
<haskell><br />
main = do let a = sequence ioActions<br />
b = when True getChar<br />
c = getChar >> getChar<br />
putStr "'let' statements are not executed!"<br />
</haskell><br />
<br />
These assigned IO procedures can be used as parameters to other<br />
procedures, or written to global variables, or processed in some other<br />
way, or just executed later, as we done in example with 'get2chars'.<br />
<br />
But how about returning from IO procedure a parameterized IO action?<br />
Let's define a procedure that returns i'th byte from file represented<br />
as Handle:<br />
<br />
<haskell><br />
readi h i = do hSeek h i AbsoluteSeek<br />
hGetChar h<br />
</haskell><br />
<br />
So bad so good. But how about procedure that returns i'th byte of file<br />
with given name without reopening it each time?<br />
<br />
<haskell><br />
readfilei :: String -> IO (Integer -> IO Char)<br />
readfilei name = do h <- openFile name ReadMode<br />
return (readi h)<br />
</haskell><br />
<br />
As you can see, it's an IO procedure that opens file and returns...<br />
another IO procedure that will read byte specified. But we can go<br />
further and include 'readi' body into 'readfilei':<br />
<br />
<haskell><br />
readfilei name = do h <- openFile name ReadMode<br />
let readi h i = do hSeek h i AbsoluteSeek<br />
hGetChar h<br />
return (readi h)<br />
</haskell><br />
<br />
Good? May be better. Why we add 'h' as 'readi' parameter if it can be<br />
got from the environment where 'readi' now defined? Shorter will be:<br />
<br />
<haskell><br />
readfilei name = do h <- openFile name ReadMode<br />
let readi i = do hSeek h i AbsoluteSeek<br />
hGetChar h<br />
return readi<br />
</haskell><br />
<br />
What we've done here? We've build parameterized IO action involving local<br />
names inside 'readfilei' and returned it as the result. Now it can be<br />
used in following way:<br />
<br />
<haskell><br />
main = do myfile <- readfilei "test"<br />
a <- myfile 0<br />
b <- myfile 1<br />
print (a,b)<br />
</haskell><br />
<br />
<br />
Such usage of IO actions is very typical for Haskell programs - you<br />
just construct one or more (using tuple) IO actions that your need,<br />
with and/or without parameters, involving the parameters that your<br />
"constructor" received, and return them to caller. Then these IO actions<br />
can be used in rest of program without any knowledge about your<br />
internal implementation strategies. Actually, this is used to<br />
partially emulate OOP (to be exact, ADT) programming ideology.<br />
<br />
<br />
For example, one of my program's modules is the memory suballocator. It<br />
receives address and size of large memory block and returns two<br />
procedures - one to allocate subblock of given size and second to<br />
return allocated block back:<br />
<br />
<haskell><br />
memoryAllocator :: Ptr a -> Int -> IO (Int -> IO (Ptr b),<br />
Ptr c -> IO ())<br />
<br />
memoryAllocator buf size = do ......<br />
let alloc size = do ...<br />
...<br />
free ptr = do ...<br />
...<br />
return (alloc, free)<br />
</haskell><br />
<br />
How this is implemented? 'alloc' and 'free' works with references<br />
created inside this procedure. Because creation of these references is<br />
a part of 'memoryAllocator' IO actions chain, new independent set of<br />
references will be created for each memory block for which<br />
'memoryAllocator' is called:<br />
<br />
<haskell><br />
memoryAllocator buf size = do start <- newIORef buf<br />
end <- newIORef (buf `plusPtr` size)<br />
...<br />
</haskell><br />
<br />
These two references (we will implement very simple memory allocator) are<br />
read and written in 'alloc' and 'free' definitions:<br />
<br />
<haskell><br />
let alloc size = do addr <- readIORef start<br />
writeIORef start (addr `plusPtr` size)<br />
return addr<br />
<br />
let free ptr = do writeIORef start ptr<br />
</haskell><br />
<br />
What we've defined here is just a pair of closures that is using state<br />
available on the moment of their definition. As you can see, it's as<br />
easy as in any other functional language, despite the Haskell's lack<br />
of direct support for non-pure functions.<br />
<br />
<br />
<br />
== unsafePerformIO and unsafeInterleaveIO ==<br />
<br />
Programmers with imperative background often still looks for a ways to<br />
execute IO actions inside the pure procedures. But that this means?<br />
Imagine that you try to write procedure that reads contents of file<br />
with given name:<br />
<br />
<haskell><br />
readContents :: Filename -> String<br />
</haskell><br />
<br />
Defining it as pure function will simplify the code that use it, i<br />
agree. But this creates troubles for the compiler:<br />
<br />
- first, this call is not inserted in sequence of "world<br />
transformations", so compiler don't get a hint - at what exact moment<br />
you want to execute this action. For example, if file contents is one<br />
at the program start and another at the end - what contents you want<br />
to see? Moment of "consumption" of this value don't make strong<br />
guarantees for execution order, because Haskell see all the functions<br />
as pure and fell free to reorder their execution as needed.<br />
<br />
- second, attempts to read contents of file with the same name can be<br />
factorized despite the fact that file (or current directory) can be<br />
changed between calls. Again, Haskell looks at all the functions as<br />
pure ones and feel free to omit excessive calls with the same<br />
parameters.<br />
<br />
So, implementing functions that interacts with Real World as pure ones<br />
considered as a Bad Behavior. Good boys never do it ;)<br />
<br />
<br />
Nevertheless, there are (semi-official) ways to use IO actions inside<br />
of pure functions. As you should remember this is prohibited by<br />
requiring "baton" to call IO action. Pure function don't have the baton,<br />
but there is special procedure, that procures this baton from nowhere,<br />
uses it to call IO action and then throws resulting "world" away!<br />
A little low-level magic :) This very special procedure is:<br />
<br />
<haskell><br />
unsafePerformIO :: IO a -> a<br />
</haskell><br />
<br />
Let's look at it's (possible) definition:<br />
<br />
<haskell><br />
unsafePerformIO :: (RealWorld -> (a,RealWorld)) -> a<br />
unsafePerformIO action = let (a,world1) = action createNewWorld<br />
in a<br />
</haskell><br />
<br />
where 'createNewWorld' is internal function producing new value of<br />
RealWorld type.<br />
<br />
Using unsafePerformIO, you can easily write pure functions that does<br />
I/O inside. But don't do this without real need, and remember to<br />
follow this rule: compiler don't know that you are cheating, it still<br />
consider each non-IO function as pure one. Therefore, all the usual<br />
optimization rules can (and will!) be applied to it's execution. So<br />
you must ensure that:<br />
<br />
1) Result of each call depends only on it's arguments<br />
<br />
2) You don't rely on side-effects of this function, which may be not<br />
executed if it's results are not used<br />
<br />
<br />
Let's investigate this problem deeper. Function evaluation in Haskell<br />
are ruled by value's necessity - computed only the values that really<br />
required to calculate final result. But that this means according to<br />
'main' function? To "calculate final world's" value, it's required to<br />
perform all the intermediate IO actions that included in 'main' chain.<br />
By using 'unsafePerformIO' we call IO actions outside of this chain.<br />
What can guarantee that they will be run? Nothing. The only case when<br />
they will be run is if that is required to compute overall function<br />
result (that in turn should be required to perform some action in<br />
'main' chain). Here we return to the Haskell-natural<br />
evaluation-on-value-need. Now you should clearly see the difference:<br />
<br />
- IO action inside IO procedure guaranteed to execute as long as<br />
it is inside 'main' chain - even when it's result is not used.<br />
You directly specify order of action's execution inside IO procedure.<br />
Data dependencies are simulated via "world" values.<br />
<br />
- IO action inside 'unsafePerformIO' will be performed only if<br />
result of this operation is really used. Evaluation order is not<br />
guaranteed and you should not rely on it (except when you sure about<br />
data dependency).<br />
<br />
<br />
I should also say that inside 'unsafePerformIO' call you can organize<br />
small internal chain of IO actions with help of the same binding<br />
operators and/or 'do' sugar:<br />
<br />
<haskell><br />
one = unsafePerformIO $ do var <- newIORef 0<br />
writeIORef var 1<br />
readIORef var<br />
</haskell><br />
<br />
and in this case ALL the operations in this chain will be performed as<br />
long as 'unsafePerformIO' result will be demanded. To ensure this,<br />
the actual 'unsafePerformIO' implementation evaluates "world" returned<br />
by the 'action':<br />
<br />
<haskell><br />
unsafePerformIO action = let (a,world1) = action createNewWorld<br />
in (world1 `seq` a)<br />
</haskell><br />
<br />
('seq' operation strictly evaluates it's first argument before<br />
returning the value of second one)<br />
<br />
<br />
<br />
But there is even more strange operation - 'unsafeInterleaveIO' that<br />
gets "official baton", makes it's piratical copy, and then run's<br />
"illegal" relay-race in parallel with main one! I can't further say<br />
about it's behavior without grief and indignation, it's not surprise<br />
that this operation is widely used in such software-piratical<br />
countries as Russia and China! ;) Don't even ask me - i will say<br />
nothing about this dirty trick i using permanently ;)<br />
<br />
<br />
<br />
== fixIO and 'mdo' ==<br />
<br />
== ST monad ==<br />
<br />
== Q monad ==<br />
<br />
== Welcome to machine: actual [[GHC]] implementation ==<br />
<br />
A little disclaimer: after all, i should say that i don't described<br />
here what is a monad (i even don't know it myself) and what my<br />
explanations shows only the one _possible_ way to implement them in<br />
Haskell. For example, hbc Haskell compiler implements monads via<br />
continuations. I also don't said anything about exception handling<br />
that is natural part of "monad" concept. You can read "All about<br />
monads" guide to learn more on these topics.<br />
<br />
But there are a good news: first, monad understanding you've build<br />
will work with any implementation. You just can't work with RealWorld<br />
values directly.<br />
<br />
Second, IO monad implementation described here is really used in GHC,<br />
Hugs (nhc/jhc, too?) compilers. It is the really real IO definition<br />
from GHC sources:<br />
<br />
<haskell><br />
newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))<br />
</haskell><br />
<br />
It uses "State# RealWorld" type instead of our RealWorld, it uses "(# #)"<br />
strict tuple for optimization, and it adds IO data constructor<br />
around the type. Nevertheless, there are no principal changes. Knowing<br />
the principle of "chaining" IO actions via fake "state of world"<br />
values, now you can easily understand and write low-level<br />
implementations of GHC I/O operations.<br />
<br />
<br />
=== The [[Yhc]]/nhc98 implementation ===<br />
<br />
<haskell><br />
data World = World<br />
newtype IO a = IO (World -> Either IOError a)<br />
</haskell><br />
<br />
This implementation makes the "World" disappear somewhat, and returns Either a<br />
result "a", or if an error occurs then "IOError". The lack of the World on the<br />
right hand side of the function can only be done because the compiler knows<br />
special things about the IO type, and will not over optimise it.<br />
<br />
<br />
== Further reading ==<br />
<br />
Look at the [[Books and tutorials#Using_Monads]] page<br />
<br />
Are you have more questions? Ask in the haskell-cafe.</div>ThiagoArrais