Tutorials/Programming Haskell/Introduction
It's about time we got some job done in Haskell, eh? Now, one of my favourite programming books as an undergraduate was the Camel Book, "Programming Perl". It was full of lots of practical examples of Perl code, written well. (And I'm grateful to Larry Wall, Tom Christiansen and Randal Schwartz for writing the book that made programming fun).
So what would it look like if we wrote a Haskell tutorial in this style? Let's have at it!
Getting started
Like some languages Haskell can be both compiled and interpreted. The most widely used implementation of Haskell currently is GHC, which provides both an optimising native code compiler, and an interactive bytecode interpreter. I'll be using GHC (or its interactive front end, GHCi, for all code. So grab a copy of GHC now, from your package system, or the GHC home page.
Start up GHCi:
$ ghci ___ ___ _ / _ \ /\ /\/ __(_) / /_\// /_/ / / | | GHC Interactive, version 6.6, for Haskell 98. / /_\\/ __ / /___| | http://www.haskell.org/ghc/ \____/\/ /_/\____/|_| Type :? for help.
Loading package base ... linking ... done. Prelude>
The interpreter now waits for your code fragments. The "Prelude" prompt indicates which library modules are in scope, and in this case, only the basic language module, known as the Prelude.
Now we can start running Haskell code.
Prelude> "G'day, world!"
"G'day, world!"
Prelude> putStrLn "G'day, world!"
G'day, world!
You can compile this code to a native binary using GHC, by writing in a source file:
main = putStrLn "G'day, world!"
and then compiling the source to native code. Assuming your file is A.hs:
$ ghc A.hs
This produces a new executable, ./a.out (a.out.exe on windows), which you can run like any other program on your system:
$ ./a.out G'day, world!
Variables
We can name arbitrary fragments of Haskell using variables. Like so:
phrase = "G'day, world!"
main = putStrLn phrase
We don't have to define what type phrase is, as Haskell uses type inference to infer at compile time the types of all expressions in the program. As "G'day, world!" is a string, so must phrase be a string. There are a bunch of basic types of values to play with. Here's a small sample:
answer = 42
pi = 3.141592653589793238462643383279502884197169399375105820974944592
avocados = 6.02e23
pet = "Lambda"
sign = "I love my " ++ pet
coat = "It costs $100"
hence = "whence"
thence = hence
moles = 2.5
x = moles * avocados
c = '#'
pair = (2.5, "lambdas")
list = [5,6,4,3,1]
options = Just "done"
failed = Nothing
void = ()
One important thing to remember is that Haskell's variables, like in most functional programming languages, are like variables in mathematics, and are just names for expressions. They're explicitly not mutable boxes, like in most imperative programming languages. As a result, you never need to worry about initialising a Haskell variable, nor do you need to worry about the current value in a variable: it always has the same value, and can always be replaced with its definition. So the following behaves just like it would in maths:
answer = 42
another = answer + 1
more = another + answer
main = print more
That is,
$ ghc A.hs $ ./a.out 85
Now, since variables are just names for program fragments, you can evaluate Haskell on paper by replacing all names with their definition, until you reach a final value, like so:
main = print more
=>
main = print (another + answer)
=>
main = print ((answer + 1) + answer)
=>
main = print ((answer + 1) + 42)
=>
main = print ((42 + 1) + 42)
=>
main = print (43 + 42)
=>
main = print 85
=>
85
Having such a simple system for variables allows for a wide range of interesting optimisations, and makes understanding what a program is doing at any point much easier, since you don't have to worry about what state a variable might currently be in. (Of course, some problems need (threadsafe) mutable boxes, and they're available as a library for when you need that).
Collections
Often you need to collect a bunch of values together into some kind of collection. Haskell has many many collection types, but in particular, it has lists and finite maps, which operate much like arrays and hashes of the imperative world.
Lists
A list is just an ordered, um, list of values. They can be nested, and transformed in all sorts of ways, using functions. Assuming your file, A.hs, contains:
home = ["couch", "chair", "table", "stove"]
We can play around with this stuff like so:
$ ghci A.hs
*Main> home
["couch","chair","table","stove"]
*Main> head home
"couch"
*Main> tail home
["chair","table","stove"]
*Main> last home
"stove"
*Main> home !! 2
"table"
*Main> reverse home
["stove","table","chair","couch"]
*Main> map reverse home
["hcuoc","riahc","elbat","evots"]
Loading in the List library gives us some more functions to use:
*Main> :m + Data.List
*Main Data.List> intersperse "#" home
["couch","#","chair","#","table","#","stove"]
*Main Data.List> concat (intersperse "#" home)
"couch#chair#table#stove"
*Main Data.List> home \\ ["table","stove"]
["couch","chair"]
Finite Maps
Finite maps (or maps) are the lookup tables of purely functional programming. Whenever you'd use some kind of hash in an imperative language, you can replace it with a Map in Haskell.
Like hashes, maps can be seen as a table of pairs of keys and values. You can declare a new map:
import Data.Map
days = fromList
[ ("Sun", "Sunday" )
, ("Mon", "Monday" )
, ("Tue", "Tuesday" )
, ("Wed", "Wednesday" )
, ("Thu", "Thursday" )
, ("Fri", "Friday" )
, ("Sat", "Saturday" ) ]
You can also convert a map to a list, using (well, duh!) toList:
*Main> toList days
[("Fri","Friday"),("Mon","Monday"),("Sat","Saturday")
,("Sun","Sunday"),("Thu","Thursday"),("Tue","Tuesday")
,("Wed","Wednesday")]
Note that they come out unordered, just like in hashes. If you just want the keys of the map:
*Main> keys days
["Fri","Mon","Sat","Sun","Thu","Tue","Wed"]
*Main> elems days
["Friday","Monday","Saturday","Sunday","Thursday","Tuesday","Wednesday"]
Since maps are a good structure for looking up values, you can search them using the lookup function. This function returns the element, if found:
*Main> Data.Map.lookup "Tue" days
"Tuesday"
Since the name 'lookup' is also used by a list function of similar purpose in the Prelude, we use the qualified name here to disambiguate which 'lookup' to use.
On failure
But what happens if the key is not found? (Feel free to skip this section if you don't care about errors yet) lookup will then fail, and how it fails depends on what kind of failure you want. Haskell goes to great lengths to make programming for failure flexible. For example, to fail with an exception:
*Main> Data.Map.lookup "Thor" days
*** Exception: user error (Data.Map.lookup: Key not found)
Which is the same as failing with an IO error. We can specify this specifically with a type annotation, to say "fail with an IO error":
*Main> Data.Map.lookup "Thor" days :: IO String
*** Exception: user error (Data.Map.lookup: Key not found)
Often you might instead prefer that some special value is returned on failure:
*Main> Data.Map.lookup "Thors" days :: Maybe String
Nothing
Maybe you'd just like an empty list:
*Main> Data.Map.lookup "Thor" days :: [String]
[]
Finally, you can always provide an explicit default value:
*Main> findWithDefault "Not found" "Thor" days
"Not found"
Failure is entirely under your control!
Actions
Now, real programs interact with the outside world. They call functions which do IO, as a side effect, and then return some value. In Haskell, functions with side effects are often called actions, to distinguish them from normal Haskell functions (which behave like mathematical functions: they take inputs and return a result, with no side effects). Programming with side effects is carefully handled in Haskell, again to control the possibility of errors, and all functions which have side effects have a special type: the IO type.
For example, the function to print a string has the following type (and you can ask the interpreter for the type interactively):
Prelude> :t putStr
putStr :: String -> IO ()
which tells you that this function takes a String as an argument, does some IO side effect, and returns the null value. It is equivalent to the following C type:
void putStr(char *);
but with a bit of extra information, namely, that the function does some IO. We would print out some element of our map like so:
main = print ("Tue in long form is " ++ findWithDefault "Not found" "Tue" days)
*Main> main
"Tue in long form is Tuesday"
An example
One of the classic programming puzzles for introducing real world problems is the 'class grade' problem. You have a text file containing a list of student names and their grades, and you'd like to extract various information and display it. In deference to The Camel Book, we'll follow this lead, and start with a file "grades", containing something like this:
Alonzo 70 Simon 94 Henk 79 Eugenio 69 Bob 80 Oleg 77 Philip 73 ...
Student's appear multiple times, with entries for each of their subjects Let's read this file, populate a map with the data, and print some statistical information about the results. First thing to do is import some basic libraries:
import Data.Char
import Data.Maybe
import Data.List
import qualified Data.Map hiding (map)
import Text.Printf
And now here's the entire program, to read the grades file, compute all the averages, and print them:
main = do
src <- readFile "grades"
let pairs = map (split.words) (lines src)
let grades = foldr insert Data.Map.empty pairs
mapM_ (draw grades) (sort (Data.Map.keys grades))
where
insert (s, g) = Data.Map.insertWith (++) s [g]
split [name,mark] = (name, read mark)
draw g s = printf "%s\t%s\tAverage: %f\n" s (show marks) avg
where
marks = Data.Map.findWithDefault (error "No such student") s g
avg = sum marks / fromIntegral (length marks) :: Double
Running it
How do we run this program? There's lots of ways:
Compile it to native code
$ ghc -O Grades.hs
$ ./a.out Alonzo [70.0,71.0] Average: 70.5 Bob [80.0,88.0] Average: 84.0 Eugenio [69.0,98.0] Average: 83.5 Henk [79.0,81.0] Average: 80.0 Oleg [77.0,68.0] Average: 72.5 Philip [73.0,71.0] Average: 72.0 Simon [94.0,83.0] Average: 88.5
Run it in the bytecode interpreter
$ runhaskell Grades.hs Alonzo [70.0,71.0] Average: 70.5 Bob [80.0,88.0] Average: 84.0 Eugenio [69.0,98.0] Average: 83.5 Henk [79.0,81.0] Average: 80.0 Oleg [77.0,68.0] Average: 72.5 Philip [73.0,71.0] Average: 72.0 Simon [94.0,83.0] Average: 88.5
Execute it interactively
$ ghci Grades.hs Prelude Main> main Alonzo [70.0,71.0] Average: 70.5 Bob [80.0,88.0] Average: 84.0 Eugenio [69.0,98.0] Average: 83.5 Henk [79.0,81.0] Average: 80.0 Oleg [77.0,68.0] Average: 72.5 Philip [73.0,71.0] Average: 72.0 Simon [94.0,83.0] Average: 88.5
Make the script executable
Under unix, you can use the #! convention to make a script executable. Add the following to the top of the source file:
#!/usr/bin/env runhaskell
And then set the script executable:
$ chmod +x Grades.hs
$ ./Grades.hs Alonzo [70.0,71.0] Average: 70.5 Bob [80.0,88.0] Average: 84.0 Eugenio [69.0,98.0] Average: 83.5 Henk [79.0,81.0] Average: 80.0 Oleg [77.0,68.0] Average: 72.5 Philip [73.0,71.0] Average: 72.0 Simon [94.0,83.0] Average: 88.5
Next week
More IO!