A brief introduction to Haskell
From HaskellWiki
Haskell is:
- A language developed by the programming languages research community.
- Is a lazy, purely functional language (that also has imperative features such as side effects and mutable state, along with strict evaluation)
- Born as an open source vehicle for programming language research
- One of the youngest children of ML and Lisp
- Particularly useful for programs that manipulate data structures (such as compilers and interpreters), and for concurrent/parallel programming
Inspired by the article Introduction to OCaml, and translated from the OCaml by Don Stewart.
Contents |
1 Background
- 1990. Haskell 1.0
- 1991. Haskell 1.1
- 1993. Haskell 1.2
- 1996. Haskell 1.3
- 1997. Haskell 1.4
- 1998. Haskell 98
- 2000-2006. Period of rapid language and community growth
- ~2007. Haskell Prime
2 Haskell features
Has some novel features relative to Java (and C++).
- Immutable variables by default (mutable state programmed via monads)
- Pure by default (side effects are programmed via monads)
- Lazy evaluation: results are only computed if they're required (strictness optional)
- Everything is an expression
- First-class functions: functions can be defined anywhere, passed as arguments, and returned as values.
- Both compiled and interpreted implementations available
- Full type inference -- type declarations optional
- Pattern matching on data structures -- data structures are first class!
- Parametric polymorphism
- Bounded parametric polymorphism
These are all conceptually more advanced ideas.
Compared to similar functional languages, Haskell differs in that it has support for:
- Lazy evaluation
- Pure functions by default
- Monadic side effects
- Type classes
- Syntax based on layout
The GHC Haskell compiler, in particular, provides some interesting extensions:
- Generalised algebraic data types
- Impredicative types system
- Software transactional memory
- Parallel, SMP runtime system
3 The Basics
Read the language definition to supplement these notes. For more depth and examples see the Haskell wiki.
3.1 Interacting with the language
Haskell is both compiled and interpreted. For exploration purposes, we'll consider interacting with Haskell via the GHCi interpreter:
- expressions are entered at the prompt
- newline signals end of input
Here is a GHCi sessoin, starting from a UNIX prompt.
$ ghci ___ ___ _ / _ \ /\ /\/ __(_) / /_\// /_/ / / | | GHC Interactive, version 6.4.2, for Haskell 98. / /_\\/ __ / /___| | http://www.haskell.org/ghc/ \____/\/ /_/\____/|_| Type :? for help.
Loading package base-1.0 ... linking ... done.Here the incredibly simple Haskell program
Prelude> let x = 3 + 4
We can ask the system what type it automaticaly inferred for our
variable.integer values.
Prelude> :t x x :: Integer
A variable evaluates to its value.
Prelude> x 7
expressions.
Prelude> x + 4 11
new binding for a variable with local scope.
Prelude> let x = 4 in x + 3 7
Alternatively, declarations typed in at the top level are like an open-ended let:
Prelude> let x = 4 Prelude> let y = x + 3 Prelude> x * x 16 Prelude> :t x x :: Integer Prelude> :t y y :: Integer Prelude> :t x * x x * x :: Integer
Notice that type inference infers the correct type for all the expressions, without us having to ever specify the type explicitly.
4 Basic types
There is a range of basic types, defined in the language Prelude.
Int -- bounded, word-sized integers Integer -- unbounded integers Double -- floating point values Char -- characters String -- strings () -- the unit type Bool -- booleans [a] -- lists (a,b) -- tuples / product types Either a b -- sum types Maybe a -- optional values
For example:
7 12312412412412321 3.1415 'x' "haskell" () True, False [1,2,3,4,5] ('x', 42) Left True, Right "string" Nothing, Just True
These types have all the usual operations on them, in the standard libraries.
4.1 Libraries
- The Prelude contains the core operations on basic types. It is imported by default into every Haskell module. For example;
+ - div mod && || not < > == /=
Learn the Prelude well. Less basic functions are found in the standard libraries. For data structures such as List, Array and finite maps.
To use functions from these modules you have to import them, or in GHCi, refer to the qualified name, for example to use the toUpper function on Chars:
Prelude> Char.toUpper 'x' 'X' Prelude> :m + Char Prelude Char> toUpper 'y' 'Y'
In a source file, you have to import the module explicitly:
import Char
4.2 Overloading
Haskell uses typeclasses to methodically allow overloading. A typeclass describes a set of functions, and any type which provides those functions can be made an instance of that type. This avoids the syntactic redundancy of languages like OCaml.
For example, the functionPrelude> :t (*) (*) :: (Num a) => a -> a -> a
Which can be read as "* is a polymorphic function, taking two types 'a', and returning a result of the same type, where those types are members of the class Num".
This means that it will operate on any type in theInt, Integer
Prelude> 2.3 * 5.7 13.11
or on integers:
Prelude> 2 * 5 10
used. Haskell also never performs implicit coercions, all coercions must be explicit. For example, if we try to multiply two different types,
then the type check againstfail.
Prelude> (2.3 :: Double) * (5 :: Int) <interactive>:1:19: Couldn't match `Double' against `Int' Expected type: Double Inferred type: Int In the expression: 5 :: Int In the second argument of `(*)', namely `(5 :: Int)'
Prelude> (2.3 :: Double) * fromIntegral (5 :: Int) 11.5
Why bother -- why not allow the system to implicitly coerce types? Implicit type conversions by the system are the source of innumerable hard to find bugs in languages that support them, and makes reasoning about a program harder, since you must apply not just the language's semantics, but an extra set of coercion rules.
Note that If we leave off the type signatures however, Haskell will helpfully infer the most general type:
Prelude> 2.3 * 5 11.5
4.3 Expressions
In Haskell expressions are everything. There are no pure "statements" like in Java/C++. instead, statements are also expressions, they return values.
Prelude> (if 2 == 3 then 5 else 6) + 1 7
Prelude> (if 2 == 3 then 5 else 6.5) + 1 7.5
4.4 Local bindings
In Haskelllet x = 1 + 2 in x + 3
is analogous to:
{ int x = 1 + 2; ... x + 3 ... ; }
in C, but the Haskell variable x is given a value that is immutable (can never change).
4.5 Allocation
When you declare a new variable, Haskell automatically allocates that value for you -- no need to explicitly manage memory. The garbage collector will then collect any ureachable values once they go out of scope.
Advanced users can also manage memory by hand using the foreign function interface.
4.6 Lists
Lists are ... lists of Haskell values. Defining a new list is trivial, easier than in Java.
Prelude> [2, 1+2, 4] [2,3,4] Prelude> :t [2, 1+2, 4] [2, 1+2, 4] :: (Num a) => [a]
This automatically allocates space for the list and puts in the elements. Haskell is garbage-collected like Java so no explicit de-allocation is needed. The type of the list is inferred automatically. All elements of a list must be of the same type.
Prelude> ["e", concat ["f", "g"], "h"] ["e","fg","h"]
parenthesis to delimit the function's arguments. Haskell uses whitespace, and not commas, and:
- You don't need parentheses for function application in Haskell: sin 0.3
- Multiple arguments can be passed in one at a time (curried) which means they can be separated by spaces: .max 3 4
Lists must be uniform in their type ("homogenous").
Prelude> ['x', True] Couldn't match `Char' against `Bool'
Here we tried to build a list containing a Char and a Boolean, but the
list constructor,Prelude> :t [] [] :: [a]
(For those wondering how to build a list of heterogenous values, you would use a sum type):
Prelude> [Left 'x', Right True] [Left 'x',Right True] Prelude> :t [Left 'x', Right True] [Left 'x', Right True] :: [Either Char Bool]
List operations are numerous, as can be seen in the Data.List library.
Prelude> let x = [2,3] Prelude> let y = 1 : x -- 'cons' the value 1 onto the list Prelude> x -- the list is immutable [2,3] Prelude> y [1,2,3] Prelude> x ++ y -- joining lists [2,3,1,2,3] Prelude> head y -- first element of the list is the 'head' 1 Prelude> tail y -- the rest of the list is the 'tail' [2,3]
4.7 Pattern matching
Haskell supports pattern matching on data structures. This is a powerful language feature, making code that manipulates data structures incredibly simple. The core language feature for pattern matching is the
Prelude> case x of h:t -> h 2
pattern to ignore the part of the pattern we don't care about:
Prelude> case x of _:t -> t [3]
4.8 Tuples
Tupes are fixed length structures, whose fields may be of differing types ("heterogenous"). They are known as product types in programming language theory.
Prelude> let x = (2, "hi") Prelude> case x of (y,_) -> y 2
Unlike the ML family of languages, Haskell uses the same syntax for the value level as on the type level. So the type of the above tuple is:
Prelude> :t x x :: (Integer, [Char])
All the data mentioned so far is immutable - it is impossible to change an entry in an existing list, tuple, or record without implicitly copying the data! Also, all variables are immutable. By default Haskell is a pure language. Side effects, such as mutation, are discussed later.
5 Functions
Here is a simple recursive factorial function definition.
Prelude> let fac n = if n == 0 then 1 else n * fac (n-1) Prelude> :t fac fac :: (Num a) => a -> a Prelude> fac 42 1405006117752879898543142606244511569936384000000000
specially tag it as such, as you would in the ML family of languages).
When you apply (or invoke) the fac function, you don't need any special parenthesis around the code. Note that there is no return statement; instead, the value of the whole body-expression is implicitly what gets returned.
Functions of more than one argument may be defined:
Prelude> let max a b = if a > b then a else b Prelude> max 3 7 7
Other important aspects of Haskell functions:
- Functions can be defined anywhere in the code via lambda abstractions:
Prelude> ((\x -> x + 1) 4) + 7 12
Prelude> let f = \x -> x + 1 Prelude> :t f f :: Integer -> Integer
Anonymous functions like this can be very useful. Also, functions can be passed to and returned from functions. For example, the higher order
functionelement of a list (like a for-loop):
Prelude> map (\x -> x ^ 2) [1..10] [1,4,9,16,25,36,49,64,81,100]
In Haskell, we can use section syntax for more concise anonymous functions:
Prelude> map (^ 2) [1..10] [1,4,9,16,25,36,49,64,81,100]
2) :: Integer -> Integer</hask>, and a list of numbers.
5.1 Currying
Currying is a method by which function arguments may be passed one at a time to a function, rather than passing all arguments in one go in a structure:
Prelude> let comb n m = if m == 0 || m == n then 1 else comb (n-1) m + comb (n-1) (m-1) Prelude> comb 10 4 210
another argument of that type!
Indeed, we can give comb only one argument, in which case it returns a function that we can later use:
Prelude> let comb10 = comb 10 Prelude> comb10 4 210 Prelude> comb10 3 120
Mutually recursive functions may defined in the same way as normal functions:
let take [] = [] (x:xs) = x : skip xs skip [] = [] (_:ys) = take ys Prelude> :t take take :: [a] -> [a] Prelude> :t skip skip :: [a] -> [a] Prelude> take [1..10] [1,3,5,7,9] Prelude> skip [1..10] [2,4,6,8,10]
This example also shows a pattern match with multiple cases, either empty list or nonempty list. More on patterns now.
5.2 Patterns
Patterns make function definitions much more succinct, as we just saw.
let rev [] = [] rev (x:xs) = rev xs ++ [x]
are patterns against which the value passed to the function is matched. The match occurs on the structure of the data -- that is, on its constructors.
Lists are defined as:
data [] a = [] | a : [a]
built two ways:
- either the empty list, []
- or an element consed onto a list, such as or1 : [].1 : 2 : 3 : []
- For the special case of lists, Haskell provides the syntax sugar: to build the same data.[1,2,3]
variables x and xs to the head and tail components of the list.
Remember thatperforming pattern matching (pattern matching in let bindings is sugar
formore than one pattern matches:
case [1,2,3] of (x:y) -> True (x:y:z) -> False [] -> True Warning: Pattern match(es) are overlapped In a case alternative: (x : y : z) -> ... True
Warnings will be generated at compile time if patterns don't cover all possibilities, or contain redundant branches.
Prelude> :set -Wall Prelude> case [1,2,3] of (x:_) -> x Warning: Pattern match(es) are non-exhaustive In a case alternative: Patterns not matched: [] 1
An exception will be thrown at runtime if a pattern match fails:
Prelude> let myhead (x:_) = x Warning: Pattern match(es) are non-exhaustive In a case alternative: Patterns not matched: [] Prelude> myhead [] *** Exception: <interactive>:1:16-36: Non-exhaustive patterns in case
As we have seen, patterns may be used in function definitions. For example, this looks like a function of two arguments, but its a function of one argument which matches a pair pattern.
Prelude> let add (x,y) = x + y Prelude> add (2,3) 5
5.3 Immutable declarations
Immutable Declarations
- Important feature of let-defined variable values in Haskell (and some other functional languages): they cannot change their value later.
- Greatly helps in reasoning about programs---we know the variable's value is fixed.
- Smalltalk also forces method arguments to be immutable; C++'s const and Java's final on fields has a similar effect.
let x = 5 in let f y = x + 1 in let x = 7 in f 0 -- f uses the particular x in lexical scope where it is defined 6
Here's the one that will mess with your mind: the same thing as above but with the declarations typed into GHCi. (The GHCi environment conceptually an open-ended series of lets which never close).
Prelude> let x = 5 Prelude> let f y = x + 1 Prelude> f 0 6 Prelude> let x = 7 -- not an assignment, a new declaration Prelude> f 0 6
5.4 Higher order functions
Haskell, like ML, makes wide use of higher-order functions: functions that either take other functions as argument or return functions as results, or both. Higher-order functions are an important component of a programmer's toolkit.
- They allow "pluggable" programming by passing in and out chunks of code.
- Many new programming design patterns are possible.
- It greatly increases the reusability of code.
- Higher-order + Polymorphic = Reusable
The classic example of a function that takes another function as
argument is thea function and applies the function to every element of the list.
map :: (a -> b) -> [a] -> [b] map _ [] = [] map f (x:xs) = f x : map f xs
The lower case variables in the type declaration of map are type variables, meaning that the function is polymorphic in that argument (can take any type).
Prelude> map (*10) [4,2,7] [40,20,70]
Perhaps the simplest higher-order function is the composer, in mathematics expressed as g o f. it takes two functions and returns a new function which is their composition:
(.) :: (b -> c) -> (a -> b) -> a -> c (.) f g x = f (g x)
This function takes three arguments: two functions, f and g, and a value, x. It then applies g to x, before applying f to the result. For example:
Prelude> let plus3times2 = (*2) . (+3) Prelude> plus3times2 10 26
As we have seen before, functions are just expressions so can also be immediately applied after being defined:
Prelude> ((*2) . (+3)) 10 26 Prelude> (.) (*2) (+3) 10 26
Note how Haskell allows the infix function . to be used in prefix form, when wrapped in parenthesis.
5.5 Currying
Currying is an important concept of functional programming; it is named after logician Haskell Curry, after which the languages Haskell and Curry are named! Multi-argument functions as defined thus far are curried, lets look at what is really happening.
Here is a two-argument function defined in our usual manner.
Prelude> let myadd x y = x + y Prelude> myadd 3 4 7
Here is another completely equivalent way to define the same function:
Prelude> let myadd x = \y -> x + y Prelude> :t myadd myadd :: (Num a) => a -> a -> a Prelude> myadd 3 4 7 Prelude> let inc3 = myadd 3 Prelude> inc3 4 7
The main observation is myadd is a function returning a function, so the way we supply two arguments is
- Invoke the function, get a function back
- Then invoke the returned function passing the second argument.
- Our final value is returned, victory.
- is an inlined version of this where the function returned by(myadd 3) 4is not put in any variablemyadd 3
Here is a third equivalent way to define myadd, as an anonymous function returning another anonymous function.
Prelude> let myadd = \x -> \y -> x + y Prelude> :t myadd myadd :: Integer -> Integer -> Integer
With currying, all functions "really" take exactly one argument. Currying also naturally arises when functions return functions, as in the map application above showed. Multiple-argument functions should always be written in curried form; all the library functions are curried.
Note thus far we have curried only two-argument functions; in general, n-argument currying is possible. Functions can also take pairs as arguments to achieve the effect of a two-argument function:
Prelude> let mypairadd (x,y) = x + y Prelude> mypairadd (2,3) 5
So, either we can curry or we can pass a pair. We can also write higher-order functions to switch back and forth between the two forms.
fst :: (a,b) -> a fst (x,_) = x snd :: (a,b) -> b snd (_,y) = y curry :: ((a, b) -> c) -> a -> b -> c curry f x y = f (x, y) uncurry :: (a -> b -> c) -> ((a, b) -> c) uncurry f p = f (fst p) (snd p) Prelude> :t uncurry myadd uncurry myadd :: (Integer, Integer) -> Integer Prelude> :t curry mypairadd curry mypairadd :: Integer -> Integer -> Integer Prelude> :t uncurry map uncurry map :: (a -> b, [a]) -> [b] Prelude> :t curry (uncurry myadd) -- a no-op curry (uncurry myadd) :: Integer -> Integer -> Integer
Look at the types: these mappings in both directions in some sense
"implement" the well-known isomorphism on sets:5.6 A bigger example
Here is a more high-powered example of the use of currying.
foldr f [] y = y foldr f (x:xs) y = f x (foldr f xs y) *Main> :t foldr foldr :: (a -> t -> t) -> [a] -> t -> t *Main> let prod = foldr (\a x -> a * x) *Main> :t prod prod :: [Integer] -> Integer -> Integer *Main> let prod0 = prod [1,2,3,4] *Main> :t prod0 prod0 :: Integer -> Integer *Main> (prod0 1, prod0 2) (24,48)
Here is an analysis of this recursive function, for the arbitrary 2-element list [x1,x2], the call
foldr f [x1, x2] y
reduces to (by inlining the body of fold):
f x1 (foldr f [x2] y)
which in turn reduces to:
f x1 (f x2 (foldr f [] y)))
and then:
f x1 (f x2 y)
From this we can assert that the general result returned from
y)...))))
particular function f, as with prod above.
5.7 Proving program properties by induction
We should in fact be able to prove this property by induction. Its easier if we reverse the numbering of the list.
Lemma.Proof. Proceed by induction on the length of the list
Base Case n=1, i.e. the list is [x1]. The function reduces to
hypothesized.
Induction Step. Assume
foldr f [xn,...,x1] y
reduces to
f xn (f xn-1 f ...(f x1 y)...)))
and show
foldr f [xn+1, xn,...,x1] y
reduces to
f xn+1 (f xn f ...(f x1 y)...))))
Computing
foldr f [x1,x2,...,xn,xn+1] y
it matches the pattern with x being xn+1 and xs being
foldr f [xn,...,x1] y
which by our inductive assumption reduces to
f xn (f xn-1 f ...(f x1 y)...)))
And, given this result for the recursive call, the whole function then returns
f xn+1 (...result of recursive call...)
which is
f xn+1 (f xn (f xn-1 f ...(f x1 y)...)))
which is what we needed to show. QED
The above implementation is inefficient in that f is explicitly passed to every recursive call. Here is a more efficient version with identical functionality.
foldr :: (a -> b -> b) -> b -> [a] -> b foldr k z xs = go xs where go [] = z go (y:ys) = y `k` go ys
This function also illustrates how functions may be defined in a local scope,
usingsince it is the return value of f.
Question: How does the return value 'go' know where to look for k when its called??
Prelude> let summate = foldr (+) Prelude> summate [1,2,3,4] 0 10
Prelude> k <interactive>:1:0: Not in scope: `k'
somewhere: in a closure. At function definition point, the current values of variables not local to the function definition are remembered in a structure called a closure. Function values in Haskell are thus really a pair consisting of the function (pointer) and the local environment, in a closure.
Without making a closure, higher-order functions will do awkward things (such as binding to whatever 'k' happens to be in scope). Java, C++, C can pass and return function (pointers), but all functions are defined at the top level so they have no closures.
5.8 Loading source from a file
You should never type code directly in the top loop! Its impossible to fix errors. Instead, you should edit in a file. There are several reasonable modes to interact with caml:
1. Use any editor, and save each group of interlinked functions in a separate file, for example "myfunctions.ml". Then, from the top loop type
- #use "myfunctions.ml";;
-- this will submit everything in the file to the top loop. Note its #use, not just use. 2. Use any editor, and copy-and-paste code into caml. This is great for smaller functions but eventually you want to use the method above. 3. (best if you are using UNIX; may even work under Windows now:) Use the emacs editor and the caml mode documented on the course emacs web page. Edit in the manner of 1. above, with groups of functions in a file. Then, use the
C-c C-e caml-eval-phrase
C-c C-r caml-eval-region
C-c C-s caml-show-subshell
C-c ` caml-goto-phrase-error
commands of caml mode to submit the current definition (phrase) to caml, the selected region to caml, to show the actual shell window, and to find your errors respectively. When you are playing with little examples, just type a small bit in any emacs buffer and click on it and use the caml-eval-phrase command to send it to caml. caml-eval-phrase at the end of a file submits all the expressions in the file.