From HaskellWiki
< Performance
Revision as of 00:47, 7 May 2007 by Joe (talk | contribs) (fixed typos)
Jump to navigation Jump to search
Haskell Performance Resource

Data Types - Functions
Overloading - FFI - Arrays
Strings - Integers - I/O
Floating point - Concurrency
Modules - Monads

Strictness - Laziness
Avoiding space leaks
Accumulating parameter

GHC - nhc98 - Hugs
Yhc - JHC

Laziness: Procrastinating for Profit

To look at how laziness works in Haskell, and how to make it do efficient work, we'll implement a merge sort function. It will have the type:

 merge_sort :: (Ord a) => [a] -> [a]

We'll also need a function to split the list in two, I'll call this cleaving, and it will look like this:

 cleave :: [a] -> ([a],[a])

Let's start by implementing the cleaving function. The conventional way to split a list in merge sort is to take the first N/2 elements off the front, and the remaining elements after this number. The problem is that finding the length of a list in haskell is expensive. So instead, we'll take pairs of elements off the front. Define two functions:

 evens [] = []
 evens [x] = [x]
 evens (x:_:xs) = x : evens xs
 odds [] = []
 odds [x] = []
 odds (_:x:xs) = x : odds xs

and use them to implement cleave:

 cleave xs = (evens xs, odds xs)

Experience in a strictly evaluation language like SML or Objective CAML may lead you to write alternate versions using an accumulating parameter. Assuming that reversing the order of the elements doesn't matter, you could use this function to split the list into even and odd elements and implement the cleave function as follows:

 cleave = cleave' ([],[]) where
     cleave' (eacc,oacc) [] = (eacc,oacc)
     cleave' (eacc,oacc) [x] = (x:eacc,oacc)
     cleave' (eacc,oacc) (x:x':xs) = cleave' (x:eacc,x':oacc) xs

This appears to be a better implementation. It's tail recursive, and by either strictness analysis or explicitly making the accumulating parameters strict, it won't blow the stack up. Believe it or not, our first implementation was better.

In order to produce the first element of either list, you need to process the entire list. In a non-strict language, we could encounter an infinite list, and we'd like our function to work nicely on them. Consider the effect of:

 head $ fst $ cleave [0..10000000]

With our first definition, we'll get 0 in constant time. With our second, we'll get it in O(N) time, and our calculation will diverge on an infinite list like [0..].

Why is our first version better? Let's look at how evens works and how lists are represented in Haskell. Lists are represented as either an empty list, or a "cons" cell that consists of an element and the remaining list. In pseudo-Haskell, we might write:

 data [a] = [] | a : [a]

In a lazy language, an expression is only evaluated when needed. The machinery used to implement this is called a thunk. It's essentially a value with two possible states: either a computed value, or the process to compute that value. When we assign a value in Haskell, we create a thunk with the instructions to compute the value of the expression we've assigned. When this thunk is forced, these instructions are used to compute a value which is stored in the thunk. The next time the value is required, this computed value is retrieved. Lazyness can be implemented in languages like SML using this method together with mutable references.

So in the recursive case of evens, we produce a thunk that contains a list cons cell. This cons cell contains two thunks, one of the element value, and one of the rest of the list. The thunk for the element is taken from the list the function is operating on, and the thunk for the rest of the list consists of instructions to compute the rest of the list using evens. We'd say that evens and odds are lazy in their input: they consume only enough value to produce the value. As an example of how lazy functions work, consider:

 head ( 5 : undefined )
 tail [undefined, 5]
 evens (5 : undefined : 3 : undefined : 1 : undefined : [])
 take 3 $ evens (5 : undefined : 3 : undefined : 1 : undefined : undefined)

Despite how all the inputs contain partially undefined values, all the values of the function applications are valid values. A lazy function will only diverge when a required value for its computation diverges. If you're wondering why we have two undefineds at the end of the list, recall how evens was implemented. We need two undefined cells to make sure the third case is selected: the one with two elements followed by a remainder list. Having only one undefined means that after the 1 element, the remainder of the list is undefined. Then we can't decide between the 2nd and 3rd cases. Now let's look at what happens with lazy evaluation and diverging values. Consider:

 tail (5 : undefined)
 head [undefined, 5]
 odds (5 : undefined : 3 : undefined : 1 : undefined : [])
 take 4 $ evens ( 5 : undefined : 3 : undefined : 1 : undefined : undefined)

So the application of evens to a non-trivial list results in a thunk being returned immediately. And when we ask for the first element of the list evens produces, we only evaluate the value thunk. This is why we can apply evens or odds, (and cleave for that matter,) to an infinite list. We'll implement merge_sort using cleave:

 merge_sort [] = []
 merge_sort [x] = [x]
 merge_sort lst = let (e,o) = cleave lst in merge (merge_sort e) (merge_sort o) where
     merge :: (Ord a) => [a] -> [a] -> [a]
     merge xs [] = xs
     merge [] ys = ys
     merge xxs@(x:xs) yys@(y:ys) =
         | x <= y = x : merge xs yys
         | otherwise = y : merge xxs ys

You can see that this function isn't lazy. It begins by cleaving the list recursively until it is left with trivial lists, ones with zero or one elements. These are obviously already sorted. It then uses the nested function merge, which combines two ordered lists and preserves their order. The act of partitioning the list into trivial lists before reassembly can begin means the entire list needs to be processed before we can begin merging them and assembling the list. In this case, we can't make a lazier solution, one that would work on an infinite list. If this is surprising, think of this: in sorting a list, the first element out should be the least (or greatest) how are we to find this element without examining the entire list? We'd say that merge_sort is strict in the array to be sorted: if an infinite list is supplied, the computation will diverge in the sense that output will never be provided. There are some operations that cannot be done lazily, for instance, sorting a list.

We've seen the difference between a lazy function and a strict function. Lazy computing has two major appeals. The first is that only enough work is done to compute a value. The second is that we can operate in the presence of infinite and undefined data structures, as long as we don't examine the undefined parts or try to process the infinity of values.