Memory leak: Difference between revisions

From HaskellWiki
(makeXs: make the upper bound an argument)
(unevaluated thunks)
Line 25: Line 25:
let makeXs n = [1..n::Integer]
let makeXs n = [1..n::Integer]
in  sum (makeXs 1000000) * product (makeXs 1000000)
in  sum (makeXs 1000000) * product (makeXs 1000000)
</haskell>
=== Building up unevaluated expressions ===
Another typical cause of memory leaks are unevaluated expressions,
the classical example being to sum up the numbers of a list (known as <hask>sum</hask> function).
<haskell>
foldl (+) 0 [1..1000000::Integer]
</haskell>
The problem is, that the runtime system does not know, whether the intermediate sums are actually needed at a later point,
and thus it leaves them unevaluated.
I.e. it stores something equivalent to <hask>1+2+3+4</hask> instead of just <hask>10</hask>.
You may be lucky that the [[strictness analyzer]] already removes the laziness at compile time,
but in general you cannot rely on it.
The safe way is to use [[seq]] to force evaluation of intermediate sums.
This is done by <hask>foldl'</hask>.
<haskell>
foldl' (+) 0 [1..1000000::Integer]
</haskell>
</haskell>



Revision as of 06:07, 24 June 2010

A memory leak means that a program allocates more memory than necessary for its execution. Although Haskell implementations use garbage collectors, programmers must still keep memory management in mind. A garbage collector can reliably prevent dangling pointers, but it is easily possible to produce memory leaks, especially in connection with lazy evaluation. Note that a leak will not only consume more and more memory but it will also slow down the garbage collector considerably! Maybe it is even the reason for the widely spread opinion that garbage collectors are slow or not suited for realtime applications.

Types of leaks

Holding a reference for a too long time

Consider for example:

let xs = [1..1000000::Integer]
in  sum xs * product xs

Since most Haskell compilers expect, that the programmer used let in order to share xs between the call of sum and the call of product, the list xs is completely materialized and hold in memory. However, the list xs is very cheap to compute, and thus it would reduce memory usage considerably, if xs is recomputed for both calls. Since we want to avoid code duplication, we like to achieve this by turning the list definition into a function with a dummy argument.

let makeXs n = [1..n::Integer]
in  sum (makeXs 1000000) * product (makeXs 1000000)

Building up unevaluated expressions

Another typical cause of memory leaks are unevaluated expressions, the classical example being to sum up the numbers of a list (known as sum function).

foldl (+) 0 [1..1000000::Integer]

The problem is, that the runtime system does not know, whether the intermediate sums are actually needed at a later point, and thus it leaves them unevaluated. I.e. it stores something equivalent to 1+2+3+4 instead of just 10. You may be lucky that the strictness analyzer already removes the laziness at compile time, but in general you cannot rely on it. The safe way is to use seq to force evaluation of intermediate sums. This is done by foldl'.

foldl' (+) 0 [1..1000000::Integer]

Detection of memory leaks

A memory leak can be detected by writing a test that should require only a limitted amount of memory and then run the compiled program with restricted heap size. E.g. you can restrict the heap size to 4 MB like in this example: $ ./mytest +RTS -M4m -RTS