Maintaining laziness: Difference between revisions
(Wadler's force function, if-then-else) |
(code duplication) |
||
Line 32: | Line 32: | ||
=== Early decision === | === Early decision === | ||
Be aware that the following two expression are not equivalent. | Be aware that the following two expression are not equivalent. | ||
<haskell> | <haskell> | ||
Line 44: | Line 43: | ||
which is a difference in [[non-strict semantics]]. | which is a difference in [[non-strict semantics]]. | ||
Consider e.g. <hask>if b then 'a':x else 'a':y</hask>. | Consider e.g. <hask>if b then 'a':x else 'a':y</hask>. | ||
It is common source of too much strictness to make decisions too early and thus duplicate code in the decision branches. | |||
Intuitively spoken, the bad thing about [[code duplication]] (stylistic questions put aside) is, | |||
that the run-time system cannot see that in the branches some things are equal and do it in common before the critical decision. | |||
Actually, the compiler and run-time system could be "improved" to do so, but in order to keep things predictable, they do not do so. | |||
Even more, this behaviour is required by theory, | |||
since by pushing decisions to the inner of an expression you change the semantics of the expression. | |||
So we return to the question, what the programmer actually wants. | |||
pattern match on (,) is better than pattern match on (:), because the first one has no alternative constructor | pattern match on (,) is better than pattern match on (:), because the first one has no alternative constructor |
Revision as of 20:00, 28 December 2008
One of Haskell's main features is non-strict semantics, which in is implemented by lazy evaluation in all popular Haskell compilers. However many Haskell libraries found on Hackage are implemented just as if Haskell would be a strict language. This leads to unnecessary inefficiencies, memory leaks and, we suspect, unintended semantics. In this article we want to go through some techniques on how to check lazy behaviour on functions, examples of typical constructs which break laziness without need, and finally we want to link to techniques that may yield the same effect without laziness.
Checking laziness
undefined, cycles
unit tests
Laziness breakers
Maybe, Either, Exceptions
Wadler's force function
The following looks cumbersome:
let (Just x) = y
in Just x
It looks like a complicated expression for y
,
with an added danger of failing unrecoverably when y
is not Just
.
...
parsers - leave Maybe where no Maybe is required
Early decision
Be aware that the following two expression are not equivalent.
-- less lazy
if b then f x else f y
-- more lazy
f (if b then x else y)
It is if undefined then f x else f y
is undefined
,
whereas f (if b then x else y)
if f undefined
,
which is a difference in non-strict semantics.
Consider e.g. if b then 'a':x else 'a':y
.
It is common source of too much strictness to make decisions too early and thus duplicate code in the decision branches. Intuitively spoken, the bad thing about code duplication (stylistic questions put aside) is, that the run-time system cannot see that in the branches some things are equal and do it in common before the critical decision. Actually, the compiler and run-time system could be "improved" to do so, but in order to keep things predictable, they do not do so. Even more, this behaviour is required by theory, since by pushing decisions to the inner of an expression you change the semantics of the expression. So we return to the question, what the programmer actually wants.
pattern match on (,) is better than pattern match on (:), because the first one has no alternative constructor
laziness encoded in uncurry
if then else
state monad
reader monad
Strict pattern matching in a recursion
The implementation of the partition
function in GHC up to 6.2 failed on infinite lists.
What happened?
The reason was a too strict pattern matching.
Consider the following correct implementation:
partition :: (a -> Bool) -> [a] -> ([a], [a])
partition p =
foldr
(\x ~(y,z) ->
if p x
then (x : y, z)
else (y, x : z))
([],[])
...
List reversal
Any use of the list function reverse
should alert you,
since when you access the first element of a reversed list, then all nodes of the input list must be evaluated and stored in memory.
Think twice whether it is really needed.
The article Infinity and efficiency shows how to avoid list reversal.
Alternatives
From the above issues you see that it laziness is a fragile thing. Only one moment where you do not pay attention and a function, carefully developed with laziness in mind, is no longer lazy, when you call it. The type system can almost not help you hunting laziness breakers and there is little support by debuggers. Thus detection of laziness breakers, often requires understanding of a large portion of code, which is against the idea of modularity. Maybe for your case you might prefer a different idiom, that achieves the same goals in a safer way. See e.g. the Enumerator and iteratee pattern.