Lazy vs. non-strict

Haskell is often described as a lazy language. However, the language specification simply states that Haskell is non-strict, which is not quite the same thing as lazy.

Direction of evaluation

Non-strictness means that reduction (the mathematical term for evaluation) proceeds from the outside in, so if you have (a+(b*c)) then first you reduce the +, then you reduce the inner (b*c). Strict languages work the other way around, starting with the innermost brackets and working outwards.

This matters to the semantics because if you have an expression that evaluates to bottom (i.e. an error or endless loop) then any language that starts at the inside and works outwards will always find that bottom value, and hence the bottom will propagate outwards. However if you start from the outside and work in then some of the sub-expressions are eliminated by the outer reductions, so they don't get evaluated and you don't get "bottom".

Lazy evaluation, on the other hand, means only evaluating an expression when its results are needed (note the shift from "reduction" to "evaluation"). So when the evaluation engine sees an expression it builds a thunk data structure containing whatever values are needed to evaluate the expression, plus a pointer to the expression itself. When the result is actually needed the evaluation engine calls the expression and then replaces the thunk with the result for future reference.

Obviously there is a strong correspondence between a thunk and a partly-evaluated expression. Hence in most cases the terms "lazy" and "non-strict" are synonyms. But not quite. For instance you could imagine an evaluation engine on highly parallel hardware that fires off sub-expression evaluation eagerly, but then throws away results that are not needed.

In practice Haskell is not a purely lazy language: for instance pattern matching is usually strict (So trying a pattern match forces evaluation to happen at least far enough to accept or reject the match. You can prepend a ~ in order to make pattern matches lazy). The strictness analyzer also looks for cases where sub-expressions are always required by the outer expression, and converts those into eager evaluation. It can do this because the semantics (in terms of "bottom") don't change. Programmers can also use the seq primitive to force an expression to evaluate regardless of whether the result will ever be used. $! is defined in terms of seq.

Source:

Paul Johnson in Haskell Cafe What is the role of $! ?

WHNF

WHNF is an abbreviation for weak head normal form.

Further references

Haskell’s Non-Strict Semantics – What Exactly does Lazy Evaluation Calculate? – An introductory tutorial on the difference between lazy evaluation and non-strict semantics.

Laziness is simply a common implementation technique for non-strict languages, but it is not the only possible technique. One major drawback with lazy implementations is that they are not generally amenable to parallelisation. This paper states that experiments indicate that little parallelism can be extracted from lazy programs:

"The Impact of Laziness on Parallelism and the Limits of Strictness Analysis" (G. Tremblay G. R. Gao) http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.3806

Lenient, or optimistic, evaluation is an implementation approach that lies somewhere between lazy and strict, and combines eager evaluation with non-strict semantics. This seems to be considered more promising for parallelisation.

This paper implies (section 2.2.1) that lenient evaluation can handle circular data structures and recursive definitions, but cannot express infinite structures without explicit use of delays:

"How Much Non-strictness do Lenient Programs Require?" (Klaus E. Schauser, Seth C. Goldstein) http://www.cs.cmu.edu/~seth/papers/schauser-fplca95.pdf

Some experiments with non-lazy Haskell compilers have been attempted: Research_papers/Runtime_systems#Optimistic_Evaluation