Difference between revisions of "Lazy pattern match"

Latest revision as of 14:51, 6 March 2013

What does "lazy pattern match" mean and what is the meaning of the tilde in pattern matches?

Syntax

These are all lazy pattern matches:

let (a,b) = p

f ~(a,b) = ...

case p of ~(a,b) -> ...

(\ ~(a,b) -> ... )

The let matches the top-most constructor lazily. This seems to be quite arbitrary but this is how it is defined. That is, if you want to match constructors lazily in two levels then you have to write:

let (a, ~(b,c)) = p

f ~(a, ~(b,c)) = ...

case p of ~(a, ~(b,c)) -> ...

(\ ~(a, ~(b,c)) -> ... )

Meaning

What is the meaning of a lazy pattern match and why is it required sometimes?

The lazy pattern match on a pair as in

f ~(a,b) = g a b

can be translated to

f p = g (fst p) (snd p)

Generally, a lazy pattern match is translated to calling corresponding record field accessors. The key difference between strict pattern match

f (a,b) = g a b

and lazy pattern match

f ~(a,b) = g a b

is that the strict pattern match requires to check for the pair constructor before g can be evaluated. In contrast to that, the lazy pattern match allows to defer the pair constructor match to the evaluation of g a b. If the function g can generate something without looking at its arguments then f can generate something as well before matching the pair constructor.

This difference can be essential. For example compare the implementation of splitAt with lazy and strict pattern match. Here is the (correct) lazy implementation:

import Prelude hiding (splitAt)

splitAt :: Int -> [a] -> ([a], [a])
splitAt n xs =
   if n<=0
     then ([], xs)
     else
        case xs of
           [] -> ([], [])
           y:ys ->
              case splitAt (n-1) ys of
                 ~(prefix, suffix) -> (y : prefix, suffix)

Now try

Test> splitAt 1000000 $ repeat 'a'

With lazy pattern match in the last line of the splitAt implementation you see an answer immediately whereas with a strict pattern match the Haskell interpreter requires some time and memory before showing something. The reason is that the strict pattern match forces the interpreter to perform all recursive calls to splitAt in order to check whether they actually generate a pair constructor. This might look silly since the pair type has only one constructor but mind you that a pair value can also be undefined. Furthermore this behavior is consistent with data types with more than one constructor.

If you are uncertain how that splitAt magic works then it might help to translate the tilde into pair element accessors fst and snd.

Implications

The lazy pattern match has some consequences. First of all a lazy pattern matches immediately always. Remember,

f ~(x:xs) = x:xs

is translated to

f ys = head ys : tail ys

and the match on ys always succeeds. That is with the lazy pattern match the order of matches counts, e.g.

f :: [a] -> [a]
f [] = []
f ~(x:xs) = x:xs

is fine but stupid, because the first match already requires the decision whether the list is empty or not. But the reversed order

f :: [a] -> [a]
f ~(x:xs) = x:xs
f [] = []

yields a compiler warning for an unnecessary match on [] because the first case already matches always.

You may have also noted the danger caused by a lazy pattern match. Since the lazy pattern match on a non-empty list is translated to head and tail it is as dangerous as head and tail, because these functions are not total. That is, generally it is good style to avoid lazy pattern matches on types with more than one constructor.

Difference between revisions of "Lazy pattern match"

Latest revision as of 14:51, 6 March 2013

Syntax

Meaning

Implications

Navigation menu

Search

@@ Line 49: / Line 49: @@
 </haskell>
 Generally, a lazy pattern match is translated to calling
-corresponding record field accessors.
+corresponding record field accessors. The key difference between strict pattern match
-The key difference between strict pattern match
 <haskell>
 f (a,b) = g a b
@@ Line 59: / Line 57: @@
 f ~(a,b) = g a b
 </haskell>
+is that the strict pattern match requires to check for the pair constructor before <hask>g</hask> can be evaluated. In contrast to that, the lazy pattern match allows to defer the pair constructor match to the evaluation of <hask>g a b</hask>. If the function <hask>g</hask> can generate something without looking at its arguments then <hask>f</hask> can generate something as well before matching the pair constructor.
-is that the strict pattern match requires to check for the pair constructor
-before <hask>g</hask> can be evaluated.
-In contrast to that, the lazy pattern match allows
-to defer the pair constructor match to the evaluation of <hask>g a b</hask>.
-If the function <hask>g</hask> can generate something
-without looking at its arguments
-then <hask>f</hask> can generate something as well
-before matching the pair constructor.
+This difference can be essential. For example compare the implementation of <hask>splitAt</hask> with lazy and strict pattern match. Here is the (correct) lazy implementation:
-This difference can be essential.
-For example compare the implementation of <hask>splitAt</hask>
-with lazy and strict pattern match.
-Here is the (correct) lazy implementation:
 <haskell>
 import Prelude hiding (splitAt)
@@ Line 90: / Line 78: @@
 Test> splitAt 1000000 $ repeat 'a'
 </haskell>
+With lazy pattern match in the last line of the <hask>splitAt</hask> implementation you see an answer immediately whereas with a strict pattern match the Haskell interpreter requires some time and memory before showing something. The reason is that the strict pattern match forces the interpreter to perform all recursive calls to <hask>splitAt</hask> in order to check whether they actually generate a pair constructor. This might look silly since the pair type has only one constructor but mind you that a pair value can also be <hask>undefined</hask>. Furthermore this behavior is consistent with data types with more than one constructor.
-With lazy pattern match
-in the last line of the <hask>splitAt</hask> implementation
-you see an answer immediately
-whereas with a strict pattern match
-the Haskell interpreter requires some time and memory
-before showing something.
-The reason is that the strict pattern match forces
-the interpreter to perform all recursive calls to <hask>splitAt</hask>
-in order to check whether they actually generate a pair constructor.
-This might look silly since the pair type has only one constructor
-but mind you that a pair value can also be <hask>undefined</hask>.
-Furthermore this behavior is consistent
-with data types with more than one constructor.
-If you are uncertain how that <hask>splitAt</hask> magic works
+If you are uncertain how that <hask>splitAt</hask> magic works then it might help to translate the tilde into pair element accessors <hask>fst</hask> and <hask>snd</hask>.
-then it might help to translate the tilde into pair element accessors
-<hask>fst</hask> and <hask>snd</hask>.
 == Implications ==
-The lazy pattern match has some consequences.
+The lazy pattern match has some consequences. First of all a lazy pattern matches immediately always. Remember,
-First of all a lazy pattern matches immediately always.
-Remember,
 <haskell>
 f ~(x:xs) = x:xs
@@ Line 121: / Line 93: @@
 f ys = head ys : tail ys
 </haskell>
-and the match on <hask>ys</hask> always succeeds.
+and the match on <hask>ys</hask> always succeeds. That is with the lazy pattern match the order of matches counts, e.g.
-That is with the lazy pattern match
-the order of matches counts.
-E.g.
 <haskell>
 f :: [a] -> [a]
@@ Line 130: / Line 99: @@
 f ~(x:xs) = x:xs
 </haskell>
+is fine but stupid, because the first match already requires the decision whether the list is empty or not. But the reversed order
-is fine but stupid,
-because the first match already requires the decision
-whether the list is empty or not.
-But the reversed order
 <haskell>
 f :: [a] -> [a]
@@ Line 139: / Line 105: @@
 f [] = []
 </haskell>
-yields a compiler warning
+yields a compiler warning for an unnecessary match on <hask>[]</hask> because the first case already matches always.
-because the first case already matches always.
+You may have also noted the danger caused by a lazy pattern match. Since the lazy pattern match on a non-empty list is translated to <hask>head</hask> and <hask>tail</hask> it is as dangerous as <hask>head</hask> and <hask>tail</hask>, because these functions are not [[total function|total]]. That is, generally it is good style to avoid lazy pattern matches on types with more than one constructor.
-You may have also noted the danger caused by a lazy pattern match.
-Since the lazy pattern match on a non-empty list
-is translated to <hask>head</hask> and <hask>tail</hask>
-it is as dangerous as <hask>head</hask> and <hask>tail</hask>,
-because these functions are not [[total function|total]].
-That is generally it is a good style to avoid lazy pattern matches
-on types with more than one constructor.