Difference between revisions of "Correctness of short cut fusion"
(wrote new page) 
Hairy Dude (talk  contribs) m (→foldr/build: typo) 

(32 intermediate revisions by 8 users not shown)  
Line 1:  Line 1:  
==Short cut fusion== 
==Short cut fusion== 

−  +  [[Short cut fusion]] allows elimination of intermediate data structures using rewrite rules that can also be performed automatically during compilation. 

The two most popular instances are the <hask>foldr</hask>/<hask>build</hask> and the <hask>destroy</hask>/<hask>unfoldr</hask>rule for Haskell lists. 
The two most popular instances are the <hask>foldr</hask>/<hask>build</hask> and the <hask>destroy</hask>/<hask>unfoldr</hask>rule for Haskell lists. 

Line 23:  Line 23:  
<haskell> 
<haskell> 

−  foldr c n (build g) 
+  foldr c n (build g) → g c n 
</haskell> 
</haskell> 

−  
===<hask>destroy</hask>/<hask>unfoldr</hask>=== 
===<hask>destroy</hask>/<hask>unfoldr</hask>=== 

Line 33:  Line 32:  
<haskell> 
<haskell> 

destroy :: (forall b. (b > Maybe (a,b)) > b > c) > [a] > c 
destroy :: (forall b. (b > Maybe (a,b)) > b > c) > [a] > c 

−  destroy g = g 
+  destroy g = g step 
−  +  step :: [a] > Maybe (a,[a]) 

−  +  step [] = Nothing 

−  +  step (x:xs) = Just (x,xs) 

unfoldr :: (b > Maybe (a,b)) > b > [a] 
unfoldr :: (b > Maybe (a,b)) > b > [a] 

Line 49:  Line 48:  
<haskell> 
<haskell> 

−  destroy g (unfoldr p e) 
+  destroy g (unfoldr p e) → g p e 
</haskell> 
</haskell> 

Line 60:  Line 59:  
We can distinguish two situations, depending on whether <hask>g</hask> is defined using <hask>seq</hask> or not. 
We can distinguish two situations, depending on whether <hask>g</hask> is defined using <hask>seq</hask> or not. 

−  ===In absence of <hask>seq</hask>=== 
+  ===In the absence of <hask>seq</hask>=== 
+  
+  ====<hask>foldr</hask>/<hask>build</hask>==== 

If <hask>g</hask> does not use <hask>seq</hask>, then the <hask>foldr</hask>/<hask>build</hask>rule really is a semantic equivalence, that is, it holds that 
If <hask>g</hask> does not use <hask>seq</hask>, then the <hask>foldr</hask>/<hask>build</hask>rule really is a semantic equivalence, that is, it holds that 

Line 68:  Line 67:  
</haskell> 
</haskell> 

−  The two sides are 
+  The two sides are interchangeable in any program without affecting semantics. 
+  
+  ====<hask>destroy</hask>/<hask>unfoldr</hask>==== 

The <hask>destroy</hask>/<hask>unfoldr</hask>rule, however, is not a semantic equivalence. 
The <hask>destroy</hask>/<hask>unfoldr</hask>rule, however, is not a semantic equivalence. 

Line 82:  Line 81:  
<haskell> 
<haskell> 

−  destroy g (unfoldr p e) = g 
+  destroy g (unfoldr p e) = g step (unfoldr p e) 
−  = case 
+  = case step (unfoldr p e) of Just z > 0 
−  = case 
+  = case step (case p e of Nothing > [] 
−  +  Just (x,e') > x:unfoldr p e') of Just z > 0 

−  = case 
+  = case step (case Just undefined of Nothing > [] 
−  +  Just (x,e') > x:unfoldr p e') of Just z > 0 

= undefined 
= undefined 

</haskell> 
</haskell> 

Line 110:  Line 109:  
<haskell> 
<haskell> 

−  destroy g (unfoldr p e) 
+  destroy g (unfoldr p e) ⊑ g p e 
</haskell> 
</haskell> 

+  
+  What ''is'' known is that semantic equivalence can be recovered here by putting moderate restrictions on p. 

+  More precisely, if <hask>g</hask> does not use <hask>seq</hask> and <hask>p</hask> is a strict function that never returns <hask>Just ⊥</hask> (where ⊥ denotes any kind of failure or nontermination), then indeed: 

+  
+  <haskell> 

+  destroy g (unfoldr p e) = g p e 

+  </haskell> 

+  
+  ===In the presence of <hask>seq</hask>=== 

+  
+  This is the more interesting setting, given that in Haskell there is no way to restrict the use of <hask>seq</hask>, so in any given program we must be prepared for the possibility that the <hask>g</hask> appearing in the <hask>foldr</hask>/<hask>build</hask> or the <hask>destroy</hask>/<hask>unfoldr</hask>rule is defined using <hask>seq</hask>. 

+  Unsurprisingly, it is also the setting in which more can go wrong than above. 

+  
+  ====<hask>foldr</hask>/<hask>build</hask>==== 

+  
+  In the presence of <hask>seq</hask>, the <hask>foldr</hask>/<hask>build</hask>rule is not necessarily a semantic equivalence. 

+  The instance 

+  
+  <haskell> 

+  g = seq 

+  c = undefined 

+  n = 0 

+  </haskell> 

+  
+  shows, via similar "evaluations" as above, that the righthand side (<hask>g c n</hask>) can be strictly less defined than the lefthand side (<hask>foldr c n (build g)</hask>). 

+  
+  The converse cannot happen, because the following always holds: 

+  
+  <haskell> 

+  foldr c n (build g) ⊒ g c n 

+  </haskell> 

+  
+  Moreover, semantic equivalence can again be recovered by putting restrictions on the involved functions. 

+  
+  On the consumption side, if <hask>(c ⊥ ⊥) ≠ ⊥</hask> and <hask>n ≠ ⊥</hask>, then even in the presence of <hask>seq</hask>: 

+  
+  <haskell> 

+  foldr c n (build g) = g c n 

+  </haskell> 

+  
+  On the production side, <hask>seq</hask> can be used safely as long as it is never used to force anything whose type <hask>build</hask> expects to be polymorphic. In particular, the function passed to build must not force either of its arguments, and must not force anything constructed using them. For example, in 

+  
+  <haskell> 

+  f x = build (\c n > x `seq` (x `c` n)) 

+  </haskell> 

+  
+  The usual equivalence holds, regardless of <hask>c</hask> and <hask>n</hask>: 

+  
+  <haskell> 

+  fold c n (f x) = x `seq` (x `c` n) 

+  </haskell> 

+  
+  For a more interesting example, we can define 

+  
+  <haskell> 

+  hyloList f q c n = 

+  case f q of 

+  Nothing > n 

+  Just (x,q') > x `c` hyloList f q' c n 

+  
+  unfoldr f q = build (hyloList f q) 

+  </haskell> 

+  
+  Note that if <hask>f</hask> or <hask>q</hask> uses <hask>seq</hask>, then that will appear in the argument to <hask>build</hask>, but that is still safe because <hask>f</hask> and <hask>q</hask> have no way to get their hands on <hask>c</hask>, <hask>n</hask>, or anything built from them. 

+  
+  ====<hask>destroy</hask>/<hask>unfoldr</hask>==== 

+  
+  Contrary to the situation without <hask>seq</hask>, now also the <hask>destroy</hask>/<hask>unfoldr</hask>rule may decrease the definedness of a program. 

+  This is witnessed by the following instance: 

+  
+  <haskell> 

+  g = \x y > seq x 0 

+  p = undefined 

+  e = 0 

+  </haskell> 

+  
+  Here the lefthand side of the rule (<hask>destroy g (unfoldr p e)</hask>) yields <hask>0</hask>, while the righthand side (<hask>g p e</hask>) yields <hask>undefined</hask>. 

+  
+  Conditions for semantic approximation in either direction can be given as follows. 

+  
+  If <hask>p ≠ ⊥</hask> and <hask>(p ⊥)</hask> ∈ {<hask>⊥</hask>, <hask>Just ⊥</hask>}, then: 

+  
+  <haskell> 

+  destroy g (unfoldr p e) ⊑ g p e 

+  </haskell> 

+  
+  If <hask>p</hask> is strict and total and never returns <hask>Just ⊥</hask>, then: 

+  
+  <haskell> 

+  destroy g (unfoldr p e) ⊒ g p e 

+  </haskell> 

+  
+  Of course, conditions for semantic equivalence can be obtained by combining the two laws above. 

+  
+  ==Discussion== 

+  
+  Correctness of short cut fusion is not just an academic issue. 

+  All recent versions of [[GHC]] (at least 6.0  6.6) automatically perform transformations like <hask>foldr</hask>/<hask>build</hask> during their optimization pass (also in the disguise of more specialized rules such as <hask>head</hask>/<hask>build</hask>). The rules are specified in the GHC.Base and GHC.List modules. 

+  There has been at least one occasion where, as a result, a safely terminating program was turned into a failing one "in the wild", with a less artificial example than the ones given above. 

+  
+  ===<hask>foldr</hask>/<hask>build</hask>=== 

+  
+  As pointed out above, everything is fine with <hask>foldr</hask>/<hask>build</hask> in the absence of <hask>seq</hask>. 

+  If the producer (<hask>build g</hask>) of the intermediate list may be defined using <hask>seq</hask>, then the conditions <hask>(c ⊥ ⊥) ≠ ⊥</hask> and <hask>n ≠ ⊥</hask> better be satisified, lest the compiler transform a perfectly fine program into a failing one. 

+  
+  The mentioned conditions are equivalent to requiring that the consumer (<hask>foldr c n</hask>) is a total function, that is, maps non⊥ lists to a non⊥ value. 

+  It is thus relatively easy to identify whether a list consumer defined in terms of <hask>foldr</hask> is eligible for <hask>foldr</hask>/<hask>build</hask>fusion in the presence of <hask>seq</hask> or not. 

+  For example, the Prelude functions <hask>head</hask> and <hask>sum</hask> are generally not, while <hask>map</hask> is. 

+  
+  There is, however, currently no way to detect automatically, inside the compiler, whether a particular instance of <hask>foldr</hask>/<hask>build</hask>fusion is safe, i.e., whether the producer avoids <hask>seq</hask> or the consumer is total. 

+  
+  ===<hask>destroy</hask>/<hask>unfoldr</hask>=== 

+  
+  As above, the compiler cannot figure out automatically whether (and how) a given instance of <hask>destroy</hask>/<hask>unfoldr</hask>fusion will change the semantics of a program. 

+  
+  An easy way to get rid of the condition regarding <hask>p</hask> never returning <hask>Just ⊥</hask> is to slightly change the definitions of the functions involved: 

+  
+  <haskell> 

+  data Step a b = Done  Yield a b 

+  
+  destroy' :: (forall b. (b > Step a b) > b > c) > [a] > c 

+  destroy' g = g step' 

+  
+  step' :: [a] > Step a [a] 

+  step' [] = Done 

+  step' (x:xs) = Yield x xs 

+  
+  unfoldr' :: (b > Step a b) > b > [a] 

+  unfoldr' p e = case p e of Done > [] 

+  Yield x e' > x:unfoldr' p e' 

+  </haskell> 

+  
+  The new type <hask>Step a b</hask> is almost isomorphic to <hask>Maybe (a,b)</hask>, but avoids the "junk value" <hask>Just ⊥</hask>. This change does not affect the expressiveness of <hask>unfoldr</hask> or <hask>unfoldr'</hask> with respect to list producers. 

+  But it allows some of the laws above to be simplified a bit. 

+  
+  We would still have that if <hask>g</hask> does not use <hask>seq</hask>, then: 

+  
+  <haskell> 

+  destroy g' (unfoldr' p e) ⊑ g p e 

+  </haskell> 

+  
+  Moreover, if <hask>g</hask> does not use <hask>seq</hask> and <hask>p</hask> is strict, then even: 

+  
+  <haskell> 

+  destroy' g (unfoldr' p e) = g p e 

+  </haskell> 

+  
+  In the potential presence of <hask>seq</hask>, if <hask>p ≠ ⊥</hask> and <hask>p</hask> is strict, then: 

+  
+  <haskell> 

+  destroy' g (unfoldr' p e) ⊑ g p e 

+  </haskell> 

+  
+  Also without restriction regarding <hask>seq</hask>, if <hask>p</hask> is strict and total, then: 

+  
+  <haskell> 

+  destroy' g (unfoldr' p e) ⊒ g p e 

+  </haskell> 

+  
+  The worst change in program behavior from a complier user's point of view is when, through application of "optimization" rules, a safely terminating program is transformed into a failing one or one delivering a different result. 

+  This can happen in the presence of <hask>seq</hask>, for example with a producer of the form 

+  
+  <haskell> 

+  repeat x = unfoldr (\y > Just (x,y)) undefined 

+  </haskell> 

+  
+  or 

+  
+  <haskell> 

+  repeat x = unfoldr' (\y > Yield x y) undefined 

+  </haskell> 

+  
+  Fortunately, it cannot happen for any producer of a nonempty, spinetotal list (i.e., one that contains at least one element and ends with <hask>[]</hask>). 

+  The reason is that for any such producer expressed via <hask>unfoldr</hask> or <hask>unfoldr'</hask> the conditions imposed on <hask>p</hask> in the lefttoright approximation laws above are necessarily fulfilled. 

+  
+  A lefttoright approximation as in 

+  
+  <haskell> 

+  destroy g (unfoldr p e) ⊑ g p e 

+  </haskell> 

+  
+  under suitable preconditions might be acceptable in practice. 

+  After all, it only means that the transformed program may be "more terminating" than the original one, but not less so. 

+  
+  If one insists on semantic equivalence rather than approximation, then the conditions imposed on the producer of the intermediate list become quite severe, in particular in the potential presence of <hask>seq</hask>. 

+  For example, the following producer has to be outlawed then: 

+  
+  <haskell> 

+  enumFromTo n m = unfoldr (\i > if i>m then Nothing else Just (i,i+1)) n 

+  </haskell> 

+  
+  ==Literature== 

+  
+  Various parts of the above story, and elaborations thereof, are also told in the following papers: 

+  
+  * A. Gill, J. Launchbury, and S.L. Peyton Jones. [http://doi.acm.org/10.1145/165180.165214 A short cut to deforestation]. Functional Programming Languages and Computer Architecture, Proceedings, pages 223232, ACM Press, 1993. 

+  * J. Svenningsson. [http://doi.acm.org/10.1145/581478.581491 Shortcut fusion for accumulating parameters & ziplike functions]. International Conference on Functional Programming, Proceedings, pages 124132, ACM Press, 2002. 

+  * P. Johann. [http://dx.doi.org/10.1017/S0960129504004578 On proving the correctness of program transformations based on free theorems for higherorder polymorphic calculi]. Mathematical Structures in Computer Science, 15:201229, 2005. 

+  * P. Johann and J. Voigtländer. [http://iospress.metapress.com/openurl.asp?genre=article&issn=01692968&volume=69&issue=1&spage=63 The impact of seq on free theoremsbased program transformations]. Fundamenta Informaticae, 69:63102, 2006. 

+  * J. Voigtländer and P. Johann. [http://dx.doi.org/10.1016/j.tcs.2007.09.014 Selective strictness and parametricity in structural operational semantics, inequationally]. Theoretical Computer Science, 388:290318, 2007. 

+  * J. Voigtländer. [http://doi.acm.org/10.1145/1328408.1328412 Proving Correctness via Free Theorems: The Case of the destroy/buildRule]. Partial Evaluation and SemanticsBased Program Manipulation, Proceedings, pages 1320, ACM Press, 2008. 

+  * J. Voigtländer. [http://dx.doi.org/10.1007/9783540789697_13 Semantics and Pragmatics of New Shortcut Fusion Rules]. Functional and Logic Programming, Proceedings, LNCS 4989:163179, SpringerVerlag, 2008. 

+  * P. Johann and J. Voigtländer. [http://wwwtcs.inf.tudresden.de/~voigt/iandc.pdf A family of syntactic logical relations for the semantics of Haskelllike languages]. Information and Computation, 207:341368, 2009. 

+  
+  
+  
+  [[Category:Tutorials]] 

+  [[Category:Program transformation]] 
Latest revision as of 01:17, 8 April 2016
Contents
Short cut fusion
Short cut fusion allows elimination of intermediate data structures using rewrite rules that can also be performed automatically during compilation.
The two most popular instances are the foldr
/build
 and the destroy
/unfoldr
rule for Haskell lists.
foldr
/build
The foldr
/build
rule eliminates intermediate lists produced by build
and consumed by foldr
, where these functions are defined as follows:
foldr :: (a > b > b) > b > [a] > b
foldr c n [] = n
foldr c n (x:xs) = c x (foldr c n xs)
build :: (forall b. (a > b > b) > b > b) > [a]
build g = g (:) []
Note the rank2 polymorphic type of build
.
The foldr
/build
rule now means the following replacement for appropriately typed g
, c
, and n
:
foldr c n (build g) → g c n
destroy
/unfoldr
The destroy
/unfoldr
rule eliminates intermediate lists produced by unfoldr
and consumed by destroy
, where these functions are defined as follows:
destroy :: (forall b. (b > Maybe (a,b)) > b > c) > [a] > c
destroy g = g step
step :: [a] > Maybe (a,[a])
step [] = Nothing
step (x:xs) = Just (x,xs)
unfoldr :: (b > Maybe (a,b)) > b > [a]
unfoldr p e = case p e of Nothing > []
Just (x,e') > x:unfoldr p e'
Note the rank2 polymorphic type of destroy
.
The destroy
/unfoldr
rule now means the following replacement for appropriately typed g
, p
, and e
:
destroy g (unfoldr p e) → g p e
Correctness
If the foldr
/build
 and the destroy
/unfoldr
rule are to be automatically performed during compilation, as is possible using GHC's RULES pragmas, we clearly want them to be equivalences.
That is, the left and righthand sides should be semantically the same for each instance of either rule.
Unfortunately, this is not so in Haskell.
We can distinguish two situations, depending on whether g
is defined using seq
or not.
In the absence of seq
foldr
/build
If g
does not use seq
, then the foldr
/build
rule really is a semantic equivalence, that is, it holds that
foldr c n (build g) = g c n
The two sides are interchangeable in any program without affecting semantics.
destroy
/unfoldr
The destroy
/unfoldr
rule, however, is not a semantic equivalence.
To see this, consider the following instance:
g = \x y > case x y of Just z > 0
p = \x > if x==0 then Just undefined else Nothing
e = 0
These values have appropriate types for being used in the destroy
/unfoldr
rule. But with them, that rule's lefthand side "evaluates" as follows:
destroy g (unfoldr p e) = g step (unfoldr p e)
= case step (unfoldr p e) of Just z > 0
= case step (case p e of Nothing > []
Just (x,e') > x:unfoldr p e') of Just z > 0
= case step (case Just undefined of Nothing > []
Just (x,e') > x:unfoldr p e') of Just z > 0
= undefined
while its righthand side "evaluates" as follows:
g p e = case p e of Just z > 0
= case Just undefined of Just z > 0
= 0
Thus, by applying the destroy
/unfoldr
rule, a nonterminating (or otherwise failing) program can be transformed into a safely terminating one.
The obvious questions now are:
 Can the converse also happen, that is, can a safely terminating program be transformed into a failing one?
 Can a safely terminating program be transformed into another safely terminating one that gives a different value as result?
There is no formal proof yet, but strong evidence supporting the conjecture that the answer to both questions is "No!".
The conjecture goes that if g
does not use seq
, then the destroy
/unfoldr
rule is a semantic approximation from left to right, that is, it holds that
destroy g (unfoldr p e) ⊑ g p e
What is known is that semantic equivalence can be recovered here by putting moderate restrictions on p.
More precisely, if g
does not use seq
and p
is a strict function that never returns Just ⊥
(where ⊥ denotes any kind of failure or nontermination), then indeed:
destroy g (unfoldr p e) = g p e
In the presence of seq
This is the more interesting setting, given that in Haskell there is no way to restrict the use of seq
, so in any given program we must be prepared for the possibility that the g
appearing in the foldr
/build
 or the destroy
/unfoldr
rule is defined using seq
.
Unsurprisingly, it is also the setting in which more can go wrong than above.
foldr
/build
In the presence of seq
, the foldr
/build
rule is not necessarily a semantic equivalence.
The instance
g = seq
c = undefined
n = 0
shows, via similar "evaluations" as above, that the righthand side (g c n
) can be strictly less defined than the lefthand side (foldr c n (build g)
).
The converse cannot happen, because the following always holds:
foldr c n (build g) ⊒ g c n
Moreover, semantic equivalence can again be recovered by putting restrictions on the involved functions.
On the consumption side, if (c ⊥ ⊥) ≠ ⊥
and n ≠ ⊥
, then even in the presence of seq
:
foldr c n (build g) = g c n
On the production side, seq
can be used safely as long as it is never used to force anything whose type build
expects to be polymorphic. In particular, the function passed to build must not force either of its arguments, and must not force anything constructed using them. For example, in
f x = build (\c n > x `seq` (x `c` n))
The usual equivalence holds, regardless of c
and n
:
fold c n (f x) = x `seq` (x `c` n)
For a more interesting example, we can define
hyloList f q c n =
case f q of
Nothing > n
Just (x,q') > x `c` hyloList f q' c n
unfoldr f q = build (hyloList f q)
Note that if f
or q
uses seq
, then that will appear in the argument to build
, but that is still safe because f
and q
have no way to get their hands on c
, n
, or anything built from them.
destroy
/unfoldr
Contrary to the situation without seq
, now also the destroy
/unfoldr
rule may decrease the definedness of a program.
This is witnessed by the following instance:
g = \x y > seq x 0
p = undefined
e = 0
Here the lefthand side of the rule (destroy g (unfoldr p e)
) yields 0
, while the righthand side (g p e
) yields undefined
.
Conditions for semantic approximation in either direction can be given as follows.
If p ≠ ⊥
and (p ⊥)
∈ {⊥
, Just ⊥
}, then:
destroy g (unfoldr p e) ⊑ g p e
If p
is strict and total and never returns Just ⊥
, then:
destroy g (unfoldr p e) ⊒ g p e
Of course, conditions for semantic equivalence can be obtained by combining the two laws above.
Discussion
Correctness of short cut fusion is not just an academic issue.
All recent versions of GHC (at least 6.0  6.6) automatically perform transformations like foldr
/build
during their optimization pass (also in the disguise of more specialized rules such as head
/build
). The rules are specified in the GHC.Base and GHC.List modules.
There has been at least one occasion where, as a result, a safely terminating program was turned into a failing one "in the wild", with a less artificial example than the ones given above.
foldr
/build
As pointed out above, everything is fine with foldr
/build
in the absence of seq
.
If the producer (build g
) of the intermediate list may be defined using seq
, then the conditions (c ⊥ ⊥) ≠ ⊥
and n ≠ ⊥
better be satisified, lest the compiler transform a perfectly fine program into a failing one.
The mentioned conditions are equivalent to requiring that the consumer (foldr c n
) is a total function, that is, maps non⊥ lists to a non⊥ value.
It is thus relatively easy to identify whether a list consumer defined in terms of foldr
is eligible for foldr
/build
fusion in the presence of seq
or not.
For example, the Prelude functions head
and sum
are generally not, while map
is.
There is, however, currently no way to detect automatically, inside the compiler, whether a particular instance of foldr
/build
fusion is safe, i.e., whether the producer avoids seq
or the consumer is total.
destroy
/unfoldr
As above, the compiler cannot figure out automatically whether (and how) a given instance of destroy
/unfoldr
fusion will change the semantics of a program.
An easy way to get rid of the condition regarding p
never returning Just ⊥
is to slightly change the definitions of the functions involved:
data Step a b = Done  Yield a b
destroy' :: (forall b. (b > Step a b) > b > c) > [a] > c
destroy' g = g step'
step' :: [a] > Step a [a]
step' [] = Done
step' (x:xs) = Yield x xs
unfoldr' :: (b > Step a b) > b > [a]
unfoldr' p e = case p e of Done > []
Yield x e' > x:unfoldr' p e'
The new type Step a b
is almost isomorphic to Maybe (a,b)
, but avoids the "junk value" Just ⊥
. This change does not affect the expressiveness of unfoldr
or unfoldr'
with respect to list producers.
But it allows some of the laws above to be simplified a bit.
We would still have that if g
does not use seq
, then:
destroy g' (unfoldr' p e) ⊑ g p e
Moreover, if g
does not use seq
and p
is strict, then even:
destroy' g (unfoldr' p e) = g p e
In the potential presence of seq
, if p ≠ ⊥
and p
is strict, then:
destroy' g (unfoldr' p e) ⊑ g p e
Also without restriction regarding seq
, if p
is strict and total, then:
destroy' g (unfoldr' p e) ⊒ g p e
The worst change in program behavior from a complier user's point of view is when, through application of "optimization" rules, a safely terminating program is transformed into a failing one or one delivering a different result.
This can happen in the presence of seq
, for example with a producer of the form
repeat x = unfoldr (\y > Just (x,y)) undefined
or
repeat x = unfoldr' (\y > Yield x y) undefined
Fortunately, it cannot happen for any producer of a nonempty, spinetotal list (i.e., one that contains at least one element and ends with []
).
The reason is that for any such producer expressed via unfoldr
or unfoldr'
the conditions imposed on p
in the lefttoright approximation laws above are necessarily fulfilled.
A lefttoright approximation as in
destroy g (unfoldr p e) ⊑ g p e
under suitable preconditions might be acceptable in practice. After all, it only means that the transformed program may be "more terminating" than the original one, but not less so.
If one insists on semantic equivalence rather than approximation, then the conditions imposed on the producer of the intermediate list become quite severe, in particular in the potential presence of seq
.
For example, the following producer has to be outlawed then:
enumFromTo n m = unfoldr (\i > if i>m then Nothing else Just (i,i+1)) n
Literature
Various parts of the above story, and elaborations thereof, are also told in the following papers:
 A. Gill, J. Launchbury, and S.L. Peyton Jones. A short cut to deforestation. Functional Programming Languages and Computer Architecture, Proceedings, pages 223232, ACM Press, 1993.
 J. Svenningsson. Shortcut fusion for accumulating parameters & ziplike functions. International Conference on Functional Programming, Proceedings, pages 124132, ACM Press, 2002.
 P. Johann. On proving the correctness of program transformations based on free theorems for higherorder polymorphic calculi. Mathematical Structures in Computer Science, 15:201229, 2005.
 P. Johann and J. Voigtländer. The impact of seq on free theoremsbased program transformations. Fundamenta Informaticae, 69:63102, 2006.
 J. Voigtländer and P. Johann. Selective strictness and parametricity in structural operational semantics, inequationally. Theoretical Computer Science, 388:290318, 2007.
 J. Voigtländer. Proving Correctness via Free Theorems: The Case of the destroy/buildRule. Partial Evaluation and SemanticsBased Program Manipulation, Proceedings, pages 1320, ACM Press, 2008.
 J. Voigtländer. Semantics and Pragmatics of New Shortcut Fusion Rules. Functional and Logic Programming, Proceedings, LNCS 4989:163179, SpringerVerlag, 2008.
 P. Johann and J. Voigtländer. A family of syntactic logical relations for the semantics of Haskelllike languages. Information and Computation, 207:341368, 2009.