# Correctness of short cut fusion

### From HaskellWiki

(commit additions) |
|||

Line 59: | Line 59: | ||

We can distinguish two situations, depending on whether <hask>g</hask> is defined using <hask>seq</hask> or not. | We can distinguish two situations, depending on whether <hask>g</hask> is defined using <hask>seq</hask> or not. | ||

− | ===In absence of <hask>seq</hask>=== | + | ===In the absence of <hask>seq</hask>=== |

+ | |||

+ | ====<hask>foldr</hask>/<hask>build</hask>==== | ||

If <hask>g</hask> does not use <hask>seq</hask>, then the <hask>foldr</hask>/<hask>build</hask>-rule really is a semantic equivalence, that is, it holds that | If <hask>g</hask> does not use <hask>seq</hask>, then the <hask>foldr</hask>/<hask>build</hask>-rule really is a semantic equivalence, that is, it holds that | ||

Line 68: | Line 70: | ||

The two sides are semantically interchangeable. | The two sides are semantically interchangeable. | ||

+ | |||

+ | ====<hask>destroy</hask>/<hask>unfoldr</hask>==== | ||

The <hask>destroy</hask>/<hask>unfoldr</hask>-rule, however, is not a semantic equivalence. | The <hask>destroy</hask>/<hask>unfoldr</hask>-rule, however, is not a semantic equivalence. | ||

Line 119: | Line 123: | ||

</haskell> | </haskell> | ||

− | ===In presence of <hask>seq</hask>=== | + | ===In the presence of <hask>seq</hask>=== |

+ | |||

+ | This is the more interesting setting, given that in Haskell there is no way to restrict the use of <hask>seq</hask>, so in any given program we must be prepared for the possibility that the <hask>g</hask> appearing in the <hask>foldr</hask>/<hask>build</hask>- or the <hask>destroy</hask>/<hask>unfoldr</hask>-rule is defined using <hask>seq</hask>. | ||

+ | Unsurprisingly, it is also the setting in which more can go wrong than above. | ||

+ | |||

+ | ====<hask>foldr</hask>/<hask>build</hask>==== | ||

+ | |||

+ | In the presence of <hask>seq</hask>, the <hask>foldr</hask>/<hask>build</hask>-rule is not anymore a semantic equivalence. | ||

+ | The instance | ||

+ | |||

+ | <haskell> | ||

+ | g = seq | ||

+ | c = undefined | ||

+ | n = 0 | ||

+ | </haskell> | ||

+ | |||

+ | shows, via similar "evaluations" as above, that the right-hand side (<hask>g c n</hask>) can be strictly less defined than the right-hand side (<hask>foldr c n (build g)</hask>). | ||

+ | |||

+ | The converse cannot happen, because the following always holds: | ||

+ | |||

+ | <haskell> | ||

+ | foldr c n (build g) <math>\sqsupseteq</math> g c n | ||

+ | </haskell> | ||

+ | |||

+ | Moreover, semantic equivalence can again be recovered by putting restrictions on the involved functions. | ||

+ | More precisely, if <hask>(c <math>\bot~\bot)\neq\bot</math></hask> and <hask>n <math>\neq\bot</math></hask>, then even in the presence of <hask>seq</hask>: | ||

+ | |||

+ | <haskell> | ||

+ | foldr c n (build g) = g c n | ||

+ | </haskell> | ||

− | + | ====<hask>destroy</hask>/<hask>unfoldr</hask>==== |

## Revision as of 15:02, 6 July 2006

## Contents |

## 1 Short cut fusion

*Short cut fusion* allows elimination of intermediate data structures using rewrite rules that can also be performed automatically during compilation.

### 1.1 foldr/build

The foldr :: (a -> b -> b) -> b -> [a] -> b foldr c n [] = n foldr c n (x:xs) = c x (foldr c n xs) build :: (forall b. (a -> b -> b) -> b -> b) -> [a] build g = g (:) []

*rank-2 polymorphic*type of

foldr c n (build g) <nowiki>→</nowiki> g c n

### 1.2 destroy/unfoldr

The destroy :: (forall b. (b -> Maybe (a,b)) -> b -> c) -> [a] -> c destroy g = g listpsi listpsi :: [a] -> Maybe (a,[a]) listpsi [] = Nothing listpsi (x:xs) = Just (x,xs) unfoldr :: (b -> Maybe (a,b)) -> b -> [a] unfoldr p e = case p e of Nothing -> [] Just (x,e') -> x:unfoldr p e'

*rank-2 polymorphic*type of

destroy g (unfoldr p e) <nowiki>→</nowiki> g p e

## 2 Correctness

If the*RULES pragmas*, we clearly want them to be equivalences.

That is, the left- and right-hand sides should be semantically the same for each instance of either rule. Unfortunately, this is not so in Haskell.

We can distinguish two situations, depending on whether### 2.1 In the absence of seq

#### 2.1.1 foldr/build

If foldr c n (build g) = g c n

The two sides are semantically interchangeable.

#### 2.1.2 destroy/unfoldr

The To see this, consider the following instance:

g = \x y -> case x y of Just z -> 0 p = \x -> if x==0 then Just undefined else Nothing e = 0

destroy g (unfoldr p e) = g listpsi (unfoldr p e) = case listpsi (unfoldr p e) of Just z -> 0 = case listpsi (case p e of Nothing -> [] Just (x,e') -> x:unfoldr p e') of Just z -> 0 = case listpsi (case Just undefined of Nothing -> [] Just (x,e') -> x:unfoldr p e') of Just z -> 0 = undefined

while its right-hand side "evaluates" as follows:

g p e = case p e of Just z -> 0 = case Just undefined of Just z -> 0 = 0

The obvious questions now are:

- Can the converse also happen, that is, can a safely terminating program be transformed into a failing one?
- Can a safely terminating program be transformed into another safely terminating one that gives a different value as result?

There is no formal proof yet, but strong evidence supporting the conjecture that the answer to both questions is "**No!**".

destroy g (unfoldr p e) <math>\sqsubseteq</math> g p e

What *is* known is that semantic equivalence can be recovered here by putting moderate restrictions on p.

destroy g (unfoldr p e) = g p e

### 2.2 In the presence of seq

This is the more interesting setting, given that in Haskell there is no way to restrict the use of Unsurprisingly, it is also the setting in which more can go wrong than above.

#### 2.2.1 foldr/build

In the presence of The instance

g = seq c = undefined n = 0

The converse cannot happen, because the following always holds:

foldr c n (build g) <math>\sqsupseteq</math> g c n

Moreover, semantic equivalence can again be recovered by putting restrictions on the involved functions.

More precisely, iffoldr c n (build g) = g c n