Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Haskell
Wiki community
Recent changes
Random page
HaskellWiki
Search
Search
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Correctness of short cut fusion
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Special pages
Page information
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Discussion== Correctness of short cut fusion is not just an academic issue. All recent versions of [[GHC]] (at least 6.0 - 6.6) automatically perform transformations like <hask>foldr</hask>/<hask>build</hask> during their optimization pass (also in the disguise of more specialized rules such as <hask>head</hask>/<hask>build</hask>). The rules are specified in the GHC.Base and GHC.List modules. There has been at least one occasion where, as a result, a safely terminating program was turned into a failing one "in the wild", with a less artificial example than the ones given above. ===<hask>foldr</hask>/<hask>build</hask>=== As pointed out above, everything is fine with <hask>foldr</hask>/<hask>build</hask> in the absence of <hask>seq</hask>. If the producer (<hask>build g</hask>) of the intermediate list may be defined using <hask>seq</hask>, then the conditions <hask>(c β₯ β₯) β β₯</hask> and <hask>n β β₯</hask> better be satisified, lest the compiler transform a perfectly fine program into a failing one. The mentioned conditions are equivalent to requiring that the consumer (<hask>foldr c n</hask>) is a total function, that is, maps non-β₯ lists to a non-β₯ value. It is thus relatively easy to identify whether a list consumer defined in terms of <hask>foldr</hask> is eligible for <hask>foldr</hask>/<hask>build</hask>-fusion in the presence of <hask>seq</hask> or not. For example, the Prelude functions <hask>head</hask> and <hask>sum</hask> are generally not, while <hask>map</hask> is. There is, however, currently no way to detect automatically, inside the compiler, whether a particular instance of <hask>foldr</hask>/<hask>build</hask>-fusion is safe, i.e., whether the producer avoids <hask>seq</hask> or the consumer is total. ===<hask>destroy</hask>/<hask>unfoldr</hask>=== As above, the compiler cannot figure out automatically whether (and how) a given instance of <hask>destroy</hask>/<hask>unfoldr</hask>-fusion will change the semantics of a program. An easy way to get rid of the condition regarding <hask>p</hask> never returning <hask>Just β₯</hask> is to slightly change the definitions of the functions involved: <haskell> data Step a b = Done | Yield a b destroy' :: (forall b. (b -> Step a b) -> b -> c) -> [a] -> c destroy' g = g step' step' :: [a] -> Step a [a] step' [] = Done step' (x:xs) = Yield x xs unfoldr' :: (b -> Step a b) -> b -> [a] unfoldr' p e = case p e of Done -> [] Yield x e' -> x:unfoldr' p e' </haskell> The new type <hask>Step a b</hask> is almost isomorphic to <hask>Maybe (a,b)</hask>, but avoids the "junk value" <hask>Just β₯</hask>. This change does not affect the expressiveness of <hask>unfoldr</hask> or <hask>unfoldr'</hask> with respect to list producers. But it allows some of the laws above to be simplified a bit. We would still have that if <hask>g</hask> does not use <hask>seq</hask>, then: <haskell> destroy g' (unfoldr' p e) β g p e </haskell> Moreover, if <hask>g</hask> does not use <hask>seq</hask> and <hask>p</hask> is strict, then even: <haskell> destroy' g (unfoldr' p e) = g p e </haskell> In the potential presence of <hask>seq</hask>, if <hask>p β β₯</hask> and <hask>p</hask> is strict, then: <haskell> destroy' g (unfoldr' p e) β g p e </haskell> Also without restriction regarding <hask>seq</hask>, if <hask>p</hask> is strict and total, then: <haskell> destroy' g (unfoldr' p e) β g p e </haskell> The worst change in program behavior from a complier user's point of view is when, through application of "optimization" rules, a safely terminating program is transformed into a failing one or one delivering a different result. This can happen in the presence of <hask>seq</hask>, for example with a producer of the form <haskell> repeat x = unfoldr (\y -> Just (x,y)) undefined </haskell> or <haskell> repeat x = unfoldr' (\y -> Yield x y) undefined </haskell> Fortunately, it cannot happen for any producer of a nonempty, spine-total list (i.e., one that contains at least one element and ends with <hask>[]</hask>). The reason is that for any such producer expressed via <hask>unfoldr</hask> or <hask>unfoldr'</hask> the conditions imposed on <hask>p</hask> in the left-to-right approximation laws above are necessarily fulfilled. A left-to-right approximation as in <haskell> destroy g (unfoldr p e) β g p e </haskell> under suitable preconditions might be acceptable in practice. After all, it only means that the transformed program may be "more terminating" than the original one, but not less so. If one insists on semantic equivalence rather than approximation, then the conditions imposed on the producer of the intermediate list become quite severe, in particular in the potential presence of <hask>seq</hask>. For example, the following producer has to be outlawed then: <haskell> enumFromTo n m = unfoldr (\i -> if i>m then Nothing else Just (i,i+1)) n </haskell>
Summary:
Please note that all contributions to HaskellWiki are considered to be released under simple permissive license (see
HaskellWiki:Copyrights
for details). If you don't want your writing to be edited mercilessly and redistributed at will, then don't submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
DO NOT SUBMIT COPYRIGHTED WORK WITHOUT PERMISSION!
Cancel
Editing help
(opens in new window)
Toggle limited content width