Difference between revisions of "MonadFail Proposal"

From HaskellWiki
Jump to navigation Jump to search
(Initial translation from my Github-based Markdown article)
 
(replace old content with summary)
Tag: Replaced
(15 intermediate revisions by 6 users not shown)
Line 1: Line 1:
  +
'''Note:''' this proposal page has been moved to [https://gitlab.haskell.org/haskell/prime/-/wikis/libraries/proposals/monad-fail the Haskell Prime Wiki].
== <code>MonadFail</code> proposal (MFP) ==
 
   
  +
A quick summary for migration purposes, here are the versions of GHC/base with major developments in the proposal:
A couple of years ago, we proposed to make <code>Applicative</code> a superclass of <code>Monad</code> which successfully killed the single most ugly thing in Haskell as of GHC 7.10.
 
   
  +
; 8.0.1 (May 2016) / base 4.9.0.0: [https://hackage.haskell.org/package/base-4.9.0.0/docs/Control-Monad-Fail.html MonadFail in Control.Monad.Fail added]
Now, it's time to tackle the other major issue with <code>Monad</code> <code>fail</code> being a part of it.
 
  +
; 8.6.1-8.6.4 (all 8.6 have same base) (Sep 2018) / base 4.12.0.0: [https://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Monad.html#v:-62--62--61- Last base to still have Monad(fail)]
 
  +
; 8.8.1 (Jul 2019) / base 4.13.0.0: [https://hackage.haskell.org/package/base-4.13.0.0/docs/Control-Monad.html#v:-62--62--61- Migration completed, Monad(fail) gone]
You can contact me as usual via IRC/Freenode as ''quchen'', or by email to ''dluposchainsky at the email service of Google''. This file was posted on the ghc-devs@ and [http://permalink.gmane.org/gmane.comp.lang.haskell.libraries/24910 libraries@ mailing lists], as well as on [http://www.reddit.com/r/haskell/comments/397k1a/monadfail_proposal_mfp_moving_fail_out_of_monad/ Reddit].
 
 
This proposal was first posted on [https://github.com/quchen/articles/blob/master/monad_fail.md quchen's articles Github repo].
 
 
=== Overview ===
 
 
* '''The problem''' - reason for the proposal
 
* '''MonadFail class''' - the solution
 
* '''Discussion''' - explaining our design choices
 
* '''Adapting old code''' - how to prepare current code to transition smoothly
 
* '''Estimating the breakage''' - how much stuff we will break
 
* '''Transitional strategy''' - how to break as little as possible while transitioning
 
* '''Current status'''
 
 
=== The problem ===
 
 
Currently, the <code><-</code> symbol is unconditionally desugared as follows:
 
 
<syntaxhighlight lang="haskell">
 
do pat <- computation >>> let f pat = more
 
more >>> f _ = fail "..."
 
>>> in computation >>= f
 
</syntaxhighlight>
 
 
The problem with this is that <code>fail</code> cannot (!) be sensibly implemented for many monads, for example <code>State</code> <code>IO</code> <code>Reader</code> In those cases it defaults to <code>error</code> As a consequence, in current Haskell, you can not use <code>Monad</code> polymorphic code safely, because although it claims to work for all <code>Monad</code> , it might just crash on you. This kind of implicit non-totality baked into the class is ''terrible''.
 
 
The goal of this proposal is adding the <code>fail</code> only when necessary and reflecting that in the type signature of the <code>do</code> block, so that it can be used safely, and more importantly, is guaranteed not to be used if the type signature does not say so.
 
 
=== <code>MonadFail</code> class ===
 
 
To fix this, introduce a new typeclass:
 
 
<syntaxhighlight lang="haskell">
 
class Monad m => MonadFail m where
 
fail :: String -> m a
 
</syntaxhighlight>
 
 
Desugaring can now be changed to produce this constraint when necessary. For this, we have to decide when a pattern match can not fail; if this is the case, we can omit inserting the <code>fail</code> call.
 
 
The most trivial examples of unfailable patterns are of course those that match anywhere unconditionally,
 
 
<syntaxhighlight lang="haskell">
 
do x <- action >>> let f x = more
 
more >>> in action >>= f
 
</syntaxhighlight>
 
 
In particular, the programmer can assert any pattern be unfailable by making it irrefutable using a prefix tilde:
 
 
<syntaxhighlight lang="haskell">
 
do ~pat <- action >>> let f ~pat = more
 
more >>> in action >>= f
 
</syntaxhighlight>
 
 
A class of patterns that are conditionally failable are <code>newtype</code> , and single constructor <code>data</code> types, which are unfailable by themselves, but may fail if matching on their fields is done with failable patterns.
 
 
<syntaxhighlight lang="haskell">
 
data Newtype a = Newtype a
 
 
-- "x" cannot fail
 
do Newtype x <- action >>> let f (Newtype x) = more
 
more >>> in action >>= f
 
 
-- "Just x" can fail
 
do Newtype (Just x) <- action >>> let f (Newtype (Just x)) = more
 
more >>> f _ = fail "..."
 
>>> in action >>= f
 
</syntaxhighlight>
 
 
<code>ViewPatterns</code> are as failable as the pattern the view is matched against. Patterns like <code>(Just -> Just x)</code> should generate a <code>MonadFail</code> constraint even when it's "obvious" from the view's implementation that the pattern will always match. From an implementor's perspective, this means that only types (and their constructors) have to be looked at, not arbitrary values (like functions), which is impossible to do statically in general.
 
 
<syntaxhighlight lang="haskell">
 
do (view -> pat) <- action >>> let f (view -> pat) = more
 
more >>> f _ = fail "..."
 
>>> in action >>= f
 
 
do (view -> ~pat) <- action >>> let f (view -> ~pat) = more
 
more >>> in action >>= f
 
</syntaxhighlight>
 
 
A similar issue arises for <code>PatternSynonyms</code> which we cannot inspect during compilation sufficiently. A pattern synonym will therefore always be considered failable.
 
 
<syntaxhighlight lang="haskell">
 
do PatternSynonym x <- action >>> let f PatternSynonym x = more
 
more >>> f _ = fail "..."
 
>>> in action >>= f
 
</syntaxhighlight>
 
 
=== Discussion ===
 
 
* Although for many <code>MonadPlus</code> <code>fail _ = mzero</code> a separate <code>MonadFail</code> class should be created instead of just using that.
 
** A parser might fail with an error message involving positional information. Some libraries, like <code>Binary</code> provide <code>fail</code> as their only interface to fail a decoding step.
 
** Although <code>STM</code> is <code>MonadPlus</code> it uses the default <code>fail = error</code> It will therefore not get a <code>MonadFail</code> instance.
 
* What laws should <code>fail</code> follow? '''Left zero''', <syntaxhighlight lang="haskell"> ∀ s f. fail s >>= f ≡ fail s</syntaxhighlight> A call to <code>fail</code> should abort the computation. In this sense, <code>fail</code> would become a close relative of <code>mzero</code> It would work well with the common definition <code>fail _ = mzero</code> and give a simple guideline to the intended usage and effect of the <code>MonadFail</code> class.
 
* Rename <code>fail</code> '''No.''' Old code might use <code>fail</code> explicitly and we might avoid breaking it, the Report talks about <code>fail</code> and we have a solid migration strategy that does not require a renaming.
 
* Remove the <code>String</code> argument? '''No.''' The <code>String</code> might help error reporting and debugging. <code>String</code> may be ugly, but it's the de facto standard for simple text in GHC. No high performance string operations are to be expected with <code>fail</code> so this breaking change would in no way be justified. Also note that explicit <code>fail</code> calls would break if we removed the argument.
 
* How sensitive would existing code be to subtle changes in the strictness behaviour of <code>do</code> notation pattern matching? '''It doesn't.''' The implementation does not affect strictness at all, only the desugaring step. Care must be taken when fixing warnings by making patterns irrefutable using <code>~</code> as that ''does'' affect strictness. (Cf. difference between lazy/strict State)
 
* The <code>Monad</code> constraint for <code>MonadFail</code> seems unnecessary. Should we drop or relax it? What other things should be considered?
 
** Applicative <code>do</code> notation is coming sooner or later, <code>fail</code> might be useful in this more general scenario. Due to the AMP, it is trivial to change the <code>MonadFail</code> superclass to <code>Applicative</code> later. (The name will be a bit misleading, but it's a very small price to pay.)
 
** The class might be misused for a strange pointed type if left without any constraint. This is not the intended use at all. I think we should keep the <code>Monad</code> superclass for three main reasons:
 
** We don't want to see <code>(Monad m, MonadFail m) =></code> all over the place.
 
** The primary intended use of <code>fail</code> is for desugaring do-notation anyway.
 
** Retroactively removing superclasses is easy, but adding them is hard (see AMP).
 
 
=== Adapting old code ===
 
 
<ul>
 
<li>Help! My code is broken because of a missing <code>MonadFail</code> instance! ''Here are your options:''
 
<ol><li>Write a <code>MonadFail</code> instance (and bring it into scope)
 
<syntaxhighlight lang="haskell">
 
#if !MIN_VERSION_base(4,11,0)
 
-- Control.Monad.Fail import will become redundant in GHC 7.16+
 
import qualified Control.Monad.Fail as Fail
 
#endif
 
import Control.Monad
 
 
instance Monad Foo where
 
(>>=) = <...bind impl...>
 
-- NB: <code>return</code> defaults to <code>pure</code>
 
#if !MIN_VERSION_base(4,11,0)
 
-- Monad(fail) will be removed in GHC 7.16+
 
fail = Fail.fail
 
#endif
 
 
instance MonadFail Foo where
 
fail = <...fail implementation...>
 
</syntaxhighlight></li>
 
<li>Change your pattern to be irrefutable</li>
 
<li>Emulate the old behaviour by desugaring the pattern match by hand:
 
<syntaxhighlight lang="haskell">
 
do Left e <- foobar
 
stuff
 
</syntaxhighlight> becomes <syntaxhighlight lang="haskell">
 
do x <- foobar
 
e <- case x of
 
Left e' -> e'
 
Right r -> error "Pattern match failed" -- Boooo
 
stuff
 
</syntaxhighlight>
 
The point is you'll have to do your dirty laundry yourself now if you have a value that ''you'' know will always match, and if you don't handle the other patterns you'll get incompleteness warnings, and the compiler won't silently eat those for you.</li>
 
</ol></li>
 
<li>Help! My code is broken because you removed <code>fail</code> from <code>Monad</code> but my class defines it! ''Delete that part of the instance definition.''</li></ul>
 
 
=== Esimating the breakage ===
 
 
Using our initial implementation, I compiled stackage-nightly, and grepped the logs for the warnings. Assuming my implementation is correct, the number of "missing <code>MonadFail</code> warnings generated is 487. Note that I filtered out <code>[]</code> <code>Maybe</code> and <code>ReadPrec</code> since those can be given a <code>MonadFail</code> instance from within GHC, and no breakage is expected from them.
 
The build logs can be found [https://www.dropbox.com/s/knz0i979skam4zs/stackage-build.tar.xz?dl=0 here]. Search for "failable pattern" to find your way to the still pretty raw warnings.
 
 
Here are some commands you might find interesting for exploring the logs:
 
 
<syntaxhighlight lang="bash">
 
# List all packages generating warnings (57 of them)
 
grep "is used in the context" '' | \
 
grep -v '(‘\[|Maybe|ReadPrec)' | \
 
perl -pe 's#^(.'')\.log.''$#\1#' | \
 
uniq -u
 
 
# Histogram of the breaking contexts (mostly IO and parsers)
 
grep "is used in the context" '' | \
 
grep -v '(‘\[|Maybe|ReadPrec)' | \
 
perl -pe 's#^.''in the context ‘([^ ]+).''$#\1#' | \
 
sort | \
 
uniq -c | \
 
sort -rg
 
</syntaxhighlight>
 
 
=== Transitional strategy ===
 
 
The roadmap is similar to the [https://github.com/quchen/articles/blob/master/applicative_monad.md AMP], the main difference being that since <code>MonadFail</code> does not exist yet, we have to introduce new functionality and then switch to it.
 
 
<ol>
 
<li>GHC 8.0 / base-4.9
 
<ul>
 
<li>Add module <code>Control.Monad.Fail</code> with new class <code>MonadFail(fail)</code> so people can start writing instances for it. <code>Control.Monad</code> only re-exports the class <code>MonadFail</code> but not its <code>fail</code> method. NB: At this point, <code>Control.Monad.Fail.fail</code> clashes with <code>Prelude.fail</code> and <code>Control.Monad.fail</code>.</li>
 
<li>Add a language extension <code>-XMonadFailDesugaring</code> that changes desugaring to use <code>MonadFail(fail)</code> instead of <code>Monad(fail)</code> This has the effect that typechecking will infer a <code>MonadFail</code> constraint for <code>do</code> blocks with failable patterns, just as it is planned to do when the entire thing is done.</li>
 
<li>Warn when a <code>do</code> block that contains a failable pattern is desugared, but there is no <code>MonadFail</code> instance in scope: "Please add the instance or change your pattern matching." Add a flag to control whether this warning appears.</li>
 
<li>Warn when an instance implements the <code>fail</code> function (or when <code>fail</code> is imported as a method of <code>Monad</code> , as it will be removed from the <code>Monad</code> class in the future. (See also [https://ghc.haskell.org/trac/ghc/ticket/10071 GHC #10071])</li>
 
</ul>
 
</li>
 
<li>GHC 8.2
 
<ul>
 
<li>Switch <code>-XMonadFailDesugaring</code> on by default.</li>
 
<li>Warnings are still issued if the desugaring extension has been explicitly disabled.</li>
 
</ul>
 
</li>
 
<li>GHC 8.4
 
<ul>
 
<li>Remove <code>-XMonadFail</code> leaving its effects on at all times.</li>
 
<li>Remove <code>fail</code> from <code>Monad</code></li>
 
<li>Instead, re-export <code>Control.Monad.Fail.fail</code> as <code>Prelude.fail</code> and <code>Control.Monad.fail</code></li>
 
<li><code>Control.Monad.Fail</code> is now a redundant module that can be considered deprecated.</li>
 
</ul>
 
</li>
 
</ol>
 
 
=== Current status ===
 
 
* [https://wiki.haskell.org/ZuriHac2015 ZuriHac 2015 (29.5. - 31.5.)]: Franz Thoma (@fmthoma) and me (David Luposchainsky aka @quchen) started implementing the MFP in GHC.
 
** Desugaring to the new <code>fail</code> can be controlled via a new language extension, <code>MonadFailDesugaring</code>
 
** If the language extension is turned off, a warning will be emitted for code that would break if it was enabled.
 
** Warnings are emitted for types that ''have'' a ''MonadFail'' instance. This still needs to be fixed.
 
** The error messages are readable, but should be more so. We're still on this.
 
* 2015-06-09: Estimated breakage by compiling Stackage. Smaller than expected.
 
* 2015-06-09 (late): Published. People seem to like the idea, with a couple of pain points remaining.
 
* 2015-06-16: [https://github.com/quchen/articles/blob/master/monad_fail_update1.md Update 1 posted.]
 
* 2015-09-18: [https://phabricator.haskell.org/D1248 Patch nearly finished. Some nontrivial tests still fail.]
 

Revision as of 16:02, 3 June 2022

Note: this proposal page has been moved to the Haskell Prime Wiki.

A quick summary for migration purposes, here are the versions of GHC/base with major developments in the proposal:

8.0.1 (May 2016) / base 4.9.0.0
MonadFail in Control.Monad.Fail added
8.6.1-8.6.4 (all 8.6 have same base) (Sep 2018) / base 4.12.0.0
Last base to still have Monad(fail)
8.8.1 (Jul 2019) / base 4.13.0.0
Migration completed, Monad(fail) gone