Do notation considered harmful
Haskell's do notation is popular and ubiquitous. However we shall not ignore that there are several problems. Here we like to shed some light on aspects you may not have thought about, so far.
do notation hides functional details.
This is wanted in order to simplify writing imperative style code fragments.
The downsides are
- that, since
donotation is used almost everywhere, where
IOtakes place, newcomers quickly believe that the
donotation is necessary for doing
- that newcomers think, that
IOis somehow special and non-functional, in contrast to the advertisement for Haskell being purely functional,
- and that newcomers think, that the order of statements determines the order of execution.
These misunderstandings let people write clumsy code like
do putStrLn "text"
do text <- getLine return text
do text <- readFile "foo" writeFile "bar" text
readFile "foo" >>= writeFile "bar"
The order of statements is also not the criterion for the evaluation order. Also here only the data dependencies count. See for instance
do x <- Just (3+5) y <- Just (5*7) return (x-y)
5*7 can be evaluated in any order, also in parallel.
do x <- Just (3+5) y <- Nothing return (x-y)
3+5 is probably not evaluated at all, because it's result is not necessary to find out,
that the entire
do describes a
do notation is so popular that people write more things with monads than necessary.
See for instance the Binary package.
It contains the
Put monad, which has in principle nothing to do with a monad.
All "put" operations have the monadic result
In fact it is a
Writer monad using the
Builder type, and all you need is just the
Even more unfortunate,
the applicative functors were introduced to Haskell's standard libraries only after monads and arrows,
thus many types are instances of
but not as much are instances of
There is no special syntax for applicative functors because it is hardly necessary.
You just write
data Header = Header Char Int Bool readHeader :: Get Header readHeader = liftA3 Header get get get
readHeader = Header <$> get <*> get <*> get
Not using monads and thus
do notation can have advantages.
Consider a generator of unique identifiers.
First you might think of a
State monad which increments a counter each time an identifier is requested.
run :: State Int a -> a run m = evalState m 0 newId :: State Int Int newId = do n <- get modify succ return n example :: (Int -> Int -> a) -> a example f = run $ do x <- newId y <- newId return (f x y)
If you are confident, that you will not need the counter state at the end and that you will not combine blocks of code using the counter (where the second block needs the state at the end of the first block), you can enforce a more strict scheme of usage. The following is like a
where we call
local on an incremented counter for each generated identifier.
Alternatively you can view it as Continuation monad.
newtype T a = T (Int -> a) run :: T a -> a run (T f) = f 0 newId :: (Int -> T a) -> T a newId f = T $ \i -> case f i of T g -> g (succ i) example :: (Int -> Int -> T a) -> a example f = run $ newId $ \a -> newId $ \b -> f a b
This way users cannot accidentally place a
somewhere in a
do block where it has no effect.
do notation we have kept alive a dark side of the C programming language:
The silent neglect of return values of functions.
In an imperative language it is common to return an error code and provide the real work by side effects.
In Haskell this cannot happen, because functions have no side effects.
If you ignore the result of a Haskell function the function will even not be evaluated.
The situation is different for
While processing the
IO you might still ignore the contained return value.
You can write
do getLine putStrLn "text"
and thus silently ignore the result of
The same applies to
do System.Cmd.system "echo foo >bar"
where you ignore the
Is this behaviour wanted?
In safety oriented languages there are possibilities to explicitly ignore return values
EVAL in Modula-3).
Haskell does not need this, because you can already write
do _ <- System.Cmd.system "echo foo >bar" return ()
_ <- should always make you cautious whether ignoring the result is the right thing to do.
The possibility for silently ignoring monadic return values is not entirely the fault of the
It would suffice to restrict the type of the
(>>) combinator to
(>>) :: m () -> m a -> m a
This way, you can omit
_ <- only if the monadic return value has type
Happy with less sugar
Using the infix combinators for writing functions simplifies the addition of new combinators.
Consider for instance a monad for random distributions.
This monad cannot be an instance of
because there is no
mzero (it would be an empty list of events, but their probabilities do not sum up to 1)
mplus is not associative because we have to normalize the sum of probabilities to 1.
Thus we cannot use standard
guard for this monad.
However we would like to write the following:
do f <- family guard (existsBoy f) return f
Given a custom combinator which performs a filtering with subsequent normalization called
(>>=?) :: Distribution a -> (a -> Bool) -> Distribution a
we can rewrite this easily:
family >>=? existsBoy
Note that the
(>>=?) combinator introduces the risk of returning an invalid distribution (empty list of events),
but it seems that we have to live with that problem.
If you are used to write monadic function using infix combinators
you can easily switch to a different set of combinators.
This is useful when there is a monadic structure that does not fit into the current
Monad type constructor class,
where the monadic result type cannot be constrained.
This is e.g. useful for the Set data type,
where the element type must have a total order.
It shall be mentioned that the
do sometimes takes the burden from you to write boring things.
getRight :: Either a b -> Maybe b getRight y = do Right x <- y return x
y is included,
y is not a
and thus returns
Nothing in this case.
mdo notation proves useful, since it maintains a set of variables for you in a safe manner.
mdo x <- f x y z y <- g x y z z <- h x y z return (x+y+z)
mfix (\ ~( ~(x,y,z), _) -> do x <- f x y z y <- g x y z z <- h x y z return ((x,y,z),x+y+z))
- Paul Hudak in Haskell-Cafe: A regressive view of support for imperative programming in Haskell
- Data.Syntaxfree on Wordpress: Do-notation considered harmful
- Things to avoid#do notation