Monads as computation

From HaskellWiki
Revision as of 19:51, 1 August 2007 by CaleGibbard (talk | contribs)
Jump to navigation Jump to search

Motivation

Programmers in general, and functional programmers in particular, are usually not so content to solve a problem in a fragile way by coding a solution directly. Quite often the best way to solve a problem is to design a domain-specific language in which the solution to one's problem is easily expressed. Doing this generally ensures that a wide class of similar problems can be attacked using the same code.

Better still, we'd like to embed those domain specific languages into the language which we wrote them in, so that they can be used together, and so we get benefits from the language we're working in without so much extra work. So we write combinator libraries which are essentially libraries of code whose API's are sufficiently powerful that using the library is like programming in a small language embedded within the existing one.

Such a library will have some representation of primitive computations, and some ways to glue those computations together into more complex ones. A parsing library might define primitive parsers for parsing single characters, and then combining functions for concatenating parsers or selecting between them. A drawing library might define some basic drawing operations, and then various means of combining drawings into larger ones (on top, beside, above, etc.).

In this manner, the user of the combinator library builds up the computation they want, piecing together smaller parts into larger ones.

As far as programming is concerned, a monad is just a particular style of combinator library. That is, one which supports a few basic means of combination.

The reason for making this abstraction is so that all the libraries which make use of those means of combination can then share a library of combining functions built up from the primitive ones they are required to support.

Specifically, by defining an instance of Monad for your library when appropriate, you automatically get the benefit of the functions in the Control.Monad library (as well as a few others, like Data.Traversable). This includes things like for-each loops (forM/mapM), ways to turn pure functions into combiners (liftM2, etc.), as well as other control structures which you get for free just for making your library an instance of Monad.

The parts of a monad

There are of course, other kinds of combinator library, but monads arise fairly naturally from a few basic premises.

  • Monadic computations have results. This is reflected in the types. Given a monad M, a value of type M t is a computation resulting in a value of type t. It's important to realise that this is typically just some data structure.
  • For any value, there is a computation which "does nothing", and produces that result. This is given by defining the function return for the given monad.
    return :: (Monad m) => a -> m a
    
  • Given a pair of computations x and y, one can form the computation x >> y, which intuitively "runs" the computation x, throws away its result, then runs y returning its result.
    (>>) :: (Monad m) => m a -> m b -> m b
    
  • Further, we're allowed to use the result of the first computation to decide "what to do next", rather than just throwing it away. This idea is embodied by the operation (>>=), called 'bind'. If x is a computation, and f is a function from potential results of that computation to further computations to be performed, then x >>= f is a computation which runs x, then applies f to its result, getting a computation which it then runs. The result of this latter computation is the result of the combined one.
    (>>=) :: (Monad m) => m a -> (a -> m b) -> m b
    

In fact, once we have bind, we can always define (>>) as:

x >> y = x >>= (\k -> y)

It's important to realise that both what it means to "run" a computation, and what "then" means in the above are both up to the monad in question (subject to a few simple constraints to be discussed later). This point will become clearer as one sees more and more examples.

A few examples

On top of return and (>>=), any given monad will typically define a bunch of primitive computations to get the user of the library started. The IO monad, for instance, has a large number of I/O operations such as getLine :: IO String and putStrLn :: String -> IO (). The program:
main :: IO ()
main = getLine >>= putStrLn
gets a line of text from the user, and then prints it back out. For a slightly more complicated example, the program:
main :: IO ()
main = putStrLn "Enter a line of text:"
         >> getLine >>= \x -> putStrLn (reverse x)
prompts the user for a line of text, gets the line of text from the user, and then prints it back out in reverse. A parsing monad might define char :: Char -> Parser Char, for constructing a parser which succeeds if the input string matches the given character. As a very simple example, the parser:
cat = char 'c' >> char 'a' >> char 't' >> return "It's a cat."
would try to match the string "cat", and if the parse succeeded, would return the string "It's a cat.".

Do notation

Because computations are typically going to be built up from long chains of (>>) and (>>=), in Haskell, we have some syntax-sugar, called do-notation.

The do-notation allows us to write our second IO program above as:

main = do putStrLn "Enter a line of text:"
          x <- getLine
          putStrLn (reverse x)

The basic mechanical translation for the do-notation is as follows:

do { x } = x

do { x ; <stmts> }
  = x >> do { <stmts> }

do { v <- x ; <stmts> }
  = x >>= \v -> do { <stmts> }

do { let <decls> ; <stmts> }
  = let <decls> in do { <stmts> }

This gives monadic computations a bit of an imperative feel to them, but it's important to remember that the monad in question gets to decide what the combination means, and so some unusual forms of control flow might actually occur. In some monads (like parsers, or the list monad), "backtracking" may occur, and in others, even more exotic forms of control might show up.

The monad laws

However, in order to maintain some semblance of sanity, we agree to make the monads we define follow some basic rules. I'll show the three rules both in terms of return and (>>=) and do-notation, and try to give some feel of what they really mean.

1. return v >>= f == f v

2. x >>= return == x

3. (x >>= f) >>= g == x >>= (\v -> f v >>= g)

Rules 1 and 2 basically give one the sense that return v "does nothing" and results in v.

Rule 3 puts a bit of a constraint on what "then" is supposed to mean. It is perhaps easier at first to look at what it means for (>>):

(x >> y) >> z == x >> (y >> z)

This corresponds nicely with our usual reading of (>>) as "then":

putting on your tie, then (putting on your socks then putting on your shoes)

is the same thing as

(putting on your tie then putting on your socks) then putting on your shoes.

To get a bit of a different perspective on what the laws mean, let's have a look at what they look like in do-notation:

1. do { w <- return v; f w }
== do { f v }

2. do { v <- x; return v }
== do { x }

These two are again consistent with the idea that return produces a computation that has no "side-effects", and just returns its parameter.

3. do v <- do w <- x
              f w
      g v

== do w <- x
      v <- f w
      g v

This is more interesting. It's telling us that asking for the result of a compound computation in the midst of a do-block will result in exactly the same thing as if that compound computation had been spliced in directly, and gives us a valid way to refactor code written in any monad. We're allowed to abstract a chunk of code out from the middle of a do-block and give it a name without worrying about whether we've changed the meaning of the code.

The whole point

This is all very good, but apart from defining a pretty syntax for a certain kind of combinator library, the stuff we've done so far is fairly inessential. What's the point of recognising something as a monad?

The point, as I alluded to in the introduction, is that we can then write code which works for all monads, and have a whole library of code which is made available to us just for recognising that the library we're writing happens to be a monad. Since we have only return and bind to work with, this sort of code will serve to chain computations together in some methodical way. That is to say, it will consist of control structures.