GHC/Typed holes

From HaskellWiki
< GHC
Revision as of 20:19, 31 August 2014 by Henk-Jan van Tuyl (talk | contribs) (→‎Introduction: Made text up-to-date)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Typed holes in GHC.

Typed holes are a powerful feature in GHC inspired by Agda. But what are typed holes, and how do they help us write code?

(This page was taken from a 'tutorial' I wrote of this feature on Reddit.)

Introduction

Sometimes, when writing programs, we tend to use lots of polymorphic types. That's good! We can reuse those pieces, thanks to the types. But it can be difficult sometimes when writing a program to see how the polymorphic types should exactly fit together, given the things in scope.

Would it not be nice if we could have the compiler tell us the types of everything in scope? It'd be much easier to see how we can 'fit' them together like puzzle pieces.

This is the purpose of a hole: it has similar semantics to undefined, in that evaluating it is an error (you can replace all holes with undefined, and vice versa, and nothing has changed: you may even say undefined is just a really crappy, useless hole!) But it's special in that when GHC encounters a hole during compilation it will tell you what type needs to be there in place of the hole, for the type-checker to be OK with the definition. It will also show you the types of all the bindings in scope, to assist you in figuring out the problem.

This feature was implemented by Simon Peyton Jones, Sean Leather and Thijs Alkemade, and is introduced with GHC 7.8.1. It is enabled by default.

Motivating example

Let us consider the definition of a Free monad:

data Free f a
  = Pure a
  | Free (f (Free f a))

A free monad coupled with an underlying Functor f. We would like to write the Monad instance for Free f.

We begin by writing the trivial cases:

instance Functor f => Monad (Free f) where
  return a     = Pure a
  Pure a >>= f = f a
  Free f >>= g = ... -- I'm confused!

... but we're a little stumped on how to write the final case. Let's have the compiler help us do this.

Writing 'instance Monad (Free f)'

Let's go ahead and shove all of this in a file. For the final case of the Monad (Free f) instance, we will place a hole in the definition, which is designated by the same syntax for wildcard patterns, _

module FreeMonad where

data Free f a
  = Pure a
  | Free (f (Free f a))

instance Functor f => Monad (Free f) where
  return a     = Pure a
  Pure a >>= f = f a
  Free f >>= g = Free _

Now let's load that into GHCi. We should see something that looks like this:

$ ghci FreeMonad.hs
[1 of 1] Compiling Main             ( holes.hs, interpreted )

FreeMonad.hs:11:23: Warning:
    Found hole `_' with type f (Free f b)
    Where: `f' is a rigid type variable bound by
               the instance declaration at holes.hs:26:10
           `b' is a rigid type variable bound by
               the type signature for
                 >>= :: Free f a -> (a -> Free f b) -> Free f b
               at FreeMonad.hs:10:10
    Relevant bindings include
      >>= :: Free f a -> (a -> Free f b) -> Free f b
        (bound at FreeMonad.hs:11:3)
      f :: f (Free f a) (bound at FreeMonad.hs:11:8)
      g :: a -> Free f b (bound at FreeMonad.hs:11:14)
    In the first argument of `Free', namely `_'
    In the expression: Free (_)
    In an equation for `>>=': (Free f) >>= g = Free (_)
Ok, modules loaded: Main.
*Main> 

Let's look at the error, and what the hole should look like (first line of the error):

Found hole `_' with type f (Free f b)

OK, so I need some value of type f (Free f b) to put there. I don't know how to get that. But let's look at what's in scope:

Relevant bindings include
  >>= :: Free f a -> (a -> Free f b) -> Free f b
    (bound at holes.hs:28:3)
  f :: f (Free f a) (bound at holes.hs:29:8)
  g :: a -> Free f b (bound at holes.hs:29:14)
In the first argument of `Free', namely `_'

OK, so I have 3 values directly surrounding me at my disposal: f, g, and (>>=). I have a f :: f (Free f a) in scope, and that's almost what I need. I just need to turn the inner Free f a into a Free f b. And that's what fmap does for functors! And I don't know what the functor f is anyway, so clearly I need to fmap the value before anything else. Let's do that and use a hole to tell us where to go next:

instance Monad ... where
  ...
  Free f >>= g = Free (fmap _ f)

Now we get from the compiler:

Found hole `_' with type Free f a -> Free f b

OK, so now we just need some function with type Free f a -> Free f b. Let's look at what's in scope. We can quickly see that (>>=) has the type we want if we partially apply it. And g has just the type we need for that partial application! Equationally:

(>>=) :: Free f a -> (a -> Free f b) -> Free f b
g     ::              a -> Free f b

Ergo:

(\x -> x >>= g) :: Free f a -> Free f b

So the resulting instance is:

instance Functor f => Monad (Free f) where
  return a     = Pure a
  Pure a >>= f = f a
  Free f >>= g = Free (fmap (>>= g) f)

Notes on parametricity, free theorems and Agda

These examples hint at what people mean when they say they use the type system to drive development: it can help write your code, too!

This is of course very powerful for general, polymorphic types like above, because a polymorphic function may only be written so many ways. In fact, the above definition is the only legitimate Monad instance for Free!

A polymorphic type variable says what you cannot do to it: touch! You can't break the parametric nature of it by scrutinizing the variable in any way. A useful implementation of a polymorphic function has a small 'design space', naturally. It is not so, when defining monomorphic functions. A function like f :: Int -> Double could have many possible definitions, but we can always be sure for example, that f :: a -> a only has one valid definition, or that if f :: [a] -> [a], then 'trivially' we can see that f [] = [] must hold as a law.

This notion of "what information may be calculated based on the type" like we're doing here can be formalized to what we call 'Free Theorems', pioneered by Philip Wadler in his paper 'Theorems for Free!'

Free theorems fall out of the parametricity theorem (also called Reynolds' abstraction theorem,) which essentially states how polymorphic types relate to each other. Free theorems just state that, given a polymorphic type, it's possible to derive useful theorems and laws, based only on that type alone (so if f :: [a] -> [a] as we said earlier, then we get the free theorem f [] = [] by definition - try writing that case another way! You can't!)

At this point, you're getting into much deeper territory, but I say all this because you have probably intuitively thought some of these things, when writing polymorphic functions yourself - that you must write them in such a way that the terms are all properly abstracted over correctly. This kind of design 'forces' only a few definitions to be possible.

Holes for GHC were inspired by the equivalent feature in its (twice removed) sister language, Agda. Holes are considerably more powerful in Agda, mostly due to the fact it is total, the compiler infers less, and thus Agda programs are rich with evidence, giving witness to many facts the compiler can use. As a result, it can derive more code for you. In particular, Agda can literally write your code based on holes. It's able to do things like write case-analysis' for you, based only on type structure, and what's in scope, and discharge all the 'trivial' cases. You do this in emacs, in one or two keystrokes. It's essentially a semi-automated feedback loop for development, and it makes sense because the tool is really working with a meaningful model of your code, only in sensible ways that type-check. It's great.

There is not necessarily a reason why this could not exist for e.g. the Emacs mode for Haskell. Just because Haskell is not total, doesn't mean we couldn't calculate useful code based on the type - it is just much harder, especially considering the multitudes of features and extensions GHC supports.

  • When I say 'legitimate' or 'useful', I mean, 'modulo divergent definitions.' Because Haskell is not total, it is possible to write divergent terms like let x = x in x, and the type of this expression is forall a. a, meaning I can prove anything with it. In the domain of Haskell semantics, this expression is in fact equivalent to undefined as well, because all bottoms are equal. I can prove that Blue is Green this way, or that Freedom is Slavery - I can inhabit any type, using these nonsensical terms. Logistically, you could say Haskell terms are complete, but not coherent/consistent. But divergent definitions here are not very legitimate, interesting, or useful, obviously, so ignore them (cue, "Fast and Loose reasoning is morally correct.")