Difference between revisions of "Library/CC-delcont"

Revision as of 18:25, 17 July 2007

Introduction

This page is intended as a brief overview of delimited continuations and related constructs, and how they can be used in Haskell. It uses the library CC-delcont as a vehicle for doing so, but the examples should be general enough so that if you have another implementation, they should be relatively straight forward to port (whenever possible, I have endeavored not to use the operators on abstract prompt and sub-continuation types from CC-delcont, instead using the more typical, functional operators).

The basics

Undelimited continuations

If you've taken university courses in computer science, or done much investigation of language design, you've probably encountered continuations before. The author first recalls learning about them in a class on said subject, where they were covered very briefly, and it was mentioned (without proof; and no proof will be provided here) that they could be used as a basis upon which all control flow operators could be built. At the time, they seemed rather abstract and unwieldy. Perhaps they could be used to implement any more common control flow pattern, but why bother, when, as far as language implementation concerns go, it's easier to implement (and understand) most common control flow directly than it is to implement continuations?

As far as usage goes, continuations are probably most closely associated with Scheme, and its call-with-current-continuation function (abbreviated to Haskell's version, callCC from now on), although many other languages have them (undelimited continuations for Haskell are provided by the Cont monad and ContT transformer). They're often regarded as being difficult to understand, as their use can cause very complex control flow patterns (much like GOTO, although more sophisticated), though reduced to their basics, they aren't that hard to understand.

A continuation of an expression is, in a loose sense, 'the stuff that happens after the expression.' An example to refer to may help:

m >>= f >>= g >>= h

Here we have an ordinary monadic pipeline. A computation m is run, and its result is fed into f, and so on. We might ask what the continuation of 'm' is, the portion of the program that executes after m, and it looks something like:

\x -> f x >>= g >>= h

The continuation takes the value produced by m, and feeds it into 'the rest of the program.' But, the fact that we can represent this using functions as above should be a clue that continuations can be built up using them, and indeed, this is the case. There is a standard way to transform a program written normally (or in a monadic style, as above) into a program in which continuations, represented as functions, are passed around explicitly (known as the CPS transform), and this is what Cont/ContT does.

However, such a transform would be of little use if the passed continuations were inaccessible (as with any monad), and callCC is just the operator for the job. It will call a function with the implicitly passed continuation, so in:

callCC (\k -> e) >>= f >>= g >>= h

'k' will be set to a function that is something like the above '\x -> f x >>= g >>= h'. However, in some sense, it is not an ordinary function, as it will never return to the point where it is invoked. Instead, calling 'k' should be viewed as execution jumping to the point where callCC was invoked, with the entire 'callCC (..)' expression replaced with the value passed to 'k'. So k is not merely a normal function, but a way of feeding a value into into an execution context (and this is reflected in its monadic type: a -> Cont b).

So, what is all this good for? Well, a standard example is that one can use continuations to capture a method of escaping from loops (particularly nested ones), and if you ponder for a while, you might be able to imagine implementing some sort of exception mechanism with them. A simple example is computing the product of a list of numbers:

prod l = callCC (\k -> loop k l)
 where
 loop _ []     = return 1
 loop k (0:_)  = k 0
 loop k (x:xs) = do n <- loop k xs ; return (n*x)

Under normal circumstances, the loop will simply multiply all the numbers. However, if a 0 is detected, there is no need to multiply anything, the answer will always be 0. So, the continuation is invoked, and 0 is returned immediately, without performing any multiplications.

Delimited continuations

So, continuations (hopefully) seem pretty clear, and at least theoretically useful. Where do delimited continuations come into the picture.

The story (according to the hearsay the author has come across) goes back again to Scheme. As was mentioned earlier, callCC is often associated with it. Another thing closely associated with Scheme (and Lisp in general) is interactive environments in which code can be defined and run (much like our own Hugs and GHCi). Naturally, it would be nice if such environments could themselves be written in Scheme.

However, continuations in Scheme are not implemented as they are in Haskell. In Haskell, continuation using code is tagged with a monadic type, and one must use runCont(T) to run such computations, and the effects can't escape it. In Scheme, continuations are native, and all code can capture them, and capturing them captures not 'the rest of the Cont(T) computation,' but 'the rest of the program.' And if the interactive loop is written in Scheme, this includes the loop itself, so programs run within the session can affect the session itself.

Now, this might be a minor nit, but it is a nit nonetheless, and luckily for us, it led to the idea of delimited continuations. The idea was, of course, to tag a point at which the interactive loop invoked some sub-program, and then control flow operators such as callCC would only be able to capture a portion of the program up to the marker. To the sub-program, this is all that's of interest anyhow. Such a setup would solve the issue nicely.

However, once one has the ability to create such markers, why not put them in the hands of the programmer? Then, instead of them being able to capture 'the rest of the program's execution,' they would be able to delimit, capture and manipulate arbitrary portions of their programs. And indeed, such operations can be useful.

Samples

Iterators

So, what are delimited continuations good for? Well, suppose we have a binary tree data type like so:

data Tree a = Leaf | Branch a (Tree a) (Tree a)

empty = Leaf
singleton a = Branch a Leaf Leaf

insert b Leaf = Branch b Leaf Leaf
insert b (Branch a l r)
    | b < a = Branch a (insert b l) r
    | otherwise = Branch a l (insert b r)

fold :: (a -> b -> b -> b) -> b -> Tree a -> b
fold f z Leaf = z
fold f z (Branch a l r) = f a (fold f z l) (fold f z r)

for :: Monad m => Tree a -> (a -> m b) -> m ()
for t f = fold (\a l r -> l >> f a >> r) (return ()) t

Now, we have a fold over our data type, and as shown, we can therefore write a monadic iteration function 'for' over it (this is actually done for arbitrary data types in Data.Foldable). The fold is a fine method of traversing the data structure to operate on elements in most cases. However, what if we wanted something more like an iterator object, which somehow captured the traversal of the tree, remembering what element we're currently at, and which come next?

Well, it turns out one can build just such an object using continuations. It is indeed possible to build it using undelimited continuations, but it's rather complex to do so (I'll not include code that does, as I don't feel like figuring out all the details). However, it turns out it's easy using delimited continuations:

data Iterator m a = Done | Cur a (m (Iterator m a))

begin :: MonadDelimitedCont p s m => Tree a -> m (Iterator m a)
begin t = reset $ \p ->
            for t (\a ->
                shift p (\k -> return (Cur a (k $ return ())))) >> return Done

current :: Iterator m a -> Maybe a
current Done      = Nothing
current (Cur a _) = Just a

next :: Monad m => Iterator m a -> m (Iterator m a)
next Done      = return Done
next (Cur _ i) = i

finished :: Iterator m a -> Bool
finished Done = True
finished _    = False

So, clearly, Iterator is the type of iterators over a tree. current, next and done are some utility functions for operating on them. The interesting work is done in the begin function.

There are two delimited control operators in play here. First is reset, which is a way to place a delimiter around a computation. The term 'p' is simply a way to reference that delimiter; the library I'm working with allows for many named delimiters to exist, and for control operators to specify which delimiters they're working with (so a control operator may capture the continuation up to p, even if it runs into a delimiter q sooner, provided p /= q).

The other operator is shift, which is used to capture the delimited continuation. In many ways, it's like callCC, but with an important difference: it aborts the captured continuation. When callCC is called on a function f, if f returns normally, execution will pick up from just after the callCC. However, when shift is called, the continuation between the call and the enclosing prompt is packaged up (into 'k' here), and passed to the function, and a normal return will return to the place where the delimiter was set, not where shift was called.

With this in mind, we can begin to analyze the 'begin' function. First, it delimits a computation with the delimiter 'p'. Next, it begins to loop over the tree. For each element, we use 'shift' to capture "the rest of the loop", calling it 'k'. We then package that, and the current tree element, into an Iterator object, and return it. Since the shift has aborted the rest of the loop (for the time being), it returns to where 'reset' was called, and the function returns the iterator object (wrapped in a monad, of course).

The main remaining piece of interest is when next goes to get the next element of the traversal. When this happens, 'k $ return ()' is executed, which invokes the captured continuation (with the value (), because the loop doesn't take the return value of the traversal function into account anyway). This, essentially, re-enters the loop. If there is a next element, then the traversal function is called with it, shift will once again capture 'the rest of the loop' (from a later point that before, though), and return an iterator object with the new current element and continuation. If there are no new elements, then control will pass out of the loop to the following computation, which is, in this case, 'return Done', so in either case, an Iterator object is the result, and the types work out.

We can test our iterator like so:

main :: IO ()
main = runCCT $ do t <- randomTree 10
                   i <- begin t
                   doStuff i
 where
 doStuff i
    | finished i = return ()
    | otherwise  = do i'  <- next i
                      i'' <- next i
                      liftIO $ print (fromJust $ current i :: Int)
                      doStuff i'

randomTree n = rt empty n
 where
 rt t 0 = return t
 rt t n = do r <- liftIO randomIO
             rt (insert r t) (n - 1)

The output of which might go something like:

  -1937814587
  -1171184756
  -1068642732
  -741588272
  -553872051
  -499564662
  -421862876
  -59900888
  315891595
  1868487875

The example shows one possibly interesting property: one can re-use old iterators without affecting new ones. In this case, we call 'next' on the same iterator twice, but it doesn't advance the iterator twice. Our iterators behave like an ordinary functional data structure, even though they're built out of somewhat out-of-the-ordinary components.

CC-delcont

Installation

Packages are available on Hackage:

 http://hackage.haskell.org/cgi-bin/hackage-scripts/package/CC-delcont

The library is cabalized, so installation should be as simple as:

 runhaskell Setup.lhs configure
 runhaskell Setup.lhs build
 sudo runhaskell Setup.lhs install

(to install to the default directory, /usr/local/lib on Unix)

More information

A google search for delimited continuations will likely yield plenty of interesting resources on the subject. However, the following resources proved especially useful to the author when he was investigating them:

Delimited continuations in operating systems -- This paper provides excellent insight into how delimited continuations can arise as a natural solution/model for real problems, specifically in the context of implementing an operating system.

A Monadic Framework for Delimited Continuations -- This is the paper from which the implementation of the above library was derived. It's quite thorough in its explanation of the motivations for the interface, and also has several possible implementations thereof (though CC-delcont uses only one).

Delimited Dynamic Binding -- This paper, and related code, served as the basis for the dynamically scoped variable portion of the CC-delcont library. It explains the rationale for having dynamic scoping and delimited control interact in the way they do in the library, and goes through the implementation of dynamic variables in terms of delimited continuations.

Shift to control -- This paper explores four different sets of delimited control operators (all of which are implemented in CC-delcont), and their implementation. Though it's not directly relevant to this particular library, it provides some good insight into delimited continuations and their implementation in general.

Oleg Kiselyov's continuation page -- Contains plenty of excellent information on delimited continuations and the like (including some of the above papers), including examples of their use in Haskell.

@@ Line 7: / Line 7: @@
 This page is intended as a brief overview of delimited continuations and related constructs, and how they can be used in Haskell. It uses the library CC-delcont as a vehicle for doing so, but the examples should be general enough so that if you have another implementation, they should be relatively straight forward to port (whenever possible, I have endeavored not to use the operators on abstract prompt and sub-continuation types from CC-delcont, instead using the more typical, functional operators).
-== The Basics ==
+== The basics ==
-=== Undelimited Continuations ===
+=== Undelimited continuations ===
 If you've taken university courses in computer science, or done much investigation of language design, you've probably encountered continuations before. The author first recalls learning about them in a class on said subject, where they were covered very briefly, and it was mentioned (without proof; and no proof will be provided here) that they could be used as a basis upon which all control flow operators could be built. At the time, they seemed rather abstract and unwieldy. Perhaps they could be used to implement any more common control flow pattern, but why bother, when, as far as language implementation concerns go, it's easier to implement (and understand) most common control flow directly than it is to implement continuations?
@@ Line 49: / Line 49: @@
 Under normal circumstances, the loop will simply multiply all the numbers. However, if a 0 is detected, there is no need to multiply anything, the answer will always be 0. So, the continuation is invoked, and 0 is returned immediately, without performing any multiplications.
-=== Delimited Continuations ===
+=== Delimited continuations ===
 So, continuations (hopefully) seem pretty clear, and at least theoretically useful. Where do delimited continuations come into the picture.
@@ Line 174: / Line 174: @@
 (to install to the default directory, /usr/local/lib on Unix)
-== More Information ==
+== More information ==
 A google search for delimited continuations will likely yield plenty of interesting resources on the subject. However, the following resources proved especially useful to the author when he was investigating them: