This page is a dump of the designs, ideas, and results of the GHCi debugger SoC project. Please contribute with your suggestions and comments.
The project builds on top of Lemmih's work: a breakpoint function that when hit while evaluating something in GHCi, launchs a new interactive environment with the variables in the local scope of the breakpoint, allowing you to interact with them.
NEW: There is also on-going work in the documentation for these features in the ghc user guide. Here is a snapshot of the ghci documentation page extended with this project.
1 Intermediate Closure Viewer
The closure viewer is intended to permit working with polymorphic values in breakpoints, as well as to explore intermediate computations without altering the evaluation order.
This feature is now (more or less) complete. Currently it provides two new commands under ghci, :print and :sprint, both used in the same way as :type or :info. The latter prints a semievaluated closure using underscores to represent suspended computations (pretty much as Hood does). The former one in addition binds these thunks to variable names, so that you can do things with them.
Prelude> let li = map Just [1..5] Prelude> length li 5 Prelude> :sp li li - [_,_,_,_,_,] Prelude> head li Just 1 Prelude> :sp li li - [Just 1,_,_,_,_] Prelude> last li Just 5 Prelude> :sp li li - [Just 1,_,_,_Just 5] Prelude> :p li li - [Just 1, (_t1::Maybe Integer),(_t2::Maybe Integer),(_t3::Maybe Integer),Just 5] Prelude> _t1 `seq` () Prelude> :p li li - [Just 1, Just 2,(_t3::Maybe Integer),(_t4::Maybe Integer),Just 5] Prelude> _t2 Just 3
Its best feature is that it can work without type information, so you can display polymorphic objects the type of which you don't know. However if there is type information available, it is used. Thanks to this it can work with opaque or coerced types. For instance:
data Opaque = forall a. O a
*Test2> let li = map Just [1..5] *Test2> let o = O li *Test2> head li `seq` () *Test2> length li `seq` () *Test2> :p o o - [O Just 1,(_t1::Integer),(_t2::Integer),(_t3::Integer),(_t4::Integer)]
In the example above the li inside o has an opaque existential type. However, the closure viewer makes it possible to recover its type when it gets evaluated.
Other currently proposed extensions are a safeCoerce function (not so useful, it depends on ghc-api) and an unsafeDeepSeq (this one is decoupled from ghc-api). There is also a generally useful (for compiler/tool developers) isFullyEvaluated query function. The signatures being:
isFullyEvaluated :: a -> IO Bool unsafeDeepSeq :: a -> b -> b safeCoerce :: GHC.Session -> a -> Maybe b
Finally, note that there are some inconveniences with the current implementation, such as :p binding the same closure to different names when used twice on the same closure, but they are minor and temporary (hopefully).
There is plenty of work to be done in this area before the debugger can be shipped with ghc.
If you have tried the patches maybe you want to add your comments here. Please add feature requests here too.
3 Dynamic Breakpoints
See the user details of the current implementation at the GHC User Guide. Here is a snapshot of the ghci documentation page extended with this project.
3.1 Event sites and events
We define 'event sites' as points in the code where you can want to set a breakpoint. Current candidates for sites are:
- On the entrance to a function / lambda abstraction
Prior to function applications(this one does not make sense unless it forces the application using $!)
- Local bindings in lets and wheres
- Entrance to statements in monadic-do code
Overlapping or unnecesary events should be coalesced into a single one. The rationale for what is an event and what is not is trying to find a middle point between the user interests and the overhead introduced:
- We want to keep the overhead manageable, thus we want to keep the number of breakpoints low.
- The user wants to introduce breakpoints at will.
Credit goes to both A. Tolmach's ML debugger and the OCaml time-travel debugger for providing the inspiration.
There are currently the following proposals:
- Instrument the code with a conditional breakpoint at every event site. Sites are numbered, and the condition uses a site-indexed array to check if there is a breakpoint enabled. The array is maintained inside ghci. Hopefully not much magic is required for this one.
- In the style of the previous one, but no array is maintained. All the breakpoint conditions are set to False, so almost no overhead is incurred. When the user demands a breakpoint, its BCO in the heap is rewritten to enable the breakpoint. Feasibility of this?
- Don't use instrumentation. Have a new header for BCOs with breakpoints, say BCO_BREAK, and change headers in execution time on user demand (as in the previous proposal). The problem I see with this one is how to extract the local bindings. I don't fully grok the scheme Lemmih uses to do that yet.
During this project we have explored the first one, under the lemma of ``do the simplest thing that could possibly work``. I'm sure there are many other designs. Please add your proposal or just throw an idea in.
4 Call Traces
We want to have strict call traces, not the lazy ones.
- It has been suggested that stealing ideas from Cost-Centre Stacks may be useful. I need more pointers on this.
- Based on Tolmach's debugger, we can instrument the source code to build a timeline of events (either lazily or not). The events contain a pointer to its lexical parent event. With that it should be possible to extract a call trace:
- CASE 1: We are in a Function definition (FN):
- Go back one step in the timeline: it necessarily is an application (APP)
- Go back to its 'binding', i.e. its lexical parent. Keep doing this until it is a FN, then start again from case 1.
- Once you reach the top, i.e. the 0 event, you are done. Display all theAPPs you encountered in the way
- CASE 2: We are in a site other than a FN:
- Go back lexically until you hit a FN and continue with case 1.
This is just a wild, untested idea. It's possible that it would not work. Also even if it worked, it's possible that the overhead was unadmissible. WON'T WORK WITH LAZINESS
Allowing other tools to integrate with the debugger is an important goal. It should not be taken lightly though.
- It has been suggested to create a client/server protocol so that the debugger can be used by other tools.
- On the other hand, arguably it would be much easier to provide integration to clients of the ghc-api via some form of debugger api.
- Finally, it should be possible to derive the client/server architecture as an afterthought provided there is a debugger api in the ghc-api.
6 Further pointers
- Rectus, Oleg Mürk and Lennart Kolmodin
- The Ocaml Debugger, The OCaml Team
- A debugger for Standard ML, A.Tolmach, A. Appel
- The original discussion in the ghc-cvs mailing list
7 How to get the patches
The patches are available at the SoC ghc.debugger darcs repo:
darcs get --partial http://darcs.haskell.org/SoC/ghc.debugger
and build it following the instructions at the GHC developers wiki.
Or simply pull them into your local ghc-6.5 repo and rebuild stage2. Note that if you go this route, you will also need to pull a few patches at the libraries/base repo, to be pulled from http://darcs.haskell.org/SoC/ghc.debugger/libraries/base as expected
Have fun! (and feel free to spam me with bugs, suggestions or requests!)