GHC/GHCi debugger
This page is a dump of the designs, ideas, and results of the GHCi debugger SoC project. Please contribute with your suggestions and comments.
The project builds on top of Lemmih's work: a breakpoint function that when hit while evaluating something in GHCi, launchs a new interactive environment with the variables in the local scope of the breakpoint, allowing you to interact with them.
Intermediate Closure Viewer
The closure viewer is intended to permit working with polymorphic values in breakpoints, as well as to explore intermediate computations without altering the evaluation order.
This feature is now (more or less) complete. Currently it provides two new commands under ghci, :print and :sprint, both used in the same way as :type or :info. The latter prints a semievaluated closure using underscores to represent suspended computations (pretty much as Hood does). The former one in addition binds these thunks to variable names, so that you can do things with them.
Example:
Prelude> let li = map Just [1..5] Prelude> length li 5 Prelude> :sp li li - _:_:_:_:_:[] Prelude> head li Just 1 Prelude> :sp li li - Just 1:_:_:_:_:[] Prelude> last li Just 5 Prelude> :sp li li - Just 1:_:_:_:Just 5:[] Prelude> :p li li - Just 1 : (_t987::Maybe Integer) : (_t988::Maybe Integer) : (_t989::Maybe Integer) : [Just 5] Prelude> _t987 `seq` () Prelude> :p li li - Just 1 : Just 2 : (_t457::Maybe Integer) : (_t458::Maybe Integer) : [Just 5] Prelude> _t988 Just 3
Its best feature is that it can work without type information, so you can display polymorphic objects the type of which you don't know. However if there is type information available, it is used. It could be made totally independent of type info, so that it could work with opaque or coerced (wrong) types. For instance:
data Opaque = forall a. O a
*Test2> let li = map Just [1..5] *Test2> let o = O li *Test2> head li `seq` () *Test2> length li `seq` () *Test2> :p o o - O Just 1 : (_t126::a) : (_t125::a) : (_t124::a) : (_t123::a) : []
In the example above the li inside o is not typed, so the bindings aren't either. However, it would be possible to extend the closure viewer so that it recovers its types.
Other currently proposed extensions are a safeCoerce function (not so useful, it depends on ghc-api) and an unsafeDeepSeq (this one is decoupled from ghc-api). There is also a generally useful (for compiler/tool developers) isFullyEvaluated query function. The signatures being:
isFullyEvaluated :: a -> IO Bool unsafeDeepSeq :: a -> b -> b safeCoerce :: GHC.Session -> a -> Maybe b
Finally, note that there are some inconveniences with the current implementation, such as :p binding the same closure to different names when used twice on the same closure, but they are minor and temporary (hopefully).
Usability
There is plenty of work to be done in this area before the debugger can be shipped with ghc.
If you have tried the patches maybe you want to add your comments here. Please add feature requests here too.
Dynamic Breakpoints
Event sites and events
We define 'event sites' as points in the code where you can want to set a breakpoint. Current candidates for sites are:
- On the entrance to a function / lambda abstraction
- Prior to function applications (this one does not make sense unless it forces the application using $!)
- On the entrance to case alternatives
- Local bindings in lets and wheres
- Entrance to statements in monadic-do code
- In the two branches of an if expression
Overlapping or unnecesary events can be coalesced into a single one. For instance, if there is a function application immediately in the entrance to a function.
Credit goes to both A. Tolmach's ML debugger and the OCaml time-travel debugger for providing the inspiration.
Proposals
There are currently the following proposals:
- Instrument the code with a conditional breakpoint at every event site. Sites are numbered, and the condition uses a site-indexed array to check if there is a breakpoint enabled. The array is maintained inside ghci. Hopefully not much magic is required for this one.
- In the style of the previous one, but no array is maintained. All the breakpoint conditions are set to False, so almost no overhead is incurred. When the user demands a breakpoint, its BCO in the heap is rewritten to enable the breakpoint. Feasibility of this?
- Don't use instrumentation. Have a new header for BCOs with breakpoints, say BCO_BREAK, and change headers in execution time on user demand (as in the previous proposal). The problem I see with this one is how to extract the local bindings. I don't fully grok the scheme Lemmih uses to do that yet.
We are about to explore the first one, under the lemma of ``do the simplest thing that could possibly work``. I'm sure there are many other designs. Please add your proposal or just throw an idea.
Call Traces
We want to have strict call traces, not the lazy ones.
Proposals
- It has been suggested that stealing ideas from Cost-Centre Stacks may be useful. I need more pointers on this.
- Based on Tolmach's debugger, we can instrument the source code to build a timeline of events (either lazily or not). The events contain a pointer to its lexical parent event. With that it should be possible to extract a call trace:
- CASE 1: We are in a Function definition (FN):
- Go back one step in the timeline: it necessarily is an application (APP)
- Go back to its 'binding', i.e. its lexical parent. Keep doing this until it is a FN, then start again from case 1.
- Once you reach the top, i.e. the 0 event, you are done. Display all theAPPs you encountered in the way
- CASE 2: We are in a site other than a FN:
- Go back lexically until you hit a FN and continue with case 1.
This is just a wild, untested idea. It's possible that it would not work. Also even if it worked, it's possible that the overhead was unadmissible.
WON'T WORK WITH LAZINESS
Integration
Allowing other tools to integrate with the debugger is an important goal. It should not be taken lightly though.
- It has been suggested to create a client/server protocol so that the debugger can be used by other tools.
- On the other hand, arguably it would be much easier to provide integration to clients of the ghc-api via some form of debugger api.
- Finally, it should be possible to derive the client/server architecture as an afterthought provided there is a debugger api in the ghc-api.
Further pointers
- Rectus, Oleg Murk and Lennart Kolmodin
- The Ocaml Debugger, The OCaml Team
- A debugger for Standard ML, A.Tolmach, A. Appel
- The original discussion in the ghc-cvs mailing list