GHC/As a library (up to 6.8): Difference between revisions
TimChevalier (talk | contribs) (add info on new compileToCoreModule API call) |
No edit summary |
||
Line 307: | Line 307: | ||
3 | 3 | ||
</pre> | </pre> | ||
== Binary size == | |||
Using the GHC API in your | |||
applications results in large executables (e.g. > 15Mb). You can mitigate this by an enormous amount using the tools <tt>strip</tt> and <tt>gzexe</tt>; this may reduce the executable to 15-30% of its previous size. ([http://austin.youareinferior.net/?q=node/29 Example.]) | |||
== Links to other info about the GHC API == | |||
* Austin Seipp has an [http://austin.youareinferior.net/?q=node/29 extended example in his blog] |
Revision as of 10:44, 31 October 2007
Using GHC as a library
In GHC 6.5 and subsequently you can import GHC as a Haskell library, which lets you write a Haskell program that has access to all of GHC.
This page is a place for everyone to add
- Notes about how to get it working
- Comments about the API
- Suggestions for improvement
and so on.
More documentation is available on the GHC wiki: http://cvs.haskell.org/trac/ghc/wiki/Commentary/Compiler/API
Getting started
You'll need a version of GHC (at least 6.5) that supports the GHC API. The GHC download page offers stable releases and development versions; you can also use CVS (instructions) or darcs (e.g., darcs get --partial http://darcs.haskell.org/ghc).
To use the GHC API you say
import GHC
Doing this imports the module GHC from the package ghc. This module exports the "GHC API", which is still in a state of flux. Currently it's not even Haddock-documented. You can see the source code (somewhat documented). There are also other modules of interest as you do more special things.
Here's an example main program that does it Media:Main.hs (good for GHC 6.6). You need to manually change the value of myGhcRoot to point to your GHC directory.
To compile Media:Main.hs, you have to turn on the flag "-package ghc", e.g.
ghc -package ghc Main.hs
Common use cases and functions
Assumes GHC 6.6.
Default exception handling
If you don't handle exceptions yourself, you are recommended to wrap all code inside the wrapper:
defaultErrorHandler :: DynFlags -> IO a -> IO a
DynFlags.defaultDynFlags :: DynFlags
This catches exceptions and prints out exception details and exits your program with exit code 1.
Example:
import GHC
import DynFlags(defaultDynFlags)
main = defaultErrorHandler defaultDynFlags $ do
{-
stuff in the following subsections
-}
You do not have to use defaultDynFlags
, but it's the easiest starting point.
Initialization
First create a session:
newSession :: GhcMode -- BatchCompile | Interactive | MkDepend | ...
-> Maybe FilePath -- GHC installation directory
-> IO Session -- your session; you will need it
The path to your GHC installation directory (e.g., /usr/local/lib/ghc-6.6) is in practice mandatory, even though in theory marked as optional.
The session is configurable by dynamic flags (GHC dynamic flags plus session state; think -O2, -fvia-C, -fglasgow-exts, -package). This can be done with:
getSessionDynFlags :: Session -> IO DynFlags
setSessionDynFlags :: Session
-> DynFlags
-> IO [PackageId] -- important iff dynamic-linking
parseDynamicFlags :: DynFlags -- old flags
-> [String] -- e.g., all or part of getArgs
-> IO (DynFlags, [String]) -- new flags, unknown args
The DynFlags
record has a gazillion fields; ask ghci to show all of them. You can change them by hand, or use the parser (which implements the GHC command line format and does the Right Thing). But there is one you must note:
data DynFlags = DynFlags { ...,
hscTarget :: HscTarget } -- HscC | HscAsm | HscInterpreted | ...
This corresponds to -fvia-C, -fasm, or interpreting. When the session needs to re-compile a module, this field controls how. The default is HscAsm
, even in the interactive mode, meaning the interactive mode may produce .hi and .o files too. If you want to follow GHCi in not doing that, you must set this field to HscInterpreted
yourself. (On the other hand, it is fun to contemplate an interactive session that generates machine code upon your command.)
setSessionDynFlags
also sets up your session's awareness of the package database (without which you can't even use the Prelude), so even if you like the defaults, you should still call it. (Older code called PackageConfig.initPackages
for this.)
Examples:
- vanilla compiler, use all defaults (rare but good start)
session <- newSession BatchCompile (Just "/usr/local/lib/ghc-6.6")
getSessionDynFlags session >>= setSessionDynFlags session
- compiler with custom flags, easy with parser
session <- newSession BatchCompile (Just "/usr/local/lib/ghc-6.6")
f0 <- getSessionDynFlags session
(f1,b) <- parseDynamicFlags f0 ["-fglasgow-exts", "-O", "-package", "ghc", "-package Cabal",
"foo", "-v", "bar"]
-- b = ["foo", "bar"]; the other args are recognized
-- in GHC 6.6 "-O" implies "-fvia-C", that kind of thing is automatic here too
setSessionDynFlags session f1
- interactive session with interpreter
session <- newSession Interactive (Just "/usr/local/lib/ghc-6.6")
f0 <- getSessionDynFlags session
setSessionDynFlags session f0{hscTarget = HscInterpreted}
Load or compile modules
To compile code or load modules, first set one or more targets, then call the load
function.
guessTarget :: String -- "filename.hs" or "filename.lhs" or "MyModule"
-> Maybe Phase -- if not Nothing, specifies starting phase
-> IO Target
addTarget :: Session -> Target -> IO ()
setTargets :: Session -> [Target] -> IO ()
getTargets :: Session -> IO [Target]
removeTarget :: Session -> TargetId -> IO ()
load :: Session -> LoadHowMuch -> IO SuccessFlag
data LoadHowMuch
= LoadAllTargets
| LoadUpTo ModuleName
| LoadDependenciesOf ModuleName
Loading or compiling produces temp directories and files, which can only be correctly cleaned up or kept (depending on temp file flags in DynFlags
) with the wrapper:
defaultCleanupHandler :: DynFlags -> IO a -> IO a
Two factors constrain how much code should be wrapped. At the minimal, function calls such as load
and depanal
that potentially unlit or compile or link should be wrapped. At the maximal, the flags passed to defaultCleanupHandler
should be identical to those set to the session, so practically the wrapping should be after setting up seesion flags.
Example:
t <- guessTarget "Main.hs" Nothing
addTarget session t -- setTargets session [t] is also good
f <- getSessionDynFlags session
sf <- defaultCleanupHandler f (load session LoadAllTargets)
case sf of Succeeded -> ...
Failed -> ...
Dependencies (both modules and packages) are processed automatically, and an executable is produced if appropriate, precisely like --make.
Modules are compiled as per the hscTarget
flag (-fasm, -fvia-C, or interpreter) in DynFlags
, independent of GHC mode.
Compiling to Core
To compile a file to Core (the intermediate language used in GHC's simplifier), call compileToCore:
compileToCore :: Session -> FilePath -> IO (Maybe [CoreBind])
compileToCore takes a session and a filename and, if successful, returns a list of Core bindings corresponding to the given module. It is not necessary to set a target first. For documentation of the Core data types, see the CoreSyn module in GHC.
If you need access to the type declarations contained within a module as well, use compileToCoreModule instead:
compileToCoreModule :: Session -> FilePath -> IO (Maybe CoreModule)
compileToCoreModule is similar to compileToCore, except that it returns a CoreModule, which has the following definition:
data CoreModule
= CoreModule {
-- Module name
cm_module :: !Module,
-- Type environment for types declared in this module
cm_types :: !TypeEnv,
-- Declarations
cm_binds :: [CoreBind]
}
The type TypeEnv is defined in HscTypes.lhs.
Interactive evaluation
Interactive evaluation ala GHCi is done by runStmt
. But first, this is always done under a current context, i.e., which modules are in scope. Most probably you want to have at least the Prelude and those you loaded in the previous section. How to manipulate the context:
setContext :: Session
-> [Module] -- their top levels will be visible
-> [Module] -- their exports will be visible
-> IO ()
getContext :: Session -> IO ([Module], [Module])
findModule :: Session -> ModuleName -> Maybe PackageId -> IO Module
mkModule :: PackageId -> ModuleName -> Module
mkModuleName :: String -> ModuleName
PackageConfig.stringToPackageId :: String -> PackageId
Every module given to setContext
must be either in a package known to the session or has been loaded as per the previous subsection. Example:
-- equivalent to GHCi's :m Prelude Control.Monad *Main
prelude <- findModule session (mkModuleName "Prelude") Nothing
monad <- findModule session (mkModuleName "Control.Monad") Nothing
usermod <- findModule session (mkModuleName "Main") Nothing -- we have loaded this
setContext session [usermod] [prelude,monad]
You can also be specific about packages. You can also use mkModule
instead of findModule
, or even some module query functions in the next subsection.
Having set a useful context, we're now ready to evaluate.
runStmt :: Session -> String -> IO RunResult
data RunResult
= RunOk [Name] -- names bound by the expression
| RunFailed
| RunException GHC.IOBase.Exception -- that's Control.Exception.Exception
Example:
runStmt session "let n = 2 + 2" -- n is bound
runStmt session "n" -- 4 is printed (note "it" is bound)
(Interactive evaluation works in BatchCompile mode too! There are still other subtle differences, so this is not recommended.)
Type checking
What if I want the type info from a module?
Once the modules are loaded in the session, they are already type-checked.
The type information of a loaded module are stored in a data-structure called
ModuleInfo
. To access the type information, we need to apply function getModuleInfo
to the target module.
ModuleInfo
is defined as follows,
data ModuleInfo = ModuleInfo {
minf_type_env :: TypeEnv,
minf_exports :: NameSet, -- ToDo, [AvailInfo] like ModDetails?
minf_rdr_env :: Maybe GlobalRdrEnv, -- Nothing for a compiled/package mod
minf_instances :: [Instance]
#ifdef GHCI
,minf_modBreaks :: ModBreaks
#endif
-- ToDo: this should really contain the ModIface too
}
The field minf_type_env
is holding the type environment, of type TypeEnv
, which is defined as,
type TypeEnv = [TyThing]
where TyThing can be an identifier, a class, a type constructor or a data constructor.
data TyThing = AnId Id
| ADataCon DataCon
| ATyCon TyCon
| AClass Class
Recalling the running example in the previous subsection, note that the variable usermod
captures the user-defined module "Main".
We retrieve the module information of "Main" module and unfold the type environment out of it.
mb_userModInfo <- getModuleInfo session usermod
case mb_userModInfo of
Just userModInfo ->
let userTyThings = modInfoTyThings userModInfo -- access the type environments
userTys = [ (i, idType i) | (AnId i) <- userTyThings ] -- we are only interested in the declared ids and their (inferred) types.
in ... -- do something with userTys
Nothing -> return ()
Queries
-- Get module dependency graph
getModuleGraph :: Session -> IO ModuleGraph -- ModuleGraph = [ModSummary]
-- Get bindings
getBindings :: Session -> IO [TyThing]
Messages
Compiler messages (including progress, warnings, errors) are controlled by verbosity and routed through a callback mechanism. These are fields in DynFlags
:
data DynFlags = DynFlags { ...,
verbosity :: Int,
log_action :: Severity -> SrcLoc.SrcSpan -> Outputable.PprStyle -> ErrUtils.Message -> IO () }
The verbosity
field corresponds to the command line -v flag; parseDynamicFlags
is applicable. A low verbosity means log_action
is seldom called.
You can set the callback to your logger, like
f <- getSessionDynFlags session
setSessionDynFlags session f{log_action = my_log_action}
This sets the session's logger, but it will not see exceptions.
If you call defaultErrorHandler
at the outermost, clearly its message logger has to be set separately:
main = defaultErrorHandler defaultDynFlags{log_action = my_log_action} $ do ...
This logger will see messages produced by defaultErrorHandler
upon exceptions.
Interactive mode example
The file Media:Interactive.hs (also requires Media:MyPrelude.hs) serves as an example for using GHC as a library in interactive mode. It also shows how to replace some of the standard prelude functions with modified versions. See the comments in the code for further information.
Using the GHC library from inside GHCi
This works, to some extent. However, beware about loading object code, because there is only a single linker symbol table in the runtime, so GHCi will be sharing the symbol table with the new GHC session.
$ ghci -package ghc Prelude> :m + GHC PackageConfig Prelude GHC> session <- newSession Interactive (Just "/usr/local/lib/ghc-6.6") Prelude GHC> setSessionDynFlags session =<< getSessionDynFlags session Prelude GHC> setContext session [] [mkModule (stringToPackageId "base") (mkModuleName "Prelude")] Prelude GHC> runStmt session "let add1 x = x + 1" Prelude GHC> runStmt session "add1 2" 3
Binary size
Using the GHC API in your applications results in large executables (e.g. > 15Mb). You can mitigate this by an enormous amount using the tools strip and gzexe; this may reduce the executable to 15-30% of its previous size. (Example.)
Links to other info about the GHC API
- Austin Seipp has an extended example in his blog