These are the technical specifications for Hoogle 4, and are intended as a working blueprint. Please sign all notes, unless you are me. --Neil Mitchell 10:36, 28 June 2006 (UTC)
- 1 Hoogle Text File
- 2 Hoogle Data File
- 3 Interpreting a Search
- 4 User Interface
Hoogle Text File
The Hoogle Text File format will remain largely unchanged from Hoogle 3.
An example is:
module MyModule keyword where keyword -> Just :: a -> Maybe a data Maybe a fromJust :: Maybe a -> a (++) :: [a] -> [a] -> [a] instance Monad Maybe class Monad m return :: Monad a => b -> a b other "haddock" "Lambdabot says http://haskell.org/haddock"
Hoogle Data File
The Hoogle Data File format will be binary, and support lots of features a lot faster. It needs better indexing, type collapsing etc. The main purpose is to improve speed, improve accuracy and to allow multiple data files to be searched.
Most things in the file format will be int offsets into the file, limiting the file at 2Gb - which is perfectly reasonable I think :)
Every item will have a record which has a string, which is the text to output, with hints for things like argument positions. All items will be scored etc before getting their values, to make the process slightly more efficient.
The main operation is searching by name, therefore the following structure needs to be given for names:
The start of the file will be a name table, each name will be split by letter:
@0: a -> 1, b -> 3, c -> ....  @1: p -> 2 ... @2: ... [S <ap>, E <map>, M
ish, I have a better structure at home. Must add it...
At the top of each file is a list of types and their arities, for example: Maybe=1, =1, Bool=0
This will automatically catch errors like searching for "Just a" or "a -> Maybe"
Each package will have a list of instances, these will be loaded. When multiple packages are loaded all their instances will be combined, to make the process more efficient
Each package will have a list of type aliases, which will also be expanded at check time on a global basis.
Interpreting a Search
This has to be changed to take account of multi module names, i.e. Data.Map etc.
Which Modules to Search
Use +name and -name to include and remove a package, no space allowed between the +/- and the name.
For name allow one of:
- Package name, base, hoogle, yhc, ghc, haxml etc.
- Module name, Data.Map, Prelude - refers to exactly that module
- Module name with star, Data.*, Win32.* - refers to all modules hanging off that
The options + and - are interpreted in order. Those refering to packages will build up and set which data files to search. If a module is specified with + and no modules are specified before with - then its implicitly -all. The module filter is built up and applied after the search has been conducted on the results.
This allows things like:
map -base -- find map which doesn't occur in the base
Searching for :info style information
To get info style information you must accurately say what you want to search for, i.e. ?Data.Map will give details on the module Data.Map, what functions etc are in it. ?Data.Map.Map, full details of the data structure. This includes all functions which return a Data.Map.Map and which take a Data.Map.Map, what module its defined in, number of constructors, classes its a member of etc. Where a name is ambiguous, i.e. Data.Map.Map is a data structure and a constructor, :data or :ctor will be required to disambiguate.
This will be a new feature of Hoogle 4.
Deciding what to search for
To search for a function, you need brackets enclosing something (this includes  and () around a term, but not on their own) or a -> arrow. Otherwise its a name search.
Searching for names
All searches will be interpreted as name searches, unless they have a -> in the type signature. When searching with multiple names, it will be easier to get a hit. For example, the data structure Data.Map.Map will get a hit from:
- map as its the name
- data (part of the module name)
- data (because its a data structure)
- Data.Map, the module name
If you want to search for a namespace, i.e. functions, then :data, :class, :module - prefixes of all of these will also work as long as they are unique. Symbols, i.e. ++ will also be interpreted this way.
Searching for functions
These will remain mainly unchanged, but be harder to invoke accidentally, because -> will be needed to search for functions. To search for a zero arity function, wrap it in brackets - (IO ()) and this will search for IO () as a function.