From HaskellWiki
< Hoogle
Revision as of 14:39, 2 July 2006 by NeilMitchell (talk | contribs) (Hoogle Data File)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

These are the technical specifications for Hoogle 4, and are intended as a working blueprint. Please sign all notes, unless you are me. --Neil Mitchell 10:36, 28 June 2006 (UTC)

Hoogle Text File

The Hoogle Text File format will remain largely unchanged from Hoogle 3.

An example is:

module MyModule
keyword where
keyword ->
Just :: a -> Maybe a
data Maybe a
fromJust :: Maybe a -> a
(++) :: [a] -> [a] -> [a]
instance Monad Maybe
class Monad m
return :: Monad a => b -> a b
other "haddock" "Lambdabot says http://haskell.org/haddock"

Hoogle Data File

The Hoogle Data File format will be binary, and support lots of features a lot faster. It needs better indexing, type collapsing etc. The main purpose is to improve speed, improve accuracy and to allow multiple data files to be searched.


Most things in the file format will be int offsets into the file, limiting the file at 2Gb - which is perfectly reasonable I think :)

Every item will have a record which has a string, which is the text to output, with hints for things like argument positions. All items will be scored etc before getting their values, to make the process slightly more efficient.

Name Searches

The main operation is searching by name, therefore the following structure needs to be given for names:

The start of the file will be a name table, each name will be split by letter:

  a -> 1, b -> 3, c -> .... []
  p -> 2 ...
  ... [S <ap>, E <map>, M ]

ish, I have a better structure at home. Must add it...

Type Validation

At the top of each file is a list of types and their arities, for example: Maybe=1, []=1, Bool=0

This will automatically catch errors like searching for "Just a" or "a -> Maybe"


Each package will have a list of instances, these will be loaded. When multiple packages are loaded all their instances will be combined, to make the process more efficient


Each package will have a list of type aliases, which will also be expanded at check time on a global basis.

Interpreting a Search

This has to be changed to take account of multi module names, i.e. Data.Map etc.

Which Modules to Search

Use +name and -name to include and remove a package, no space allowed between the +/- and the name.

For name allow one of:

  • Package name, base, hoogle, yhc, ghc, haxml etc.
  • Module name, Data.Map, Prelude - refers to exactly that module
  • Module name with star, Data.*, Win32.* - refers to all modules hanging off that

The options + and - are interpreted in order. Those refering to packages will build up and set which data files to search. If a module is specified with + and no modules are specified before with - then its implicitly -all. The module filter is built up and applied after the search has been conducted on the results.

This allows things like:

map -base -- find map which doesn't occur in the base

Searching for :info style information

To get info style information you must accurately say what you want to search for, i.e. ?Data.Map will give details on the module Data.Map, what functions etc are in it. ?Data.Map.Map, full details of the data structure. This includes all functions which return a Data.Map.Map and which take a Data.Map.Map, what module its defined in, number of constructors, classes its a member of etc. Where a name is ambiguous, i.e. Data.Map.Map is a data structure and a constructor, :data or :ctor will be required to disambiguate.

This will be a new feature of Hoogle 4.

Deciding what to search for

To search for a function, you need brackets enclosing something (this includes [] and () around a term, but not on their own) or a -> arrow. Otherwise its a name search.

Searching for names

All searches will be interpreted as name searches, unless they have a -> in the type signature. When searching with multiple names, it will be easier to get a hit. For example, the data structure Data.Map.Map will get a hit from:

  • map as its the name
  • data (part of the module name)
  • data (because its a data structure)
  • Data.Map, the module name

If you want to search for a namespace, i.e. functions, then :data, :class, :module - prefixes of all of these will also work as long as they are unique. Symbols, i.e. ++ will also be interpreted this way.

Searching for functions

These will remain mainly unchanged, but be harder to invoke accidentally, because -> will be needed to search for functions. To search for a zero arity function, wrap it in brackets - (IO ()) and this will search for IO () as a function.

User Interface

Need a standard and an advanced section. Standard is just a textbox. Advanced both parses the search and allows you to type in the individual items without modifiers. This should make it easier for people to get to grips with the advanced features. This user interface will be entirely in client side javascript.