HaskellImplementorsWorkshop/2011/Tibell
Faster persistent data structures through hashing[edit]
Johan Tibell
The most commonly used map (dictionary) data types in Haskell are implemented using some kind of binary tree, typically a size balanced tree or a Patricia tree. While binary trees provide good asymptotic performance, their real world performance is not stellar, especially when used with keys which are expensive to compare, such as strings.
In this talk I will describe a new map data type that uses a recently developed data structure, a hash array mapped trie, to achieve better real world performance. I will describe the design and implementation of this new data structure, improvements made to GHC to improve its performance, benchmark results, and finally conclude with a discussion on compiler improvements that could further improve the performance of this data type.