Difference between revisions of "Applications and libraries/Data structures"
From HaskellWiki
(added AltBinary and more info about Streams and ArrayRef) 
m (fix gmane links using archive.fo) 

(40 intermediate revisions by 16 users not shown)  
Line 2:  Line 2:  
{{LibrariesPage}} 
{{LibrariesPage}} 

−  == Haskell Libraries and Tools for Data Structures == 

+  ==General libraries and tools for data structures== 

−  ;[http://www. 
+  ;[http://www.cs.princeton.edu/~rdockins/edison/home/ Edison] 
:This is the latest version of the Edison library of efficient data structures. There are also [http://www.haskell.org/ghc/docs/edison/ earlier version of Edison] by Chris Okasaki. It provides sequences, finite maps, priority queues, and sets/bags. ([http://www.eecs.usma.edu/Personnel/okasaki/pubs.html#hw00 overview paper]). 
:This is the latest version of the Edison library of efficient data structures. There are also [http://www.haskell.org/ghc/docs/edison/ earlier version of Edison] by Chris Okasaki. It provides sequences, finite maps, priority queues, and sets/bags. ([http://www.eecs.usma.edu/Personnel/okasaki/pubs.html#hw00 overview paper]). 

−  ;[http://homepages.nildram.co.uk/~ahey/HLibs/Data.Tree.AVL/ Data.Tree.AVL] 

+  ;[[Library/New collections]] 

−  :An implementation of AVL trees and related utilities. 

+  :This is a package of many useful collections types. Advantages include: 

+  :*It's easy to migrate from standard Lists/Sets/Maps to the new package. The package is an evolution (rather than a revolution) of the collections currently in the base package. 

+  :*Each collection type fits in a consistent framework (thanks to classes) 

+  :*An extensive QuickCheck test suite also serves as detailed specification for the collections. 

−  ;[http://homepages.nildram.co.uk/~ahey/HLibs/Data.StringMap/ Data.StringMap] 

+  :This package includes and supercedes the following libraries: 

−  :A library providing maps from String keys to values, based on Tries. 

+  
+  :;[http://homepages.nildram.co.uk/~ahey/HLibs/Data.Tree.AVL/ Data.Tree.AVL] 

+  ::An implementation of AVL trees and related utilities. 

+  
+  :;[http://homepages.nildram.co.uk/~ahey/HLibs/Data.StringMap/ Data.StringMap] 

+  ::A library providing maps from String keys to values, based on Tries. 

+  
+  :This package includes the following libraries, but they are maintained separately: 

+  
+  :;[http://sourceforge.net/projects/rangedsets/ Ranged Sets] 

+  ::A ranged set is a list of nonoverlapping ranges. The ranges have upper and lower boundaries, and a boundary divides the base type into values above and below. No value can ever sit on a boundary. So you can have the set {2.0 < x <= 3.0, 5.3 < x < 6} 

;[http://www.cs.vu.nl/Strafunski/ Strafunski] 
;[http://www.cs.vu.nl/Strafunski/ Strafunski] 

Line 20:  Line 33:  
;[http://web.engr.oregonstate.edu/~erwig/diet/ Discrete Interval Encoding Trees] 
;[http://web.engr.oregonstate.edu/~erwig/diet/ Discrete Interval Encoding Trees] 

−  :The discrete interval encoding tree is a structure for storing subsets of types having a total order and a predecessor and a successor function. 
+  :The discrete interval encoding tree (DIET) is a structure for storing subsets of types having a total order and a predecessor and a successor function. 
−  
−  ;[http://sourceforge.net/projects/rangedsets/ Ranged Sets] 

−  :A ranged set is a list of nonoverlapping ranges. The ranges have upper and lower boundaries, and a boundary divides the base type into values above and below. No value can ever sit on a boundary. So you can have the set {2.0 < x <= 3.0, 5.3 < x < 6} 

;[http://www.cwi.nl/~ralf/HList/ HList] 
;[http://www.cwi.nl/~ralf/HList/ HList] 

Line 35:  Line 48:  
;[http://repetae.net/john/recent/out/GenUtil.html GenUtil] 
;[http://repetae.net/john/recent/out/GenUtil.html GenUtil] 

−  :A collection of random useful utility functions written in pure Haskell 98. In general, it 
+  :A collection of random useful utility functions written in pure Haskell 98. In general, it tries to conform to the naming scheme put forth in the Haskell prelude and fill in the obvious omissions, as well as providing useful routines in general. 
;[http://www.cs.uu.nl/~afie/pd09122004.zip PersistentDocument] ''The link is dead, somebody please either update it or remove it.'' 
;[http://www.cs.uu.nl/~afie/pd09122004.zip PersistentDocument] ''The link is dead, somebody please either update it or remove it.'' 

Line 43:  Line 56:  
:A generic monad for navigating around arbitrary data structures 
:A generic monad for navigating around arbitrary data structures 

−  == 
+  ==Graphs== 
;[http://web.engr.oregonstate.edu/~erwig/fgl/haskell/ FGL  A Functional Graph Library] 
;[http://web.engr.oregonstate.edu/~erwig/fgl/haskell/ FGL  A Functional Graph Library] 

Line 51:  Line 64:  
:Part of the UMinho Haskell libraries, this library provides a representation and operations on relations. A special case of relations are graphs. The operations include graph chopping and slicing, strong connected component analysis, graphs metrics, and more. 
:Part of the UMinho Haskell libraries, this library provides a representation and operations on relations. A special case of relations are graphs. The operations include graph chopping and slicing, strong connected component analysis, graphs metrics, and more. 

−  ;[http:// 
+  ;[http://archive.fo/XdRKm Haskell Graph Automorphism Library] 
−  :JeanPhilippe Bernardy's implementation of Brendan McKay's algorithm for graph 
+  :JeanPhilippe Bernardy's implementation of Brendan McKay's algorithm for graph canonical labeling and automorphism group (Nauty). 
−  == 
+  ==IO== 
−  ;[ 
+  ;[[Library/Streams  Streams]] 
−  :Streams is a featurerich, flexible, extensible, backwardcompatible and fast 
+  :Streams is a featurerich, flexible, extensible, backwardcompatible and fast I/O library. It supports various stream types: files and legacy Handle type, string and memory buffers, pipes. There is also common functionality available for any stream: buffering, Char encoding, locking. 
−  I/O library. It supports various stream types: plain and memorymapped 

−  files, string and memory buffers, pipes. There is also 

−  common functionality, available for any stream: buffering, Char 

−  encoding, locking. 

+  ;[[Binary IO]] 

−  == 
+  ==Mutable data== 
;[http://www.isi.edu/~hdaume/STPP/ The Haskell STate Preprocessor] 
;[http://www.isi.edu/~hdaume/STPP/ The Haskell STate Preprocessor] 

:This is a short preprocessor for stateful Haskell programs. It aleviates the pain of performing single array lookup/write/update functions with some syntax to support it. It also supports hash table operations based on the HashTable implementation available from the author. Finally, it supports monadic if and monadic case constructions. It is lightweight in the sense that it performs no syntactic analysis and is essentially a character transformer. 
:This is a short preprocessor for stateful Haskell programs. It aleviates the pain of performing single array lookup/write/update functions with some syntax to support it. It also supports hash table operations based on the HashTable implementation available from the author. Finally, it supports monadic if and monadic case constructions. It is lightweight in the sense that it performs no syntactic analysis and is essentially a character transformer. 

−  ;[ 
+  ;[[Library/ArrayRef  Arrays & References Library]] 
:Featuring: 
:Featuring: 

* Unboxed references in IO and ST monads 
* Unboxed references in IO and ST monads 

Line 71:  Line 85:  
* Syntax sugar to make using of mutable objects easier (=:, +=, =,..) 
* Syntax sugar to make using of mutable objects easier (=:, +=, =,..) 

* Reimplemented Arrays library with the following improvements: 
* Reimplemented Arrays library with the following improvements: 

−  +  * Unboxed arrays now can be used in polymorphic functions 

−  +  * The MArray class now supports arrays with dynamic bounds 

−  +  * Implementation of dynamic (resizable) arrays 

−  === Strings === 

+  ;[http://code.haskell.org/storablevector/ Data.StorableVector.ST] 

+  :StorableVector also support mutation in ST monad. 

+  
+  See also [[Modern array libraries]] 

+  
+  ==Lists== 

+  
+  ;[http://code.google.com/p/slist/ SList] ''(broken link)'' 

+  :Sized lists for Haskell 

+  
+  ;[http://www.cse.unsw.edu.au/~dons/dlist.html dlist] 

+  :Difference lists (supporting O(1) append and snoc) 

+  
+  ==Strings== 

;[http://www.cse.unsw.edu.au/~dons/fps.html Data.ByteString] 
;[http://www.cse.unsw.edu.au/~dons/fps.html Data.ByteString] 

−  :The FPS library provides mmapped and malloc'd packed strings (byte arrays held by a ForeignPtr), along with a list interface to these strings. It lets you do extremely fast 
+  :The FPS library provides mmapped and malloc'd packed strings (byte arrays held by a ForeignPtr), along with a list interface to these strings. It lets you do extremely fast I/O in Haskell; in some cases, even faster than typical C implementations, as well as conserving space. 
+  
+  ;[http://hackage.haskell.org/package/DataRope Data.Rope] 

+  :A wellknown tree data structure for manipulating strings without (too many) memory copies, contrarily to what Data.ByteString does. Also, contrarily to the functions in Data.ByteString.Lazy, most operations on ropes are in O(log n). 

+  
+  ;[http://code.haskell.org/storablevector/ Data.StorableVector] 

+  :This is very much the same like Data.ByteString extended to any Storable element type. There is a data structure for monolithic blocks and chunky sequences. Mutable access is also possible in ST monad. 

−  ;[http:// 
+  ;[http://software.complete.org/missingh/ MissingH] 
:MissingH is a library of pureHaskell utility functions relating to strings, logging, and I/O. 
:MissingH is a library of pureHaskell utility functions relating to strings, logging, and I/O. 

Line 89:  Line 116:  
:A library for variable expansion inside strings 
:A library for variable expansion inside strings 

−  === Serialising data === 

+  ;[http://haskelli18n.cvs.sourceforge.net/haskelli18n/ i18n strings] 

+  :At sourceforge 

+  
+  ==Regular expressions== 

+  
+  There are many libraries for regular expressions available. By default 

+  GHC comes with: 

+  
+  ;[http://darcs.haskell.org/packages/regexbase/ regexbase] 

+  :This defines the type classes for the new API 

+  
+  ;[http://darcs.haskell.org/packages/regexcompat/ regexcompat] 

+  :This defines the old Text.Regex.Posix API, using regexposix 

+  
+  ;[http://darcs.haskell.org/packages/regexposix/ regexposix] 

+  :This is the interface to the old (and slow) posix backend (C's regex.h) 

+  
+  All backends provide String and ByteString interfaces. 

+  
+  Additional backend libraries are available for (hopefully) more efficient regular expression 

+  implementations: 

+  
+  ;[http://darcs.haskell.org/packages/regexpcre/ regexpcre] 

+  :PCREbased regexes. This requires [http://www.pcre.org/ libpcre] (currently works against version 6.6.0 as of January 2007). This Haskell library and libpcre are both BSD. This is very fast but has left branch semantics instead of POSIX leftmost longest semantics. 

+  
+  ;[http://darcs.haskell.org/packages/regextre/ regextre] 

+  :TREbased regexes. This requires [http://laurikari.net/tre/ libtre] (currently works against version 0.7.5 as of January 2007). Libtre aims to be POSIX compatible and is a much faster implementation that used by regexposix. Note that libtre does have an [http://laurikari.net/pipermail/tregeneral/2007January/000083.html outstanding bug] in correctly interpreting some regular expressions. This Haskell package is BSD3 licensed and libtre is LGPL. 

+  
+  ;[http://darcs.haskell.org/packages/regexparsec/ regexparsec] 

+  :Pure Haskell regexes implemented via parsec. The leftmost longest subexpression capture semantics of this library differ from the POSIX standard. This also has an option to prefer the left branch semantics that libpcre uses. 

+  
+  ;[http://darcs.haskell.org/packages/regexdfa/ regexdfa] 

+  :Pure Haskell DFA for regexes. This is licensed under LPGL (the above packages are all under BSD3) due to derived code. This is faster than regexparsec but does not return captured subexpressions. It also is being updated to handle repeated patterns that can match empty strings (the older version hangs, the new version rewrites the pattern internally). 

+  
+  ;There is a regextdfa packages in development (not released) that will be a pure Haskell and BSD3 licensed implementation inspired by libtre. It aims to be a replacement for regexposix. 

+  
+  ;Development versions (possible unstable or broken) of the above packages are also available under [http://darcs.haskell.org/packages/regexunstable/ regexunstable]. For support, please use the [[Mailing_lists  haskellcafe]] mailing list. 

+  
+  ==Spatial indices== 

+  There are several libraries containing generic code for maps indexed by 2D, 3D or Ndimensional points that support efficient nearestneighbour or range operations: 

+  
+  * 2D: 

+  :;[http://hackage.haskell.org/package/spacepart spacepart] 

+  ::Contains quadtree implementation. 

+  
+  * 3D: 

+  :;[http://hackage.haskell.org/package/Octree octree] 

+  :: Simple unbalanced octrees. 

+  
+  * Of arbitrary dimensionality: 

+  
+  :;[http://hackage.haskell.org/package/pktree pktree] 

+  ::PKtrees support both radius and rectangular range search. 

+  
+  :;[http://hackage.haskell.org/package/KdTree kd tree] 

+  ::kd trees also support efficient range and nearest neighbour search 

+  
+  ==Serialising data== 

+  
+  ;[http://www.cse.unsw.edu.au/~dons/binary/DataBinary.html Data.Binary] ([http://darcs.haskell.org/binary/ darcs repository]) 

+  ;A library for serialising binary values to and from lazy ByteStrings. 

;[http://www.nheptane.com/nhlab/repos/NewBinary NewBinary] 
;[http://www.nheptane.com/nhlab/repos/NewBinary NewBinary] 

Line 97:  Line 184:  
:SerTH is a binary serialization library for Haskell. It supports serializing cyclic datatypes in a fast binary format. SerTH uses template haskell for deriving the serializing interface for new datatypes. 
:SerTH is a binary serialization library for Haskell. It supports serializing cyclic datatypes in a fast binary format. SerTH uses template haskell for deriving the serializing interface for new datatypes. 

−  ;[ 
+  ;[[Library/AltBinary  AltBinary]] 
: AltBinary is an exhaustive library that support binary I/O and serialization. It's part of [http://haskell.org/haskellwiki/Library/Streams Streams] library, so serialization is possible to any I/O source, from String to memorymapped file. It's also backward compatible with [http://www.nheptane.com/nhlab/repos/NewBinary NewBinary] library what makes translation of old code trivial. Very fast, very featurerich, Hugs/GHC compatible, etc, etc... 
: AltBinary is an exhaustive library that support binary I/O and serialization. It's part of [http://haskell.org/haskellwiki/Library/Streams Streams] library, so serialization is possible to any I/O source, from String to memorymapped file. It's also backward compatible with [http://www.nheptane.com/nhlab/repos/NewBinary NewBinary] library what makes translation of old code trivial. Very fast, very featurerich, Hugs/GHC compatible, etc, etc... 

−  === Compressing data === 

+  ;[http://svn.openfoundry.org/pugs/thirdparty/HsSyck/ HsSyck] 

+  :YAML is a straightforward machine parsable data serialization format designed for human readability and interaction with dynamic languages. It is optimized for data serialization, configuration settings, log files, Internet messaging and filtering. Syck is an extension, written in C, for reading and writing YAML swiftly in popular scripting languages. It is part of core Ruby, and also has bindings for Perl 5, Python, Lua, Cocoa, and Perl 6. HsSyck provides Data.Yaml.Syck as an interface to YAML structures, using Data.ByteString for efficient textual data representation. Additionally, we provide a set of DrIFT rules to dump and load arbitrary Haskell data types in the YAML format. 

−  ;[http://freearc.narod.ru/ Compression2005] 

+  ;[[GenericSerialize]] 

−  :Features of the Compression2005 Library: 

+  :GenericSerialize is a library which serializes data using the "Scrap Your Boilerplate" infrastructure. This means that while it cannot easily support datastructurespecific serialization, it can support many different data formats cheaply. 

+  
+  ==Compressing data== 

+  
+  ;[[Library/Compression  Compressions2005]] 

+  :Features of the Compression2005 Library ([http://freearc.narod.ru/ homepage]) 

* easy and uniform access to most competitive free compression algorithms as of April'05: LZMA, PPMd and GRZip 
* easy and uniform access to most competitive free compression algorithms as of April'05: LZMA, PPMd and GRZip 

* all input/output performed via usersupplied functions (callbacks), so you can compress data in memory, files, pipes, sockets and anything else 
* all input/output performed via usersupplied functions (callbacks), so you can compress data in memory, files, pipes, sockets and anything else 

−  * all parameters of compression algorithm are defined with a single string, for example "lzma:8mb:fast:hc4:fb32". 
+  * all parameters of compression algorithm are defined with a single string, for example "lzma:8mb:fast:hc4:fb32". 
−  Using this library, you can write 
+  * Using this library, you can write a bziplike utility in one line, with better compression results than bzip2 itself. 
−  === Benchmarking data structures === 

+  ;[http://hackage.haskell.org/cgibin/hackagescripts/package/zlib Zlib] 

+  :Zlib bindings for ByteStrings. darcs get http://code.haskell.org/zlib/ ([http://hackage.haskell.org/cgibin/hackagescripts/package/zlib docs]) 

+  
+  ;[http://hackage.haskell.org/cgibin/hackagescripts/package/bzlib BZip2] 

+  :BZip2 bindings for ByteStrings. darcs get http://code.haskell.org/bzlib/ ([http://hackage.haskell.org/cgibin/hackagescripts/package/bzlib docs]) 

+  
+  ==Benchmarking data structures== 

;[http://www.cs.york.ac.uk/fp/auburn/ Auburn] 
;[http://www.cs.york.ac.uk/fp/auburn/ Auburn] 

Line 116:  Line 215:  
;[http://www.cse.unsw.edu.au/~dons/code/fps/tests/Bench.hs Bench] 
;[http://www.cse.unsw.edu.au/~dons/code/fps/tests/Bench.hs Bench] 

:Simple time and space benchmarking for various listlike data structures. Easily adapted to arbitrary structures 
:Simple time and space benchmarking for various listlike data structures. Easily adapted to arbitrary structures 

+  
+  ==Generic traversals== 

+  
+  ;[http://eecs.oregonstate.edu/~erwig/reclib/ RecLib A Recursion and Traversal Library for Haskell] 

+  :The Recursion Library for Haskell provides a rich set of generic traversal strategies to facilitate the flexible specification of generic term traversals. The underlying mechanism is the Scrap Your Boilerplate (SYB) approach. Most of the strategies that are used to implement recursion operators are taken from Stratego. 

+  
+  ==Typesafe variants of <hask>IntSet</hask> and <hask>IntMap</hask> for enumerations== 

+  
+  * ChrisKuklewicz wrote a bit of boilerplate to reuse IntSet and IntMap with any Enum type: [[EnumSet EnumMap]] 

+  * an efficient implementation of EnumSet for sets which fit into a machine word can be found in [http://www.eecs.tufts.edu/~rdocki01/docs/edison/DataEdisonCollEnumSet.html Edison] (see above) 

+  
+  ==Hackage== 

+  
+  * [http://hackage.haskell.org/packages/archive/pkglist.html#cat:Data%20Structures Data structures on Hackage] 

+  * [http://hackage.haskell.org/packages/archive/pkglist.html#cat:Data More data structures on Hackage] 

+  * [http://hackage.haskell.org/packages/archive/pkglist.html#cat:Generics Generics on Hackage] 
Latest revision as of 19:28, 15 August 2019
 The copyright status of this work is not known. Please help resolve this on the talk page.
This page contains a list of libraries and tools in a certain category. For a comprehensive list of such pages, see Applications and libraries.
Contents
General libraries and tools for data structures
 Edison
 This is the latest version of the Edison library of efficient data structures. There are also earlier version of Edison by Chris Okasaki. It provides sequences, finite maps, priority queues, and sets/bags. (overview paper).
 Library/New collections
 This is a package of many useful collections types. Advantages include:
 It's easy to migrate from standard Lists/Sets/Maps to the new package. The package is an evolution (rather than a revolution) of the collections currently in the base package.
 Each collection type fits in a consistent framework (thanks to classes)
 An extensive QuickCheck test suite also serves as detailed specification for the collections.
 This package includes and supercedes the following libraries:
 Data.Tree.AVL
 An implementation of AVL trees and related utilities.
 Data.StringMap
 A library providing maps from String keys to values, based on Tries.
 This package includes the following libraries, but they are maintained separately:
 Ranged Sets
 A ranged set is a list of nonoverlapping ranges. The ranges have upper and lower boundaries, and a boundary divides the base type into values above and below. No value can ever sit on a boundary. So you can have the set {2.0 < x <= 3.0, 5.3 < x < 6}
 Strafunski
 A bundle for generic programming. It provides programming support for generic traversal as useful in the implementation of program transformations.
 Partial v0.1
 The Partial library provides a partial order class. It also provides routines for generating a Hasse diagram from a set and a partial order. Renderers are provided for the abstract Hasse diagram representation into LaTeX (via Xypic) and into dot, the format for AT&T's Graphviz tools. Since no horizontal sorting is done, the Xypic output is rather poor at present; dot does a much better job with its layout optimisation algorithm.
 Discrete Interval Encoding Trees
 The discrete interval encoding tree (DIET) is a structure for storing subsets of types having a total order and a predecessor and a successor function.
 HList
 A heterogeneous collection is a datatype that is capable of storing data of different types, while providing operations for lookup, update, iteration, and others. There are various kinds of heterogeneous collections, differing in representation, invariants, and access operations.
 monadLib
 Iavor Diatchki's library of monad transformers for Haskell. It enables the quick construction of monads  abstract data types that capture common programming idioms, such as throwing and catching exceptions or continuations. In many programming languages such features are built into the language (if they're provided at all), but in Haskell they are userprogrammable.
 Pointless Haskell
 Pointless Haskell is library for pointfree programming with recursion patterns defined as hylomorphisms. It also allows the visualization of the intermediate data structure of the hylomorphisms with GHood. This feature together with the DrHylo tool allows us to easily visualize recursion trees of Haskell functions.
 rhaskell : Reactive Objects
 Stefan Wehr's reactive objects library. Reactive objects are a convenient abstraction for writing programs which have to interact with a concurrent environment. A reactive object has two characteristics: the abandonment of all blocking operations and the unification of the concepts state and process. The former allows a reactive object to accept input from multiple sources without imposing any ordering on the input events. The latter prevents race conditions because the state of an object is only manipulated from the process belonging to the object.
 GenUtil
 A collection of random useful utility functions written in pure Haskell 98. In general, it tries to conform to the naming scheme put forth in the Haskell prelude and fill in the obvious omissions, as well as providing useful routines in general.
 PersistentDocument The link is dead, somebody please either update it or remove it.
 The persistent document abstraction takes care of dealing with a document you want to open from and save to disk and that supports undo. This functionality can be used by editors of arbitrary documents and saves you a lot of quite subtle coding.
 Zipper monad
 A generic monad for navigating around arbitrary data structures
Graphs
 FGL  A Functional Graph Library
 The functional graph library provides a collection of graph operations.
 Data.Relation
 Part of the UMinho Haskell libraries, this library provides a representation and operations on relations. A special case of relations are graphs. The operations include graph chopping and slicing, strong connected component analysis, graphs metrics, and more.
 Haskell Graph Automorphism Library
 JeanPhilippe Bernardy's implementation of Brendan McKay's algorithm for graph canonical labeling and automorphism group (Nauty).
IO
 Streams
 Streams is a featurerich, flexible, extensible, backwardcompatible and fast I/O library. It supports various stream types: files and legacy Handle type, string and memory buffers, pipes. There is also common functionality available for any stream: buffering, Char encoding, locking.
Mutable data
 The Haskell STate Preprocessor
 This is a short preprocessor for stateful Haskell programs. It aleviates the pain of performing single array lookup/write/update functions with some syntax to support it. It also supports hash table operations based on the HashTable implementation available from the author. Finally, it supports monadic if and monadic case constructions. It is lightweight in the sense that it performs no syntactic analysis and is essentially a character transformer.
 Arrays & References Library
 Featuring:
 Unboxed references in IO and ST monads
 Monadindependent interfaces to boxed and unboxed references
 Syntax sugar to make using of mutable objects easier (=:, +=, =,..)
 Reimplemented Arrays library with the following improvements:
 Unboxed arrays now can be used in polymorphic functions
 The MArray class now supports arrays with dynamic bounds
 Implementation of dynamic (resizable) arrays
 Data.StorableVector.ST
 StorableVector also support mutation in ST monad.
See also Modern array libraries
Lists
 SList (broken link)
 Sized lists for Haskell
 dlist
 Difference lists (supporting O(1) append and snoc)
Strings
 Data.ByteString
 The FPS library provides mmapped and malloc'd packed strings (byte arrays held by a ForeignPtr), along with a list interface to these strings. It lets you do extremely fast I/O in Haskell; in some cases, even faster than typical C implementations, as well as conserving space.
 Data.Rope
 A wellknown tree data structure for manipulating strings without (too many) memory copies, contrarily to what Data.ByteString does. Also, contrarily to the functions in Data.ByteString.Lazy, most operations on ropes are in O(log n).
 Data.StorableVector
 This is very much the same like Data.ByteString extended to any Storable element type. There is a data structure for monolithic blocks and chunky sequences. Mutable access is also possible in ST monad.
 MissingH
 MissingH is a library of pureHaskell utility functions relating to strings, logging, and I/O.
 HsLocale
 A localeaware replacement for the standard IO routines, and support for wide strings
 VariableExpansion
 A library for variable expansion inside strings
 i18n strings
 At sourceforge
Regular expressions
There are many libraries for regular expressions available. By default GHC comes with:
 regexbase
 This defines the type classes for the new API
 regexcompat
 This defines the old Text.Regex.Posix API, using regexposix
 regexposix
 This is the interface to the old (and slow) posix backend (C's regex.h)
All backends provide String and ByteString interfaces.
Additional backend libraries are available for (hopefully) more efficient regular expression implementations:
 regexpcre
 PCREbased regexes. This requires libpcre (currently works against version 6.6.0 as of January 2007). This Haskell library and libpcre are both BSD. This is very fast but has left branch semantics instead of POSIX leftmost longest semantics.
 regextre
 TREbased regexes. This requires libtre (currently works against version 0.7.5 as of January 2007). Libtre aims to be POSIX compatible and is a much faster implementation that used by regexposix. Note that libtre does have an outstanding bug in correctly interpreting some regular expressions. This Haskell package is BSD3 licensed and libtre is LGPL.
 regexparsec
 Pure Haskell regexes implemented via parsec. The leftmost longest subexpression capture semantics of this library differ from the POSIX standard. This also has an option to prefer the left branch semantics that libpcre uses.
 regexdfa
 Pure Haskell DFA for regexes. This is licensed under LPGL (the above packages are all under BSD3) due to derived code. This is faster than regexparsec but does not return captured subexpressions. It also is being updated to handle repeated patterns that can match empty strings (the older version hangs, the new version rewrites the pattern internally).
 There is a regextdfa packages in development (not released) that will be a pure Haskell and BSD3 licensed implementation inspired by libtre. It aims to be a replacement for regexposix.
 Development versions (possible unstable or broken) of the above packages are also available under regexunstable. For support, please use the haskellcafe mailing list.
Spatial indices
There are several libraries containing generic code for maps indexed by 2D, 3D or Ndimensional points that support efficient nearestneighbour or range operations:
 2D:
 spacepart
 Contains quadtree implementation.
 3D:
 octree
 Simple unbalanced octrees.
 Of arbitrary dimensionality:
 pktree
 PKtrees support both radius and rectangular range search.
 kd tree
 kd trees also support efficient range and nearest neighbour search
Serialising data
 Data.Binary (darcs repository)
 A library for serialising binary values to and from lazy ByteStrings.
 NewBinary
 A port of Malcolm Wallace's Binary library from NHC, offering facilities for heap compression and binary I/O. The defacto standard for binary I/O in Haskell
 SerTH
 SerTH is a binary serialization library for Haskell. It supports serializing cyclic datatypes in a fast binary format. SerTH uses template haskell for deriving the serializing interface for new datatypes.
 AltBinary
 AltBinary is an exhaustive library that support binary I/O and serialization. It's part of Streams library, so serialization is possible to any I/O source, from String to memorymapped file. It's also backward compatible with NewBinary library what makes translation of old code trivial. Very fast, very featurerich, Hugs/GHC compatible, etc, etc...
 HsSyck
 YAML is a straightforward machine parsable data serialization format designed for human readability and interaction with dynamic languages. It is optimized for data serialization, configuration settings, log files, Internet messaging and filtering. Syck is an extension, written in C, for reading and writing YAML swiftly in popular scripting languages. It is part of core Ruby, and also has bindings for Perl 5, Python, Lua, Cocoa, and Perl 6. HsSyck provides Data.Yaml.Syck as an interface to YAML structures, using Data.ByteString for efficient textual data representation. Additionally, we provide a set of DrIFT rules to dump and load arbitrary Haskell data types in the YAML format.
 GenericSerialize
 GenericSerialize is a library which serializes data using the "Scrap Your Boilerplate" infrastructure. This means that while it cannot easily support datastructurespecific serialization, it can support many different data formats cheaply.
Compressing data
 Compressions2005
 Features of the Compression2005 Library (homepage)
 easy and uniform access to most competitive free compression algorithms as of April'05: LZMA, PPMd and GRZip
 all input/output performed via usersupplied functions (callbacks), so you can compress data in memory, files, pipes, sockets and anything else
 all parameters of compression algorithm are defined with a single string, for example "lzma:8mb:fast:hc4:fb32".
 Using this library, you can write a bziplike utility in one line, with better compression results than bzip2 itself.
 Zlib
 Zlib bindings for ByteStrings. darcs get http://code.haskell.org/zlib/ (docs)
 BZip2
 BZip2 bindings for ByteStrings. darcs get http://code.haskell.org/bzlib/ (docs)
Benchmarking data structures
 Auburn
 Auburn is Graeme Moss's kit for benchmarking implementations of lazy data structures. Give it several implementations of an ADT (abstract data type) and it will tell you which one is best for your particular application.
 Bench
 Simple time and space benchmarking for various listlike data structures. Easily adapted to arbitrary structures
Generic traversals
 RecLib A Recursion and Traversal Library for Haskell
 The Recursion Library for Haskell provides a rich set of generic traversal strategies to facilitate the flexible specification of generic term traversals. The underlying mechanism is the Scrap Your Boilerplate (SYB) approach. Most of the strategies that are used to implement recursion operators are taken from Stratego.
Typesafe variants of IntSet
and IntMap
for enumerations
 ChrisKuklewicz wrote a bit of boilerplate to reuse IntSet and IntMap with any Enum type: EnumSet EnumMap
 an efficient implementation of EnumSet for sets which fit into a machine word can be found in Edison (see above)