Generics

From HaskellWiki
Revision as of 02:44, 13 February 2021 by Ysangkok (talk | contribs) (explain why i mention java at all)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Datatype-generic programming, also frequently just called generic programming or generics in Haskell, is a form of abstraction that allows defining functions that can operate on a large class of datatypes. In this page we summarise a number of popular approaches to generic programming that are often used with GHC. For a more in-depth introduction to generic programming in general, have a look at Gibbons' Datatype-Generic Programming, or the Libraries for Generic Programming paper.

Note that the module GHC.Generics mentioned below is maybe more related to Java's "reflection" than it is to polymorphism in Java, which is confusingly called "generics". Haskell's polymorphism can often be erased at compile-time, similar to Java where there is "type erasure" and the polymorphism is not present in compiled code. So the word "generic" is somewhat overloaded, and it may be desirable to use "polymorphic" when possible.

What is generic programming?

Haskell is a polymorphic language. This means that you can have a single datatype for lists:

data List a = Nil | Cons a (List a)

These lists can contain any type of information, such as integers, Booleans, or even other lists. Since the length of a list does not depend on the type of its elements, there is also a single definition for list length:

length :: List a -> Int
length Nil        = 0
length (Cons _ t) = 1 + length t

The length function can be defined on other datatypes than lists. Consider a datatype for trees:

data Tree a = Leaf | Bin a (Tree a) (Tree a)

You can also compute the length of a tree (or its size, if you want), by recursively traversing the tree and counting the number of elements. Generic programming allows to define a single length function, that can operate on lists, trees, and many other datatypes. This reduces code duplication and makes code more robust to changes, because you can change your datatypes without needing to adapt the generic functions that operate on them.

We now look at some approaches to generic programming in Haskell.

GHC.Generics

The GHC.Generics module, available with GHC since version 7.2, allows you to easily define classes with methods for which no implementation is necessary, similarly to Show, for instance. It's described in a separate wiki page.

SYB

Scrap Your Boilerplate (SYB), available with GHC since version 6.0, is an earlier approach to generic programming, particularly well suited for traversals and transformations over large trees. It has its own wiki page.

Uniplate

Uniplate is available as a library on Hackage. It is similar in nature to SYB, but uses simpler types. For more information see its webpage.

Multirec

Multirec is a library for generic programming with fixed points, supporting mutually recursive families of datatypes, and allowing functionality such as folds or the zipper. For more information, see its webpage.

RepLib

Replib is a library for generic programming based on representation types (and GADTs). It is available through Hackage and Google Code. It is one of the more expressive libraries and supports the SYB traversals as well as other styles of generic programming. The Unbound library uses RepLib to automatically derive operations for name binding, including alpha-equivalence, free variable calculation and substitution.