# Difference between revisions of "GHC.Generics"

(New page: GHC 7.2 includes a new ''generic deriving mechanism''. This means you can have more classes that you can <hask>derive</hask>, like <hask>Show</hask> or <hask>Functor</hask>. This is accomp...) |
|||

Line 83: | Line 83: | ||

-- Node, with information about the constructor |
-- Node, with information about the constructor |
||

:+: M1 C Con_Node ( |
:+: M1 C Con_Node ( |
||

− | -- Constructor argument, which could have information |
+ | -- Constructor argument, which could have information |

+ | -- about a record selector label |
||

M1 S NoSelector ( |
M1 S NoSelector ( |
||

-- Argument, tagged with P because it is a parameter |
-- Argument, tagged with P because it is a parameter |
||

− | K1 P a) |
+ | K1 P a) |

− | -- Another argument, tagged with R because it is |
+ | -- Another argument, tagged with R because it is |

+ | -- a recursive occurrence of a type |
||

:*: M1 S NoSelector (K1 R (UserTree a)) |
:*: M1 S NoSelector (K1 R (UserTree a)) |
||

-- Idem |
-- Idem |
||

:*: M1 S NoSelector (K1 R (UserTree a)) |
:*: M1 S NoSelector (K1 R (UserTree a)) |
||

+ | )) |
||

</haskell> |
</haskell> |
||

+ | |||

+ | A bit more complicated, but essentially the same. Datatypes like <hask>Data_UserTree</hask> are empty datatypes used only for providing meta-information in the representation; you don't have to worry much about them for now. All of this is explained in much more detail in Section 2.1. of [http://dreixel.net/research/pdf/gdmh.pdf the original paper describing the new generic deriving mechanism]. |
||

+ | |||

+ | ==== A generic function ===== |

## Revision as of 12:18, 4 May 2011

GHC 7.2 includes a new *generic deriving mechanism*. This means you can have more classes that you can `derive`

, like `Show`

or `Functor`

. This is accomplished through two new features, enabled with two new flags: `DeriveRepresentable` and `DefaultSignatures`. We'll show how this all works in a detailed example.

## Contents

## Serialization

Suppose you are writing a class for serialization of data. You have a type `Bit`

representing bits, and a class `Serialize`

:

```
data Bit = O | I
class Serialize a where
put :: a -> [Bit]
```

You might have written some instances already:

```
instance Serialize Int where
put i = serializeInt i
instance Serialize a => Serialize [a] where
put [] = []
put (h:t) = put h ++ put t
```

A user of your library, however, will have their his datatypes, like:

```
data UserTree a = Node a (UserTree a) (UserTree a) | Leaf
```

He will have to specify an `instance Serialize (UserTree a) where ...`

himself. This, however, is tedious, especially because most instances will probably be rather trivial, and should be derived automatically.

It is here that generic programming can help you. If you are familiar with SYB you could use it at this stage, but now we'll see how to do this with the new generic deriving mechanism.

### Generic serialization

First you have to tell the compiler how to serialize any datatype, in general. Since Haskell datatypes have a regular structure, this means you can just explain how to serialize a few basic datatypes.

#### Representation types

We can represent most Haskell datatypes using only the following primitive types:

```
-- | Void: used for datatypes without constructors
data V1 p
-- | Unit: used for constructors without arguments
data U1 p = U1
-- | Constants, additional parameters and recursion of kind *
newtype K1 i c p = K1 { unK1 :: c }
-- | Meta-information (constructor names, etc.)
newtype M1 i c f p = M1 { unM1 :: f p }
-- | Sums: encode choice between constructors
infixr 5 :+:
data (:+:) f g p = L1 (f p) | R1 (g p)
-- | Products: encode multiple arguments to constructors
infixr 6 :*:
data (:*:) f g p = f p :*: g p
```

For starters, try to ignore the `p` parameter in all types; it's there just for future compatibility. The easiest way to understand how you can use these types to represent others is to see an example. Let's represent the `UserTree`

type shown before:

```
type RepUserTree a =
-- A UserTree is either a Leaf, which has no arguments
U1
-- ... or it is a Node, which has three arguments that we put in a product
:+: a :*: UserTree a :*: UserTree a
```

Simple, right? Different constructors become alternatives of a sum, and multiple arguments become products. In fact, we want to have some more information in the representation, like datatype and constructor names, and to know if a product argument is a parameter or a type. We use the other primitives for this, and the representation looks more like:

```
type RealRepUserTree a =
-- Information about the datatype
M1 D Data_UserTree (
-- Leaf, with information about the constructor
M1 C Con_Leaf U1
-- Node, with information about the constructor
:+: M1 C Con_Node (
-- Constructor argument, which could have information
-- about a record selector label
M1 S NoSelector (
-- Argument, tagged with P because it is a parameter
K1 P a)
-- Another argument, tagged with R because it is
-- a recursive occurrence of a type
:*: M1 S NoSelector (K1 R (UserTree a))
-- Idem
:*: M1 S NoSelector (K1 R (UserTree a))
))
```

A bit more complicated, but essentially the same. Datatypes like `Data_UserTree`

are empty datatypes used only for providing meta-information in the representation; you don't have to worry much about them for now. All of this is explained in much more detail in Section 2.1. of the original paper describing the new generic deriving mechanism.