(→Type families in GHC)
(→What do I need to use type families?)
Revision as of 09:10, 4 October 2008
1 Type families in GHC
Indexed type families are a new GHC extension to facilitate type-level programming. Type families are a generalisation of associated data types and associated type synonyms, and are described in a recent ICFP paper. They essentially provide type-indexed data types and named functions on types, which are useful for generic programming and highly parameterised library interfaces as well as interfaces with enhanced static information, much like dependent types. They might also be regarded as an alternative to functional dependencies, but provide a more functional style of type-level programming than the relational style of functional dependencies.
2 What do I need to use type families?
The first stable release of GHC that properly supports type families is 6.10.1. (An early partial implementation was part of the 6.8 release, but its use is deprecated.) Please report bugs via the GHC bug tracker, ideally accompanied by a small example program that demonstrates the problem. Use the GHC mailing list for questions or for a discussion of this language extension and its description on this wiki page.
3 What are type families?
Indexed type families, or type families for short, are type constructors that represent sets of types. Set members are denoted by supplying the type family constructor with type parameters, which are called type indices. The difference between vanilla parametrised type constructors and family constructors is much like between parametrically polymorphic functions and (ad-hoc polymorphic) methods of type classes. Parametric polymorphic functions behave the same at all type instances, whereas class methods can change their behaviour in dependence on the class type parameters. Similarly, vanilla type constructors imply the same data representation for all type instances, but family constructors can have varying representation types for varying type indices.
Indexed type families come in two flavours: data families and type synonym families. They are the indexed family variants of algebraic data types and type synonyms, respectively. The instances of data families can be data types and newtypes.
4 An associated data type example
As an example, consider Ralf Hinze's generalised tries, a form of generic finite maps.
4.1 The class declaration
We define a type class whose instances are the types that we can use as keys in our generic maps:
class GMapKey k where data GMap k :: * -> * empty :: GMap k v lookup :: k -> GMap k v -> Maybe v insert :: k -> v -> GMap k v -> GMap k v
4.2 An Int instanceTo use Ints as keys into generic maps, we declare an instance that simply uses
instance GMapKey Int where data GMap Int v = GMapInt (Data.Map.Map Int v) empty = GMapInt Data.Map.empty lookup k (GMapInt m) = Data.Map.lookup k m insert k v (GMapInt m) = GMapInt (Data.Map.insert k v m)
NB: At the moment, GADT syntax is not allowed for associated data types (or other indexed types). This is not a fundamental limitation, but just a shortcoming of the current implementation, which we expect to lift in the future.As an exercise, implement an instance for
4.3 A unit instanceGeneric definitions, apart from elementary types, typically cover units, products, and sums. We start here with the unit instance for
instance GMapKey () where data GMap () v = GMapUnit (Maybe v) empty = GMapUnit Nothing lookup () (GMapUnit v) = v insert () v (GMapUnit _) = GMapUnit $ Just v
4.4 Product and sum instancesNext, let us define the instances for pairs and sums (i.e.,
instance (GMapKey a, GMapKey b) => GMapKey (a, b) where data GMap (a, b) v = GMapPair (GMap a (GMap b v)) empty = GMapPair empty lookup (a, b) (GMapPair gm) = lookup a gm >>= lookup b insert (a, b) v (GMapPair gm) = GMapPair $ case lookup a gm of Nothing -> insert a (insert b v empty) gm Just gm2 -> insert a (insert b v gm2 ) gm instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) empty = GMapEither empty empty lookup (Left a) (GMapEither gm1 _gm2) = lookup a gm1 lookup (Right b) (GMapEither _gm1 gm2 ) = lookup b gm2 insert (Left a) v (GMapEither gm1 gm2) = GMapEither (insert a v gm1) gm2 insert (Right a) v (GMapEither gm1 gm2) = GMapEither gm1 (insert a v gm2)
4.5 Using a generic map
Finally, some code building and querying a generic map:
myGMap :: GMap (Int, Either Char ()) String myGMap = insert (5, Left 'c') "(5, Left 'c')" $ insert (4, Right ()) "(4, Right ())" $ insert (5, Right ()) "This is the one!" $ insert (5, Right ()) "This is the two!" $ insert (6, Right ()) "(6, Right ())" $ insert (5, Left 'a') "(5, Left 'a')" $ empty main = putStrLn $ maybe "Couldn't find key!" id $ lookup (5, Right ()) myGMap
4.6 Download the codeIf you want to play with this example without copying it off the wiki, just download the source code for <div class="inline-code">
5 Detailed definition of data families
Data families appear in two flavours: (1) they can be defined on the toplevel or (2) they can appear inside type classes (in which case they are known as associated types). The former is the more general variant, as it lacks the requirement for the type-indexes to coincide with the class parameters. However, the latter can lead to more clearly structured code and compiler warnings if some type instances were - possibly accidentally - omitted. In the following, we always discuss the general toplevel form first and then cover the additional constraints placed on associated types.
5.1 Family declarations
Indexed data families are introduced by a signature, such as
data family GMap k :: * -> *
data family Array e
data family Array :: * -> *
5.1.1 Associated family declarationsWhen a data family is declared as part of a type class, we drop the
class GMapKey k where data GMap k :: * -> * ...
In contrast to toplevel declarations, named arguments must be used for all type parameters that are to be used as type-indexes. Moreover, the argument names must be class parameters. Each class parameter may only be used at most once per associated type, but some may be omitted and they may be in an order other than in the class head. Hence, the following contrived example is admissible:
class C a b c where data T c a :: *
5.2 Instance declarationsInstance declarations of data and newtype families are very similar to standard data and newtype declarations. The only two differences are that the keyword
data instance GMap (Either a b) v = GMapEither (GMap a v) (GMap b v)
In this example, the declaration has only one variant. In general, it can be any number.Data and newtype instance declarations are only legit when an appropriate family declaration is in scope - just like class instances require the class declaration to be visible. Moreover, each instance declaration has to conform to the kind determined by its family declaration. This implies that the number of parameters of an instance declaration matches the arity determined by the kind of the family. Although, all data families are declared with the
Even if type families are defined as toplevel declarations, functions that perform different computations for different family instances still need to be defined as methods of type classes. In particular, the following is not possible:
data family T a data instance T Int = A data instance T Char = B nonsence :: T a -> Int nonsence A = 1 -- WRONG: These two equations together... nonsence B = 2 -- ...will produce a type error.
Given the functionality provided by GADTs (Generalised Algebraic Data Types), it might seem as if a definition, such as the above, should be feasible. However, type families are - in contrast to GADTs - are open; i.e., new instances can always be added, possibly in other modules. Supporting pattern matching across different data instances would require a form of extensible case construct.
5.2.1 Associated type instancesWhen an associated family instance is declared within a type class instance, we drop the
instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v ...
5.2.2 Scoping of class parameters
In the case of multi-parameter type classes, the visibility of class parameters in the right-hand side of associated family instances depends solely on the parameters of the data family. As an example, consider the simple class declaration
class C a b where data T a
Only one of the two class parameters is a parameter to the data family. Hence, the following instance declaration is invalid:
instance C [c] d where data T [c] = MkT (c, d) -- WRONG!! 'd' is not in scope
5.2.3 Type class instances of family instancesType class instances of instances of data families can be defined as usual, and in particular data instance declarations can have
data GMap () v = GMapUnit (Maybe v) deriving Show
which implicitly defines an instance of the form
instance Show v => Show (GMap () v) where ...
Note that class instances are always for particular instances of a data family and never for an entire family as a whole. This for essentially the same reasons that we cannot define a toplevel function that performs pattern matching on the data constructors of different instances of a single type family. It would require a form of extensible case construct.
The instance declarations of a data family used in a single program may not overlap at all, independent of whether they are associated or not. In contrast to type class instances, this is not only a matter of consistency, but one of type safety.
5.3 Import and exportThe association of data constructors with type families is more dynamic than that is the case with standard data and newtype declarations. In the standard case, the notation
5.3.1 Associated familiesAs expected, an import or export item of the form
5.3.2 ExamplesAssuming our running
- : Exports just the class name.module GMap (GMapKey) where...
- : Exports the class, the associated typemodule GMap (GMapKey(..)) where...and the member functionsGMap,empty, andlookup. None of the data constructors is exported.insert
- : As before, but also exports all the data constructorsmodule GMap (GMapKey(..), GMap(..)) where...,GMapInt,GMapChar,GMapUnit, andGMapPair.GMapUnit
- : As before.module GMap (GMapKey(empty, lookup, insert), GMap(..)) where...
- : As before.module GMap (GMapKey, empty, lookup, insert, GMap(..)) where...
Family instances are implicitly exported, just like class instances. However, this applies only to the heads of instances, not to the data constructors an instance defines.
6 An associated type synonym example
Type synonym families are an alternative to functional dependencies, which makes functional dependency examples well suited to introduce type synonym families. In fact, type families are a more functional way to express the same as functional dependencies (despite the name!), as they replace the relational notation of functional dependencies by an expression-oriented notation; i.e., functions on types are really represented by functions and not relations.
6.1 The class declaration
Here's an example from Mark Jones' seminal paper on functional dependencies:
class Collects e ce | ce -> e where empty :: ce insert :: e -> ce -> ce member :: e -> ce -> Bool toList :: ce -> [e]
With associated type synonyms we can write this as
class Collects ce where type Elem ce empty :: ce insert :: Elem ce -> ce -> ce member :: Elem ce -> ce -> Bool toList :: ce -> [Elem ce]
6.2 An instance
Instances change in correspondingly. An instance of the two-parameter class
instance Eq e => Collects e [e] where empty =  insert e l = (e:l) member e  = False member e (x:xs) | e == x = True | otherwise = member e xs toList l = l
become an instance of a single-parameter class, where the dependant type parameter turns into an associated type instance declaration:
instance Eq e => Collects [e] where type Elem [e] = e empty =  insert e l = (e:l) member e  = False member e (x:xs) | e == x = True | otherwise = member e xs toList l = l
6.3 Using generic collections
With Functional Dependencies the code would be:
sumCollects :: (Collects e c1, Collects e c2) => c1 -> c2 -> c2 sumCollects c1 c2 = foldr insert c2 (toList c1)
In contrast, with associated type synonyms, we get:
sumCollects :: (Collects c1, Collects c2, Elem c1 ~ Elem c2) => c1 -> c2 -> c2 sumCollects c1 c2 = foldr insert c2 (toList c1)
7 Detailed definition of type synonym families
Type families appear in two flavours: (1) they can be defined on the toplevel or (2) they can appear inside type classes (in which case they are known as associated type synonyms). The former is the more general variant, as it lacks the requirement for the type-indexes to coincide with the class parameters. However, the latter can lead to more clearly structured code and compiler warnings if some type instances were - possibly accidentally - omitted. In the following, we always discuss the general toplevel form first and then cover the additional constraints placed on associated types.
7.1 Family declarations
Indexed type families are introduced by a signature, such as
type family Elem c :: *
type family Elem c
Parameters can also be given explicit kind signatures if needed. We call the number of parameters in a type family declaration, the family's arity, and all applications of a type family must be fully saturated w.r.t. to that arity. This requirement is unlike ordinary type synonyms and it implies that the kind of a type family is not sufficient to determine a family's arity, and hence in general, also insufficient to determine whether a type family application is well formed. As an example, consider the following declaration:
type family F a b :: * -> * -- F's arity is 2, -- although it's overall kind is * -> * -> * -> *
Given this declaration the following are examples of well-formed and malformed types:
F Char [Int] -- OK! Kind: * -> * F Char [Int] Bool -- OK! Kind: * F IO Bool -- WRONG: kind mismatch in the first argument F Bool -- WRONG: unsaturated application
7.1.1 Associated family declarationsWhen a type family is declared as part of a type class, we drop the
class Collects ce where type Elem ce :: * ...
The argument names of the type family must be class parameters. Each class parameter may only be used at most once per associated type, but some may be omitted and they may be in an order other than in the class head. Hence, the following contrived example is admissible:
class C a b c where type T c a :: *
These rules are exactly as for associated data families.
7.2 Instance declarationsInstance declarations of type families are very similar to standard type synonym declarations. The only two differences are that the keyword
type instance Elem [e] = e
Type family instance declarations are only legitimate when an appropriate family declaration is in scope - just like class instances require the class declaration to be visible. Moreover, each instance declaration has to conform to the kind determined by its family declaration, and the number of type parameters in an instance declaration must match the number of type parameters in the family declaration. Finally, the right-hand side of a type instance must be a monotype; i.e., it may not include explicit foralls. Here are some examples of admissible and illegal type instances:
type family F a :: * type instance F [Int] = Int -- OK! type instance F String = Char -- OK! type instance F (F a) = a -- WRONG: type parameter mentions a type family type instance F (forall a. (a, b)) = b -- WRONG: a forall type appears in a type parameter type instance F Float = forall a.a -- WRONG: right-hand side may not be a forall type type family G a b :: * -> * type instance G Int = (,) -- WRONG: must be two type parameters type instance G Int Char Float = Double -- WRONG: must be two type parameters
7.2.1 Associated type instancesWhen an associated family instance is declared within a type class instance, we drop the
instance (Eq (Elem [e])) => Collects ([e]) where type Elem [e] = e ...
The instance declarations of a type family used in a single program may only overlap if the right-hand sides of the overlapping instances coincide for the overlapping types. More formally, two instance declarations overlap if there is a substitution that makes the left-hand sides of the instances syntactically the same. Whenever that is the case, the right-hand sides of the instances must also be syntactically equal under the same substitution. This condition is independent of whether the type family is associated or not, and it is not only a matter of consistency, but one of type safety.
Here are two example to illustrate the condition under which overlap is permitted.
type instance F (a, Int) = [a] type instance F (Int, b) = [b] -- overlap permitted type instance G (a, Int) = [a] type instance G (Char, a) = [a] -- ILLEGAL overlap, as [Char] /= [Int]
In order to guarantee that type inference in the presence of type families decidable, we need to place a number of additional restrictions on the formation of type instance declarations. Instance declarations have the general form
type instance F t1 .. tn = t
- the total number of symbols (type constructors and type variables) in is strictly smaller than ins1 .. sm,t1 .. tn
- for every type variable ,aoccurs inaat most as often as ins1 .. sm, andt1 .. tn
- do not contain any type family constructors.s1 .. sm
These restrictions are a conservative approximation of the exact conditions for ensuring decidability. They have been chosen such that a compiler can check them and that a programmer can easily decide whether a program meets the restrictions. Consequently, they do exclude perfectly fine programs, just as this the case with the restrictions on type class instances. A recent draft paper describes the reasons for these restrictions in more detail. If the option -fallow-undecidable-instances is passed to the compiler, the discussed restrictions do not apply, and it is the programmers responsibility to ensure that type inferences stay decidable.
7.3 Equality constraintsType context can include equality constraints of the form
sumCollects :: (Collects c1, Collects c2, Elem c1 ~ Elem c2) => c1 -> c2 -> c2
Equality constraints can also appear in class and instance contexts. The former enable a simple translation of programs using functional dependencies into programs using family synonyms instead. The general idea is to rewrite a class declaration of the form
class C a b | a -> b
class (F a ~ b) => C a b where type F a
- Associated Types with Class. Manuel M. T. Chakravarty, Gabriele Keller, Simon Peyton Jones, and Simon Marlow. In Proceedings of The 32nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'05), pages 1-13, ACM Press, 2005.
- Associated Type Synonyms. Manuel M. T. Chakravarty, Gabriele Keller, and Simon Peyton Jones. In Proceedings of The Tenth ACM SIGPLAN International Conference on Functional Programming, ACM Press, pages 241-253, 2005.
- System F with Type Equality Coercions. Martin Sulzmann, Manuel M. T. Chakravarty, Simon Peyton Jones, and Kevin Donnelly. In Proceedings of The Third ACM SIGPLAN Workshop on Types in Language Design and Implementation, ACM Press, 2007.
- Towards Open Type Functions for Haskell. Tom Schrijvers, Martin Sulzmann, Simon Peyton-Jones, and Manuel M. T. Chakravarty. Unpublished manuscript, 2007.