Difference between revisions of "GADTs for dummies"
(slightly less paranoidsounding rephrase) 

(21 intermediate revisions by 13 users not shown)  
Line 1:  Line 1:  
−  long time i don't understood what GADT is and how they can be used. it 

+  [[Category:Tutorials]] 

−  was sort of conspiracy of silence  people who understand GADTs think 

+  [[Category:Language extensions]] 

−  that this is obvious thing and don't need any explanation but i still 

−  can't understood. 

−  now i got an idea and think that it was really obvious :) and i want 

+  For a long time, I didn't understand what [[Generalised algebraic datatype]]s were or how they could be used. It almost seemed a conspiracy of silence — people who understood GADTs thought 

−  to share my understanding  may be my way to realize GADTs can help 

+  that everything was obvious and didn't need further explanation, but I still 

−  someone else. so 

+  couldn't understand them. 

+  
+  Now that I have an idea of how it works, I think that it was really obvious. :) So, I want to share my understanding of GADTs. Maybe the way I realized how GADTs work could help 

+  someone else. 

+  
+  There are plenty of GADTrelated papers, but the best one for beginners 

+  is [http://www.cs.ox.ac.uk/ralf.hinze/publications/With.pdf "Fun with phantom types"]. 

+  Phantom types is another name of GADT. You should also know that this 

+  paper uses old GADT syntax. This paper is a mustread because it 

+  contains several examples of practical GADT usage  a theme completely 

+  omitted from this article. For example, the paper outlines how to do binary compression 

+  on datatypes without using typeclasses, 

+  and suggests how to make a kind of printf function. 

== Type functions == 
== Type functions == 

−  "data" declaration is a way to declare both type constructor and data 
+  A "data" declaration is a way to declare both type constructor and data 
constructors. For example, 
constructors. For example, 

+  <haskell> 

data Either a b = Left a  Right b 
data Either a b = Left a  Right b 

+  </haskell> 

declares type constructor "Either" and two data constructors "Left" 
declares type constructor "Either" and two data constructors "Left" 

−  and "Right". Ordinary Haskell functions 
+  and "Right". Ordinary Haskell functions work with data constructors: 
+  <haskell> 

isLeft (Left a) = True 
isLeft (Left a) = True 

isLeft (Right b) = False 
isLeft (Right b) = False 

+  </haskell> 

−  but there is also 
+  but there is also an analogous way to work with type constructors! 
+  <haskell> 

type X a = Either a a 
type X a = Either a a 

+  </haskell> 

−  declares TYPE FUNCTION named "X". 
+  declares a TYPE FUNCTION named "X". Its parameter "a" must be some type 
−  and it returns some type as result. We can't use "X" on data values, 
+  and it returns some type as its result. We can't use "X" on data values, 
but we can use it on type values. Type constructors declared with 
but we can use it on type values. Type constructors declared with 

"data" statements and type functions declared with "type" statements 
"data" statements and type functions declared with "type" statements 

−  used together to build arbitrarily complex types. In such 
+  can be used together to build arbitrarily complex types. In such 
"computations" type constructors serves as basic "values" and type 
"computations" type constructors serves as basic "values" and type 

functions as a way to process them. 
functions as a way to process them. 

−  Indeed, type functions in Haskell 
+  Indeed, type functions in Haskell are very limited compared to 
−  ordinary functions  they don't support pattern matching 
+  ordinary functions  they don't support pattern matching, 
−  nor multiple statements, nor recursion 
+  nor multiple statements, nor recursion. 
−  
−  
== Hypothetical Haskell extension  Fullfeatured type functions == 
== Hypothetical Haskell extension  Fullfeatured type functions == 

−  Let's build hypothetical Haskell extension 
+  Let's build a hypothetical Haskell extension that mimics, for type 
−  functions the wellknown ways to define ordinary functions, including 
+  functions, the wellknown ways to define ordinary functions, including 
pattern matching: 
pattern matching: 

+  <haskell> 

type F [a] = Set a 
type F [a] = Set a 

+  </haskell> 

−  multiple statements ( 
+  multiple statements (this is meaningful only in the presence of pattern matching): 
+  <haskell> 

type F Bool = Char 
type F Bool = Char 

F String = Int 
F String = Int 

+  </haskell> 

−  and recursion ( 
+  and recursion (which again needs pattern matching and multiple statements): 
+  <haskell> 

type F [a] = F a 
type F [a] = F a 

F (Map a b) = F b 
F (Map a b) = F b 

F (Set a) = F a 
F (Set a) = F a 

F a = a 
F a = a 

+  </haskell> 

−  +  As you may already have guessed, this last definition calculates a simple base type of arbitrarilynested collections, e.g.: 

−  of arbitrarilynested collection: F [[[Set Int]]] = Int 

+  <haskell> 

+  F [[[Set Int]]] = 

+  F [[Set Int]] = 

+  F [Set Int] = 

+  F (Set Int) = 

+  F Int = 

+  Int 

+  </haskell> 

−  Let's 
+  Let's not forget about statement guards: 
+  <haskell> 

type F a  IsSimple a == TrueType = a 
type F a  IsSimple a == TrueType = a 

+  </haskell> 

−  Here we define type function F only for simple datatypes by using 
+  Here we define type function F only for simple datatypes by using a 
guard type function "IsSimple": 
guard type function "IsSimple": 

+  <haskell> 

type IsSimple Bool = TrueType 
type IsSimple Bool = TrueType 

IsSimple Int = TrueType 
IsSimple Int = TrueType 

Line 75:  Line 108:  
data TrueType = T 
data TrueType = T 

data FalseType = F 
data FalseType = F 

−  
+  </haskell> 

These definitions seem a bit odd, and while we are in imaginary land, 
These definitions seem a bit odd, and while we are in imaginary land, 

let's consider a way to write this shorter: 
let's consider a way to write this shorter: 

+  <haskell> 

type F a  IsSimple a = a 
type F a  IsSimple a = a 

Line 86:  Line 120:  
.... 
.... 

IsSimple Double 
IsSimple Double 

+  </haskell> 

−  Here, we 
+  Here, we defined a list of simple types. The implied result of all 
−  written statements for "IsSimple" is True 
+  written statements for "IsSimple" is True, and False for 
−  +  everything else. Essentially, "IsSimple" is a TYPE PREDICATE! 

−  I really love it! :) How about constructing predicate that 
+  I really love it! :) How about constructing a predicate that traverses a 
complex type trying to decide whether it contains "Int" anywhere? 
complex type trying to decide whether it contains "Int" anywhere? 

+  <haskell> 

type HasInt Int 
type HasInt Int 

HasInt [a] = HasInt a 
HasInt [a] = HasInt a 

Line 99:  Line 135:  
HasInt (Map a b)  HasInt a 
HasInt (Map a b)  HasInt a 

HasInt (Map a b)  HasInt b 
HasInt (Map a b)  HasInt b 

+  </haskell> 

−  or type function that substitutes one type with another inside 
+  or a type function that substitutes one type with another inside 
−  +  arbitrarilydeep types: 

+  <haskell> 

type Replace t a b  t==a = b 
type Replace t a b  t==a = b 

Replace [t] a b = [Replace t a b] 
Replace [t] a b = [Replace t a b] 

Line 108:  Line 146:  
Replace (Map t1 t2) a b = Map (Replace t1 a b) (Replace t2 a b) 
Replace (Map t1 t2) a b = Map (Replace t1 a b) (Replace t2 a b) 

Replace t a b = t 
Replace t a b = t 

−  
+  </haskell> 

−  
== One more hypothetical extension  multivalue type functions == 
== One more hypothetical extension  multivalue type functions == 

Line 116:  Line 153:  
extension  type functions that may have MULTIPLE VALUES. Say, 
extension  type functions that may have MULTIPLE VALUES. Say, 

+  <haskell> 

type Collection a = [a] 
type Collection a = [a] 

Collection a = Set a 
Collection a = Set a 

Collection a = Map b a 
Collection a = Map b a 

+  </haskell> 

So, "Collection Int" has "[Int]", "Set Int" and "Map String Int" as 
So, "Collection Int" has "[Int]", "Set Int" and "Map String Int" as 

Line 124:  Line 163:  
"Int". 
"Int". 

−  Pay attention to 
+  Pay attention to the last statement of the "Collection" definition, where 
−  we 
+  we used the type variable "b" that was not mentioned on the left side, 
−  nor defined in any other way. 
+  nor defined in any other way. Since it's perfectly possible for the 
−  "Collection" function 
+  "Collection" function to have multiple values, using some free variable on 
−  +  the right side that can be replaced with any type is not a problem 

−  at all 
+  at all. "Map Bool Int", "Map [Int] Int" and "Map Int Int" all are 
possible values of "Collection Int" along with "[Int]" and "Set Int". 
possible values of "Collection Int" along with "[Int]" and "Set Int". 

−  +  At first glance, it seems that multiplevalue functions are meaningless  they 

−  +  can't be used to define datatypes, because we need concrete types here. But 

−  +  if we take another look, they can be useful to define type constraints and 

−  +  type families. 

−  We can also represent multiplevalue function as predicate: 
+  We can also represent a multiplevalue function as a predicate: 
+  <haskell> 

type Collection a [a] 
type Collection a [a] 

Collection a (Set a) 
Collection a (Set a) 

Collection a (Map b a) 
Collection a (Map b a) 

+  </haskell> 

−  If you 
+  If you're familiar with Prolog, you should know that a predicate, in contrast to 
−  function, is multi 
+  a function, is a multidirectional thing  it can be used to deduce any 
−  parameter from other ones. For example, in this hypothetical definition: 
+  parameter from the other ones. For example, in this hypothetical definition: 
+  <haskell> 

head  Collection Int a :: a > Int 
head  Collection Int a :: a > Int 

+  </haskell> 

−  we define 'head' function for any Collection containing Ints. 
+  we define a 'head' function for any Collection containing Ints. 
And in this, again, hypothetical definition: 
And in this, again, hypothetical definition: 

+  <haskell> 

data Safe c  Collection c a = Safe c a 
data Safe c  Collection c a = Safe c a 

+  </haskell> 

we deduced element type 'a' from collection type 'c' passed as the 
we deduced element type 'a' from collection type 'c' passed as the 

Line 162:  Line 207:  
== Back to real Haskell  type classes == 
== Back to real Haskell  type classes == 

−  +  After reading about all of these glorious examples, you may be wondering 

−  +  "Why doesn't Haskell support fullfeatured type functions?" Hold your breath... 

−  Haskell already contains them and 
+  Haskell already contains them, and GHC has implemented all of the 
−  mentioned 
+  capabilities mentioned above for more than 10 years! They were just named... 
−  TYPE CLASSES! Let's translate all our examples to their language: 
+  TYPE CLASSES! Let's translate all of our examples to their language: 
+  <haskell> 

class IsSimple a 
class IsSimple a 

instance IsSimple Bool 
instance IsSimple Bool 

Line 173:  Line 219:  
.... 
.... 

instance IsSimple Double 
instance IsSimple Double 

+  </haskell> 

−  Haskell'98 standard supports type classes with only one parameter 
+  The Haskell'98 standard supports type classes with only one parameter. 
−  limits us to 
+  That limits us to only defining type predicates like this one. But GHC and 
−  Hugs 
+  Hugs support multiparameter type classes that allow us to define 
arbitrarilycomplex type functions 
arbitrarilycomplex type functions 

+  <haskell> 

class Collection a c 
class Collection a c 

instance Collection a [a] 
instance Collection a [a] 

instance Collection a (Set a) 
instance Collection a (Set a) 

instance Collection a (Map b a) 
instance Collection a (Map b a) 

+  </haskell> 

−  All the "hypothetical" Haskell extensions we investigated earlier 
+  All of the "hypothetical" Haskell extensions we investigated earlier are 
actually implemented at the type class level! 
actually implemented at the type class level! 

Pattern matching: 
Pattern matching: 

+  <haskell> 

instance Collection a [a] 
instance Collection a [a] 

+  </haskell> 

Multiple statements: 
Multiple statements: 

+  <haskell> 

instance Collection a [a] 
instance Collection a [a] 

instance Collection a (Set a) 
instance Collection a (Set a) 

+  </haskell> 

Recursion: 
Recursion: 

+  <haskell> 

instance (Collection a c) => Collection a [c] 
instance (Collection a c) => Collection a [c] 

+  </haskell> 

Pattern guards: 
Pattern guards: 

+  <haskell> 

instance (IsSimple a) => Collection a (UArray a) 
instance (IsSimple a) => Collection a (UArray a) 

+  </haskell> 

−  Let's define type class which contains any collection which uses Int 
+  Let's define a type class which contains any collection which uses Int in 
its elements or indexes: 
its elements or indexes: 

+  <haskell> 

class HasInt a 
class HasInt a 

instance HasInt Int 
instance HasInt Int 

Line 214:  Line 272:  
instance (HasInt a) => HasInt (Map a b) 
instance (HasInt a) => HasInt (Map a b) 

instance (HasInt b) => HasInt (Map a b) 
instance (HasInt b) => HasInt (Map a b) 

+  </haskell> 

−  +  Another example is a class that replaces all occurrences of 'a' with 

−  'b' in type 't' and return result as 'res': 
+  'b' in type 't' and return the result as 'res': 
+  <haskell> 

class Replace t a b res 
class Replace t a b res 

instance Replace t a a t 
instance Replace t a a t 

−  instance Replace 
+  instance (Replace t a b res) 
+  => Replace [t] a b [res] 

instance (Replace t a b res) 
instance (Replace t a b res) 

=> Replace (Set t) a b (Set res) 
=> Replace (Set t) a b (Set res) 

Line 227:  Line 287:  
=> Replace (Map t1 t2) a b (Map res1 res2) 
=> Replace (Map t1 t2) a b (Map res1 res2) 

instance Replace t a b t 
instance Replace t a b t 

+  </haskell> 

−  You can compare it to the hypothetical definition we 
+  You can compare it to the hypothetical definition we gave earlier. 
−  It's important to note that type class instances, as 
+  It's important to note that type class instances, as opposed to 
−  function statements, are not checked in order. Instead, most 
+  function statements, are not checked in order. Instead, the most 
−  _specific_ instance automatically selected. So, in Replace case, the 
+  _specific_ instance is automatically selected. So, in the Replace case, the 
−  last instance 
+  last instance, which is the most general instance, will be selected only if all the others 
−  +  fail to match, which is what we want. 

In many other cases this automatic selection is not powerful enough 
In many other cases this automatic selection is not powerful enough 

Line 239:  Line 300:  
language developers. The two most wellknown language extensions 
language developers. The two most wellknown language extensions 

proposed to solve such problems are instance priorities, which allow 
proposed to solve such problems are instance priorities, which allow 

−  to explicitly specify instance selection order, and '/=' constraints, 
+  us to explicitly specify instance selection order, and '/=' constraints, 
which can be used to explicitly prohibit unwanted matches: 
which can be used to explicitly prohibit unwanted matches: 

+  <haskell> 

instance Replace t a a t 
instance Replace t a a t 

instance (a/=b) => Replace [t] a b [Replace t a b] 
instance (a/=b) => Replace [t] a b [Replace t a b] 

instance (a/=b, t/=[_]) => Replace t a b t 
instance (a/=b, t/=[_]) => Replace t a b t 

+  </haskell> 

−  You can check that these instances 
+  You can check that these instances no longer overlap. 
−  +  In practice, typelevel arithmetic by itself is not very useful. It becomes a 

−  +  fantastic tool when combined with another feature that type classes provide  

−  +  member functions. For example: 

+  <haskell> 

class Collection a c where 
class Collection a c where 

foldr1 :: (a > a > a) > c > a 
foldr1 :: (a > a > a) > c > a 

Line 261:  Line 325:  
sum :: (Num a, Collection a c) => c > a 
sum :: (Num a, Collection a c) => c > a 

sum = foldr1 (+) 
sum = foldr1 (+) 

+  </haskell> 

−  I'll 
+  I'll also be glad to see the possibility of using type classes in data 
−  declarations like this: 
+  declarations, like this: 
+  <haskell> 

data Safe c = (Collection c a) => Safe c a 
data Safe c = (Collection c a) => Safe c a 

+  </haskell> 

−  but 
+  but as far as I know, this is not yet implemented. 
UNIFICATION 
UNIFICATION 

−  
+  ... 

−  
== Back to GADTs == 
== Back to GADTs == 

−  If you are 
+  If you are wondering how all of these interesting type manipulations relate to 
−  +  GADTs, here is the answer. As you know, Haskell contains highly 

−  +  developed ways to express datatodata functions. We also know that 

−  +  Haskell contains rich facilities to write typetotype functions in the form of 

−  type 
+  "type" statements and type classes. But how do "data" statements fit into this 
−  +  infrastructure? 

−  My answer: they just 
+  My answer: they just define a typetodata constructor translation. Moreover, 
−  +  this translation may give multiple results. Say, the following definition: 

−  following definition: 

+  <haskell> 

data Maybe a = Just a  Nothing 
data Maybe a = Just a  Nothing 

+  </haskell> 

−  defines typetodata constructors function "Maybe" that has parameter 
+  defines typetodata constructors function "Maybe" that has a parameter 
"a" and for each "a" has two possible results  "Just a" and 
"a" and for each "a" has two possible results  "Just a" and 

"Nothing". We can rewrite it in the same hypothetical syntax that was 
"Nothing". We can rewrite it in the same hypothetical syntax that was 

used above for multivalue type functions: 
used above for multivalue type functions: 

+  <haskell> 

data Maybe a = Just a 
data Maybe a = Just a 

Maybe a = Nothing 
Maybe a = Nothing 

+  </haskell> 

Or how about this: 
Or how about this: 

+  <haskell> 

data List a = Cons a (List a) 
data List a = Cons a (List a) 

List a = Nil 
List a = Nil 

+  </haskell> 

and this: 
and this: 

+  <haskell> 

data Either a b = Left a 
data Either a b = Left a 

Either a b = Right b 
Either a b = Right b 

+  </haskell> 

−  But how 
+  But how flexible are "data" definitions? As you should remember, "type" 
−  +  definitions were very limited in their features, while type classes, 

−  +  on the other hand, were more developed than ordinary Haskell functions 

−  +  facilities. What about features of "data" definitions examined as sort of functions? 

−  examined as sort of functions? 

−  On the one 
+  On the one hand, they supports multiple statements and multiple results and 
−  +  can be recursive, like the "List" definition above. On the other, that's all  

−  +  no pattern matching or even type constants on the left side and no guards. 

−  the left side and no guards. 

−  Lack of pattern matching means that left side can contain only free 
+  Lack of pattern matching means that the left side can contain only free type 
−  +  variables. That in turn means that the left sides of all "data" statements for a 

−  +  type will be essentially the same. Therefore, repeated left sides in 

−  +  multistatement "data" definitions are omitted and instead of 

−  and instead of 

+  <haskell> 

data Either a b = Left a 
data Either a b = Left a 

Either a b = Right b 
Either a b = Right b 

+  </haskell> 

we write just 
we write just 

+  <haskell> 

data Either a b = Left a 
data Either a b = Left a 

 Right b 
 Right b 

+  </haskell> 

−  And here finally 
+  And here we finally come to GADTs! It's just a way to define data types using 
−  +  pattern matching and constants on the left side of "data" statements! 

−  +  Let's say we want to do this: 

+  <haskell> 

data T String = D1 Int 
data T String = D1 Int 

T Bool = D2 
T Bool = D2 

T [a] = D3 (a,a) 
T [a] = D3 (a,a) 

+  </haskell> 

−  Amazed? After all, GADTs seems really very simple and obvious 

−  extension to data type definition facilities. 

+  We cannot do this using a standard data definition. So, now we must use a GADT definition: 

+  <haskell> 

+  data T a where 

+  D1 :: Int > T String 

+  D2 :: T Bool 

+  D3 :: (a,a) > T [a] 

+  </haskell> 

+  Amazed? After all, GADTs seem to be a really simple and obvious extension to 

+  data type definition facilities. 

−  The idea is to allow a data constructor's return type to 

−  be specified directly: 

+  
+  
+  The idea here is to allow a data constructor's return type to be specified 

+  directly: 

+  
+  <haskell> 

data Term a where 
data Term a where 

Lit :: Int > Term Int 
Lit :: Int > Term Int 

Pair :: Term a > Term b > Term (a,b) 
Pair :: Term a > Term b > Term (a,b) 

... 
... 

+  </haskell> 

−  In a function that performs pattern matching on Term, the pattern 
+  In a function that performs pattern matching on Term, the pattern match gives 
−  +  type as well as value information. For example, consider this function: 

−  consider this function: 

+  <haskell> 

eval :: Term a > a 
eval :: Term a > a 

eval (Lit i) = i 
eval (Lit i) = i 

eval (Pair a b) = (eval a, eval b) 
eval (Pair a b) = (eval a, eval b) 

... 
... 

+  </haskell> 

−  If the argument matches Lit, it must have been built with a 
+  If the argument matches Lit, it must have been built with a Lit constructor, 
−  +  so type 'a' must be Int, and hence we can return 'i' (an Int) in the right 

−  +  hand side. The same thing applies to the Pair constructor. 

−  constructor. 

Line 374:  Line 468:  
== Further reading == 
== Further reading == 

−  The best paper on type level arithmetic using type classes 
+  The best paper on type level arithmetic using type classes I've seen 
is "Faking it: simulating dependent types in Haskell" 
is "Faking it: simulating dependent types in Haskell" 

−  ( http:// 
+  ( http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.2636 ). Most of 
−  +  this article comes from his work. 

−  +  A great demonstration of typelevel arithmetic is in the TypeNats package, 

which "defines typelevel natural numbers and arithmetic operations on 
which "defines typelevel natural numbers and arithmetic operations on 

them including addition, subtraction, multiplication, division and GCD" 
them including addition, subtraction, multiplication, division and GCD" 

( darcs get partial tag '0.1' http://www.eecs.tufts.edu/~rdocki01/typenats/ ) 
( darcs get partial tag '0.1' http://www.eecs.tufts.edu/~rdocki01/typenats/ ) 

−  I should also mention here Oleg Kiselyov page on typelevel 
+  I should also mention here [http://okmij.org/ftp/Haskell/types.html Oleg Kiselyov's page on typelevel programming in Haskell]. 
−  programming in Haskell: http://okmij.org/ftp/Haskell/types.html 

−  +  Other GADTrelated papers: 

−  remains the "Fun with phantom types" 

−  (http://www.informatik.unibonn.de/~ralf/publications/With.pdf). 

−  Phantom types is another name of GADT. You should also know that this 

−  paper uses old GADT syntax. This paper is mustread because it 

−  contains numerous examples of practical GADT usage  theme completely 

−  omitted from my article. 

−  
−  
−  Other GADTrelated papers i know: 

−  
−  "Dynamic Optimization for Functional Reactive Programming using 

−  Generalized Algebraic Data Types" 

−  http://www.cs.nott.ac.uk/~nhn/Publications/icfp2005.pdf 

−  
−  "Phantom types" (actually more scientific version of "Fun with phantom types") 

−  http://citeseer.ist.psu.edu/rd/0,606209,1,0.25,Download/http:qSqqSqwww.informatik.unibonn.deqSq~ralfqSqpublicationsqSqPhantom.ps.gz 

−  
−  "Phantom types and subtyping" http://arxiv.org/ps/cs.PL/0403034 

−  
−  "Existentially quantified type classes" by Stuckey, Sulzmann and Wazny (URL?) 

−  
−  
−  
−  
−  
−  
−  
−  
−  == Random rubbish from previous versions of article == 

−  
−  
−  data family Map k :: * > * 

−  
−  data instance Map () v = MapUnit (Maybe v) 

−  data instance Map (a, b) v = MapPair (Map a (Map b v)) 

−  
−  
−  
−  
−  
−  
−  let's consider wellknown 'data' declarations: 

−  
−  data T a = D a a Int 

−  
−  it can be seen as function 'T' from type 'a' to some data constructor. 

−  'T Bool', for example, gives result 'D Bool Bool Int', while 

−  'T [Int]' gives result 'D [Int] [Int] Int'. 

−  
−  'data' declaration can also have several "results", say 

−  
−  data Either a b = Left a  Right b 

−  
−  and "result" of 'Either Int String' can be either "Left Int" or "Right 

−  String" 

−  
−  
−  
−  
−  
−  
−  
−  Well, to give compiler confidence 

−  that 'a' can be deduced in just one way from 'c', we can add some 

−  form of hint: 

−  
−  type Collection :: a c  c>a 

−  Collection a [a] 

−  Collection a (Set a) 

−  Collection a (Map b a) 

−  
−  The first line i added tell the compiler that Collection predicate has 

−  two parameters and the second parameter determines the first. Based on 

−  this restriction, compiler can detect and prohibit attempts to define 

−  different element types for the same collection: 

−  
−  type Collection :: a c  c>a 

−  Collection a (Map b a) 

−  Collection b (Map b a)  error! prohibits functional dependency 

−  
−  Of course, Collection is just a function from 'c' to 'a', but if we 

−  will define it directly as a function: 

−  type Collection [a] = a 

+  * [http://www.cs.nott.ac.uk/~nhn/Publications/icfp2005.pdf "Dynamic Optimization for Functional Reactive Programming using Generalized Algebraic Data Types"] 

−  Collection (Set a) = a 

−  Collection (Map b a) = a 

−   it can't be used in 'head' definition above. Moreover, using 

+  * [http://citeseer.ist.psu.edu/rd/0,606209,1,0.25,Download/http:qSqqSqwww.informatik.unibonn.deqSq~ralfqSqpublicationsqSqPhantom.ps.gz "Phantom Types"] (actually a more scientific version of "Fun with phantom types") 

−  functional dependencies we can define bidirectional functions: 

−  type TwoTimesBigger :: a b  a>b, b>a 

+  * [http://arxiv.org/ps/cs.PL/0403034 "Phantom types and subtyping"] 

−  TwoTimesBigger Int8 Int16 

−  TwoTimesBigger Int16 Int32 

−  TwoTimesBigger Int32 Int64 

−  or predicates with 3, 4 or more parameters with any relations between 

+  * "Existentially quantified type classes" by Stuckey, Sulzmann and Wazny (URL?) 

−  them. It's a great power! 
Latest revision as of 10:49, 10 June 2021
For a long time, I didn't understand what Generalised algebraic datatypes were or how they could be used. It almost seemed a conspiracy of silence — people who understood GADTs thought
that everything was obvious and didn't need further explanation, but I still
couldn't understand them.
Now that I have an idea of how it works, I think that it was really obvious. :) So, I want to share my understanding of GADTs. Maybe the way I realized how GADTs work could help someone else.
There are plenty of GADTrelated papers, but the best one for beginners is "Fun with phantom types". Phantom types is another name of GADT. You should also know that this paper uses old GADT syntax. This paper is a mustread because it contains several examples of practical GADT usage  a theme completely omitted from this article. For example, the paper outlines how to do binary compression on datatypes without using typeclasses, and suggests how to make a kind of printf function.
Contents
Type functions
A "data" declaration is a way to declare both type constructor and data constructors. For example,
data Either a b = Left a  Right b
declares type constructor "Either" and two data constructors "Left" and "Right". Ordinary Haskell functions work with data constructors:
isLeft (Left a) = True
isLeft (Right b) = False
but there is also an analogous way to work with type constructors!
type X a = Either a a
declares a TYPE FUNCTION named "X". Its parameter "a" must be some type and it returns some type as its result. We can't use "X" on data values, but we can use it on type values. Type constructors declared with "data" statements and type functions declared with "type" statements can be used together to build arbitrarily complex types. In such "computations" type constructors serves as basic "values" and type functions as a way to process them.
Indeed, type functions in Haskell are very limited compared to ordinary functions  they don't support pattern matching, nor multiple statements, nor recursion.
Hypothetical Haskell extension  Fullfeatured type functions
Let's build a hypothetical Haskell extension that mimics, for type functions, the wellknown ways to define ordinary functions, including pattern matching:
type F [a] = Set a
multiple statements (this is meaningful only in the presence of pattern matching):
type F Bool = Char
F String = Int
and recursion (which again needs pattern matching and multiple statements):
type F [a] = F a
F (Map a b) = F b
F (Set a) = F a
F a = a
As you may already have guessed, this last definition calculates a simple base type of arbitrarilynested collections, e.g.:
F [[[Set Int]]] =
F [[Set Int]] =
F [Set Int] =
F (Set Int) =
F Int =
Int
Let's not forget about statement guards:
type F a  IsSimple a == TrueType = a
Here we define type function F only for simple datatypes by using a guard type function "IsSimple":
type IsSimple Bool = TrueType
IsSimple Int = TrueType
....
IsSimple Double = TrueType
IsSimple a = FalseType
data TrueType = T
data FalseType = F
These definitions seem a bit odd, and while we are in imaginary land, let's consider a way to write this shorter:
type F a  IsSimple a = a
type IsSimple Bool
IsSimple Int
....
IsSimple Double
Here, we defined a list of simple types. The implied result of all written statements for "IsSimple" is True, and False for everything else. Essentially, "IsSimple" is a TYPE PREDICATE!
I really love it! :) How about constructing a predicate that traverses a complex type trying to decide whether it contains "Int" anywhere?
type HasInt Int
HasInt [a] = HasInt a
HasInt (Set a) = HasInt a
HasInt (Map a b)  HasInt a
HasInt (Map a b)  HasInt b
or a type function that substitutes one type with another inside arbitrarilydeep types:
type Replace t a b  t==a = b
Replace [t] a b = [Replace t a b]
Replace (Set t) a b = Set (Replace t a b)
Replace (Map t1 t2) a b = Map (Replace t1 a b) (Replace t2 a b)
Replace t a b = t
One more hypothetical extension  multivalue type functions
Let's add more fun! We will introduce one more hypothetical Haskell extension  type functions that may have MULTIPLE VALUES. Say,
type Collection a = [a]
Collection a = Set a
Collection a = Map b a
So, "Collection Int" has "[Int]", "Set Int" and "Map String Int" as its values, i.e. different collection types with elements of type "Int".
Pay attention to the last statement of the "Collection" definition, where we used the type variable "b" that was not mentioned on the left side, nor defined in any other way. Since it's perfectly possible for the "Collection" function to have multiple values, using some free variable on the right side that can be replaced with any type is not a problem at all. "Map Bool Int", "Map [Int] Int" and "Map Int Int" all are possible values of "Collection Int" along with "[Int]" and "Set Int".
At first glance, it seems that multiplevalue functions are meaningless  they can't be used to define datatypes, because we need concrete types here. But if we take another look, they can be useful to define type constraints and type families.
We can also represent a multiplevalue function as a predicate:
type Collection a [a]
Collection a (Set a)
Collection a (Map b a)
If you're familiar with Prolog, you should know that a predicate, in contrast to a function, is a multidirectional thing  it can be used to deduce any parameter from the other ones. For example, in this hypothetical definition:
head  Collection Int a :: a > Int
we define a 'head' function for any Collection containing Ints.
And in this, again, hypothetical definition:
data Safe c  Collection c a = Safe c a
we deduced element type 'a' from collection type 'c' passed as the parameter to the type constructor.
Back to real Haskell  type classes
After reading about all of these glorious examples, you may be wondering "Why doesn't Haskell support fullfeatured type functions?" Hold your breath... Haskell already contains them, and GHC has implemented all of the capabilities mentioned above for more than 10 years! They were just named... TYPE CLASSES! Let's translate all of our examples to their language:
class IsSimple a
instance IsSimple Bool
instance IsSimple Int
....
instance IsSimple Double
The Haskell'98 standard supports type classes with only one parameter. That limits us to only defining type predicates like this one. But GHC and Hugs support multiparameter type classes that allow us to define arbitrarilycomplex type functions
class Collection a c
instance Collection a [a]
instance Collection a (Set a)
instance Collection a (Map b a)
All of the "hypothetical" Haskell extensions we investigated earlier are actually implemented at the type class level!
Pattern matching:
instance Collection a [a]
Multiple statements:
instance Collection a [a]
instance Collection a (Set a)
Recursion:
instance (Collection a c) => Collection a [c]
Pattern guards:
instance (IsSimple a) => Collection a (UArray a)
Let's define a type class which contains any collection which uses Int in its elements or indexes:
class HasInt a
instance HasInt Int
instance (HasInt a) => HasInt [a]
instance (HasInt a) => HasInt (Map a b)
instance (HasInt b) => HasInt (Map a b)
Another example is a class that replaces all occurrences of 'a' with
'b' in type 't' and return the result as 'res':
class Replace t a b res
instance Replace t a a t
instance (Replace t a b res)
=> Replace [t] a b [res]
instance (Replace t a b res)
=> Replace (Set t) a b (Set res)
instance (Replace t1 a b res1, Replace t2 a b res2)
=> Replace (Map t1 t2) a b (Map res1 res2)
instance Replace t a b t
You can compare it to the hypothetical definition we gave earlier. It's important to note that type class instances, as opposed to function statements, are not checked in order. Instead, the most _specific_ instance is automatically selected. So, in the Replace case, the last instance, which is the most general instance, will be selected only if all the others fail to match, which is what we want.
In many other cases this automatic selection is not powerful enough and we are forced to use some artificial tricks or complain to the language developers. The two most wellknown language extensions proposed to solve such problems are instance priorities, which allow us to explicitly specify instance selection order, and '/=' constraints, which can be used to explicitly prohibit unwanted matches:
instance Replace t a a t
instance (a/=b) => Replace [t] a b [Replace t a b]
instance (a/=b, t/=[_]) => Replace t a b t
You can check that these instances no longer overlap.
In practice, typelevel arithmetic by itself is not very useful. It becomes a
fantastic tool when combined with another feature that type classes provide 
member functions. For example:
class Collection a c where
foldr1 :: (a > a > a) > c > a
class Num a where
(+) :: a > a > a
sum :: (Num a, Collection a c) => c > a
sum = foldr1 (+)
I'll also be glad to see the possibility of using type classes in data
declarations, like this:
data Safe c = (Collection c a) => Safe c a
but as far as I know, this is not yet implemented.
UNIFICATION
...
Back to GADTs
If you are wondering how all of these interesting type manipulations relate to GADTs, here is the answer. As you know, Haskell contains highly developed ways to express datatodata functions. We also know that Haskell contains rich facilities to write typetotype functions in the form of "type" statements and type classes. But how do "data" statements fit into this infrastructure?
My answer: they just define a typetodata constructor translation. Moreover, this translation may give multiple results. Say, the following definition:
data Maybe a = Just a  Nothing
defines typetodata constructors function "Maybe" that has a parameter "a" and for each "a" has two possible results  "Just a" and "Nothing". We can rewrite it in the same hypothetical syntax that was used above for multivalue type functions:
data Maybe a = Just a
Maybe a = Nothing
Or how about this:
data List a = Cons a (List a)
List a = Nil
and this:
data Either a b = Left a
Either a b = Right b
But how flexible are "data" definitions? As you should remember, "type" definitions were very limited in their features, while type classes, on the other hand, were more developed than ordinary Haskell functions facilities. What about features of "data" definitions examined as sort of functions?
On the one hand, they supports multiple statements and multiple results and can be recursive, like the "List" definition above. On the other, that's all  no pattern matching or even type constants on the left side and no guards.
Lack of pattern matching means that the left side can contain only free type variables. That in turn means that the left sides of all "data" statements for a type will be essentially the same. Therefore, repeated left sides in multistatement "data" definitions are omitted and instead of
data Either a b = Left a
Either a b = Right b
we write just
data Either a b = Left a
 Right b
And here we finally come to GADTs! It's just a way to define data types using
pattern matching and constants on the left side of "data" statements!
Let's say we want to do this:
data T String = D1 Int
T Bool = D2
T [a] = D3 (a,a)
We cannot do this using a standard data definition. So, now we must use a GADT definition:
data T a where
D1 :: Int > T String
D2 :: T Bool
D3 :: (a,a) > T [a]
Amazed? After all, GADTs seem to be a really simple and obvious extension to data type definition facilities.
The idea here is to allow a data constructor's return type to be specified directly:
data Term a where
Lit :: Int > Term Int
Pair :: Term a > Term b > Term (a,b)
...
In a function that performs pattern matching on Term, the pattern match gives type as well as value information. For example, consider this function:
eval :: Term a > a
eval (Lit i) = i
eval (Pair a b) = (eval a, eval b)
...
If the argument matches Lit, it must have been built with a Lit constructor, so type 'a' must be Int, and hence we can return 'i' (an Int) in the right hand side. The same thing applies to the Pair constructor.
Further reading
The best paper on type level arithmetic using type classes I've seen is "Faking it: simulating dependent types in Haskell" ( http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.2636 ). Most of this article comes from his work.
A great demonstration of typelevel arithmetic is in the TypeNats package, which "defines typelevel natural numbers and arithmetic operations on them including addition, subtraction, multiplication, division and GCD" ( darcs get partial tag '0.1' http://www.eecs.tufts.edu/~rdocki01/typenats/ )
I should also mention here Oleg Kiselyov's page on typelevel programming in Haskell.
Other GADTrelated papers:
 "Phantom Types" (actually a more scientific version of "Fun with phantom types")
 "Existentially quantified type classes" by Stuckey, Sulzmann and Wazny (URL?)