https://wiki.haskell.org/api.php?action=feedcontributions&user=Dr0b3rts&feedformat=atomHaskellWiki - User contributions [en]2020-02-24T03:48:48ZUser contributionsMediaWiki 1.27.4https://wiki.haskell.org/index.php?title=GADTs_for_dummies&diff=42532GADTs for dummies2011-10-25T18:51:42Z<p>Dr0b3rts: Added GADT translation of example data definition</p>
<hr />
<div>[[Category:Tutorials]]<br />
[[Category:Language extensions]]<br />
<br />
For a long time, I didn't understand what GADTs were or how they could be used. It was sort of a conspiracy of silence - people who understood GADTs thought<br />
that everything was obvious and didn't need further explanation, but I still<br />
couldn't understand them.<br />
<br />
Now that I have an idea how it works, I think that it was really obvious. :) So, I want to share my understanding GADTs. Maybe the way I realized how GADTs work could help<br />
someone else. See also [[Generalised algebraic datatype]]<br />
<br />
== Type functions ==<br />
<br />
A "data" declaration is a way to declare both type constructor and data<br />
constructors. For example,<br />
<br />
<haskell><br />
data Either a b = Left a | Right b<br />
</haskell><br />
<br />
declares type constructor "Either" and two data constructors "Left"<br />
and "Right". Ordinary Haskell functions works with data constructors:<br />
<br />
<haskell><br />
isLeft (Left a) = True<br />
isLeft (Right b) = False<br />
</haskell><br />
<br />
but there is also an analogous way to work with type constructors!<br />
<br />
<haskell><br />
type X a = Either a a<br />
</haskell><br />
<br />
declares a TYPE FUNCTION named "X". Its parameter "a" must be some type<br />
and it returns some type as its result. We can't use "X" on data values,<br />
but we can use it on type values. Type constructors declared with<br />
"data" statements and type functions declared with "type" statements<br />
can be used together to build arbitrarily complex types. In such<br />
"computations" type constructors serves as basic "values" and type<br />
functions as a way to process them.<br />
<br />
Indeed, type functions in Haskell are very limited compared to<br />
ordinary functions - they don't support pattern matching,<br />
nor multiple statements, nor recursion.<br />
<br />
== Hypothetical Haskell extension - Full-featured type functions ==<br />
<br />
Let's build a hypothetical Haskell extension that mimics, for type<br />
functions, the well-known ways to define ordinary functions, including<br />
pattern matching:<br />
<br />
<haskell><br />
type F [a] = Set a<br />
</haskell><br />
<br />
multiple statements (this is meaningful only in the presence of pattern matching):<br />
<br />
<haskell><br />
type F Bool = Char<br />
F String = Int<br />
</haskell><br />
<br />
and recursion (which again needs pattern matching and multiple statements):<br />
<br />
<haskell><br />
type F [a] = F a<br />
F (Map a b) = F b<br />
F (Set a) = F a<br />
F a = a<br />
</haskell><br />
<br />
As you may already have guessed, this last definition calculates a simple base type of arbitrarily-nested collections, e.g.: <br />
<br />
<haskell><br />
F [[[Set Int]]] = <br />
F [[Set Int]] =<br />
F [Set Int] = <br />
F (Set Int) =<br />
F Int =<br />
Int<br />
</haskell><br />
<br />
Let's not forget about statement guards:<br />
<br />
<haskell><br />
type F a | IsSimple a == TrueType = a<br />
</haskell><br />
<br />
Here we define type function F only for simple datatypes by using a<br />
guard type function "IsSimple":<br />
<br />
<haskell><br />
type IsSimple Bool = TrueType<br />
IsSimple Int = TrueType<br />
....<br />
IsSimple Double = TrueType<br />
IsSimple a = FalseType<br />
<br />
data TrueType = T<br />
data FalseType = F<br />
</haskell><br />
<br />
These definitions seem a bit odd, and while we are in imaginary land,<br />
let's consider a way to write this shorter:<br />
<br />
<haskell><br />
type F a | IsSimple a = a<br />
<br />
type IsSimple Bool<br />
IsSimple Int<br />
....<br />
IsSimple Double<br />
</haskell><br />
<br />
Here, we defined a list of simple types. The implied result of all<br />
written statements for "IsSimple" is True, and False for<br />
everything else. Essentially, "IsSimple" is a TYPE PREDICATE!<br />
<br />
I really love it! :) How about constructing a predicate that traverses a<br />
complex type trying to decide whether it contains "Int" anywhere?<br />
<br />
<haskell><br />
type HasInt Int<br />
HasInt [a] = HasInt a<br />
HasInt (Set a) = HasInt a<br />
HasInt (Map a b) | HasInt a<br />
HasInt (Map a b) | HasInt b<br />
</haskell><br />
<br />
or a type function that substitutes one type with another inside<br />
arbitrarily-deep types:<br />
<br />
<haskell><br />
type Replace t a b | t==a = b<br />
Replace [t] a b = [Replace t a b]<br />
Replace (Set t) a b = Set (Replace t a b)<br />
Replace (Map t1 t2) a b = Map (Replace t1 a b) (Replace t2 a b)<br />
Replace t a b = t<br />
</haskell><br />
<br />
== One more hypothetical extension - multi-value type functions ==<br />
<br />
Let's add more fun! We will introduce one more hypothetical Haskell<br />
extension - type functions that may have MULTIPLE VALUES. Say,<br />
<br />
<haskell><br />
type Collection a = [a]<br />
Collection a = Set a<br />
Collection a = Map b a<br />
</haskell><br />
<br />
So, "Collection Int" has "[Int]", "Set Int" and "Map String Int" as<br />
its values, i.e. different collection types with elements of type<br />
"Int".<br />
<br />
Pay attention to the last statement of the "Collection" definition, where<br />
we used the type variable "b" that was not mentioned on the left side,<br />
nor defined in any other way. Since it's perfectly possible for the<br />
"Collection" function to have multiple values, using some free variable on<br />
the right side that can be replaced with any type is not a problem<br />
at all. "Map Bool Int", "Map [Int] Int" and "Map Int Int" all are<br />
possible values of "Collection Int" along with "[Int]" and "Set Int".<br />
<br />
At first glance, it seems that multiple-value functions are meaningless - they<br />
can't be used to define datatypes, because we need concrete types here. But<br />
if we take another look, they can be useful to define type constraints and<br />
type families.<br />
<br />
We can also represent a multiple-value function as a predicate:<br />
<br />
<haskell><br />
type Collection a [a]<br />
Collection a (Set a)<br />
Collection a (Map b a)<br />
</haskell><br />
<br />
If you're familiar with Prolog, you should know that a predicate, in contrast to<br />
a function, is a multi-directional thing - it can be used to deduce any<br />
parameter from the other ones. For example, in this hypothetical definition:<br />
<br />
<haskell><br />
head | Collection Int a :: a -> Int<br />
</haskell><br />
<br />
we define a 'head' function for any Collection containing Ints.<br />
<br />
And in this, again, hypothetical definition:<br />
<br />
<haskell><br />
data Safe c | Collection c a = Safe c a<br />
</haskell><br />
<br />
we deduced element type 'a' from collection type 'c' passed as the<br />
parameter to the type constructor.<br />
<br />
<br />
<br />
== Back to real Haskell - type classes ==<br />
<br />
After reading about all of these glorious examples, you may be wondering<br />
"Why doesn't Haskell support full-featured type functions?" Hold your breath...<br />
Haskell already contains them, and GHC has implemented all of the<br />
capabilities mentioned above for more than 10 years! They were just named...<br />
TYPE CLASSES! Let's translate all of our examples to their language:<br />
<br />
<haskell><br />
class IsSimple a<br />
instance IsSimple Bool<br />
instance IsSimple Int<br />
....<br />
instance IsSimple Double<br />
</haskell><br />
<br />
The Haskell'98 standard supports type classes with only one parameter.<br />
That limits us to only defining type predicates like this one. But GHC and<br />
Hugs support multi-parameter type classes that allow us to define<br />
arbitrarily-complex type functions<br />
<br />
<haskell><br />
class Collection a c<br />
instance Collection a [a]<br />
instance Collection a (Set a)<br />
instance Collection a (Map b a)<br />
</haskell><br />
<br />
All of the "hypothetical" Haskell extensions we investigated earlier are<br />
actually implemented at the type class level!<br />
<br />
Pattern matching:<br />
<br />
<haskell><br />
instance Collection a [a]<br />
</haskell><br />
<br />
Multiple statements:<br />
<br />
<haskell><br />
instance Collection a [a]<br />
instance Collection a (Set a)<br />
</haskell><br />
<br />
Recursion:<br />
<br />
<haskell><br />
instance (Collection a c) => Collection a [c]<br />
</haskell><br />
<br />
Pattern guards:<br />
<br />
<haskell><br />
instance (IsSimple a) => Collection a (UArray a)<br />
</haskell><br />
<br />
<br />
<br />
Let's define a type class which contains any collection which uses Int in<br />
its elements or indexes:<br />
<br />
<haskell><br />
class HasInt a<br />
instance HasInt Int<br />
instance (HasInt a) => HasInt [a]<br />
instance (HasInt a) => HasInt (Map a b)<br />
instance (HasInt b) => HasInt (Map a b)<br />
</haskell><br />
<br />
<br />
Another example is a class that replaces all occurrences of 'a' with<br />
'b' in type 't' and return the result as 'res':<br />
<br />
<haskell><br />
class Replace t a b res<br />
instance Replace t a a t<br />
instance Replace [t] a b [Replace t a b]<br />
instance (Replace t a b res)<br />
=> Replace (Set t) a b (Set res)<br />
instance (Replace t1 a b res1, Replace t2 a b res2)<br />
=> Replace (Map t1 t2) a b (Map res1 res2)<br />
instance Replace t a b t<br />
</haskell><br />
<br />
You can compare it to the hypothetical definition we gave earlier.<br />
It's important to note that type class instances, as opposed to<br />
function statements, are not checked in order. Instead, the most<br />
_specific_ instance is automatically selected. So, in the Replace case, the<br />
last instance, which is the most general instance, will be selected only if all the others<br />
fail to match, which is what we want.<br />
<br />
In many other cases this automatic selection is not powerful enough<br />
and we are forced to use some artificial tricks or complain to the<br />
language developers. The two most well-known language extensions<br />
proposed to solve such problems are instance priorities, which allow<br />
us to explicitly specify instance selection order, and '/=' constraints,<br />
which can be used to explicitly prohibit unwanted matches:<br />
<br />
<haskell><br />
instance Replace t a a t<br />
instance (a/=b) => Replace [t] a b [Replace t a b]<br />
instance (a/=b, t/=[_]) => Replace t a b t<br />
</haskell><br />
<br />
You can check that these instances no longer overlap.<br />
<br />
<br />
In practice, type-level arithmetic by itself is not very useful. It becomes a<br />
fantastic tool when combined with another feature that type classes provide -<br />
member functions. For example:<br />
<br />
<haskell><br />
class Collection a c where<br />
foldr1 :: (a -> a -> a) -> c -> a<br />
<br />
class Num a where<br />
(+) :: a -> a -> a<br />
<br />
sum :: (Num a, Collection a c) => c -> a<br />
sum = foldr1 (+)<br />
</haskell><br />
<br />
<br />
I'll also be glad to see the possibility of using type classes in data<br />
declarations, like this:<br />
<br />
<haskell><br />
data Safe c = (Collection c a) => Safe c a<br />
</haskell><br />
<br />
but as far as I know, this is not yet implemented.<br />
<br />
<br />
UNIFICATION<br />
...<br />
<br />
<br />
<br />
== Back to GADTs ==<br />
<br />
If you are wondering how all of these interesting type manipulations relate to<br />
GADTs, here is the answer. As you know, Haskell contains highly<br />
developed ways to express data-to-data functions. We also know that<br />
Haskell contains rich facilities to write type-to-type functions in the form of<br />
"type" statements and type classes. But how do "data" statements fit into this<br />
infrastructure?<br />
<br />
My answer: they just define a type-to-data constructor translation. Moreover,<br />
this translation may give multiple results. Say, the following definition:<br />
<br />
<haskell><br />
data Maybe a = Just a | Nothing<br />
</haskell><br />
<br />
defines type-to-data constructors function "Maybe" that has a parameter<br />
"a" and for each "a" has two possible results - "Just a" and<br />
"Nothing". We can rewrite it in the same hypothetical syntax that was<br />
used above for multi-value type functions:<br />
<br />
<haskell><br />
data Maybe a = Just a<br />
Maybe a = Nothing<br />
</haskell><br />
<br />
Or how about this:<br />
<br />
<haskell><br />
data List a = Cons a (List a)<br />
List a = Nil<br />
</haskell><br />
<br />
and this:<br />
<br />
<haskell><br />
data Either a b = Left a<br />
Either a b = Right b<br />
</haskell><br />
<br />
But how flexible are "data" definitions? As you should remember, "type"<br />
definitions were very limited in their features, while type classes,<br />
on the other hand, were more developed than ordinary Haskell functions<br />
facilities. What about features of "data" definitions examined as sort of functions?<br />
<br />
On the one hand, they supports multiple statements and multiple results and<br />
can be recursive, like the "List" definition above. On the other, that's all -<br />
no pattern matching or even type constants on the left side and no guards.<br />
<br />
Lack of pattern matching means that the left side can contain only free type<br />
variables. That in turn means that the left sides of all "data" statements for a<br />
type will be essentially the same. Therefore, repeated left sides in<br />
multi-statement "data" definitions are omitted and instead of<br />
<br />
<haskell><br />
data Either a b = Left a<br />
Either a b = Right b<br />
</haskell><br />
<br />
we write just<br />
<br />
<haskell><br />
data Either a b = Left a<br />
| Right b<br />
</haskell><br />
<br />
<br />
And here we finally come to GADTs! It's just a way to define data types using<br />
pattern matching and constants on the left side of "data" statements!<br />
Let's say we want to do this:<br />
<br />
<haskell><br />
data T String = D1 Int<br />
T Bool = D2<br />
T [a] = D3 (a,a)<br />
</haskell><br />
<br />
<br />
We cannot do this using a standard data definition. So, now we must use a GADT definition:<br />
<br />
<haskell><br />
data T a where<br />
D1 :: Int -> T String<br />
D2 :: T Bool<br />
D3 :: (a,a) -> T [a]<br />
</haskell><br />
<br />
Amazed? After all, GADTs seem to be a really simple and obvious extension to<br />
data type definition facilities.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
The idea here is to allow a data constructor's return type to be specified<br />
directly:<br />
<br />
<haskell><br />
data Term a where<br />
Lit :: Int -> Term Int<br />
Pair :: Term a -> Term b -> Term (a,b)<br />
...<br />
</haskell><br />
<br />
In a function that performs pattern matching on Term, the pattern match gives<br />
type as well as value information. For example, consider this function:<br />
<br />
<haskell><br />
eval :: Term a -> a<br />
eval (Lit i) = i<br />
eval (Pair a b) = (eval a, eval b)<br />
...<br />
</haskell><br />
<br />
If the argument matches Lit, it must have been built with a Lit constructor,<br />
so type 'a' must be Int, and hence we can return 'i' (an Int) in the right<br />
hand side. The same thing applies to the Pair constructor.<br />
<br />
<br />
<br />
<br />
<br />
<br />
== Further reading ==<br />
<br />
The best paper on type level arithmetic using type classes I've seen<br />
is "Faking it: simulating dependent types in Haskell"<br />
( http://www.cs.nott.ac.uk/~ctm/faking.ps.gz ). Most of<br />
this article comes from his work.<br />
<br />
A great demonstration of type-level arithmetic is in the TypeNats package,<br />
which "defines type-level natural numbers and arithmetic operations on<br />
them including addition, subtraction, multiplication, division and GCD"<br />
( darcs get --partial --tag '0.1' http://www.eecs.tufts.edu/~rdocki01/typenats/ )<br />
<br />
I should also mention here Oleg Kiselyov's page on type-level<br />
programming in Haskell: http://okmij.org/ftp/Haskell/types.html<br />
<br />
<br />
There are plenty of GADT-related papers, but the best one for beginners<br />
is "Fun with phantom types"<br />
(http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf).<br />
Phantom types is another name of GADT. You should also know that this<br />
paper uses old GADT syntax. This paper is a must-read because it<br />
contains numerous examples of practical GADT usage - a theme completely<br />
omitted from my article.<br />
<br />
<br />
Other GADT-related papers:<br />
<br />
"Dynamic Optimization for Functional Reactive Programming using<br />
Generalized Algebraic Data Types"<br />
http://www.cs.nott.ac.uk/~nhn/Publications/icfp2005.pdf<br />
<br />
"Phantom types" (actually more scientific version of "Fun with phantom types")<br />
http://citeseer.ist.psu.edu/rd/0,606209,1,0.25,Download/http:qSqqSqwww.informatik.uni-bonn.deqSq~ralfqSqpublicationsqSqPhantom.ps.gz<br />
<br />
"Phantom types and subtyping" http://arxiv.org/ps/cs.PL/0403034<br />
<br />
"Existentially quantified type classes" by Stuckey, Sulzmann and Wazny (URL?)<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
== Random rubbish from previous versions of article ==<br />
<br />
<br />
<haskell><br />
data family Map k :: * -> *<br />
<br />
data instance Map () v = MapUnit (Maybe v)<br />
data instance Map (a, b) v = MapPair (Map a (Map b v))<br />
</haskell><br />
<br />
<br />
<br />
<br />
<br />
<br />
let's consider well-known 'data' declarations:<br />
<br />
<haskell><br />
data T a = D a a Int<br />
</haskell><br />
<br />
it can be seen as function 'T' from type 'a' to some data constructor.<br />
<br />
'T Bool', for example, gives result 'D Bool Bool Int', while<br />
<br />
'T [Int]' gives result 'D [Int] [Int] Int'.<br />
<br />
'data' declaration can also have several "results", say<br />
<br />
<haskell><br />
data Either a b = Left a | Right b<br />
</haskell><br />
<br />
and "result" of 'Either Int String' can be either "Left Int" or "Right<br />
String"<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Well, to give compiler confidence<br />
that 'a' can be deduced in just one way from 'c', we can add some<br />
form of hint:<br />
<br />
<haskell><br />
type Collection :: a c | c->a<br />
Collection a [a]<br />
Collection a (Set a)<br />
Collection a (Map b a)<br />
</haskell><br />
<br />
The first line i added tell the compiler that Collection predicate has<br />
two parameters and the second parameter determines the first. Based on<br />
this restriction, compiler can detect and prohibit attempts to define<br />
different element types for the same collection:<br />
<br />
<haskell><br />
type Collection :: a c | c->a<br />
Collection a (Map b a)<br />
Collection b (Map b a) -- error! prohibits functional dependency<br />
</haskell><br />
<br />
Of course, Collection is just a function from 'c' to 'a', but if we<br />
will define it directly as a function:<br />
<br />
<haskell><br />
type Collection [a] = a<br />
Collection (Set a) = a<br />
Collection (Map b a) = a<br />
</haskell><br />
<br />
- it can't be used in 'head' definition above. Moreover, using<br />
functional dependencies we can define bi-directional functions:<br />
<br />
<haskell><br />
type TwoTimesBigger :: a b | a->b, b->a<br />
TwoTimesBigger Int8 Int16<br />
TwoTimesBigger Int16 Int32<br />
TwoTimesBigger Int32 Int64<br />
</haskell><br />
<br />
or predicates with 3, 4 or more parameters with any relations between<br />
them. It's a great power!</div>Dr0b3rtshttps://wiki.haskell.org/index.php?title=GADTs_for_dummies&diff=42531GADTs for dummies2011-10-25T18:25:30Z<p>Dr0b3rts: Many grammatical revisions</p>
<hr />
<div>[[Category:Tutorials]]<br />
[[Category:Language extensions]]<br />
<br />
For a long time, I didn't understand what GADTs were or how they could be used. It was sort of a conspiracy of silence - people who understood GADTs thought<br />
that everything was obvious and didn't need further explanation, but I still<br />
couldn't understand them.<br />
<br />
Now that I have an idea how it works, I think that it was really obvious. :) So, I want to share my understanding GADTs. Maybe the way I realized how GADTs work could help<br />
someone else. See also [[Generalised algebraic datatype]]<br />
<br />
== Type functions ==<br />
<br />
A "data" declaration is a way to declare both type constructor and data<br />
constructors. For example,<br />
<br />
<haskell><br />
data Either a b = Left a | Right b<br />
</haskell><br />
<br />
declares type constructor "Either" and two data constructors "Left"<br />
and "Right". Ordinary Haskell functions works with data constructors:<br />
<br />
<haskell><br />
isLeft (Left a) = True<br />
isLeft (Right b) = False<br />
</haskell><br />
<br />
but there is also an analogous way to work with type constructors!<br />
<br />
<haskell><br />
type X a = Either a a<br />
</haskell><br />
<br />
declares a TYPE FUNCTION named "X". Its parameter "a" must be some type<br />
and it returns some type as its result. We can't use "X" on data values,<br />
but we can use it on type values. Type constructors declared with<br />
"data" statements and type functions declared with "type" statements<br />
can be used together to build arbitrarily complex types. In such<br />
"computations" type constructors serves as basic "values" and type<br />
functions as a way to process them.<br />
<br />
Indeed, type functions in Haskell are very limited compared to<br />
ordinary functions - they don't support pattern matching,<br />
nor multiple statements, nor recursion.<br />
<br />
== Hypothetical Haskell extension - Full-featured type functions ==<br />
<br />
Let's build a hypothetical Haskell extension that mimics, for type<br />
functions, the well-known ways to define ordinary functions, including<br />
pattern matching:<br />
<br />
<haskell><br />
type F [a] = Set a<br />
</haskell><br />
<br />
multiple statements (this is meaningful only in the presence of pattern matching):<br />
<br />
<haskell><br />
type F Bool = Char<br />
F String = Int<br />
</haskell><br />
<br />
and recursion (which again needs pattern matching and multiple statements):<br />
<br />
<haskell><br />
type F [a] = F a<br />
F (Map a b) = F b<br />
F (Set a) = F a<br />
F a = a<br />
</haskell><br />
<br />
As you may already have guessed, this last definition calculates a simple base type of arbitrarily-nested collections, e.g.: <br />
<br />
<haskell><br />
F [[[Set Int]]] = <br />
F [[Set Int]] =<br />
F [Set Int] = <br />
F (Set Int) =<br />
F Int =<br />
Int<br />
</haskell><br />
<br />
Let's not forget about statement guards:<br />
<br />
<haskell><br />
type F a | IsSimple a == TrueType = a<br />
</haskell><br />
<br />
Here we define type function F only for simple datatypes by using a<br />
guard type function "IsSimple":<br />
<br />
<haskell><br />
type IsSimple Bool = TrueType<br />
IsSimple Int = TrueType<br />
....<br />
IsSimple Double = TrueType<br />
IsSimple a = FalseType<br />
<br />
data TrueType = T<br />
data FalseType = F<br />
</haskell><br />
<br />
These definitions seem a bit odd, and while we are in imaginary land,<br />
let's consider a way to write this shorter:<br />
<br />
<haskell><br />
type F a | IsSimple a = a<br />
<br />
type IsSimple Bool<br />
IsSimple Int<br />
....<br />
IsSimple Double<br />
</haskell><br />
<br />
Here, we defined a list of simple types. The implied result of all<br />
written statements for "IsSimple" is True, and False for<br />
everything else. Essentially, "IsSimple" is a TYPE PREDICATE!<br />
<br />
I really love it! :) How about constructing a predicate that traverses a<br />
complex type trying to decide whether it contains "Int" anywhere?<br />
<br />
<haskell><br />
type HasInt Int<br />
HasInt [a] = HasInt a<br />
HasInt (Set a) = HasInt a<br />
HasInt (Map a b) | HasInt a<br />
HasInt (Map a b) | HasInt b<br />
</haskell><br />
<br />
or a type function that substitutes one type with another inside<br />
arbitrarily-deep types:<br />
<br />
<haskell><br />
type Replace t a b | t==a = b<br />
Replace [t] a b = [Replace t a b]<br />
Replace (Set t) a b = Set (Replace t a b)<br />
Replace (Map t1 t2) a b = Map (Replace t1 a b) (Replace t2 a b)<br />
Replace t a b = t<br />
</haskell><br />
<br />
== One more hypothetical extension - multi-value type functions ==<br />
<br />
Let's add more fun! We will introduce one more hypothetical Haskell<br />
extension - type functions that may have MULTIPLE VALUES. Say,<br />
<br />
<haskell><br />
type Collection a = [a]<br />
Collection a = Set a<br />
Collection a = Map b a<br />
</haskell><br />
<br />
So, "Collection Int" has "[Int]", "Set Int" and "Map String Int" as<br />
its values, i.e. different collection types with elements of type<br />
"Int".<br />
<br />
Pay attention to the last statement of the "Collection" definition, where<br />
we used the type variable "b" that was not mentioned on the left side,<br />
nor defined in any other way. Since it's perfectly possible for the<br />
"Collection" function to have multiple values, using some free variable on<br />
the right side that can be replaced with any type is not a problem<br />
at all. "Map Bool Int", "Map [Int] Int" and "Map Int Int" all are<br />
possible values of "Collection Int" along with "[Int]" and "Set Int".<br />
<br />
At first glance, it seems that multiple-value functions are meaningless - they<br />
can't be used to define datatypes, because we need concrete types here. But<br />
if we take another look, they can be useful to define type constraints and<br />
type families.<br />
<br />
We can also represent a multiple-value function as a predicate:<br />
<br />
<haskell><br />
type Collection a [a]<br />
Collection a (Set a)<br />
Collection a (Map b a)<br />
</haskell><br />
<br />
If you're familiar with Prolog, you should know that a predicate, in contrast to<br />
a function, is a multi-directional thing - it can be used to deduce any<br />
parameter from the other ones. For example, in this hypothetical definition:<br />
<br />
<haskell><br />
head | Collection Int a :: a -> Int<br />
</haskell><br />
<br />
we define a 'head' function for any Collection containing Ints.<br />
<br />
And in this, again, hypothetical definition:<br />
<br />
<haskell><br />
data Safe c | Collection c a = Safe c a<br />
</haskell><br />
<br />
we deduced element type 'a' from collection type 'c' passed as the<br />
parameter to the type constructor.<br />
<br />
<br />
<br />
== Back to real Haskell - type classes ==<br />
<br />
After reading about all of these glorious examples, you may be wondering<br />
"Why doesn't Haskell support full-featured type functions?" Hold your breath...<br />
Haskell already contains them, and GHC has implemented all of the<br />
capabilities mentioned above for more than 10 years! They were just named...<br />
TYPE CLASSES! Let's translate all of our examples to their language:<br />
<br />
<haskell><br />
class IsSimple a<br />
instance IsSimple Bool<br />
instance IsSimple Int<br />
....<br />
instance IsSimple Double<br />
</haskell><br />
<br />
The Haskell'98 standard supports type classes with only one parameter.<br />
That limits us to only defining type predicates like this one. But GHC and<br />
Hugs support multi-parameter type classes that allow us to define<br />
arbitrarily-complex type functions<br />
<br />
<haskell><br />
class Collection a c<br />
instance Collection a [a]<br />
instance Collection a (Set a)<br />
instance Collection a (Map b a)<br />
</haskell><br />
<br />
All of the "hypothetical" Haskell extensions we investigated earlier are<br />
actually implemented at the type class level!<br />
<br />
Pattern matching:<br />
<br />
<haskell><br />
instance Collection a [a]<br />
</haskell><br />
<br />
Multiple statements:<br />
<br />
<haskell><br />
instance Collection a [a]<br />
instance Collection a (Set a)<br />
</haskell><br />
<br />
Recursion:<br />
<br />
<haskell><br />
instance (Collection a c) => Collection a [c]<br />
</haskell><br />
<br />
Pattern guards:<br />
<br />
<haskell><br />
instance (IsSimple a) => Collection a (UArray a)<br />
</haskell><br />
<br />
<br />
<br />
Let's define a type class which contains any collection which uses Int in<br />
its elements or indexes:<br />
<br />
<haskell><br />
class HasInt a<br />
instance HasInt Int<br />
instance (HasInt a) => HasInt [a]<br />
instance (HasInt a) => HasInt (Map a b)<br />
instance (HasInt b) => HasInt (Map a b)<br />
</haskell><br />
<br />
<br />
Another example is a class that replaces all occurrences of 'a' with<br />
'b' in type 't' and return the result as 'res':<br />
<br />
<haskell><br />
class Replace t a b res<br />
instance Replace t a a t<br />
instance Replace [t] a b [Replace t a b]<br />
instance (Replace t a b res)<br />
=> Replace (Set t) a b (Set res)<br />
instance (Replace t1 a b res1, Replace t2 a b res2)<br />
=> Replace (Map t1 t2) a b (Map res1 res2)<br />
instance Replace t a b t<br />
</haskell><br />
<br />
You can compare it to the hypothetical definition we gave earlier.<br />
It's important to note that type class instances, as opposed to<br />
function statements, are not checked in order. Instead, the most<br />
_specific_ instance is automatically selected. So, in the Replace case, the<br />
last instance, which is the most general instance, will be selected only if all the others<br />
fail to match, which is what we want.<br />
<br />
In many other cases this automatic selection is not powerful enough<br />
and we are forced to use some artificial tricks or complain to the<br />
language developers. The two most well-known language extensions<br />
proposed to solve such problems are instance priorities, which allow<br />
us to explicitly specify instance selection order, and '/=' constraints,<br />
which can be used to explicitly prohibit unwanted matches:<br />
<br />
<haskell><br />
instance Replace t a a t<br />
instance (a/=b) => Replace [t] a b [Replace t a b]<br />
instance (a/=b, t/=[_]) => Replace t a b t<br />
</haskell><br />
<br />
You can check that these instances no longer overlap.<br />
<br />
<br />
In practice, type-level arithmetic by itself is not very useful. It becomes a<br />
fantastic tool when combined with another feature that type classes provide -<br />
member functions. For example:<br />
<br />
<haskell><br />
class Collection a c where<br />
foldr1 :: (a -> a -> a) -> c -> a<br />
<br />
class Num a where<br />
(+) :: a -> a -> a<br />
<br />
sum :: (Num a, Collection a c) => c -> a<br />
sum = foldr1 (+)<br />
</haskell><br />
<br />
<br />
I'll also be glad to see the possibility of using type classes in data<br />
declarations, like this:<br />
<br />
<haskell><br />
data Safe c = (Collection c a) => Safe c a<br />
</haskell><br />
<br />
but as far as I know, this is not yet implemented.<br />
<br />
<br />
UNIFICATION<br />
...<br />
<br />
<br />
<br />
== Back to GADTs ==<br />
<br />
If you are wondering how all of these interesting type manipulations relate to<br />
GADTs, here is the answer. As you know, Haskell contains highly<br />
developed ways to express data-to-data functions. We also know that<br />
Haskell contains rich facilities to write type-to-type functions in the form of<br />
"type" statements and type classes. But how do "data" statements fit into this<br />
infrastructure?<br />
<br />
My answer: they just define a type-to-data constructor translation. Moreover,<br />
this translation may give multiple results. Say, the following definition:<br />
<br />
<haskell><br />
data Maybe a = Just a | Nothing<br />
</haskell><br />
<br />
defines type-to-data constructors function "Maybe" that has a parameter<br />
"a" and for each "a" has two possible results - "Just a" and<br />
"Nothing". We can rewrite it in the same hypothetical syntax that was<br />
used above for multi-value type functions:<br />
<br />
<haskell><br />
data Maybe a = Just a<br />
Maybe a = Nothing<br />
</haskell><br />
<br />
Or how about this:<br />
<br />
<haskell><br />
data List a = Cons a (List a)<br />
List a = Nil<br />
</haskell><br />
<br />
and this:<br />
<br />
<haskell><br />
data Either a b = Left a<br />
Either a b = Right b<br />
</haskell><br />
<br />
But how flexible are "data" definitions? As you should remember, "type"<br />
definitions were very limited in their features, while type classes,<br />
on the other hand, were more developed than ordinary Haskell functions<br />
facilities. What about features of "data" definitions examined as sort of functions?<br />
<br />
On the one hand, they supports multiple statements and multiple results and<br />
can be recursive, like the "List" definition above. On the other, that's all -<br />
no pattern matching or even type constants on the left side and no guards.<br />
<br />
Lack of pattern matching means that the left side can contain only free type<br />
variables. That in turn means that the left sides of all "data" statements for a<br />
type will be essentially the same. Therefore, repeated left sides in<br />
multi-statement "data" definitions are omitted and instead of<br />
<br />
<haskell><br />
data Either a b = Left a<br />
Either a b = Right b<br />
</haskell><br />
<br />
we write just<br />
<br />
<haskell><br />
data Either a b = Left a<br />
| Right b<br />
</haskell><br />
<br />
<br />
And here we finally come to GADTs! It's just a way to define data types using<br />
pattern matching and constants on the left side of "data" statements! How<br />
about this:<br />
<br />
<haskell><br />
data T String = D1 Int<br />
T Bool = D2<br />
T [a] = D3 (a,a)<br />
</haskell><br />
<br />
Amazed? After all, GADTs seem to be a really simple and obvious extension to<br />
data type definition facilities.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
The idea is to allow a data constructor's return type to be specified<br />
directly:<br />
<br />
<haskell><br />
data Term a where<br />
Lit :: Int -> Term Int<br />
Pair :: Term a -> Term b -> Term (a,b)<br />
...<br />
</haskell><br />
<br />
In a function that performs pattern matching on Term, the pattern match gives<br />
type as well as value information. For example, consider this function:<br />
<br />
<haskell><br />
eval :: Term a -> a<br />
eval (Lit i) = i<br />
eval (Pair a b) = (eval a, eval b)<br />
...<br />
</haskell><br />
<br />
If the argument matches Lit, it must have been built with a Lit constructor,<br />
so type 'a' must be Int, and hence we can return 'i' (an Int) in the right<br />
hand side. The same thing applies to the Pair constructor.<br />
<br />
<br />
<br />
<br />
<br />
<br />
== Further reading ==<br />
<br />
The best paper on type level arithmetic using type classes I've seen<br />
is "Faking it: simulating dependent types in Haskell"<br />
( http://www.cs.nott.ac.uk/~ctm/faking.ps.gz ). Most of<br />
this article comes from his work.<br />
<br />
A great demonstration of type-level arithmetic is in the TypeNats package,<br />
which "defines type-level natural numbers and arithmetic operations on<br />
them including addition, subtraction, multiplication, division and GCD"<br />
( darcs get --partial --tag '0.1' http://www.eecs.tufts.edu/~rdocki01/typenats/ )<br />
<br />
I should also mention here Oleg Kiselyov's page on type-level<br />
programming in Haskell: http://okmij.org/ftp/Haskell/types.html<br />
<br />
<br />
There are plenty of GADT-related papers, but the best one for beginners<br />
is "Fun with phantom types"<br />
(http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf).<br />
Phantom types is another name of GADT. You should also know that this<br />
paper uses old GADT syntax. This paper is a must-read because it<br />
contains numerous examples of practical GADT usage - a theme completely<br />
omitted from my article.<br />
<br />
<br />
Other GADT-related papers:<br />
<br />
"Dynamic Optimization for Functional Reactive Programming using<br />
Generalized Algebraic Data Types"<br />
http://www.cs.nott.ac.uk/~nhn/Publications/icfp2005.pdf<br />
<br />
"Phantom types" (actually more scientific version of "Fun with phantom types")<br />
http://citeseer.ist.psu.edu/rd/0,606209,1,0.25,Download/http:qSqqSqwww.informatik.uni-bonn.deqSq~ralfqSqpublicationsqSqPhantom.ps.gz<br />
<br />
"Phantom types and subtyping" http://arxiv.org/ps/cs.PL/0403034<br />
<br />
"Existentially quantified type classes" by Stuckey, Sulzmann and Wazny (URL?)<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
== Random rubbish from previous versions of article ==<br />
<br />
<br />
<haskell><br />
data family Map k :: * -> *<br />
<br />
data instance Map () v = MapUnit (Maybe v)<br />
data instance Map (a, b) v = MapPair (Map a (Map b v))<br />
</haskell><br />
<br />
<br />
<br />
<br />
<br />
<br />
let's consider well-known 'data' declarations:<br />
<br />
<haskell><br />
data T a = D a a Int<br />
</haskell><br />
<br />
it can be seen as function 'T' from type 'a' to some data constructor.<br />
<br />
'T Bool', for example, gives result 'D Bool Bool Int', while<br />
<br />
'T [Int]' gives result 'D [Int] [Int] Int'.<br />
<br />
'data' declaration can also have several "results", say<br />
<br />
<haskell><br />
data Either a b = Left a | Right b<br />
</haskell><br />
<br />
and "result" of 'Either Int String' can be either "Left Int" or "Right<br />
String"<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Well, to give compiler confidence<br />
that 'a' can be deduced in just one way from 'c', we can add some<br />
form of hint:<br />
<br />
<haskell><br />
type Collection :: a c | c->a<br />
Collection a [a]<br />
Collection a (Set a)<br />
Collection a (Map b a)<br />
</haskell><br />
<br />
The first line i added tell the compiler that Collection predicate has<br />
two parameters and the second parameter determines the first. Based on<br />
this restriction, compiler can detect and prohibit attempts to define<br />
different element types for the same collection:<br />
<br />
<haskell><br />
type Collection :: a c | c->a<br />
Collection a (Map b a)<br />
Collection b (Map b a) -- error! prohibits functional dependency<br />
</haskell><br />
<br />
Of course, Collection is just a function from 'c' to 'a', but if we<br />
will define it directly as a function:<br />
<br />
<haskell><br />
type Collection [a] = a<br />
Collection (Set a) = a<br />
Collection (Map b a) = a<br />
</haskell><br />
<br />
- it can't be used in 'head' definition above. Moreover, using<br />
functional dependencies we can define bi-directional functions:<br />
<br />
<haskell><br />
type TwoTimesBigger :: a b | a->b, b->a<br />
TwoTimesBigger Int8 Int16<br />
TwoTimesBigger Int16 Int32<br />
TwoTimesBigger Int32 Int64<br />
</haskell><br />
<br />
or predicates with 3, 4 or more parameters with any relations between<br />
them. It's a great power!</div>Dr0b3rtshttps://wiki.haskell.org/index.php?title=Dynamic_programming_example&diff=42265Dynamic programming example2011-10-01T19:44:51Z<p>Dr0b3rts: Corrected bitmask of last example (0x8120 should be 0x80120)</p>
<hr />
<div>[[Category:Tutorials]] [[Category:Code]]<br />
<br />
Dynamic programming refers to translating a problem to be solved into a recurrence formula, and crunching this formula with the help of an array (or any suitable collection) to save useful intermediates and avoid redundant work.<br />
<br />
Computationally, dynamic programming boils down to write once, share and read many times. This is exactly what lazy functional programming is for.<br />
<br />
== Sample problems and solutions ==<br />
<br />
=== Available in 6-packs, 9-packs, 20-packs ===<br />
<br />
A fast food place sells a finger food in only boxes of 6 pieces, boxes of 9 pieces, or boxes of 20 pieces. You can only buy zero or more such boxes. Therefore it is impossible to buy exactly 5 pieces, or exactly 7 pieces, etc. Can you buy exactly N pieces?<br />
<br />
If I can buy i-6 pieces, or i-9 pieces, or i-20 pieces (provided these are not negative numbers), I can then buy i pieces (by adding a box of 6 or 9 or 20). Below, I set up the array <hask>r</hask> for exactly that, with <hask>r!0</hask> forced to <hask>True</hask> to bootstrap the whole thing.<br />
<br />
<haskell><br />
import Data.Array<br />
<br />
buyable n = r!n<br />
where r = listArray (0,n) (True : map f [1..n])<br />
f i = i >= 6 && r!(i-6) || i >= 9 && r!(i-9) || i >= 20 && r!(i-20)<br />
</haskell><br />
<br />
You certainly want to know how to buy N pieces, in addition to knowing whether it can be done. I now use the array to hold both kinds of information: <hask>r!i</hask> is <hask>Nothing</hask> if i pieces cannot be bought, or <hask>Just (x,y,z)</hask> if i pieces can be bought, and moreover it can be done by x boxes of 6, y boxes of 9, and z boxes of 20. Below the code for <hask>buy</hask> is more tedious (understandably) but is just a natural extension of the logic behind the code of <hask>buyable</hask>.<br />
<br />
<haskell><br />
import Data.Array<br />
<br />
buy n = r!n<br />
where r = listArray (0,n) (Just (0,0,0) : map f [1..n])<br />
f i = case attempt (i-6)<br />
of Just(x,y,z) -> Just(x+1,y,z)<br />
_ -> case attempt (i-9)<br />
of Just(x,y,z) -> Just(x,y+1,z)<br />
_ -> case attempt (i-20)<br />
of Just(x,y,z) -> Just(x,y,z+1)<br />
_ -> Nothing<br />
attempt x = if x>=0 then r!x else Nothing<br />
</haskell><br />
<br />
Optional: If you know monads and that <hask>Maybe</hask> is a monad, you can write it in a more regular way:<br />
<br />
<haskell><br />
import Data.Array<br />
import Control.Monad(guard,mplus)<br />
<br />
buy n = r!n<br />
where r = listArray (0,n) (Just (0,0,0) : map f [1..n])<br />
f i = do (x,y,z) <- attempt (i-6)<br />
return (x+1,y,z)<br />
`mplus`<br />
do (x,y,z) <- attempt (i-9)<br />
return (x,y+1,z)<br />
`mplus`<br />
do (x,y,z) <- attempt (i-20)<br />
return (x,y,z+1)<br />
attempt x = guard (x>=0) >> r!x<br />
</haskell><br />
This more regular code can be further generalized.<br />
<br />
== Optimization ==<br />
<br />
Simple dynamic programing is usually fast enough (and as always,<br />
profile before optimizing!) However, when you need more speed, it is<br />
usually fairly easy to shave an order of magnitude off the space usage<br />
of dynamic programming problems (with concomitant speedups due to<br />
cache effects.) The trick is to manually schedule the computation in<br />
order to discard temporary results as soon as possible.<br />
<br />
Notice that if we compute results in sequential order from 0 to the<br />
needed count, (in the example above) we will always have computed<br />
subproblems before the problems. Also, if we do it in this order we<br />
need not keep any value for longer than twenty values. So we can use<br />
the old fibonacci trick:<br />
<br />
<haskell><br />
buyable n = iter n (True : replicate 19 False)<br />
where iter 0 lst = lst !! 0<br />
iter n lst = iter (n-1) ((lst !! 5 || lst !! 8 || lst !! 19) : take 19 lst)<br />
</haskell><br />
<br />
At each call of iter, the n parameter contains (total - cur) and the<br />
lst parameter stores buyable for (cur-1, cur-2, cur-3, ...). Also<br />
note that the indexes change meaning through the cons, so we need to<br />
offset the !! indexes by 1. <br />
<br />
We can improve this more by packing the bit array:<br />
<br />
<haskell><br />
import Data.Bits<br />
<br />
buyable n = iter n 1<br />
where iter :: Int -> Int -> Bool<br />
iter 0 lst = odd lst<br />
iter n lst = iter (n-1) ((lst `shiftL` 1) .|.<br />
if lst .&. 0x80120 /= 0 then 1 else 0)<br />
</haskell><br />
<br />
This final version is compiled into a single allocation-free loop.</div>Dr0b3rts