Personal tools

User:Michiexile/MATH198/Lecture 4

From HaskellWiki

Jump to: navigation, search



1 Product

Recall the construction of a cartesian product of two sets: A\times B=\{(a,b) : a\in A, b\in B\}. We have functions p_A:A\times B\to A and p_B:A\times B\to B extracting the two sets from the product, and we can take any two functions f:A\to A' and g:B\to B' and take them together to form a function f\times g:A\times B\to A'\times B'.

Similarly, we can form the type of pairs of Haskell types:
Pair s t = (s,t)
. For the pair type, we have canonical functions
fst :: (s,t) -> s
snd :: (s,t) -> t
extracting the components. And given two functions
f :: s -> s'
g :: t -> t'
, there is a function
f *** g :: (s,t) -> (s',t')

An element of the pair is completely determined by the two elements included in it. Hence, if we have a pair of generalized elements q_1:V\to A and q_2:V\to B, we can find a unique generalized element q:V\to A\times B such that the projection maps on this gives us the original elements back.

This argument indicates to us a possible definition that avoids talking about elements in sets in the first place, and we are lead to the

Definition A product of two objects A,B in a category C is an object A\times B equipped with maps A \leftarrow^{p_1} A\times B\rightarrow^{p_2} B such that for any other object V with maps A \leftarrow^{q_1} V \rightarrow^{q_2} B, there is a unique map V\to A\times B such that the diagram


commutes. The diagram A \leftarrow^{p_1} A\times B\rightarrow^{p_2} B is called a product cone if it is a diagram of a product with the projection maps from its definition.

In the category of sets, the unique map is given by q(v) = (q1(v),q2(v)). In the Haskell category, it is given by the combinator
(&&&) :: (a -> b) -> (a -> c) -> a -> (b,c)

We tend to talk about the product. The justification for this lies in the first interesting

Proposition If P and P' are both products for A,B, then they are isomorphic.

Proof Consider the diagram


Both vertical arrows are given by the product property of the two product cones involved. Their compositions are endo-arrows of P,P', such that in each case, we get a diagram like


with V=A\times B=P (or P'), and q1 = p1,q2 = p2. There is, by the product property, only one endoarrow that can make the diagram work - but both the composition of the two arrows, and the identity arrow itself, make the diagram commute. Therefore, the composition has to be the identity. QED.

2 Coproduct

The other thing you can do in a Haskell data type declaration looks like this:

Coproduct a b = A a | B b
and the corresponding library type is
Either a b = Left a | Right b

This type provides us with functions

A :: a -> Coproduct a b
B :: b -> Coproduct a b

and hence looks quite like a dual to the product construction, in that the guaranteed functions the type brings are in the reverse directions from the arrows that the product projection arrows.

So, maybe what we want to do is to simply dualize the entire definition?

Definition Let C be a category. The coproduct of two objects A,B is an object A + B equipped with maps i_1:A\to A+B and i_2:B\to A+B such that any other object V with maps A\rightarrow_{v_1} V \leftarrow_{v_2} B has a unique map v:A+B\to V such that v1 = vi1 and v2 = vi2.

In the Haskell case, the maps i1,i2 are the type constructors A,B. And indeed, this Coproduct, the union type construction, is the type which guarantees inclusion of source types, but with minimal additional assumptions on the type.

In the category of sets, the coproduct construction is one where we can embed both sets into the coproduct, faithfully, and the result has no additional structure beyond that. Thus, the coproduct in set, is the disjoint union of the included sets: both sets are included without identifications made, and no extra elements are introduced.

Proposition If C,C' are both coproducts for some A,B, then they are isomorphic.

The proof is almost exactly the same as the proof for the product case.

  • Diagram definition
  • Disjoint union in Set
  • Coproduct of categories construction
  • Union types

3 Algebra of datatypes

Recall from [User:Michiexile/MATH198/Lecture_3|Lecture 3] that we can consider endofunctors as container datatypes. Some of the more obvious such container datatypes include:

data 1 a = Empty
data T a = T a

These being the data type that has only one single element and the data type that has exactly one value contained.

Using these, we can generate a whole slew of further datatypes. First off, we can generate a data type with any finite number of elements by n = 1 + 1 + \dots + 1 (n times). Remember that the coproduct construction for data types allows us to know which summand of the coproduct a given part is in, so the single elements in all the
s in the definition of
here are all distinguishable, thus giving the final type the required number of elements. Of note among these is the data type
Bool = 2
- the Boolean data type, characterized by having exactly two elements.

Furthermore, we can note that 1\times T = T, with the isomorphism given by the maps

f (Empty, T x) = T x
g (T x) = (Empty, T x)

Thus we have the capacity to add and multiply types with each other. We can verify, for any types A,B,C A\times(B+C) = A\times B + A\times C

We can thus make sense of types like T3 + 2T2 (either a triple of single values, or one out of two tagged pairs of single values).

This allows us to start working out a calculus of data types with versatile expression power. We can produce recursive data type definitions by using equations to define data types, that then allow a direct translation back into Haskell data type definitions, such as: List = 1 + T\times List BinaryTree = T\times (1+BinaryTree\times BinaryTree) TernaryTree = T\times (1+TernaryTree\times TernaryTree\times TernaryTree) GenericTree = T\times (1+List\circ GenericTree)

The real power of this way of rewriting types comes in the recognition that we can use algebraic methods to reason about our data types. For instance:

List = 1 + T * List 
     = 1 + T * (1 + T * List) 
     = 1 + T * 1 + T * T* List 
     = 1 + T + T * T * List

so a list is either empty, contains one element, or contains at least two elements. Using, though, ideas from the theory of power series, or from continued fractions, we can start analyzing the data types using steps on the way that seem completely bizarre, but arriving at important property. Again, an easy example for illustration:

List = 1 + T * List    -- and thus
List - T * List = 1    -- even though (-) doesn't make sense for data types
(1 - T) * List = 1     -- still ignoring that (-)...
List = 1 / (1 - T)     -- even though (/) doesn't make sense for data types
     = 1 + T + T*T + T*T*T + ...  -- by the geometric series identity

and hence, we can conclude - using formally algebraic steps in between - that a list by the given definition consists of either an empty list, a single value, a pair of values, three values, et.c.

At this point, I'd recommend anyone interested in more perspectives on this approach to data types, and thinks one may do with them, to read the following references:

3.1 Blog posts

3.2 Research papers

d for data types 7 trees into 1

4 Homework

  1. What are the products in the category C(P) of a poset P? What are the coproducts?
  2. Prove that any two coproducts are isomorphic.
  3. Write down the type declaration for at least two of the example data types from the section of the algebra of datatypes, and write a
    implementation for each.