Free structure
From HaskellWiki
(Add some end notes about category stuff.) 
(change 'injection' to 'embedding' in the informal description of freeness, since injectivity isn't necessarily implied.) 

(10 intermediate revisions by 2 users not shown)  
Line 1:  Line 1:  
[[Category:Glossary]]  [[Category:Glossary]]  
−  [[Category:  +  [[Category:Theoretical foundations]] 
−  +  
[[Category:Mathematics]]  [[Category:Mathematics]]  
=== Introduction ===  === Introduction ===  
−  This article attempts to give a relatively informal understanding of "free" structures from algebra/category theory, with pointers to some of the formal material for those who desire it.  +  This article attempts to give a relatively informal understanding of "free" structures from algebra/category theory, with pointers to some of the formal material for those who desire it. The later sections make use of some notions from [[category theory]], so some familiarity with its basics will be useful. 
=== Algebra ===  === Algebra ===  
Line 12:  Line 11:  
==== What sort of structures are we talking about? ====  ==== What sort of structures are we talking about? ====  
−  +  The distinction between free structures and other, nonfree structures, originates in [http://en.wikipedia.org/wiki/Abstract_algebra abstract algebra], so that provides a good place to start. Some common structures considered in algebra are:  
* '''[[Monoid]]s'''  * '''[[Monoid]]s'''  
Line 49:  Line 48:  
Now, given such a description, we can talk about the free structure over a particular set <math>S</math> (or, possibly over some other underlying structure; but we'll stick with sets now). What this means is that given <math>S</math>, we want to find some set <math>M</math>, together with appropriate operations to make <math>M</math> the structure in question, along with the following two criteria:  Now, given such a description, we can talk about the free structure over a particular set <math>S</math> (or, possibly over some other underlying structure; but we'll stick with sets now). What this means is that given <math>S</math>, we want to find some set <math>M</math>, together with appropriate operations to make <math>M</math> the structure in question, along with the following two criteria:  
−  * There is an  +  * There is an embedding <math>i : S \to M</math> 
* The structure generated is as 'simple' as possible.  * The structure generated is as 'simple' as possible.  
** <math>M</math> should contain only elements that are required to exist by <math>i</math> and the operations of the structure.  ** <math>M</math> should contain only elements that are required to exist by <math>i</math> and the operations of the structure.  
−  ** The only equational laws that should hold for the generated structure are those that are required to hold by the equational laws for  +  ** The only equational laws that should hold for the generated structure are those that are required to hold by the equational laws for structures of that type. 
−  So, in the case of a free monoid (from here on out, we'll assume that the structure in question is a monoid, since it's simplest), the equation <math>x * y = y * x</math> should not hold unless <math>x = y</math>, <math>x = e</math> or <math>y = e</math>.  +  So, in the case of a free monoid (from here on out, we'll assume that the structure in question is a monoid, since it's simplest), the equation <math>x * y = y * x</math> should not hold unless <math>x = y</math>, <math>x = e</math> or <math>y = e</math>. Further <math>i x \in M</math>, for all <math>x</math>, and <math>e \in M</math>, and <math>\forall x, y \in M.\,\, x * y \in M</math> (and these should all be distinct, except as required by the monoid laws), but there should be no 'extra' elements of <math>M</math> in addition to those. 
For monoids, the free structure over a set is given by the monoid of lists of elements of that set, with concatenation as multiplication. It should be easy to convince yourself of the following (in pseudoHaskell):  For monoids, the free structure over a set is given by the monoid of lists of elements of that set, with concatenation as multiplication. It should be easy to convince yourself of the following (in pseudoHaskell):  
Line 77:  Line 76:  
==== Free structure functors ====  ==== Free structure functors ====  
−  One possible objection to the above description (even a more formal version thereof) is that the characterization of "simple" is somewhat vague. [  +  One possible objection to the above description (even a more formal version thereof) is that the characterization of "simple" is somewhat vague. [[Category theory]] gives a somewhat better solution. Generally, structuresoveraset will form a category, with arrows being structurepreserving homomorphisms. "Simplest" (in the sense we want) structures in that category will then either be [http://en.wikipedia.org/wiki/Initial_and_terminal_objects initial or terminal], [1] and thus, freeness can be defined in terms of such universal constructions. 
−  In its full categorical generality, freeness isn't necessarily categorized by underlying set structure, either. Instead, one looks at "forgetful" functors from the category of structures to some other category. For our free monoids above, it'd be:  +  In its [http://en.wikipedia.org/wiki/Free_object full categorical generality], freeness isn't necessarily categorized by underlying set structure, either. Instead, one looks at "forgetful" functors [2] from the category of structures to some other category. For our free monoids above, it'd be: 
* <math>U : Mon \to Set</math>  * <math>U : Mon \to Set</math>  
−  The functor taking monoids to their underlying set. Then, the relevant universal property is given by finding an [http://en.wikipedia.org/wiki/Adjunction adjoint functor]:  +  The functor taking monoids <math>(M, e, *)</math> to their underlying set <math>M</math>. Then, the relevant universal property is given by finding an [http://en.wikipedia.org/wiki/Adjunction adjoint functor]: 
* <math>F : Set \to Mon</math>, <math> F</math> ⊣ <math>U </math>  * <math>F : Set \to Mon</math>, <math> F</math> ⊣ <math>U </math>  
Line 117:  Line 116:  
* Objects are endofunctors <math>F : C \to C</math>  * Objects are endofunctors <math>F : C \to C</math>  
−  * Morphisms are [  +  * Morphisms are [[Category theory/Natural transformationnatural transformations]] [3] between the functors 
* The tensor product is composition: <math>F \otimes G = F \circ G</math>  * The tensor product is composition: <math>F \otimes G = F \circ G</math>  
* The identity object is the identity functor, <math>I</math>, taking objects and morphisms to themselves  * The identity object is the identity functor, <math>I</math>, taking objects and morphisms to themselves  
Line 199:  Line 198:  
This is caused by these algebras identifying elements in incompatible ways (2 makes SSZ = Z, but 3 doesn't, and 3 makes SSSZ = Z, but 2 doesn't). So, the values of an initial algebra must be compatible with any such identification scheme, and this is accomplished by identifying ''none'' of the terms in the initial algebra (so that h is free to send each term to an appropriate value in the target, according to the identifications there). A similar phenomenon occurs in the main section of this article, except that the structures in question have additional equational laws that terms must satisfy, so the initial structure ''is'' allowed to identify those, ''but no more'' than those.  This is caused by these algebras identifying elements in incompatible ways (2 makes SSZ = Z, but 3 doesn't, and 3 makes SSSZ = Z, but 2 doesn't). So, the values of an initial algebra must be compatible with any such identification scheme, and this is accomplished by identifying ''none'' of the terms in the initial algebra (so that h is free to send each term to an appropriate value in the target, according to the identifications there). A similar phenomenon occurs in the main section of this article, except that the structures in question have additional equational laws that terms must satisfy, so the initial structure ''is'' allowed to identify those, ''but no more'' than those.  
−  By the same argument, we can determine that 3 is not a final algebra. Nor are the naturals (for any modular set M, S(  +  By the same argument, we can determine that 3 is not a final algebra. Nor are the naturals (for any modular set M, S(h(M1)) = S(M1) = M, but h(S(M1)) = h0 = 0). The final algebra is the set {0}, with S0 = 0 and Z = 0, with unique homomorphism hx = 0. This can be seen as identifying as many elements as possible, rather than as few. Naturally, final algebras don't receive that much interest. However, finality is an important property of [http://en.wikipedia.org/wiki/Initial_algebra#Final_coalgebra coalgebras]. 
+  
+  ==== Forgetful functors ====  
+  
+  The term "forgetful functor" has no formal specification; only an intuitive one. The idea is that one starts in some category of structures, and then defines a functor by forgetting part or all of what defines those structures. For instance:  
+  
+  * <math>U : Str \to Set</math>, where <math>Str</math> is any category of algebraic structures, and U simply forgets about all of the nary operations and equational laws, and takes structures to their underlying sets, and homomorphisms to functions over those sets.  
+  * <math>U : Grp \to Mon</math>, which takes a group and forgets about the inverse operation to give a monoid. This functor would then be related to "free groups over a monoid".  
==== Natural transformations ====  ==== Natural transformations ====  
−  The  +  The wiki article gives a formal definition of natural transformations, but a Haskell programmer can think of a natural transformation between functors F and G as: 
<haskell>  <haskell>  
trans :: forall a. F a > G a  trans :: forall a. F a > G a  
</haskell>  </haskell> 
Latest revision as of 17:49, 3 March 2010
Contents 
[edit] 1 Introduction
This article attempts to give a relatively informal understanding of "free" structures from algebra/category theory, with pointers to some of the formal material for those who desire it. The later sections make use of some notions from category theory, so some familiarity with its basics will be useful.
[edit] 2 Algebra
[edit] 2.1 What sort of structures are we talking about?
The distinction between free structures and other, nonfree structures, originates in abstract algebra, so that provides a good place to start. Some common structures considered in algebra are:
 Monoids
 consisting of
 A set M
 An identity
 A binary operation
 And satisfying the equations
 x * (y * z) = (x * y) * z
 e * x = x = x * e
 consisting of
 Groups
 consisting of
 A monoid (M,e, * )
 An additional unary operation
 satisfying
 x * x^{ − 1} = e = x^{ − 1} * x
 consisting of
 Rings
 consisting of
 A set R
 A unary operation
 Two binary operations
 Distinguished elements
 such that
 (R,0, + , − ) is a group
 (R,1, * ) is a monoid
 x + y = y + x
 (x + y) * z = x * z + y * z
 x * (y + z) = x * y + x * z
 consisting of
So, for algebraic structures, we have sets equipped with operations that are expected to satisfy equational laws.
[edit] 2.2 Free algebraic structures
Now, given such a description, we can talk about the free structure over a particular set S (or, possibly over some other underlying structure; but we'll stick with sets now). What this means is that given S, we want to find some set M, together with appropriate operations to make M the structure in question, along with the following two criteria:
 There is an embedding
 The structure generated is as 'simple' as possible.
 M should contain only elements that are required to exist by i and the operations of the structure.
 The only equational laws that should hold for the generated structure are those that are required to hold by the equational laws for structures of that type.
So, in the case of a free monoid (from here on out, we'll assume that the structure in question is a monoid, since it's simplest), the equation x * y = y * x should not hold unless x = y, x = e or y = e. Further , for all x, and , and (and these should all be distinct, except as required by the monoid laws), but there should be no 'extra' elements of M in addition to those.
For monoids, the free structure over a set is given by the monoid of lists of elements of that set, with concatenation as multiplication. It should be easy to convince yourself of the following (in pseudoHaskell):
M = [S] e = [] * = (++) i : S > [S] i x = [x]  i x = x : [] [] ++ xs = xs = xs ++ [] xs ++ (ys ++ zs) = (xs ++ ys) ++ zs xs ++ ys = ys ++ xs iff xs == ys  xs == []  ys == []  etc.
[edit] 3 The category connection
[edit] 3.1 Free structure functors
One possible objection to the above description (even a more formal version thereof) is that the characterization of "simple" is somewhat vague. Category theory gives a somewhat better solution. Generally, structuresoveraset will form a category, with arrows being structurepreserving homomorphisms. "Simplest" (in the sense we want) structures in that category will then either be initial or terminal, [1] and thus, freeness can be defined in terms of such universal constructions.
In its full categorical generality, freeness isn't necessarily categorized by underlying set structure, either. Instead, one looks at "forgetful" functors [2] from the category of structures to some other category. For our free monoids above, it'd be:
The functor taking monoids (M,e, * ) to their underlying set M. Then, the relevant universal property is given by finding an adjoint functor:
 , F ⊣ U
F being the functor taking sets to the free monoids over those sets. So, free structure functors are left adjoint to forgetful functors. It turns out this categorical presentation also has a dual: cofree structure functors are right adjoint to forgetful functors.
[edit] 3.2 Algebraic constructions in a category
Category theory also provides a way to extend specifications of algebraic structures to more general categories, which can allow us to extend the above informal understanding to new contexts. For instance, one can talk about monoid objects in an arbitrary monoidal category. Such categories have a tensor product of objects, with a unit object I (both of which satisfy various laws).
A monoid object in a monoidal category is then:
 An object M
 A unit 'element'
 A multiplication
such that:
Where:
 and are the identity isomorphisms for the monoidal category, and
 is part of the associativity isomorphism of the category.
So, hopefully the connection is clear: we've generalized the carrier set to a carrier object, generalized the operations to morphisms in a category, and equational laws are promoted to being equations about composition of morphisms.
[edit] 3.3 Monads
One example of a class of monoid objects happens to be monads. Given a base category C, we have the monoidal category C^{C}:
 Objects are endofunctors
 Morphisms are natural transformations [3] between the functors
 The tensor product is composition:
 The identity object is the identity functor, I, taking objects and morphisms to themselves
If we then specialize the definition of a monoid object to this situation, we get:
 An endofunctor
 A natural transformation
 A natural transformation
which satisfy laws that turn out to be the standard monad laws. So, monads turn out to be monoid objects in the category of endofunctors.
[edit] 3.4 Free Monads
But, what about our intuitive understanding of free monoids above? We wanted to promote an underlying set, but we have switched from sets to functors. So, presumably, a free monad is generated by an underlying (endo)functor, . We then expect there to be a natural transformation , 'injecting' the functor into the monad.
In Haskell, we can write the type of free monads over Haskell endofunctors as follows:
data Free f a = Return a  Roll (f (Free f a)) instance Functor f => Monad (Free f) where return a = Return a Return a >>= f = f a Roll ffa >>= f = Roll $ fmap (>>= f) ffa  join (Return fa) = fa  join (Roll ffa) = Roll (fmap join ffa) inj :: Functor f => f a > Free f a inj fa = Roll $ fmap Return fa
This should bear some resemblance to free monoids over lists. Return
is analogous to []
, and Roll
is analogous to (:)
. Lists let us create arbitrary length strings of elements from some set, while Free f
lets us create structures involving f
composed with itself an arbitrary number of times (recall, functor composition was the tensor product of our category). Return
gives our type a way to handle the 0ary composition of f
(as []
is the 0length string), while Roll
is the way to extend the nesting level by one (just as (:)
lets us create (n+1)length strings out of nlength ones). Finally, both injections are built in a similar way:
inj_list x = (:) x [] inj_free fx = Roll (fmap Return fx)
This, of course, is not completely rigorous, but it is a nice extension of the informal reasoning we started with.
[edit] 4 Further reading
For those looking for an introduction to the necessary category theory used above, Steve Awodey's Category Theory is a popular, freely available reference.
[edit] 5 Notes
[edit] 5.1 Universal constructions
Initial (final) objects are those that have a single unique arrow from (to) the object to (from) every other object in the category. For instance, the empty set is initial in the category of sets, and any oneelement set is final. Initial objects play an important role in the semantics of algebraic datatypes. For a datatype like:
data T = C1 A B C  C2 D E T
we consider the following:
 A functor ,
 Falgebras which are:
 An object
 An action
 Algebra homomorphisms
 These are given by such that
The datatype T
is then given by an initial Falgebra. This works out nicely because the unique algebra homomorphism whose existence is guaranteed by initiality is the fold or 'catamorphism' for the datatype.
Intuitively, though, the fact that T
is an Falgebra means that it is in some sense closed under forming terms of shape Fsuppose we took the simpler signature FX = 1 + X
of the natural numbers; then both Z = inl () and Sx = inr x can be incorporated into Nat. However, there are potentially many algebras; for instance, the naturals modulo some finite number, and successor modulo that number are an algebra for the natural signature.
However, initiality constrains what Nat can be. Consider, for instance, the above modular sets 2 and 3. There can be no homomorphism :

 but

 but

 but

This is caused by these algebras identifying elements in incompatible ways (2 makes SSZ = Z, but 3 doesn't, and 3 makes SSSZ = Z, but 2 doesn't). So, the values of an initial algebra must be compatible with any such identification scheme, and this is accomplished by identifying none of the terms in the initial algebra (so that h is free to send each term to an appropriate value in the target, according to the identifications there). A similar phenomenon occurs in the main section of this article, except that the structures in question have additional equational laws that terms must satisfy, so the initial structure is allowed to identify those, but no more than those.
By the same argument, we can determine that 3 is not a final algebra. Nor are the naturals (for any modular set M, S(h(M1)) = S(M1) = M, but h(S(M1)) = h0 = 0). The final algebra is the set {0}, with S0 = 0 and Z = 0, with unique homomorphism hx = 0. This can be seen as identifying as many elements as possible, rather than as few. Naturally, final algebras don't receive that much interest. However, finality is an important property of coalgebras.
[edit] 5.2 Forgetful functors
The term "forgetful functor" has no formal specification; only an intuitive one. The idea is that one starts in some category of structures, and then defines a functor by forgetting part or all of what defines those structures. For instance:
 , where Str is any category of algebraic structures, and U simply forgets about all of the nary operations and equational laws, and takes structures to their underlying sets, and homomorphisms to functions over those sets.
 , which takes a group and forgets about the inverse operation to give a monoid. This functor would then be related to "free groups over a monoid".
[edit] 5.3 Natural transformations
The wiki article gives a formal definition of natural transformations, but a Haskell programmer can think of a natural transformation between functors F and G as:
trans :: forall a. F a > G a