Es/Teoría de Categorías y Programación Funcional @ Stanford: Clase 1

From HaskellWiki


Aviso: Esta página está en proceso de traducción.

This page now includes additional information based on the notes taken in class. Hopefully this will make the notes reasonably complete for everybody.


Bienvenida

El autor original es Mikael Vejdemo-Johansson. Se le puede encontrar en su oficina 383-BB durante horas de oficina o por correo en mik@math.stanford.edu.

Introducción

¿Por qué este curso?

Las introducciones a Haskell usualmente incluyen indicaciones de que la Teoría de Categorías es una herramienta útil, aunque no haces mucho más que mencionarlo. El objetivo de este curso es cubrir esa brecha, y proporcionar una introducción a la Teoría de Categorías que se enlace con Haskell y con la programación funcional como fuente de ejemplos y aplicaciones.

¿Qué incluye?

La definición de categorías, objetos especiales y morfismos, functores, transformación natural, (co-)límites y casos específicos, adjuntos, libertad y presentación como construcciones categóricas, mónadas y flechas de Kleisli, recursión con construcciones categóricas. Maybe, just maybe, if we have enough time, we'll finish with looking at the definition of a topos, and how this encodes logic internal to a category. Applications to fuzzy sets.

¿Qué hace falta?

Los ejemplos provendrán de matemáticas discretas, lógica, programación en Haskell y álgebra lineal. Espero que los siguientes conceptos sean por lo menos vagamente familiares para alguien estudiando este curso:

  • Conjuntos
  • Funciones
  • Permutaciones
  • Grupos
  • Conjuntos parcialmente ordenados
  • Espacios vectoriales
  • Mapas lineales
  • Matrices
  • Homomorfismos

Buenas referencias

En la biblioteca de matemáticas/CC (Stanford) hay:

  • Mac Lane: Categories for the working mathematician
  • Awodey: Category Theory

Monoids

Para establecer la notación y estar seguros de que todos hayan visto la definición:

Definición Un monoide es un conjunto M equipado con una operación binaria asociativa * (en Haskell: mappend) y un elemento identidad (en Haskell: mempty).

Un semigrupo es un monoide sin el requerimiento de elemento identidad.


Una función f:MN es un homomorfismo de monoide si se cumplen las siguientes condiciones:

  • f()=
  • f(m*m)=f(m)*f(m)

Categoría

Gráficas

Recordemos la definición de gráfica (dirigida). Una gráfica G es una colección de aristas (flechas) y vértices (nodos). A cada arista se le asigna un nodo fuente y un nodo destino.

fuentedestino

Dado un grafo G, denotamos la colección de nodos de G y la colección de flechas por G1. Estas dos colecciones están conectados, y la gráfica dada su estructura, por dos funciones: la función de fuente s:G1toG0 y el objetivo de la función t:G1toG0 .

No deberemos, en general, necesitar de una de las colecciones para ser un conjunto, pero felizmente aceptamos grandes colecciones; tratar con la teoría de conjuntos paradojas como son y cuando tenemos que hacerlo. Un gráfo donde ambos nodos y flechas son conjuntos se llama "pequeño". Un grafo que contenga una clase se llama "Largo".

Si los dos G0 y G1 son finitos, el grafo se llamafinitotambién.

The empty graph has G0=G1=.

A discrete graph has G1=.

A complete graph has G1={(v,w)|v,wG0}.

A simple graph has at most one arrow between each pair of nodes. Any relation on a set can be interpreted as a simple graph.

  • Show some examples.

A homomorphism f:GH of graphs is a pair of functions f0:G0H0 and f1:G1H1 such that sources map to sources and targets map to targets, or in other words:

  • s(f1(e))=f0(s(e))
  • t(f1(e))=f0(t(e))

By a path in a graph G from the node x to the node y of length k, we mean a sequence of edges (f1,f2,,fk) such that:

  • s(f1)=x
  • t(fk)=y
  • s(fi)=t(fi1) for all other i.

Paths with start and end point identical are called closed. For any node x, there is a unique closed path () starting and ending in x of length 0.

For any edge f, there is a unique path from s(f) to t(f) of length 1: (f).

We denote by Gk the set of paths in G of length k.

Categories

We now are ready to define a category. A category is a graph C equipped with an associative composition operation :G2G1, and an identity element for composition 1x for each node x of the graph.

Note that G2 can be viewed as a subset of G1×G1, the set of all pairs of arrows. It is intentional that we define the composition operator on only a subset of the set of all pairs of arrows - the composable pairs. Whenever you'd want to compose two arrows that don't line up to a path, you'll get nonsense, and so any statement about the composition operator has an implicit "whenever defined" attached to it.

The definition is not quite done yet - this composition operator, and the identity arrows both have a few rules to fulfill, and before I state these rules, there are some notation we need to cover.

Backwards!

If we have a path given by the arrows (f,g) in G2, we expect f:AB and g:BC to compose to something that goes AC. The origin of all these ideas lies in geometry and algebra, and so the abstract arrows in a category are supposed to behave like functions under function composition, even though we don't say it explicitly.

Now, we are used to writing function application as f(x) - and possibly, from Haskell, as f x. This way, the composition of two functions would read g(f(x)).

On the other hand, the way we write our paths, we'd read f then g. This juxtaposition makes one of the two ways we write things seem backwards. We can resolve it either by making our paths in the category go backwards, or by reversing how we write function application.

In the latter case, we'd write x.f, say, for the application of f to x, and then write x.f.g for the composition. It all ends up looking a lot like Reverse Polish Notation, and has its strengths, but feels unnatural to most. It does, however, have the benefit that we can write out function composition as (f,g)f.g and have everything still make sense in all notations.

In the former case, which is the most common in the field, we accept that paths as we read along the arrows and compositions look backwards, and so, if f:AB and g:BC, we write gf:AC, remembering that elements are introduced from the right, and the functions have to consume the elements in the right order.


The existence of the identity map can be captured in a function language as well: it is the existence of a function u:G0G1.

Now for the remaining rules for composition. Whenever defined, we expect associativity - so that h(gf)=(hg)f. Furthermore, we expect:

  1. Composition respects sources and targets, so that:
    • s(gf)=s(f)
    • t(gf)=t(g)
  2. s(u(x))=t(u(x))=x

In a category, arrows are also called morphisms, and nodes are also called objects. This ties in with the algebraic roots of the field.

We denote by HomC(A,B), or if C is obvious from context, just Hom(A,B), the set of all arrows from A to B. This is the hom-set or set of morphisms, and may also be denoted C(A,B).

If a category is large or small or finite as a graph, it is called a large/small/finite category.

A category with objects a collection of sets and morphisms a selection from all possible set-valued functions such that the identity morphism for each object is a morphism, and composition in the category is just composition of functions is called concrete. Concrete categories form a very rich source of examples, though far from all categories are concrete.

New Categories from old

As with most other algebraic objects, one essential part of our tool box is to take known objects and form new examples from them. This allows us generate a wealth of examples from the ones that shape our intuition.

Typical things to do here would be to talk about subobjects, products and coproducts, sometimes obvious variations on the structure, and what a typical object looks like. Remember from linear algebra how subspaces, cartesian products (which for finite-dimensional vectorspaces covers both products and coproducts) and dual spaces show up early, as well as the theorems giving dimension as a complete descriptor of a vectorspace.

We'll go through the same sequence here; with some significant small variations.

A category D is a subcategory of the category C if:

  • D0C0
  • D1C1
  • D1 contains 1X for all XD0
  • sources and targets of all the arrows in D1 are all in D0
  • the composition in D is the restriction of the composition in C.

Written this way, it does look somewhat obnoxious. It does become easier though, with the realization - studied closer in homework exercise 2 - that the really important part of a category is the collection of arrows. Thus, a subcategory is a subcollection of the collection of arrows - with identities for all objects present, and with at least all objects that the existing arrows imply.

A subcategory DC is full if D(A,B)=C(A,B) for all objects A,B of D. In other words, a full subcategory is completely determined by the selection of objects in the subcategory.

A subcategory DC is wide if the collection of objects is the same in both categories. Hence, a wide subcategory picks out a subcollection of the morphisms.

The dual of a category is to a large extent inspired by vector space duals. In the dual C* of a category C, we have the same objects, and the morphisms are given by the equality C*(A,B)=C(B,A) - every morphism from C is present, but it goes in the wrong direction. Dualizing has a tendency to add the prefix co- when it happens, so for instance coproducts are the dual notion to products. We'll return to this construction many times in the course.

Given two categories C,D, we can combine them in several ways:

  1. We can form the category that has as objects the disjoint union of all the objects of C and D, and that sets Hom(A,B)= whenever A,B come from different original categories. If A,B come from the same original category, we simply take over the homset from that category. This yields a categorical coproduct, and we denote the result by C+D. Composition is inherited from the original categories.
  2. We can also form the category with objects A,B for every pair of objects AC,BD. A morphism in Hom(A,B,A,B) is simply a pair f:AA,g:BB. Composition is defined componentwise. This category is the categorical correspondent to the cartesian product, and we denot it by C×D.

In these three constructions - the dual, the product and the coproduct - he arrows in the categories are formal constructions, not functions; even if the original category was given by functions, the result is no longer given by a function.

Given a category C and an object A of that category, we can form the slice category C/A. Objects in the slice category are arrows BA for some object B in C, and an arrow ϕ:fg is an arrow s(f)s(g) such that f=gϕ. Composites of arrows are just the composites in the base category.

Notice that the same arrow ϕ in the base category C represents potentially many different arrows in C/A: it represents one arrow for each choice of source and target compatible with it.

There is a dual notion: the coslice category AC, where the objects are paired with maps AB.

Slice categories can be used, among other things, to specify the idea of parametrization. The slice category C/A gives a sense to the idea of objects from C labeled by elements of A.

We get this characterization by interpreting the arrow representing an object as representing its source and a type function. Hence, in a way, the Typeable type class in Haskell builds a slice category on an appropriate subcategory of the category of datatypes.

Alternatively, we can phrase the importance of the arrow in a slice categories of, say, Set, by looking at preimages of the slice functions. That way, an object f:BA gives us a family of (disjoint) subsets of B indexed by the elements of A.

Finally, any graph yields a category by just filling in the arrows that are missing. The result is called the free category generated by the graph, and is a concept we will return to in some depth. Free objects have a strict categorical definition, and they serve to give a model of thought for the things they are free objects for. Thus, categories are essentially graphs, possibly with restrictions or relations imposed; and monoids are essentially strings in some alphabet, with restrictions or relatinos.

Examples

  • The empty category.
    • No objects, no morphisms.
  • The one object/one arrow category 1.
    • A single object and its identity arrow.
  • The categories 2 and 1+1.
    • Two objects, A,B with identity arrows and a unique arrow AB.
  • The category Set of sets.
    • Sets for objects, functions for arrows.
  • The catgeory FSet of finite sets.
    • Finite sets for objects, functions for arrows.
  • The category PFn of sets and partial functions.
    • Sets for objects. Arrows are pairs (SS,f:ST)PFn(S,T).
    • PFn(A,B) is a partially ordered set. (Sf,f)(Sg,g) precisely if SfSg and f=g|Sf.
  • There is an alternative way to define a category of partial functions: For objects, we take sets, and for morphisms ST, we take subsets FS×T such that each element in S occurs in at most one pair in the subset. Composition is by an interpretation of these subsets corresponding to the previous description. We'll call this category PFn.
  • Every partial order is a category. Each hom-set has at most one element.
    • Objects are the elements of the poset. Arrows are unique, with AB precisely if AB.
  • Every monoid is a category. Only one object.
      • Kleene closure. Free monoids.
  • The category of Sets and injective functions.
  • The category of Sets and surjective functions.
  • The category of k-vector spaces and linear maps.
  • The category with objects the natural numbers and Hom(m,n) the set of m×n-matrices.
  • The category of Data Types with Computable Functions.
    • Our ideal programming language has:
      • Primitive data types.
      • Constants of each primitive type.
      • Operations, given as functions between types.
      • Constructors, producing elements from data types, and producing derived data types and operations.
    • We will assume that the language is equipped with
      • A do-nothing operation for each data type. Haskell has id.
      • An empty type 1, with the property that each type has exactly one function to this type. Haskell has (). We will use this to define the constants of type t as functions 1t. Thus, constants end up being 0-ary functions.
      • A composition constructor, taking an operator f:AB and another operator g:BC and producing an operator gf:AC. Haskell has (.).
    • This allows us to model a functional programming language with a category.
  • The category with objects logical propositions and arrows proofs.
  • The category Rel has objects finite sets and morphisms AB being subsets of A×B. Composition is by (a,c)gf if there is some bB such that (a,b)f,(b,c)g. Identity morphism is the diagonal (a,a):aA.



Homework

For a passing mark, a written, acceptable solution to at least 3 of the 6 questions should be given no later than midnight before the next lecture.

For each lecture, there will be a few exercises marked with the symbol *. These will be more difficult than the other exercises given, will require significant time and independent study, and will aim to complement the course with material not covered in lectures, but nevertheless interesting for the general philosophy of the lecture course.

  1. Prove the general associative law: that for any path, and any bracketing of that path, the same composition may be found.
  2. Which of the following form categories? Proof and disproof for each:
    • Objects are finite sets, morphisms are functions such that |f1(b)|2 for all morphisms f, objects B and elements b.
    • Objects are finite sets, morphisms are functions such that |f1(b)|2 for all morphisms f, objects B and elements b.
    • Objects are finite sets, morphisms are functions such that |f1(b)|< for all morphisms f, objects B and elements b.
  3. Suppose u:AA in some category C.
    1. If gu=g for all g:AB in the category, then u=1A.
    2. If uh=h for all h:BA in the category, then u=1A.
    3. These two results characterize the objects in a category by the properties of their corresponding identity arrows completely.
  4. For as many of the examples given as you can, prove that they really do form a category. Passing mark is at least 60% of the given examples.
    • Which of the categories are subcategories of which other categories? Which of these are wide? Which are full?
  5. For this question, all parts are required:
    1. For which sets is the free monoid on that set commutative.
    2. Prove that for any category C, the set Hom(A,A) is a monoid under composition for every object A.
  6. * Read up on ω-complete partial orders. Suppose S is some set and P is the set of partial functions SS - in other words, an element of P is some pair (S0,f:S0S) with S0S. We give this set a poset structure by (S0,f)(S1,g) precisely if S0S1 and f(s)=g(s)sS0.
    • Show that P is a strict ω-CPO.
    • An element x of S is a fixpoint of f:SS if f(x)=x. Let N be the ω-CPO of partially defined functions on the natural numbers. We define a function ϕ:NN by sending some h:NN to a function k defined by
      1. k(0)=1
      2. k(n) is defined only if h(n1) is defined, and then by k(n)=n*h(n1).
Describe ϕ(nn2) and ϕ(nn3). Show that ϕ is continuous. Find a fixpoint (S0,f) of ϕ such that any other fixpoint of the same function is less than this one.
Find a continuous endofunction on some ω-CPO that has the fibonacci function F(0)=0,F(1)=1,F(n)=F(n1)+F(n2) as the least fixed point.
Implement a Haskell function that finds fixed points in an ω-CPO. Implement the two fixed points above as Haskell functions - using the ω-CPO fixed point approach in the implementation. It may well be worth looking at Data.Map to provide a Haskell context for a partial function for this part of the task.