From HaskellWiki
< Diagrams‎ | Dev
Revision as of 15:56, 23 July 2012 by Byorgey (talk | contribs) (Linear and affine transformations)

Jump to: navigation, search

Linear and affine transformations

A linear transformation on a vector space V is a function f : V \to V satisfying

  • f(kv) = k f(v)
  • f(u + v) = f(u) + f(v)

where k is an arbitrary scalar and u,v are arbitrary vectors. Linear transformations always preserve the origin, send lines to lines, and preserve the relative distances of points along any given line. The image of parallel lines under a linear transformation is again parallel lines. However, linear transformations do not necessarily preserve angles between lines. Examples of linear transformations include

  • rotation
  • reflection
  • scaling
  • shear

Linear transformations are closed under composition and form a monoid with the identity function as the neutral element.

An affine transformation is a function of the form a(v) = f(v) + u where f is a linear transformation and u is some vector. Affine transformations preserve all the same things as linear transformations, except that they do not necessarily send the origin to itself. Affine transformations include all the above examples, along with

  • translation

Affine transformations are also closed under composition, and are used as the basis for all transformations in diagrams. Note that using homogeneous coordinates, affine transformations can be seen as a certain subset of linear transformations in a vector space one dimension higher. (diagrams used to actually use this approach for representing affine transformations, but no longer does.)

Inverse and transpose transformations

Any linear transformation in an n-dimensional space can be represented as an n-by-n matrix: in particular, the matrix whose nth column is the result of applying the transformation to the unit vector with a 1 as its nth entry and 0's everywhere else. For example,

[4 5] [6 2]

is the linear transformation that sends (1,0) to (4,6) and sends (0,1) to (5,2). It's easy to see that multiplying a matrix by such a unit vector picks out a column of the matrix, as I claimed. It's also not too hard to see that the matrix's action on any other vector is completely determined by linearity. Finally, of course, one can verify that matrix multiplication is, in fact, linear.

By the inverse of a linear map (note I use "linear transformation" and "linear map" interchangeably), we mean its inverse under composition: a transformation T^-1 such that TT^-1 = T^-1 T = id. We can also think about this in terms of matrices: the inverse of a matrix M is a matrix M^-1 such that MM^-1 = M^-1 M = I, the identity matrix (with ones along the diagonal and zeros everywhere else). One can check that (1) the identity matrix corresponds to the identity linear map, and (2) matrix multiplication corresponds to composition of linear maps. So these are really two different views of the same thing.

The *transpose* of a matrix (denoted M^T) is the matrix you get when you reflect all the entries about the main diagonal. For example, the transpose of

[4 5] [6 2]


[4 6] [5 2]

This is (usually) not the same as the inverse, as can be easily checked. Now, how can we interpret the transpose operation on matrices as an operation on linear maps? Of course, we can just say "the transpose of a linear map is defined to be the linear map corresponding to the transpose of its matrix representation". But I actually don't have any intuitive sense for what the transpose of a linear map IS in any more fundamental sense.

What we really want to think about is the *inverse transpose*, that is, the inverse of the transpose of a matrix, denoted M^-T. Suppose we have a vector v, and another vector p which is perpendicular to v. Now suppose we apply a linear map (represented by a matrix M) to the vector v, giving us a new vector v' = Mv. In general, Mp may NOT be perpendicular to Mv! Some linear maps such as rotations and uniform scales do preserve perpendicularity (if that's a word), but not all linear maps do: for example, the 2D linear map that scales all x-coordinates by 2 but leaves y-coordinates unchanged. But what if we want to obtain another vector perpendicular to v' ? For some reason that I don't really understand (although I should probably go try to understand it better), the thing to do is to multiply p not by M, but by the *inverse transpose* of M! That is: if v and p are perpendicular vectors, then so are Mv and (M^-T)p.

Now, if one has a matrix M it is easy to compute its transpose, and not too difficult to compute its inverse. However, in diagrams, given a transformation we *don't* have its matrix representation. Linear maps are represented more directly (and efficiently) as functions. We could generate a matrix by applying the transformation to basis vectors, but this is ugly, and converting a matrix back to a functional linear map is even uglier, and probably loses a lot of efficiency as well. What to do? We need to carry around enough extra information with each transformation so that we can extract the inverse transpose. We just have to make sure that this extra information is compositional, i.e. we know how to compose two transformations while preserving the extra information.

It's easy to see that (M^-1)^-1 = M = (M^T)^T (i.e. the inverse and transpose operations are involutions). It's not quite as obvious, but not too hard to show, that the inverse of the transpose is the same as the transpose of the inverse. Combining these facts, we see that for a given matrix M there are really only four matrices of interest: M, M^-1, M^T, and M^-T. We can arrange them conceptually in a square, thus:

M M^-1 M^T M^-T

Then taking the inverse corresponds to a L-R reflection, and taking the transpose corresponds to a U-D reflection. Given two such sets of four linear maps, it's not hard to work out how to compose them: just compose the elements pairwise (i.e. compose the inverse with the inverse, the transpose with the transpose, etc.), keeping in mind that the inverse and transpose compose in the reverse order whereas the regular transformation and the inverse transpose compose in the normal order.

So each transformation is actually a set of *four* linear maps. To take the inverse or transpose of a transformation just involves shuffling the linear maps around. Composition works as described above. We just have to specify these four linear maps by hand for each primitive transformation the library provides, which can be done by appealing to the matrix representation of the primitive transformation in question.


Transformations are defined in Graphics.Rendering.Diagrams.Transform (from the diagrams-core package). There are also more comments etc. there which may be helpful. First, we define *invertible* linear transformations:

data (:-:) u v = (u :-* v) :-: (v :-* u)

That is, a linear transformation paired with its inverse. We always carry around an inverse transformation along with every transformation. All the functions for constructing primitive transformations also construct an inverse, and when two transformations are composed their inverses are also composed (in the reverse order). Note that the :-* type is from the vector-space package. Don't worry about how it is implemented (actually I don't even know).

Then an (affine) Transformation is defined as follows:

data Transformation v = Transformation (v :-: v) (v :-: v) v

There are three parts here. The first (v :-: v) is the normal transformation (the linear part of it, i.e. without translation) paired with its inverse. The second (v :-: v) is the *transpose* and *inverse transpose*. The transpose of a linear transformation is just the transpose of its matrix.