Diagrams/Dev/Transformations
From HaskellWiki
(Dump a bunch of text here explaining representation of transformations in diagrams and the rationale behind it. Still needs a lot of cleanup) |
(Notes on an intensional representation for angle-preserving transformations) |
||
(4 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
= Linear and affine transformations = | = Linear and affine transformations = | ||
− | + | A ''linear transformation'' on a vector space <math>V</math> is a function <math>f : V \to V</math> satisfying | |
− | = | + | * <math>f(kv) = k f(v)</math> |
+ | * <math>f(u + v) = f(u) + f(v)</math> | ||
− | + | where <math>k</math> is an arbitrary scalar and <math>u,v</math> are arbitrary vectors. Linear transformations always preserve the origin, send lines to lines, and preserve the relative distances of points along any given line. The image of parallel lines under a linear transformation is again parallel lines. However, linear transformations do ''not'' necessarily preserve angles between lines. Examples of linear transformations include | |
− | + | * rotation | |
− | + | * reflection | |
+ | * scaling | ||
+ | * shear | ||
− | + | Linear transformations are closed under composition and form a monoid with the identity function as the neutral element. | |
− | + | An ''affine transformation'' is a function of the form <math>a(v) = f(v) + u</math> where <math>f</math> is a linear transformation and <math>u</math> is some vector. Affine transformations preserve all the same things as linear transformations, ''except'' that they do not necessarily send the origin to itself. Affine transformations include all the above examples, along with | |
− | + | * translation | |
− | + | Affine transformations are also closed under composition, and are used as the basis for all transformations in <code>diagrams</code>. Note that using homogeneous coordinates, affine transformations can be seen as a certain subset of linear transformations in a vector space one dimension higher. (<code>diagrams</code> used to actually use this approach for representing affine transformations, but no longer does.) | |
− | + | ||
− | + | = Matrices, inverse and transpose transformations = | |
− | + | Any linear transformation in an <math>n</math>-dimensional space can be represented as an <math>n \times n</math> matrix: in particular, the matrix whose <math>n</math>th column is the result of applying the transformation to the unit vector with a <math>1</math> as its <math>n</math>th entry and <math>0</math>s everywhere else. For example, | |
− | + | ||
+ | <math> | ||
+ | \begin{bmatrix} | ||
+ | 4 & 5 \\ | ||
+ | 6 & 2 | ||
+ | \end{bmatrix} | ||
+ | </math> | ||
+ | |||
+ | is the linear transformation that sends <math>(1,0)</math> to <math>(4,6)</math> and sends <math>(0,1)</math> to <math>(5,2)</math>. It's easy to see that multiplying a matrix by such a unit vector picks out a column of the matrix, as I claimed. It's also not too hard to see that the matrix's action on any other vector is completely determined by linearity. Finally, of course, one can verify that matrix multiplication is, in fact, linear. | ||
+ | |||
+ | By the inverse of a linear map (note I use "linear transformation" and "linear map" interchangeably), we mean its inverse under composition: a transformation <math>T^{-1}</math> such that <math>T \circ T^{-1} = T^{-1} \circ T = id</math>. We can also think about this in terms of matrices: the inverse of a matrix <math>M</math> is a matrix <math>M^{-1}</math> such that <math>MM^{-1} = M^{-1} M = I</math>, the identity matrix (with ones along the diagonal and zeros everywhere else). One can check that (1) the identity matrix corresponds to the identity linear map, and (2) matrix multiplication corresponds to composition of linear maps. So these are really two different views of the same thing. | ||
+ | |||
+ | The ''transpose'' of a matrix (denoted <math>M^\top</math>) is the matrix you get when you reflect all the entries about the main diagonal. For example, | ||
+ | |||
+ | <math> | ||
+ | \begin{bmatrix} | ||
+ | 4 & 5 \\ | ||
+ | 6 & 2 | ||
+ | \end{bmatrix}^\top | ||
+ | = | ||
+ | \begin{bmatrix} | ||
+ | 4 & 6 \\ | ||
+ | 5 & 2 | ||
+ | \end{bmatrix} | ||
+ | </math> | ||
This is (usually) not the same as the inverse, as can be easily checked. Now, how can we interpret the transpose operation on matrices as an operation on linear maps? Of course, we can just say "the transpose of a linear map is defined to be the linear map corresponding to the transpose of its matrix representation". But I actually don't have any intuitive sense for what the transpose of a linear map IS in any more fundamental sense. | This is (usually) not the same as the inverse, as can be easily checked. Now, how can we interpret the transpose operation on matrices as an operation on linear maps? Of course, we can just say "the transpose of a linear map is defined to be the linear map corresponding to the transpose of its matrix representation". But I actually don't have any intuitive sense for what the transpose of a linear map IS in any more fundamental sense. | ||
− | What we really want to think about is the | + | What we really want to think about is the ''inverse transpose'', that is, the inverse of the transpose of a matrix, denoted <math>M^{-\top}</math>. Suppose we have a vector <math>v</math>, and another vector <math>p</math> which is perpendicular to <math>v</math>. Now suppose we apply a linear map (represented by a matrix <math>M</math>) to the vector <math>v</math>, giving us a new vector <math>v' = Mv</math>. In general, <math>Mp</math> may not be perpendicular to <math>Mv</math>, since linear transformations do not preserve angles. Some linear maps such as rotations and uniform scales do preserve perpendicularity (if that's a word), but not all linear maps do: for example, the 2D linear map that scales all <math>x</math>-coordinates by 2 but leaves <math>y</math>-coordinates unchanged. But what if we want to obtain another vector perpendicular to <math>v'</math>? In fact, the thing to do is to multiply <math>p</math> not by <math>M</math>, but by the ''inverse transpose'' of <math>M</math>. That is: if <math>v</math> and <math>p</math> are perpendicular vectors, then so are <math>Mv</math> and <math>(M^{-\top})p</math>. This can be proved by noting that the dot product of two vectors <math>v_1 \bullet v_2</math> can equivalently be written <math>v_1^\top v_2</math> (treating vectors as <math>n \times 1</math> column matrices), and that two vectors are perpendicular iff their dot product is zero. Hence |
− | + | <math>(Mv) \bullet (M^{-\top}p) = (Mv)^\top(M^{-\top}p) = v^\top M^\top M^{-\top}p = v^\top p = 0</math> | |
− | + | Now, if one has a matrix <math>M</math> it is easy to compute its transpose, and not too difficult to compute its inverse. However, in diagrams, given a transformation we ''don't'' have its matrix representation. Linear maps are represented more directly (and efficiently) as functions. We could generate a matrix by applying the transformation to basis vectors, but this is ugly, and converting a matrix back to a functional linear map is even uglier, and probably loses a lot of efficiency as well. What to do? We need to carry around enough extra information with each transformation so that we can extract the inverse transpose. We just have to make sure that this extra information is compositional, i.e. we know how to compose two transformations while preserving the extra information. | |
− | + | It's easy to see that <math>(M^{-1})^{-1} = M = (M^\top)^\top</math> (i.e. the inverse and transpose operations are involutions). It's not quite as obvious, but not too hard to show, that the inverse of the transpose is the same as the transpose of the inverse. Combining these facts, we see that for a given matrix <math>M</math> there are really only four matrices of interest: <math>M</math>, <math>M^{-1}</math>, <math>M^\top</math>, and <math>M^{-\top}</math>. We can arrange them conceptually in a square, thus: | |
− | M^ | + | |
− | + | <math> | |
+ | \begin{matrix} | ||
+ | M & M^{-1} \\ | ||
+ | M^\top & M^{-\top} | ||
+ | \end{matrix} | ||
+ | </math> | ||
− | So each transformation is actually a set of | + | Then taking the inverse corresponds to a L-R reflection, and taking the transpose corresponds to a U-D reflection. Given two such sets of four linear maps, it's not hard to work out how to compose them: just compose the elements pairwise (''i.e.'' compose the inverse with the inverse, the transpose with the transpose, ''etc.''), keeping in mind that the inverse and transpose compose in the reverse order whereas the regular transformation and the inverse transpose compose in the normal order. |
+ | |||
+ | So each transformation is actually a set of ''four'' linear maps. To take the inverse or transpose of a transformation just involves shuffling the linear maps around. Composition works as described above. We just have to specify these four linear maps by hand for each primitive transformation the library provides, which can be done by appealing to the matrix representation of the primitive transformation in question. | ||
+ | |||
+ | Now, so far we have only represented ''linear'' transformations: affine transformations can be recovered by storing just a single extra vector along with the four linear transformations. | ||
= Representation = | = Representation = | ||
− | Transformations are defined in Graphics.Rendering.Diagrams.Transform (from the diagrams-core package). There are | + | Transformations are defined in [http://hackage.haskell.org/packages/archive/diagrams-core/0.5.0.1/doc/html/Graphics-Rendering-Diagrams-Transform.html <code>Graphics.Rendering.Diagrams.Transform</code>] (from the <code>diagrams-core</code> package). There are additional comments there which may be helpful. First, we define ''invertible'' linear transformations: |
+ | <code> | ||
data (:-:) u v = (u :-* v) :-: (v :-* u) | data (:-:) u v = (u :-* v) :-: (v :-* u) | ||
+ | </code> | ||
− | That is, a linear transformation paired with its inverse. We always carry around an inverse transformation along with every transformation. All the functions for constructing primitive transformations also construct an inverse, and when two transformations are composed their inverses are also composed (in the reverse order). Note that the :-* type is from the vector-space package. Don't worry about how it is implemented (actually I don't even know). | + | That is, a linear transformation paired with its inverse. We always carry around an inverse transformation along with every transformation. All the functions for constructing primitive transformations also construct an inverse, and when two transformations are composed their inverses are also composed (in the reverse order). Note that the <code>:-*</code> type constructor is from the [http://hackage.haskell.org/package/vector%2Dspace <code>vector-space</code>] package. Don't worry about how it is implemented (actually I don't even know). |
− | Then an (affine) Transformation is defined as follows: | + | Then an (affine) <code>Transformation</code> is defined as follows: |
+ | <code> | ||
data Transformation v = Transformation (v :-: v) (v :-: v) v | data Transformation v = Transformation (v :-: v) (v :-: v) v | ||
+ | </code> | ||
+ | |||
+ | There are three parts here. The first <code>(v :-: v)</code> is the normal transformation (the linear part of it, ''i.e.'' without translation) paired with its inverse. The second <code>(v :-: v)</code> is the transpose and inverse transpose. The final <code>v</code> is the translational component of the transformation. | ||
+ | |||
+ | For examples of creating <code>Transformation</code>s, see the remainder of the <code>Graphics.Rendering.Diagrams.Transform</code> module, and also the [http://hackage.haskell.org/packages/archive/diagrams-lib/0.5/doc/html/Diagrams-TwoD-Transform.html <code>Diagrams.TwoD.Transform</code>] module from the <code>diagrams-lib</code> package. | ||
+ | |||
+ | = Specifying transformations as matrices = | ||
+ | |||
+ | To turn an arbitrary matrix into a <code>Transformation</code> would require the following: | ||
+ | |||
+ | * Assuming the matrix is represented using homogeneous coordinates, it must be checked that the matrix actually represents an affine transformation (as opposed to a general projective transformation). This corresponds to the bottom row of the matrix consisting of all zeros and a final 1. | ||
+ | * Extract the translational component from the final column of the matrix (except for the 1 in the bottom right corner). | ||
+ | * Extract the linear transformation component as the submatrix obtained by deleting the last row and column. | ||
+ | * Turn the matrix into a function by inlining the definition of matrix-vector multiplication. ''e.g.'' the matrix | ||
+ | |||
+ | <math> | ||
+ | \begin{bmatrix} | ||
+ | 1 & 2 \\ | ||
+ | 3 & 4 | ||
+ | \end{bmatrix} | ||
+ | </math> | ||
+ | |||
+ | corresponds to the function <math>\lambda (x,y) \to (x + 2y, 3x + 4y)</math>. | ||
+ | * Compute the transpose, inverse, and inverse transpose of the matrix and turn the results into functions as well. | ||
+ | |||
+ | = Angle-preserving transformations = | ||
+ | |||
+ | An angle-preserving transformation is one which preserves angles between all pairs of lines. It is possible to prove the following: | ||
+ | |||
+ | * A linear transformation <math>L</math> is angle-preserving iff there exists a constant <math>\lambda</math> such that <math>\|L(v)\| = \lambda \|v\|</math> for all vectors <math>v</math>. (See http://www.cs.bsu.edu/homepages/fischer/math445/angles.pdf.) | ||
+ | * Intuitively, the previous result means that any angle-preserving linear transformation (other than the constantly zero transformation) can be uniquely decomposed as a uniform scale followed by a rotation. | ||
+ | * An affine transformation is angle-preserving iff it can be expressed as an angle-preserving linear transformation followed by a translation. | ||
+ | * Angle-preserving affine transformations are closed under composition. | ||
+ | |||
+ | The last two points in particular imply that we can uniquely represent angle-preserving affine transformations by a triple <math>A = (\theta, s, w)</math> where <math>A(v) = R_\theta(sv) + w</math> (<math>R_\theta</math> indicates a rotation by <math>\theta</math>), and given this representation we may compute | ||
+ | |||
+ | <math>(\theta_1, s_1, w_1) \circ (\theta_2, s_2, w_2) = (\theta_1 + \theta_2, s_1 s_2, R_{\theta_1}(s_1 w_2) + w_1)</math> | ||
+ | |||
+ | It is also easy to compute the inverse | ||
+ | |||
+ | <math>(\theta, s, w)^{-1} = (-\theta, 1/s, -R_{-\theta}(w/s))</math> | ||
+ | |||
+ | and the transpose of the linear part: | ||
+ | |||
+ | <math>(\theta, s)^T = (-\theta, s)</math> | ||
+ | |||
+ | (The above calculations are simplified by the observation that rotations and uniform scales commute, though of course transformations do not commute in general.) | ||
+ | |||
+ | As a concrete proposal, we can change <code>T2</code> from a synonym for <code>Transformation R2</code> to a sum type | ||
+ | |||
+ | data T2 | ||
+ | = Ortho Angle Double R2 | ||
+ | | GenT2 (Transformation R2) | ||
+ | |||
+ | and change things like <code>rotate</code> to construct an <code>Ortho</code> instead of a <code>Transformation R2</code>. There is a bit of trickiness to be worked out with the function <code>scaling</code>, which currently has type <code>Scalar v -> Transformation v</code>, i.e. it is supposed to work over any vector space, but we want it to do something special for <code>R2</code>. I am not sure what the right solution is. | ||
+ | |||
+ | In any case, some benefits of such a scheme would be: | ||
+ | |||
+ | * Angle-preserving transformations are precisely those which preserve circular arcs. If we add a new primitive type of arc segment, we can look at the final transformation applied to the segment to determine whether we can simply issue a primitive arc command (if the transformation is an <code>Ortho</code>) or if we must issue an arc command wrapped in a transform (or approximate the arc with Beziers). | ||
− | + | * For typical applications, <code>Ortho</code> transformations make up the vast majority of transformations actually used. Intuitively, it seems this should lead to a performance boost, since <code>Ortho</code> values consist of simple first-order data, making it easier for GHC to optimize e.g. chains of repeated transformations. General <code>Transformation</code> values, on the other hand, are functions, which can be tricky for GHC to optimize e.g. when deeply nested. |
Revision as of 02:54, 1 December 2013
Contents |
1 Linear and affine transformations
A linear transformation on a vector space V is a function satisfying
- f(kv) = kf(v)
- f(u + v) = f(u) + f(v)
where k is an arbitrary scalar and u,v are arbitrary vectors. Linear transformations always preserve the origin, send lines to lines, and preserve the relative distances of points along any given line. The image of parallel lines under a linear transformation is again parallel lines. However, linear transformations do not necessarily preserve angles between lines. Examples of linear transformations include
- rotation
- reflection
- scaling
- shear
Linear transformations are closed under composition and form a monoid with the identity function as the neutral element.
An affine transformation is a function of the form a(v) = f(v) + u where f is a linear transformation and u is some vector. Affine transformations preserve all the same things as linear transformations, except that they do not necessarily send the origin to itself. Affine transformations include all the above examples, along with
- translation
Affine transformations are also closed under composition, and are used as the basis for all transformations in diagrams
. Note that using homogeneous coordinates, affine transformations can be seen as a certain subset of linear transformations in a vector space one dimension higher. (diagrams
used to actually use this approach for representing affine transformations, but no longer does.)
2 Matrices, inverse and transpose transformations
Any linear transformation in an n-dimensional space can be represented as an matrix: in particular, the matrix whose nth column is the result of applying the transformation to the unit vector with a 1 as its nth entry and 0s everywhere else. For example,
is the linear transformation that sends (1,0) to (4,6) and sends (0,1) to (5,2). It's easy to see that multiplying a matrix by such a unit vector picks out a column of the matrix, as I claimed. It's also not too hard to see that the matrix's action on any other vector is completely determined by linearity. Finally, of course, one can verify that matrix multiplication is, in fact, linear.
By the inverse of a linear map (note I use "linear transformation" and "linear map" interchangeably), we mean its inverse under composition: a transformation T^{ − 1} such that . We can also think about this in terms of matrices: the inverse of a matrix M is a matrix M^{ − 1} such that MM^{ − 1} = M^{ − 1}M = I, the identity matrix (with ones along the diagonal and zeros everywhere else). One can check that (1) the identity matrix corresponds to the identity linear map, and (2) matrix multiplication corresponds to composition of linear maps. So these are really two different views of the same thing.
The transpose of a matrix (denoted ) is the matrix you get when you reflect all the entries about the main diagonal. For example,
This is (usually) not the same as the inverse, as can be easily checked. Now, how can we interpret the transpose operation on matrices as an operation on linear maps? Of course, we can just say "the transpose of a linear map is defined to be the linear map corresponding to the transpose of its matrix representation". But I actually don't have any intuitive sense for what the transpose of a linear map IS in any more fundamental sense.
What we really want to think about is the inverse transpose, that is, the inverse of the transpose of a matrix, denoted . Suppose we have a vector v, and another vector p which is perpendicular to v. Now suppose we apply a linear map (represented by a matrix M) to the vector v, giving us a new vector v' = Mv. In general, Mp may not be perpendicular to Mv, since linear transformations do not preserve angles. Some linear maps such as rotations and uniform scales do preserve perpendicularity (if that's a word), but not all linear maps do: for example, the 2D linear map that scales all x-coordinates by 2 but leaves y-coordinates unchanged. But what if we want to obtain another vector perpendicular to v'? In fact, the thing to do is to multiply p not by M, but by the inverse transpose of M. That is: if v and p are perpendicular vectors, then so are Mv and . This can be proved by noting that the dot product of two vectors can equivalently be written (treating vectors as column matrices), and that two vectors are perpendicular iff their dot product is zero. Hence
Now, if one has a matrix M it is easy to compute its transpose, and not too difficult to compute its inverse. However, in diagrams, given a transformation we don't have its matrix representation. Linear maps are represented more directly (and efficiently) as functions. We could generate a matrix by applying the transformation to basis vectors, but this is ugly, and converting a matrix back to a functional linear map is even uglier, and probably loses a lot of efficiency as well. What to do? We need to carry around enough extra information with each transformation so that we can extract the inverse transpose. We just have to make sure that this extra information is compositional, i.e. we know how to compose two transformations while preserving the extra information.
It's easy to see that (i.e. the inverse and transpose operations are involutions). It's not quite as obvious, but not too hard to show, that the inverse of the transpose is the same as the transpose of the inverse. Combining these facts, we see that for a given matrix M there are really only four matrices of interest: M, M^{ − 1}, , and . We can arrange them conceptually in a square, thus:
Then taking the inverse corresponds to a L-R reflection, and taking the transpose corresponds to a U-D reflection. Given two such sets of four linear maps, it's not hard to work out how to compose them: just compose the elements pairwise (i.e. compose the inverse with the inverse, the transpose with the transpose, etc.), keeping in mind that the inverse and transpose compose in the reverse order whereas the regular transformation and the inverse transpose compose in the normal order.
So each transformation is actually a set of four linear maps. To take the inverse or transpose of a transformation just involves shuffling the linear maps around. Composition works as described above. We just have to specify these four linear maps by hand for each primitive transformation the library provides, which can be done by appealing to the matrix representation of the primitive transformation in question.
Now, so far we have only represented linear transformations: affine transformations can be recovered by storing just a single extra vector along with the four linear transformations.
3 Representation
Transformations are defined in Graphics.Rendering.Diagrams.Transform
(from the diagrams-core
package). There are additional comments there which may be helpful. First, we define invertible linear transformations:
data (:-:) u v = (u :-* v) :-: (v :-* u)
That is, a linear transformation paired with its inverse. We always carry around an inverse transformation along with every transformation. All the functions for constructing primitive transformations also construct an inverse, and when two transformations are composed their inverses are also composed (in the reverse order). Note that the :-*
type constructor is from the vector-space
package. Don't worry about how it is implemented (actually I don't even know).
Then an (affine) Transformation
is defined as follows:
data Transformation v = Transformation (v :-: v) (v :-: v) v
There are three parts here. The first (v :-: v)
is the normal transformation (the linear part of it, i.e. without translation) paired with its inverse. The second (v :-: v)
is the transpose and inverse transpose. The final v
is the translational component of the transformation.
For examples of creating Transformation
s, see the remainder of the Graphics.Rendering.Diagrams.Transform
module, and also the Diagrams.TwoD.Transform
module from the diagrams-lib
package.
4 Specifying transformations as matrices
To turn an arbitrary matrix into a Transformation
would require the following:
- Assuming the matrix is represented using homogeneous coordinates, it must be checked that the matrix actually represents an affine transformation (as opposed to a general projective transformation). This corresponds to the bottom row of the matrix consisting of all zeros and a final 1.
- Extract the translational component from the final column of the matrix (except for the 1 in the bottom right corner).
- Extract the linear transformation component as the submatrix obtained by deleting the last row and column.
- Turn the matrix into a function by inlining the definition of matrix-vector multiplication. e.g. the matrix
corresponds to the function .
- Compute the transpose, inverse, and inverse transpose of the matrix and turn the results into functions as well.
5 Angle-preserving transformations
An angle-preserving transformation is one which preserves angles between all pairs of lines. It is possible to prove the following:
- A linear transformation L is angle-preserving iff there exists a constant λ such that for all vectors v. (See http://www.cs.bsu.edu/homepages/fischer/math445/angles.pdf.)
- Intuitively, the previous result means that any angle-preserving linear transformation (other than the constantly zero transformation) can be uniquely decomposed as a uniform scale followed by a rotation.
- An affine transformation is angle-preserving iff it can be expressed as an angle-preserving linear transformation followed by a translation.
- Angle-preserving affine transformations are closed under composition.
The last two points in particular imply that we can uniquely represent angle-preserving affine transformations by a triple A = (θ,s,w) where A(v) = R_{θ}(sv) + w (R_{θ} indicates a rotation by θ), and given this representation we may compute
It is also easy to compute the inverse
(θ,s,w)^{ − 1} = ( − θ,1 / s, − R_{ − θ}(w / s))
and the transpose of the linear part:
(θ,s)^{T} = ( − θ,s)
(The above calculations are simplified by the observation that rotations and uniform scales commute, though of course transformations do not commute in general.)
As a concrete proposal, we can change T2
from a synonym for Transformation R2
to a sum type
data T2 = Ortho Angle Double R2 | GenT2 (Transformation R2)
and change things like rotate
to construct an Ortho
instead of a Transformation R2
. There is a bit of trickiness to be worked out with the function scaling
, which currently has type Scalar v -> Transformation v
, i.e. it is supposed to work over any vector space, but we want it to do something special for R2
. I am not sure what the right solution is.
In any case, some benefits of such a scheme would be:
- Angle-preserving transformations are precisely those which preserve circular arcs. If we add a new primitive type of arc segment, we can look at the final transformation applied to the segment to determine whether we can simply issue a primitive arc command (if the transformation is an
Ortho
) or if we must issue an arc command wrapped in a transform (or approximate the arc with Beziers).
- For typical applications,
Ortho
transformations make up the vast majority of transformations actually used. Intuitively, it seems this should lead to a performance boost, sinceOrtho
values consist of simple first-order data, making it easier for GHC to optimize e.g. chains of repeated transformations. GeneralTransformation
values, on the other hand, are functions, which can be tricky for GHC to optimize e.g. when deeply nested.