# Numeric Haskell: A Repa Tutorial

### From HaskellWiki

DonStewart (Talk | contribs) m (→Changing the shape of an array) |
DonStewart (Talk | contribs) m |
||

Line 1: | Line 1: | ||

− | [http://hackage.haskell.org/package/repa Repa] is a Haskell library for | + | [http://hackage.haskell.org/package/repa Repa] is a Haskell library for high performance, regular, multi-dimensional parallel arrays. All numeric data is stored unboxed. Functions written with the Repa combinators are automatically parallel provided you supply "+RTS -N" on the command line when running the program. |

− | high performance, regular, multi-dimensional parallel arrays. All | + | |

− | numeric data is stored unboxed. Functions written with the Repa | + | |

− | combinators are automatically parallel provided you supply "+RTS -N" on | + | |

− | the command line when running the program. | + | |

− | This document provides a tutorial on array programming in Haskell using | + | This document provides a tutorial on array programming in Haskell using the repa package. |

− | the repa package. | + | |

''Note:'' a companion tutorial to this is provided in [http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Vector_Tutorial vector tutorial]. | ''Note:'' a companion tutorial to this is provided in [http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Vector_Tutorial vector tutorial]. | ||

Line 12: | Line 7: | ||

= Quick Tour = | = Quick Tour = | ||

− | Repa (REgular PArallel arrays) is an advanced, multi-dimensional parallel | + | Repa (REgular PArallel arrays) is an advanced, multi-dimensional parallel arrays library for Haskell, with a number of distinct capabilities: |

− | arrays library for Haskell, with a number of distinct capabilities: | + | |

* The arrays are "regular" (i.e. dense and rectangular); and | * The arrays are "regular" (i.e. dense and rectangular); and | ||

Line 52: | Line 46: | ||

== Index types and shapes == | == Index types and shapes == | ||

− | Before we can get started manipulating arrays, we need a grasp of repa's | + | Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repa-based arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/Data-Array-Repa-Shape.html#t:Shape richer type language] for array indices and shapes. |

− | notion of array shape. Much like the classic 'array' library in Haskell, repa-based arrays are | + | |

− | parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa | + | |

− | arrays use a [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/Data-Array-Repa-Shape.html#t:Shape richer type language] for array indices and shapes. | + | |

Index types consist of two parts: | Index types consist of two parts: | ||

Line 81: | Line 72: | ||

is a zero-dimension object (i.e. a point) holding a Double. | is a zero-dimension object (i.e. a point) holding a Double. | ||

− | Many operations over arrays are polymorphic in the shape / dimension component. | + | Many operations over arrays are polymorphic in the shape / dimension |

− | Others require operating on the shape itself, rather than the array. A | + | component. Others require operating on the shape itself, rather than |

− | typeclass, < | + | the array. A typeclass, <code>Shape</code>, lets us operate uniformally |

− | with different shape. | + | over arrays with different shape. |

== Shapes == | == Shapes == | ||

Line 114: | Line 105: | ||

is the shape of 1D arrays of length 10, indexed via Ints. | is the shape of 1D arrays of length 10, indexed via Ints. | ||

− | Given an array, you can always find its shape by calling | + | Given an array, you can always find its shape by calling <code>extent</code>. |

− | < | + | |

== Generating arrays == | == Generating arrays == | ||

Line 223: | Line 213: | ||

== Operations on arrays == | == Operations on arrays == | ||

− | Besides indexing, there are many regular, list-like operations on | + | Besides indexing, there are many regular, list-like operations on arrays. |

− | arrays. | + | |

=== Maps, zips, filters and folds === | === Maps, zips, filters and folds === | ||

Line 234: | Line 223: | ||

[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0] | [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0] | ||

− | since `map` conflicts with the definition in the Prelude, we have to use | + | since `map` conflicts with the definition in the Prelude, we have to use it qualified: |

− | it qualified: | + | |

> Data.Array.Repa.map (^2) x | > Data.Array.Repa.map (^2) x | ||

Line 257: | Line 245: | ||

[6.0,15.0,24.0] | [6.0,15.0,24.0] | ||

− | Two arrays may be combined via < | + | Two arrays may be combined via <code>zipWith</code>: |

zipWith :: (Shape sh, Elt b, Elt c, Elt a) => | zipWith :: (Shape sh, Elt b, Elt c, Elt a) => | ||

Line 269: | Line 257: | ||

=== Numeric operations: negation, addition, subtraction, multiplication === | === Numeric operations: negation, addition, subtraction, multiplication === | ||

− | Repa arrays are instances of the <code>Num</code>. This means that | + | Repa arrays are instances of the <code>Num</code>. This means that |

− | on numerical elements are lifted automagically onto arrays of such elements | + | operations on numerical elements are lifted automagically onto arrays of |

− | + | such elements. For example, <code>(+)</code> on two double values corresponds to | |

− | For example, < | + | element-wise addition, <code>(+)</code>, of the two arrays of doubles: |

− | < | + | |

> let x = fromList (Z :. (10::Int)) [1..10] | > let x = fromList (Z :. (10::Int)) [1..10] | ||

Line 312: | Line 299: | ||

To do this, we extract the current shape of the array, and write a function | To do this, we extract the current shape of the array, and write a function | ||

that maps the index space from the old array to the new array. That index space function | that maps the index space from the old array to the new array. That index space function | ||

− | is then passed to | + | is then passed to backpermute which actually constructs the new |

array from the old one. | array from the old one. | ||

## Revision as of 01:00, 10 May 2011

Repa is a Haskell library for high performance, regular, multi-dimensional parallel arrays. All numeric data is stored unboxed. Functions written with the Repa combinators are automatically parallel provided you supply "+RTS -N" on the command line when running the program.

This document provides a tutorial on array programming in Haskell using the repa package.

*Note:* a companion tutorial to this is provided in vector tutorial.

## Contents |

# 1 Quick Tour

Repa (REgular PArallel arrays) is an advanced, multi-dimensional parallel arrays library for Haskell, with a number of distinct capabilities:

- The arrays are "regular" (i.e. dense and rectangular); and
- Functions may be written that are polymorphic in the shape of the array;
- Many operations on arrays are accomplished by changing only the shape of the array (without copying elements);
- The library will automatically parallelize operations over arrays.

This is a quick start guide for the package. For further information, consult:

- The Haddock Documentation
- Regular, Shape-polymorphic, Parallel Arrays in Haskell.
- Efﬁcient Parallel Stencil Convolution in Haskell

## 1.1 Importing the library

Download the `repa` package:

$ cabal install repa

and import it qualified:

import qualified Data.Array.Repa as R

The library needs to be imported qualified as it shares the same function names as list operations in the Prelude.

Note: Operations that involve writing new index types for Repa arrays will require the '-XTypeOperators' language extension.

For non-core functionality, a number of related packages are available:

* repa-bytestring * repa-io * repa-algorithms

and example algorithms in:

* repa-examples

## 1.2 Index types and shapes

Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repa-based arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a richer type language for array indices and shapes.

Index types consist of two parts:

- a dimension component; and
- an index type

The most common dimensions are given by the shorthand names:

type DIM0 = Z type DIM1 = DIM0 :. Int type DIM2 = DIM1 :. Int type DIM3 = DIM2 :. Int type DIM4 = DIM3 :. Int type DIM5 = DIM4 :. Int

thus,

Array DIM2 Double

is a two-dimensional array of doubles, indexed via `Int` keys, while

Array Z Double

is a zero-dimension object (i.e. a point) holding a Double.

Many operations over arrays are polymorphic in the shape / dimension
component. Others require operating on the shape itself, rather than
the array. A typeclass, `Shape`

, lets us operate uniformally
over arrays with different shape.

## 1.3 Shapes

To build values of `shape` type, we can use the `Z` and `:.` constructors:

> Z -- the zero-dimension Z

For arrays of non-zero dimension, we must give a size. A common error is to leave off the type of the size,

> :t Z :. 10 Z :. 10 :: Num head => Z :. head

For arrays of non-zero dimension, we must give a size. A common error is to leave off the type of the size,

> :t Z :. 10 Z :. 10 :: Num head => Z :. head

leading to annoying type errors about unresolved instances, such as:

No instance for (Shape (Z :. head0))

To select the correct instance, you will need to annotate the size literals with their concrete type:

> :t Z :. (10 :: Int) Z :. (10 :: Int) :: Z :. Int

is the shape of 1D arrays of length 10, indexed via Ints.

Given an array, you can always find its shape by calling `extent`

.

## 1.4 Generating arrays

New repa arrays ("arrays" from here on) can be generated in many ways:

$ ghci GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package ffi-1.0 ... linking ... done. Prelude> :m + Data.Array.Repa

They may be constructed from lists:

A one dimensional array of length 10, here, given the shape `(Z :. 10)`:

> let x = fromList (Z :. (10::Int)) [1..10] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]

The type of `x` is inferred as:

> :t x x :: Array (Z :. Int) Double

which we can read as "an array of dimension 1, indexed via Int keys, holding elements of type Double"

We could also have written the type as:

x :: Array DIM1 Double

The same data may also be treated as a two dimensional array:

> let x = fromList (Z :. (5::Int) :. (2::Int)) [1..10] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]

which would have the type:

x :: Array ((Z :. Int) :. Int) Double

or

x :: Array DIM2 Double

### 1.4.1 Building arrays from vectors

It is also possible to build arrays from unboxed vectors:

fromVector :: Shape sh => sh -> Vector a -> Array sh a

by applying a shape to a vector.

import Data.Vector.Unboxed

> let x = fromVector (Z :. (10::Int)) (enumFromN 0 10) [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]

is a one-dimensional array of doubles, but we can also impose other shapes:

> let x = fromVector (Z :. (3::Int) :. (3::Int)) (enumFromN 0 9) > x [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0] > :t x x :: Array ((Z :. Int) :. Int) Double

## 1.5 Indexing arrays

To access elements in repa arrays, you provide an array and a shape, to access the element:

(!) :: (Shape sh, Elt a) => Array sh a -> sh -> a

So:

> let x = fromList (Z :. (10::Int)) [1..10] > x ! (Z :. 2) 3.0

Note that we can't, even for one-dimensional arrays, give just a bare literal as the shape:

> x ! 2

No instance for (Num (Z :. Int)) arising from the literal `2'

as the Z type isn't in the Num class.

What if the index is out of bounds, though?

> x ! (Z :. 11) *** Exception: ./Data/Vector/Generic.hs:222 ((!)): index out of bounds (11,10)

an exception is thrown. An altnerative is to indexing functions that return a Maybe:

(!?) :: (Shape sh, Elt a) => Array sh a -> sh -> Maybe a

An example:

> x !? (Z :. 9) Just 10.0

> x !? (Z :. 11) Nothing

## 1.6 Operations on arrays

Besides indexing, there are many regular, list-like operations on arrays.

### 1.6.1 Maps, zips, filters and folds

We can map over multi-dimensional arrays:

> let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]

since `map` conflicts with the definition in the Prelude, we have to use it qualified:

> Data.Array.Repa.map (^2) x [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]

Maps leave the dimension unchanged.

Folding reduces the inner dimension of the array.

fold :: (Shape sh, Elt a) => (a -> a -> a) -> a -> Array (sh :. Int) a -> Array sh a

So if 'x' is a 3D array:

> let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]

We can sum each row, to yield a 2D array:

> fold (+) 0 x [6.0,15.0,24.0]

Two arrays may be combined via `zipWith`

:

zipWith :: (Shape sh, Elt b, Elt c, Elt a) => (a -> b -> c) -> Array sh a -> Array sh b -> Array sh c

an example:

> zipWith (*) x x [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]

### 1.6.2 Numeric operations: negation, addition, subtraction, multiplication

Repa arrays are instances of the `Num`

. This means that
operations on numerical elements are lifted automagically onto arrays of
such elements. For example, `(+)`

on two double values corresponds to
element-wise addition, `(+)`

, of the two arrays of doubles:

> let x = fromList (Z :. (10::Int)) [1..10] > x + x [2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]

Other operations from the Num class work just as well:

> -x [-1.0,-2.0,-3.0,-4.0,-5.0,-6.0,-7.0,-8.0,-9.0,-10.0]

> x ^ 3 [1.0,8.0,27.0,64.0,125.0,216.0,343.0,512.0,729.0,1000.0]

> x * x [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0]

## 1.7 Changing the shape of an array

One of the main advantages of repa-style arrays over other arrays in Haskell is the ability to reshape data without copying. This is achieved via *index-space transformations*.

An example: transposing a 2D array (this example taken from the repa paper). First, the type of the transformation:

transpose2D :: Elt e => Array DIM2 e -> Array DIM2 e

Note that this transform will work on DIM2 arrays holding any elements. Now, to swap rows and columns, we have to modify the shape:

transpose2D a = backpermute (swap e) swap a where e = extent a swap (Z :. i :. j) = Z :. j :. i

The swap function reorders the index space of the array. To do this, we extract the current shape of the array, and write a function that maps the index space from the old array to the new array. That index space function is then passed to backpermute which actually constructs the new array from the old one.

backpermute generates a new array from an old, when given the new shape, and a function that translates between the index space of each array (i.e. a shape transformer).

backpermute :: (Shape sh, Shape sh', Elt a) => sh' -> (sh' -> sh) -> Array sh a -> Array sh' a

Note that the array created is not actually evaluated (we only modified the index space of the array).

Transposition is such a common operation that it is provided by the library:

transpose :: (Shape sh, Elt a) => Array ((sh :. Int) :. Int) a -> Array ((sh :. Int) :. Int)

the type indicate that it works on the lowest two dimensions of the array.

Other operations on index spaces include taking slices and joining arrays into larger ones.