# Difference between revisions of "Numeric Haskell: A Repa Tutorial"

Jump to: navigation, search

Repa is a Haskell library for high performance, regular, multi-dimensional parallel arrays. All numeric data is stored unboxed. Functions written with the Repa combinators are automatically parallel provided you supply "+RTS -N" on the command line when running the program.

This document provides a tutorial on array programming in Haskell using the repa package.

Note: a companion tutorial to this is provided in vector tutorial.

# Quick Tour

Repa (REgular PArallel arrays) is an advanced, multi-dimensional parallel arrays library for Haskell, with a number of distinct capabilities:

• The arrays are "regular" (i.e. dense and rectangular); and
• Functions may be written that are polymorphic in the shape of the array;
• Many operations on arrays are accomplished by changing only the shape of the array (without copying elements);
• The library will automatically parallelize operations over arrays.

This is a quick start guide for the package. For further information, consult:

## Importing the library

Download the `repa` package:

```  \$ cabal install repa
```

and import it qualified:

```  import qualified Data.Array.Repa as R
```

The library needs to be imported qualified as it shares the same function names as list operations in the Prelude.

Note: Operations that involve writing new index types for Repa arrays will require the '-XTypeOperators' language extension.

For non-core functionality, a number of related packages are available:

```* repa-bytestring
* repa-io
* repa-algorithms
```

and example algorithms in:

```* repa-examples
```

## Index types and shapes

Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repa-based arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a richer type language for array indices and shapes.

Index types consist of two parts:

• a dimension component; and
• an index type

The most common dimensions are given by the shorthand names:

```   type DIM0 = Z
type DIM1 = DIM0 :. Int
type DIM2 = DIM1 :. Int
type DIM3 = DIM2 :. Int
type DIM4 = DIM3 :. Int
type DIM5 = DIM4 :. Int
```

thus,

```   Array DIM2 Double
```

is a two-dimensional array of doubles, indexed via `Int` keys, while

```   Array Z Double
```

is a zero-dimension object (i.e. a point) holding a Double.

Many operations over arrays are polymorphic in the shape / dimension component. Others require operating on the shape itself, rather than the array. A typeclass, `Shape`, lets us operate uniformally over arrays with different shape.

## Shapes

To build values of `shape` type, we can use the `Z` and `:.` constructors:

```   > Z         -- the zero-dimension
Z
```

For arrays of non-zero dimension, we must give a size. A common error is to leave off the type of the size,

```   > :t Z :. 10
Z :. 10 :: Num head => Z :. head
```

For arrays of non-zero dimension, we must give a size. A common error is to leave off the type of the size,

```   > :t Z :. 10
Z :. 10 :: Num head => Z :. head
```

leading to annoying type errors about unresolved instances, such as:

```   No instance for (Shape (Z :. head0))
```

To select the correct instance, you will need to annotate the size literals with their concrete type:

```   > :t Z :. (10 :: Int)
Z :. (10 :: Int) :: Z :. Int
```

is the shape of 1D arrays of length 10, indexed via Ints.

Given an array, you can always find its shape by calling `extent`.

## Generating arrays

New repa arrays ("arrays" from here on) can be generated in many ways:

```   \$ ghci
GHCi, version 7.0.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
Prelude> :m + Data.Array.Repa
```

They may be constructed from lists:

A one dimensional array of length 10, here, given the shape `(Z :. 10)`:

```   > let x = fromList (Z :. (10::Int)) [1..10]
> x
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
```

The type of `x` is inferred as:

```   > :t x
x :: Array (Z :. Int) Double
```

which we can read as "an array of dimension 1, indexed via Int keys, holding elements of type Double"

We could also have written the type as:

```   x :: Array DIM1 Double
```

The same data may also be treated as a two dimensional array:

```   > let x = fromList (Z :. (5::Int) :. (2::Int)) [1..10]
> x
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
```

which would have the type:

```   x :: Array ((Z :. Int) :. Int) Double
```

or

```   x :: Array DIM2 Double
```

### Building arrays from vectors

It is also possible to build arrays from unboxed vectors:

```   fromVector :: Shape sh => sh -> Vector a -> Array sh a
```

by applying a shape to a vector.

```   import Data.Vector.Unboxed
```
```   > let x = fromVector (Z :. (10::Int)) (enumFromN 0 10)
[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]
```

is a one-dimensional array of doubles, but we can also impose other shapes:

```   > let x = fromVector (Z :. (3::Int) :. (3::Int)) (enumFromN 0 9)
> x
[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0]
> :t x
x :: Array ((Z :. Int) :. Int) Double
```

## Indexing arrays

To access elements in repa arrays, you provide an array and a shape, to access the element:

```   (!) :: (Shape sh, Elt a) => Array sh a -> sh -> a
```

So:

```   > let x = fromList (Z :. (10::Int)) [1..10]
> x ! (Z :. 2)
3.0
```

Note that we can't, even for one-dimensional arrays, give just a bare literal as the shape:

```   > x ! 2
```
```       No instance for (Num (Z :. Int))
arising from the literal `2'
```

as the Z type isn't in the Num class.

What if the index is out of bounds, though?

```   > x ! (Z :. 11)
*** Exception: ./Data/Vector/Generic.hs:222 ((!)): index out of bounds (11,10)
```

an exception is thrown. An altnerative is to indexing functions that return a Maybe:

```   (!?) :: (Shape sh, Elt a) => Array sh a -> sh -> Maybe a
```

An example:

```   > x !? (Z :. 9)
Just 10.0
```
```   > x !? (Z :. 11)
Nothing
```

## Operations on arrays

Besides indexing, there are many regular, list-like operations on arrays.

### Maps, zips, filters and folds

We can map over multi-dimensional arrays:

```   > let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9]
> x
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]
```

since `map` conflicts with the definition in the Prelude, we have to use it qualified:

```   > Data.Array.Repa.map (^2) x
[1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]
```

Maps leave the dimension unchanged.

Folding reduces the inner dimension of the array.

```   fold :: (Shape sh, Elt a) => (a -> a -> a) -> a -> Array (sh :. Int) a -> Array sh a
```

So if 'x' is a 3D array:

```   > let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9]
> x
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]
```

We can sum each row, to yield a 2D array:

```   > fold (+) 0 x
[6.0,15.0,24.0]
```

Two arrays may be combined via `zipWith`

```   zipWith :: (Shape sh, Elt b, Elt c, Elt a) =>
(a -> b -> c) -> Array sh a -> Array sh b -> Array sh c
```

an example:

```   > zipWith (*) x x
[1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]
```

### Numeric operations: negation, addition, subtraction, multiplication

Repa arrays are instances of the `Num`. This means that operations on numerical elements are lifted automagically onto arrays of such elements:

For example, `(+)` on two double values corresponds to element-wise addition, `(+)`, of the two arrays of doubles:

```   > let x = fromList (Z :. (10::Int)) [1..10]
> x + x
[2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]
```

Other operations from the Num class work just as well:

```   > -x
[-1.0,-2.0,-3.0,-4.0,-5.0,-6.0,-7.0,-8.0,-9.0,-10.0]
```
```   > x ^ 3
[1.0,8.0,27.0,64.0,125.0,216.0,343.0,512.0,729.0,1000.0]
```
```   > x * x
[1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0]
```

## Changing the shape of an array

One of the main advantages of repa-style arrays over other arrays in Haskell is the ability to reshape data without copying. This is achieved via *index-space transformations*.

An example: transposing a 2D array (this example taken from the repa paper). First, the type of the transformation:

```   transpose2D :: Elt e => Array DIM2 e -> Array DIM2 e
```

Note that this transform will work on DIM2 arrays holding any elements. Now, to swap rows and columns, we have to modify the shape:

```   transpose2D a = backpermute (swap e) swap a
where
e = extent a
swap (Z :. i :. j) = Z :. j :. i
```

The swap function reorders the index space of the array. To do this, we extract the current shape of the array, and write a function that maps the index space from the old array to the new array. That index space function is then passed to `backpermute` which actually constructs the new array from the old one.

backpermute generates a new array from an old, when given the new shape, and a function that translates between the index space of each array (i.e. a shape transformer).

```   backpermute
:: (Shape sh, Shape sh', Elt a)
=> sh'
-> (sh' -> sh)
-> Array sh a
-> Array sh' a
```

Note that the array created is not actually evaluated (we only modified the index space of the array).

Transposition is such a common operation that it is provided by the library:

```   transpose :: (Shape sh, Elt a)
=> Array ((sh :. Int) :. Int) a -> Array ((sh :. Int) :. Int)
```

the type indicate that it works on the lowest two dimensions of the array.

Other operations on index spaces include taking slices and joining arrays into larger ones.