Numeric Haskell: A Repa Tutorial
Repa is a Haskell library for high performance, regular, multi-dimensional parallel arrays. All numeric data is stored unboxed. Functions written with the Repa combinators are automatically parallel provided you supply "+RTS -N" on the command line when running the program.
This document provides a tutorial on array programming in Haskell using the repa package.
Note: a companion tutorial to this is provided in vector tutorial.
Quick Tour
Repa (REgular PArallel arrays) is an advanced, multi-dimensional parallel arrays library for Haskell, with a number of distinct capabilities:
- The arrays are "regular" (i.e. dense and rectangular); and
- Functions may be written that are polymorphic in the shape of the array;
- Many operations on arrays are accomplished by changing only the shape of the array (without copying elements);
- The library will automatically parallelize operations over arrays.
This is a quick start guide for the package.
Importing the library
Download the `repa` package:
$ cabal install repa
and import it qualified:
import qualified Data.Array.Repa as R
The library needs to be imported qualified as it shares the same function names as list operations in the Prelude.
Note: Operations that involve writing new index types for Repa arrays will require the '-XTypeOperators' language extension.
For non-core functionality, a number of related packages are available:
* repa-bytestring * repa-io * repa-algorithms
and example algorithms in:
* repa-examples
Index types and shapes
Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repa-based arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a richer type language for array indices and shapes.
Index types consist of two parts:
- a dimension component; and
- an index type
The most common dimensions are given by the shorthand names:
type DIM0 = Z type DIM1 = DIM0 :. Int type DIM2 = DIM1 :. Int type DIM3 = DIM2 :. Int type DIM4 = DIM3 :. Int type DIM5 = DIM4 :. Int
thus,
Array DIM2 Double
is a two-dimensional array of doubles, indexed via `Int` keys, while
Array Z Double
is a zero-dimension object (i.e. a point) holding a Double.
Many operations over arrays are polymorphic in the shape / dimension component.
Others require operating on the shape itself, rather than the array. A
typeclass, Shape
, lets us operate uniformally over arrays
with different shape.
Shapes
To build values of `shape` type, we can use the `Z` and `:.` constructors:
> Z -- the zero-dimension Z
For arrays of non-zero dimension, we must give a size. A common error is to leave off the type of the size,
> :t Z :. 10 Z :. 10 :: Num head => Z :. head
For arrays of non-zero dimension, we must give a size. A common error is to leave off the type of the size,
> :t Z :. 10 Z :. 10 :: Num head => Z :. head
leading to annoying type errors about unresolved instances, such as:
No instance for (Shape (Z :. head0))
To select the correct instance, you will need to annotate the size literals with their concrete type:
> :t Z :. (10 :: Int) Z :. (10 :: Int) :: Z :. Int
is the shape of 1D arrays of length 10, indexed via Ints.
Generating arrays
New repa arrays ("arrays" from here on) can be generated in many ways:
$ ghci GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package ffi-1.0 ... linking ... done. Prelude> :m + Data.Array.Repa
They may be constructed from lists:
A one dimensional array of length 10, here, given the shape `(Z :. 10)`:
> let x = fromList (Z :. (10::Int)) [1..10] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
The type of `x` is inferred as:
> :t x x :: Array (Z :. Int) Double
which we can read as "an array of dimension 1, indexed via Int keys, holding elements of type Double"
We could also have written the type as:
x :: Array DIM1 Double
The same data may also be treated as a two dimensional array:
> let x = fromList (Z :. (5::Int) :. (2::Int)) [1..10] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
which would have the type:
x :: Array ((Z :. Int) :. Int) Double
or
x :: Array DIM2 Double
Building arrays from vectors
It is also possible to build arrays from unboxed vectors:
fromVector :: Shape sh => sh -> Vector a -> Array sh a
by applying a shape to a vector.
import Data.Vector.Unboxed
> let x = fromVector (Z :. (10::Int)) (enumFromN 0 10) [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]
is a one-dimensional array of doubles, but we can also impose other shapes:
> let x = fromVector (Z :. (3::Int) :. (3::Int)) (enumFromN 0 9) > x [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0] > :t x x :: Array ((Z :. Int) :. Int) Double
Indexing arrays
To access elements in repa arrays, you provide an array and a shape, to access the element:
(!) :: (Shape sh, Elt a) => Array sh a -> sh -> a
So:
> let x = fromList (Z :. (10::Int)) [1..10] > x ! (Z :. 2) 3.0
Note that we can't, even for one-dimensional arrays, give just a bare literal as the shape:
> x ! 2
No instance for (Num (Z :. Int)) arising from the literal `2'
as the Z type isn't in the Num class.
What if the index is out of bounds, though?
> x ! (Z :. 11) *** Exception: ./Data/Vector/Generic.hs:222 ((!)): index out of bounds (11,10)
an exception is thrown. An altnerative is to indexing functions that return a Maybe:
(!?) :: (Shape sh, Elt a) => Array sh a -> sh -> Maybe a
An example:
> x !? (Z :. 9) Just 10.0
> x !? (Z :. 11) Nothing
Operations on arrays
Besides indexing, there are many regular, list-like operations on arrays.
Maps, zips, filters and folds
We can map over multi-dimensional arrays:
> let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]
since `map` conflicts with the definition in the Prelude, we have to use it qualified:
> Data.Array.Repa.map (^2) x [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]
Maps leave the dimension unchanged.
Folding reduces the inner dimension of the array.
fold :: (Shape sh, Elt a) => (a -> a -> a) -> a -> Array (sh :. Int) a -> Array sh a
So if 'x' is a 3D array:
> let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]
We can sum each row, to yield a 2D array:
> fold (+) 0 x [6.0,15.0,24.0]
Two arrays may be combined via zipWith
zipWith :: (Shape sh, Elt b, Elt c, Elt a) => (a -> b -> c) -> Array sh a -> Array sh b -> Array sh c
an example:
> zipWith (*) x x [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]
Numeric operations: negation, addition, subtraction, multiplication
Repa arrays are instances of the Num
. This means that operations
on numerical elements are lifted automagically onto arrays of such elements:
For example, (+)
on two double values corresponds to element-wise addition,
(+)
, of the two arrays of doubles:
> let x = fromList (Z :. (10::Int)) [1..10] > x + x [2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]
Other operations from the Num class work just as well:
> -x [-1.0,-2.0,-3.0,-4.0,-5.0,-6.0,-7.0,-8.0,-9.0,-10.0]
> x ^ 3 [1.0,8.0,27.0,64.0,125.0,216.0,343.0,512.0,729.0,1000.0]
> x * x [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0]