Difference between revisions of "Numeric Haskell: A Repa Tutorial"
DonStewart (talk  contribs) (→Index Types) 
DonStewart (talk  contribs) 

Line 1:  Line 1:  
−  [http://hackage.haskell.org/package/repa Repa] is a Haskell library for high performance, regular, multidimensional parallel arrays. All numeric data is stored unboxed. Functions written with the Repa combinators are automatically parallel provided you supply +RTS Nwhatever on the command line when running the program. 

+  [http://hackage.haskell.org/package/repa Repa] is a Haskell library for 

+  high performance, regular, multidimensional parallel arrays. All 

+  numeric data is stored unboxed. Functions written with the Repa 

+  combinators are automatically parallel provided you supply "+RTS N" on 

+  the command line when running the program. 

−  See also the [http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Vector_Tutorial vector tutorial]. 

+  This document provides a tutorial on array programming in Haskell using 

+  the repa package. 

+  
+  ''Note:'' a companion tutorial to this is provided in [http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Vector_Tutorial vector tutorial]. 

= Quick Tour = 
= Quick Tour = 

Line 39:  Line 46:  
* [http://hackage.haskell.org/package/repaexamples repaexamples] 
* [http://hackage.haskell.org/package/repaexamples repaexamples] 

−  == Index 
+  == Index types and shapes == 
−  Much like the classic 'array' library in Haskell, repabased arrays are 

+  Before we can get started manipulating arrays, we need a grasp of repa's 

+  notion of array shape. Much like the classic 'array' library in Haskell, repabased arrays are 

parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa 
parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa 

arrays use a [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/DataArrayRepaShape.html#t:Shape richer type language] for array indices and shapes. 
arrays use a [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/DataArrayRepaShape.html#t:Shape richer type language] for array indices and shapes. 

Line 63:  Line 71:  
Array DIM2 Double 
Array DIM2 Double 

−  is a twodimensional array of doubles, indexed via `Int` keys 
+  is a twodimensional array of doubles, indexed via `Int` keys, while 
+  
+  Array Z Double 

+  
+  is a zerodimension object (i.e. a point) holding a Double. 

Many operations over arrays are polymorphic in the shape / dimension component. 
Many operations over arrays are polymorphic in the shape / dimension component. 

−  Others require operating on the shape itself, rather than the array. 
+  Others require operating on the shape itself, rather than the array. A 
+  typeclass, <hask>Shape</hask>, lets us operate uniformally over arrays 

+  with different shape. 

+  
+  == Shapes == 

To build values of `shape` type, we can use the `Z` and `:.` constructors: 
To build values of `shape` type, we can use the `Z` and `:.` constructors: 

Line 115:  Line 123:  
The type of `x` is inferred as: 
The type of `x` is inferred as: 

−  
> :t x 
> :t x 

Line 143:  Line 150:  
To access elements in repa arrays, you provide an array and a shape, to access the element: 
To access elements in repa arrays, you provide an array and a shape, to access the element: 

−  +  (!) :: (Shape sh, Elt a) => Array sh a > sh > a 

−  == Modifying arrays == 

+  So: 

−  == Generating arrays == 

+  > let x = fromList (Z :. (10::Int)) [1..10] 

+  > x ! (Z :. 2) 

+  3.0 

+  Note that we can't, even for onedimensional arrays, give just a bare literal as the shape: 

+  > x ! 2 

−  == Modifying arrays == 

+  No instance for (Num (Z :. Int)) 

+  arising from the literal `2' 

−  == Indexing arrays == 

+  as the Z type isn't in the Num class. 

−  To access elements in repa arrays, you provide an array and a shape, to access the element: 

+  What if the index is out of bounds, though? 

+  
+  > x ! (Z :. 11) 

+  *** Exception: ./Data/Vector/Generic.hs:222 ((!)): index out of bounds (11,10) 

+  
+  an exception is thrown. An altnerative is to indexing functions that return a Maybe: 

+  
+  (!?) :: (Shape sh, Elt a) => Array sh a > sh > Maybe a 

+  
+  An example: 

+  
+  > x !? (Z :. 9) 

+  Just 10.0 

+  
+  > x !? (Z :. 11) 

+  Nothing 

+  
+  == Operations on arrays == 

+  
+  === Numeric operations: negation, addition, subtraction, multiplication === 

+  
+  Repa arrays are instances of the <code>Num</code>. This means that operations 

+  on numerical elements are lifted automagically onto arrays of such elements: 

+  
+  For example, <hask>(+)</hask> on two double values corresponds to elementwise addition, 

+  <hask>(+)</hask>, of the two arrays of doubles: 

+  
+  > let x = fromList (Z :. (10::Int)) [1..10] 

+  > x + x 

+  [2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0] 

−  (!) :: (Shape sh, Elt a) => Array sh a > sh > a 

+  Other operations from the Num class work just as well: 

−  = Syntax = 

+  > x 

+  [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0] 

−  Repa arrays are instances of `Num`. This means that operations on numerical elements are lifted automagically onto arrays of such elements: 

+  > x ^ 3 

+  [1.0,8.0,27.0,64.0,125.0,216.0,343.0,512.0,729.0,1000.0] 

−  For example, `(+)` on two double values corresponds to zipwise `(+)` on two arrays of doubles. 

+  > x * x 

+  [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0] 
Revision as of 21:21, 9 May 2011
Repa is a Haskell library for high performance, regular, multidimensional parallel arrays. All numeric data is stored unboxed. Functions written with the Repa combinators are automatically parallel provided you supply "+RTS N" on the command line when running the program.
This document provides a tutorial on array programming in Haskell using the repa package.
Note: a companion tutorial to this is provided in vector tutorial.
Contents
Quick Tour
Repa (REgular PArallel arrays) is an advanced, multidimensional parallel arrays library for Haskell, with a number of distinct capabilities:
 The arrays are "regular" (i.e. dense and rectangular); and
 Functions may be written that are polymorphic in the shape of the array;
 Many operations on arrays are accomplished by changing only the shape of the array (without copying elements);
 The library will automatically parallelize operations over arrays.
This is a quick start guide for the package.
Importing the library
Download the `repa` package:
$ cabal install repa
and import it qualified:
import qualified Data.Array.Repa as R
The library needs to be imported qualified as it shares the same function names as list operations in the Prelude.
Note: Operations that involve writing new index types for Repa arrays will require the 'XTypeOperators' language extension.
For noncore functionality, a number of related packages are available:
* repabytestring * repaio * repaalgorithms
and example algorithms in:
* repaexamples
Index types and shapes
Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repabased arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a richer type language for array indices and shapes.
Index types consist of two parts:
 a dimension component; and
 an index type
The most common dimensions are given by the shorthand names:
type DIM0 = Z type DIM1 = DIM0 :. Int type DIM2 = DIM1 :. Int type DIM3 = DIM2 :. Int type DIM4 = DIM3 :. Int type DIM5 = DIM4 :. Int
thus,
Array DIM2 Double
is a twodimensional array of doubles, indexed via `Int` keys, while
Array Z Double
is a zerodimension object (i.e. a point) holding a Double.
Many operations over arrays are polymorphic in the shape / dimension component.
Others require operating on the shape itself, rather than the array. A
typeclass, Shape
, lets us operate uniformally over arrays
with different shape.
Shapes
To build values of `shape` type, we can use the `Z` and `:.` constructors:
> Z  the zerodimension Z
For arrays of nonzero dimension, we must give a size. A common error is to leave off the type of the size,
> :t Z :. 10 Z :. 10 :: Num head => Z :. head
For arrays of nonzero dimension, we must give a size. A common error is to leave off the type of the size,
> :t Z :. 10 Z :. 10 :: Num head => Z :. head
leading to annoying type errors about unresolved instances, such as:
No instance for (Shape (Z :. head0))
To select the correct instance, you will need to annotate the size literals with their concrete type:
> :t Z :. (10 :: Int) Z :. (10 :: Int) :: Z :. Int
is the shape of 1D arrays of length 10, indexed via Ints.
Generating arrays
New repa arrays ("arrays" from here on) can be generated in many ways:
$ ghci GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help Loading package ghcprim ... linking ... done. Loading package integergmp ... linking ... done. Loading package base ... linking ... done. Loading package ffi1.0 ... linking ... done. Prelude> :m + Data.Array.Repa
They may be constructed from lists:
A one dimensional array of length 10, here, given the shape `(Z :. 10)`:
> let x = fromList (Z :. (10::Int)) [1..10] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
The type of `x` is inferred as:
> :t x x :: Array (Z :. Int) Double
which we can read as "an array of dimension 1, indexed via Int keys, holding elements of type Double"
We could also have written the type as:
x :: Array DIM1 Double
The same data may also be treated as a two dimensional array:
> let x = fromList (Z :. (5::Int) :. (2::Int)) [1..10] > x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
which would have the type:
x :: Array ((Z :. Int) :. Int) Double
or
x :: Array DIM2 Double
Indexing arrays
To access elements in repa arrays, you provide an array and a shape, to access the element:
(!) :: (Shape sh, Elt a) => Array sh a > sh > a
So:
> let x = fromList (Z :. (10::Int)) [1..10] > x ! (Z :. 2) 3.0
Note that we can't, even for onedimensional arrays, give just a bare literal as the shape:
> x ! 2
No instance for (Num (Z :. Int)) arising from the literal `2'
as the Z type isn't in the Num class.
What if the index is out of bounds, though?
> x ! (Z :. 11) *** Exception: ./Data/Vector/Generic.hs:222 ((!)): index out of bounds (11,10)
an exception is thrown. An altnerative is to indexing functions that return a Maybe:
(!?) :: (Shape sh, Elt a) => Array sh a > sh > Maybe a
An example:
> x !? (Z :. 9) Just 10.0
> x !? (Z :. 11) Nothing
Operations on arrays
Numeric operations: negation, addition, subtraction, multiplication
Repa arrays are instances of the Num
. This means that operations
on numerical elements are lifted automagically onto arrays of such elements:
For example, (+)
on two double values corresponds to elementwise addition,
(+)
, of the two arrays of doubles:
> let x = fromList (Z :. (10::Int)) [1..10] > x + x [2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]
Other operations from the Num class work just as well:
> x [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
> x ^ 3 [1.0,8.0,27.0,64.0,125.0,216.0,343.0,512.0,729.0,1000.0]
> x * x [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0]