Difference between revisions of "Numeric Haskell: A Repa Tutorial"
DonStewart (talk  contribs) m 
(→Type indexes) 

(118 intermediate revisions by 12 users not shown)  
Line 1:  Line 1:  
−  [http://hackage.haskell.org/package/repa Repa] is a Haskell library for high performance, regular, multidimensional parallel arrays. All numeric data is stored unboxed. Functions written with the Repa combinators are automatically parallel provided you supply "+RTS N" on the command line when running the program. 

+  [http://hackage.haskell.org/package/repa Repa] is a Haskell library for 

+  high performance, regular, multidimensional parallel arrays. All 

+  numeric data is stored unboxed and functions written with the Repa 

+  combinators are automatically parallel (provided you supply "+RTS N" on 

+  the command line when running the program). 

−  This document provides a tutorial on array programming in Haskell using the repa package. 

+  [[Image:Grid.pngright]] 

−  ''Note:'' a companion tutorial to this is provided in [http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Vector_Tutorial vector tutorial]. 

+  This document provides a tutorial on array programming in Haskell using the Repa package and was based on version 3.2 of the library. Repa versions earlier than 3.0 use different API and are not compatible with this tutorial. 

+  
+  ''Note: a companion tutorial to this is provided as the [http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Vector_Tutorial vector tutorial], and is based on the [http://www.scipy.org/Tentative_NumPy_Tutorial NumPy tutorial]''. 

+  
+  ''Authors: [http://donsbot.wordpress.com Don Stewart] (original version), [http://ics.p.lodz.pl/~stolarek Jan Stolarek] (update from Repa 2 to Repa 3).'' 

= Quick Tour = 
= Quick Tour = 

Line 9:  Line 17:  
Repa (REgular PArallel arrays) is an advanced, multidimensional parallel arrays library for Haskell, with a number of distinct capabilities: 
Repa (REgular PArallel arrays) is an advanced, multidimensional parallel arrays library for Haskell, with a number of distinct capabilities: 

−  * The arrays are "regular" (i.e. dense and 
+  * The arrays are "regular" (i.e. dense, rectangular and store elements all of the same type); and 
* Functions may be written that are polymorphic in the shape of the array; 
* Functions may be written that are polymorphic in the shape of the array; 

* Many operations on arrays are accomplished by changing only the shape of the array (without copying elements); 
* Many operations on arrays are accomplished by changing only the shape of the array (without copying elements); 

Line 17:  Line 25:  
* [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/DataArrayRepa.html The Haddock Documentation] 
* [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/DataArrayRepa.html The Haddock Documentation] 

+  * [http://research.microsoft.com/enus/um/people/simonpj/papers/ndp/haskell12lippmeier.pdf Guiding Parallel Array Fusion with Indexed Types]  describes architecture of Repa 3 

* [http://research.microsoft.com/enus/um/people/simonpj/papers/ndp/rarrays.pdf Regular, Shapepolymorphic, Parallel Arrays in Haskell]. 
* [http://research.microsoft.com/enus/um/people/simonpj/papers/ndp/rarrays.pdf Regular, Shapepolymorphic, Parallel Arrays in Haskell]. 

* [http://www.cse.unsw.edu.au/~benl/papers/stencil/stencilicfp2011sub.pdf Efﬁcient Parallel Stencil Convolution in Haskell] 
* [http://www.cse.unsw.edu.au/~benl/papers/stencil/stencilicfp2011sub.pdf Efﬁcient Parallel Stencil Convolution in Haskell] 

+  * [http://repa.ouroborus.net/ Repa project homepage] 

−  +  = Importing the library = 

Download the `repa` package: 
Download the `repa` package: 

−  +  $ cabal install repa 

and import it qualified: 
and import it qualified: 

−  +  import qualified Data.Array.Repa as R 

The library needs to be imported qualified as it shares the same function names as list operations in the Prelude. 
The library needs to be imported qualified as it shares the same function names as list operations in the Prelude. 

−  Note: Operations that involve writing new index types for Repa arrays will require the 'XTypeOperators' language extension. 
+  ''Note: Operations that involve writing new index types for Repa arrays will require the 'XTypeOperators' language extension.'' 
For noncore functionality, a number of related packages are available: 
For noncore functionality, a number of related packages are available: 

−  +  * [http://hackage.haskell.org/package/repaio repaio] 

−  +  * [http://hackage.haskell.org/package/repaalgorithms repaalgorithms] 

−  +  * [http://hackage.haskell.org/package/repadevil repadevil] (image loading) 

and example algorithms in: 
and example algorithms in: 

−  +  * [http://hackage.haskell.org/package/repaexamples repaexamples] 

−  == Index types and shapes == 

+  = Type indexes = 

+  Repa 3 introduced a notion of ''type index'': 

−  Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repabased arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/DataArrayRepaShape.html#t:Shape richer type language] for array indices and shapes. 

+  data family Array rep sh e 

−  Index types consist of two parts: 

+  In the above definition <code>rep</code> is the type index, which is represented with a data type name and defines the type of underlying array representation. This allows to explicitly specify representation of an array on type system level. Repa distinguishes ten different types of representation, which fall into three categories: manifest, delayed and meta. Manifest arrays are represented by concrete values stored in memory using e.g. unboxed vectors (represented with data type U) or strict bytestrings (B). Delayed array (D) is a function from index to a value. Every time an element is requested from a delayed array it is calculated anew, which means that delayed arrays are inefficient when the data is needed multiple times. On the other hand delaying an array allows to perform optimisation by fusion. Meta arrays provide additional information about underlying array data, which can be used for load balancing or speeding up calculation of border effects. See [http://research.microsoft.com/enus/um/people/simonpj/papers/ndp/haskell12lippmeier.pdf Guiding Parallel Array Fusion with Indexed Types] paper for a detailed explanation of each representation. 

−  * a dimension component; and 

+  = Array shapes = 

−  * an index type 

+  
+  Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repabased arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a [http://hackage.haskell.org/packages/archive/repa/latest/doc/html/DataArrayRepaShape.html#t:Shape richer type language] for describing multidimensional array indices and shapes (technically, a ''heterogeneous snoc list''). 

+  
+  Shape types are built somewhat like lists. The constructor <code>Z</code> corresponds 

+  to a rank zero shape, and is used to mark the end of the list. The <code>:.</code> constructor adds additional dimensions to the shape. So, for example, the shape: 

+  
+  (Z :. 3 :. 2 :. 3) 

+  
+  is the shape of a small 3D array, with shape type 

+  
+  (Z :. Int :. Int :. Int) 

The most common dimensions are given by the shorthand names: 
The most common dimensions are given by the shorthand names: 

−  type DIM0 = Z 

+  <haskell> 

−  +  type DIM0 = Z 

−  +  type DIM1 = DIM0 :. Int 

−  +  type DIM2 = DIM1 :. Int 

−  +  type DIM3 = DIM2 :. Int 

−  +  type DIM4 = DIM3 :. Int 

+  type DIM5 = DIM4 :. Int 

+  </haskell> 

thus, 
thus, 

−  Array DIM2 Double 

+  <haskell> 

+  Array U DIM2 Double 

+  </haskell> 

−  is a twodimensional array of doubles, indexed via 
+  is the type of a twodimensional array of unboxed doubles, indexed via <code>Int</code> keys, while 
−  Array Z Double 

+  <haskell> 

+  Array U Z Double 

+  </haskell> 

−  is a zerodimension object (i.e. a point) holding 
+  is a zerodimension object (i.e. a point) holding an unboxed Double. 
Many operations over arrays are polymorphic in the shape / dimension 
Many operations over arrays are polymorphic in the shape / dimension 

component. Others require operating on the shape itself, rather than 
component. Others require operating on the shape itself, rather than 

−  the array. A typeclass, <code>Shape</code>, lets us operate 
+  the array. A typeclass, <code>Shape</code>, lets us operate uniformly 
over arrays with different shape. 
over arrays with different shape. 

−  == 
+  == Building shapes == 
−  To build values of 
+  To build values of shape type, we can use the <code>Z</code> and <code>:.</code> constructors. Open the ghci and import Repa: 
−  > Z  the zerodimension 

+  <haskell> 

−  Z 

+  ghci> :m +Data.Array.Repa 

+  ghci> Z  the zerodimension 

+  Z 

+  </haskell> 

−  For arrays of nonzero dimension, we must give a size. 
+  For arrays of nonzero dimension, we must give a size. ''Note:'' a common error is to leave off the ''type'' of the size. 
−  > :t Z :. 10 

+  <haskell> 

−  Z :. 10 :: Num head => Z :. head 

+  ghci> :t Z :. 10 

−  
+  Z :. 10 :: Num head => Z :. head 

−  For arrays of nonzero dimension, we must give a size. A common error is to leave off the type of the size, 

+  </haskell> 

−  
−  > :t Z :. 10 

−  Z :. 10 :: Num head => Z :. head 

leading to annoying type errors about unresolved instances, such as: 
leading to annoying type errors about unresolved instances, such as: 

Line 100:  Line 125:  
To select the correct instance, you will need to annotate the size literals with their concrete type: 
To select the correct instance, you will need to annotate the size literals with their concrete type: 

−  > :t Z :. (10 :: Int) 

+  <haskell> 

−  +  ghci> :t Z :. (10 :: Int) 

+  Z :. (10 :: Int) :: Z :. Int 

+  </haskell> 

is the shape of 1D arrays of length 10, indexed via Ints. 
is the shape of 1D arrays of length 10, indexed via Ints. 

Line 107:  Line 132:  
Given an array, you can always find its shape by calling <code>extent</code>. 
Given an array, you can always find its shape by calling <code>extent</code>. 

−  == Generating arrays == 

+  Additional convenience types for selecting particular parts of a shape are also provided (<code>All, Any, Slice</code> etc.) which are covered later in the tutorial. 

−  New repa arrays ("arrays" from here on) can be generated in many ways: 

+  == Working with shapes == 

−  $ ghci 

+  That one key operation, <code>extent</code>, gives us many attributes of an array: 

−  GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help 

−  Loading package ghcprim ... linking ... done. 

−  Loading package integergmp ... linking ... done. 

−  Loading package base ... linking ... done. 

−  Loading package ffi1.0 ... linking ... done. 

−  Prelude> :m + Data.Array.Repa 

−  They may be constructed from lists: 

+  <haskell> 

+   Extract the shape of the array 

+  extent :: (Shape sh, Source r e) => Array r sh e > sh 

+  </haskell> 

−  A one dimensional array of length 10, here, given the shape `(Z :. 10)`: 

+  So, given a 3x3x3 array, of type <code>Array U DIM3 Int</code>, we can: 

−  > let x = fromList (Z :. (10::Int)) [1..10] 

+  <haskell> 

−  > x 

+   build an array 

−  [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0] 

+  ghci> let x :: Array U DIM3 Int; x = fromListUnboxed (Z :. (3::Int) :. (3::Int) :. (3::Int)) [1..27] 

+  ghci> :t x 

+  x :: Array U DIM3 Int 

−  The type of `x` is inferred as: 

+   query the extent 

+  ghci> extent x 

+  ((Z :. 3) :. 3) :. 3 

−  > :t x 

+   compute the rank (number of dimensions) 

−  x :: Array (Z :. Int) Double 

+  ghci> let sh = extent x 

+  ghci> rank sh 

+  3 

−  which we can read as "an array of dimension 1, indexed via Int keys, holding elements of type Double" 

+   compute the size (total number of elements) 

+  > size sh 

+  27 

+  
+   extract the elements of the array as a flat vector 

+  ghci> toUnboxed x 

+  fromList [1,2,3,4,5,6,7,8,9,10 

+  ,11,12,13,14,15,16,17,18,19 

+  ,20,21,22,23,24,25,26,27] :: Data.Vector.Unboxed.Base.Vector Int 

+  </haskell> 

+  
+  = Generating arrays = 

+  
+  New repa arrays ("arrays" from here on) can be generated in many ways, and we always begin by importing the <code>Data.Array.Repa</code> module: 

+  
+  <code> 

+  $ ghci 

+  GHCi, version 7.4.1: http://www.haskell.org/ghc/ :? for help 

+  Loading package ghcprim ... linking ... done. 

+  Loading package integergmp ... linking ... done. 

+  Loading package base ... linking ... done. 

+  ghci> :m + Data.Array.Repa 

+  </code> 

+  They may be constructed from lists, for example. Here is a one dimensional array of length 10, here, given the shape <code>(Z :. 10)</code>: 

+  
+  <haskell> 

+  ghci> let inputs = [1..10] :: [Double] 

+  ghci> let x = fromListUnboxed (Z :. (10::Int)) inputs 

+  ghci> x 

+  AUnboxed (Z :. 10) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]) 

+  </haskell> 

+  
+  The type of <code>x</code> is inferred as: 

+  
+  <haskell> 

+  ghci> :t x 

+  x :: Array U (Z :. Int) Double 

+  </haskell> 

+  
+  which we can read as "an array of dimension 1, indexed via <code>Int</code> keys, holding elements of type <code>Double</code> stored using unboxed vectors" 

We could also have written the type as: 
We could also have written the type as: 

−  x :: Array DIM1 Double 

+  <haskell> 

+  ghci> let x' = fromListUnboxed (Z :. 10 :: DIM1) inputs 

+  ghci> :t x' 

+  x' :: Array U DIM1 Double 

+  </haskell> 

−  The same data may also be treated as a two dimensional array: 
+  The same data may also be treated as a two dimensional array, by changing the shape parameter: 
−  > let x = fromList (Z :. (5::Int) :. (2::Int)) [1..10] 

+  <haskell> 

−  > x 

+  ghci> let x2 = fromListUnboxed (Z :. (5::Int) :. (2::Int)) inputs 

−  [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0] 

+  ghci> x2 

+  AUnboxed ((Z :. 5) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]) 

+  </haskell> 

−  which 
+  which has the type: 
−  x :: Array ((Z :. Int) :. Int) Double 

+  <haskell> 

+  ghci> :t x2 

+  x2 :: Array U ((Z :. Int) :. Int) Double 

+  </haskell> 

−  or 

+  or, as above, if we define it with the type synonym for 2 dimensional <code>Int</code> indexed arrays, <code>DIM2</code>: 

−  x :: Array DIM2 Double 

−  === Building arrays from vectors === 

+  <haskell> 

+  ghci> let x2' = fromListUnboxed (Z :. 5 :. 2 :: DIM2) inputs 

+  ghci> x2' 

+  AUnboxed ((Z :. 5) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]) 

+  ghci> :t x2' 

+  x2' :: Array U DIM2 Double 

−  It is also possible to build arrays from unboxed vectors: 

+  </haskell> 

−  fromVector :: Shape sh => sh > Vector a > Array sh a 

+  == Building arrays from vectors == 

−  by applying a shape to a vector. 

+  It is also possible to build arrays from unboxed vectors, from the 'vector' package: 

−  import Data.Vector.Unboxed 

+  <haskell> 

+  fromUnboxed :: (Shape sh, Unbox e) => sh > Vector e > Array U sh e 

+  </haskell> 

−  > let x = fromVector (Z :. (10::Int)) (enumFromN 0 10) 

+  New arrays are built by applying a shape to the vector. For example: 

−  [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0] 

−  is a onedimensional array of doubles, but we can also impose other 

+  <haskell> 

+  ghci> :m + Data.Vector.Unboxed 

+  ghci> let x = fromUnboxed (Z :. (10::Int)) (enumFromN 0 10) 

+  ghci> x 

+  AUnboxed (Z :. 10) (fromList [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]) 

+  </haskell> 

+  
+  is a onedimensional array of doubles. As usual, we can also impose other 

shapes: 
shapes: 

−  > let x = fromVector (Z :. (3::Int) :. (3::Int)) (enumFromN 0 9) 

+  <haskell> 

−  > x 

+  ghci> let x = fromUnboxed (Z :. (3::Int) :. (3::Int)) (enumFromN 0 9) 

−  [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0] 

+  ghci> x 

−  > :t x 

+  AUnboxed ((Z :. 3) :. 3) (fromList [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0]) 

−  x :: Array ((Z :. Int) :. Int) Double 

+  ghci> :t x 

+  x :: Array U ((Z :. Int) :. Int) Double 

+  </haskell> 

+  
+  to create a 3x3 array. 

+  
+  == Generating random arrays == 

+  
+  The [http://hackage.haskell.org/package/repaalgorithms repaalgorithms] package lets us generate new arrays with random data: 

+  
+  <haskell> 

+   3d array of Ints, bounded between 0 and 255. 

+  ghci> randomishIntArray (Z :. (3::Int) :. (3::Int) :. (3::Int)) 0 255 1 

+  AUnboxed (((Z :. 3) :. 3) :. 3) (fromList [217,42,130,200,216,254,67,77,152, 

+  85,140,226,179,71,23,17,152,84,47,17,45,5,88,245,107,214,136]) 

+  </haskell> 

+  
+  == Reading arrays from files == 

+  
+  Using the [http://hackage.haskell.org/package/repaio repaio] package, arrays may be written and read from files in a number of formats: 

+  
+  * as BMP files; and 

+  * in a number of text formats. 

+  
+  with other formats rapidly appearing. An example: to write an 2D array to an ascii file: 

+  
+  <haskell> 

+  ghci> :m +Data.Array.Repa.IO.Matrix 

+  ghci> let x = fromList (Z :. 5 :. 2 :: DIM2) [1..10] 

+  ghci> writeMatrixToTextFile "test.dat" x 

+  </haskell> 

+  
+  This will result in a file containing: 

+  
+  MATRIX 

+  5 2 

+  1.0 

+  2.0 

+  3.0 

+  4.0 

+  5.0 

+  6.0 

+  7.0 

+  8.0 

+  9.0 

+  10.0 

+  
+  In turn, this file may be read back in via <code>readMatrixFromTextFile</code>. 

+  
+  <haskell> 

+  ghci> xx < readMatrixFromTextFile "test.dat" 

+  ghci> xx 

+  AUnboxed ((Z :. 5) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]) 

+  ghci> :t xx 

+  xx :: Array U DIM2 Double 

+  </haskell> 

+  
+  To process [http://en.wikipedia.org/wiki/BMP_file_format .bmp files], use [http://hackage.haskell.org/packages/archive/repaio/2.0.0.3/doc/html/DataArrayRepaIOBMP.html Data.Array.Repa.IO.BMP], as follows (currently reading only works for 24 bit .bmp): 

+  
+  <haskell> 

+  Data.Array.Repa.IO.BMP> x < readImageFromBMP "/tmp/test24.bmp" 

+  </haskell> 

+  
+  Reads this .bmp image: 

+  
+  http://i.imgur.com/T4ttx.png 

+  
+  as a 3D array of <code>Word8</code>, which can be further processed. 

+  
+  For image IO in many, many formats, use the [http://hackage.haskell.org/package/repadevil repadevil] library. 

+  
+  == Copying arrays from pointers == 

+  
+  You can also generate new repa arrays by copying data from a pointer. Here is an example, using <code>fromForeignPtr</code>: 

+  
+  <haskell> 

+  import Data.Word 

+  import Foreign.Ptr 

+  import qualified Foreign.ForeignPtr.Safe as FPS 

+  
+  import qualified Data.Vector.Storable as V 

+  import qualified Data.Array.Repa as R 

+  import qualified Data.Array.Repa.Repr.ForeignPtr as RFP 

+  import Data.Array.Repa 

+  
+  import Data.Array.Repa.IO.DevIL 

+  
+  i, j, k :: Int 

+  (i, j, k) = (255, 255, 4)  RGBA 

+  
+   1D vector, filled with pretty colors 

+  v :: V.Vector Word8 

+  v = V.fromList . take (i * j * k) . cycle $ concat 

+  [ [ r, g, b, 255 ] 

+   r < [0 .. 255] 

+  , g < [0 .. 255] 

+  , b < [0 .. 255] 

+  ] 

+  
+  ptr2repa :: Ptr Word8 > IO (R.Array RFP.F R.DIM3 Word8) 

+  ptr2repa p = do 

+  fp < FPS.newForeignPtr_ p 

+  return $ RFP.fromForeignPtr (Z :. i :. j :. k) fp 

+  
+  main = do 

+   copy our 1d vector to a repa 3d array, via a pointer 

+  r < V.unsafeWith v ptr2repa 

+  runIL $ writeImage "test.png" (RGBA r) 

+  return () 

+  </haskell> 

+  
+  The <code>v</code> function creates a vector filled with pixel data (RGBA values); <code>ptr2repa</code> takes a <code>Ptr</code>, converts it to <code>ForeignPtr</code> which is then used to copy data into a 3D REPA array (first two dimensions correspond to pixel location, the third dimension corresponds to colour channel). In the main function we use <code>unsafeWith</code> function operating on storable vectors. It passes <code>Ptr</code> to vector's internal data to a function that operates on that <code>Ptr</code> (it may not modify the data). In our case this is the <code>ptr2repa</code> array, which returns image data wrapped in an array. That array is then passed as data to <code>writeImage</code> function. <code>RGBA</code> is a data constructor for <code>Image</code> type. This allows <code>writeImage</code> to interpret array data correctly. <code>runIL</code> is a wrapper around IO monad that guarantees that IL library will be properly initialized. The result is written to disk as this image: 

+  
+  http://i.imgur.com/o0Cv2.png 

−  +  = Indexing arrays = 

To access elements in repa arrays, you provide an array and a shape, to access the element: 
To access elements in repa arrays, you provide an array and a shape, to access the element: 

−  (!) :: (Shape sh, Elt a) => Array sh a > sh > a 

+  <haskell> 

+  (!) :: (Shape sh, Source r e) => Array r sh e > sh > e 

+  </haskell> 

−  So: 

+  Indices start with 0. So: 

−  > let x = fromList (Z :. (10::Int)) [1..10] 

+  <haskell> 

−  > x ! (Z :. 2) 

+  ghci> let x = fromListUnboxed (Z :. (10::Int)) [1..10] 

−  +  ghci> x ! (Z :. 2) 

+  3.0 

+  </haskell> 

−  Note that we can't 
+  Note that we can't give just a bare literal as the shape, even for onedimensional arrays: 
−  > x ! 2 

+  <haskell> 

+  > x ! 2 

−  +  No instance for (Num (Z :. Int)) 

−  +  arising from the literal `2' 

+  </haskell> 

−  as the Z type isn't in the Num class. 
+  as the <code>Z</code> type isn't in the <code>Num</code> class, and Haskell's numeric literals are overloaded. 
What if the index is out of bounds, though? 
What if the index is out of bounds, though? 

−  > x ! (Z :. 11) 

+  <haskell> 

−  *** Exception: ./Data/Vector/Generic.hs:222 ((!)): index out of bounds (11,10) 

+  > x ! (Z :. 11) 

+  *** Exception: ./Data/Vector/Generic.hs:244 ((!)): index out of bounds (11,10) 

+  </haskell> 

−  +  An exception is thrown. Older versions of repa offered safe indexing operator <code>(!?)</code> that returned <code>Maybe</code>, but it is no longer available in repa 3.2. 

−  (!?) :: (Shape sh, Elt a) => Array sh a > sh > Maybe a 

+  = Operations on arrays = 

−  An example: 

+  Besides indexing, there are many regular, listlike operations on arrays. Since many of the names parallel those in the Prelude, we import Repa qualified: 

−  > x !? (Z :. 9) 

+  ghci> import qualified Data.Array.Repa as Repa 

−  Just 10.0 

−  > x !? (Z :. 11) 

+  == Maps, zips, filters and folds == 

−  Nothing 

−  == Operations on arrays == 

+  Let us define an unboxed array: 

−  Besides indexing, there are many regular, listlike operations on arrays. 

+  <haskell> 

+  ghci> let x = fromListUnboxed (Z :. (3::Int) :. (3::Int)) [1..9] 

+  ghci> x 

+  AUnboxed ((Z :. 3) :. 3) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]) 

+  </haskell> 

−  === Maps, zips, filters and folds === 

+  We can map over multidimensional arrays with <code>map</code>, but since it conflicts with the definition in the Prelude, we have to use it with the qualifier we requested: 

−  We can map over multidimensional arrays: 

+  <haskell> 

+  ghci> let y = Repa.map (^2) x 

+  </haskell> 

−  > let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9] 

+  The type of <code>Repa.map</code> is: 

−  > x 

−  [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0] 

−  since `map` conflicts with the definition in the Prelude, we have to use it qualified: 

+  <haskell> 

+  map :: (Shape sh, Source r a) => 

+  (a > b) > Array r sh a > Array D sh b 

+  </haskell> 

−  > Data.Array.Repa.map (^2) x 

+  This means that result is a delayed array (represented by letter D in the return type). We can index into delayed array (remember that indices are 0based): 

−  [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0] 

−  Maps leave the dimension unchanged. 

+  <haskell> 

+  ghci> y Repa.! (Z :. 0 :. 2) 

+  9.0 

+  ghci> y Repa.! (Z :. 2 :. 0) 

+  49.0 

+  </haskell> 

−  Folding reduces the inner dimension of the array. 

+  Passing an incorrect index will NOT cause an exception: 

−  fold :: (Shape sh, Elt a) => (a > a > a) > a > Array (sh :. Int) a > Array sh a 

+  <haskell> 

+  ghci> y Repa.! (Z :. 0 :. 3) 

+  16.0 

+  ghci> y Repa.! (Z :. 0 :. 4) 

+  0.0 

+  ghci> y Repa.! (Z :. 10 :. 10) 

+  0.0 

+  </haskell> 

−  So if 'x' is a 3D array: 

+  Since delayed arrays are not instances of Show type class, we cannot display the whole array directly: 

−  > let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9] 

+  <haskell> 

−  +  ghci> y 

−  +  No instance for (Show (Array D ((Z :. Int) :. Int) Double)) 

+  arising from a use of `print' 

+  </haskell> 

−  We can sum each row, to yield a 2D array: 

+  To display <code>y</code> array it must be turned into manifest array by computing all its values. This can be done either with computeP, which evaluates array in parallel, or computeS, which evaluates values sequentially. Looking at the type of computeP: 

−  > fold (+) 0 x 

+  <haskell> 

−  [6.0,15.0,24.0] 

+  ghci> :t computeP 

+  computeP 

+  :: (Monad m, Source r2 e, Load r1 sh e, Target r2 e) => 

+  Array r1 sh e > m (Array r2 sh e) 

+  </haskell> 

+  
+  we can see that the result is enclosed within a monad. The reason for this is that monads give a well defined notion of sequence and thus <code>computeP</code> enforces completion of parallel evaluation in a particular point of monadic computations. This prevents nested data parallelism, which is not supported by Repa. We don't really care here about monadic effect, so we can take the manifest array out of a monad after <code>computeP</code> is done: 

+  
+  <haskell> 

+  ghci> computeP y :: IO (Array U DIM2 Double) 

+  AUnboxed ((Z :. 3) :. 3) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]) 

+  ghci> z < computeP y :: IO (Array U DIM2 Double) 

+  ghci> z 

+  AUnboxed ((Z :. 3) :. 3) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]) 

+  ghci> let [w] = computeP y :: [Array U DIM2 Double] 

+  ghci> w 

+  AUnboxed ((Z :. 3) :. 3) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]) 

+  </haskell> 

+  
+  Repa's <code>map</code> leaves the dimension unchanged: 

+  
+  <haskell> 

+  ghci> extent x 

+  (Z :. 3) :. 3 

+  ghci> extent y 

+  (Z :. 3) :. 3 

+  </haskell> 

+  
+  There are four fold operations in Repa library: <code>foldP</code>, <code>foldS</code>, <code>foldAllP</code> and <code>foldAllS</code>. First two reduce the the inner dimension of the array, the last two reduce whole array to one value. The <code>P</code> and <code>S</code> suffixes stand for parallel and sequential evaluation. First parameter of parallel folds must be an associative sequential operator and the starting element must be a neutral element of that operator. This is required in order to ensure correctness of parallel evaluation. 

+  
+  Type of <code>foldP</code> is given as: 

+  
+  <haskell> 

+  foldP :: (Monad m, Shape sh, Source r a, Unbox a, Elt a) => 

+  (a > a > a) > a > Array r (sh :. Int) a > m (Array U sh a) 

+  </haskell> 

+  
+  The <code>x</code> defined above was a 2D array: 

+  
+  <haskell> 

+  ghci> extent x 

+  (Z :. 3) :. 3 

+  </haskell> 

+  
+  but if we sum each row: 

+  
+  <haskell> 

+  ghci> Repa.foldP (+) 0 x 

+  AUnboxed (Z :. 3) (fromList [6.0,15.0,24.0]) 

+  </haskell> 

+  
+  we get a 1D array instead (<code>liftM</code> is used to run <code>extent</code> function within a monad returned by <code>foldP</code>): 

+  
+  <haskell> 

+  ghci> liftM extent (Repa.foldP (+) 0 x) 

+  Z :. 3 

+  </haskell> 

+  
+  Similarly, if <code>y</code> is a (3 x 3 x 2) 3D array: 

+  
+  <haskell> 

+  ghci> let y = fromListUnboxed ((Z :. 3 :. 3 :. 2) :: DIM3) [1..18] 

+  ghci> y 

+  AUnboxed (((Z :. 3) :. 3) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0,17.0,18.0]) 

+  </haskell> 

+  
+  we can fold over the inner dimension: 

+  
+  <haskell> 

+  ghci> Repa.foldP (+) 0 y 

+  AUnboxed ((Z :. 3) :. 3) (fromList [3.0,7.0,11.0,15.0,19.0,23.0,27.0,31.0,35.0]) 

+  </haskell> 

+  
+  yielding a 2D (3 x 3) array in place of our 3D (3 x 3 x 3) array. 

Two arrays may be combined via <code>zipWith</code>: 
Two arrays may be combined via <code>zipWith</code>: 

−  zipWith :: (Shape sh, Elt b, Elt c, Elt a) => 

+  <haskell> 

−  (a > b > c) > Array sh a > Array sh b > Array sh c 

+  zipWith :: (Shape sh, Source r2 b, Source r1 a) => 

+  (a > b > c) > Array r1 sh a > Array r2 sh b > Array D sh c 

+  </haskell> 

an example: 
an example: 

−  > zipWith (*) x x 

+  <haskell> 

−  [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0] 

+  ghci> let y = Repa.zipWith (*) x x 

+  ghci> computeP y :: IO (Array U DIM2 Double) 

+  AUnboxed ((Z :. 3) :. 3) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]) 

+  ghci> extent y 

+  (Z :. 3) :. 3 

+  </haskell> 

+  
+  If the extent of the two array arguments differ, then the resulting array's extent is their intersection. 

+  
+  == Mapping, with indices == 

+  
+  A very powerful operator is <code>traverse</code>, an array transformation which also supplies the current index: 

+  
+  <haskell> 

+  traverse :: (Shape sh', Shape sh, Source r a) 

+  => Array r sh a  Source array 

+  > (sh > sh')  Function to produce the extent of the result. 

+  > ((sh > a) > sh' > b)  Function to produce elements of the result. 

+   It is passed a lookup function to 

+   get elements of the source. 

+  > Array D sh' b 

+  </haskell> 

+  
+  This is quite a complicated type, because it is very general. Let's take it apart. The first argument is the source array, which is obvious. The second argument is a function that transforms the shape of the input array to yield the output array. So if the arrays are the same size, this function is <code>id</code>. It might grow or resize the shape in other ways. 

+  
+  Finally, the 3rd argument is where the magic is. It is a function (often anonymous one) that takes two parameters: a function that looks up elements of an original array and a location of the currently processed element in the new array. Based on these two parameters a new value is determined for the resulting array. Note that <code>traverse</code> goes through all elements of the resulting array and it is up to you how to get values of original array based on an index of the new array. 

+  
+  <code>traverse</code> returns a delayed array, which means we will use <code>computeP</code> to evaluate its values and display them on the screen. 

+  
+  So we see this generalizes map to support indexes, and optional inspection of the current element. Let's try some examples: 

+  
+  <haskell> 

+  ghci> let x = fromListUnboxed (Z :. (3::Int) :. (3::Int) :. (3::Int)) [1..27] :: Array U DIM3 Int 

+  ghci> x 

+  AUnboxed (((Z :. 3) :. 3) :. 3) (fromList [1,2,3,4,5,6,7,8,9, 

+  10,11,12,13,14,15,16,17,18, 

+  19,20,21,22,23,24,25,26,27]) 

−  === Numeric operations: negation, addition, subtraction, multiplication === 

+   Keeping the shape the same, and just overwriting elements 

+   Use `traverse` to set all elements to their `x` axis: 

+  ghci> computeP $ traverse x id (\_ (Z :. i :. j :. k) > i) :: IO (Array U DIM3 Int) 

+  AUnboxed (((Z :. 3) :. 3) :. 3) (fromList [0,0,0,0,0,0,0,0,0, 

+  1,1,1,1,1,1,1,1,1, 

+  2,2,2,2,2,2,2,2,2]) 

−  Repa arrays are instances of the <code>Num</code>. This means that 

+   Shuffle elements around, based on their index. 

−  operations on numerical elements are lifted automagically onto arrays of 

+   Rotate elements by swapping elements from rotated locations: 

−  such elements. For example, <code>(+)</code> on two double values corresponds to 

+  ghci> computeP $ traverse x id (\f (Z :. i :. j :. k) > f (Z :. j :. k :. i)) :: IO (Array U DIM3 Int) 

−  elementwise addition, <code>(+)</code>, of the two arrays of doubles: 

+  AUnboxed (((Z :. 3) :. 3) :. 3) (fromList [1,4,7,10,13,16,19,22,25, 

+  2,5,8,11,14,17,20,23,26, 

+  3,6,9,12,15,18,21,24,27]) 

−  > let x = fromList (Z :. (10::Int)) [1..10] 

+   Take a slice of one dimension. The resulting shape of the array changes 

−  > x + x 

+  ghci> computeP $ traverse x (\(e :. _) > e) (\f (Z :. i :. j) > f (Z :. i :. j :. 0)) :: IO (Array U DIM2 Int) 

−  [2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0] 

+  AUnboxed ((Z :. 3) :. 3) (fromList [1,4,7,10,13,16,19,22,25]) 

+  </haskell> 

−  Other operations from the Num class work just as well: 

+  The documentation on [http://hackage.haskell.org/packages/archive/repa/latest/doc/html/DataArrayRepa.html#v:traverse traverse] provides further information. 

−  > x 

+  == Numeric operations: negation, addition, subtraction, multiplication == 

−  [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0] 

−  > x ^ 3 

+  To perform basic numeric operations on Repa arrays you must use specialized operators: <code>(+^)</code> (addition), <code>(^)</code> (subtraction), <code>(*^)</code> (multiplication) and <code>(/^)</code> (division). All these operators work elementwise and produce delayed arrays (they have to be computed if you want to have results as unboxed arrays). 

−  [1.0,8.0,27.0,64.0,125.0,216.0,343.0,512.0,729.0,1000.0] 

−  > x * x 

+  <haskell> 

−  [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0] 

+  ghci> let x = fromListUnboxed (Z :. (10 :: Int)) [1..10] 

+  ghci> computeP $ x +^ x :: IO (Array U DIM1 Double) 

+  AUnboxed (Z :. 10) (fromList [2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]) 

+  ghci> computeP $ x ^ x :: IO (Array U DIM1 Double) 

+  AUnboxed (Z :. 10) (fromList [0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]) 

+  ghci> computeP $ x *^ x :: IO (Array U DIM1 Double) 

+  AUnboxed (Z :. 10) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0]) 

+  ghci> computeP $ x /^ x :: IO (Array U DIM1 Double) 

+  AUnboxed (Z :. 10) (fromList [1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]) 

+  </haskell> 

−  +  = Changing the shape of an array = 

One of the main advantages of repastyle arrays over other arrays in 
One of the main advantages of repastyle arrays over other arrays in 

Line 283:  Line 631:  
via *indexspace transformations*. 
via *indexspace transformations*. 

−  An example: transposing a 2D array 
+  An example: transposing a 2D array. First, the type of the transformation: 
−  paper). First, the type of the transformation: 

−  transpose2D :: Elt e => Array DIM2 e > Array DIM2 e 

+  <haskell> 

+  transpose2D :: (Source r e) => Array r DIM2 e > Array D DIM2 e 

+  </haskell> 

Note that this transform will work on DIM2 arrays holding any elements. 
Note that this transform will work on DIM2 arrays holding any elements. 

Now, to swap rows and columns, we have to modify the shape: 
Now, to swap rows and columns, we have to modify the shape: 

−  transpose2D a = backpermute (swap e) swap a 

+  <haskell> 

−  where 

+  transpose2D a = backpermute (swap e) swap a 

−  e = extent a 

+  where 

−  swap (Z :. i :. j) = Z :. j :. i 

+  e = extent a 

+  swap (Z :. i :. j) = Z :. j :. i 

+  </haskell> 

The swap function reorders the index space of the array. 
The swap function reorders the index space of the array. 

To do this, we extract the current shape of the array, and write a function 
To do this, we extract the current shape of the array, and write a function 

−  that maps the index space from the old array to the new array. That index space function 
+  that maps the index space from the old array to the new array. That index space function is then passed to backpermute which actually constructs the new array from the old one. 
−  is then passed to backpermute which actually constructs the new 

−  array from the old one. 

−  backpermute generates a new array from an old, when given the new shape, and a 
+  backpermute generates a new array from an old, when given the new shape, and a function that translates between the index space of each array (i.e. a shape transformer). 
−  function that translates between the index space of each array (i.e. a shape 

−  transformer). 

−  backpermute 

+  <haskell> 

−  +  backpermute :: (Shape sh2, Shape sh1, Source r e) => 

−  +  sh2 > (sh2 > sh1) > Array r sh1 e > Array D sh2 e 

−  +  </haskell> 

−  > Array sh a 

−  > Array sh' a 

Note that the array created is not actually evaluated (we only modified the index space of the array). 
Note that the array created is not actually evaluated (we only modified the index space of the array). 

Line 311:  Line 663:  
library: 
library: 

−  transpose :: (Shape sh, Elt a) 

+  <haskell> 

−  => Array ((sh :. Int) :. Int) a > Array ((sh :. Int) :. Int) 

+  transpose :: (Shape sh, Source r e) => 

+  Array r ((sh :. Int) :. Int) e > Array D ((sh :. Int) :. Int) e 

+  </haskell> 

−  +  The types indicate that it works on the lowest two dimensions of the 

array. 
array. 

−  Other operations on index spaces include taking slices and joining 

+  Arrays shape can be changed without altering the underlying data by using <code>reshape</code>: 

−  arrays into larger ones. 

+  
+  <haskell> 

+  ghci> let x = fromListUnboxed (Z :. (5 :: Int) :. (2 :: Int)) [1..10] 

+  ghci> x 

+  AUnboxed ((Z :. 5) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]) 

+  ghci> computeP $ reshape (Z :. (2::Int) :. (5::Int)) x :: IO (Array U DIM2 Double) 

+  AUnboxed ((Z :. 2) :. 5) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]) 

+  </haskell> 

+  
+  = Examples = 

+  
+  Following are some examples of useful functions that exercise the API. 

+  
+  == Example: Rotating an image with backpermute == 

+  
+  Flip an image upside down: 

+  
+  <haskell> 

+  import Foreign.Ptr 

+  import System.Environment 

+  import Data.Word 

+  import Data.Array.Repa hiding ((++)) 

+  import Data.Array.Repa.IO.DevIL 

+  import Data.Array.Repa.Repr.ForeignPtr 

+  
+  main :: IO () 

+  main = do 

+  [f] < getArgs 

+  (RGB v) < runIL $ readImage f 

+  rotated < (computeP $ rot180 v) :: IO (Array F DIM3 Word8) 

+  runIL $ writeImage ("flip"++f) (RGB rotated) 

+  
+  rot180 :: (Source r e) => Array r DIM3 e > Array D DIM3 e 

+  rot180 g = backpermute e flop g 

+  where 

+  e@(Z :. x :. y :. _) = extent g 

+  flop (Z :. i :. j :. k) = 

+  (Z :. x  i  1 :. y  j  1 :. k) 

+  </haskell> 

+  
+  Running this: 

+  
+  $ ghc O2 make threaded rtsopts repa.hs 

+  $ ./repa "haskell.jpg" +RTS N2 

+  
+  Results in: 

+  
+  http://i.imgur.com/YsGA8.jpg 

+  
+  == Example: matrixmatrix multiplication == 

+  
+  A more advanced example from the Repa paper is matrixmatrix multiplication. The result of matrix multiplication is a matrix whose elements are found by multiplying the elements of each row from the first matrix by the associated elements of the same column from the second matrix and summing the result. 

+  
+  if <math>A=\begin{bmatrix}a&b\\c&d\end{bmatrix}</math> and <math>B=\begin{bmatrix}e&f\\g&h\end{bmatrix}</math> 

+  
+  then 

+  
+  <math>AB=\begin{bmatrix}a&b\\c&d\end{bmatrix}\begin{bmatrix}e&f\\g&h\end{bmatrix}=\begin{bmatrix}ae+bg&af+bh\\ce+dg&cf+dh\end{bmatrix}</math> 

+  
+  So we take two, 2D arrays and generate a new array, using our transpose 

+  function from earlier: 

+  
+  <haskell> 

+  mmMult :: Monad m 

+  => Array U DIM2 Double 

+  > Array U DIM2 Double 

+  > m (Array U DIM2 Double) 

+  mmMult a b = sumP (Repa.zipWith (*) aRepl bRepl) 

+  where 

+  t = transpose2D b 

+  aRepl = extend (Z :.All :.colsB :.All) a 

+  bRepl = extend (Z :.rowsA :.All :.All) t 

+  (Z :.colsA :.rowsA) = extent a 

+  (Z :.colsB :.rowsB) = extent b 

+  </haskell> 

+  
+  The idea is to expand both 2D argument arrays into 3D arrays by 

+  replicating them across a new axis using <code>extend</code> function. The front face of the cuboid that results represents the array <code>a</code>, which we replicate as often as <code>b</code> has columns <code>(colsB)</code>, producing <code>aRepl</code>. 

+  
+  The top face represents <code>t</code> (the transposed b), which we 

+  replicate as often as a has rows <code>(rowsA)</code>, producing 

+  <code>bRepl,</code>. The two replicated arrays have the same extent, 

+  which corresponds to the index space of matrix multiplication. 

+  
+  Optimized implementations of this function are available in the 

+  [http://hackage.haskell.org/package/repaalgorithms repaalgorithms] package. 

+  
+  == Example: parallel image desaturation == 

+  
+  To convert an image from color to greyscale, we can use the luminosity method to average RGB pixels into a common grey value, where the average is weighted for human perception of green. 

+  
+  The formula for luminosity is 0.21 R + 0.71 G + 0.07 B. 

+  
+  We can write a parallel image desaturation tool using repa and the [http://hackage.haskell.org/package/repadevil repadevil image library]: 

+  
+  <haskell> 

+  import System.Environment 

+  import Data.Word 

+  import Data.Array.Repa hiding ((++)) 

+  import Data.Array.Repa.IO.DevIL 

+  import Data.Array.Repa.Repr.ForeignPtr 

+  
+   

+   Read an image, desaturate, write out with new name. 

+   

+  main = do 

+  [f] < getArgs 

+  runIL $ do 

+  (RGB a) < readImage f 

+  b < (computeP $ traverse a id luminosity) :: IL (Array F DIM3 Word8) 

+  writeImage ("grey" ++ f) (RGB b) 

+  </haskell> 

+  
+  And now the luminosity transform itself, which averages the 3 RGB colors based on perceived weight: 

+  
+  <haskell> 

+   

+   (Parallel) desaturation of an image via the luminosity method. 

+   

+  luminosity :: (DIM3 > Word8) > DIM3 > Word8 

+  luminosity _ (Z :. _ :. _ :. 3) = 255  alpha channel 

+  luminosity f (Z :. i :. j :. _) = ceiling $ 0.21 * r + 0.71 * g + 0.07 * b 

+  where 

+  r = fromIntegral $ f (Z :. i :. j :. 0) 

+  g = fromIntegral $ f (Z :. i :. j :. 1) 

+  b = fromIntegral $ f (Z :. i :. j :. 2) 

+  </haskell> 

+  
+  And that's it! The result is a parallel image desaturator, when compiled with 

+  
+  $ ghc O threaded rtsopts make repa.hs 

+  
+  which we can run, to use two cores: 

+  
+  $ time ./repa sunflower.png +RTS N2 H 

+  
+  real 0m0.048s 

+  user 0m0.052s 

+  sys 0m0.016s 

+  
+  We can also see that program takes a bit longer when run on a single core: 

+  
+  $ time ./repa sunflower.png +RTS N1 H 

+  
+  real 0m0.064s 

+  user 0m0.060s 

+  sys 0m0.008s 

+  
+  Given an image like this: 

+  
+  http://i.imgur.com/iu4gV.png 

+  
+  The desaturated result from Haskell: 

+  
+  http://i.imgur.com/REhA5.png 

+  
+  == Example: Partial Differential equations, idiomatic option pricing and risk == 

+  
+  http://idontgetoutmuch.wordpress.com/2013/01/05/optionpricingusinghaskellparallelarrays/ 

+  http://stackoverflow.com/questions/14082158/idiomaticoptionpricingandriskusingrepaparallelarrays 

+  http://idontgetoutmuch.wordpress.com/2013/02/10/parallelisingpathdependentoptionsinhaskell2/ 

+  
+  = Optimising Repa programs = 

+  
+  '''Note''': This section is based on older version of Repa (2.x). It seems that it is not entirely accurate for Repa 3. For example there is no <code>force</code> function and array type (manifest or delayed) is now explicitly defined using type index. I do not feel sufficiently competent to update this section so I am leaving it "as is" for the time being. [[User:Killy9999killy9999]] 09:12, 11 October 2012 (UTC) 

+  
+  == Fusion, and why you need it == 

+  Repa depends critically on array fusion to achieve fast code. Fusion is a fancy name for the combination of inlining and code transformations performed by GHC when it compiles your program. The fusion process merges the array filling loops defined in the Repa library, with the "worker" functions that you write in your own module. If the fusion process fails, then the resulting program will be much slower than it needs to be, often 10x slower an equivalent program using plain Haskell lists. On the other hand, provided fusion works, the resulting code will run as fast as an equivalent cleanly written C program. Making fusion work is not hard once you understand what's going on. 

+  
+  == The <code>force</code> function has the loops == 

+  
+  Suppose we have the following binding: 

+  
+  arr' = R.force $ R.map (\x > x + 1) arr 

+  
+  The right of this binding will compile down to code that first allocates the result array <code>arr'</code>, then iterates over the source array <code>arr</code>, reading each element in turn and adding one to it, then writing to the corresponding element in the result. 

+  
+  Importantly, the code that does the allocation, iteration and update is defined as part of the <code>force</code> function. This forcing code has been written to break up the result into several chunks, and evaluate each chunk with a different thread. This is what makes your code run in parallel. If you do ''not'' use <code>force</code> then your code will be slow and ''not'' run in parallel. 

+  
+  == Delayed and Manifest arrays == 

+  In the example from the previous section, think of the <code>R.map (\x > x + 1) arr</code> expression as a ''specification'' for a new array. In the library, this specification is referred to as a ''delayed'' array. A delayed array is represented as a function that takes an array index, and computes the value of the element at that index. 

+  
+  Applying <code>force</code> to a delayed array causes all elements to be computed in parallel. The result of a <code>force</code> is referred to as a ''manifest'' array. A manifest array is a "real" array represented as a flat chunk of memory containing array elements. 

+  
+  All Repa array operators will accept both delayed and manifest arrays. However, if you index into a delayed array without forcing it first, then each indexing operation costs a function call. It also ''recomputes'' the value of the array element at that index. 

+  
+  == Shells and Springs == 

+  Here is another way to think about Repa's approach to array fusion. Suppose we write the following binding: 

+  
+  arr' = R.force $ R.map (\x > x * 2) $ R.map (\x > x + 1) arr 

+  
+  Remember from the previous section, that the result of each of the applications of <code>R.map</code> is a delayed array. A delayed array is not a "real", manifest array, it's just a shell that contains a function to compute each element. In this example, the two worker functions correspond to the lambda expressions applied to <code>R.map</code>. 

+  
+  When GHC compiles this example, the two worker functions are fused into a fresh unfolding of the parallel loop defined in the code for <code>R.force</code>. Imagine holding <code>R.force</code> in your left hand, and squashing the calls to <code>R.map</code> into it, like a spring. Doing this breaks all the shells, and you end up with the worker functions fused into an unfolding of <code>R.force</code>. 

+  
+  == INLINE worker functions == 

+  Consider the following example: 

+  
+  f x = x + 1 

+  arr' = R.force $ R.zipWith (*) (R.map f arr1) (R.map f arr2) 

+  
+  During compilation, we need GHC to fuse our worker functions into a fresh unfolding of <code>R.force</code>. In this example, fusion includes inlining the definition of <code>f</code>. If <code>f</code> is ''not'' inlined, then the performance of the compiled code will be atrocious. It will perform a function call for each application of <code>f</code>, where it really only needs a single machine instruction to increment the <code>x</code> value. 

+  
+  Now, in general, GHC tries to avoid producing binaries that are "too big". Part of this is a heuristic that controls exactly what functions are inlined. The heuristic says that a function may be inlined only if it is used once, or if its definition is less than some particular size. If neither of these apply, then the function won't be inlined, killing performance. 

+  
+  For Repa programs, as fusion and inlining has such a dramatic effect on performance, we should ''absolutely not'' rely on heuristics to control whether or not this inlining takes place. If we rely on a heuristic, then even if our program runs fast today, if this heuristic is ever altered then some functions that used to be inlined may no longer be. 

+  
+  The moral of the story is to attach INLINE pragmas to ''all'' of your client functions that compute array values. This ensures that these critical functions will be inlined now, and forever. 

+  
+  {# INLINE f #} 

+  f x = x + 1 

+  
+  arr' = R.force $ R.zipWith (*) (R.map f arr1) (R.map f arr2) 

+  
+  
+  [[Category:Libraries]] 

+  [[Category:Packages]] 

+  [[Category:Performance]] 

+  [[Category:Tutorials]] 

+  [[Category:Parallel]] 
Latest revision as of 12:48, 11 July 2016
Repa is a Haskell library for high performance, regular, multidimensional parallel arrays. All numeric data is stored unboxed and functions written with the Repa combinators are automatically parallel (provided you supply "+RTS N" on the command line when running the program).
This document provides a tutorial on array programming in Haskell using the Repa package and was based on version 3.2 of the library. Repa versions earlier than 3.0 use different API and are not compatible with this tutorial.
Note: a companion tutorial to this is provided as the vector tutorial, and is based on the NumPy tutorial.
Authors: Don Stewart (original version), Jan Stolarek (update from Repa 2 to Repa 3).
Contents
Quick Tour
Repa (REgular PArallel arrays) is an advanced, multidimensional parallel arrays library for Haskell, with a number of distinct capabilities:
 The arrays are "regular" (i.e. dense, rectangular and store elements all of the same type); and
 Functions may be written that are polymorphic in the shape of the array;
 Many operations on arrays are accomplished by changing only the shape of the array (without copying elements);
 The library will automatically parallelize operations over arrays.
This is a quick start guide for the package. For further information, consult:
 The Haddock Documentation
 Guiding Parallel Array Fusion with Indexed Types  describes architecture of Repa 3
 Regular, Shapepolymorphic, Parallel Arrays in Haskell.
 Efﬁcient Parallel Stencil Convolution in Haskell
 Repa project homepage
Importing the library
Download the `repa` package:
$ cabal install repa
and import it qualified:
import qualified Data.Array.Repa as R
The library needs to be imported qualified as it shares the same function names as list operations in the Prelude.
Note: Operations that involve writing new index types for Repa arrays will require the 'XTypeOperators' language extension.
For noncore functionality, a number of related packages are available:
 repaio
 repaalgorithms
 repadevil (image loading)
and example algorithms in:
Type indexes
Repa 3 introduced a notion of type index:
data family Array rep sh e
In the above definition rep
is the type index, which is represented with a data type name and defines the type of underlying array representation. This allows to explicitly specify representation of an array on type system level. Repa distinguishes ten different types of representation, which fall into three categories: manifest, delayed and meta. Manifest arrays are represented by concrete values stored in memory using e.g. unboxed vectors (represented with data type U) or strict bytestrings (B). Delayed array (D) is a function from index to a value. Every time an element is requested from a delayed array it is calculated anew, which means that delayed arrays are inefficient when the data is needed multiple times. On the other hand delaying an array allows to perform optimisation by fusion. Meta arrays provide additional information about underlying array data, which can be used for load balancing or speeding up calculation of border effects. See Guiding Parallel Array Fusion with Indexed Types paper for a detailed explanation of each representation.
Array shapes
Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repabased arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a richer type language for describing multidimensional array indices and shapes (technically, a heterogeneous snoc list).
Shape types are built somewhat like lists. The constructor Z
corresponds
to a rank zero shape, and is used to mark the end of the list. The :.
constructor adds additional dimensions to the shape. So, for example, the shape:
(Z :. 3 :. 2 :. 3)
is the shape of a small 3D array, with shape type
(Z :. Int :. Int :. Int)
The most common dimensions are given by the shorthand names:
type DIM0 = Z
type DIM1 = DIM0 :. Int
type DIM2 = DIM1 :. Int
type DIM3 = DIM2 :. Int
type DIM4 = DIM3 :. Int
type DIM5 = DIM4 :. Int
thus,
Array U DIM2 Double
is the type of a twodimensional array of unboxed doubles, indexed via Int
keys, while
Array U Z Double
is a zerodimension object (i.e. a point) holding an unboxed Double.
Many operations over arrays are polymorphic in the shape / dimension
component. Others require operating on the shape itself, rather than
the array. A typeclass, Shape
, lets us operate uniformly
over arrays with different shape.
Building shapes
To build values of shape type, we can use the Z
and :.
constructors. Open the ghci and import Repa:
ghci> :m +Data.Array.Repa
ghci> Z  the zerodimension
Z
For arrays of nonzero dimension, we must give a size. Note: a common error is to leave off the type of the size.
ghci> :t Z :. 10
Z :. 10 :: Num head => Z :. head
leading to annoying type errors about unresolved instances, such as:
No instance for (Shape (Z :. head0))
To select the correct instance, you will need to annotate the size literals with their concrete type:
ghci> :t Z :. (10 :: Int)
Z :. (10 :: Int) :: Z :. Int
is the shape of 1D arrays of length 10, indexed via Ints.
Given an array, you can always find its shape by calling extent
.
Additional convenience types for selecting particular parts of a shape are also provided (All, Any, Slice
etc.) which are covered later in the tutorial.
Working with shapes
That one key operation, extent
, gives us many attributes of an array:
 Extract the shape of the array
extent :: (Shape sh, Source r e) => Array r sh e > sh
So, given a 3x3x3 array, of type Array U DIM3 Int
, we can:
 build an array
ghci> let x :: Array U DIM3 Int; x = fromListUnboxed (Z :. (3::Int) :. (3::Int) :. (3::Int)) [1..27]
ghci> :t x
x :: Array U DIM3 Int
 query the extent
ghci> extent x
((Z :. 3) :. 3) :. 3
 compute the rank (number of dimensions)
ghci> let sh = extent x
ghci> rank sh
3
 compute the size (total number of elements)
> size sh
27
 extract the elements of the array as a flat vector
ghci> toUnboxed x
fromList [1,2,3,4,5,6,7,8,9,10
,11,12,13,14,15,16,17,18,19
,20,21,22,23,24,25,26,27] :: Data.Vector.Unboxed.Base.Vector Int
Generating arrays
New repa arrays ("arrays" from here on) can be generated in many ways, and we always begin by importing the Data.Array.Repa
module:
$ ghci
GHCi, version 7.4.1: http://www.haskell.org/ghc/ :? for help
Loading package ghcprim ... linking ... done.
Loading package integergmp ... linking ... done.
Loading package base ... linking ... done.
ghci> :m + Data.Array.Repa
They may be constructed from lists, for example. Here is a one dimensional array of length 10, here, given the shape
(Z :. 10)
:
ghci> let inputs = [1..10] :: [Double]
ghci> let x = fromListUnboxed (Z :. (10::Int)) inputs
ghci> x
AUnboxed (Z :. 10) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0])
The type of x
is inferred as:
ghci> :t x
x :: Array U (Z :. Int) Double
which we can read as "an array of dimension 1, indexed via Int
keys, holding elements of type Double
stored using unboxed vectors"
We could also have written the type as:
ghci> let x' = fromListUnboxed (Z :. 10 :: DIM1) inputs
ghci> :t x'
x' :: Array U DIM1 Double
The same data may also be treated as a two dimensional array, by changing the shape parameter:
ghci> let x2 = fromListUnboxed (Z :. (5::Int) :. (2::Int)) inputs
ghci> x2
AUnboxed ((Z :. 5) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0])
which has the type:
ghci> :t x2
x2 :: Array U ((Z :. Int) :. Int) Double
or, as above, if we define it with the type synonym for 2 dimensional Int
 indexed arrays, DIM2
:
ghci> let x2' = fromListUnboxed (Z :. 5 :. 2 :: DIM2) inputs
ghci> x2'
AUnboxed ((Z :. 5) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0])
ghci> :t x2'
x2' :: Array U DIM2 Double
Building arrays from vectors
It is also possible to build arrays from unboxed vectors, from the 'vector' package:
fromUnboxed :: (Shape sh, Unbox e) => sh > Vector e > Array U sh e
New arrays are built by applying a shape to the vector. For example:
ghci> :m + Data.Vector.Unboxed
ghci> let x = fromUnboxed (Z :. (10::Int)) (enumFromN 0 10)
ghci> x
AUnboxed (Z :. 10) (fromList [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0])
is a onedimensional array of doubles. As usual, we can also impose other shapes:
ghci> let x = fromUnboxed (Z :. (3::Int) :. (3::Int)) (enumFromN 0 9)
ghci> x
AUnboxed ((Z :. 3) :. 3) (fromList [0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0])
ghci> :t x
x :: Array U ((Z :. Int) :. Int) Double
to create a 3x3 array.
Generating random arrays
The repaalgorithms package lets us generate new arrays with random data:
 3d array of Ints, bounded between 0 and 255.
ghci> randomishIntArray (Z :. (3::Int) :. (3::Int) :. (3::Int)) 0 255 1
AUnboxed (((Z :. 3) :. 3) :. 3) (fromList [217,42,130,200,216,254,67,77,152,
85,140,226,179,71,23,17,152,84,47,17,45,5,88,245,107,214,136])
Reading arrays from files
Using the repaio package, arrays may be written and read from files in a number of formats:
 as BMP files; and
 in a number of text formats.
with other formats rapidly appearing. An example: to write an 2D array to an ascii file:
ghci> :m +Data.Array.Repa.IO.Matrix
ghci> let x = fromList (Z :. 5 :. 2 :: DIM2) [1..10]
ghci> writeMatrixToTextFile "test.dat" x
This will result in a file containing:
MATRIX 5 2 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
In turn, this file may be read back in via readMatrixFromTextFile
.
ghci> xx < readMatrixFromTextFile "test.dat"
ghci> xx
AUnboxed ((Z :. 5) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0])
ghci> :t xx
xx :: Array U DIM2 Double
To process .bmp files, use Data.Array.Repa.IO.BMP, as follows (currently reading only works for 24 bit .bmp):
Data.Array.Repa.IO.BMP> x < readImageFromBMP "/tmp/test24.bmp"
Reads this .bmp image:
as a 3D array of Word8
, which can be further processed.
For image IO in many, many formats, use the repadevil library.
Copying arrays from pointers
You can also generate new repa arrays by copying data from a pointer. Here is an example, using fromForeignPtr
:
import Data.Word
import Foreign.Ptr
import qualified Foreign.ForeignPtr.Safe as FPS
import qualified Data.Vector.Storable as V
import qualified Data.Array.Repa as R
import qualified Data.Array.Repa.Repr.ForeignPtr as RFP
import Data.Array.Repa
import Data.Array.Repa.IO.DevIL
i, j, k :: Int
(i, j, k) = (255, 255, 4)  RGBA
 1D vector, filled with pretty colors
v :: V.Vector Word8
v = V.fromList . take (i * j * k) . cycle $ concat
[ [ r, g, b, 255 ]
 r < [0 .. 255]
, g < [0 .. 255]
, b < [0 .. 255]
]
ptr2repa :: Ptr Word8 > IO (R.Array RFP.F R.DIM3 Word8)
ptr2repa p = do
fp < FPS.newForeignPtr_ p
return $ RFP.fromForeignPtr (Z :. i :. j :. k) fp
main = do
 copy our 1d vector to a repa 3d array, via a pointer
r < V.unsafeWith v ptr2repa
runIL $ writeImage "test.png" (RGBA r)
return ()
The v
function creates a vector filled with pixel data (RGBA values); ptr2repa
takes a Ptr
, converts it to ForeignPtr
which is then used to copy data into a 3D REPA array (first two dimensions correspond to pixel location, the third dimension corresponds to colour channel). In the main function we use unsafeWith
function operating on storable vectors. It passes Ptr
to vector's internal data to a function that operates on that Ptr
(it may not modify the data). In our case this is the ptr2repa
array, which returns image data wrapped in an array. That array is then passed as data to writeImage
function. RGBA
is a data constructor for Image
type. This allows writeImage
to interpret array data correctly. runIL
is a wrapper around IO monad that guarantees that IL library will be properly initialized. The result is written to disk as this image:
Indexing arrays
To access elements in repa arrays, you provide an array and a shape, to access the element:
(!) :: (Shape sh, Source r e) => Array r sh e > sh > e
Indices start with 0. So:
ghci> let x = fromListUnboxed (Z :. (10::Int)) [1..10]
ghci> x ! (Z :. 2)
3.0
Note that we can't give just a bare literal as the shape, even for onedimensional arrays:
> x ! 2
No instance for (Num (Z :. Int))
arising from the literal `2'
as the Z
type isn't in the Num
class, and Haskell's numeric literals are overloaded.
What if the index is out of bounds, though?
> x ! (Z :. 11)
*** Exception: ./Data/Vector/Generic.hs:244 ((!)): index out of bounds (11,10)
An exception is thrown. Older versions of repa offered safe indexing operator (!?)
that returned Maybe
, but it is no longer available in repa 3.2.
Operations on arrays
Besides indexing, there are many regular, listlike operations on arrays. Since many of the names parallel those in the Prelude, we import Repa qualified:
ghci> import qualified Data.Array.Repa as Repa
Maps, zips, filters and folds
Let us define an unboxed array:
ghci> let x = fromListUnboxed (Z :. (3::Int) :. (3::Int)) [1..9]
ghci> x
AUnboxed ((Z :. 3) :. 3) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0])
We can map over multidimensional arrays with map
, but since it conflicts with the definition in the Prelude, we have to use it with the qualifier we requested:
ghci> let y = Repa.map (^2) x
The type of Repa.map
is:
map :: (Shape sh, Source r a) =>
(a > b) > Array r sh a > Array D sh b
This means that result is a delayed array (represented by letter D in the return type). We can index into delayed array (remember that indices are 0based):
ghci> y Repa.! (Z :. 0 :. 2)
9.0
ghci> y Repa.! (Z :. 2 :. 0)
49.0
Passing an incorrect index will NOT cause an exception:
ghci> y Repa.! (Z :. 0 :. 3)
16.0
ghci> y Repa.! (Z :. 0 :. 4)
0.0
ghci> y Repa.! (Z :. 10 :. 10)
0.0
Since delayed arrays are not instances of Show type class, we cannot display the whole array directly:
ghci> y
No instance for (Show (Array D ((Z :. Int) :. Int) Double))
arising from a use of `print'
To display y
array it must be turned into manifest array by computing all its values. This can be done either with computeP, which evaluates array in parallel, or computeS, which evaluates values sequentially. Looking at the type of computeP:
ghci> :t computeP
computeP
:: (Monad m, Source r2 e, Load r1 sh e, Target r2 e) =>
Array r1 sh e > m (Array r2 sh e)
we can see that the result is enclosed within a monad. The reason for this is that monads give a well defined notion of sequence and thus computeP
enforces completion of parallel evaluation in a particular point of monadic computations. This prevents nested data parallelism, which is not supported by Repa. We don't really care here about monadic effect, so we can take the manifest array out of a monad after computeP
is done:
ghci> computeP y :: IO (Array U DIM2 Double)
AUnboxed ((Z :. 3) :. 3) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0])
ghci> z < computeP y :: IO (Array U DIM2 Double)
ghci> z
AUnboxed ((Z :. 3) :. 3) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0])
ghci> let [w] = computeP y :: [Array U DIM2 Double]
ghci> w
AUnboxed ((Z :. 3) :. 3) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0])
Repa's map
leaves the dimension unchanged:
ghci> extent x
(Z :. 3) :. 3
ghci> extent y
(Z :. 3) :. 3
There are four fold operations in Repa library: foldP
, foldS
, foldAllP
and foldAllS
. First two reduce the the inner dimension of the array, the last two reduce whole array to one value. The P
and S
suffixes stand for parallel and sequential evaluation. First parameter of parallel folds must be an associative sequential operator and the starting element must be a neutral element of that operator. This is required in order to ensure correctness of parallel evaluation.
Type of foldP
is given as:
foldP :: (Monad m, Shape sh, Source r a, Unbox a, Elt a) =>
(a > a > a) > a > Array r (sh :. Int) a > m (Array U sh a)
The x
defined above was a 2D array:
ghci> extent x
(Z :. 3) :. 3
but if we sum each row:
ghci> Repa.foldP (+) 0 x
AUnboxed (Z :. 3) (fromList [6.0,15.0,24.0])
we get a 1D array instead (liftM
is used to run extent
function within a monad returned by foldP
):
ghci> liftM extent (Repa.foldP (+) 0 x)
Z :. 3
Similarly, if y
is a (3 x 3 x 2) 3D array:
ghci> let y = fromListUnboxed ((Z :. 3 :. 3 :. 2) :: DIM3) [1..18]
ghci> y
AUnboxed (((Z :. 3) :. 3) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0,17.0,18.0])
we can fold over the inner dimension:
ghci> Repa.foldP (+) 0 y
AUnboxed ((Z :. 3) :. 3) (fromList [3.0,7.0,11.0,15.0,19.0,23.0,27.0,31.0,35.0])
yielding a 2D (3 x 3) array in place of our 3D (3 x 3 x 3) array.
Two arrays may be combined via zipWith
:
zipWith :: (Shape sh, Source r2 b, Source r1 a) =>
(a > b > c) > Array r1 sh a > Array r2 sh b > Array D sh c
an example:
ghci> let y = Repa.zipWith (*) x x
ghci> computeP y :: IO (Array U DIM2 Double)
AUnboxed ((Z :. 3) :. 3) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0])
ghci> extent y
(Z :. 3) :. 3
If the extent of the two array arguments differ, then the resulting array's extent is their intersection.
Mapping, with indices
A very powerful operator is traverse
, an array transformation which also supplies the current index:
traverse :: (Shape sh', Shape sh, Source r a)
=> Array r sh a  Source array
> (sh > sh')  Function to produce the extent of the result.
> ((sh > a) > sh' > b)  Function to produce elements of the result.
 It is passed a lookup function to
 get elements of the source.
> Array D sh' b
This is quite a complicated type, because it is very general. Let's take it apart. The first argument is the source array, which is obvious. The second argument is a function that transforms the shape of the input array to yield the output array. So if the arrays are the same size, this function is id
. It might grow or resize the shape in other ways.
Finally, the 3rd argument is where the magic is. It is a function (often anonymous one) that takes two parameters: a function that looks up elements of an original array and a location of the currently processed element in the new array. Based on these two parameters a new value is determined for the resulting array. Note that traverse
goes through all elements of the resulting array and it is up to you how to get values of original array based on an index of the new array.
traverse
returns a delayed array, which means we will use computeP
to evaluate its values and display them on the screen.
So we see this generalizes map to support indexes, and optional inspection of the current element. Let's try some examples:
ghci> let x = fromListUnboxed (Z :. (3::Int) :. (3::Int) :. (3::Int)) [1..27] :: Array U DIM3 Int
ghci> x
AUnboxed (((Z :. 3) :. 3) :. 3) (fromList [1,2,3,4,5,6,7,8,9,
10,11,12,13,14,15,16,17,18,
19,20,21,22,23,24,25,26,27])
 Keeping the shape the same, and just overwriting elements
 Use `traverse` to set all elements to their `x` axis:
ghci> computeP $ traverse x id (\_ (Z :. i :. j :. k) > i) :: IO (Array U DIM3 Int)
AUnboxed (((Z :. 3) :. 3) :. 3) (fromList [0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2,2])
 Shuffle elements around, based on their index.
 Rotate elements by swapping elements from rotated locations:
ghci> computeP $ traverse x id (\f (Z :. i :. j :. k) > f (Z :. j :. k :. i)) :: IO (Array U DIM3 Int)
AUnboxed (((Z :. 3) :. 3) :. 3) (fromList [1,4,7,10,13,16,19,22,25,
2,5,8,11,14,17,20,23,26,
3,6,9,12,15,18,21,24,27])
 Take a slice of one dimension. The resulting shape of the array changes
ghci> computeP $ traverse x (\(e :. _) > e) (\f (Z :. i :. j) > f (Z :. i :. j :. 0)) :: IO (Array U DIM2 Int)
AUnboxed ((Z :. 3) :. 3) (fromList [1,4,7,10,13,16,19,22,25])
The documentation on traverse provides further information.
Numeric operations: negation, addition, subtraction, multiplication
To perform basic numeric operations on Repa arrays you must use specialized operators: (+^)
(addition), (^)
(subtraction), (*^)
(multiplication) and (/^)
(division). All these operators work elementwise and produce delayed arrays (they have to be computed if you want to have results as unboxed arrays).
ghci> let x = fromListUnboxed (Z :. (10 :: Int)) [1..10]
ghci> computeP $ x +^ x :: IO (Array U DIM1 Double)
AUnboxed (Z :. 10) (fromList [2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0])
ghci> computeP $ x ^ x :: IO (Array U DIM1 Double)
AUnboxed (Z :. 10) (fromList [0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0])
ghci> computeP $ x *^ x :: IO (Array U DIM1 Double)
AUnboxed (Z :. 10) (fromList [1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0])
ghci> computeP $ x /^ x :: IO (Array U DIM1 Double)
AUnboxed (Z :. 10) (fromList [1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0])
Changing the shape of an array
One of the main advantages of repastyle arrays over other arrays in Haskell is the ability to reshape data without copying. This is achieved via *indexspace transformations*.
An example: transposing a 2D array. First, the type of the transformation:
transpose2D :: (Source r e) => Array r DIM2 e > Array D DIM2 e
Note that this transform will work on DIM2 arrays holding any elements. Now, to swap rows and columns, we have to modify the shape:
transpose2D a = backpermute (swap e) swap a
where
e = extent a
swap (Z :. i :. j) = Z :. j :. i
The swap function reorders the index space of the array. To do this, we extract the current shape of the array, and write a function that maps the index space from the old array to the new array. That index space function is then passed to backpermute which actually constructs the new array from the old one.
backpermute generates a new array from an old, when given the new shape, and a function that translates between the index space of each array (i.e. a shape transformer).
backpermute :: (Shape sh2, Shape sh1, Source r e) =>
sh2 > (sh2 > sh1) > Array r sh1 e > Array D sh2 e
Note that the array created is not actually evaluated (we only modified the index space of the array).
Transposition is such a common operation that it is provided by the library:
transpose :: (Shape sh, Source r e) =>
Array r ((sh :. Int) :. Int) e > Array D ((sh :. Int) :. Int) e
The types indicate that it works on the lowest two dimensions of the array.
Arrays shape can be changed without altering the underlying data by using reshape
:
ghci> let x = fromListUnboxed (Z :. (5 :: Int) :. (2 :: Int)) [1..10]
ghci> x
AUnboxed ((Z :. 5) :. 2) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0])
ghci> computeP $ reshape (Z :. (2::Int) :. (5::Int)) x :: IO (Array U DIM2 Double)
AUnboxed ((Z :. 2) :. 5) (fromList [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0])
Examples
Following are some examples of useful functions that exercise the API.
Example: Rotating an image with backpermute
Flip an image upside down:
import Foreign.Ptr
import System.Environment
import Data.Word
import Data.Array.Repa hiding ((++))
import Data.Array.Repa.IO.DevIL
import Data.Array.Repa.Repr.ForeignPtr
main :: IO ()
main = do
[f] < getArgs
(RGB v) < runIL $ readImage f
rotated < (computeP $ rot180 v) :: IO (Array F DIM3 Word8)
runIL $ writeImage ("flip"++f) (RGB rotated)
rot180 :: (Source r e) => Array r DIM3 e > Array D DIM3 e
rot180 g = backpermute e flop g
where
e@(Z :. x :. y :. _) = extent g
flop (Z :. i :. j :. k) =
(Z :. x  i  1 :. y  j  1 :. k)
Running this:
$ ghc O2 make threaded rtsopts repa.hs $ ./repa "haskell.jpg" +RTS N2
Results in:
Example: matrixmatrix multiplication
A more advanced example from the Repa paper is matrixmatrix multiplication. The result of matrix multiplication is a matrix whose elements are found by multiplying the elements of each row from the first matrix by the associated elements of the same column from the second matrix and summing the result.
if and
then
So we take two, 2D arrays and generate a new array, using our transpose function from earlier:
mmMult :: Monad m
=> Array U DIM2 Double
> Array U DIM2 Double
> m (Array U DIM2 Double)
mmMult a b = sumP (Repa.zipWith (*) aRepl bRepl)
where
t = transpose2D b
aRepl = extend (Z :.All :.colsB :.All) a
bRepl = extend (Z :.rowsA :.All :.All) t
(Z :.colsA :.rowsA) = extent a
(Z :.colsB :.rowsB) = extent b
The idea is to expand both 2D argument arrays into 3D arrays by
replicating them across a new axis using extend
function. The front face of the cuboid that results represents the array a
, which we replicate as often as b
has columns (colsB)
, producing aRepl
.
The top face represents t
(the transposed b), which we
replicate as often as a has rows (rowsA)
, producing
bRepl,
. The two replicated arrays have the same extent,
which corresponds to the index space of matrix multiplication.
Optimized implementations of this function are available in the repaalgorithms package.
Example: parallel image desaturation
To convert an image from color to greyscale, we can use the luminosity method to average RGB pixels into a common grey value, where the average is weighted for human perception of green.
The formula for luminosity is 0.21 R + 0.71 G + 0.07 B.
We can write a parallel image desaturation tool using repa and the repadevil image library:
import System.Environment
import Data.Word
import Data.Array.Repa hiding ((++))
import Data.Array.Repa.IO.DevIL
import Data.Array.Repa.Repr.ForeignPtr

 Read an image, desaturate, write out with new name.

main = do
[f] < getArgs
runIL $ do
(RGB a) < readImage f
b < (computeP $ traverse a id luminosity) :: IL (Array F DIM3 Word8)
writeImage ("grey" ++ f) (RGB b)
And now the luminosity transform itself, which averages the 3 RGB colors based on perceived weight:

 (Parallel) desaturation of an image via the luminosity method.

luminosity :: (DIM3 > Word8) > DIM3 > Word8
luminosity _ (Z :. _ :. _ :. 3) = 255  alpha channel
luminosity f (Z :. i :. j :. _) = ceiling $ 0.21 * r + 0.71 * g + 0.07 * b
where
r = fromIntegral $ f (Z :. i :. j :. 0)
g = fromIntegral $ f (Z :. i :. j :. 1)
b = fromIntegral $ f (Z :. i :. j :. 2)
And that's it! The result is a parallel image desaturator, when compiled with
$ ghc O threaded rtsopts make repa.hs
which we can run, to use two cores:
$ time ./repa sunflower.png +RTS N2 H real 0m0.048s user 0m0.052s sys 0m0.016s
We can also see that program takes a bit longer when run on a single core:
$ time ./repa sunflower.png +RTS N1 H real 0m0.064s user 0m0.060s sys 0m0.008s
Given an image like this:
The desaturated result from Haskell:
Example: Partial Differential equations, idiomatic option pricing and risk
http://idontgetoutmuch.wordpress.com/2013/01/05/optionpricingusinghaskellparallelarrays/ http://stackoverflow.com/questions/14082158/idiomaticoptionpricingandriskusingrepaparallelarrays http://idontgetoutmuch.wordpress.com/2013/02/10/parallelisingpathdependentoptionsinhaskell2/
Optimising Repa programs
Note: This section is based on older version of Repa (2.x). It seems that it is not entirely accurate for Repa 3. For example there is no force
function and array type (manifest or delayed) is now explicitly defined using type index. I do not feel sufficiently competent to update this section so I am leaving it "as is" for the time being. killy9999 09:12, 11 October 2012 (UTC)
Fusion, and why you need it
Repa depends critically on array fusion to achieve fast code. Fusion is a fancy name for the combination of inlining and code transformations performed by GHC when it compiles your program. The fusion process merges the array filling loops defined in the Repa library, with the "worker" functions that you write in your own module. If the fusion process fails, then the resulting program will be much slower than it needs to be, often 10x slower an equivalent program using plain Haskell lists. On the other hand, provided fusion works, the resulting code will run as fast as an equivalent cleanly written C program. Making fusion work is not hard once you understand what's going on.
The force
function has the loops
Suppose we have the following binding:
arr' = R.force $ R.map (\x > x + 1) arr
The right of this binding will compile down to code that first allocates the result array arr'
, then iterates over the source array arr
, reading each element in turn and adding one to it, then writing to the corresponding element in the result.
Importantly, the code that does the allocation, iteration and update is defined as part of the force
function. This forcing code has been written to break up the result into several chunks, and evaluate each chunk with a different thread. This is what makes your code run in parallel. If you do not use force
then your code will be slow and not run in parallel.
Delayed and Manifest arrays
In the example from the previous section, think of the R.map (\x > x + 1) arr
expression as a specification for a new array. In the library, this specification is referred to as a delayed array. A delayed array is represented as a function that takes an array index, and computes the value of the element at that index.
Applying force
to a delayed array causes all elements to be computed in parallel. The result of a force
is referred to as a manifest array. A manifest array is a "real" array represented as a flat chunk of memory containing array elements.
All Repa array operators will accept both delayed and manifest arrays. However, if you index into a delayed array without forcing it first, then each indexing operation costs a function call. It also recomputes the value of the array element at that index.
Shells and Springs
Here is another way to think about Repa's approach to array fusion. Suppose we write the following binding:
arr' = R.force $ R.map (\x > x * 2) $ R.map (\x > x + 1) arr
Remember from the previous section, that the result of each of the applications of R.map
is a delayed array. A delayed array is not a "real", manifest array, it's just a shell that contains a function to compute each element. In this example, the two worker functions correspond to the lambda expressions applied to R.map
.
When GHC compiles this example, the two worker functions are fused into a fresh unfolding of the parallel loop defined in the code for R.force
. Imagine holding R.force
in your left hand, and squashing the calls to R.map
into it, like a spring. Doing this breaks all the shells, and you end up with the worker functions fused into an unfolding of R.force
.
INLINE worker functions
Consider the following example:
f x = x + 1 arr' = R.force $ R.zipWith (*) (R.map f arr1) (R.map f arr2)
During compilation, we need GHC to fuse our worker functions into a fresh unfolding of R.force
. In this example, fusion includes inlining the definition of f
. If f
is not inlined, then the performance of the compiled code will be atrocious. It will perform a function call for each application of f
, where it really only needs a single machine instruction to increment the x
value.
Now, in general, GHC tries to avoid producing binaries that are "too big". Part of this is a heuristic that controls exactly what functions are inlined. The heuristic says that a function may be inlined only if it is used once, or if its definition is less than some particular size. If neither of these apply, then the function won't be inlined, killing performance.
For Repa programs, as fusion and inlining has such a dramatic effect on performance, we should absolutely not rely on heuristics to control whether or not this inlining takes place. If we rely on a heuristic, then even if our program runs fast today, if this heuristic is ever altered then some functions that used to be inlined may no longer be.
The moral of the story is to attach INLINE pragmas to all of your client functions that compute array values. This ensures that these critical functions will be inlined now, and forever.
{# INLINE f #} f x = x + 1
arr' = R.force $ R.zipWith (*) (R.map f arr1) (R.map f arr2)