https://wiki.haskell.org/api.php?action=feedcontributions&user=Sacheie&feedformat=atomHaskellWiki - User contributions [en]2024-03-28T15:23:57ZUser contributionsMediaWiki 1.35.5https://wiki.haskell.org/index.php?title=Numeric_Haskell:_A_Repa_Tutorial&diff=45880Numeric Haskell: A Repa Tutorial2012-06-04T23:49:17Z<p>Sacheie: Added a link to Ben Lippmeier's new paper on Repa 3.</p>
<hr />
<div>Note: This tutorial is for an old version of Repa. The current version (Repa 3.1) has a slightly different API. You can read more about Repa 3 in [http://www.cse.unsw.edu.au/~benl/papers/guiding/guiding-Haskell2012-sub.pdf this paper.]<br />
<br />
[http://hackage.haskell.org/package/repa Repa] is a Haskell library for<br />
high performance, regular, multi-dimensional parallel arrays. All<br />
numeric data is stored unboxed and functions written with the Repa<br />
combinators are automatically parallel (provided you supply "+RTS -N" on<br />
the command line when running the program).<br />
<br />
[[Image:Grid.png|right]]<br />
<br />
This document provides a tutorial on array programming in Haskell using the repa package.<br />
<br />
''Note: a companion tutorial to this is provided as the [http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Vector_Tutorial vector tutorial], and is based on the [http://www.scipy.org/Tentative_NumPy_Tutorial NumPy tutorial]''.<br />
<br />
''Authors: [http://donsbot.wordpress.com Don Stewart].''<br />
<br />
= Quick Tour =<br />
<br />
Repa (REgular PArallel arrays) is an advanced, multi-dimensional parallel arrays library for Haskell, with a number of distinct capabilities:<br />
<br />
* The arrays are "regular" (i.e. dense, rectangular and store elements all of the same type); and<br />
* Functions may be written that are polymorphic in the shape of the array;<br />
* Many operations on arrays are accomplished by changing only the shape of the array (without copying elements);<br />
* The library will automatically parallelize operations over arrays.<br />
<br />
This is a quick start guide for the package. For further information, consult:<br />
<br />
* [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/Data-Array-Repa.html The Haddock Documentation]<br />
* [http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/rarrays.pdf Regular, Shape-polymorphic, Parallel Arrays in Haskell].<br />
* [http://www.cse.unsw.edu.au/~benl/papers/stencil/stencil-icfp2011-sub.pdf Efficient Parallel Stencil Convolution in Haskell]<br />
<br />
== Importing the library ==<br />
<br />
Download the `repa` package:<br />
<br />
$ cabal install repa<br />
<br />
and import it qualified:<br />
<br />
import qualified Data.Array.Repa as R<br />
<br />
The library needs to be imported qualified as it shares the same function names as list operations in the Prelude.<br />
<br />
''Note: Operations that involve writing new index types for Repa arrays will require the '-XTypeOperators' language extension.''<br />
<br />
For non-core functionality, a number of related packages are available:<br />
<br />
* [http://hackage.haskell.org/package/repa-bytestring repa-bytestring]<br />
* [http://hackage.haskell.org/package/repa-io repa-io]<br />
* [http://hackage.haskell.org/package/repa-algorithms repa-algorithms]<br />
* [http://hackage.haskell.org/package/repa-devil repa-devil] (image loading)<br />
<br />
and example algorithms in:<br />
<br />
* [http://hackage.haskell.org/package/repa-examples repa-examples]<br />
<br />
== Index types and shapes ==<br />
<br />
Before we can get started manipulating arrays, we need a grasp of repa's notion of array shape. Much like the classic 'array' library in Haskell, repa-based arrays are parameterized via a type which determines the dimension of the array, and the type of its index. However, while classic arrays take tuples to represent multiple dimensions, Repa arrays use a [http://hackage.haskell.org/packages/archive/repa/2.0.0.3/doc/html/Data-Array-Repa-Shape.html#t:Shape richer type language] for describing multi-dimensional array indices and shapes (technically, a ''heterogeneous snoc list'').<br />
<br />
Shape types are built somewhat like lists. The constructor <code>Z</code> corresponds<br />
to a rank zero shape, and is used to mark the end of the list. The <code>:.</code> constructor adds additional dimensions to the shape. So, for example, the shape:<br />
<br />
(Z :. 3 :. 2 :. 3)<br />
<br />
is the shape of a small 3D array, with shape type<br />
<br />
(Z :. Int :. Int :. Int)<br />
<br />
The most common dimensions are given by the shorthand names:<br />
<br />
<haskell><br />
type DIM0 = Z<br />
type DIM1 = DIM0 :. Int<br />
type DIM2 = DIM1 :. Int<br />
type DIM3 = DIM2 :. Int<br />
type DIM4 = DIM3 :. Int<br />
type DIM5 = DIM4 :. Int<br />
</haskell><br />
<br />
thus,<br />
<br />
<haskell><br />
Array DIM2 Double<br />
</haskell><br />
<br />
is the type of a two-dimensional array of doubles, indexed via <code>Int</code> keys, while<br />
<br />
<haskell><br />
Array Z Double<br />
</haskell><br />
<br />
is a zero-dimension object (i.e. a point) holding a Double.<br />
<br />
Many operations over arrays are polymorphic in the shape / dimension<br />
component. Others require operating on the shape itself, rather than<br />
the array. A typeclass, <code>Shape</code>, lets us operate uniformly<br />
over arrays with different shape.<br />
<br />
=== Building shapes ===<br />
<br />
To build values of shape type, we can use the <code>Z</code> and <code>:.</code> constructors. Open the ghci and import Repa:<br />
<br />
<haskell><br />
Prelude> :m +Data.Array.Repa<br />
Repa> Z -- the zero-dimension<br />
Z<br />
</haskell><br />
<br />
For arrays of non-zero dimension, we must give a size. ''Note:'' a common error is to leave off the ''type'' of the size.<br />
<br />
<haskell><br />
Repa> :t Z :. 10<br />
Z :. 10 :: Num head => Z :. head<br />
</haskell><br />
<br />
leading to annoying type errors about unresolved instances, such as:<br />
<br />
No instance for (Shape (Z :. head0))<br />
<br />
To select the correct instance, you will need to annotate the size literals with their concrete type:<br />
<br />
<haskell><br />
Repa> :t Z :. (10 :: Int)<br />
Z :. (10 :: Int) :: Z :. Int<br />
</haskell><br />
<br />
is the shape of 1D arrays of length 10, indexed via Ints.<br />
<br />
Given an array, you can always find its shape by calling <code>extent</code>.<br />
<br />
Additional convenience types for selecting particular parts of a shape are also provided (<code>All, Any, Slice</code> etc.) which are covered later in the tutorial.<br />
<br />
=== Working with shapes ===<br />
<br />
That one key operation, <code>extent</code>, gives us many attributes of an array:<br />
<br />
<haskell><br />
-- Extract the shape of the array<br />
extent :: Array sh a -> sh<br />
</haskell><br />
<br />
So, given a 3x3x3 array, of type <code>Array DIM3 Int</code>, we can:<br />
<br />
<haskell><br />
-- build an array<br />
Repa> let x :: Array DIM3 Int; x = fromList (Z :. (3::Int) :. (3::Int) :. (3::Int)) [1..27]<br />
Repa> :t x<br />
x :: Array DIM3 Int<br />
<br />
-- query the extent<br />
Repa> extent x<br />
((Z :. 3) :. 3) :. 3<br />
<br />
-- compute the rank (number of dimensions)<br />
Repa> let sh = extent x<br />
Repa> rank sh<br />
3<br />
<br />
-- compute the size (total number of elements)<br />
> size sh<br />
27<br />
<br />
-- extract the elements of the array as a flat vector<br />
Repa> toVector x<br />
fromList [1,2,3,4,5,6,7,8,9,10<br />
,11,12,13,14,15,16,17,18,19<br />
,20,21,22,23,24,25,26,27] :: Data.Vector.Unboxed.Vector<br />
</haskell><br />
<br />
== Generating arrays ==<br />
<br />
New repa arrays ("arrays" from here on) can be generated in many ways, and we always begin by importing the <code>Data.Array.Repa</code> module:<br />
<br />
<br />
$ ghci<br />
GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help<br />
Loading package ghc-prim ... linking ... done.<br />
Loading package integer-gmp ... linking ... done.<br />
Loading package base ... linking ... done.<br />
Loading package ffi-1.0 ... linking ... done.<br />
Prelude> :m + Data.Array.Repa<br />
<br />
<br />
They may be constructed from lists, for example. Here is a one dimensional array of length 10, here, given the shape <code>(Z :. 10)</code>:<br />
<br />
<haskell><br />
Repa> let inputs = [1..10] :: [Double]<br />
Repa> let x = fromList (Z :. (10::Int)) inputs<br />
Repa> x<br />
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]<br />
</haskell><br />
<br />
The type of <code>x</code> is inferred as:<br />
<br />
<haskell><br />
Repa> :t x<br />
x :: Array (Z :. Int) Double<br />
</haskell><br />
<br />
which we can read as "an array of dimension 1, indexed via <code>Int</code> keys, holding elements of type <code>Double</code>"<br />
<br />
We could also have written the type as:<br />
<br />
<haskell><br />
Repa> let x' = fromList (Z :. 10 :: DIM1) inputs<br />
Repa> :t x'<br />
x' :: Array DIM1 Double<br />
</haskell><br />
<br />
The same data may also be treated as a two dimensional array, by changing the shape parameter:<br />
<br />
<haskell><br />
Repa> let x2 = fromList (Z :. (5::Int) :. (2::Int)) inputs<br />
Repa> x2<br />
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]<br />
</haskell><br />
<br />
which has the type:<br />
<br />
<haskell><br />
Repa> :t x2<br />
x2 :: Array ((Z :. Int) :. Int) Double<br />
</haskell><br />
<br />
or, as above, if we define it with the type synonym for 2 dimensional <code>Int</code>- indexed arrays, <code>DIM2</code>:<br />
<br />
<haskell><br />
Repa> let x2' = fromList (Z :. 5 :. 2 :: DIM2) inputs<br />
Repa> x2'<br />
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]<br />
Repa> :t x2'<br />
x2' :: Array DIM2 Double<br />
</haskell><br />
<br />
=== Building arrays from vectors ===<br />
<br />
It is also possible to build arrays from unboxed vectors, from the 'vector' package:<br />
<br />
<haskell><br />
fromVector :: Shape sh => sh -> Vector a -> Array sh a<br />
</haskell><br />
<br />
New arrays are built by applying a shape to the vector. For example:<br />
<br />
<haskell><br />
Repa> import Data.Vector.Unboxed --recent ghci's support this import syntax<br />
Repa Unboxed> let x = fromVector (Z :. (10::Int)) (enumFromN 0 10)<br />
[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]<br />
</haskell><br />
<br />
is a one-dimensional array of doubles. As usual, we can also impose other<br />
shapes:<br />
<br />
<haskell><br />
Repa Unboxed> let x = fromVector (Z :. (3::Int) :. (3::Int)) (enumFromN 0 9)<br />
Repa Unboxed> x<br />
[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0]<br />
Repa Unboxed> :t x<br />
x :: Array ((Z :. Int) :. Int) Double<br />
</haskell><br />
<br />
to create a 3x3 array.<br />
<br />
=== Generating random arrays ===<br />
<br />
The [http://hackage.haskell.org/package/repa-algorithms repa-algorithms] package lets us generate new arrays with random data:<br />
<br />
<haskell><br />
-- 3d array of Ints, bounded between 0 and 255.<br />
Repa> :m +Data.Array.Repa.Algorithms.Randomish<br />
Repa Randomish> randomishIntArray (Z :. (3::Int) :. (3::Int) :. (3::Int)) 0 255 1<br />
[217,42,130,200,216,254<br />
,67,77,152,85,140,226,179<br />
,71,23,17,152,84,47,17,45<br />
,5,88,245,107,214,136]<br />
</haskell><br />
<br />
=== Reading arrays from files ===<br />
<br />
Using the [http://hackage.haskell.org/package/repa-io repa-io] package, arrays may be written and read from files in a number of formats:<br />
<br />
* as BMP files; and<br />
* in a number of text formats.<br />
<br />
with other formats rapidly appearing. For the special case of arrays of Word8 values, the [http://hackage.haskell.org/package/repa-bytestring repa-bytestring] library supports generating bytestrings in memory.<br />
<br />
An example: to write an 2D array to an ascii file:<br />
<br />
<haskell><br />
Repa> :m +Data.Array.Repa.IO.Matrix<br />
Repa Matrix> let x = fromList (Z :. 5 :. 2 :: DIM2) [1..10]<br />
Repa Matrix> writeMatrixToTextFile "test.dat" x<br />
</haskell><br />
<br />
This will result in a file containing:<br />
<br />
MATRIX<br />
2 5<br />
1.0<br />
2.0<br />
3.0<br />
4.0<br />
5.0<br />
6.0<br />
7.0<br />
8.0<br />
9.0<br />
10.0<br />
<br />
In turn, this file may be read back in via <code>readMatrixFromTextFile</code>.<br />
<br />
<haskell><br />
Repa Matrix> xx <- readMatrixFromTextFile "test.dat" <br />
Repa Matrix> xx<br />
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]<br />
Repa Matrix> :t xx<br />
it :: Array DIM2 Double<br />
</haskell><br />
<br />
<br />
To process [http://en.wikipedia.org/wiki/BMP_file_format .bmp files], use [http://hackage.haskell.org/packages/archive/repa-io/2.0.0.3/doc/html/Data-Array-Repa-IO-BMP.html Data.Array.Repa.IO.BMP], as follows (currently reading only works for 24 bit .bmp):<br />
<br />
<haskell><br />
Data.Array.Repa.IO.BMP> x <- readImageFromBMP "/tmp/test24.bmp"<br />
</haskell><br />
<br />
Reads this .bmp image:<br />
<br />
http://i.imgur.com/T4ttx.png<br />
<br />
as a 3D array of <code>Word8</code>, which can be further processed.<br />
<br />
''Note'': at the time of writing, there are no binary instances for repa arrays.<br />
<br />
For image IO in many, many formats, use the [http://hackage.haskell.org/package/repa-devil repa-devil] library.<br />
<br />
=== Copying arrays from pointers ===<br />
<br />
You can also generate new repa arrays by copying data from a pointer, using the [http://hackage.haskell.org/package/repa-bytestring repa-bytestring] package. Here is an example, using <code>copyFromPtrWord8</code>:<br />
<br />
<haskell><br />
import Data.Word<br />
import Foreign.Ptr<br />
<br />
import qualified Data.Vector.Storable as V<br />
import qualified Data.Array.Repa as R<br />
import Data.Array.Repa<br />
import qualified Data.Array.Repa.ByteString as R<br />
<br />
import Data.Array.Repa.IO.DevIL<br />
<br />
i, j, k :: Int <br />
(i, j, k) = (255, 255, 4 {-RGBA-})<br />
<br />
-- 1d vector, filled with pretty colors<br />
v :: V.Vector Word8<br />
v = V.fromList . take (i * j * k) . cycle $ concat<br />
[ [ r, g, b, 255 ]<br />
| r <- [0 .. 255]<br />
, g <- [0 .. 255]<br />
, b <- [0 .. 255]<br />
] <br />
<br />
ptr2repa :: Ptr Word8 -> IO (R.Array R.DIM3 Word8)<br />
ptr2repa p = R.copyFromPtrWord8 (Z :. i :. j :. k) p<br />
<br />
main = do<br />
-- copy our 1d vector to a repa 3d array, via a pointer<br />
r <- V.unsafeWith v ptr2repa<br />
runIL $ writeImage "test.png" r <br />
return ()<br />
</haskell><br />
<br />
This fills a vector, converts it to a pointer, then copies that pointer to a 3d array, before writing the result out as this image:<br />
<br />
http://i.imgur.com/o0Cv2.png<br />
<br />
== Indexing arrays ==<br />
<br />
To access elements in repa arrays, you provide an array and a shape, to access the element:<br />
<br />
<haskell><br />
(!) :: (Shape sh, Elt a) => Array sh a -> sh -> a<br />
</haskell><br />
<br />
Indices start with 0. So:<br />
<br />
<haskell><br />
> let x = fromList (Z :. (10::Int)) [1..10]<br />
> x ! (Z :. 2)<br />
3.0<br />
</haskell><br />
<br />
Note that we can't give just a bare literal as the shape, even for one-dimensional arrays, :<br />
<br />
<haskell><br />
> x ! 2<br />
<br />
No instance for (Num (Z :. Int))<br />
arising from the literal `2'<br />
</haskell><br />
<br />
as the <code>Z</code> type isn't in the <code>Num</code> class, and Haskell's numeric literals are overloaded.<br />
<br />
What if the index is out of bounds, though?<br />
<br />
<haskell><br />
> x ! (Z :. 11)<br />
*** Exception: ./Data/Vector/Generic.hs:222 ((!)): index out of bounds (11,10)<br />
</haskell><br />
<br />
an exception is thrown. An alternative is to use indexing functions that return a <code>Maybe</code>:<br />
<br />
<haskell><br />
(!?) :: (Shape sh, Elt a) => Array sh a -> sh -> Maybe a<br />
</haskell><br />
<br />
An example:<br />
<br />
<haskell><br />
> x !? (Z :. 9)<br />
Just 10.0<br />
<br />
> x !? (Z :. 11)<br />
Nothing<br />
</haskell><br />
<br />
== Operations on arrays ==<br />
<br />
Besides indexing, there are many regular, list-like operations on arrays. Since many of the names parallel those in the Prelude, we import Repa qualified:<br />
<br />
Repa> import qualified Data.Array.Repa as Repa<br />
<br />
=== Maps, zips, filters and folds ===<br />
<br />
We can map over multi-dimensional arrays:<br />
<br />
Repa> let x = fromList (Z :. (3::Int) :. (3::Int)) [1..9]<br />
Repa> x<br />
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0]<br />
<br />
since this <code>map</code> conflicts with the definition in the Prelude, we have to use it with the qualifier we requested:<br />
<br />
Repa> Repa.map (^2) x<br />
[1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]<br />
<br />
Repa's <code>map</code> leaves the dimension unchanged:<br />
<br />
Repa> extent x<br />
(Z :. 3) :. 3<br />
Repa> extent (Repa.map (^2) x)<br />
(Z :. 3) :. 3<br />
<br />
A <code>fold</code> reduces the inner dimension of the array:<br />
<br />
fold :: (Shape sh, Elt a) <br />
=> (a -> a -> a) -> a -> Array (sh :. Int) a -> Array sh a<br />
<br />
The <code>x</code> defined above was a 2D array:<br />
<br />
Repa> extent x<br />
(Z :. 3) :. 3<br />
<br />
but if we sum each row:<br />
<br />
Repa> Repa.fold (+) 0 x<br />
[6.0,15.0,24.0]<br />
<br />
we get a 1D array instead:<br />
<br />
Repa> extent (Repa.fold (+) 0 x)<br />
Z :. 3<br />
<br />
Similarly, if <code>y</code> is a (3 x 3 x 3) 3D array:<br />
<br />
Repa> let y = fromList ((Z :. 3 :. 3 :. 3) :: DIM3) [1..27]<br />
Repa> y<br />
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0,17.0,18.0,19.0,20.0,21.0,22.0,23.0,24.0,25.0,26.0,27.0]<br />
<br />
we can fold over the inner dimension: <br />
<br />
Repa> Repa.fold (+) 0 y<br />
[6.0,15.0,24.0,33.0,42.0,51.0,60.0,69.0,78.0]<br />
<br />
yielding a 2D (3 x 3) array in place of our 3D (3 x 3 x 3) array:<br />
<br />
Repa> extent y<br />
((Z :. 3) :. 3) :. 3<br />
Repa> extent (Repa.fold (+) 0 y)<br />
(Z :. 3) :. 3<br />
<br />
Two arrays may be combined via <code>zipWith</code>:<br />
<br />
zipWith :: (Shape sh, Elt b, Elt c, Elt a) =><br />
(a -> b -> c) -> Array sh a -> Array sh b -> Array sh c<br />
<br />
an example:<br />
<br />
Repa> Repa.zipWith (*) x x<br />
[1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0]<br />
Repa> extent it<br />
(Z :. 3) :. 3<br />
<br />
Repa> Repa.zipWith (*) y y <br />
[1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,<br />
100.0,121.0,144.0,169.0,196.0,225.0,256.0,289.0,324.0,<br />
361.0,400.0,441.0,484.0,529.0,576.0,625.0,676.0,729.0] --reformatted<br />
Repa> extent it<br />
((Z :. 3) :. 3) :. 3<br />
<br />
=== Mapping, with indices ===<br />
<br />
A very powerful operator is <code>traverse</code>, a parallel array traversal which also supplies the current index:<br />
<br />
<haskell><br />
traverse :: (Shape sh, Shape sh', Elt a) <br />
=> Array sh a <br />
-- Source array<br />
-> (sh -> sh') <br />
-- Function to produce the extent of the result.<br />
-> ((sh -> a) -> sh' -> b) <br />
-- Function to produce elements of the result.<br />
-- It is passed a lookup function to<br />
-- get elements of the source.<br />
-> Array sh' b<br />
</haskell><br />
<br />
This is quite a compicated type, because it is very general. Let's take it apart. The first argument is the source array, which is obvious. The second argument is a function that transforms the shape of the input array to yield the output array. So if the arrays are the same size, this function is <code>id</code>. It might grow or resize the shape in other ways.<br />
<br />
Finally, the 3rd argument is where the magic is. Given an index, return a new element, and you also get a lookup function which when applied yields the current element.<br />
<br />
So we see this generalizes map to support indexes, and optional inspection of the current element. Let's try some examples:<br />
<br />
<haskell><br />
$ ghci <br />
GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help<br />
*> :m + Data.Array.Repa<br />
*> :m + Data.Array.Repa.Algorithms.Randomish<br />
*> let a :: Array DIM3 Int;<br />
a = fromList (Z :. (3::Int) :. (3::Int) :. (3::Int)) [1..27]<br />
*> a<br />
[1,2,3,4,5,6,7,8,9<br />
,10,11,12,13,14,15,16,17,18<br />
,19,20,21,22,23,24,25,26,27]<br />
<br />
-- Keeping the shape the same, and just overwriting elements<br />
-- Use `traverse` to set all elements to their `x` axis:<br />
*> traverse a id (\_ (Z :. i :. j :. k) -> i) <br />
[0,0,0,0,0,0,0,0,0<br />
,1,1,1,1,1,1,1,1,1<br />
,2,2,2,2,2,2,2,2,2]<br />
<br />
-- Shuffle elements around, based on their index.<br />
-- Rotate elements by swapping elements from rotated locations:<br />
> traverse a id (\f (Z :. i :. j :. k) -> f (Z :. j :. k :. i)) <br />
[1,4,7,10,13,16,19,22,25<br />
,2,5,8,11,14,17,20,23,26<br />
,3,6,9,12,15,18,21,24,27]<br />
</haskell><br />
<br />
The documentation on [http://hackage.haskell.org/packages/archive/repa/2.0.2.1/doc/html/Data-Array-Repa.html#g:7 traverse] provides further information.<br />
<br />
=== Numeric operations: negation, addition, subtraction, multiplication ===<br />
<br />
Repa arrays are instances of the <code>Num</code>. This means that<br />
operations on numerical elements are lifted automagically onto arrays of<br />
such elements. For example, <code>(+)</code> on two double values corresponds to<br />
element-wise addition, <code>(+)</code>, of the two arrays of doubles:<br />
<br />
> let x = fromList (Z :. (10::Int)) [1..10]<br />
> x + x<br />
[2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]<br />
<br />
Other operations from the Num class work just as well:<br />
<br />
> -x<br />
[-1.0,-2.0,-3.0,-4.0,-5.0,-6.0,-7.0,-8.0,-9.0,-10.0]<br />
<br />
> x ^ 3<br />
[1.0,8.0,27.0,64.0,125.0,216.0,343.0,512.0,729.0,1000.0]<br />
<br />
> x * x<br />
[1.0,4.0,9.0,16.0,25.0,36.0,49.0,64.0,81.0,100.0]<br />
<br />
== Changing the shape of an array ==<br />
<br />
One of the main advantages of repa-style arrays over other arrays in<br />
Haskell is the ability to reshape data without copying. This is achieved<br />
via *index-space transformations*. <br />
<br />
An example: transposing a 2D array (this example taken from the repa<br />
paper). First, the type of the transformation:<br />
<br />
transpose2D :: Elt e => Array DIM2 e -> Array DIM2 e <br />
<br />
Note that this transform will work on DIM2 arrays holding any elements.<br />
Now, to swap rows and columns, we have to modify the shape:<br />
<br />
transpose2D a = backpermute (swap e) swap a<br />
where<br />
e = extent a<br />
swap (Z :. i :. j) = Z :. j :. i<br />
<br />
The swap function reorders the index space of the array.<br />
To do this, we extract the current shape of the array, and write a function<br />
that maps the index space from the old array to the new array. That index space function<br />
is then passed to backpermute which actually constructs the new<br />
array from the old one.<br />
<br />
backpermute generates a new array from an old, when given the new shape, and a<br />
function that translates between the index space of each array (i.e. a shape<br />
transformer).<br />
<br />
backpermute<br />
:: (Shape sh, Shape sh', Elt a) <br />
=> sh' <br />
-> (sh' -> sh) <br />
-> Array sh a <br />
-> Array sh' a<br />
<br />
Note that the array created is not actually evaluated (we only modified the index space of the array).<br />
<br />
Transposition is such a common operation that it is provided by the<br />
library:<br />
<br />
transpose :: (Shape sh, Elt a)<br />
=> Array ((sh :. Int) :. Int) a -> Array ((sh :. Int) :. Int)<br />
<br />
the type indicate that it works on the lowest two dimensions of the<br />
array.<br />
<br />
Other operations on index spaces include taking slices and joining<br />
arrays into larger ones.<br />
<br />
== Examples ==<br />
<br />
Following are some examples of useful functions that exercise the API.<br />
<br />
=== Example: Rotating an image with backpermute ===<br />
<br />
Flip an image upside down:<br />
<br />
<haskell><br />
import System.Environment<br />
import Data.Word<br />
import Data.Array.Repa hiding ((++))<br />
import Data.Array.Repa.IO.DevIL<br />
<br />
main = do<br />
[f] <- getArgs<br />
runIL $ do<br />
v <- readImage f<br />
writeImage ("flip-"++f) (rot180 v)<br />
<br />
rot180 :: Array DIM3 Word8 -> Array DIM3 Word8<br />
rot180 g = backpermute e flop g<br />
where<br />
e@(Z :. x :. y :. _) = extent g<br />
<br />
flop (Z :. i :. j :. k) =<br />
(Z :. x - i - 1 :. y - j - 1 :. k)<br />
</haskell><br />
<br />
Running this:<br />
<br />
$ ghc -O2 --make A.hs<br />
$ ./A haskell.jpg <br />
<br />
Results in:<br />
<br />
http://i.imgur.com/YsGA8.jpg<br />
<br />
=== Example: matrix-matrix multiplication ===<br />
<br />
A more advanced example from the Repa paper: matrix-matrix multiplication: the result of<br />
matrix multiplication is a matrix whose elements are found by<br />
multiplying the elements of each row from the first matrix by the<br />
associated elements of the same column from the second matrix and<br />
summing the result.<br />
<br />
if <math>A=\begin{bmatrix}a&b\\c&d\end{bmatrix}</math> and <math>B=\begin{bmatrix}e&f\\g&h\end{bmatrix}</math><br />
<br />
then <br />
<br />
<math>AB=\begin{bmatrix}a&b\\c&d\end{bmatrix}\begin{bmatrix}e&f\\g&h\end{bmatrix}=\begin{bmatrix}ae+bg&af+bh\\ce+dg&cf+dh\end{bmatrix}</math><br />
<br />
So we take two, 2D arrays and generate a new array, using our transpose<br />
function from earlier:<br />
<br />
<haskell><br />
mmMult :: (Num e, Elt e)<br />
=> Array DIM2 e<br />
-> Array DIM2 e<br />
-> Array DIM2 e<br />
<br />
mmMult a b = sum (zipWith (*) aRepl bRepl)<br />
where<br />
t = transpose2D b<br />
aRepl = extend (Z :.All :.colsB :.All) a<br />
bRepl = extend (Z :.rowsA :.All :.All) t<br />
(Z :.colsA :.rowsA) = extent a<br />
(Z :.colsB :.rowsB) = extent b<br />
</haskell><br />
<br />
The idea is to expand both 2D argument arrays into 3D arrays by<br />
replicating them across a new axis. The front face of the cuboid that<br />
results represents the array <code>a</code>, which we replicate as often<br />
as <code>b</code> has columns <code>(colsB)</code>, producing<br />
<code>aRepl</code>.<br />
<br />
The top face represents <code>t</code> (the transposed b), which we<br />
replicate as often as a has rows <code>(rowsA)</code>, producing<br />
<code>bRepl,</code>. The two replicated arrays have the same extent,<br />
which corresponds to the index space of matrix multiplication<br />
<br />
Optimized implementations of this function are available in the<br />
[http://hackage.haskell.org/package/repa-algorithms repa-algorithms] package.<br />
<br />
=== Example: parallel image desaturation ===<br />
<br />
To convert an image from color to greyscale, we can use the luminosity method to average RGB pixels into a common grey value, where the average is weighted for human perception of green.<br />
<br />
The formula for luminosity is 0.21 R + 0.71 G + 0.07 B.<br />
<br />
We can write a parallel image desaturation tool using repa and the [http://hackage.haskell.org/package/repa-devil repa-devil image library]:<br />
<br />
<haskell><br />
import Data.Array.Repa.IO.DevIL<br />
import Data.Array.Repa hiding ((++))<br />
import Data.Word<br />
import System.Environment<br />
<br />
--<br />
-- Read an image, desaturate, write out with new name.<br />
--<br />
main = do <br />
[f] <- getArgs<br />
runIL $ do<br />
a <- readImage f<br />
let b = traverse a id luminosity <br />
writeImage ("grey-" ++ f) b<br />
</haskell><br />
<br />
And now the luminosity transform itself, which averages the 3 RGB colors based on perceived weight:<br />
<br />
<haskell><br />
--<br />
-- (Parallel) desaturation of an image via the luminosity method.<br />
--<br />
luminosity :: (DIM3 -> Word8) -> DIM3 -> Word8<br />
luminosity _ (Z :. _ :. _ :. 3) = 255 -- alpha channel<br />
luminosity f (Z :. i :. j :. _) = ceiling $ 0.21 * r + 0.71 * g + 0.07 * b<br />
where<br />
r = fromIntegral $ f (Z :. i :. j :. 0)<br />
g = fromIntegral $ f (Z :. i :. j :. 1)<br />
b = fromIntegral $ f (Z :. i :. j :. 2)<br />
</haskell><br />
<br />
And that's it! The result is a parallel image desaturator, when compiled with <br />
<br />
$ ghc -O -threaded -rtsopts --make A.hs -fforce-recomp<br />
<br />
which we can run, to use two cores:<br />
<br />
$ time ./A sunflower.png +RTS -N2 -H <br />
./A sunflower.png +RTS -N2 -H 0.19s user 0.03s system 135% cpu 0.165 total<br />
<br />
Given an image like this:<br />
<br />
http://i.imgur.com/iu4gV.png<br />
<br />
The desaturated result from Haskell:<br />
<br />
http://i.imgur.com/REhA5.png<br />
<br />
== Optimising Repa programs ==<br />
<br />
=== Fusion, and why you need it ===<br />
Repa depends critically on array fusion to achieve fast code. Fusion is a fancy name for the combination of inlining and code transformations performed by GHC when it compiles your program. The fusion process merges the array filling loops defined in the Repa library, with the "worker" functions that you write in your own module. If the fusion process fails, then the resulting program will be much slower than it needs to be, often 10x slower an equivalent program using plain Haskell lists. On the other hand, provided fusion works, the resulting code will run as fast as an equivalent cleanly written C program. Making fusion work is not hard once you understand what's going on.<br />
<br />
=== The <code>force</code> function has the loops ===<br />
<br />
Suppose we have the following binding: <br />
<br />
arr' = R.force $ R.map (\x -> x + 1) arr<br />
<br />
The right of this binding will compile down to code that first allocates the result array <code>arr'</code>, then iterates over the source array <code>arr</code>, reading each element in turn and adding one to it, then writing to the corresponding element in the result.<br />
<br />
Importantly, the code that does the allocation, iteration and update is defined as part of the <code>force</code> function. This forcing code has been written to break up the result into several chunks, and evaluate each chunk with a different thread. This is what makes your code run in parallel. If you do ''not'' use <code>force</code> then your code will be slow and ''not'' run in parallel.<br />
<br />
=== Delayed and Manifest arrays ===<br />
In the example from the previous section, think of the <code>R.map (\x -> x + 1) arr</code> expression as a ''specification'' for a new array. In the library, this specification is referred to as a ''delayed'' array. A delayed array is represented as a function that takes an array index, and computes the value of the element at that index.<br />
<br />
Applying <code>force</code> to a delayed array causes all elements to be computed in parallel. The result of a <code>force</code> is referred to as a ''manifest'' array. A manifest array is a "real" array represented as a flat chunk of memory containing array elements.<br />
<br />
All Repa array operators will accept both delayed and manifest arrays. However, if you index into a delayed array without forcing it first, then each indexing operation costs a function call. It also ''recomputes'' the value of the array element at that index.<br />
<br />
=== Shells and Springs ===<br />
Here is another way to think about Repa's approach to array fusion. Suppose we write the following binding:<br />
<br />
arr' = R.force $ R.map (\x -> x * 2) $ R.map (\x -> x + 1) arr<br />
<br />
Remember from the previous section, that the result of each of the applications of <code>R.map</code> is a delayed array. A delayed array is not a "real", manifest array, it's just a shell that contains a function to compute each element. In this example, the two worker functions correspond to the lambda expressions applied to <code>R.map</code>.<br />
<br />
When GHC compiles this example, the two worker functions are fused into a fresh unfolding of the parallel loop defined in the code for <code>R.force</code>. Imagine holding <code>R.force</code> in your left hand, and squashing the calls to <code>R.map</code> into it, like a spring. Doing this breaks all the shells, and you end up with the worker functions fused into an unfolding of <code>R.force</code>.<br />
<br />
=== INLINE worker functions ===<br />
Consider the following example:<br />
<br />
f x = x + 1<br />
arr' = R.force $ R.zipWith (*) (R.map f arr1) (R.map f arr2)<br />
<br />
During compilation, we need GHC to fuse our worker functions into a fresh unfolding of <code>R.force</code>. In this example, fusion includes inlining the definition of <code>f</code>. If <code>f</code> is ''not'' inlined, then the performance of the compiled code will be atrocious. It will perform a function call for each application of <code>f</code>, where it really only needs a single machine instruction to increment the <code>x</code> value.<br />
<br />
Now, in general, GHC tries to avoid producing binaries that are "too big". Part of this is a heuristic that controls exactly what functions are inlined. The heuristic says that a function may be inlined only if it is used once, or if its definition is less than some particular size. If neither of these apply, then the function won't be inlined, killing performance. <br />
<br />
For Repa programs, as fusion and inlining has such a dramatic effect on performance, we should ''absolutely not'' rely on heuristics to control whether or not this inlining takes place. If we rely on a heuristic, then even if our program runs fast today, if this heuristic is ever altered then some functions that used to be inlined may no longer be.<br />
<br />
The moral of the story is to attach INLINE pragmas to ''all'' of your client functions that compute array values. This ensures that these critical functions will be inlined now, and forever.<br />
<br />
{-# INLINE f #-}<br />
f x = x + 1<br />
<br />
arr' = R.force $ R.zipWith (*) (R.map f arr1) (R.map f arr2)<br />
<br />
== Advanced techniques ==<br />
<br />
=== Repa's parallel programming model ===<br />
<br />
''Discussion about the gang threads and hooks to help''<br />
<br />
=== Programming with stencils ===<br />
<br />
''Discuss the stencil types model''<br />
<br />
<br />
[[Category:Libraries]]<br />
[[Category:Packages]]<br />
[[Category:Performance]]<br />
[[Category:Tutorials]]<br />
[[Category:Parallel]]</div>Sacheie