GHC/Data Parallel Haskell/Package NDP

From HaskellWiki
Jump to navigation Jump to search

Speed with less convenience: package ndp

The following explains how to install and use the concurrent, high-performance library of strict, segmented arrays called package ndp. Here, strict means that when one element of an array is evaluated, all of them are - we also call this a parallel evaluation semantics. Moreover, segmented means that all operations come in two flavours: one for plain, flat arrays and one for segmented arrays, which are an optimised representation of arrays with one level of nesting. For example, a sum of a segmented array of numbers computes one sum for each segment. Package ndp has a purely functional interface, but internally uses monadic low-level array operations, array fusion, and a many standard GHC optimisations to produce highly optimised code. The library uses GHC's SMP concurrency to transparently parallelise array processing on hardware with multiple cores and/or CPUs.

Installation

Package ndp is only available in source form. The simplest way to build it is to first get and build a source distribution of GHC (preferably the current development version) - see the docu on how to get the sources and how to build them. Then, in the source tree, do the following

% cd libraries
% darcs get http://darcs.haskell.org/packages/ndp/
% cd ndp
% make boot
% make

If you don't have darcs, you can alternatively download the ndp-0.1 tar ball. Now, the option -package ndp is available for use with the inplace compiler (i.e., compiler/ghc-inplace). Alternatively, you can install it by invoking make install on the GHC source root and within libraries/ndp/. Then, the option -package ndp can be used in the installed compiler.

A small example

The following module implements the dot product of two vectors with package ndp:

module DotP_ndp (dotp)
where

import Data.Array.Parallel.Unlifted

dotp :: (Num a, UA a) => UArr a -> UArr a -> a
dotp xs ys = sumU (zipWithU (*) xs ys)

We can also use that in an interactive GHCi session:

Prelude> :set -package ndp
package flags have changed, ressetting and loading new packages...
Loading package ndp-1.0 ... linking ... done.
Prelude> :l /home/chak/code/haskell/DotP_ndp
[1 of 1] Compiling DotP_ndp         ( /home/chak/code/haskell/DotP_ndp.hs, interpreted )
Ok, modules loaded: DotP_ndp.
*DotP_ndp> dotp (toU [1..3]) (toU [4..6])
Loading package haskell98 ... linking ... done.
32.0
*DotP_ndp>

Most of the functions under Data.Array.Parallel.Unlifted are still purely sequential, albeit much more efficient than GHC.PArr. In addition, the (currently only few) functions from Data.Array.Parallel.Unlifted.Parallel transparently use multiple processing elements if GHC was compiled with SMP multiprocessor support.

A number of examples of using package ndp are in the examples directory.