Difference between revisions of "GHC/Data Parallel Haskell/Package NDP"

From HaskellWiki
Jump to navigation Jump to search
 
Line 1: Line 1:
 
== Speed with less convenience: package ndp ==
 
== Speed with less convenience: package ndp ==
   
  +
The following explains how to install and use the concurrent, high-performance library of strict, segmented arrays called ''package ndp.'' Here, ''strict'' means that when one element of an array is evaluated, all of them are - we also call this a ''parallel evaluation semantics''. Moreover, ''segmented'' means that all operations come in two flavours: one for plain, flat arrays and one for segmented arrays, which are an optimised representation of arrays with one level of nesting. For example, a sum of a segmented array of numbers computes one sum for each segment. Package ndp has a purely functional interface, but internally uses monadic low-level array operations, array fusion, and a many standard GHC optimisations to produce highly optimised code. The library uses [[GHC/Concurrency|GHC's SMP concurrency]] to transparently parallelise array processing on hardware with multiple cores and/or CPUs.
The concurrent, high-performance library of strict, segmented arrays mentioned above takes the form of a GHC package called [http://darcs.haskell.org/packages/ndp/ ''ndp'']. This package is under development and only available in source form. The simplest way to build it is to first '''get''' and '''build''' a source distribution of GHC (preferably the [http://darcs.haskell.org/ghc/ current development version]) - see the docu on [http://hackage.haskell.org/trac/ghc/wiki/Building/GettingTheSources how to get the sources] and [http://hackage.haskell.org/trac/ghc/wiki/Building/Hacking how to build them]. Then, in the source tree, do the following
 
  +
  +
=== Installation ===
  +
 
Package ndp is only available in source form. The simplest way to build it is to first '''get''' and '''build''' a source distribution of GHC (preferably the [http://darcs.haskell.org/ghc/ current development version]) - see the docu on [http://hackage.haskell.org/trac/ghc/wiki/Building/GettingTheSources how to get the sources] and [http://hackage.haskell.org/trac/ghc/wiki/Building/Hacking how to build them]. Then, in the source tree, do the following
 
<blockquote><pre>
 
<blockquote><pre>
 
% cd libraries
 
% cd libraries
Line 9: Line 13:
 
% make
 
% make
 
</pre></blockquote>
 
</pre></blockquote>
  +
If you don't have darcs, you can alternatively download the [http://www.cse.unsw.edu.au/~pls/projects/ndp/src/ndp-0.1.tar.gz ndp-0.1 tar ball].
 
Now, the option <tt>-package ndp</tt> is available for use with the inplace compiler (i.e., <tt>compiler/ghc-inplace</tt>). Alternatively, you can install it by invoking <tt>make install</tt> on the GHC source root '''and''' within <tt>libraries/ndp/</tt>. Then, the option <tt>-package ndp</tt> can be used in the installed compiler.
 
Now, the option <tt>-package ndp</tt> is available for use with the inplace compiler (i.e., <tt>compiler/ghc-inplace</tt>). Alternatively, you can install it by invoking <tt>make install</tt> on the GHC source root '''and''' within <tt>libraries/ndp/</tt>. Then, the option <tt>-package ndp</tt> can be used in the installed compiler.
   
  +
=== A small example ===
For example, the following module implements the dot product with package ndp:
 
  +
 
The following module implements the dot product of two vectors with package ndp:
   
 
<haskell>
 
<haskell>
Line 37: Line 44:
 
*DotP_ndp>
 
*DotP_ndp>
 
</pre></blockquote>
 
</pre></blockquote>
 
The difference between the package ndp and the <tt>-fparr</tt> version of the dot product is just a fairly small amount of sugar. However, for programs using arrays of more complex (including nested arrays), the difference is much bigger. Nevertheless, many programs can be implemented quite easily with just package ndp. The speed difference between the two options is dramatic.
 
   
 
Most of the functions under [http://darcs.haskell.org/packages/ndp/Data/Array/Parallel/Unlifted/ <tt>Data.Array.Parallel.Unlifted</tt>] are still purely sequential, albeit '''much''' more efficient than <tt>GHC.PArr</tt>. In addition, the (currently only few) functions from [http://darcs.haskell.org/packages/ndp/Data/Array/Parallel/Unlifted/Parallel.hs <tt>Data.Array.Parallel.Unlifted.Parallel</tt>] ''transparently'' use multiple processing elements if GHC was compiled with SMP multiprocessor support.
 
Most of the functions under [http://darcs.haskell.org/packages/ndp/Data/Array/Parallel/Unlifted/ <tt>Data.Array.Parallel.Unlifted</tt>] are still purely sequential, albeit '''much''' more efficient than <tt>GHC.PArr</tt>. In addition, the (currently only few) functions from [http://darcs.haskell.org/packages/ndp/Data/Array/Parallel/Unlifted/Parallel.hs <tt>Data.Array.Parallel.Unlifted.Parallel</tt>] ''transparently'' use multiple processing elements if GHC was compiled with SMP multiprocessor support.

Revision as of 07:13, 20 March 2007

Speed with less convenience: package ndp

The following explains how to install and use the concurrent, high-performance library of strict, segmented arrays called package ndp. Here, strict means that when one element of an array is evaluated, all of them are - we also call this a parallel evaluation semantics. Moreover, segmented means that all operations come in two flavours: one for plain, flat arrays and one for segmented arrays, which are an optimised representation of arrays with one level of nesting. For example, a sum of a segmented array of numbers computes one sum for each segment. Package ndp has a purely functional interface, but internally uses monadic low-level array operations, array fusion, and a many standard GHC optimisations to produce highly optimised code. The library uses GHC's SMP concurrency to transparently parallelise array processing on hardware with multiple cores and/or CPUs.

Installation

Package ndp is only available in source form. The simplest way to build it is to first get and build a source distribution of GHC (preferably the current development version) - see the docu on how to get the sources and how to build them. Then, in the source tree, do the following

% cd libraries
% darcs get http://darcs.haskell.org/packages/ndp/
% cd ndp
% make boot
% make

If you don't have darcs, you can alternatively download the ndp-0.1 tar ball. Now, the option -package ndp is available for use with the inplace compiler (i.e., compiler/ghc-inplace). Alternatively, you can install it by invoking make install on the GHC source root and within libraries/ndp/. Then, the option -package ndp can be used in the installed compiler.

A small example

The following module implements the dot product of two vectors with package ndp:

module DotP_ndp (dotp)
where

import Data.Array.Parallel.Unlifted

dotp :: (Num a, UA a) => UArr a -> UArr a -> a
dotp xs ys = sumU (zipWithU (*) xs ys)

We can also use that in an interactive GHCi session:

Prelude> :set -package ndp
package flags have changed, ressetting and loading new packages...
Loading package ndp-1.0 ... linking ... done.
Prelude> :l /home/chak/code/haskell/DotP_ndp
[1 of 1] Compiling DotP_ndp         ( /home/chak/code/haskell/DotP_ndp.hs, interpreted )
Ok, modules loaded: DotP_ndp.
*DotP_ndp> dotp (toU [1..3]) (toU [4..6])
Loading package haskell98 ... linking ... done.
32.0
*DotP_ndp>

Most of the functions under Data.Array.Parallel.Unlifted are still purely sequential, albeit much more efficient than GHC.PArr. In addition, the (currently only few) functions from Data.Array.Parallel.Unlifted.Parallel transparently use multiple processing elements if GHC was compiled with SMP multiprocessor support.

A number of examples of using package ndp are in the examples directory.