# Difference between revisions of "GHC/Data Parallel Haskell"

(→Data Parallel Haskell) |
(→Data Parallel Haskell) |
||

Line 2: | Line 2: | ||

== Data Parallel Haskell == |
== Data Parallel Haskell == |
||

− | ''Data Parallel Haskell'' is the codename for an extension to the Glasgow Haskell Compiler and its libraries to support [http://www.cs.cmu.edu/~scandal/cacm/cacm2.html nested data parallelism] with a focus to utilise |
+ | ''Data Parallel Haskell'' is the codename for an extension to the Glasgow Haskell Compiler and its libraries to support [http://www.cs.cmu.edu/~scandal/cacm/cacm2.html nested data parallelism] with a focus to utilise multicore CPUs. Nested data parallelism extends the programming model of flat data parallelism, as known from parallel Fortran dialects, to irregular parallel computations (such as divide-and-conquer algorithms) and irregular data structures (such as sparse matrices and tree structures). An introduction to nested data parallelism in Haskell, including some examples, can be found in the paper [http://www.cse.unsw.edu.au/~chak/papers/papers.html#ndp-haskell Nepal – Nested Data-Parallelism in Haskell]. |

=== Project status === |
=== Project status === |
||

A first ''technology preview'' of Data Parallel Haskell (DPH) is included in the 6.10.1 release of GHC. All major components of DPH are implemented, including code vectorisation and parallel execution on multicore systems. However, the implementation has many limitations and probably also many bugs. Major limitations include the inability to mix vectorised and non-vectorised code in a single Haskell module, the need to use a feature-deprived, special-purpose Prelude in vectorised code, and a lack of optimisations (leading to poor performance). |
A first ''technology preview'' of Data Parallel Haskell (DPH) is included in the 6.10.1 release of GHC. All major components of DPH are implemented, including code vectorisation and parallel execution on multicore systems. However, the implementation has many limitations and probably also many bugs. Major limitations include the inability to mix vectorised and non-vectorised code in a single Haskell module, the need to use a feature-deprived, special-purpose Prelude in vectorised code, and a lack of optimisations (leading to poor performance). |
||

+ | |||

+ | The purpose of this technology preview is twofold. Firstly, it gives interested early adopters the opportunity to see where the project is headed and enables them to experiment with simple DPH programs. Secondly, we hope to get user feedback that helps us to guide the project and prioritise those features that our users are most interested in. |
||

+ | |||

⚫ | |||

+ | |||

+ | === Where to get it === |
||

+ | |||

+ | Currently, we recommend to use the implementation in GHC 6.10.1. It is [http://haskell.org/ghc/download_ghc_6_10_1.html available in source and binary form] for many architectures. (Please use the version in the HEAD repository of GHC only if you are a GHC developer or a very experienced GHC user and if you know the current status of the DPH code – intermediate versions may well be broken while we implement major changes.) |
||

=== Overview === |
=== Overview === |
||

⚫ | The |
||

+ | From a user's point of view, Data Parallel Haskell adds a new data type to Haskell –namely, ''parallel arrays''– as well as operations on parallel arrays. Syntactically, parallel arrays are like lists, only that instead of square brackets <hask>[</hask> and <hask>]</hask>, parallel arrays use square brackets with a colon <hask>[:</hask> and <hask>:]</hask>. In particular, <hask>[:e:]</hask> is the type of parallel arrays with elements of type <hask>e</hask>; the expression <hask>[:x, y, z:]</hask> denotes a three element parallel array with elements <hask>x</hask>, <hask>y</hask>, and <hask>z</hask>; and <hask>[:x + 1 | x <- xs:]</hask> represents a simple array comprehension. More sophisticated array comprehensions (including the equivalent of [http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#parallel-list-comprehensions parallel list comprehensions]) as well as enumerations and pattern matching proceed in an analog manner. Moreover, the array library of DPH defines analogs of most list operations from the Haskell prelude and the standard <hask>List</hask> library (e.g., we have <hask>lengthP</hask>, <hask>sumP</hask>, <hask>mapP</hask>, and so on). |
||

+ | |||

+ | The two main differences between lists and parallel arrays are that (1) parallel arrays are a strict data structure and (2) that they are not inductively defined. Parallel arrays are strict in that by demanding a single element, all elements of an array are demanded. Hence, all elements of a parallel array might be evaluated in parallel. To facilitate such parallel evaluation, operations on parallel arrays should treat arrays as aggregate structures that are manipulated in their entirety (instead of the inductive, element-wise processing that is the foundation of all Haskell list functions.) |
||

+ | |||

+ | As a consequence, parallel arrays are always finite, and standard functions that yield infinite lists, such as <hask>enumFrom</hask> and <hask>repeat</hask>, have no corresponding array operation. Moreover, parallel arrays only have an undirected fold function <hask>foldP</hask> that requires an associative function as an argument – such a fold function has a parallel step complexity of O(log ''n'') for arrays of length ''n''. Parallel arrays also come with some aggregate operations that absent from the standard list library, such as <hask>permuteP</hask>. |
||

+ | |||

⚫ | |||

− | Data Parallel Haskell essentially extends GHC in two ways: |
||

− | # by a concurrent, high-performance library of strict, segmented arrays, which we call '''package ndp''' and |
||

− | # by syntactic sugar for nested parallel arrays and array comprehensions, including a program transformation, called '''vectorisation''', which maps nested array parallelism to the API provided by package ndp. |
||

− | Both package ndp and the syntactic sugar are already usable, but as we have yet to implement vectorisation, they currently cannot be used together. To get an idea of code using array comprehensions, consider the following definition of the dot product of two vectors: |
||

<haskell> |
<haskell> |
||

dotp :: Num a => [:a:] -> [:a:] -> a |
dotp :: Num a => [:a:] -> [:a:] -> a |
||

Line 22: | Line 32: | ||

This code uses an array variant of [http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#parallel-list-comprehensions parallel list comprehensions], but should otherwise be self-explanatory to any Haskell programmer. |
This code uses an array variant of [http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#parallel-list-comprehensions parallel list comprehensions], but should otherwise be self-explanatory to any Haskell programmer. |
||

− | In contrast, package ndp requires us to manually desugar the function definition to |
||

+ | === Designing parallel programs === |
||

− | <haskell> |
||

− | dotp :: (Num a, UA a) => UArr a -> UArr a -> a |
||

− | dotp xs ys = sumU (zipWithU (*) xs ys) |
||

− | </haskell> |
||

− | This is still pretty straight forward, combinator-based code. However, desugaring becomes more burdensome when algorithms use nested arrays. |
||

− | Until we have added vectorisation to GHC, we recommend to prototype parallel algorithms using nested arrays and array comprehensions, and then, to manually desugar them to use package ndp for performance. For more examples of programs using nested arrays and array comprehensions, refer to [[GHC/Data Parallel Haskell/GHC.PArr|convenience without the speed]], and for more examples of programs using package ndp, refer to [[GHC/Data Parallel Haskell/Package NDP|speed without the convenience]]. |
||

+ | === Some background === |
||

⚫ | |||

⚫ | DPH has two major components: (1) the ''vectorisation transformation'' and (2) a ''generic library for flat parallel arrays''. The vectorisation transformation turns nested into flat data-parallelism and is described in detail in the paper [http://www.cse.unsw.edu.au/~chak/papers/PLKC08.html Harnessing the Multicores: Nested Data Parallelism in Haskell.] The generic array library maps flat data-parallelism to GHC's multi-threaded multicore support and is described in the paper [http://www.cse.unsw.edu.au/~chak/papers/CLPKM06.html Data Parallel Haskell: a status report]. The same topics are also covered in the slides for the two talks [http://research.microsoft.com/~simonpj/papers/ndp/NdpSlides.pdf Nested data parallelism in Haskell] and [http://dataparallel.googlegroups.com/web/UNSW%20CGO%20DP%202007.pdf Compiling nested data parallelism by program transformation]. |
||

− | |||

− | '''Important Note:''' Data Parallel Haskell is currently being completely restructured. Consequently, it may or may not properly build in the current development version, and it most certainly will not generate efficient code. We are working on getting it into a reasonable shape again for the GHC 6.10 release (at which point we will update the documentation to explain the current feature set). |
||

− | |||

− | As we are still missing some parts in the implementation of Data Parallel Haskell (most importantly the vectorisation transformation), you need to choose between the following two options: |
||

− | |||

− | ; '''Convenience without the speed''' : Nice surface syntax for parallel arrays and array comprehensions (very similar to that for Haskell lists) is available in all GHC releases of the 6.8 series. To use parallel arrays, specify the option <code>-XPArr</code> and import <code>GHC.PArr</code>. This option is nice to implement and test parallel algorithms, but the code will be executed purely sequentially with limited performance. Further details including example programs are at [[Data_Parallel_Haskell/GHC.PArr|convenience without the speed]]. |
||

− | ; '''Speed without the convenience''' : Parallel high-performance arrays that use aggressive [[fusion]] and transparently utilise multicore and SMP architectures are available in the library package <code>ndp</code>. In contrast to the previous option, there is no special syntactic sugar for array types and array comprehensions, and only flat and segmented arrays of basic types and pairs are available. The library is only provided in source form and needs to be compiled in a GHC build tree. However, this is pretty easy to do. The details including example programs are at [[Data_Parallel_Haskell/PackageNDP|speed without the convenience]]. |
||

− | |||

− | We are currently working to integrate these two components to finally provide convenience and speed in one. |
||

− | |||

⚫ | |||

=== Further information === |
=== Further information === |
||

− | For further reading, refer to this [[GHC/Data Parallel Haskell/References|collection of background papers, and pointers to other people's work]]. If you are really curious and like to know implementation details and the internals of the Data Parallel Haskell project, it is all on the GHC developer wiki on the pages covering [http://hackage.haskell.org/trac/ghc/wiki/DataParallel data parallelism] and [http://hackage.haskell.org/trac/ghc/wiki/TypeFunctions |
+ | For further reading, refer to this [[GHC/Data Parallel Haskell/References|collection of background papers, and pointers to other people's work]]. If you are really curious and like to know implementation details and the internals of the Data Parallel Haskell project, it is all on the GHC developer wiki on the pages covering [http://hackage.haskell.org/trac/ghc/wiki/DataParallel data parallelism] and [http://hackage.haskell.org/trac/ghc/wiki/TypeFunctions type families]. |

## Revision as of 05:21, 2 December 2008

## Contents

## Data Parallel Haskell

*Data Parallel Haskell* is the codename for an extension to the Glasgow Haskell Compiler and its libraries to support nested data parallelism with a focus to utilise multicore CPUs. Nested data parallelism extends the programming model of flat data parallelism, as known from parallel Fortran dialects, to irregular parallel computations (such as divide-and-conquer algorithms) and irregular data structures (such as sparse matrices and tree structures). An introduction to nested data parallelism in Haskell, including some examples, can be found in the paper Nepal – Nested Data-Parallelism in Haskell.

### Project status

A first *technology preview* of Data Parallel Haskell (DPH) is included in the 6.10.1 release of GHC. All major components of DPH are implemented, including code vectorisation and parallel execution on multicore systems. However, the implementation has many limitations and probably also many bugs. Major limitations include the inability to mix vectorised and non-vectorised code in a single Haskell module, the need to use a feature-deprived, special-purpose Prelude in vectorised code, and a lack of optimisations (leading to poor performance).

The purpose of this technology preview is twofold. Firstly, it gives interested early adopters the opportunity to see where the project is headed and enables them to experiment with simple DPH programs. Secondly, we hope to get user feedback that helps us to guide the project and prioritise those features that our users are most interested in.

**Disclaimer:** Data Parallel Haskell is very much **work in progress.** Some components are already usable, and we explain here how to use them. However, please be aware that APIs are still in flux and functionality may change during development.

### Where to get it

Currently, we recommend to use the implementation in GHC 6.10.1. It is available in source and binary form for many architectures. (Please use the version in the HEAD repository of GHC only if you are a GHC developer or a very experienced GHC user and if you know the current status of the DPH code – intermediate versions may well be broken while we implement major changes.)

### Overview

From a user's point of view, Data Parallel Haskell adds a new data type to Haskell –namely, *parallel arrays*– as well as operations on parallel arrays. Syntactically, parallel arrays are like lists, only that instead of square brackets `[`

and `]`

, parallel arrays use square brackets with a colon `[:`

and `:]`

. In particular, `[:e:]`

is the type of parallel arrays with elements of type `e`

; the expression `[:x, y, z:]`

denotes a three element parallel array with elements `x`

, `y`

, and `z`

; and `[:x + 1 | x <- xs:]`

represents a simple array comprehension. More sophisticated array comprehensions (including the equivalent of parallel list comprehensions) as well as enumerations and pattern matching proceed in an analog manner. Moreover, the array library of DPH defines analogs of most list operations from the Haskell prelude and the standard `List`

library (e.g., we have `lengthP`

, `sumP`

, `mapP`

, and so on).

The two main differences between lists and parallel arrays are that (1) parallel arrays are a strict data structure and (2) that they are not inductively defined. Parallel arrays are strict in that by demanding a single element, all elements of an array are demanded. Hence, all elements of a parallel array might be evaluated in parallel. To facilitate such parallel evaluation, operations on parallel arrays should treat arrays as aggregate structures that are manipulated in their entirety (instead of the inductive, element-wise processing that is the foundation of all Haskell list functions.)

As a consequence, parallel arrays are always finite, and standard functions that yield infinite lists, such as `enumFrom`

and `repeat`

, have no corresponding array operation. Moreover, parallel arrays only have an undirected fold function `foldP`

that requires an associative function as an argument – such a fold function has a parallel step complexity of O(log *n*) for arrays of length *n*. Parallel arrays also come with some aggregate operations that absent from the standard list library, such as `permuteP`

.

### An example

```
dotp :: Num a => [:a:] -> [:a:] -> a
dotp xs ys = sumP [:x * y | x <- xs | y <- ys:]
```

This code uses an array variant of parallel list comprehensions, but should otherwise be self-explanatory to any Haskell programmer.

### Designing parallel programs

### Some background

DPH has two major components: (1) the *vectorisation transformation* and (2) a *generic library for flat parallel arrays*. The vectorisation transformation turns nested into flat data-parallelism and is described in detail in the paper Harnessing the Multicores: Nested Data Parallelism in Haskell. The generic array library maps flat data-parallelism to GHC's multi-threaded multicore support and is described in the paper Data Parallel Haskell: a status report. The same topics are also covered in the slides for the two talks Nested data parallelism in Haskell and Compiling nested data parallelism by program transformation.

### Further information

For further reading, refer to this collection of background papers, and pointers to other people's work. If you are really curious and like to know implementation details and the internals of the Data Parallel Haskell project, it is all on the GHC developer wiki on the pages covering data parallelism and type families.