Parallelism: Difference between revisions

From HaskellWiki
m (reformat Simon Marlow's text)
Line 5: Line 5:
In Haskell we provide two ways to achieve parallelism:
In Haskell we provide two ways to achieve parallelism:
* Concurrency, which can be used for parallelising IO.
* Concurrency, which can be used for parallelising IO.
* Pure parallelism, which can be used to speed up pure (non-IO)
* Pure parallelism, which can be used to speed up pure (non-IO) parts of the program.
  parts of the program.


[[Concurrency]] (Control.Concurrent): Multiple threads of control that execute "at the same time".
[[Concurrency]] (Control.Concurrent): Multiple threads of control that execute "at the same time".
Line 24: Line 23:


* '''Control.Parallel'''.  The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library.  
* '''Control.Parallel'''.  The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library.  
 
* If you need more control, try Strategies or perhaps the Par monad
* '''Nested Data Parallelism'''.  For an approach to exploiting the implicit parallelism in array programs for multiprocessors, see [[GHC/Data Parallel Haskell|Data Parallel Haskell]] (work in progress).


=== Multicore GHC ===
=== Multicore GHC ===

Revision as of 13:23, 20 April 2011

Parallel Programming in Haskell

Parallelism is about speeding up a program by using multiple processors.

In Haskell we provide two ways to achieve parallelism:

  • Concurrency, which can be used for parallelising IO.
  • Pure parallelism, which can be used to speed up pure (non-IO) parts of the program.

Concurrency (Control.Concurrent): Multiple threads of control that execute "at the same time".

  • Threads are in the IO monad
  • IO operations from multiple threads are interleaved non-deterministically
  • communication between threads must be explicitly programmed
  • Threads may execute on multiple processors simultaneously
  • Dangers: race conditions and deadlocks

Pure Parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:

Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.

Starting points

  • Control.Parallel. The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library.
  • If you need more control, try Strategies or perhaps the Par monad

Multicore GHC

Since 2004, GHC supports running programs in parallel on an SMP or multi-core machine. How to do it:

  • Compile your program using the -threaded switch.
  • Run the program with +RTS -N2 to use 2 threads, for example (RTS stands for runtime system; see the GHC users' guide). You should use a -N value equal to the number of CPU cores on your machine (not including Hyper-threading cores). As of GHC v6.12, you can leave off the number of cores and all available cores will be used (you still need to pass -N however, like so: +RTS -N).
  • Concurrent threads (forkIO) will run in parallel, and you can also use the par combinator and Strategies from the Control.Parallel.Strategies module to create parallelism.
  • Use +RTS -sstderr for timing stats.
  • To debug parallel program performance, use ThreadScope.

Alternative approaches

  • Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the DPH and Repa libraries for transparently parallel arrays.
  • Intel Concurrent Collections for Haskell: a graph-oriented parallel programming model.

Related work