Difference between revisions of "Parallelism"

From HaskellWiki
Jump to navigation Jump to search
m (reformat Simon Marlow's text)
Line 4: Line 4:
   
 
In Haskell we provide two ways to achieve parallelism:
 
In Haskell we provide two ways to achieve parallelism:
- Concurrency, which can be used for parallelising IO.
+
* Concurrency, which can be used for parallelising IO.
- Pure parallelism, which can be used to speed up pure (non-IO)
+
* Pure parallelism, which can be used to speed up pure (non-IO)
parts of the program.
+
parts of the program.
   
[[Concurrency]] (Control.Concurrent):
+
[[Concurrency]] (Control.Concurrent): Multiple threads of control that execute "at the same time".
 
* Threads are in the IO monad
Multiple threads of control that execute "at the same time".
 
 
* IO operations from multiple threads are interleaved non-deterministically
- Threads are in the IO monad
 
 
* communication between threads must be explicitly programmed
- IO operations from multiple threads are interleaved
 
 
* Threads may execute on multiple processors simultaneously
non-deterministically
 
 
* Dangers: [[race conditions]] and [[deadlocks]]
- communication between threads must be explicitly programmed
 
- Threads may execute on multiple processors simultaneously
 
- dangers: [[race conditions]] and [[deadlocks]]
 
   
Pure Parallelism (Control.Parallel):
+
Pure Parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:
 
* Guaranteed deterministic (same result every time)
Speeding up a pure computation using multiple processors.
 
 
* no [[race conditions]] or [[deadlocks]]
- Pure parallelism has these advantages:
 
- guaranteed deterministic (same result every time)
 
- no [[race conditions]] or [[deadlocks]]
 
   
 
Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.
 
Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.

Revision as of 13:21, 20 April 2011

Parallel Programming in Haskell

Parallelism is about speeding up a program by using multiple processors.

In Haskell we provide two ways to achieve parallelism:

  • Concurrency, which can be used for parallelising IO.
  • Pure parallelism, which can be used to speed up pure (non-IO)
  parts of the program.

Concurrency (Control.Concurrent): Multiple threads of control that execute "at the same time".

  • Threads are in the IO monad
  • IO operations from multiple threads are interleaved non-deterministically
  • communication between threads must be explicitly programmed
  • Threads may execute on multiple processors simultaneously
  • Dangers: race conditions and deadlocks

Pure Parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:

Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.

Starting points

  • Control.Parallel. The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library.
  • Nested Data Parallelism. For an approach to exploiting the implicit parallelism in array programs for multiprocessors, see Data Parallel Haskell (work in progress).

Multicore GHC

Since 2004, GHC supports running programs in parallel on an SMP or multi-core machine. How to do it:

  • Compile your program using the -threaded switch.
  • Run the program with +RTS -N2 to use 2 threads, for example (RTS stands for runtime system; see the GHC users' guide). You should use a -N value equal to the number of CPU cores on your machine (not including Hyper-threading cores). As of GHC v6.12, you can leave off the number of cores and all available cores will be used (you still need to pass -N however, like so: +RTS -N).
  • Concurrent threads (forkIO) will run in parallel, and you can also use the par combinator and Strategies from the Control.Parallel.Strategies module to create parallelism.
  • Use +RTS -sstderr for timing stats.
  • To debug parallel program performance, use ThreadScope.

Alternative approaches

  • Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the DPH and Repa libraries for transparently parallel arrays.
  • Intel Concurrent Collections for Haskell: a graph-oriented parallel programming model.

Related work