Difference between revisions of "Parallelism"

From HaskellWiki
Jump to navigation Jump to search
m
 
(2 intermediate revisions by 2 users not shown)
Line 2: Line 2:
   
 
In Haskell we provide two ways to achieve parallelism:
 
In Haskell we provide two ways to achieve parallelism:
* Pure parallelism, which can be used to speed up pure (non-IO) parts of the program.
+
* [[Pure]] parallelism, which can be used to speed up non-IO parts of the program.
* Concurrency, which can be used for parallelising IO.
+
* [[Concurrency]], which can be used for parallelising IO.
   
Pure Parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:
+
Pure parallelism ([https://hackage.haskell.org/package/parallel/docs/Control-Parallel.html <code>Control.Parallel</code>]): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:
 
* Guaranteed deterministic (same result every time)
 
* Guaranteed deterministic (same result every time)
* no [[race conditions]] or [[deadlocks]]
+
* no [http://en.wikipedia.org/wiki/Race_condition race conditions] or [http://en.wikipedia.org/wiki/Deadlock deadlocks]
   
[[Concurrency]] (Control.Concurrent): Multiple threads of control that execute "at the same time".
+
Concurrency ([https://hackage.haskell.org/package/base/docs/Control-Concurrent.html <code>Control.Concurrent</code>]): Multiple threads of control that execute "at the same time".
 
* Threads are in the IO monad
 
* Threads are in the IO monad
 
* IO operations from multiple threads are interleaved non-deterministically
 
* IO operations from multiple threads are interleaved non-deterministically
 
* communication between threads must be explicitly programmed
 
* communication between threads must be explicitly programmed
 
* Threads may execute on multiple processors simultaneously
 
* Threads may execute on multiple processors simultaneously
* Dangers: [[race conditions]] and [[deadlocks]]
+
* Dangers: [http://en.wikipedia.org/wiki/Race_condition race conditions] and [http://en.wikipedia.org/wiki/Deadlock deadlocks]
   
Rule of thumb: use Pure Parallelism if you can, Concurrency otherwise.
+
Rule of thumb: use pure parallelism if you can, concurrency otherwise.
   
 
== Starting points ==
 
== Starting points ==
   
* '''Control.Parallel'''. The first thing to start with parallel programming in Haskell is the use of par/pseq from the parallel library. Try the Real World Haskell [http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.html chapter on parallelism and concurrency]. The parallelism-specific parts are in the second half of the chapter.
+
* [https://hackage.haskell.org/package/parallel/docs/Control-Parallel.html <code>Control.Parallel</code>]: The first thing to start with parallel programming in Haskell is the use of <code>par</code> and <code>pseq</code> from the [https://hackage.haskell.org/package/parallel <code>parallel</code>] library. Try the Real World Haskell [http://book.realworldhaskell.org/read/concurrent-and-multicore-programming.html chapter on parallelism and concurrency]. The parallelism-specific parts are in the second half of the chapter.
* If you need more control, try Strategies or perhaps the Par monad
+
* If you need more control, try [https://hackage.haskell.org/package/parallel/docs/Control-Parallel-Strategies.html <code>Strategies</code>] or perhaps the monadic type [https://hackage.haskell.org/package/monad-par/docs/Control-Monad-Par.html <code>Par</code>].
   
 
== Multicore GHC ==
 
== Multicore GHC ==

Latest revision as of 21:38, 3 May 2024

Parallelism is about speeding up a program by using multiple processors.

In Haskell we provide two ways to achieve parallelism:

  • Pure parallelism, which can be used to speed up non-IO parts of the program.
  • Concurrency, which can be used for parallelising IO.

Pure parallelism (Control.Parallel): Speeding up a pure computation using multiple processors. Pure parallelism has these advantages:

Concurrency (Control.Concurrent): Multiple threads of control that execute "at the same time".

  • Threads are in the IO monad
  • IO operations from multiple threads are interleaved non-deterministically
  • communication between threads must be explicitly programmed
  • Threads may execute on multiple processors simultaneously
  • Dangers: race conditions and deadlocks

Rule of thumb: use pure parallelism if you can, concurrency otherwise.

Starting points

Multicore GHC

Since 2004, GHC supports running programs in parallel on an SMP or multi-core machine. How to do it:

  • Compile your program using the -threaded switch.
  • Run the program with +RTS -N2 to use 2 threads, for example (RTS stands for runtime system; see the GHC users' guide). You should use a -N value equal to the number of CPU cores on your machine (not including Hyper-threading cores). As of GHC v6.12, you can leave off the number of cores and all available cores will be used (you still need to pass -N however, like so: +RTS -N).
  • Concurrent threads (forkIO) will run in parallel, and you can also use the par combinator and Strategies from the Control.Parallel.Strategies module to create parallelism.
  • Use +RTS -sstderr for timing stats.
  • To debug parallel program performance, use ThreadScope.

Alternative approaches

  • Nested data parallelism: a parallel programming model based on bulk data parallelism, in the form of the DPH and Repa libraries for transparently parallel arrays.
  • monad-par and LVish provide Par monads that can structure parallel computations over "monotonic" data structures, which in turn can be used from within purely functional programs.
  • [OLD] Intel Concurrent Collections for Haskell: a graph-oriented parallel programming model.

See also