Personal tools

Parallel GHC Project

From HaskellWiki

Revision as of 15:39, 9 September 2011 by EricKow (Talk | contribs)

Jump to: navigation, search


1 Overview

The Parallel GHC Project is an MSR-funded project to push the real-world use of parallel Haskell. The aim is to demonstrate that parallel Haskell can be employed successfully in industrial projects.

In the last few years GHC has gained impressive support for parallel programming on commodity multi-core systems. In addition to traditional threads and shared variables, it supports pure parallelism, software transactional memory (STM), and data parallelism. With much of this research and development complete, the next stage is to get the technology into more widespread use.

This project aims to do the engineering work to solve whatever remaining practical problems are blocking organisations from making serious use of parallelism with GHC. The driving force is the applications rather than the technology.

The project involves a partnership with four groups from commercial and scientific organisations. Over the course of two years these groups are applying parallel Haskell in their specific domains. They are being supported by GHC HQ and Well-Typed who are providing advice on Haskell tools and techniques, and applying engineering effort to resolve any issues that are hindering these groups' progress.

The project is being coordinated by Well-Typed and they are providing the bulk of the support and engineering effort. The project started in the summer of 2010.

2 Project News

The Parallel GHC project is expading! Our recent search for a new project partner has been successful. In fact, we will be welcoming two new partners to the project, the research and development group of Spanish telecoms company Telefónica, and VETT a UK-based payment processing company. We are excited to be working with the teams at Telefónica I+D and VETT.

Meanwhile, we have completed a pure Haskell implementation of the "Modified Additive Lagged Fibonacci" random number generator. This generator is attractive for use in Monte Carlo simulations because it is splittable and has good statistical quality, while providing high performance. The LFG implementation has undergone statistical randomness tests to verify quality equivalent to the LFG implementation from the SPRNG random number generator collection. It will soon be available on Hackage.

We have recently released

  • the the gtk2hs library (0.12.1, pre-release), adding GHC 7.2 compatibility,
  • the ghc-events library (, adding support for custom events,
  • and ThreadScope (0.2.0), with a new spark profiling visualisation, bookmarks, and other enhancements

Also coming up are a couple of Monad Reader articles featuring work from the Parallel GHC project.

Kazu Yamamoto has been writing an article for the upcoming Monad Reader special edition on parallelism and concurrency. He'll be revisiting Mighttpd (pronounced "Mighty"), a high-performance web server written in Haskell in late 2009. Since Mighttpd came out the Haskell web programming landscape has evolved considerably, with the release of GHC 7 and development of several web application frameworks. The new mighttpd takes advantage of GHC 7's new IO manager as well as the Yesod Web Application Interface (WAI) and the HTTP engine Warp.

Bernie Pope and Dmitry Astapov have been working on an article about the Haskell MPI binding developed in the context of this project. You can find out more about the binding on its Hackage and GitHub pages.

For the longer term, we are continuing to underway on extend ThreadScope and the GHC EventLog. The aim is to support profiling of multi-process or distributed Haskell systems such as client/server or MPI programs. This involves incorporate some changes made in related projects (Eden, EdenTV). This work may have some benefits even for profiling single-process programs. It should also allow comparative profiling where multiple runs of the same program (e.g. different inputs or slightly different code) are viewed on the same timeline.

3 Project artefacts

In addition to helping the participating organisations, the project will whenever possible make improvements to libraries and tools that are useful to Haskell users more generally.

Project Description Status
multiprocess Threadscope profiling of multi-process or distributed Haskell systems such as client/server or MPI programs. in progress
SPRNG port Haskell implementation of some pseudo random number generators from the SPRNG library in progress
Parallel Haskell Portal one-stop resource oriented for users of parallelism and concurrency in Haskell in progress
SPRNG binding Haskell wrapper around SPRNG in progress
Haskell-MPI Haskell bindings to C MPI library version 1.0 released 2010-12-09
GHC RTS improvements  #4449 - GHC 7 can't do IO when daemonized fixed in 7.0.x branch
 #4504 - "awaitSignal Nothing" does not block thread with -threaded fixed in 7.0.2
 #4512 - EventLog does not play well with forkProcess fixed in 7.0.x branch
 #4514 - IO manager can deadlock if a file descriptor is closed behind its back fixed in 7.0.x branch
 #4854 - Validating on a PPC Mac OS X: Fix miscellaneous errors and warnings fixed in 7.0.x branch
c2hs improvements marshalling functions now can have arguments supplied to them. version 0.16.3 released 2011-03-24

4 Getting involved

Progress reports will be posted to the parallel Haskell mailing list and to the Well-Typed blog.

The best starting point to get involved is to join the mailing list. Note that the list is for parallel Haskell generally, not just the Parallel GHC Project.

5 Participating organisations

Cloudy Bayes: Hierarchical Bayesian modeling in Haskell
The Cloudy Bayes project aims to develop a fast Bayesian model fitter that takes advantage of modern multiprocessor machines. It will support model descriptions in the BUGS model description language (WinBUGS, OpenBUGS, and JAGS). It will be implemented as an embedded domain specific language (EDSL) within Haskell. A wide range of model hierarchical Bayesian model structures will be possible, including many of the models used in medical, ecological, and biological sciences.
Cloudy Bayes will provide an easy to use interface for describing models, running Monte Carlo Markov chain (MCMC) fitters, diagnosing performance and convergence criteria as it runs, and collecting output for post-processing. Haskell's strong type system will be used to ensure that model descriptions make sense, providing a fast, safe development cycle.
IIJ Innovation Institute Inc.
Haskell is suitable for many kinds of domain, and GHC's support for lightweight threads makes it attractive for concurrency applications. An exception has been network server programming because GHC 6.12 and earlier have an IO manager that is limited to 1024 network sockets. GHC 7 has a new IO manager implementation that gets rid of this limitation.
This project will implement several network servers to demonstrate that Haskell is suitable for network servers that handle a massive number of concurrent connections.
Los Alamos National Laboratory
This project will use parallel Haskell to implement high-performance Monte Carlo algorithms, a class of algorithms which use randomness to sample large or otherwise intractable solution spaces. The initial goal is a particle-based MC algorithm suitable for modeling the flow of radiation, with application to problems in astrophysics. From this, the project is expected to move to identification of suitable abstractions for expressing a wider variety of Monte Carlo algorithms, and using models for different physical phenomena.
Willow Garage Inc.
Distributed Rigid Body Dynamics in ROS
Willow Garage seeks a high-level representation for a distributed rigid body dynamics simulation, capable of excellent parallel speedup on current and foreseeable hardware, yet linking to existing optimized libraries for low-level message passing and matrix math.
This project will drive API, performance, and profiling tool requirements for Haskell's interface to the Message Passing Interface (MPI) specification, an industry-standard in High Performance Computing (HPC), as used on clusters of many nodes.
Competing internal initiatives use C++/MPI and CUDA directly.
Willow Garage aims to lay the groundwork for personal robotics applications in everyday life. ROS (Robot Operating System) is an open source, meta-operating system for your robot.