HaskellImplementorsWorkshop/2014

The Haskell implementors' workshop is a forum for those involved in implementing Haskell systems, infrastructure, libraries and tools, generally for people involved in implementing Haskell technology, to share their work and discuss future directions and collaborations with others.

In 2014, the Haskell Implementors Workshop will be co-located with ICFP 2014 in Gothenburg.

The workshop does not have proceedings. Talks and/or demos are proposed by submitting an abstract, and selected by a small program committee. The workshop will be informal and interactive, with a flexible timetable and plenty of room for ad-hoc discussion, demos, and impromptu short talks.

Traditionally, HIW is an open forum for people writing compilers, tools, or libraries, people with cool ideas for directions in which we should take the platform, proposals for new features to be implemented, and half-baked crazy ideas. For more details see the Call for Contributions in full.

HIW 2014 accepted 13 talk proposals. To make this possible, some of the talks had to be accepted as "short talks" of 15 minutes. Other talks have 20 minutes. All talks are followed by 5 minutes for questions, discussion, and changeovers.

The workshop will also give opportunity for lightning talks organised on the workshop day. Speakers sign up for lightning talks during the workshop. Lightning talk slots will be scheduled first-come-first-served, for ~5 minutes in the afternoon sessions.

Links

Important Dates

Monday, 14 July, 2014: Talk Proposal Deadline (anywhere on earth)
Monday, 21 July, 2014: Notification
(a couple of days after 21 July): Notification (sorry, we are delayed)
Saturday, 6 September: Workshop

Programme

The videos of HIW 2014 are available on Youtube

09:00	Welcome to HIW 2014 (slides)
09:15 - 10:00	Haskell in a different flavour
09:15	CLaSH: Compiling circuit descriptions (slides)
09:40	The Past, Present and Future of the Programmer-friendly Helium Compiler (slides)
10:00 - 10:30	Coffee break
10:30 - 11:20	Memory / runtime management
10:30	GUMSMP: a multilevel parallel Haskell implementation (slides)
10:55	Managing a Haskell heap in Javascript
11:20 - 11:40	Break
11:40 - 12:30	GHC
11:40	GHC status update
12:05	GHC's developer tools ecosystem (Contributing to GHC) (slides)
12:30 - 14:00	Lunch
14:00 - 14:45	Distributed Haskell
14:00	The implementation of the HdpH DSLs: Details and Difficulties
14:25	A GHC language extension for static values (short talk)
14:45 - 15:00	Lightning talks I
	Marc Lentczner: Where are we?
	Andrew Farmer: Verifying rules with Hermit
	Adam Gundry : Type-checker plugins: What, Why?
15:00 - 15:10	Break
15:10 - 16:00	Advanced types
15:10	Dependent Haskell
15:35	Partial Type Signatures
16:00 - 16:30	Tea break
16:30 - 17:10	Haskell modules
16:30	Extending Cabal with Plugins, Preprocessors and Multi-target Compilers (slides)
16:50	Implementing Backpack in GHC (short talk)
17:10 - 18:00	Lightning talks II
	Alejandro Serraro: Interactive features in ghc-mod
	Simon Marlow: Death by dynamic linking
	Tom Ellis: Compiling to SQL
	Lennart Augustsson: Better type-error messages
	Jonas Duregard: Black-box mutation testing
	Gershom Bazerman: New www.haskell.org
	Michael Adams: Optimizing SYB

Abstracts of Accepted Talks

CLaSH: Compiling circuit descriptions

Christiaan Baaij, U. Twente

The CLaSH compiler sees Haskell programs as digital circuit descriptions, in particular, it sees them as structural circuit descriptions. We take a structural view so that the implicit parallelism in the description is mapped to (sub-)circuits actually running in parallel. As a result, annotation-free, "normal" Haskell code gets mapped to a "maximally" parallel circuit. Probably unsurprisingly there is catch: you have specify sequential circuits manually. The output of the CLaSH compiler is low-level, synthesizable, VHDL code.

In this talk I will explain the general flow of the CLaSH compiler, the design decisions behind its internal rewrite system, and its system for overloadable primitive operations. I will also compare the structural synthesis approach taken by CLaSH, to the embedded DSL approach (such as Lava), and the behavioural synthesis approach (such as Verity). This comparison will be both about the advantages and disadvantages for the language user, and the language implementer. I will conclude with some of the limitations of the current CLaSH compiler and directions for future work.

Website: http://christiaanb.github.io/clash2

The Past, Present and Future of the Programmer-friendly Helium Compiler

Jurriaan Hage, U. Utrecht (slides)

Helium was developed as a programmer-friendly compiler for Haskell, in particular to improve upon the type error diagnosis as reported by Haskell compilers in a class room setting. Helium has been dormant for some time, although it is still in use in various places around the world. Over the last few years, we have, off and on, worked on making Helium available through Hackage, and we want to use the HIW to introduce it back the world, including a whole new generation of Haskell programmers and researchers who may only have remotely heard of it. In the talk, I will discuss the history of Helium, describing its unique features, the limitations to what it supports, advertise its availability on Hackage (a task that will be taken care over the summer holidays), and importantly, what our future plans are for this compiler. In particular, to discuss how it relates to that other compiler, the UHC, that the Software Technology group at Utrecht develops.

Of particular interest is the recently started DOMSTED project that was funded by the Dutch government on providing domain-specific type error diagnosis for embedded domain-specific languages within the full Haskell language, supporting also many of the type system exten- sions that everyday Haskell programmers like to employ. Helium was the first compiler, back in 2003, to provide such support, but this never went beyond (approximately) Haskell 98. This may not be such a limitation when teaching students how to program in Haskell, but it certainly is a limitation if we want the professional Haskell programmer to enjoy the same benefits in the presence of type families, GADTs, multi-parameter type classes, existentials and higher- ranked types. community interested in this approach to verifying type class laws and other properties.

GUMSMP: a Multilevel Parallel Haskell Implementation

Malak Aljabri, Hans-Wolfgang Loidl and Phil Trinder, U.Glasgow / Heriot-Watt U.

GUMSMP is a new parallel Haskell runtime implementation, which is designed for hierarchical platforms, such as clusters of multi-cores. It combines distributed memory parallelism, using a virtual shared heap over a cluster as developed for the GHCGUM system, with low-overhead shared memory parallelism on the multi-cores as developed for the GHC-SMP system. The combination of both systems results in an high-performance execution engine that can flexibly use the computational power of modern clusters of multi-cores. In particular, the GUMSMP design features effective latency hiding, and mostly passive load distribution, to exploit high- latency clusters. We present improvements to the load balancing system that accounts for the presence of multi-cores and uses pre-fetching to increase overall utilisation. Specifically, we studied the impact of several load balancing policies, including work pre- fetching and hierarchical work distribution, resulting in significant improvements of perfor- mance. We present novel performance results for this implementation on a cluster of 8-core machines, demonstrating good scalability of a set of 9 benchmarks, including a newly added blackscholes implementation, on up to 96 cores with speedups ranging from 20 to 87, and performance gains of up to 20

The GUMSMP system also proves to be beneficial for executing parallel Haskell code on state-of-the-art NUMA shared memory machines, that suffer from varying latencies accessing memory in different NUMA regions. Our recent performance results show that our hybrid system, GUMSMP, consistently outperforms the shared memory GHC-SMP implementation on seven benchmarks. The best results are achieved using one shared heap within a single NUMA region, but using distributed heaps across the regions. The combination of both shared and distributed heap in GUMSMP gives us the flexibility to choose these heap settings.

Managing a Haskell heap in JavaScript

Luite Stegeman, (unaffiliated)

The GHC runtime system uses a garbage collector to clean up unused memory and several additional tasks such as resetting unreferenced CAFs, running finalizers for weak references, removing indirections for updated thunks and selector thunks and checking for "indefinitely blocked" situations. GHCJS relies on the JavaScript garbage collector to do the basic cleanup task. Unfortunately the JavaScript runtime provides no information that can help the other tasks. Even the WeakMap proposed for ECMAScript 6 does not allow us to observe when a reference is dead. Currently, GHCJS uses a fairly simple heap scanning technique to fill the gaps. In this talk we explore variations on the basic technique and changes in the GHCJS runtime system to improve performance and reduce the amount of work.

GHC status update

Simon Peyton-Jones

Contributing to GHC

Joachim Breitner (slides)

The core component of the Haskell ecosystem, the Glasgow Haskell Compiler (GHC) is not only open source, it is also a proper open source project relying on the work of volunteers. Despite its age and its apparent complexity, new contributors are not needed but actually useful to the project.

Recently, the project has seen some changes that make it even easier for you to start hacking on it, more convenient to get your changes reviewed and harder to break anything: Our repositories have a less custom setup; a tool called Phabricator is used for efficient and meme-ridden code review; various quality assurances services detect breakage and performance regressions early. This extends our existing tools (trac, the mailing lists) and practices (notes, an extensive test suite) that keep working on GHC manageable.

In this talk we give an overview of old and new practices and tools, especially aiming at interested newcomers, lowering the entry barrier to contributing to GHC.

Extending Cabal with Plugins, Preprocessors and Multi-target Compilers

Tibor Bremer / Atze Dijkstra, University Utrecht (slides)

I propose ToolCabal, a redesign of Cabal. It improves upon Cabal in the following ways: 1. It improves on the dealing with preprocessors 2. It supports a native plugin system, meaning no recompilation is needed to use plugins. 3. It allows multiple backends of a compiler to be build simultaneous

The core of the redesign is using the Haskell class system to deal uniformly with compilers and preprocessors, and using the shake build system to force dependency tracking of all source files, be it sources for the preprocessors or the compiler. Shake is particularly useful here as it allows dynamic dependencies, meaning it can correctly track dependencies of the generated files.

It steps away from the custom builds and build hook structure of Cabal by using a full plugin system that can be used at both the preprocessor as well as the compiler level. This is com- pletely native and the build system deals uniformly with plugin and non plugin preprocessors and compilers.

Another major improvement of ToolCabal over Cabal is the support for multiple targets of a compiler, e.g. the bytecode and profiling backends of GHC, and the bytecode and JavaScript backends of UHC.

Due to the generic type classes the tool is completely independent of both the source and target. It is not Haskell specific anymore but can be used with any other source language and compilers.

Source code can be found at: https://github.com/TiborIntelSoft/ToolCabal

Implementing Backpack in GHC

Edward Yang, Stanford U.

I am spending this summer at GHC HQ working on an implementation of Backpack, a new module system for Haskell. Why should you care? Along the road to Backpack, one may: (1) solve Cabal hell, by supporting multiple instances of a package linked against different ver- sions and allowing us to move away from version number ranges as the primary mechanism of specifying package dependency, (2) support module-level pluggability, so you never have to use a silly type class to make your library support both String and Text, (3) improve the state of the art for large complicated libraries such as the ghc-api and their users, making it far easier for users to say, "Here is the subset of the library I’m depending on." This work is very much in progress, and I’m hoping to report as much as I’ve accomplished by the time ICFP rolls around. This would also be a good opportunity to talk about Haskell type classes and their differences from modular type classes from the ML tradition, and what the future of Haskell type classes might be.

Dependent Haskell

Richard A. Eisenberg, U.Pennsylvania

This talk will be a preview of Dependent Haskell, with its implementation currently under way in a fork of GHC. Dependent Haskell is a dependently typed variant of Haskell. It is intended to be a conservative (that is, backward compatible) extension to GHC’s Haskell, preserving type erasure wherever possible. Dependent Haskell does not require all functions to terminate. Thus, programs in Dependent Haskell are not proofs; instead, properties expressed in types hold only when the program compiles and runs in finite time (partial correctness). This seems reasonable in practice, as any divergence of a running program is caused by a loop written by the programmer – compilation will not introduce new loops to overcome programmer mistakes. I hope to merge this into the main branch of GHC when it is mature enough. One of my goals in this talk is to spur useful design discussion about how best to serve the needs of users.

Partial Type Signatures

Thomas Winant, Dominique Devriese, Frank Piessens and Tom Schrijvers, KU Leuven / U. Ghent

In Haskell, programmers have a binary choice between omitting the type signature (and relying on type inference) or explicitly providing the type entirely; there are no intermediate options. Partial type signatures bridge the gap between the two extremes.

In a partial type signature, annotated types can be mixed with inferred types. A type signature is written like before, but can now contain wildcards, written as underscores. Also, placing a wildcard in the constraints part of a type signature will allow the type checker to infer an arbi- trary number of constraints. The type checker will verify that the inferred type matches the form of the partial type signature, while at the same time it infers the types of the wildcards. E.g., a partial type signature _ => a -> _ could lead to the type Show a => a -> String . Partial type signatures give the programmer complete control over the amount of type infor- mation he wants to annotate.

Partial type signatures can be useful in the development phase; the programmer can annotate the parts of the type signature he knows and replace the unknown parts with underscores. They can also be used to hide complex parts of a type signature or to emphasise the relevant annotated parts.

We are currently working on integrating partial type signatures in GHC as a variant of Type- dHoles. Whereas TypedHoles allow holes in programs or expressions, PartialTypeSignatures allow holes in types. Depending on the extensions that are enabled, partial type signatures are considered as either temporary placeholders that need to be replaced with full signatures for the code to be successfully compiled or a permanent way to leave out uninteresting parts of types.

In this presentation, I will give an overview of partial type signatures, and give some insights in the implementation.

The Implementation of the HdpH DSLs: Details and Difficulties

Patrick Maier, Robert Stewart and Phil Trinder, U.Glasgow / Heriot-Watt U.

HdpH and HdpH-RS are a pair of Haskell DSLs for irregular task- parallel computation on distributed-memory platforms. These DSLs were specifically developed to address the issues of non-uniform communication topologies and fault tolerance arising on large-scale distributed platforms. They were also specifically designed to rely only on standard GHC - no runtime system modifications, no third party libraries (apart from standard Hackage packages). The design and semantics of HdpH and HdpH-RS will be presented at the Haskell Sympo- sium; this talk will focus on details of the implementation of the DSLs and on barriers that limit further improvements.

Implementation details include:

Serialisable Closures, and the differences to CloudHaskell. HdpH Closures are fully polymorphic and composable like applicative functors. Overheads on composition are low, making "computing with Closures" a natural paradigm that is used extensively in the implementation of polymorphic skeletons.
Networking backends. The HdpH DSLs implement distributed work stealing protocols, placing a number of requirements on the networking backend. We compare experiences with two networking backends: the TCP-based network-transport package, and a hand-crafted MPI-backend.

Barriers include:

Distributed work stealing relies on low-latency messaging. HdpH currently achieves this by dedicating one core per node exclusively to message handling. IO threads with priorities would be a better solution.
Parallel DSLs often compute on large data structures in normal form. A compact in-memory representation of normal forms (similar to unboxed vectors) would be beneficial for cache performance and might reduce garbage collection and serialisation overheads.

A GHC language extension for static values

Mathieu Boespflug and Facundo Domínguez, Tweag I/O

The Haskell package distributed-process implements a framework for programming distributed systems. One of the cornerstones of the approach is the ability to send closures for evalu- ation in remote peers. So far this has been achieved by pervasive use of Template Haskell, which although effective it is still demanding on the programmer who must manage tables of functions which may be used by peers.

This pitfall has been noticed since the conception of the framework. The suggested remedy was to extend the Haskell language with a notion of static values, used as references to Haskell values, which are valid in all processes of a given distributed application. As far as we are aware, such language extension remained unimplemented and has not been defined with precision.

In this talk we will discuss an implementation of such an extension for the GHC compiler, its potential impact in the distributed-process framework and its limitations. A working implemen- tation is publicly available, which is being prepared to be submitted for inclusion in GHC.

Programme Committee

Jost Berthold - co-chair (University of Copenhagen)
Kevin Hammond (University of St.Andrews)
Gabriele Keller (University of New South Wales)
Paul Liu (Intel Labs)
Rita Loogen (Philipps-Universitat Marburg)
Geoffrey Mainland - co-chair (Drexel University, Philadelphia)
Carter Schonwald (WellPosed Ltd)