HaskellImplementorsWorkshop/2016

The Haskell implementors' workshop is a forum for those involved in implementing Haskell systems, infrastructure, libraries and tools, generally for people involved in implementing Haskell technology, to share their work and discuss future directions and collaborations with others.

In 2016, the Haskell Implementors Workshop will be co-located with ICFP 2016 in Nara.

The workshop does not have proceedings. Talks and/or demos are proposed by submitting an abstract, and selected by a small program committee. The workshop will be informal and interactive, with a flexible timetable and plenty of room for ad-hoc discussion, demos, and impromptu short talks.

Traditionally, HIW is an open forum for people writing compilers, tools, or libraries, people with cool ideas for directions in which we should take the platform, proposals for new features to be implemented, and half-baked crazy ideas.

Links

Program Committee

Joachim Breitner (Karlsruhe Institut für Technologie)
Duncan Coutts (Well Typed)
Michael Snoyman (FP Complete)
Luite Stegeman (ghcjs)
Niki Vazou (UCSD)
Stephanie Weirich (University of Pennsylvania)
Edward Z. Yang - chair (Stanford University)

Schedule

Session 9:15-10:15 "The State of GHC"

9:15 State of GHC (Simon Peyton Jones)
9:45 Contributing to GHC (Ben Gamari)

Session 10:35-11:25 "Pluggability and Modularity"

10:35 More powerful GHC Plugins (Moritz Angermann)
11:00 Backpack to Work: Towards Backpack in Practice (Edward Z. Yang)

Session 11:45-12:35 "Types and Effects"

11:45 A Dependent Haskell Triptych (Richard A. Eisenberg)
12:10 Automatically Escaping Monads (Ben Lippmeier)

Session 14:00-14:50 "The Engineering of GHC"

14:00 Pita: Tools for making GHC fast again (Ben Gamari)
14:25 GHC Determinism (Bartosz Nitka)

Session 15:20-16:10 "Backends for GHC"

15:20 Remote GHCi (Simon Marlow)
15:45 GHCVM - A JVM Backend for GHC (Rahul Muttineni)

Session 16:40-18:00 "Trees and Lightning Talks"

16:40 Trees That Grow (Shayan Najd, Simon Peyton Jones, Jacques Carette)
17:05 Lightning Talks
- Pre-accepted (10min):
  - Functional Lattice Crypto (Eric Crockett, Chris Peikert)
  - Inductive Tuples (Lennart Augustsson)
  - LLVM Bitcode (Moritz Angermann)

Program

Contributing to GHC (Ben Gamari)

GHC has seen a remarkable amount of change in the past five years. This can be seen both in the features that have been implemented, as well as structure of the community which implemented them. While our technical infrastructure has been adapting with the adoption of Phabricator, better automation, and improved continuous integration, the collaborative and social mechanisms which have served our developer community well in the past may be showing their age.

To remain sustainable, we need to ensure that the scale of GHC's developer community keeps up with growth in both the compiler itself and its user community. In this session we will discuss some of the sticking points in GHC's development process, especially with respect to on-boarding of new contributors and treatment of external proposals. We'll open with a short discussion of some of the measures that GHC HQ has been recently taking and then opening up the floor for general discussion of ways we can better support GHC developers.

More powerful GHC Plugins (Moritz Angermann)

GHC provides a plugin interface for writing type checker plugins and for some time also frontend plugins. The plugin interface, however could be much more powerful.

With the addition of a few new hooks, more powerful plugins can be build that can control the compilation pipeline and other parts of ghc.

This talk will outline what is currently possible, and what new hooks the author has in development, as well as issues arising from using the plugin interface.

Backpack to Work: Towards Backpack in Practice (Edward Z. Yang)

The universal organizing principle for large software systems in programming languages today is the package, the unit by which reusable code may be versioned and distributed. However, most package systems provide only a weak form of modularity, where packages depend directly on other packages. Backpack breaks new ground by arguing mixin packages can be a good fit for package-level modularity. Unfortunately, Backpack as was described in POPL'14 cannot be easily implemented for most existing languages today (including Haskell), because it tightly couples the compiler with the package manager. In this talk, I describe an evolution of the Backpack design which respects the division between package manager and compiler. This is not a paper design: it is principally motivated by our (ongoing) efforts to implement Backpack in GHC and the Cabal package system, which we hope to land for GHC 8.2.

A Dependent Haskell Triptych (Richard A. Eisenberg)

This talk will include three examples, compilable today, of dependently typed programming in Haskell:

A finely-typed database access library whereby the requirements placed on the schema of the database can be inferred from the client code. This means that the database and the access code can be updated independently of one another, as long as the schema meets the inferred requirements.

A datatype design pattern allowing for both row and column extensibility. This example makes critical use of both an injective type family and a higher-rank kind.

A translation of Idris's algebraic effects library to Haskell, allowing for client code to use several effects (such as state, failure, or I/O) together in a composable fashion without monad transformers or manual lifting.

The goal of this talk is not to explain all of the details of any of these examples, nor is it to advertise any release-ready software expected to be used in your next project. Instead, this talk is intended to showcase the kinds of problems dependent types in Haskell might solve and a sneak peek at some solutions.

Automatically Escaping Monads (Ben Lippmeier)

New Haskell programmers often find themselves "trapped in a monad". Given a computation of type 'IO a', there is no way to get the 'a' value "out" of the computation. One can use the monadic 'bind' operator to create a new computation based on the old one, but this can lead to an awkward style of programming. Haskell is really two distinct sub languages, where the pure sub language is used to compose computations in the monadic sub-language. When effects need to be added to previously pure code we also need to undertake a big refactoring effort, converting 'let' to 'do', and 'map' to 'mapM' and so on.

In this talk I'll present a practical implementation of a coeffect system that lets us get the 'a' out. Given a suspended computation M of type 'IO a', 'run M' evaluates it, releasing its effect into the context, and producing the value of type 'a'. Conversely, given a term N of type 'a' which would perform an IO effect when evaluated, 'box N' suspends its evaluation, returning a value of type 'IO a'. This idea has been described previously, by both Filinski [1] who named the operators 'reflect' and 'reify', as well as Pfenning and Davies [2] in the context of lax modal logic.

Although the theory works out, manually inserting 'run' and 'box' casts into a program is even more annoying than converting 'let'-expressions to 'do'-expressions when effects need to be added to pure code. Luckily, there is a way to automatically insert these casts during type checking, so we can write code in a style that assumes implicit effects (like in ML), but with types as nice as in Haskell. This system has recently been implemented in DDC [3] as an extension to Dunfield and Krishnaswami's bidirectional type checking algorithm [4]. Automatically inserting run and box casts works surprisingly well, and also supports DDC's approach to type safe freezing, where mutable objects can be converted to immutable ones, without unsafe hacks like 'unsafeFreezeArray'.

[1] Representing Monads Andrzej Filinski Principles of Programming Languages, 2004.

[2] A Judgmental Reconstruction of Modal Logic Frank Pfenning and Rowan Davies Mathematical Structures in Computer Science, 2000.

[3] The Disciplined Disciple Compiler https://github.com/DDCSF/ddc

[4] Complete and Easy Bidirectional Typechecking for Higher-Ranked Polymorphism. Joshua Dunfield and Neelakantan Krishnaswami International Conference on Functional Programming, 2013.

Pita: Tools for making GHC fast again (Ben Gamari)

▼Abstract The accumulation of new language additions, standard library interfaces, and compiler features had led to noticably longer compile times in GHC in recent years. Joachim Breitner's Gipeda is a great diagnostic tool for analysing particular commits. In our effort to make GHC fast again, we were in need of diagnostic tools to analyze compound compile times across periods spanning from weeks to years.

With the Performance Indicator Tracking Application, Pita, we introduce a diagnostic tool for visualizing changes in GHC's performance characteristics over the course of development. Based on the same dataset used by Gipeda, Pita offers side-by-side comparisons across time and multiple tests. Using Pita, we are able to quickly identify correlations and patterns that trigger compile time speed bumps, allowing us to narrow down years of development to a small set of regressing commits.

This talk provides an introduction to Pita. Using hands-on examples we show how to chase down speed bumps in past commits. Finally, we discuss the state of our effort in improving GHC's performance, discussing some of the common themes that we have found in the course of the work and sharing some best practices for GHC contributors to minimize the impact of their changes on compiler performance.

Using Pita we have already reduced compiler allocations by 8% on average across nofib. We are hoping that with further work we will ultimately be able to bring compilation times back to pre-7.6 magnitudes in the coming year. We invite the community to help us realize this goal.

GHC Determinism (Bartosz Nitka)

Compilation with GHC was non-deterministic prior to version 8.0.2. Compiling with the same flags, sources, and environment could produce incompatible binary objects.

Non-deterministic compilation causes several problems. Non-deterministic interface files result in slower incremental builds because of unnecessary recompilation. Distributed build systems, such as Buck or Bazel, cannot reliably cache non-deterministic object files, negating the primary benefit of distributed compilation. Non-determinism results in unstable symbol names, making code hot-swapping difficult. Non-deterministic output forces packagers of binary distributions such as Debian to waste resources recompiling after a trivial change and also makes it impossible to verify if a binary was built from the given code.

During the past couple of months I tested GHC for non-determinism on itself and Stackage, developed a non-determinism test suite, and systematically audited GHC's internals to ensure that compilation is deterministic in GHC 8.0.2. In this talk I will discuss the causes of non-determinism, how I addressed them, and how we can keep GHC deterministic going forward.

Remote GHCi (Simon Marlow)

Haxl users at Facebook do a lot of development and testing inside GHCi. In fact, we’ve built a customized version of GHCi that runs code in our Haxl monad by default instead of the IO monad, and has a handful of extra commands to support common workflows needed by our developers. This is a pretty smooth setup: right inside GHCi we can test the production code against real data, and interact with all of the services that our production systems talk to, while having a nice interactive edit/compile/test cycle.

However, one thing is missed by many developers, especially those coming from other languages: easy access to a stack trace when debugging.

This talk will tell the story of how we implemented always-on stack traces in GHCi, which are now available in GHC 8.0.1. The implementation involves splitting out the interpreter from GHCi into a separate process, so that the interpreted code can run in profiling mode (where stack traces are available), while the compiler continues to run at full speed on the unprofiled runtime. GHC communicates with the interpreter using binary messages over a pipe.

Remote GHCi has a number of other benefits, including fixing the long-standing annoyance that to compile TH code with -prof you had to build it twice. It will also be useful for using TH when cross-compiling, and has a lot of overlap with a similar mechanism already implemented in GHCJS; indeed we hope to converge on a single implementation in the future.

(more background for the talk in this blog post: http://simonmar.github.io/posts/2016-02-12-Stack-traces-in-GHCi.html)

GHCVM - A JVM Backend for GHC (Rahul Muttineni)

GHCVM is a compiler that translates GHC 7.10.3's Haskell to Java bytecodes, using a slightly modified version of GHC [1]. It aims to reduce the entry barrier for companies who wish to deploy Haskell but are bound to the JVM. The code generator and runtime system were carefully modeled after GHC, translating the primitive concepts to Java equivalents.

I will discuss the progress so far and highlight the key design decisions that were made to preserve GHC's semantics on the JVM. Additionally, I will discuss future directions of GHCVM including an intermediate representation for simplifying bytecode compilation, support for an external interpreter and Template Haskell, and utilizing the recent developments in the JVM for performance boosts.

[1] The GHC API was used initially, but the rigidity of the API forced me to inline the entire GHC frontend into the GHCVM codebase.

Trees That Grow (Shayan Najd, Simon Peyton Jones, Jacques Carette)

Algebraic datatypes and pattern matching in Haskell lay a fertile ground to conveniently define and process abstract syntax trees (ASTs). However, in Haskell, trees often cannot grow: once a datatype is defined and compiled, it cannot be extended. Extensions to a datatype mainly appear as new fields to its existing data constructors, and/or new data constructors.

At the center of any metaprogramming system stand tall trees representing the abstract syntax of object terms. Metaprograms processing these trees often do so by decorating nodes with additional information. This additional information may appear as new fields to the existing data constructors, and/or new data constructors. Common practice is either post hoc, to define a new separate datatype representing the output decorated trees; or pre hoc, to use the same large datatype to represent both the non-decorated input and the decorated output trees. Both methods are often ad hoc; the former leads to duplication, and the latter forces the input trees to carry an unnecessary set of information making them inconvenient to work with.

In this talk, I introduce an encoding of datatypes that allows them to be extended in a post hoc manner, yet still argueably keeping them convenient to work with. It is done as a part of the the Summer of Haskell project titled "Native Metaprogramming in Haskell", where we considered unifying the two most popular representations of Haskell's syntax: the small and convenient AST in the popular library Haskell-Src-Exts (HSE), and the large decorated AST used inside GHC's front-end (HsSyn).

Important Dates

June: Call for Talks
Monday, 8 August, 2016: Talk Proposal Deadline
Monday, 22 August, 2016 Notification
Saturday, September 24, 2016: Workshop