Difference between revisions of "Diagrams/GSoC"

Revision as of 14:14, 9 March 2013

Are you a potential Google Summer of Code student searching for a project to propose? Consider contributing to diagrams! It's an active project with a small, friendly, and knowledgeable developer community. Contributions to diagrams directly improve people's ability to communicate ideas effectively, and raise the profile of the Haskell programming language. Most of all, it's fun---you get to tangibly experience your contributions in the form of beautiful or useful images.

This page collects some suggested project ideas. We're happy to discuss any of the below ideas, or your own ideas, to help you come up with a solid proposal for a project you're excited about---send email to the mailing list.

More info:

Project ideas

GTK application for creating diagrams interactively

Having a tight feedback loop between coding and seeing the reflected changes in a diagram is important. Right now some of the backends have a "looped" compilation mode, but it's somewhat clunky and still a lot slower than it could be, probably due to overheads of compilation, linking, etc.

The idea would be to develop a GTK application allowing the user to edit diagrams code (either with an in-application editing pane or in their own editor, perhaps using fsnotify to watch for changes) and see the updated diagram immediately. Additional potential features include:

the ability to "zoom in" on a selected subcomponent to display, instead of always displaying everything in the entire file
using sliders, input boxes, etc. to interactively display parameterized diagrams, perhaps in a type-directed way (see craftwerk-gtk for inspiration)
Interactive editing of diagrams, e.g. dragging a displayed component and having an appropriate translation call automatically added to the code, or some other sort of support for interactively generating points, vectors, scaling factors, etc. using mouse input
Support for developing animations (e.g. with a slider for moving back+forth in time)

Path operations

It would be nice if diagrams could support various operations on paths such as intersection and union, curve fitting, and path simplification. See also Diagrams/Dev/Paths, which has quite a bit of information on current efforts to implement path offsets and other path-related things.

A student taking this on would probably already need some experience in computational geometry and paths in particular; implementing path algorithms properly is notoriously tricky (though having an incomplete and buggy implementation that nonetheless works "most of the time" would still be better than nothing!).

Taking advantage of diagram tree structure

Diagrams are stored using a fancy tree data structure, but currently diagrams backends cannot take advantage of this information: diagrams are simply compiled into a list of primitives with attributes, and these are handed off to the backend. This has some important implications:

Sometimes it leads to inefficiency. For example, the diagrams code fc blue (hcat $ replicate 1000 (circle 1)) results in backends setting the fill color 1000 times (once for each circle), when instead the fill color ought to be set just once.

There are additional features which could be implemented if backends were able to observe the tree structure, such as grouping for transparency.

The project would consist in first figuring out how best to change the backend interface to allow observing the tree structure, and then implementing new features and improvements to backends based on this new ability.

The devil's in the details: working with the diagram trees can be tricky. This is not a project for the faint of heart, but if you like getting down into tricky details, understanding them, and coming up with creative and elegant ways to achieve a goal given a number of constraints, this could be a fun project with a big impact.

Constraint Based Diagrams

Generate diagrams that meet some declarative constraint specification---perhaps something along the lines of http://wadler.blogspot.com/2011/06/combinator-library-for-design-of.html . The idea is to allow users to specify constraints on their diagram layout (e.g. "A should be no further left than B", "C and D should be at least 2 and at most 8 units apart"), probably using simple linear inequalities, and then solve them to generate an appropriate layout.

A large part of the project would be in simply coming up with a good design for the user API and how to collect constraints; the rest would consist in figuring out how to solve the constraints (either directly, or by hooking up to some other library to e.g. solve systems of linear constraints).

3D diagrams

Diagrams notionally supports arbitrary vector spaces, but no one has yet done the necessary work to make three-dimensional diagrams a reality. This project would have two main components:

Add three-dimensional primitives and functions to diagrams-lib.

Work on one (or more) backends that can render 3D diagrams in some way. Options include developing the stub diagrams-povray backend, or developing an OpenGL backend.

Make Plotting As Easy As Doing It in R

From diagram-discuss:

The above code produces four plots: a scatterplot (something you would often see in statistics), a plot of a function and, well, two empty grids. As a statistician, I usually work a lot with R together with a more or less sophisticated plotting package. The currently best plotting system for R is probably ggplot . Now, I started using Bryan O'Sullivan's statistics package for some of my calculations. Once in Haskell mode, you obviously don't want to switch back and forth between languages. So, I was wondering if it is possible to produce professional looking plots with diagrams' DSL, and how difficult it could be to put together a DSL for (statistical) plotting.

I was thinking of something similar to ggplot's functionality. Making it easy to overlay plots, producing and combining legends, etc. Creating scatterplots and histograms and boxplots. Overlaying them with error regions and density estimates respectively. Then do the same for different subsets of the original data. Doing this with diagrams DSL could proof to be extremely powerful. Each "dot" in a plot could potentially be any diagram you want, dots, circles, stars, numbers or characters -- and if plots are nothing but diagrams, you could even plot plots into a plot. A real pain for most plotting systems is to combine multiple plots into one and to generate a common legend for all of them. This, for example, should be trivial to do within diagrams DSL.

I would be more than happy to help in such a project. As the code above probably suggests, I am not the strongest Haskell hacker around. In fact, I am a statistician/mathematician who happens to use Haskell for some of his projects. That's it. Would anyone be interested in picking up such a project? As I said, I would be happy to help and get involved. Because I think there is a real need for something like this, and it would be very powerful to have eDSL for statistical plotting within Haskell.

External Rendering

The idea here would be to allow for special external rendering of some primitive that Diagrams does not support. For instance, it would be nice to be able to express LaTeX expressions and when the backend renders, offload the work externally then incorporate it with the output. There are several dimensions to supporting this well and making it as backend agnostic as possible. Somewhat related is the idea of external layout such as asking GraphViz to layout some structure then doing the rendering based on those positions. At the simplest this is just turning some new primitive into an `Image` primitive on the fly in the `Renderable` instance.

Variable Precision

It would be nice to be able to trade off precision of the vector output of some backend with the size of that output. For instance the factorization diagrams are rather large when rendered to SVG, but their size could be cut in half by emitting doubles formatted to two significant digits. There is a nice balance that could be struck at a high level where we ensure that we are always within some fraction of what will likely be a pixel in the final output. Then at the level of the backend we would only need to choose the representation that is the smallest for any particular number.

This could be aided by generalized R2.

Sources for more ideas

http://www.cgal.org/Manual/latest/doc_html/cgal_manual/packages.html