STG in Javascript

From HaskellWiki
Revision as of 03:25, 16 February 2007 by BrettGiles (talk | contribs) (fix a link)
Jump to navigation Jump to search

Disclaimer: Here are my working notes related to an experiment to execute Haskell programs in a web browser. You may find them bizzarre, and even non-sensual. Don't hesitate to discuss them (please use the Talk:STG in Javascript page). Chances are, at some point a working implementation will be produced.

The Javascript Shell is of great help for this experiment.

Aug 22, 2006

Several people expressed interest in the matter, e. g.: [1], [2].

A Wiki page Hajax has been recently created, which summarizes the achievements in the related fields. By these experiments, I am trying to address the problem of Javascript generation out of a Haskell source.

To achieve this, an existing Haskell compiler, namely nhc98, is being patched to add a Javascript generation facility out of a STG tree: the original compiler generates bytecodes from the same source.

After (unsuccessful) trying several approaches (e. g. Javascript closures (see [3]), it has been decided to implement a STG machine (as described in [4]) in Javascript.

The abovereferenced paper describes how to implemement a STG machine in assembly language (or C). Javascript implementation uses the same ideas, but takes advantage of automatic memory management provided by the Javascript runtime, and also built-in handling of values more complex than just numbers and arrays of bytes.

To describe a thunk, a Javascript object of the following structure may be used:

thunk = {
  _c:function(){ ... },                 // code to evaluate a thunk
  _1:...,                               // argument 1
  _N:...                                // argument n

So, similarly to what is described in the STG paper, the c method is used to evaluate a thunk. This method may also do self-update of the thunk, replacing itself (i. e. this.c) with something else, returning a result as it becomes known (i. e. in the very end of thunk evaluation).

Some interesting things may be done by manipulating prototypes of Javascript built-in classes.

Consider this (Javascript shell log pasted below):

Number.prototype.c=function(){return this};
function(){return this}

Thus, simple numeric values are given thunk behavior: by calling the c method on them, their value is returned as if a thunk were evaluated, and in the same time they may be used in a regular way, when passed to Javascript functions outside Haskell runtime (e. g. DOM manipulation functions).

Similar trick can be done on Strings and Arrays: for these, the c method will return a head value (i. e. String.charAt(0)) CONS'ed with the remainder of a String/Array.

Aug 23, 2006

First thing to do is to learn how to call primitives. In Javascript, primitives mostly cover built-in arithmetics and interface to the Math object. Primitives need all their arguments evaluated before they are called, and usually return strict values. So there is no need to build a thunk each time a primitive is called.

At the moment, the following Haskell code:

f :: Int -> Int -> Int

f a b = (a + b) * (a - b)

g = f 1 2

compiles into (part of the Javascript below was inserted manually):

var HMain = {m:"HMain"};

Number.prototype._c=function(){return this;};

// Compiled code starts

HMain.f_T=function(v164,v165){return {_c:HMain.f_C,
return ((((this._1)._c())+((this._2)._c()))._c())*

HMain.g_T=function(){return {_c:HMain.g_C,_w:"11:1-11:9"};};
return HMain.f_T(1,2); // NB should be HMain.f_T(1,2)._c()

// Compiler code ends



When running, the script produces:


So, for each Haskell function, two Javascript functions are created: one creates a thunk when called with arguments (so it is good for saturated calls), another is the thunk's evaluation function. The latter will be passed around when dealing with partial applications (which will likely involve special sort of thunks, but we haven't got down to this as of yet).

Note that the _c() method is applied twice to the output from HMain.g_T: the function calls f_T which returns an unevaluated thunk, but this result is not used, so we need to force the evaluation to get the final result.

NB: indeed, the thunk evaluation function for HMain.g should evaluate the thunk created by HMain.f_T. Laziness will not be lost because HMain.g_C will not be executed until needed.

Sep 12, 2006

To simplify handling of partial function applications, format of thunk has been changed so that instead of _1, _2, etc. for function argument, an array named _a is used. This array always has at least one element which is undefined. Arguments start with array element indexed at 1, so to access an argument n, the following needs to be used: this._a[n].

For Haskell programs executing in a web browser environment, analogous to FFI is calling external Javascript functions. Imagine this Javascript function which prints its argument on the window status line:

// Output an integer value into the window status line

putStatus = function (i) {window.status = i; return i;};

To import such a function is a Haskell program, the following FFI declaration is to be used:

foreign import ccall "putStatus" putStatus :: Int -> Int

Note the type signature: of course it should be somewhat monadic, but for the moment, nothing has been done to support monads, so this signature is only good for testing purposes.

The current NHC98-based implementation compiles the above FFI declaration into this:

Test2.putStatus_T=function(_1){return {_c:Test2.putStatus_C, _w:"7:1-7:56", 
                                       _a:[undefined, _1]};};
return (putStatus)((this._a[1])._c());

Note that like a primitive, a foreign function evaluates all its arguments before it starts executing.

A test page illustrating this can be found at:

When this page is loaded, the window status line should display "456" while the rest of the page remains blank. The Haskell source for this test page is:

Sep 19, 2006

Initially, functions compiled from Haskell to Javascript were prepresented as members of objects (one object per Haskell module). Anticipating some complications with multilevel module hierarchy, and also with functions whose names contain special characters, it has been decided to pass every function identifier through the fixStr function: in nhc98 it replaces non-alphanumeric characters with their numeric code prefixed with an underscore. So a typical function definition looks like:

p3 :: Int -> Int -> Int -> Int
p3 a b c = (a + b) * c;

compiles into:

var Test3_46p3_T=function(v210, v211, v212){return {_c:Test3_46p3_C, 
                                                        v210, v211, v212]};};
var Test3_46p3_C=function(){
return (((((this._a[1])._c())+((this._a[2])._c()))._c())*

Note the function name: Test3_46p3_T; in previous examples it would have been something like Test3.p3_T.

Partial function applications need a different thunk format. This kind of thunk holds the function to be applied to its arguments when the application will be saturated (number of arguments becomes equal to function arity), number of remaining arguments, and an array of arguments so far.

Thus, for a function:

w = p3 1

resulting Javascript is:

var Test3_46w_T=function(){return {_c:Test3_46w_C, _w:"17:1-17:8", 
var Test3_46w_C=function(){
return ({_c:function(){return this;}, _s:Test3_46p3_T, _x:2, _a:[1]})._c();

Such a thunk always evaluates to itself (_c()); it holds the function name in its _s member, number of remaining arguments in its _x member, and available arguments in its _a member, only in this case the array does not have undefined as its zeroth element.

An application of such a function (w) to additional arguments:

z = w 2 3

compiles into:

var Test3_46z_T=function(){return {_c:Test3_46z_C, _w:"23:1-23:9", 
var Test3_46z_C=function(){
return (HSRuntime_46doApply((Test3_46w_T())._c(), [2, 3]))._c();

So, when such an expression is being computed, a special Runtime support function is called, which obtains the partial application thunk via evaluation of its first argument (Test3_46w_T())._c()), and adds the arguments provided ([2, 3]) to the list of arguments available so far. If number of arguments becomes equal to the target function arity, normal function application thunk is returned, otherwise another partial application thunk is returned. The Runtime support function looks like this:

var HSRuntime_46doApply = function (thunk, targs){
  thunk._a = thunk._a.concat (targs);
  thunk._x = thunk._x - targs.length;
  if (thunk._x > 0) {
    return thunk;
  } else {
    return thunk._s.apply (null, thunk._a);

Note the use of the apply method. It may be used also with functions that are not methods of some object. The first argument (this_arg) may be null or undefined as it will not be used by the function applied to the arguments.

NHC98 acts differently when a partial application is not defined as a separate function, but is part of another expression.

First, some Haskell definitions:

z :: Int -> Int

z = (3 +)

p :: Int -> Int -> Int

p = (+)

compile into:

var Test4_46z_T=function(){return {_c:Test4_46z_C, _w:"9:1-9:8", 
var Test4_46z_C=function(){
return ({_c:function(){return this;}, _s:LAMBDA181_T, _x:1, _a:[]})._c();

var LAMBDA181_T=function(v178){return {_c:LAMBDA181_C, _w:"9:8", 
                                       _a:[undefined, v178]};};
var LAMBDA181_C=function(){
return (((3)._c())+((this._a[1])._c()))._c();

var Test4_46p_T=function(){return {_c:Test4_46p_C, _w:"13:1-13:6", 
var Test4_46p_C=function(){
return ({_c:function(){return this;}, _s:LAMBDA182_T, _x:2, _a:[]})._c();

var LAMBDA182_T=function(v179, v180){return {_c:LAMBDA182_C, 
                                             _a:[undefined, v179, v180]};};
var LAMBDA182_C=function(){
return (((this._a[1])._c())+((this._a[2])._c()))._c();

Now, when these functions (p, z) are used:

t4main = putStatus (z (p 6 8)) -- see above for putStatus

the generated Javascript is:

var Test4_46t4main_T=function(){return {_c:Test4_46t4main_C, 
var Test4_46t4main_C=function(){
return (Test4_46putStatus_T(
              Test4_46p_T(), 6, 8)

For each application of p and z, an internal function NHC_46Internal_46_95applyN_T is called where N depends on the target function arity. In Javascript implementation, all these functions are indeed one function (because in Javascript it is possible to determine the number of arguments a function was called with, so no need in separate functions for each arity). The internal function extracts its first argument and evaluates it (by calling the _c() method), getting a partial application thunk. Then, the Runtime support function HSRuntime_46doApply is called with the thunk and arguments array:

var NHC_46Internal_46_95apply1_T = function() {return __apply__(arguments);};
var NHC_46Internal_46_95apply2_T = function() {return __apply__(arguments);};
var __apply__ = function (args) {
  var i, targs = new Array();
  var thunk = args[0]._c();
  for (i = 1; i < args.length; i++) {
    targs [i - 1] = args [i];
  return HSRuntime_46doApply (thunk, targs);