Yhc/RTS/Machine

From HaskellWiki
Part of Yhc

(Download)

The Yhc runtime is a stack machine based on the spineless G-Machine.

The stack machine has several major components:

  • the heap which contains heap nodes (see Yhc/RTS/Heap)
  • several machine 'registers' which represent the current state of the machine
  • the stack which is used for calculation and for function calls.

Registers[edit]

The stack machine uses 'registers' (i.e. C variables) to store important information about the running state of the machine (see src/runtime/BCKernel/mutator.c).

The stack machine works by evaluating application nodes. Application nodes are calls to a given function with a given set of arguments. At any one time some application node (and thus some function) is 'currently under evaluation'. When a function is called it executes the bytecode instructions in that function which might cause the further evaluation of more application nodes. Eventually an application will produce a result and evaluation of an earlier application will continue.

The registers used in this process are:

  • The instruction pointer (ip) points to current instruction being executed.
  • The stack pointer (sp) points to the item on the top of the stack.
  • The frame pointer (fp) points to the top frame on the stack.
  • The heap pointer (hp) points to the start of free space in the heap.
  • The vector application pointer (vapptr) points to the application node in the heap that is currently being evaluated.
  • The const table pointer (constptr) points to the constant table of the current application node (vapptr).

Heap[edit]

The layout of heap nodes is described in Yhc/RTS/Heap and garbage collection in Yhc/RTS/GC.

The heap used to be stored in memory with the heap growing one way and the stack growing the other. However with the introduction of concurrency to yhc each process needs its own stack.

Process stacks are thus allocated as ProcStackNodes (see Yhc/RTS/Heap) in the heap and are garbage collected like normal nodes.

Mark Table[edit]

Another part of the heap is the mark table, which is an area of memory 1 bit for each word in the heap. This is used in garbage collecting. The more traditional method of including mark bits, having a bit as a field of a node header wasn't appropriate for Yhc, mainly due to it not having enough room to fit it in!

Having a seperate table does allow some useful optimisations in the garbage collector, however, see Yhc/RTS/GC for details.

Stack[edit]

The program stack is used to store previous values of the machine registers in stack frames. This is used to allow nested function calls. The program stack is also used to during evaluation to perform calculation and as temporary store. The layout of the stack might thus be something like

===================   <- G_spBase
 |   frame 0     |
 +---------------+
 :  stack data   :
 +---------------+
 |   frame 1     |
 +---------------+ 
 |   frame 2     |     <- fp
 +---------------+     
 :  stack data   :
 :...............:
 :  stack data   :     <- sp
 :---------------:

The structure of a single stack frame is:

 struct Frame {
   Frame*     fp;     /* saved value of the fp register */
   CodePtr    ip;     /* saved value of the ip register */
   Node*      vapptr; /* saved value of the vapptr register */
 };

When a new node is evaluated a new frame is pushed on top of the stack. The current values of the registers are saved in the frame and the registers get new values:

  • fp is set the address of the newly created Frame structure on the stack.
  • ip is set to the beginning of the code for the new function.
  • vapptr is set to the new node under evaluation
  • constptr is set to the constant table of the new function

When the function returns:

  1. the value on the top of the stack is saved temporarily
  2. ip and vapptr are set to the value saved in the current frame
  3. fp is set the value saved in the current frame
  4. constptr is set to the constant table of vapptr
  5. sp is set to the new value of fp
  6. the value saved earlier is pushed back onto the top of the stack

Each process has its own stack and stacks are stored as normal heap nodes. When a process outgrows its stack it's simply given a larger one and the old one is left to be garbage collected.

Global registers[edit]

There are some additional registers that are used (see src/runtime/BCKernel/heap.h):

  • G_markTableSize records the size of the mark table in words (see ["Yhc/RTS/GC"]).
  • G_markTable is a pointer to the start of the mark table (see ["Yhc/RTS/GC"]).
  • G_hpSize is the size of the heap in words
  • G_hpStart is a pointer to the start of the heap area
  • G_hpEnd is a pointer to the end of the heap area (including stack)
  • G_spBase is the start of the stack
  • G_spLimit is the further the stack can currently extend.
  • G_gcStats are statistics about garbage collection
  • G_heapGlobals are a list of variables that the C runtime is holding that refer to haskell heap objects (see ["Yhc/RTS/Primitive"].

There are also global registers that act as 'caches'. Which is to say they are globally visible copies of mutator registers (it's necessary to copy them back and from when moving in/out of the mutator).

  • G_sp is a global mirror for sp (it's copied here from sp if we need to leave the mutator).
  • G_fp performs the same function for fp
  • G_hp the same for hp