Hp2any

From HaskellWiki
Revision as of 16:33, 14 April 2011 by GergelyPatai (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Overview[edit]

The name hp2any covers a set of tools and libraries to deal with heap profiles of Haskell programs. At the present moment, the project consists of three packages:

  • hp2any-core: a library offering functions to read heap profiles during and after run, and to perform queries on them.
  • hp2any-graph: an OpenGL-based live grapher that can show the memory usage of local and remote processes (the latter using a relay server included in the package), and a library exposing the graphing functionality to other applications.
  • hp2any-manager: a GTK application that can display graphs of several heap profiles from earlier runs.

The project is maintained by Patai Gergely. There is a Google Code source repository where you can track changes. If you are interested in contributing, you can reach me at <patai@iit.bme.hu>.

hp2any-core[edit]

The package comes with detailed Haddock commentary. The modules exposed are the following in decreasing order of importance:

  • Profiling.Heap.Types: the commonly used data structures and basic types of the framework.
  • Profiling.Heap.Read: the functions to load .hp files and to initiate live profiling.
  • Profiling.Heap.Stats: a data structure to efficiently query statistics over time intervals.
  • Profiling.Heap.Process: a helper module to construct CreateProcess structures for the profiling functions.
  • Profiling.Heap.Network: a messaging module providing serialisation for profiling related information (used by the remote profiler).

A simple example to get the peak of the memory usage from an old run:

import Profiling.Heap.Read
import Profiling.Heap.Types

main = do
  mprof <- readProfile "myprog.hp"
  case mprof of
    Nothing -> putStrLn "Couldn't read heap profile!"
    Just prof -> putStrLn $ "Maximum residency: " ++ show (maxCostTotal prof) ++ " bytes."

If you want to make several queries like that, using the Stats module can pay off. It offers the ProfileWithStats data structure, which can be queried using the same ProfileQuery interface as the much simpler Profile structure, which is optimised for loading and incremental accumulation.

Performing live profiling is similarly simple. If you want to create a simple profile logger, you can do something like the following:

import Control.Concurrent
import Profiling.Heap.Process
import Profiling.Heap.Read
import Profiling.Heap.Types

main = do
  let proc = processToProfile "./myprog" Nothing ["-param1","-param2"] [PPBreakdown BCostCentreStack,PPInterval 0.1]
  pcres <- profileCallback (Local proc) print
  case pcres of
    Nothing -> putStrLn "Something bad happened."
    Just (stopReading,Local hdl) -> do
      putStrLn "Streaming heap profile for 10 seconds:"
      threadDelay 10000000
      stopReading

This starts up a slave process and kills the profile reading thread after 10 seconds. However, it doesn’t kill the process itself; you can do that given its handle. Note the "./" in front of the program name. The library will not take care of finding out whether you want to run a program in the current directory, since you might need that distinction. To make this cross-platform, it is advisable to pass this string to System.FilePath.canonicalizePath, since Windows might have problems with prepended "./".

Note that the callback is invoked in a different thread, so if you need to be in the main thread (e.g. if using various widget toolkits), you’ll have to use communication primitives like MVar to get the information to the place where you want it.

hp2any-graph[edit]

The package includes two programs: hp2any-graph and hp2any-relay. The former can be used to start up a process to profile and display its heap usage on the fly using OpenGL. Alternatively, it can also connect to a hp2any-relay instance running on a remote host. The relay server is used essentially the same way as the grapher, but instead of opening a window, it opens a server socket and starts streaming the heap profile of its associated process to whomever connects to it.

The following sections show you how to use these tools. First of all, let’s create this simple test application in heaptest.hs:

import Text.Printf

main :: IO ()
main = exercise 500000

exercise :: Double -> IO ()
exercise d = do
  printf "Input: %f\n" d
  printf "Result: %f\n" (mean [1..d])
  exercise (d*1.5)

mean :: [Double] -> Double
mean xs = sum xs / fromIntegral (length xs)

Calculating the mean this way forces the whole list to be in the memory at the same time, since we are calculating its sum and length separately. Since the lists are getting longer with each iteration, we’ll eventually run into a stack overflow.

Compilation[edit]

In order to get a heap profile out of a program, we have to compile it with some profiling options:

ghc --make heaptest -O2 -prof -auto-all

In short, the -prof option enables profiling, while -auto-all instruments the executable by putting a cost centre (a named point of measurement) at every top-level declaration. If you want more fine-grained heap profiles, you can put SCC pragmas at any expression within the program. Consult the documentation of GHC for further details.

Local profiling[edit]

Local profiling is simple: just invoke the program through hp2any-graph and pass it the necessary parameters:

hp2any-graph -e heaptest -- +RTS -hc -K100M -i0.02

Everything after the -- is passed to the slave process as command line parameter. In this case, we are passing profiling related parameters to the runtime (+RTS). In particular, we ask for a heap profile by cost centre stack (-hc), set a big stack so the program can keep running for a little while (-K), and set a profile sample rate that’s higher than the default (-i).

Don’t be surprised if the animation is not smooth, as the heap profile might be aggressively buffered by the operating system. When a program is more complex and there are more active cost centres, one can see the samples almost as they are produced. That’s also the reason to increase the sampling rate for this small example.

Remote profiling[edit]

In order to access the heap profile of a remote process, we need a server that relays the information to the grapher. This is essentially the same as above, except we use hp2any-relay and also specify a port number to listen on:

hp2any-relay -p 5678 -e heaptest -- +RTS -hc -K100M -i0.02

We can connect to such a relay by starting the grapher in remote mode, where we only give it a server address:

hp2any-graph -s localhost:5678

It is possible to attach several viewers to the same relay at the same time. Each grapher sees only the samples produced after it was attached.

Viewing later[edit]

The grapher is not capable of viewing heap profiles of processes that already finished. The history manager (hp2any-manager, in a separate package) takes care of that duty, providing a much more comfortable interface.

hp2any-manager[edit]

The history manager is a simple application that can display heap profiles of Haskell programs. Graphs are arranged like in a tiling window manager. Here’s a screenshot to give you an idea (click for higher resolution):

The hp2any interface.

The main window is divided into columns, and each column can hold several graphs. New columns can be added with the + button on the right hand side, while each column can be closed with its respective Close button at the top.

Heap profiles can be loaded by clicking the Open button at the bottom of the column we want them to appear in, and they can also be specified as command line arguments. The open dialog has multi-selection enabled. If more than one .hp file is selected, all of them will be loaded in the same column.

Each graph has a header with some buttons. The first button brings up a menu with some viewing options, the second and third can be used to move the graph to neighbouring columns, and the last one closes the file. Graphs can be zoomed in and out using the mouse wheel, and navigated using the scroll bar below them. The view is automatically zoomed to fit the highest point of the graph section shown. The actual coordinates are shown on a tooltip.

Besides the graph, every profile window shows a list of cost centres. The list can be reordered according to the total cost by clicking on the header of the second column. As the mouse moves over the graph, the corresponding item is highlighted in the list. The colour black does not correspond to a single cost centre in the list. It is used for (the sum of) costs below a certain value (256?).

Future work[edit]

The project also aims at replacing hp2ps by reimplementing it in Haskell and possibly adding new output formats. The manager application shall be extended to display and compare the graphs in more ways, to export them in other formats and also to support live profiling right away instead of delegating that task to hp2any-graph.