Xmonad/Guided tour of the xmonad source

Introduction

Do you know a little Haskell and want to see how it can profitably be applied in a real-world situation? Would you like to quickly get up to speed on the xmonad source code so you can contribute modules and patches? Do you aspire to be as cool of a hacker as the xmonad authors? If so, this might be for you. Specifically, this document aims to:

Provide a readable overview of the xmonad source code for Haskell non-experts interested in contributing extensions or modifications to xmonad, or who are just curious.

Highlight some of the uniquenesses of xmonad and the things that make functional languages in general, and Haskell in particular, so ideally suited to this domain.

This is not a Haskell tutorial. I assume that you already know some basic Haskell: defining functions and data; the type system; standard functions, types, and type classes from the Standard Prelude; and at least a basic familiarity with monads. With that said, however, I do take frequent detours to highlight and explain more advanced topics and features of Haskell as they arise.

First things first

You'll want to have your own version of the xmonad source code to refer to as you read through the guided tour. In particular, you'll want the latest darcs version, which you can easily download by issuing the command:

darcs get http://code.haskell.org/xmonad

You may also want to refer to the Haddock-generated documentation (it's all in the source code, of course, but may be nicer to read this way). You can build the documentation by going into the root of the xmonad source directory, and issuing the command:

runhaskell Setup haddock

which will generate HTML documentation in dist/doc/html/xmonad/.

Without further ado, let's begin!

StackSet.hs

StackSet.hs is the pure, functional heart of xmonad. Far removed from corrupting pollutants such as the IO monad and the X server, it is a beatiful, limpid pool of pure code which defines most of the basic data structures used to store the state of xmonad. It is heavily validated by QuickCheck tests; the combination of good use of types and QuickCheck validation means that we can be very confident of the correctness of the code in StackSet.hs.

`StackSet`

The StackSet data type is the mother-type which stores (almost) all of xmonad's state. Let's take a look at the definition of the StackSet data type itself:

data StackSet i l a sid sd =
    StackSet { current  :: !(Screen i l a sid sd)    -- ^ currently focused workspace
             , visible  :: [Screen i l a sid sd]     -- ^ non-focused workspaces, visible in xinerama
             , hidden   :: [Workspace i l a]         -- ^ workspaces not visible anywhere
             , floating :: M.Map a RationalRect      -- ^ floating windows
             } deriving (Show, Read, Eq)

First of all, what's up with i l a sid sd? These are type parameters to StackSet---five types which must be provided to form a concrete instance of StackSet. It's not obvious just from this definition what they represent, so let's talk about them first, so we have a better idea of what's going on when they keep coming up later.

The first type parameter, here represented by i, is the type of workspace tags. Each workspace has a tag which uniquely identifies it (and which is shown in your status bar if you use the DynamicLog extension). At the moment, these tags are simply Strings---but, as you can see, the definition of StackSet doesn't depend on knowing exactly what they are. If, in the future, the xmonad developers decided that Complex Doubles would make better workspace tags, no changes would be required to any of the code in StackSet.hs!

The second type parameter l is somewhat mysterious---there isn't much code in StackSet.hs that does much of anything with it. For now, it's enough to know that the type l has something to do with layouts; StackSet is completely independent of particular window layouts, so there's not much to see here.

The third type parameter, a, is the type of a single window.

sid is a screen id, which identifies a physical screen; as we'll see later, it is (essentially) Int.

sd, the last type parameter to StackSet, represents details about a physical screen.

Although it's helpful to know what these types represent, it's important to understand that as far as StackSet is concerned, the particular types don't matter. A StackSet simply organizes data with these types in particular ways, so it has no need to know the actual types.

The StackSet data type has four members: current stores the currently focused workspace; visible stores a list of those workspaces which are not focused but are still visible on other physical screens; hidden stores those workspaces which are, well, hidden; and floating stores any windows which are in the floating layer.

A few comments are in order:

visible is only needed to support multiple physical screens with Xinerama; in a non-Xinerama setup, visible will always be the empty list.

Notice that current and visible store Screens, whereas hidden stores Workspaces. This might seem confusing until you realize that a Screen is really just a glorified Workspace, with a little extra information to keep track of which physical screen it is currently being displayed on:

data Screen i l a sid sd = Screen { workspace :: !(Workspace i l a)
                                  , screen :: !sid
                                  , screenDetail :: !sd }
    deriving (Show, Read, Eq)

A note about those exclamation points, as in workspace :: !(Workspace i l a): they are strictness annotations which specify that the fields in question should never contain thunks (unevaluated expressions). This helps ensure that we don't get huge memory blowups with fields whose values aren't needed for a while and lazily accumulate large unevaluated expressions. Such fields could also potentially cause sudden slowdowns, freezing, etc. when their values are finally needed, so the strictness annotations also help ensure that xmonad runs smoothly by spreading out the work.

The floating field stores a Map from windows (type a,remember?) to RationalRects, which simply store x position, y position, width, and height. Note that floating windows are still stored in a Workspace in addition to being a key of floating, which means that floating/sinking a window is a simple matter of inserting/deleting it from floating, without having to mess with any Workspace data.

`StackSet` functions

StackSet.hs also provides a few functions for dealing directly with StackSet values: new, view, and greedyView. For example, here's new:

new :: (Integral s) => l -> [i] -> [sd] -> StackSet i l a s sd
new l wids m | not (null wids) && length m <= length wids = StackSet cur visi unseen M.empty
  where (seen,unseen) = L.splitAt (length m) $ map (\i -> Workspace i l Nothing) wids
        (cur:visi)    = [ Screen i s sd |  (i, s, sd) <- zip3 seen [0..] m ]
                -- now zip up visibles with their screen id
new _ _ _ = abort "non-positive argument to StackSet.new"

If you're new (haha) to Haskell, this might seem dauntingly complex, but it isn't actually all that bad. In general, if you just take things slowly and break them down piece by piece, you'll probably be surprised how much you understand after all.

new takes a layout thingy (l), a list of workspace tags ([i]), and a list of screen descriptors ([sd]), and produces a new StackSet. First, there's a guard, which requires wids to be nonempty (there must be at least one workspace), and length m to be at most length wids (there can't be more screens than workspaces). If those conditions are met, it constructs a StackSet by creating a list of empty Workspaces, splitting them into seen and unseen workspaces (depending on the number of physical screens), combining the seen workspaces with screen information, and finally picking the first screen to be current. If the conditions on the guard are not met, it aborts with an error. Since this function will only ever be called internally, the call to abort isn't a problem: it's there just so we can test to make sure it's never called! If this were a function which might be called by users from their xmonad.hs configuration file, aborting would be a huge no-no: by design, xmonad should never crash for any reason (even user stupidity!).

Now take a look at view and greedyView. view takes a workspace tag and a StackSet, and returns a new StackSet in which the given workspace has been made current. greedyView only differs in the way it treats Xinerama screens: greedyView will always swap the requested workspace so it is now on the current screen even if it was already visible, whereas calling view on a visible workspace will just switch the focus to whatever screen it happens to be on. For single-head setups, of course, there isn't any difference in behavior between view and greedyView.

Note that view/greedyView do not modify a StackSet, but simply return a new one computed from the old one. This is a common purely functional paradigm: functions which would modify a data structure in an imperative/non-pure paradigm are recast as functions which take an old version of a data structure as input and produce a new version. This might seem horribly inefficient to someone used to a non-pure paradigm, but it actually isn't, for (at least) two reasons. First, a lot of work has gone into memory allocation and garbage collection, so that in a modern functional language such as Haskell, these processes are quite efficient. Second, and more importantly, the fact that Haskell is pure (modifying values is not allowed) means that when a new structure is constructed out of an old one with only a small change, usually the new structure can actually share most of the old one, with new memory being allocated only for the part that changed. In an impure language, this kind of sharing would be a big no-no, since modifying the old value later would suddenly cause the new value to change as well.

`Workspace`

The Workspace type is quite simple. It stores a tag, a layout, and possibly a Stack:

data Workspace i l a = Workspace  { tag :: !i, layout :: l, stack :: Maybe (Stack a) }
    deriving (Show, Read, Eq)

If there are no windows in a given workspace, stack will be Nothing; if there are windows, it will be Just s, where s is a non-empty Stack of windows.

There's not much else to say about it, which makes this a perfect chance to talk about record syntax. The basic way to define the Workspace type would be:

data Workspace i l a = Workspace i l (Maybe (Stack a))

This simply specifies a single constructor for the Workspace type (perhaps somewhat confusingly, also called Workspace, although these are two different things) which has three components, of types i, l, and Maybe (Stack a), respectively. The record syntax in the actual code wraps the components in curly braces, and adds a name associated with each component. These names automatically turn into accessor functions which allow us to extract the corresponding component from a value of type Workspace i l a. For example, tag becomes a function of type

tag :: Workspace i l a -> i

Hence, we have two ways to get at the internals of any value whose type is defined using record syntax: pattern-matching, or accessor functions.

`Stack`

The Stack type stores a list of the actual windows on a given workspace, along with a notion of the "current" window. Now, the "obvious" way to do this in an imperative language would be to store an array of windows along with an index into the array. However, this approach has several disadvantages:

Creating a new window or deleting the current one would be O(n) operations, as all the windows to the right of the current location would have to be shifted by one in the array.
In Haskell, indexing into a list is O(n) anyway, and using an array library would be unwieldy here.
Much work must go into maintaining guarantees such as always having the current index be a valid index into the array, maintaining the ordering of the windows when shifting them around in the array, and so on.

Instead, a Stack uses an ingenious structure known as a list zipper:

data Stack a = Stack { focus  :: !a        -- focused thing in this set
                     , up     :: [a]       -- clowns to the left
                     , down   :: [a] }     -- jokers to the right
    deriving (Show, Read, Eq)

Instead of using a single list with some sort of index, the list is broken into three pieces: a current window (focus), the windows before that, in reverse order (up), and the windows after it (down). This has several nice properties:

A Stack a cannot be empty, since it must always contain a current element. Remember, the possibility of an empty workspace is handled by the type of Workspace's stack field, Maybe (Stack a).
Shifting focus, adding a new window next to the current one, and reversing the window list are all simple O(1) operations.
There is not even the possibility of any sort of index-out-of-bounds errors while keeping track of the current window.

For more information on zippers, the Zipper page on the Haskell wiki and the chapter on zippers in the Haskell wikibook are good starting places.

Other functions

At this point you should spend some time studying the rest of the functions in StackSet.hs, which provide various operations on Stacks and StackSets. There are quite a few, but they are, for the most part, quite straightforward. Some general notes and commentary:

The functions with, modify, and modify' are great examples of higher-order functions, functions which take other functions as input. Haskell (and most functional languages) make such a thing easy and natural. For example, with applies a function to the current workspace's stack; modify essentially transforms a function on Stacks to a function on StackSets, with some Maybe types thrown in to handle empty cases.

The names of the functions integrate and differentiate may strike you as odd unless you know that there is an astonishing connection between derivatives (yes, from calculus) and zipper types. In short, finding the zipper of a given data type corresponds to finding a derivative. For more information, see the Haskell wikibook entry on zippers, or the paper by Conor McBride, The Derivative of a Regular Type is its Type of One-Hole Contexts.

Note how the implementation of functions such as focusUp/Down, swapUp/Down, and reverseStack are quite simple, thanks to higher-order functions and the zipper structure of Stacks.

I sort of lied when I said that moving focus is O(1) with the zipper structure: in the one case that focus wraps around the end of the list, it is O(n). But it is still takes O(1) amortized time.

QuickCheck properties

Now take a look at tests/Properties.hs, which contains a large collection of QuickCheck properties for testing StackSet.hs. Here's the way it works: a QuickCheck property is essentially a function of type a -> Bool which should always evaluate to True for any value of type a. (Actually, it's a little more complex than this: using a special return type Property instead of Bool, a property can also specify only a subset of values for which the property should hold.) The type a must be an instance of the type class Arbitrary, which defines a way to generate random values of that type. Running QuickCheck on a property generates a large amount of random test data of the appropriate type, and ensures that the property always evaluates to true; if not, it displays the input for which the property failed. Of course, this is not proof that the property is always true, but a set of well-defined properties is surprisingly good at catching bugs and forgotten corner cases.

These tests are run automatically with every recorded change to the xmonad source repository -- indeed, they must all pass in order for a change to be recorded.

Core.hs

The next source file to examine is Core.hs. It defines several core data types and some of the core functionality of xmonad. If StackSet.hs is the heart of xmonad, Core.hs is its guts.

`XState`, `XConf`, and `XConfig`

These three record types make up the core of xmonad's state and configuration:

A value of type XState stores xmonad's mutable runtime state, consisting of the list of workspaces, a set of mapped windows, something to do with keeping track of pending UnmapEvents, and something to do with dragging.

data XState = XState
    { windowset    :: !WindowSet           -- ^ workspace list
    , mapped       :: !(S.Set Window)      -- ^ the Set of mapped windows
    , waitingUnmap :: !(M.Map Window Int)  -- ^ the number of expected UnmapEvents
    , dragging     :: !(Maybe (Position -> Position -> X (), X ())) }

Note that the WindowSet type is just a StackSet with various concrete types substituted for its type parameters:

type WindowSet   = StackSet  WorkspaceId (Layout Window) Window ScreenId ScreenDetail
type WindowSpace = Workspace WorkspaceId (Layout Window) Window

-- | Virtual workspace indicies
type WorkspaceId = String

-- | Physical screen indicies
newtype ScreenId    = S Int deriving (Eq,Ord,Show,Read,Enum,Num,Integral,Real)

-- | The 'Rectangle' with screen dimensions and the list of gaps
data ScreenDetail   = SD { screenRect :: !Rectangle
                         , statusGap  :: !(Int,Int,Int,Int) -- ^ width of status bar on the screen
                         } deriving (Eq,Show, Read)

The type of workspace tags, WorkspaceId, is just String; a ScreenDetail stores information about the screen dimension as well as any gaps which should be left at the edges of the screen for status bars and other such things. Now, you may wonder why we have

newtype ScreenId = S Int deriving (...)

rather than just type ScreenId = Int? The reason is that if type ScreenId = Int, it would be possible to accidentally do arithmetic with ScreenIds mixed with other Int values, which clearly doesn't make sense. Making ScreenId a separate type means that the type system will enforce non-mixing of ScreenIds with other types, while using newtype with automatic instance deriving means none of the convenience is lost -- we can write code just as if ScreenIds are normal Int values, but be sure that we can't accidentally get mixed up and do something silly like add a ScreenId to the width of a window. This is the power of a rich static type system like Haskell's -- we can encode certain invariants and constraints in the type system, and have them automatically checked at compile time.

An XConf record stores xmonad's (immutable) configuration data, such as window border colors, the keymap, information about the X11 display and root window, and other user-specified configuration information. The reason this record is separated from XState is that, as we'll see later, xmonad's code provides a static guarantee that the data stored in this record is truly read-only, and cannot be changed while xmonad is running.

XConfig provides a way for the user to customize xmonad's configuration, by defining an XConfig record in their xmonad.hs file. You're probably already familiar with this record type.

The `X` monad

And now, what you've all been waiting for: the X monad!

newtype X a = X (ReaderT XConf (StateT XState IO) a)
    deriving (Functor, Monad, MonadIO, MonadState XState, MonadReader XConf)

The X monad represents a common pattern for building custom monad instances: using monad transformers, one can simply 'layer' the capabilities and effects of several monads into one, and then use GHC's newtype deriving capabilities to automatically derive instances of the relevant type classes. In this case, the base monad out of which the X monad is built is IO; this is necessary since communicating with the X server involves IO operations. On top of that is StateT XState, which automatically threads a mutable XState record through computations in the X monad; finally there is a ReaderT XConf which also threads a read-only XConf record through. As noted in the comments in the source, the XState record can be accessed with any functions in the MonadState type class, such as get, put, gets, and modify; the XConf record can be accessed with MonadReader functions, such as ask.

For more information on monad transformers in general, I recommend reading Martin Grabmüller's excellent tutorial paper, Monad Transformers Step-by-Step; for more information on this particular style of composing monad transformers and using automatic newtype deriving, read Cale Gibbard's tutorial, How To Use Monad Transformers.

Along with the X monad, several utility functions are provided, such as runX, which turns an action in the X monad into an IO action, and catchX, which provides error handling for X actions. There are also several higher-order functions provided for convenience, including withDisplay (apply a function producing an X action to the current display) and withWindowSet (apply a function to the current window set).

Xmonad/Guided tour of the xmonad source

Contents

Introduction

First things first

StackSet.hs

`StackSet`

`StackSet` functions

`Workspace`

`Stack`

Other functions

QuickCheck properties

Core.hs

`XState`, `XConf`, and `XConfig`

The `X` monad

`ManageHook`

Layout and Messages

Utility functions

On-the-fly recompilation

Navigation menu

Search

Xmonad/Guided tour of the xmonad source

Introduction

First things first

StackSet.hs

StackSet

StackSet functions

Workspace

Stack

Other functions

QuickCheck properties

Core.hs

XState, XConf, and XConfig

The X monad

ManageHook

Layout and Messages

Utility functions

On-the-fly recompilation

Navigation menu

Search

`StackSet`

`StackSet` functions

`Workspace`

`Stack`

`XState`, `XConf`, and `XConfig`

The `X` monad

`ManageHook`