# CTRex

### From HaskellWiki

## Contents |

# 1 Introduction

This page describes the design, usage and motivation for CTRex.

CTRex is a library for Haskell which implements extensible records using closed type families, datakinds and type literals. It does **not** use overlapping instances.

Features:

- Row-polymorphism

- Support for scoped labels (i.e. duplicate labels)
**and**non-scoped labels (i.e. the lacks predicate on rows).

- The value level interface and the type level interface correspond to each other.

- The order of labels (except for duplicate labels) does not matter. I.e. {x = 0, y = 0} and {y = 0, x = 0} have the
**same type**.

- Syntactic sugar on the value level as well as type level.

- If all values in a record satisfy a constraint such as , then we are able to do operations on all fields in a record, if that operation only requires that the constraint is satisfied. In this way we can create instances such asShow. This is available to the application programmer as well.Forall r Show => Show (Rec r)

- Fast extend, lookup and restriction (all O(log n)) using HashMaps.

The haddock documentation is available here.

# 2 What the hell are extensible records?

## 2.1 Basic extensible records

Records are values that contain other values, which are indexed by name (label). Examples of records are structs in c. In Haskell, we can currently declare record types as follows:

data HRec = HRec { x :: Int, y :: Bool, z :: String }

data HRec2 = HRec2 { p :: Bool, q :: Char }

` extend z "Bla" r`

**row**.

**nominally**, which means that we see if two record types are the same by checking the

**names**of the records. For example, to check if the type

In an extensible record system the record type are **structural**: two records have the same type if they carry the same fields with the same types, i.e. if they have the same row. This also means we do not have to declare the type of the record before using it. For example (in CTRex):

x := 0 .| y := False .| empty

Rec ("x" ::= Int :| y ::= Bool .| Empty)

Because the associated type for a label is in the row of the record, labels can be used in different records for different types. This is currently not possible with standard records :

data X = X { x :: Int, y :: Bool } data Y = Y { x :: Bool }

Will give a Multiple declarations of `x' error.

To summarize, extensible records have the following advantages:

- Labels can be used in different records for different types
- Records do not have to be declared before use.
- Structural typing eliminates the need for explicit conversion functions.

## 2.2 Row polymorphism

Some extensible record systems, such as CTRex, support **row polymorphism**. This concept is best explained by an example, consider the following function (in CTRex):

f :: Rec ("x" ::= Double .| "y" ::= Double .| Empty) -> Rec ("x" ::= Double .| "y" ::= Double .| "dist" ::= Double .| Empty) f r = norm := dist ((r.!x * r.!x) + (r.!y * r.!y)) .| r

**polymorphic**in the rest of the row as follows:

f :: ((r :! "x") ~ Double, (r :! "y") ~ Double) => Rec r -> Rec ("norm" ::= Double :| r) f r = norm := dist ((r.!x * r.!x) + (r.!y * r.!y)) .| r

f :: (Floating t, (r :! "y") ~ t, (r :! "x") ~ t) => Rec r -> Rec (Extend "norm" (r :! "x") r)

## 2.3 Difference between Heterogenous maps and extensible records

A question that may arise after the previous section is: What is the difference between extensible records and heterogenous maps? A hetrogenous map is a map that can store values of different types, see for example the HMap package (blatant plug, my package ).

In heterogenous maps the type associated with a key is present**in the type of the key**. For example, in HMap a key has type

**in the type of the record**, i.e. in its row. The row states, for example, that

**sure**that this record has a value of type

# 3 Programmer interface

## 3.1 Labels

Labels (such as x,y and z) in CTRex are type level symbols (i.e. type level strings). We can point to a label by using the label type:

data Label (s :: Symbol) = Label

For example, we can declare shorthands for pointing at the type level symbol "x", "y" and "z" as follows.

x = Label :: Label "x" y = Label :: Label "y" z = Label :: Label "z"

## 3.2 Rows and records

A record has the following type:Hence, we can only manipulate records and rows by the value and type level operations given in the CTRex module.

## 3.3 Operations

For all operations available on records, the value level interface and the type level interface correspond to each other.

For example, the value level operation for extending a record (adding a field) has type

extend :: KnownSymbol l => Label l -> a -> Rec r -> Rec (Extend l a r)

whereas the type level operation for adding a field has type

Extend :: Symbol -> * -> Row * -> Row *

In this way each value level operation (that changes the type) has a corresponding type level operation with a similar name. If the value level operation is not infix, the type level operation is named the same, but starting with a capital. If the value level operation is an operator, is starts with a '.' and the type level operation starts with a ':'.

The following operations are available:

- Empty Record:
- Value level: empty :: Rec Empty
- Type level: Empty :: Row *

- Value level:
- Extension:
- Value level: extend :: KnownSymbol l => Label l -> a -> Rec r -> Rec (Extend l a r)
- Type level: Extend :: Symbol -> * -> Row * -> Row *

- Value level:
- Selection:
- Value level: (.!) :: KnownSymbol l => Rec r -> Label l -> r :! l
- Type level: (:!) :: Row * -> Symbol -> *

- Value level:
- Restriction:
- Value level: (.-) :: KnownSymbol l => Rec r -> Label l -> Rec (r :- l)
- Type level: (:-) Row * -> Symbol -> Row *

- Value level:
- Record merge :
- Value level: (.++) :: Rec l -> Rec r -> Rec (l :++ r)
- Type level: (:++) :: Row * -> Row * -> Row *

- Value level:

- Rename (This operation can also be expressed using the above operations, but this looks nicer):
- Value level: rename :: (KnownSymbol l, KnownSymbol l') => Label l -> Label l' -> Rec r -> Rec (Rename l l' r)
- Type level: Rename :: Symbol -> Symbol -> Row * -> Row *

- Value level:

## 3.4 Syntactic Sugar

We provide some handy declarations which allow us to chain operations with nicer syntax. For example we can write:

p :<-| z .| y :<- 'b' .| z :!= False .| x := 2 .| y := 'a' .| empty

instead of

rename z p $ update y 'b' $ extendUnique z False $ extend x 2 $ extend y 'a' empty

For this we have a GADT datatype RecOp which takes two arguments:

- c, the type of the constaint that should hold on the input row.
- rop, the row operation (see below). with the following constructors:

This datatype has the following constructors, all of which are sugar for record operations.

- Record update. Sugar for update.(:<-) :: Label -> a -> RecOp (HasType l a) RUp
- Record extension. Sugar for extend.(:=) :: KnownSymbol l => Label l -> a -> RecOp NoConstr (l ::= a)
- Record extension, without shadowing. Sugar for extendUnique. See the section on duplicate labels.(:!=) :: KnownSymbol l => Label l -> a -> RecOp (Lacks l) (l ::= a)
- Record label renaming. Sugar for rename.(:<-|) :: (KnownSymbol l, KnownSymbol l') => Label l' -> Label l -> RecOp NoConstr (l' ::<-| l)
- Record label renaming. Sugar for renameUnique. See the section on duplicate labels.(:<-!) :: (KnownSymbol l, KnownSymbol l', r :\ l') => Label l' -> Label l -> RecOp (Lacks l') (l' ::<-| l)

On the type level the same pattern again arises, we have a datakind (RowOp *) with the following constructors:

- Row operation forRUp :: RowOp *. Identitity row operation.(:<-)
- Row extension operation. Sugar for Extend. Type level operation for(::=) :: Symbol -> * -> RowOp *and(:=)(:!=)
- Row renaming. Sugar for Rename. Type level operation for::<-|and(:<-|)(:<-!)

We then have a type level operation to perform a row operation:

(:|) :: RowOp * -> Row * -> Row *

And a value level operation to perform a record operation:

(.|) :: c r => RecOp c ro -> Rec r -> Rec (ro :| r)

Notice that the constraint from the record operation is placed on the input row.

Also notice that this means that this sugar is also available when writing types:

Rec ("p" ::<-| "z" :| RUp :| "z" ::= Bool :| "x" ::= Double :| "y" ::= Char :| Empty)

is the type exactly corresponding to:

p :<-| z .| y :<- 'b' .| z :!= False .| x := 2 .| y := 'a' .| empty

and equivalent to

Rename "p" "z" (Extend "z" Bool (Extend x Double (Extend "x" Int Empty)))

and of course equivalent to:

"p" ::= Bool :| "x" ::= Double :| "y" ::= Int :| Empty

## 3.5 Duplicate labels, and lacks

Rows and records can contain duplicate labels as described in the paper Extensible records with scoped labels by Daan Leijen.

Hence we can write:

z = x := 10 .| x := "bla" .| Empty :: Rec ("x" ::= Int :| "x" ::= String :| Empty)

We can recover the information on the second instance of x by removing x:

z .- x :: Rec ("x" ::= String :| Empty)

The motivation for this is as follows: Suppose we have a function

f :: Rec ("x" ::= Int :| r) -> (Rec ("x" ::= Bool .| r)

and we want to write the following function:

g :: Rec r -> Rec ("p" ::= String .| r) g r = let r' = f (x := 10 .| r) (c,r'') = (r'.!x, r' .- x) v = if c then "Yes" else "Nope" in p := v .| r''

If it was not possible for records and rows to contain duplicate label the type of g would be:

g :: r :\ "x" => Rec r -> Rec ("p" ::= String .| r)

As another use case for duplicate labels: consider implementing an interpreter for some embedded DSL, and you want to carry the state of the variables in the an extensible record. Declaring a new variable in the embedded language then causes us to extend the record. Since the embedded language allows shadowing (as most languages do), we can simply extend the record, we do not have to jump through hoops to make sure there are no duplicate labels. Once the variable goes out of scope, we restrict the record with the label to bring the old "variable" into scope.

However, in other situations, duplicate labels may be undesired, for instance because we want to be sure that we do not hide previous information. For this reason we also provide the already introduced `lacks` constraint.

We also provide a handy record extension function that has this constraint, so that you do not have to add types yourself:

extendUnique :: (KnownSymbol l, l :\ r) => Label l -> a -> Rec r -> Rec (Extend l a r)

The same thing for renaming:

renameUnique :: (KnownSymbol l, KnownSymbol l', r :\ l') => Label l -> Label l' -> Rec r -> Rec (Rename l l' r)

We also provide a constraint to test that two Rows are **disjoint**. Corresponding to this we also provide a function to merge with this constraint:

.+ :: Disjoint l r => Rec l -> Rec r -> Rec (l :++ r)

Notice that .+ is commutative, while .++ is not.

## 3.6 Constrained record operations

If some constraint c holds for all types in the row of a record, then the methods in the following type class are available:

class Forall (r :: Row *) (c :: * -> Constraint) where rinit :: CWit c -> (forall a. c a => a) -> Rec r erase :: CWit c -> (forall a. c a => a -> b) -> Rec r -> [(String,b)] eraseZip :: CWit c -> (forall a. c a => a -> a -> b) -> Rec r -> Rec r -> [(String,b)]

**witness**of a constraint. It's definition is as follows:

data CWit (c :: * -> Constraint) = CWit

We can use this to specify the constraint which should hold on the row, since this may be ambiguous. We can use this type class to implement, for example, some standard type classes on records:

instance (Forall r Show) => Show (Rec r) where show r = "{ " ++ meat ++ " }" where meat = intercalate ", " binds binds = map (\(x,y) -> x ++ "=" ++ y) vs vs = erase (CWit :: CWit Show) show r instance (Forall r Eq) => Eq (Rec r) where r == r' = and $ map snd $ eraseZip (CWit :: CWit Eq) (==) r r' instance (Forall r Bounded) => Bounded (Rec r) where minBound = rinit (CWit :: CWit Bounded) minBound maxBound = rinit (CWit :: CWit Bounded) maxBound

We could make an interface to do even more general stuff on (pairs of) constrained records, but I have yet to find a use case for this.

## 3.7 Type errors

Here we list which type errors are reported when using CTRex:

- Record does not have field

typerr1 = (x := 1 .| empty) .! y + 1

No instance for (Num (Records.NoSuchField "y")) arising from a use of ‛+’ In the expression: (x := 1 .| empty) .! y + 1

Somewhat unsatisfactory, the expression:

typerr1 = (x := 1 .| empty) .! y

does not immediatly give a type error. Instead its type is:

typerr1 :: Records.NoSuchField "y"

- Record does not lack field

x :!= 1 .| x := 1 .| empty

Gives the error

Error: Couldn't match type ‛'Records.LabelNotUnique "x"’ with ‛'Records.LabelUnique "x"’ In the first argument of ‛(.|)’, namely ‛x :!= 1’

- Records not disjoint

notdisjoint = let p = x := 2 .| empty q = x := 2 .| empty in p .+ q

Gives the error:

Couldn't match type ‛'Records.Duplicate "x"’ with ‛'Records.IsDisjoint’ Expected type: 'Records.IsDisjoint Actual type: Records.DisjointR ('Records.R '["x" 'Records.:-> a4]) ('Records.R '["x" 'Records.:-> a]) In the expression: p .+ q

# 4 Implementation

The basis of the implementation is a label-type pair, which is represented by the following (unexported) datakind:

data LT a = Symbol :-> a

## 4.1 Rows

A row is then simply a list of such label-type pairs:

newtype Row a = R [LT a] -- constructor not exported

Notice that the constructor is not exported, so if we ask for the type of

origin2 = y := 0 .| x := 0 .| empty

we get:

origin2 :: Rec ('Records.R '["x" 'Records.:-> Double, "y" 'Records.:-> Double])

Here the implementation of Row leaks a bit. The user cannot write down this type, since Records.R is not exported, and neither is :->.

Instead the user should not worry about the implementation and write the type in terms of operations (or let the type be inferred), i.e. :

origin2 :: Rec ("x" ::= Double :| "y" ::= Double :| Empty)

Operations on rows are implemented using closed type families. The list of label-type pairs in the row are always sorted. Each operation defined on Rows maintains this invariant. We are sure that no operations which violate this invariant can be created by the user, since the constructor is not exported.

To keep the list of label-type pairs sorted, we use the built-in closed type family :

<.=? :: Symbol -> Symbol -> Bool

Which compares two symbols at compile time and gives a Bool datakind telling us wether the left is <= than the right.

For instance, row extension is implemented as follows:

type family Extend :: Symbol -> * -> Row * -> Row * where Extend l a (R x) = R (Inject (l :-> a) x) type family Inject :: LT * -> [LT *] -> [LT *] where Inject (l :-> t) '[] = (l :-> t ': '[]) Inject (l :-> t) (l' :-> t' ': x) = Ifte (l <=.? l') (l :-> t ': l' :-> t' ': x) (l' :-> t' ': Inject (l :-> t) x)

## 4.2 Records

To implement the records we introduce the following datatype which can contain anything:

data HideType where HideType :: a -> HideType

A record is then defined as follows:

-- | A record with row r. data Rec (r :: Row *) where OR :: HashMap String (Seq HideType) -> Rec r

Here we see that a record is actually just a map from string to the sequence of values. Notice that it is a sequence of values and not a single value, because the record may contain duplicate labels.

Extension is then rather simple, it simply prepends the value to the sequence of values associated with the label:

extend :: KnownSymbol l => Label l -> a -> Rec r -> Rec (Extend l a r) extend (show -> l) a (OR m) = OR $ M.insert l v m where v = HideType a <| M.lookupDefault S.empty l m

To safely convert back from Hidetype, we maintain the following invariant: The i-th value in the sequence associated with "x" has the i-th type associated with "x" in the row. This invariant is maintained by all operations on rows and records. Since the constructors of record and row are not exported, we know that it is impossible to declare new operations on records and rows that violate this invariant. A same kind of trick is used HMap.

Since we know that the actual type of the value in Hidetype is given in the row, we can safely convert it back:

(.!) :: KnownSymbol l => Rec r -> Label l -> r :! l (OR m) .! (show -> a) = x' where x S.:< t = S.viewl $ m M.! a -- notice that this is safe because of invariant x' = case x of HideType p -> unsafeCoerce p

(.++) :: Rec l -> Rec r -> Rec (l :++ r) (OR l) .++ (OR r) = OR $ M.unionWith (><) l r

# 5 Comparison with other approaches

Here we compare with other approaches.

## 5.1 HList records

The HList package provides extensible records build on heterogeneous lists. The differences with are as follows:

- Rows are not sorted. This has the following downsides:
- Compile time operations, such as reordering the labels and checking for duplicate labels is costly (at compile time) link
- and{x = 0, y = 0 }do not have the same type. Hence, is we give a type declaration for the first and want to put in the latter, we need to call a manual conversion function.{ y = 0, x = 0 }

- Constrained row operations are less convenient, since
- the Record newtype needs to be unwrapped, and
- there is no definition provided to create instances of applyAB that work on from instances that work on justLVPair l aa
- however there similarities between the two

CTRex function | HList function(s) |
---|---|

rinit | hReplicate |

erase | hMapOut |

eraseZip | hZip with hMapOut |

- Currently, the run time complexity of extend, restriction and lookup is O(n), versus O(log n) for CTRex. Proposals have been done to make this faster paper.
- Duplicate labels are disallowed in HList.
- HList has nicer support for writing labels

## 5.2 Trex (Hugs)

The Haskell interpreter hugs supports a type system extension that allows extensible records. The differences are as follows:

- Constrained row operations are not available(?)
- Duplicate labels are disallowed
- Record merge not available (?)
- Can pattern match on records(?)
- Nicer syntax
- More (?)

## 5.3 More

More to come! Please add stuff!

# 6 Wishlist

- Nicer syntax for Label :: Label "x"
- Nice syntax for records, i.e. { x = 0, y = 0 }.
- Type level Show thing, so that we can pretty-print types, and the implementation of things such as row does not leak.
- The ability to pattern match on records.
- Type level so that we can generate clearer error messages earlier.error