GHC/SuperClass

From HaskellWiki
< GHC
Revision as of 17:12, 4 February 2023 by Ysangkok (talk | contribs) (fix links)
Jump to navigation Jump to search

This page documents several techniques and issues:

  • using SuperClass Constraints. That is, Constraints declared in the context of a class;
  • specifically using SuperClass Type Equality Constraints to simulate Functional Dependencies;
  • using repeated type variables in Instances to test for type equality; and
  • The interaction between Functional Dependencies and Instance Overlaps,
    a type-level programming technique pioneered by Hlist 2004.


Expressive Power of SuperClass Constraints

Superclass constraints are not subject to the Paterson conditions. That is, superclass constraints can be written that are not permitted as instance constraints.

  • (Superclass constraints are required to be non-cyclic, which ensures they're terminating.)
  • (Instance constraints can be more liberal than the Paterson and Coverage Conditions with UndecidableInstances; but then might not be terminating.)

These constraints are valid SuperClass, but not per (decidable) Instance:

class (F a b ~ b) => C a b ...
-- equivalently
class (D a b b) => C a b ...

Sometimes even though you have a type function, you can use knowledge of the result to 'improve' the parameters -- that is, a limited form of injectivity.

  • The classic case is adding type-level Naturals.
    • If you know the result and one param, you can infer the other param.
  • Maybe even type-level Boolean And:
    • if the result is True, so must be the params.
    • if the result is False, and you know one param is True, the other must be False.
    • but that can't be a function, because if the result is False,
      and you know one param is False, that tells nothing about the other param.
--ghc 7.10, using EmptyDataDecls as Type Booleans, rather than using DataKinds

{-# LANGUAGE   MultiParamTypeClasses, TypeFamilies, FunctionalDependencies, 
               EmptyDataDecls, FlexibleInstances   #-}

data TTrue; data TFalse
class (F1 a b ~ c, F1 b a ~ c, F3 a b c ~ b, F3 b a c ~ a)
   => TAnd a b c  where
  tAnd :: a -> b -> c
  tAnd _ _ = undefined

{-  those F3 constraints are accepted,
    even though Type family application no smaller than class Head
-}

instance (F1 a b ~ c, F1 b a ~ c, F3 a b c ~ b, F3 b a c ~ a)
      => TAnd a b c  where     -- (need to repeat the constraints on the instance)
     tAnd _ _ = undefined

type family F1 a b
type instance F1 TTrue b = b
type instance F1 TFalse b = TFalse
type family F3 a b c
type instance F3 a b TTrue = TTrue
type instance F3 TTrue b TFalse = TFalse
type instance F3 TFalse b TFalse = b         -- can't improve b


Functional dependency-like behaviour through superclass constraints

There's an idiom that's briefly mentioned in the User Guide under Equality constraints, but is more general and more powerful.

You get the power from a combination of extensions:

  • -XMultiParamTypeClasses [1]
  • Superclass constraints, which can also be MultiParam [2]
  • -XFunctionalDependencies [3]
  • and/or superclass Equality constraints with Type families, having the effect of FunDeps [4]
  • The overall effect can satisfy the FunDep Coverage conditions, which means you might not need -XUndecidableInstances [5]

The User Guide says:

Equality constraints can also appear in class and instance contexts. The former enable a simple translation of programs using functional dependencies into programs using family synonyms instead.

That's not wrong; but by talking about "class and instance contexts" in the same breath does fail to emphasise that:

  • whereas Instance constraints don't apply until after the instance has been selected
    (And it's a common newbie mistake to think they restrict the selection.)
  • superclass constraints apply during type improvement before selecting instances
    as described here "superclass context is critical ..." [thanks @DFeuer]
  • This behaviour means that even if you don't explicitly put a FunDep | ... -> ... on a class:
    • If you put a superclass equality constraint on a class, that induces a FunDep as explained;
      but just as much
    • If you put a superclass class constraint on a class, and that superclass has a FunDep, that also induces a FunDep.
    • and so on for a superclass constraint of a superclass constraint of a superclass constraint ....


Formal Description

Constraints in the class context can achieve the effect of Functional dependencies even without the explicit vertical bar and dependency arrow syntax in the class declaration. Either:

  • Using an Equality constraint to a Type family; or
  • A superclass constraint that does have an explicit Functional dependency.
  • Or a superclass constraint which is itself constrained in one of those ways.

A class declaration of the form

class C a b | a -> b

can equivalently be given as

class (F a ~ b) => C a b where
  type F a

That is, we represent every functional dependency (FD) a1 .. an -> b by an FD type family F a1 .. an and a superclass context equality F a1 .. an ~ b, essentially giving a name to the functional dependency. [See; using (~) needs -XTypeFamilies or -XGADTs.] In class instances, we define the type instances of FD families in accordance with the class head.

Or class C can equivalently be given as

class (D a b) => C a b ...

class D a b | a -> b
-- or
class (G a ~ b) => D a b where
  type G a

The class instances for D will closely follow those for C. So there is not much gain in expressivity. Consider also:

class (D a c) => E a b c ...

-- class D as above

for which the D instances will be more compact than E.

Note that D might not declare any methods, but merely be used for type improvement.

Method signatures for class C are not affected by either of these ways for achieving dependencies.


Here's a fancier example of a pseudo-FunDep reference this thread.

{-# LANGUAGE   MultiParamTypeClasses, TypeFamilies,
                             FlexibleInstances #-}

type family F a

class (F a ~ (b, c) ) => C a b c   where       -- (b, c) !!
  f1 :: a -> b
  f2 :: a -> c

-- Uses of `f1` happily improve the type for `b`.
-- Uses of `f2` happily improve the type for `c`.

Is this power made clear in the theoretical literature? It's kinda implicit in the TypeFamilies 2008 paper; but note that was written concurrently with Type Families development; and published before the work was completed. Specifically, supporting Type Families and (~) in superclass constraints was delivered a long time after the rest of the work.

Repeating type vars in Instance heads

Under the Haskell 2010 language standard, each type var in the Instance head must be unique Type Classes and Overloading, section 4.3.2 "the ui must all be distinct."

The GHC team back in 1997 discussed (briefly) lifting this restriction, section 4.6/Decision 5. "This decision is really part of Decision 4 ..." -- which is to do with relaxing the forms of the type parameters in Instance heads, to allow (for example) bare type variables or Type Synonyms. Thus repeated type variables are allowed with the FlexibleInstances flag.

One the one hand: yes of course

The key example introducing Functional Dependencies relies on repeating type vars:

class Collects e ce | ce -> e where ...
-- ce is a collection of elements e, with the ce determining e
instance Collects e (Set e) where ...
instance Collects e [e] where ...
-- a List of e taken as a Collection of e

It must be possible to repeat e in both the collection type and element. So that the 'result side' of the Functional Dependency is improved from the 'argument side'.

(At least, that was the case in 1997. Nowadays with (~) Equality constraints, those Instances could be:

instance (e ~ e') => Collects e (Set e') where ...
instance (e ~ e') => Collects e [e'] where ...

This has the downside that GHC can't see how the element type e is obtained from the collection type (despite the (~) constraint), so needs UndecidableInstances.)

On the other hand: here be dragons

The 1997 discussion also considered Overlapping Instances, Section 4.4/Decision 3; which is full of caveats and awkward use cases -- that's without detailing the complications from overlapping instances declared in separate modules Warning at the end of section.

But as soon as there's an Instance head with repeated type vars, there might be another head with distinct type vars in those positions. And if those are on the 'argument side' of a Functional Dependency (or there is no Functional Dependency), it has a different significance vs. improving a type:

  • in order to select an instance with repeated type variable the compiler must first infer each type in full, then compare them; to distinguish
  • any instance with distinct type variables, which is therefore overlapping.

This gives in effect a type-level test for equality.

Consider this (unrealistic) instance for Collects:

instance Collects e (e, e, e) where ...

This says a three-element tuple can hold a collection, providing the elements are all the same type; and in that case the compiler can infer the result type as bare e.

The 1997 paper explains

Permitting repeated type variables in the instance type of an instance declaration slightly complicates the process of matching a candidate instance declaration against a constraint requiring full matching (i.e. one-way unification, a well understood algorithm). For example when matching the instance head Foo (α, α) against a constraint Foo (τ1, τ2) one must first bind α to τ1 and then check for equality between the now-bound α and τ2.

So type inference is blocked until that equality check can be resolved.

UndecidableInstances considered suspect

Opinions vary as to whether the UndecidableInstances flag (whose purpose is to lift the Paterson and Coverage conditions) can lead to semantic incoherence. See Trac #8634, especially SPJ's detailed case analysis at Comment 30 for a long discussion of implications.

As GHC stands at 8.0, UndecidableInstances are unavoidable even for a simple type-level type equality test. In fact, note how many extensions are needed:

--ghc 7.10

{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies,
             FlexibleInstances, UndecidableInstances, 
             TypeFamilies     #-}

data TTrue
data TFalse

class TEq a b | a -> b
instance TEq (a, a) TTrue  -- repeated typevars needs FlexibleInstances

{- For the non-equal case, we'd like to write:

instance TEq (a, a') TFalse

-- but a) that doesn't overlap the instance for equal
--        (not more general, because TFalse is 'apart' from TTrue.)
--     b) and/so GHC complains "instances incompatible with FunDeps"

-- instead we must put:
-}

instance (f ~ TFalse) => TEq (a, a') f 

      -- using the (~) constraint needs TypeFamilies (or GADTs)

The main problem now for type coherence is that bare typevar f does not contain type vars from the Instance head, so GHC complains "the coverage condition fails":

  • This complaint is despite the (~) constraint
    • You might hope GHC could see the ( ~ TFalse), and figure out that inference for f is terminating.
    • But no, so you need UndecidableInstances.
    • And that is possibly more liberal coverage than you really wanted.