OOP vs type classes

From HaskellWiki
Revision as of 10:36, 18 August 2006 by Bulatz (talk | contribs) (Added John's letter)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

(this is just a sketch now. feel free to edit/comment it. i will include information you provided into the final version of this tutorial)


I'm almost not used type classes in my application programs, but when I'd gone to implement general-purpose libraries and tried to maintain as much flexibility as possible, it was natural to start build large and complex class hierarchies. I tried to use my C++ experience when doing this but I was many times bitten by the type classes restrictions. Now I think that I have better feeling and mind model for type classes and want to share it with other Haskellers, especially ones having OOP backgrounds.


Type classes is a sort of templates, not classes

At this moment C++/C#/Java languages has classes and templates/generics. What is a difference? With a class, type information carried with object itself while with templates it's outside of object and is part of the whole operation.

For example, if == operation is defined in a class, the actual procedure called for a==b may depend on run-time type of 'a' but if it is defined in template, actual procedure depends only on template instantiated (and determined at compile time).

Haskell's objects don't carry run-time type information. Instead, class constraint for polymorphic operation passed in form of "dictionary" implementing all operations of the class (there are also other implementation techniques, but this don't matter). For example,

eqList :: (Eq a) => [a] -> [a] -> Bool

translated into:

type EqDictionary a = (a->a->Bool, a->a->Bool)
eqList :: EqDictionary a -> [a] -> [a] -> Bool

where first parameter is "dictionary" containing implementation of "==" and "/=" operations for objects of type 'a'. If there are several class constraints, dictionary for each is passed.

If class has base class(es), the dictionary tuple also includes tuples of base classes dictionaries:

class Eq a => Cmp a where
  cmp :: a -> a -> Ordering

cmpList :: (Cmp a) => [a] -> [a] -> Ordering

turns into:

type CmpDictionary a = (eqDictionary a, a -> a -> Ordering)
cmpList :: CmpDictionary a -> [a] -> [a] -> Bool


Comparing to C++, this is like the templates, not classes! As with templates, typing information is part of operation, not object! But while C++ templates are really form of macro-processing (like Template Haskell) and at last end generates non-polymorphic code, Haskell's using of dictionaries allows run-time polymorphism (explanation of run-time polymorphism?).

Moreover, Haskell type classes supports inheritance. Run-time polymorphism together with inheritance are often seen as OOP distinctive points, so during long time I considered type classes as a form of OOP implementation. But that's wrong! Haskell type classes build on different basis, so they are like C++ templates with added inheritance and run-time polymorphism! And this means that usage of type classes is different from using classes, with its own strong and weak points.


Type classes vs classes

Here is a brief listing of differences between OOP classes and Haskell type classes

Type classes is like interfaces/abstract classes, not classes itself

There is no data fields inheritance and data fields itself (so type classes more like to interfaces than to classes itself)....


For those more familiar with Java/C# rather than C++, type classes resemble interfaces more than the classes. In fact, the generics in those languages capture the notion of parametric polymorphism (but Haskell is a language that takes parametric polymorphism quite seriously, so you can expect a fair amount of type gymnastics when dealing with Haskell), so more precisely, type classes are like generic interfaces.

Why interface, and not class? Mostly because type classes do not implement the methods themselves, they just guarantee that the actual types that instantiate the type class will implement specific methods. So the types are like classes in Java/C#.

One added twist: type classes can decide to provide default implementation of some methods (using other methods). You would say, then they are sort of like abstract classes. Right. But at the same time, you cannot extend (inherit) multiple abstract classes, can you?

So a type class is sort of like a contract: "any type that instantiates this type class will have the following functions defined on them..." but with the added advantage that you have type parameters built-in, so:

eqList :: (Eq a) => [a] -> [a] -> Bool

is, in a Java-like language:

interface Eq<A> {
   boolean `==` (A that);
   boolean '/=' (A that) { return !(`==(that)) } // default, can be overriden
}

<T> boolean eqList(T this, T that) where T extends Eq<T> {
  // so that inside the method you can be sure that this.`==`(that) or this.`/=`(that) makes perfect sense
}

And the "instance TypeClass ParticularInstance where ..." definition means "ParticularInstance implements TypeClass { ... }", now, multiple parameter type classes, of course, cannot be interpreted this way.


Type can appear at any place in function signature

Type can appear at any place in function signature: be any parameter, inside parameter, in a list (possibly empty), or in a result

class C a where
    f :: a -> Int
    g :: Int -> a -> Int
    h :: Int -> (Int,a) -> Int
    i :: [a] -> Int
    j :: Int -> a
    new :: a

It's even possible to define instance-specific constants (look at 'new').

If function value is instance-specific, OOP programmer will use "static" method while with type classes you need to use fake parameter:

class FixedSize a where
  sizeof :: a -> Int
instance FixedSize Int8 where
  sizeof _ = 1
instance FixedSize Int16 where
  sizeof _ = 2

main = do print (sizeof (undefined::Int8))
          print (sizeof (undefined::Int16))


Inheritance between interfaces

Inheritance between interfaces (in "class" declaration) means inclusion of base class dictionaries in dictionary of subclass:

class (Show s, Monad m s) => Stream m s where
    sClose :: s -> m ()

means

type StreamDictionary m s = (ShowDictionary s, MonadDictionary m s, s->m())

There is upcasting mechanism, it just extracts dictionary of a base class from a dictionary tuple, so you can run function that requires base class from a function that requires subclass:

f :: (Stream m s) =>  s -> m String
show ::  (Show s) =>  s -> String
f s = return (show s)

But downcasting is absolutely impossible - there is no way to get subclass dictionary from a superclass one


Inheritance between instances

Inheritance between instances (in "instance" declaration) means that operations of some class can be executed via operations of other class, i.e. such declaration describe a way to compute dictionary of inherited class via functions from dictionary of base class:

class Eq a where
  (==) :: a -> a -> Bool
class Cmp a where
  cmp :: a -> a -> Ordering
instance (Cmp a) => Eq a where
  a==b  =  cmp a b == EQ

creates the following function:

cmpDict2EqDict :: CmpDictionary a -> EqDictionary a
cmpDict2EqDict (cmp) = (\a b -> cmp a b == EQ)

This results in that any function that receives dictionary for Cmp class can call functions that require dictionary of Eq class


Downcasting is a mission impossible

Selection between instances are done at compile-time, based only on information present at this moment. So don't expect that more concrete instance will be selected just because you passed this concrete datatype to the function which accepts some general class:

class Foo a where
  foo :: a -> String

instance (Num a) => Foo a where
  foo _ = "Num"

instance Foo Int where
  foo _ = "int"

f :: (Num a) =>  a -> String
f = foo

main = do print (foo (1::Int))
          print (f (1::Int))

Here, the first call will return "int", but second - only "Num". this can be easily justified by using dictionary-based translation as described above. After you've passed data to polymorphic procedure it's type is completely lost, there is only dictionary information, so instance for Int can't be applied. The only way to construct Foo dictionary is by calculating it from Num dictionary using the first instance.


There is only one dictionary per function call

For "eqList :: (Eq a) => [a] -> [a] -> Bool" types of all elements in list must be the same, and types of both arguments must be the same too - there is only one dictionary and it know how to handle variables of only one concrete type!

Existential variables is more like OOP objects

Existential variables pack dictionary together with variable (looks very like the object concept!) so it's possible to create polymorphic containers (i.e. holding variables of different types). But downcasting is still impossible. Also, existentials still don't allow to mix variables of different types in a call to some polymorhic operation (their personal dictionaries still built for variables of one concrete type):

data HasCmp = forall a. Cmp a => HasCmp a

sorted :: [HasCmp] -> Ordering

sorted []  = True
sorted [_] = True
sorted (HasCmp a : HasCmp b : xs)  =  a<=b && sorted (b:xs)

This code will not work - a<=b can use nor 'a' nor 'b' dictionary. Even if orderings for apples and penguins are defined, we still don't have a method to compare penguin with apple!


John Meacham letter

> Roughly Haskell type classes correspond to parameterized abstract
> classes in C++ (i.e. class templates with virtual functions 
> representing the operations).  Instance declarations correspond to
> derivation and implementations of those parameterized classes.

There is a major difference though, in C++ (or java, or sather, or c#, etc..) the dictionary is always attached to the value, the actual class data type you pass around. in haskell, the dictionary is passed separately and the appropriae one is infered by the type system. C++ doesn't infer, it just assumes everything will be carying around its dictionary with it.

this makes haskell classes signifigantly more powerful in many ways.

class Num a where
   (+) :: a -> a -> a

is imposible to express in OO classes, since both arguments to + necessarily carry their dictionaries with them, there is no way to statically guarentee they have the same one. Haskell will pass a single dictionary that is shared by both types so it can handle this just fine.

in haskell you can do

class Monoid a where

       mempty :: a

in OOP, this cannot be done because where does the dicionary come from? since dictionaries are always attached to a concrete class, every method must take at least one argument of the class type (in fact, exactly one, as I'll show below). In haskell again, this is not a problem since the dictionary is passed in by the consumer of 'mempty', mempty need not conjure one out of thin air.


In fact, OO classes can only express single parameter type classes where the type argument appears exactly once in strictly covariant position. in particular, it is pretty much always the first argument and often (but not always) named 'self' or 'this'.


class HasSize a where
        getSize :: a -> Int

can be expressed in OO, 'a' appears only once, as its first argument.


Now, another thing OO classes can do is they give you the ability to create existential collections (?) of objects. as in, you can have a list of things that have a size. In haskell, the ability to do this is independent of the class (which is why haskell classes can be more powerful) and is appropriately named existential types.

data Sized = exists a . HasSize a => Sized a

what does this give you? you can now create a list of things that have a size [Sized] yay!

and you can declare an instance for sized, so you can use all your methods on it.

instance HasSize Sized where
        getSize (Sized a) = a


an exisential, like Sized, is a value that is passed around with its dictionary in tow, as in, it is an OO class! I think this is where people get confused when comparing OO classes to haskell classes. _there is no way to do so without bringing existentials into play_. OO classes are inherently existential in nature.

so, an OO abstract class declaration declares the equivalent of 3 things in haskell: a class to establish the mehods, an existential type to carry the values about, and an instance of the class for the exisential type.

an OO concrete class declares all of the above plus a data declaration for some concrete representation.


OO classes can be perfectly (even down to the runtime representation!) emulated in Haskell, but not vice versa. since OO languages tie class declarations to existentials, they are limited to only the intersection of their capabilities, because haskell has separate concepts for them, each is independently much much more powerful.

data CanApply = exists a b . CanApply (a -> b) a (b -> a)

is an example of something that cannot be expressed in OO, existentials are limited to having exactly a single value since they are tied to a single dictionary


class Num a where
   (+) :: a -> a -> a
   zero :: a
   negate :: a -> a

cannot be expressed in OO, because there is no way to pass in the same dicionary for two elements, or for a returning value to conjure up a dictionary out of thin air. (if you are not convinced, try writing a 'Number' existential and making it an instance of Num and it will be clear why it is not possible)

negate is an interesting one, there is no technical reason it cannot be implemented in OO languages, but none seem to actually support it.


so, when comparing, remember an OO class always cooresponds to a haskell class + a related haskell existential.


incidentally, an extension I am working on is to allow

data Sized = exists a . HasSize a => Sized a 
        deriving(HasSize)

which would have the obvious interpretation, obviously it would only work under the same limitations as OO classes have, but it would be a simple way for haskell programs to declare OO style classes if they so choose.

(actually, it is still signifigantly more powerful than OO classes since you can derive many instances, and even declare your own for classes that don't meet the OO consraints, also, your single class argument need not appear as the first one. it can appear in any strictly covarient position, and it can occur as often as you want in contravariant ones!)


Type class system extensions

Brief list of extensions, their abbeviated names and compatibility level

  • Constructor classes (Haskell'98)
  • MPTC: multi-parameter type classes (Hugs/GHC extension)
  • FD: functional dependencies (Hugs/GHC extension)
  • AT: associated types (GHC 6.6 only)
  • Overlapped, undecidable and incoherent instances (Hugs/GHC extension)

Literature

The paper that at first time introduced type classes and their implementation using dictionaries was Philip Wadler and Stephen Blott "How to make ad-hoc polymorphism less ad-hoc" (http://homepages.inf.ed.ac.uk/wadler/papers/class/class.ps.gz)

You can find more papers on the Type classes page.

I thanks Ralf Lammel and Klaus Ostermann for their paper "Software Extension and Integration with Type Classes" (http://homepages.cwi.nl/~ralf/gpce06/) which prompts me to start thinking about differences between OOP and type classes instead of their similarities