https://wiki.haskell.org/api.php?action=feedcontributions&user=Orion&feedformat=atomHaskellWiki - User contributions [en]2021-04-12T17:58:54ZUser contributionsMediaWiki 1.27.4https://wiki.haskell.org/index.php?title=OOP_vs_type_classes&diff=59110OOP vs type classes2014-11-18T01:47:50Z<p>Orion: /* Haskell emulation of OOP inheritance with record extension */ Fixed grammar.</p>
<hr />
<div>(this is just a sketch now. feel free to edit/comment it. I will include information you provided into the final version of this tutorial)<br />
<br />
<br />
I had generally not used type classes in my application programs, but when<br />
I'd gone to implement general purpose libraries and tried to maintain<br />
as much flexibility as possible, it was natural to start building large<br />
and complex class hierarchies. I tried to use my C++ experience when<br />
doing this but I was bitten many times by the restrictions of type classes. After this experience, I think that I now have a better feeling and mind model<br />
for type classes and I want to share it with other Haskellers -<br />
especially ones having OOP backgrounds.<br />
<br />
Brian Hulley provided us with the program that emulates OOP in Haskell - as<br />
you can see, it's much larger than equivalent C++ program. An equivalent translation from Haskell to C++ should be even longer :)<br />
<br />
<br />
== Everything is an object? ==<br />
<br />
Most software developers are familiar with the OOP motto "everything is an object." People accustomed to C++ classes often find the Haskell concept of type classes difficult to grasp. Why is it so different?<br />
<br />
C++ classes pack functions together with data, which makes it convenient to<br />
represent and consume data. Use of interfaces (abstract classes) allow classes to interact by contract, instead of directly manipulating the data in the other class. There exist alternative ways in C++ to accomplish such functionality (function pointers, discriminated unions), yet these techniques are not as handy<br />
as classes. Classes are also the primary way to hiding implementation<br />
details. Moreover, classes represent a handy way to group related<br />
functionality together. It's extremely useful to browse the structure of large C++ project in terms of classes instead of individual functions.<br />
<br />
Haskell provides other solutions for these problems.<br />
<br />
=== Type with several representations: use algebraic data type (ADT) ===<br />
<br />
For the types with different representations, algebraic data types<br />
(ADT) - an analog of discriminated unions - are supported:<br />
<br />
<haskell><br />
data Point = FloatPoint Float Float<br />
| IntPoint Int Int<br />
</haskell><br />
<br />
Haskell provides a very easy way to build/analyze them:<br />
<br />
<haskell><br />
coord :: Point -> (Float, Float)<br />
coord (FloatPoint x y) = (x,y)<br />
coord (IntPoint x y) = (realToFrac x, realToFrac y)<br />
<br />
main = do print (coord (FloatPoint 1 2))<br />
print (coord (IntPoint 1 2))<br />
</haskell><br />
<br />
So ADTs in general are preferred in Haskell over the class-based<br />
solution of the same problem:<br />
<br />
<haskell><br />
class Point a where<br />
coord :: a -> (Float, Float)<br />
<br />
data FloatPoint = FloatPoint Float Float<br />
instance Point FloatPoint where<br />
coord (FloatPoint x y) = (x,y)<br />
<br />
data IntPoint = IntPoint Int Int<br />
instance Point IntPoint where<br />
coord (IntPoint x y) = (realToFrac x, realToFrac y)<br />
</haskell><br />
<br />
<br />
The equivalent C++ implementation using inheritance requires much more machinery than our 5 line, ADT-based solution. This also illustrates a Haskell benefit--it's much easier to define types/functions. Perhaps objects are not as great as you thought before. :D<br />
<br />
<br />
<pre><br />
#include <algorithm><br />
#include <iostream><br />
using namespace std;<br />
<br />
ostream & operator<<(ostream & lhs, pair<float, float> const& rhs) {<br />
return lhs << "(" << rhs.first << "," << rhs.second << ")";<br />
}<br />
<br />
struct Point {<br />
virtual pair<float,float> coord() = 0;<br />
};<br />
<br />
struct FloatPoint : Point {<br />
float x, y;<br />
FloatPoint (float _x, float _y) : x(_x), y(_y) {}<br />
pair<float,float> coord() {return make_pair(x,y); }<br />
};<br />
<br />
struct IntPoint : Point {<br />
int x, y;<br />
IntPoint (int _x, int _y) : x(_x), y(_y) {}<br />
pair<float,float> coord() { return make_pair(x,y); }<br />
};<br />
<br />
int main () {<br />
cout << FloatPoint(1,2).coord();<br />
cout << IntPoint(1,2).coord();<br />
return 0;<br />
}<br />
</pre><br />
<br />
As you see, ADTs together with type inference make Haskell programs<br />
about 2 times smaller than their C++ equivalent.<br />
<br />
<br />
=== Packing data & functions together: use (records of) closures ===<br />
<br />
Another typical class use-case is to pack data together with one<br />
or more processing functions and pass this bunch to some<br />
function. Then this function can call the aforementioned functions to implement<br />
some functionality, not bothering how it is implemented internally.<br />
Hopefully Haskell provides a better way: you can pass any functions as parameters to other functions directly.<br />
Moreover, such functions can be constructed on-the-fly, capturing free variables in context, creating the so-called closures. In this way, you construct something like object on-demand and don't even need a type class:<br />
<br />
<haskell><br />
do x <- newIORef 0<br />
proc (modifyIORef x (+1), readIORef x)<br />
</haskell><br />
<br />
Here, we applied proc to two functions - one incrementing the value of a<br />
counter and another reading its current value. Another call to proc<br />
that uses counter with locking, might look like this:<br />
<br />
<haskell><br />
do x <- newMVar 0<br />
proc (modifyMVar x (+1), readMVar x)<br />
</haskell><br />
<br />
Here, proc may be defined as:<br />
<br />
<haskell><br />
proc :: (IO (), IO Int) -> IO ()<br />
proc (inc, read) = do { inc; inc; inc; read >>= print }<br />
</haskell><br />
<br />
<br />
i.e. it receive two abstract operations whose implementation may vary<br />
in different calls to proc and call them without any knowledge of<br />
implementation details. The equivalent C++ code could look like this:<br />
<br />
<pre><br />
class Counter {<br />
public:<br />
virtual void inc() = 0;<br />
virtual int read() = 0;<br />
};<br />
<br />
class SimpleCounter : public Counter {<br />
public:<br />
SimpleCounter() { n = 0; }<br />
void inc() { n++; }<br />
int read() { return n; }<br />
private:<br />
int n;<br />
};<br />
<br />
void proc (Counter &c) {<br />
c.inc(); c.inc(); c.inc(); cout << c.read();<br />
}<br />
</pre><br />
<br />
And again, Haskell code is much simpler and more straightforward - we<br />
don't need to declare classes, operations, their types - we just pass<br />
to the proc implementation of operations it needs. Look at<br />
[[IO inside#Example: returning an IO action as a result]]<br />
and following sections to find more examples of using closures instead<br />
of OOP classes.<br />
<br />
=== Hiding implementation details: use module export list ===<br />
<br />
One more usage of OOP classes is to hide implementation details, making<br />
internal data/functions inaccessible to class clients. Unfortunately, this<br />
functionality is not part of type class facilities. Instead, you<br />
should use the sole Haskell method of encapsulation, module<br />
export list:<br />
<br />
<haskell><br />
module Stack (Stack, empty, push, pop, top, isEmpty) where<br />
<br />
newtype Stack a = Stk [a]<br />
<br />
empty = Stk []<br />
push x (Stk xs) = Stk (x:xs)<br />
pop (Stk (x:xs)) = Stk xs<br />
top (Stk (x:xs)) = x<br />
isEmpty (Stk xs) = null xs<br />
</haskell><br />
<br />
Since the constructor for the data type Stack is hidden (the export<br />
list would say Stack(Stk) if it were exposed), outside of this module a stack can only be built from operations empty, push and pop, and<br />
examined with top and isEmpty.<br />
<br />
<br />
=== Grouping related functionality: use module hierarchy and Haddock markup ===<br />
<br />
Dividing a whole program into classes and using their hierarchy to<br />
represent entire an program structure is a great instrument for OO languages. Unfortunately, it's again impossible in Haskell. Instead,<br />
the structure of a program is typically rendered in a module hierarchy and inside<br />
a module - in its export list. Although Haskell doesn't provide<br />
facilities to describe a hierarchical structure inside of a module, we have<br />
another tool to do it - Haddock, a de-facto standard documentation tool.<br />
<br />
<haskell><br />
module System.Stream.Instance (<br />
<br />
-- * File is a file stream<br />
File,<br />
-- ** Functions that open files<br />
openFile, -- open file in text mode<br />
openBinaryFile, -- open file in binary mode<br />
-- ** Standard file handles<br />
stdin,<br />
stdout,<br />
stderr,<br />
<br />
-- * MemBuf is a memory buffer stream<br />
MemBuf,<br />
-- ** Functions that open MemBuf<br />
createMemBuf, -- create new MemBuf<br />
createContiguousMemBuf, -- create new contiguous MemBuf<br />
openMemBuf, -- use memory area as MemBuf<br />
<br />
) where<br />
...<br />
</haskell><br />
<br />
Here, Haddock will build documentation for a module using its export list. The export list will be divided into sections (whose<br />
headers given with "-- *") and subsections (given with "-- **"). As<br />
a result, module documentation reflects its structure without using<br />
classes for this purpose.<br />
<br />
== Type classes is a sort of templates, not classes ==<br />
<br />
At this moment, C++ has classes and<br />
templates. What is the difference? With a class, type<br />
information is carried with the object itself while with templates it's<br />
outside of the object and is part of the whole operation.<br />
<br />
For example, if the == operation is defined as a virtual method in a class, the actual<br />
procedure called for a==b may depend on the run-time type of 'a', but if <br />
the operation is defined in template, the actual procedure depends only on the instantiated template (which is determined at compile time).<br />
<br />
Haskell's objects don't carry run-time type information. Instead,<br />
the class constraint for a polymorphic operation is passed in as a <br />
"dictionary" implementing all operations of the class (there are also<br />
other implementation techniques, but this doesn't matter). For example,<br />
<br />
<haskell><br />
eqList :: (Eq a) => [a] -> [a] -> Bool<br />
</haskell><br />
<br />
is translated into:<br />
<br />
<haskell><br />
type EqDictionary a = (a->a->Bool, a->a->Bool)<br />
eqList :: EqDictionary a -> [a] -> [a] -> Bool<br />
</haskell><br />
<br />
where the first parameter is a "dictionary" containing the implementation of<br />
"==" and "/=" operations for objects of type 'a'. If there are several<br />
class constraints, a dictionary for each is passed.<br />
<br />
If the class has base class(es), the dictionary tuple includes the base class dictionaries, so<br />
<br />
<haskell><br />
class Eq a => Cmp a where<br />
cmp :: a -> a -> Ordering<br />
<br />
cmpList :: (Cmp a) => [a] -> [a] -> Ordering<br />
</haskell><br />
<br />
turns into:<br />
<br />
<haskell><br />
type CmpDictionary a = (EqDictionary a, a -> a -> Ordering)<br />
cmpList :: CmpDictionary a -> [a] -> [a] -> Ordering<br />
</haskell><br />
<br />
<br />
Compared to C++, this is more like templates, not classes! As with<br />
templates, type information is part of operation, not the object! But<br />
while C++ templates are really a form of macro-processing (like<br />
Template Haskell) and at last end generate non-polymorphic code,<br />
Haskell's use of dictionaries allows run-time polymorphism<br />
(explanation of run-time polymorphism? -what is this? a form of dynamic dispatch?).<br />
<br />
Moreover, Haskell type classes support inheritance. Run-time<br />
polymorphism together with inheritance are often seen as OOP<br />
distinctive points, so during long time I considered type classes as a<br />
form of OOP implementation. But that's wrong! Haskell type classes<br />
build on a different basis, so they are like C++ templates with added<br />
inheritance and run-time polymorphism! And this means that the usage of<br />
type classes is different from using classes, with its own strong and<br />
weak points.<br />
<br />
== Type classes vs classes ==<br />
<br />
Here is a brief listing of differences between OOP classes and Haskell type classes<br />
<br />
=== Type classes are like interfaces/abstract classes, not classes itself ===<br />
<br />
There is no inheritance and data fields<br />
(so type classes are more like interfaces than classes)....<br />
<br />
<br />
For those more familiar with Java/C# rather than C++, type classes resemble interfaces more than the classes. In fact, the generics in those languages capture the notion of parametric polymorphism (but Haskell is a language that takes parametric polymorphism quite seriously, so you can expect a fair amount of type gymnastics when dealing with Haskell), so more precisely, type classes are like generic interfaces.<br />
<br />
Why interface, and not class? Mostly because type classes do not implement the methods themselves, they just guarantee that the actual types that instantiate the type class will implement specific methods. So the types are like classes in Java/C#.<br />
<br />
One added twist: type classes can decide to provide default implementation of some methods (using other methods). You would say, then they are sort of like abstract classes. Right. But at the same time, you cannot extend (inherit) multiple abstract classes, can you?<br />
<br />
So a type class is sort of like a contract: "any type that instantiates this type class will have the following functions defined on them..." but with the added advantage that you have type parameters built-in, so:<br />
<br />
<haskell><br />
class Eq a where<br />
(==) :: a -> a -> Bool<br />
(/=) :: a -> a -> Bool<br />
-- let's just implement one function in terms of the other<br />
x /= y = not (x == y) <br />
</haskell><br />
<br />
is, in a Java-like language:<br />
<br />
'''interface''' Eq<A> {<br />
'''boolean''' equal(A that);<br />
'''boolean''' notEqual(A that) { <br />
''// default, can be overriden''<br />
'''return''' !equal(that); <br />
} <br />
}<br />
<br />
And the "instance TypeClass ParticularInstance where ..." definition means "ParticularInstance implements TypeClass { ... }", now, multiple parameter type classes, of course, cannot be interpreted this way.<br />
<br />
<br />
=== Type can appear at any place in function signature ===<br />
Type can appear at any place in function signature: be any<br />
parameter, inside parameter, in a list (possibly empty), or in a result<br />
<br />
<haskell><br />
class C a where<br />
f :: a -> Int<br />
g :: Int -> a -> Int<br />
h :: Int -> (Int,a) -> Int<br />
i :: [a] -> Int<br />
j :: Int -> a<br />
new :: a<br />
</haskell><br />
<br />
It's even possible to define instance-specific constants (look at 'new').<br />
<br />
If function value is instance-specific, OOP programmer will use<br />
"static" method while with type classes you need to use fake<br />
parameter:<br />
<br />
<haskell><br />
class FixedSize a where<br />
sizeof :: a -> Int<br />
instance FixedSize Int8 where<br />
sizeof _ = 1<br />
instance FixedSize Int16 where<br />
sizeof _ = 2<br />
<br />
main = do print (sizeof (undefined::Int8))<br />
print (sizeof (undefined::Int16))<br />
</haskell><br />
<br />
<br />
=== Inheritance between interfaces ===<br />
Inheritance between interfaces (in "class" declaration) means<br />
inclusion of base class dictionaries in dictionary of subclass:<br />
<br />
<haskell><br />
class (Show s, Monad m s) => Stream m s where<br />
sClose :: s -> m ()<br />
</haskell><br />
<br />
means<br />
<br />
<haskell><br />
type StreamDictionary m s = (ShowDictionary s, MonadDictionary m s, s->m())<br />
</haskell><br />
<br />
There is upcasting mechanism, it just extracts dictionary of a base<br />
class from a dictionary tuple, so you can run a function that requires<br />
base class from a function that requires subclass:<br />
<br />
<haskell><br />
f :: (Stream m s) => s -> m String<br />
show :: (Show s) => s -> String<br />
f s = return (show s)<br />
</haskell><br />
<br />
But downcasting is absolutely impossible - there is no way to get<br />
subclass dictionary from a superclass one<br />
<br />
<br />
<br />
=== Inheritance between instances ===<br />
Inheritance between instances (in "instance" declaration) means<br />
that operations of some class can be executed via operations of other<br />
class, i.e. such declaration describe a way to compute dictionary of<br />
inherited class via functions from dictionary of base class:<br />
<br />
<haskell><br />
class Eq a where<br />
(==) :: a -> a -> Bool<br />
class Cmp a where<br />
cmp :: a -> a -> Ordering<br />
instance (Cmp a) => Eq a where<br />
a==b = cmp a b == EQ<br />
</haskell><br />
<br />
creates the following function:<br />
<br />
<haskell><br />
cmpDict2EqDict :: CmpDictionary a -> EqDictionary a<br />
cmpDict2EqDict (cmp) = (\a b -> cmp a b == EQ)<br />
</haskell><br />
<br />
This results in that any function that receives dictionary for Cmp class<br />
can call functions that require dictionary of Eq class<br />
<br />
<br />
=== Downcasting is a mission impossible ===<br />
<br />
Selection between instances is done at compile-time, based only on<br />
information present at the moment. So don't expect that more concrete<br />
instance will be selected just because you passed this concrete<br />
datatype to the function which accepts some general class:<br />
<br />
<haskell><br />
class Foo a where<br />
foo :: a -> String<br />
<br />
instance (Num a) => Foo a where<br />
foo _ = "Num"<br />
<br />
instance Foo Int where<br />
foo _ = "int"<br />
<br />
f :: (Num a) => a -> String<br />
f = foo<br />
<br />
main = do print (foo (1::Int))<br />
print (f (1::Int))<br />
</haskell><br />
<br />
Here, the first call will return "int", but second - only "Num".<br />
this can be easily justified by using dictionary-based translation<br />
as described above. After you've passed data to polymorphic procedure<br />
it's type is completely lost, there is only dictionary information, so<br />
instance for Int can't be applied. The only way to construct Foo<br />
dictionary is by calculating it from Num dictionary using the first<br />
instance.<br />
<br />
:<i>Remark: This isn't even a legal program unless you use the <hask>IncoherentInstances</hask> language extension. The error message:</i><br />
<br />
<haskell><br />
Overlapping instances for Foo a<br />
arising from a use of `foo' at /tmp/I.hs:17:4-6<br />
Matching instances:<br />
instance [overlap ok] (Num a) => Foo a<br />
-- Defined at /tmp/I.hs:10:9-24<br />
instance [overlap ok] Foo Int -- Defined at /tmp/I.hs:13:9-15<br />
(The choice depends on the instantiation of `a'<br />
To pick the first instance above, use -XIncoherentInstances <br />
when compiling the other instance declarations)<br />
</haskell><br />
:<i>Details: [http://haskell.org/ghc/docs/latest/html/users_guide/type-class-extensions.html#instance-overlap GHC User's Guide]</i><br />
<br />
=== There is only one dictionary per function call ===<br />
For "eqList :: (Eq a) => [a] -> [a] -> Bool" types of all elements<br />
in list must be the same, and types of both arguments must be the same<br />
too - there is only one dictionary and it know how to handle variables<br />
of only one concrete type!<br />
<br />
=== Existential variables is more like OOP objects ===<br />
Existential variables pack dictionary together with variable (looks<br />
very like the object concept!) so it's possible to create polymorphic<br />
containers (i.e. holding variables of different types). But<br />
downcasting is still impossible. Also, existentials still don't allow<br />
to mix variables of different types in a call to some polymorhic operation<br />
(their personal dictionaries still built for variables of one concrete type):<br />
<br />
<haskell><br />
data HasCmp = forall a. Cmp a => HasCmp a<br />
<br />
sorted :: [HasCmp] -> Ordering<br />
<br />
sorted [] = True<br />
sorted [_] = True<br />
sorted (HasCmp a : HasCmp b : xs) = a<=b && sorted (b:xs)<br />
</haskell><br />
<br />
This code will not work - a<=b can use nor 'a' neither 'b' dictionary.<br />
Even if orderings for apples and penguins are defined, we still don't have<br />
a method to compare penguins to apples!<br />
<br />
<br />
== Other opinions ==<br />
<br />
=== OO class always corresponds to a haskell class + a related haskell existential (John Meacham) ===<br />
<br />
> Roughly Haskell type classes correspond to parameterized abstract<br />
> classes in C++ (i.e. class templates with virtual functions <br />
> representing the operations). Instance declarations correspond to<br />
> derivation and implementations of those parameterized classes.<br />
<br />
There is a major difference though, in C++ (or java, or sather, or c#,<br />
etc.) the dictionary is always attached to the value, the actual class<br />
data type you pass around. In Haskell, the dictionary is passed<br />
separately and the appropriate one is inferred by the type system. C++<br />
doesn't infer, it just assumes everything will be carrying around its<br />
dictionary with it.<br />
<br />
This makes Haskell classes significantly more powerful in many ways.<br />
<br />
<haskell><br />
class Num a where<br />
(+) :: a -> a -> a<br />
</haskell><br />
<br />
is impossible to express in OO classes: since both arguments to +<br />
necessarily carry their dictionaries with them, there is no way to<br />
statically guarantee they have the same one. Haskell will pass a single<br />
dictionary that is shared by both types so it can handle this just fine.<br />
<br />
In haskell you can do<br />
<br />
<haskell><br />
class Monoid a where<br />
mempty :: a<br />
</haskell><br />
<br />
In OOP, this cannot be done because where does the dictionary come from?<br />
Since dictionaries are always attached to a concrete class, every method<br />
must take at least one argument of the class type (in fact, exactly one,<br />
as I'll show below). In Haskell again, this is not a problem since the<br />
dictionary is passed in by the consumer of 'mempty' - mempty need not<br />
conjure one out of thin air.<br />
<br />
In fact, OO classes can only express single parameter type classes where<br />
the type argument appears exactly once in strictly covariant position.<br />
In particular, it is pretty much always the first argument and often<br />
(but not always) named 'self' or 'this'.<br />
<br />
<haskell><br />
class HasSize a where<br />
getSize :: a -> Int<br />
</haskell><br />
<br />
can be expressed in OO, 'a' appears only once, as its first argument.<br />
<br />
Now, another thing OO classes can do is they give you the ability to<br />
create existential collections (?) of objects. As in, you can have a<br />
list of things that have a size. In Haskell, the ability to do this is<br />
independent of the class (which is why Haskell classes can be more<br />
powerful) and is appropriately named existential types.<br />
<br />
<haskell><br />
data Sized = exists a . HasSize a => Sized a <br />
</haskell><br />
<br />
What does this give you? You can now create a list of things that have a<br />
size [Sized] yay!<br />
<br />
And you can declare an instance for Sized, so you can use all your<br />
methods on it.<br />
<br />
<haskell><br />
instance HasSize Sized where<br />
getSize (Sized a) = getSize a<br />
</haskell><br />
<br />
An existential, like Sized, is a value that is passed around with its<br />
dictionary in tow, as in, it is an OO class! I think this is where<br />
people get confused when comparing OO classes to Haskell classes. _There<br />
is no way to do so without bringing existentials into play_. OO classes<br />
are inherently existential in nature.<br />
<br />
So, an OO abstract class declaration declares the equivalent of 3 things<br />
in Haskell: a class to establish the methods, an existential type to<br />
carry the values about, and an instance of the class for the existential<br />
type.<br />
<br />
An OO concrete class declares all of the above plus a data declaration<br />
for some concrete representation.<br />
<br />
OO classes can be perfectly (even down to the runtime representation!)<br />
emulated in Haskell, but not vice versa. Since OO languages tie class<br />
declarations to existentials, they are limited to only the intersection<br />
of their capabilities, because Haskell has separate concepts for them;<br />
each is independently much much more powerful.<br />
<br />
<haskell><br />
data CanApply = exists a b . CanApply (a -> b) a (b -> a)<br />
</haskell><br />
<br />
is an example of something that cannot be expressed in OO, existentials<br />
are limited to having exactly a single value since they are tied to a<br />
single dictionary.<br />
<br />
<haskell><br />
class Num a where<br />
(+) :: a -> a -> a<br />
zero :: a<br />
negate :: a -> a<br />
</haskell><br />
<br />
cannot be expressed in OO, because there is no way to pass in the same<br />
dicionary for two elements, or for a returning value to conjure up a<br />
dictionary out of thin air. (If you are not convinced, try writing a<br />
'Number' existential and making it an instance of Num and it will be<br />
clear why it is not possible.)<br />
<br />
negate is an interesting one - there is no technical reason it cannot be<br />
implemented in OO languages, but none seem to actually support it.<br />
<br />
So, when comparing, remember an OO class always corresponds to a Haskell<br />
class + a related Haskell existential.<br />
<br />
Incidentally, an extension I am working on is to allow<br />
<br />
<haskell><br />
data Sized = exists a . HasSize a => Sized a <br />
deriving(HasSize)<br />
</haskell><br />
<br />
which would have the obvious interpretation. Obviously it would only work<br />
under the same limitations as OO classes have, but it would be a simple<br />
way for haskell programs to declare OO style classes if they so choose.<br />
<br />
(Actually, it is still signifigantly more powerful than OO classes since<br />
you can derive many instances, and even declare your own for classes<br />
that don't meet the OO constraints. Also, your single class argument need<br />
not appear as the first one. It can appear in any strictly covariant<br />
position, and it can occur as often as you want in contravariant ones!)<br />
<br />
=== Type classes correspond to parameterized abstract classes (Gabriel Dos Reis) ===<br />
<br />
| > Roughly Haskell type classes correspond to parameterized abstract<br />
| > classes in C++ (i.e. class templates with virtual functions <br />
| > representing the operations). Instance declarations correspond to<br />
| > derivation and implementations of those parameterized classes.<br />
| <br />
| There is a major difference though, in C++ (or java, or sather, or c#,<br />
| etc..) the dictionary is always attached to the value, the actual class<br />
| data type you pass around.<br />
<br />
I suspect that most of the confusion come from the fact that people<br />
believe just because virtual functions are attached to objects, <br />
they cannot attach them to operations outside classes. That, to my<br />
surprise, hints at a deeper misappreciation of both type classes and<br />
so-called "OO" technology. Type classes are more OO than one might<br />
realize. <br />
<br />
The dictionary can be attached to the operations (not just to the values) by<br />
using objects local to functions (which sort of matierialize the<br />
dictionary). Consider<br />
<br />
// Abstract class for a collection of classes that implement<br />
// the "Num" mathematical structure<br />
template<typename T><br />
struct Num {<br />
virtual T add(T, T) const = 0;<br />
};<br />
<br />
// Every type must specialize this class template to assert<br />
// membership to the "Num" structure. <br />
template<typename T> struct Num_instance;<br />
<br />
// The operation "+" is defined for any type that belongs to "Num".<br />
// Notice, membership is asserted aby specializing Num_instance<>.<br />
template<typename T><br />
T operator+(T lhs, T rhs)<br />
{<br />
const Num_instance<T> instance; <br />
return instance.add(lhs, rhs);<br />
}<br />
<br />
// "Foo" is in "Num"<br />
struct Num_instance<Foo> : Num<Foo> {<br />
Foo add(Foo a, Foo b) const { ... }<br />
};<br />
<br />
<br />
The key here is in the definition of operator+ which is just a formal<br />
name for the real operation done by instance.add().<br />
<br />
I appreciate that inferring and building the dictionary (represented<br />
here by the "instance" local to operator+<T>) is done automatically by<br />
the Haskell type system.<br />
That is one of the reasons why the type class notation is a nice sugar.<br />
However, that should not distract from its deerper OO semantics.<br />
<br />
<br />
[...]<br />
<br />
| in haskell you can do<br />
| <br />
| class Monoid a where<br />
| mempty :: a<br />
| <br />
| in OOP, this cannot be done because where does the dicionary come from?<br />
<br />
See above. I believe a key in my suggestion was "parameterized<br />
abstract classes", not just "abstract classes".<br />
<br />
<br />
<br />
== Haskell emulation of OOP inheritance with record extension ==<br />
<br />
Brian Hulley provided us the code that shows how OOP inheritance can be<br />
emulated in Haskell. His translation method supports data fields<br />
inheritance, although doesn't support downcasting.<br />
<br />
> although i mentioned not only pluses but also drawbacks of type<br />
> classes: lack of record extension mechanisms (such at that implemented<br />
> in O'Haskell) and therefore inability to reuse operation<br />
> implementation in an derived data type...<br />
<br />
You can reuse ops in a derived data type but it involves a tremendous amount <br />
of boilerplate. Essentially, you just use the type classes to simulate <br />
extendable records by having a method in each class that accesses the <br />
fixed-length record corresponding to that particular C++ class.<br />
<br />
Here is an example (apologies for the length!) which shows a super class <br />
function being overridden in a derived class and a derived class method <br />
(B::Extra) making use of something implemented in the super class:<br />
<br />
<haskell><br />
module Main where<br />
<br />
{- Haskell translation of the following C++<br />
<br />
class A {<br />
public:<br />
String s;<br />
Int i;<br />
<br />
A(String s, Int i) s(s), i(i){}<br />
<br />
virtual void Display(){<br />
printf("A %s %d\n", s.c_str(), i);<br />
}<br />
<br />
virtual Int Reuse(){<br />
return i * 100;<br />
}<br />
};<br />
<br />
<br />
class B: public A{<br />
public:<br />
Char c;<br />
<br />
B(String s, Int i, Char c) : A(s, i), c(c){}<br />
<br />
virtual void Display(){<br />
printf("B %s %d %c", s.c_str(), i, c);<br />
}<br />
<br />
virtual void Extra(){<br />
printf("B Extra %d\n", Reuse());<br />
}<br />
<br />
};<br />
<br />
-}<br />
<br />
data A = A<br />
{ _A_s :: String<br />
, _A_i :: Int<br />
}<br />
<br />
-- This could do arg checking etc<br />
constructA :: String -> Int -> A<br />
constructA = A<br />
<br />
<br />
class ClassA a where<br />
getA :: a -> A<br />
<br />
display :: a -> IO ()<br />
display a = do<br />
let<br />
A{_A_s = s, _A_i = i} = getA a<br />
putStrLn $ "A " ++ s ++ show i<br />
<br />
reuse :: a -> Int<br />
reuse a = _A_i (getA a) * 100<br />
<br />
<br />
data WrapA = forall a. ClassA a => WrapA a<br />
<br />
instance ClassA WrapA where<br />
getA (WrapA a) = getA a<br />
display (WrapA a) = display a<br />
reuse (WrapA a) = reuse a<br />
<br />
instance ClassA A where<br />
getA = id<br />
<br />
<br />
data B = B { _B_A :: A, _B_c :: Char }<br />
<br />
<br />
constructB :: String -> Int -> Char -> B<br />
constructB s i c = B {_B_A = constructA s i, _B_c = c}<br />
<br />
class ClassA b => ClassB b where<br />
getB :: b -> B<br />
<br />
extra :: b -> IO ()<br />
extra b = do<br />
putStrLn $ "B Extra " ++ show (reuse b)<br />
<br />
data WrapB = forall b. ClassB b => WrapB b<br />
<br />
instance ClassB WrapB where<br />
getB (WrapB b) = getB b<br />
extra (WrapB b) = extra b<br />
<br />
instance ClassA WrapB where<br />
getA (WrapB b) = getA b<br />
display (WrapB b) = display b<br />
reuse (WrapB b) = reuse b<br />
<br />
instance ClassB B where<br />
getB = id<br />
<br />
instance ClassA B where<br />
getA = _B_A<br />
<br />
-- override the base class version<br />
display b = putStrLn $<br />
"B " ++ _A_s (getA b)<br />
++ show (_A_i (getA b))<br />
++ [_B_c (getB b)]<br />
<br />
<br />
main :: IO ()<br />
main = do<br />
let<br />
a = constructA "a" 0<br />
b = constructB "b" 1 '*'<br />
<br />
col = [WrapA a, WrapA b]<br />
<br />
mapM_ display col<br />
putStrLn ""<br />
mapM_ (putStrLn . show . reuse) col<br />
putStrLn ""<br />
extra b<br />
<br />
{- Output:<br />
<br />
> ghc -fglasgow-exts --make Main<br />
> main<br />
A a0<br />
B b1*<br />
<br />
0<br />
100<br />
<br />
B Extra 100<br />
<br />
><br />
-}<br />
</haskell><br />
<br />
(If the "caseless underscore" Haskell' ticket is accepted the leading <br />
underscores would have to be replaced by something like "_f" ie _A_s ---> <br />
_fA_s etc)<br />
<br />
== Type class system extensions ==<br />
<br />
Brief list of extensions, their abbreviated names and compatibility level<br />
<br />
* Constructor classes (Haskell'98)<br />
* MPTC: multi-parameter type classes (Hugs/GHC extension)<br />
* FD: functional dependencies (Hugs/GHC extension)<br />
* AT: associated types (GHC 6.6 only)<br />
* Overlapped, undecidable and incoherent instances (Hugs/GHC extension)<br />
<br />
<br />
== Literature ==<br />
<br />
The paper that at first time introduced type classes and their implementation<br />
using dictionaries was Philip Wadler and Stephen Blott "How to make ad-hoc polymorphism less ad-hoc" (http://homepages.inf.ed.ac.uk/wadler/papers/class/class.ps.gz)<br />
<br />
You can find more papers on the [http://haskell.org/haskellwiki/Research_papers/Type_systems#Type_classes Type classes] page.<br />
<br />
I thanks Ralf Lammel and Klaus Ostermann for their paper<br />
"Software Extension and Integration with Type Classes" (http://homepages.cwi.nl/~ralf/gpce06/) which prompts me to start thinking about differences between OOP and type classes instead of their similarities<br />
<br />
[[Category:Tutorials]]</div>Orionhttps://wiki.haskell.org/index.php?title=Memory_leak&diff=58645Memory leak2014-08-11T21:41:43Z<p>Orion: /* Keeping not needed references alive */ -- Modified grammar</p>
<hr />
<div>A memory leak means that a program allocates more memory than necessary for its execution.<br />
Although Haskell implementations use [[garbage collector]]s, programmers must still keep memory management in mind.<br />
A garbage collector can reliably prevent [http://en.wikipedia.org/Dangling_pointers dangling pointers],<br />
but it is easily possible to produce memory leaks, especially in connection with [[lazy evaluation]].<br />
Note that a leak will not only consume more and more memory but it will also slow down the [[garbage collector]] considerably!<br />
Maybe it is even the reason for the widely spread opinion<br />
that garbage collectors are slow or not suited for [[realtime]] applications.<br />
<br />
== Types of leaks ==<br />
<br />
=== Holding a reference for a too long time ===<br />
<br />
Consider for example:<br />
<haskell><br />
let xs = [1..1000000::Integer]<br />
in sum xs * product xs<br />
</haskell><br />
Since most Haskell compilers expect that the programmer used <hask>let</hask> in order to share <hask>xs</hask> between the call of <hask>sum</hask> and the call of <hask>product</hask>,<br />
the list <hask>xs</hask> is completely materialized and held in memory.<br />
However, the list <hask>xs</hask> is very cheap to compute, and thus memory usage can be reduced considerably by computing <hask>xs</hask> in each call.<br />
<br />
To achieve this while avoiding code duplication, we can turn the list definition into a function with a dummy argument.<br />
<haskell><br />
let makeXs n = [1..n::Integer]<br />
in sum (makeXs 1000000) * product (makeXs 1000000)<br />
</haskell><br />
<br />
=== Building up unevaluated expressions ===<br />
<br />
Another typical cause of memory leaks are unevaluated expressions,<br />
the classical example being to sum up the numbers of a list (known as <hask>sum</hask> function).<br />
<haskell><br />
foldl (+) 0 [1..1000000::Integer]<br />
</haskell><br />
The problem is, that the runtime system does not know, whether the intermediate sums are actually needed at a later point,<br />
and thus it leaves them unevaluated.<br />
I.e. it stores something equivalent to <hask>1+2+3+4</hask> instead of just <hask>10</hask>.<br />
You may be lucky that the [[strictness analyzer]] already removes the laziness at compile time,<br />
but in general you cannot rely on it.<br />
The safe way is to use [[seq]] to force evaluation of intermediate sums.<br />
This is done by <hask>foldl'</hask>.<br />
<haskell><br />
foldl' (+) 0 [1..1000000::Integer]<br />
</haskell><br />
<br />
=== Keeping not needed references alive ===<br />
<br />
Consider the following definition:<br />
<br />
<haskell><br />
x = fst (a, b)<br />
</haskell><br />
<br />
Until we evaluate <hask>x</hask>, both references <hask>a</hask> and <hask>b</hask> are kept alive.<br />
<br />
After evaluating <hask>x</hask>, <hask>b</hask> can be garbage collected if <hask>b</hask> is not referenced elsewhere.<br />
<br />
== Detection of memory leaks ==<br />
<br />
A memory leak can be detected by writing a test that should require only a limitted amount of memory<br />
and then run the compiled program with restricted heap size.<br />
E.g. you can restrict the heap size to 4 MB like in this example:<br />
<br />
<code><br />
$ ./mytest +RTS -M4m -RTS<br />
</code><br />
<br />
== A note on GHCi ==<br />
<br />
If you are noticing a space leak while running your code within GHCi, please note that interpreted code behaves differently from compiled code: even when using `seq`.<br />
<br />
Consider starting ghci as follows:<br />
<br />
<code><br />
$ ghci -fobject-code<br />
</code><br />
<br />
For this, see [http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/ghci-obj.html Compiling to object code inside GHCi].<br />
<br />
== See also ==<br />
<br />
* [http://neilmitchell.blogspot.nl/2013/02/chasing-space-leak-in-shake.html Chasing a Space Leak in Shake] on Neil Mitchell's Haskell Blog (2013-02-25)<br />
* Blog post [http://blog.ezyang.com/2011/05/anatomy-of-a-thunk-leak/ "Anatomy of a thunk leak"]<br />
* Blog post [http://blog.ezyang.com/2011/05/space-leak-zoo/ "Space leak zoo"]<br />
* Haskell Cafe on a space leak caused by the garbage collector that [http://www.haskell.org/pipermail/haskell-cafe/2010-June/079444.html did not recognize a selector-like function call]<br />
* Haskell libraries on [http://www.haskell.org/pipermail/libraries/2010-September/014420.html Make lines stricter to fix a space leak]<br />
<br />
[[Category:Glossary]]</div>Orion