Difference between revisions of "The Monad.Reader/Issue3/Notes on Learning Haskell"

From HaskellWiki
Jump to: navigation, search
 
(added missing parenthesis)
 
(8 intermediate revisions by one other user not shown)
Line 1: Line 1:
'''This article needs reformatting! Please help tidy it up.'''--[[User:WouterSwierstra|WouterSwierstra]] 14:16, 9 May 2008 (UTC)
 
  +
=Learning Haskell Notes=
 
  +
:By ''Graham Klyne''
##acl -All:read Default
 
#pragma section-numbers 2
 
 
= Learning Haskell Notes =
 
 
''Graham Klyne''
 
   
 
This page contains some personal notes accumulated over several months of learning to program in Haskell, developing a moderately-sized application.
 
This page contains some personal notes accumulated over several months of learning to program in Haskell, developing a moderately-sized application.
   
  +
__TOC__
   
= Contents =
 
  +
==Learning Materials==
 
[[TableOfContents(2)]]
 
 
 
 
[[Anchor(LearningMaterials)]]
 
== Learning Materials ==
 
 
 
I found there was plenty of good tutorial material for introducing a new programmer to Haskell, and the language report was a very precise and concise reference for language specialists, but there seems to be little in between for a 'jobbing programmer'. The report could sometimes be unhelpful when trying to learn details of some subtle new feature of the language not covered by the tutorials.
 
I found there was plenty of good tutorial material for introducing a new programmer to Haskell, and the language report was a very precise and concise reference for language specialists, but there seems to be little in between for a 'jobbing programmer'. The report could sometimes be unhelpful when trying to learn details of some subtle new feature of the language not covered by the tutorials.
   
A book in the style of, say, the ''The Annotated C++ Reference Manual'' (Ellis and Stroustrup) would be useful: when I was learning and using C++, this book was invaluable for me, as it contained just the right mix of formal definition and in-depth pedagogical material.
+
A book in the style of, say, the ''The Annotated C++ Reference Manual'' (Ellis and Stroustrup) would be useful: when I was learning and using C++, this book was invaluable for me, as it contained just the right mix of formal definition and in-depth pedagogical material.
   
I find the Haskell report tends to concentrate largely on definition, and sometimes skimps on explanation. This would be fine, but the other, more tutorial, sources I consulted don't seem to have complete coverage of the language.
+
I find the Haskell report tends to concentrate largely on definition, and sometimes skimps on explanation. This would be fine, but the other, more tutorial, sources I consulted don't seem to have complete coverage of the language.
   
[[Anchor(ErrorMessages)]]
 
  +
==Error messages from compilers==
== Error messages from compilers ==
 
  +
Error messages can sometimes be very obscure, misleading even. Especially in the realm of types and classes.
   
Error messages can sometimes be very obscure, misleading even. Especially in the realm of types and classes.
 
  +
The type system is very subtle. Maybe this is unavoidable. Better diagnostics would help; [[GHC]] and [[GHCi]] seem to be better in this respect than [[Hugs]], in part because they give more information about the difference between what was expected and what was found.
 
The type system is very subtle. Maybe this is unavoidable.
 
Better diagnostics would help; GHC and GHCi seem to be better in this respect than Hugs, in part because they give more information about the difference between what was expected and what was found.
 
   
 
I think that because so many concepts are separated, it's easy to focus on the wrong aspects of one's type declarations.
 
I think that because so many concepts are separated, it's easy to focus on the wrong aspects of one's type declarations.
   
A common error is to mis-judge operator precendences and end up with a sub-expression being inferred to have a very different type
+
A common error is to mis-judge operator precedences and end up with a sub-expression being inferred to have a very different type
 
from what was intended; usually a function type.
 
from what was intended; usually a function type.
(Hint: function application binds more tightly than any operator; function composition has very low precedence.)
+
(Hint: function application binds more tightly than any operator; function composition has very low precedence.)
 
Here is a simple example:
 
Here is a simple example:
{{{
 
  +
<haskell>f1 :: (Char->Float) -> (Float->Int) -> Char -> Int
f1 :: (Char->Float) -> (Float->Int) -> Char -> Int
 
  +
f1 g h c = h . g c</haskell>
f1 g h c = h . g c
 
}}}
 
 
results in the HUGS error message
 
results in the HUGS error message
{{{
 
  +
... - Type error in application
+
... - Type error in application
*** Expression : h . g c
+
*** Expression : h . g c
*** Term : g c
+
*** Term : g c
*** Type : Float
+
*** Type : Float
*** Does not match : a -> b
+
*** Does not match : a -> b
}}}
+
 
A valid form would be:
 
A valid form would be:
{{{
 
  +
<haskell>
t :: (Char->Float) -> (Float->Int) -> Char -> Int
+
f1 :: (Char->Float) -> (Float->Int) -> Char -> Int
t f g c = g . f c
+
f1 g h c = (h . g) c
}}}
+
</haskell>
In this case, it's fairly easy to see the mistake, but in more complex expressions the type errors can be rather tricky to spot, especially when inferred intermediate types are being used. In the following example, the same error is less easy to determine from the offered diagnostic message "Term `f5`, type `Char -> a -> b` does not match `Char -> Int`":
+
In this case, it's fairly easy to see the mistake, but in more complex expressions the type errors can be rather tricky to spot, especially when inferred intermediate types are being used. In the following example, the same error is less easy to determine from the offered diagnostic message <tt>Term `f5`, type `Char -> a -> b` does not match `Char -> Int`</tt>:
{{{
+
<haskell>
 
f2 g h c = h . g c
 
f2 g h c = h . g c
 
f3 'a' = 1.0
 
f3 'a' = 1.0
Line 64: Line 46:
 
f5 :: Char -> Int
 
f5 :: Char -> Int
 
f5 c = f2 f3 f4 'a'
 
f5 c = f2 f3 f4 'a'
}}}
 
  +
</haskell>
   
 
It can sometimes be difficult to distinguish between simple typos and deep abuse of the type system.
 
It can sometimes be difficult to distinguish between simple typos and deep abuse of the type system.
 
In this example:
 
In this example:
{{{
 
  +
<haskell>
class (Eq k, Show k) => LookupEntryClass a k v where
+
class (Eq k, Show k) => LookupEntryClass a k v where
entryKey :: a -> k
+
entryKey :: a -> k
entryVal :: a -> v
+
entryVal :: a -> v
newEntry :: (k,v) -> a
+
newEntry :: (k,v) -> a
   
data (LookupEntryClass a k v) => LookupMap a k v = LookupMap [a k v]
+
data (LookupEntryClass a k v) => LookupMap a k v = LookupMap [a k v]
}}}
+
</haskell>
the error message reported (by HUGS) was of an invalid type constructor in the 'data' declaration,
+
the error message reported (by HUGS) was of an invalid type constructor in the 'data' declaration, when the real problem was missing type parameters in the class method declarations; i.e. should have been:
when the real problem was missing type parameters in the class method declarations; i.e. should have been:
+
<haskell>
{{{
+
class (Eq k, Show k) => LookupEntryClass a k v where
class (Eq k, Show k) => LookupEntryClass a k v where
+
entryKey :: a k v -> k
entryKey :: a k v -> k
+
entryVal :: a k v -> v
entryVal :: a k v -> v
+
newEntry :: (k,v) -> a k v
newEntry :: (k,v) -> a k v
 
   
data (LookupEntryClass a k v) => LookupMap a k v = LookupMap [a k v]
+
data (LookupEntryClass a k v) => LookupMap a k v = LookupMap [a k v]
}}}
+
</haskell>
 
[[Anchor(MultiParameterClasses)]]
 
== Multi-parameter type classes ==
 
   
  +
==Multi-parameter type classes==
 
Although not part of the official Haskell 98 language, multi-parameter classes seem to be pretty much essential
 
Although not part of the official Haskell 98 language, multi-parameter classes seem to be pretty much essential
for many sizeable tasks or re-targetable software components. I think clearer signposting of this widely-implemented
+
for many sizeable tasks or re-targetable software components. I think clearer signposting of this widely-implemented
feature would help a new programmer. (But see also point [#type-class-misuse below] about using type classes,)
+
feature would help a new programmer. (But see also point [#type-class-misuse below] about using type classes,)
I note others, too, have observed the importance of multi-parameter classes.
+
I note others, too, have observed the importance of multi-parameter classes.
   
 
Also, more accessible documentation in this area would help.
 
Also, more accessible documentation in this area would help.
Line 97: Line 80:
 
It's not clear why I made this mistake -- I think the seeds of my confusion may have been laid by the form of
 
It's not clear why I made this mistake -- I think the seeds of my confusion may have been laid by the form of
 
subclass declarations:
 
subclass declarations:
{{{
 
  +
<haskell>
(C a) => (D a)
+
(C a) => (D a)
}}}
+
</haskell>
 
which means `D` is a subclass of `C`, or `a` stands for an instance of `C` as well as `D`.
 
which means `D` is a subclass of `C`, or `a` stands for an instance of `C` as well as `D`.
 
In OO programming, classes are types and instances are values -- Haskell separates these ideas more carefully.
 
In OO programming, classes are types and instances are values -- Haskell separates these ideas more carefully.
Line 107: Line 90:
 
without being lead by previous experience.
 
without being lead by previous experience.
   
[[Anchor(ConfusingMonads)]]
 
  +
==Confusing introduction to monads==
== Confusing introduction to monads ==
 
 
 
Monads can be a source of confusion.
 
Monads can be a source of confusion.
 
The basic ideas are easy enough to get hold of, and using existing monad-based libraries is quite simple,
 
The basic ideas are easy enough to get hold of, and using existing monad-based libraries is quite simple,
Line 125: Line 106:
 
how monads work had become entrenched.
 
how monads work had become entrenched.
   
(I had some ideas for writing a short ProgrammersViewOfMonads - if I can remember them!
+
(I had some ideas for writing a short [[Programmers' View Of Monads]] - if I can remember them!
If I recall correctly, my idea was to focus on concrete monads, describing the monadic
+
If I recall correctly, my idea was to focus on concrete monads, describing the monadic
behaviour of Maybe, and then moving up to lists, state, and eventually IO.)
+
behavior of Maybe, and then moving up to lists, state, and eventually IO.)
 
[[Anchor(programming-idioms)]]
 
== Programming idioms ==
 
   
  +
==Programming idioms==
 
I suspect that even though I am now able to write Haskell programs of moderate complexity,
 
I suspect that even though I am now able to write Haskell programs of moderate complexity,
 
I am not really using the functional idioms as effectively as I might.
 
I am not really using the functional idioms as effectively as I might.
Line 154: Line 136:
 
short of this ideal.)
 
short of this ideal.)
   
  +
===Separate structure traversal logic from data specifics===
  +
A particular example of the preceding exhortation comes in separating structure traversal from the specifics of processing the structure members.
   
[[Anchor(StructureTraversal)]]
 
  +
Haskell's polymorphic type system allows very generic structure handling routines to be written. The list functions in the Haskell prelude and library are examples of this. The handling of structure-related issues can be kept separate from data-specific issues by using function parameters to handle data-specific functions. The Haskell list libraries have <haskell>processBy</haskell> functions that abstract element matching tests in this way.
=== Separate structure traversal logic from data specifics ===
 
 
A particular example of the preceding exhortation comes in separating structure traversal
 
from the specifics of processing the structure members.
 
 
Haskell's polymorphic type system allows very generic structure handling routines to be written.
 
The list functions in the Haskell prelude and library are examples of this.
 
The handling of structure-related issues can be kept separate from data-specific
 
issues by using function parameters to handle data-specific functions.
 
The Haskell list libraries have ``processBy`` functions that abstract
 
element matching tests in this way.
 
   
 
I have found that failing to separate traversal logic in this way tends to lead
 
I have found that failing to separate traversal logic in this way tends to lead
to over-complicated code.
 
  +
to over-complicated code. Separating traversal logic means that it can be tested on simpler data structures, and of course means that it can be re-used with different data. But even when the traversal function is so arbitrary as to defy re-use, the code simplification is a big win.
Separating traversal logic means that it can be tested on simpler data structures,
 
and of course means that it can be re-used with different data.
 
But even when the traversal function is so arbitrary as to defy re-use,
 
the code simplification is a big win.
 
   
 
(This advice is carried a good deal further by
 
(This advice is carried a good deal further by
Line 180: Line 149:
 
See [http://www.ninebynine.org/Links.html#Programming-Haskell] for links).
 
See [http://www.ninebynine.org/Links.html#Programming-Haskell] for links).
   
 
  +
==Functors==
[[Anchor(Functors)]]
 
== Functors ==
 
 
 
In my early days of Haskell programming, I kept on bumping into this idea of a 'functor'
 
In my early days of Haskell programming, I kept on bumping into this idea of a 'functor'
 
without any real motivation or sense of what it was used for.
 
without any real motivation or sense of what it was used for.
 
Then suddenly it seemed to click...
 
Then suddenly it seemed to click...
   
A 'functor' is a structured collection (or container) type with a method (fmap) that accepts
+
A 'functor' is a structured collection (or container) type with a method (<haskell>fmap</haskell>) that accepts
 
a method and applies that method to the members of the collection yielding an isomorphic
 
a method and applies that method to the members of the collection yielding an isomorphic
 
collection of values of a (possibly) new type.
 
collection of values of a (possibly) new type.
Line 197: Line 163:
 
It has been pointed out to me that this characterization of Functor doesn't obviously embrace
 
It has been pointed out to me that this characterization of Functor doesn't obviously embrace
 
many useful Functors which are rather not collections.
 
many useful Functors which are rather not collections.
For example, every monad (like IO) can be a Functor if you take {{{fmap = Monad.liftM}}}.
+
For example, every monad (like IO) can be a Functor if you take <haskell>fmap = Monad.liftM</haskell>.
 
(For common monadic functors like `[]` and `Maybe` this would give the same operation as their normal definitions.)
 
(For common monadic functors like `[]` and `Maybe` this would give the same operation as their normal definitions.)
 
For myself, if I squint, I can see most monads as a kind of container or carrier or arrangement of other data values,
 
For myself, if I squint, I can see most monads as a kind of container or carrier or arrangement of other data values,
 
so while I accept the comment it doesn't completely invalidate my early intuition about functors.
 
so while I accept the comment it doesn't completely invalidate my early intuition about functors.
   
If the true nature of a functor has been intuitively and clearly explained,
+
If the true nature of a functor has been intuitively and clearly explained,
 
it doesn't seem to be anywhere that I've come across yet.
 
it doesn't seem to be anywhere that I've come across yet.
   
=== Functors and Monads ===
+
===Functors and Monads===
 
 
(This sub-section deals with some matters arising in discussion of the original
 
(This sub-section deals with some matters arising in discussion of the original
''Haskell Learning Notes'', and probably does not contain any useful information
+
''Haskell Learning Notes'', and probably does not contain any useful information
 
for new students of Haskell.)
 
for new students of Haskell.)
   
Let's look more closely at the comment thyat every Monad can be a Functor:
+
Let's look more closely at the comment that every Monad can be a Functor:
{{{
+
<haskell>
 
liftM :: Monad m => (a->b) -> m a -> m b
 
liftM :: Monad m => (a->b) -> m a -> m b
 
fmap :: Functor f => (a->b) -> f a -> f b
 
fmap :: Functor f => (a->b) -> f a -> f b
}}}
 
  +
</haskell>
The function signatures are clearly comparable. Consider `Maybe`:
+
The function signatures are clearly comparable. Consider `Maybe`:
{{{
+
<haskell>
 
liftM f a = do { a' <- a ; return f a' }
 
liftM f a = do { a' <- a ; return f a' }
= a >>= \a' -> return f a'
+
= a >>= \a' -> return f a'
= case a of
+
= case a of
Nothing -> Nothing
+
Nothing -> Nothing
Just a' -> Just (f a')
+
Just a' -> Just (f a')
fmap f Nothing = Nothing
+
fmap f Nothing = Nothing
 
fmap f (Just a') = Just (f a')
 
fmap f (Just a') = Just (f a')
}}}
 
  +
</haskell>
 
so it's easy to see these are equivalent for `Maybe`.
 
so it's easy to see these are equivalent for `Maybe`.
   
Line 235: Line 201:
 
Similarly, every functor is a type constructor but not vice versa;
 
Similarly, every functor is a type constructor but not vice versa;
 
a functor is a type constructor PLUS a function fmap satisfying some laws.
 
a functor is a type constructor PLUS a function fmap satisfying some laws.
(''So what are the Functor laws?'' I can think of maybe `fmap id = id` when applied to a Functor)
+
(''So what are the Functor laws?'' I can think of maybe `fmap id = id` when applied to a Functor)
   
 
The claim that every monad is a functor is not without controversy.
 
The claim that every monad is a functor is not without controversy.
I think this is because there are other ways of defining `fmap` than {{{fmap = Monad.liftM}}}
+
I think this is because there are other ways of defining `fmap` than <haskell>fmap = Monad.liftM</haskell>
 
which result in equally valid functors.
 
which result in equally valid functors.
(It has been said: "The fact Haskell does not make Functor a superclass of Monad is an infelicity of the Standard.
+
(It has been said: "The fact Haskell does not make Functor a superclass of Monad is an infelicity of the Standard. In Haskell 1.4 (I think), it was in fact a superclass.")
In Haskell 1.4 (I think), it was in fact a superclass.")
 
   
 
Strictly speaking monads need not support fail functions.
 
Strictly speaking monads need not support fail functions.
Line 246: Line 212:
   
 
Rather than using `liftM`, `fmap` can be defined via the monad operations:
 
Rather than using `liftM`, `fmap` can be defined via the monad operations:
{{{
 
  +
<haskell>
 
fmap f m = do { x <- m; return (f x) }
 
fmap f m = do { x <- m; return (f x) }
}}}
 
  +
</haskell>
 
or, more succinctly:
 
or, more succinctly:
{{{
 
  +
<haskell>
 
fmap f = (>>= return . f)
 
fmap f = (>>= return . f)
}}}
 
  +
</haskell>
 
But you cannot define `>>=` and `return` via `fmap` alone.
 
But you cannot define `>>=` and `return` via `fmap` alone.
 
So, in this sense, a monad is a special kind of functor.
 
So, in this sense, a monad is a special kind of functor.
Line 259: Line 225:
 
The term 'monad' is used above rather loosely.
 
The term 'monad' is used above rather loosely.
 
'monad' is a term from category theory (whose precise definition is quite abstrusely mathematical),
 
'monad' is a term from category theory (whose precise definition is quite abstrusely mathematical),
from which the Haskell Monad type class is derived. In Haskell, a Monad is a ''type constructor'':
+
from which the Haskell Monad type class is derived. In Haskell, a Monad is a ''type constructor'':
it accepts a type value and returns a new type. That is, a Monad has ''kind'' `* -> *`.
+
it accepts a type value and returns a new type. That is, a Monad has ''kind'' `* -> *`.
 
If `m` is a Monad and `a` is a type, then `m a` is a ''monadic type''.
 
If `m` is a Monad and `a` is a type, then `m a` is a ''monadic type''.
 
An instance of a monadic type is a ''monadic value''.
 
An instance of a monadic type is a ''monadic value''.
   
 
  +
==Using and composing monads==
[[Anchor(using-monads)]]
 
== Using and composing monads ==
 
 
 
Reading the available material, I found this really confusing.
 
Reading the available material, I found this really confusing.
 
Not because it's difficult, but because the patterns used are so different from what I have been used to
 
Not because it's difficult, but because the patterns used are so different from what I have been used to
 
in conventional programming.
 
in conventional programming.
 
The particular area that I was having problems getting my head around was the understanding how a monadic value
 
The particular area that I was having problems getting my head around was the understanding how a monadic value
(i.e. an instance of a monadic type) is created, and how final values are extracted and used.
+
(i.e. an instance of a monadic type) is created, and how final values are extracted and used.
 
Many of the examples I found appeared to gloss over these issues,
 
Many of the examples I found appeared to gloss over these issues,
expecially if some result was to be used in a non-monadic expression.
+
especially if some result was to be used in a non-monadic expression.
 
I think a cookbook approach to use of composed monads,
 
I think a cookbook approach to use of composed monads,
 
with emphasis on actually how to actually execute the monads and extract a final result,
 
with emphasis on actually how to actually execute the monads and extract a final result,
 
might be helpful.
 
might be helpful.
   
Eventually, I found Mark Carroll's example at [http://www.haskell.org/hawiki/MonadState] to be really useful.
+
Eventually, I found Mark Carroll's example at <!-- Broken: [http://www.haskell.org/hawiki/MonadState] --> [http://web.archive.org/web/20070706123846/http://www.haskell.org/hawiki/MonadState] to be really useful. I spent a little while adding comments to this example, and everything became much clearer.
I spent a little while adding comments to this example, and everything became much clearer.
 
 
 
[[Anchor(type-classes-and-data)]]
 
== Type classes, class constraints and data declarations ==
 
   
  +
==Type classes, class constraints and data declarations==
 
Try to avoid putting class constraints on data declarations.
 
Try to avoid putting class constraints on data declarations.
   
Line 300: Line 264:
 
the class membership, e.g. by using functions associated with the class signature.
 
the class membership, e.g. by using functions associated with the class signature.
   
 
  +
==Common errors in do sequences==
[[Anchor(common-do-error)]]
 
== Common errors in do sequences ==
 
 
 
An error I frequently make, which leads to really obscure error messages, is:
 
An error I frequently make, which leads to really obscure error messages, is:
{{{
 
  +
<haskell>
 
return f a
 
return f a
}}}
 
  +
</haskell>
 
in a do-sequence, instead of:
 
in a do-sequence, instead of:
{{{
 
  +
<haskell>
 
return $ f a
 
return $ f a
}}}
 
  +
</haskell>
 
or:
 
or:
{{{
 
  +
<haskell>
 
return (f a)
 
return (f a)
}}}
 
  +
</haskell>
   
 
I feel that compilers might helpfully notice this and offer a hint.
 
I feel that compilers might helpfully notice this and offer a hint.
 
I can't offhand think of any situation when the former case is actually sensible
 
I can't offhand think of any situation when the former case is actually sensible
(though I;m told there are a few).
+
(though I'm told there are a few).
   
 
I've also noticed that, because a String is a list, and a list is a Monad,
 
I've also noticed that, because a String is a list, and a list is a Monad,
Line 328: Line 289:
 
which in turn leads to very confusing secondary type error messages.
 
which in turn leads to very confusing secondary type error messages.
   
 
  +
==Type classes are not as useful as they first seem==
[[Anchor(type-class-misuse)]]
 
== Type classes are not as useful as they first seem ==
 
 
 
Or, maybe more correctly, there are many better ways to achieve the same things that an
 
Or, maybe more correctly, there are many better ways to achieve the same things that an
 
OO programmer would do using classes (or, in Java, interfaces).
 
OO programmer would do using classes (or, in Java, interfaces).
   
I have enountered situations in which I declare a class, and instances of that class,
+
I have encountered situations in which I declare a class, and instances of that class,
 
and other functions and structures that use the class,
 
and other functions and structures that use the class,
 
and later I find it really doesn't support what I want to do because one cannot
 
and later I find it really doesn't support what I want to do because one cannot
Line 356: Line 314:
 
A class doesn't make sense as a structuring or organizing mechanism.
 
A class doesn't make sense as a structuring or organizing mechanism.
 
For example, it is common to want to format values of very different
 
For example, it is common to want to format values of very different
types, and to write functions that use this generic capability. But it
+
types, and to write functions that use this generic capability. But it
 
doesn't really make sense to organize data types into those that can be
 
doesn't really make sense to organize data types into those that can be
 
formatted and those that cannot -- they may have no other attributes in common.
 
formatted and those that cannot -- they may have no other attributes in common.
Line 365: Line 323:
   
 
A weakness of over-use of the type class system has been pointed out by Frank Atanassow in the context
 
A weakness of over-use of the type class system has been pointed out by Frank Atanassow in the context
of formatting (the `Show` class):
+
of formatting (the <haskell>Show</haskell> class):
''I do my best to avoid type classes, but I'm in the minority.
+
<blockquote>"I do my best to avoid type classes, but I'm in the minority.
 
In the long run, I think it's better to use existentials,
 
In the long run, I think it's better to use existentials,
 
as they let you define multiple instances for the same type.
 
as they let you define multiple instances for the same type.
 
This usually makes more sense.
 
This usually makes more sense.
 
For example, 'Show': why support only one, global way of printing things?
 
For example, 'Show': why support only one, global way of printing things?
I often need multiple ones.''
+
I often need multiple ones."</blockquote>
  +
 
I too have noticed that -- I often seem to want both a simple conversion to text,
 
I too have noticed that -- I often seem to want both a simple conversion to text,
 
and a variation which formats over several lines with indentation.
 
and a variation which formats over several lines with indentation.
Line 378: Line 336:
 
In general, I think beginners overuse type classes, and use them incorrectly.
 
In general, I think beginners overuse type classes, and use them incorrectly.
 
I think a good approach is to write a first version without any fancy type hacking,
 
I think a good approach is to write a first version without any fancy type hacking,
just using the usual ADT approach, and later come back to it and see how to generalize
+
just using the usual [[ADT]] approach, and later come back to it and see how to generalize
 
it with classes, if and when the need arises.
 
it with classes, if and when the need arises.
People tend to forget that the major difference between ADT's and OO-style classes is really
+
People tend to forget that the major difference between ADTs and OO-style classes is really
 
only that with a class you can have many type instances in the same program simultaneously,
 
only that with a class you can have many type instances in the same program simultaneously,
 
whereas with an ADT you can have only one; but the ADT implementation is still interchangeable.
 
whereas with an ADT you can have only one; but the ADT implementation is still interchangeable.
Line 387: Line 345:
 
are opaque to operations on the ADT itself.
 
are opaque to operations on the ADT itself.
   
[[Anchor(ExistentialTypes)]]
 
 
== Existential types ==
 
   
It seems that 'forall x' introduces x as existential when it appears immediately preceding a
 
  +
==Existential types==
  +
It seems that <haskell>forall x</haskell> introduces x as existential when it appears immediately preceding a
 
datatype constructor declaration.
 
datatype constructor declaration.
 
Elsewhere, it appears to signal a universal quantifier.
 
Elsewhere, it appears to signal a universal quantifier.
Line 405: Line 361:
   
 
The reason is that, in:
 
The reason is that, in:
{{{
 
  +
<haskell>
 
data AppT a = forall x. App (x -> a, x)
 
data AppT a = forall x. App (x -> a, x)
}}}
 
  +
</haskell>
 
by considering the constructor's role as a function, we see that `App` must be an instance of type:
 
by considering the constructor's role as a function, we see that `App` must be an instance of type:
{{{
 
  +
<haskell>
 
(forall x. (x -> a, x)) -> AppT a
 
(forall x. (x -> a, x)) -> AppT a
}}}
 
  +
</haskell>
 
(note the `forall` is under the second arrow) which is ''implied by'':
 
(note the `forall` is under the second arrow) which is ''implied by'':
{{{
 
  +
<haskell>
 
exist x. ((x -> a, x) -> AppT a
 
exist x. ((x -> a, x) -> AppT a
}}}
 
  +
</haskell>
 
where the `exist` is at top-level.
 
where the `exist` is at top-level.
 
With rank-N polymorphism you can construct such things without data declarations.
 
With rank-N polymorphism you can construct such things without data declarations.
 
(A forall induces an existential when it occurs in an `odd' argument position.)
 
(A forall induces an existential when it occurs in an `odd' argument position.)
   
''(The above explanation was a response to my original notes, which I now find I don't entirely understand.
+
''(The above explanation was a response to my original notes, which I now find I don't entirely understand.
 
Hopefully, a more comprehensible explanation will arise in due course.)''
 
Hopefully, a more comprehensible explanation will arise in due course.)''
   
[[Anchor(ClassDependencies)]]
 
  +
==More on type class subtleties; class dependencies==
== More on type class subtleties; class dependencies ==
 
 
 
I think I finally figured out how to get my LookupMap abstraction to work the way I wanted.
 
I think I finally figured out how to get my LookupMap abstraction to work the way I wanted.
 
The idea was that I would be able to use LookupMap to provide a lookup over any datatype
 
The idea was that I would be able to use LookupMap to provide a lookup over any datatype
 
by creating an appropriate LookupEntryClass instance declaration. Originally, I had:
 
by creating an appropriate LookupEntryClass instance declaration. Originally, I had:
{{{
 
  +
<haskell>
 
class (Eq k, Show k) => LookupEntryClass a k v
 
class (Eq k, Show k) => LookupEntryClass a k v
where
+
where
newEntry :: (k,v) -> a k v
+
newEntry :: (k,v) -> a k v
keyVal :: a k v -> (k,v)</pre>
+
keyVal :: a k v -> (k,v)</pre>
}}}
+
</haskell>
 
and
 
and
{{{
 
  +
<haskell>
 
instance (Eq k, Show k) => LookupEntryClass (,) k v where
 
instance (Eq k, Show k) => LookupEntryClass (,) k v where
newEntry = id
+
newEntry = id
keyVal = id
+
keyVal = id
}}}
+
</haskell>
 
and
 
and
{{{
 
  +
<haskell>
 
data LookupMap a k v = LookupMap [a k v]
 
data LookupMap a k v = LookupMap [a k v]
}}}
 
  +
</haskell>
 
(A list is a poor implementation structure for this purpose.
 
(A list is a poor implementation structure for this purpose.
 
The idea of this class is that I can vary the implementation
 
The idea of this class is that I can vary the implementation
Line 454: Line 408:
 
I finally realized, after 6 months or so of using Haskell,
 
I finally realized, after 6 months or so of using Haskell,
 
that I could declare the lookup entry class thus:
 
that I could declare the lookup entry class thus:
{{{
 
  +
<haskell>
 
class (Eq k, Show k) => LookupEntryClass a k v | a -> k, a -> v
 
class (Eq k, Show k) => LookupEntryClass a k v | a -> k, a -> v
where
+
where
newEntry :: (k,v) -> a
+
newEntry :: (k,v) -> a
keyVal :: a -> (k,v)
+
keyVal :: a -> (k,v)
}}}
+
</haskell>
 
and
 
and
{{{
 
  +
<haskell>
 
instance (Eq k, Show k) => LookupEntryClass (k,v) k v where
 
instance (Eq k, Show k) => LookupEntryClass (k,v) k v where
newEntry = id
+
newEntry = id
keyVal = id
+
keyVal = id
}}}
+
</haskell>
 
and
 
and
{{{
 
  +
<haskell>
 
data LookupMap a = LookupMap [a]
 
data LookupMap a = LookupMap [a]
}}}
 
  +
</haskell>
 
That is, the `LookupEntryClass` constructor takes three parameters of kind `*`,
 
That is, the `LookupEntryClass` constructor takes three parameters of kind `*`,
 
rather than the first being of kind `* -> * -> *`.
 
rather than the first being of kind `* -> * -> *`.
Line 479: Line 433:
   
 
When needed, the key and value types are made available through a context declaration, thus:
 
When needed, the key and value types are made available through a context declaration, thus:
{{{
 
  +
<haskell>
keyOrder :: (LookupEntryClass a k v, Ord k) => a -> a -> Ordering
+
keyOrder :: (LookupEntryClass a k v, Ord k) => a -> a -> Ordering
 
keyOrder e1 e2 = compare k1 k2
 
keyOrder e1 e2 = compare k1 k2
where
+
where
(k1,_) = keyVal e1
+
(k1,_) = keyVal e1
(k2,_) = keyVal e2
+
(k2,_) = keyVal e2
}}}
+
</haskell>
   
 
I don't think this overall approach would be possible without using the type class dependency extension to Haskell.
 
I don't think this overall approach would be possible without using the type class dependency extension to Haskell.
   
 
  +
==Using ShowS==
[[Anchor(UsingShowS)]]
 
== Using ShowS ==
 
 
 
`ShowS` is a type that is used to improve the efficiency of creating long strings (character sequences)
 
`ShowS` is a type that is used to improve the efficiency of creating long strings (character sequences)
 
that are for output, or are otherwise used in left-to-right order.
 
that are for output, or are otherwise used in left-to-right order.
Line 499: Line 450:
 
this means that when building a long string by successive concatenation,
 
this means that when building a long string by successive concatenation,
 
the time required to append each character is proportional to the length of the string,
 
the time required to append each character is proportional to the length of the string,
making the overall time requiired to build the string proportional to the square of its length.
+
making the overall time required to build the string proportional to the square of its length.
   
 
Type `ShowS` is defined as:
 
Type `ShowS` is defined as:
{{{
 
  +
<haskell>
 
type ShowS = String -> String
 
type ShowS = String -> String
}}}
 
  +
</haskell>
   
 
At first glance, `ShowS` does not appear to rectify this,
 
At first glance, `ShowS` does not appear to rectify this,
 
since for a string value it is defined simply as `(++)`:
 
since for a string value it is defined simply as `(++)`:
{{{
 
  +
<haskell>
 
showString :: String -> ShowS
 
showString :: String -> ShowS
 
showString = (++)
 
showString = (++)
}}}
 
  +
</haskell>
   
 
`ShowS` uses lazy evaluation to defer evaluation of the `++` operation;
 
`ShowS` uses lazy evaluation to defer evaluation of the `++` operation;
Line 518: Line 469:
   
 
Consider the expression:
 
Consider the expression:
{{{
 
  +
<haskell>
 
a ++ b
 
a ++ b
}}}
 
  +
</haskell>
 
In order to access the first character of the resulting string, it is necessary to evaluate 'a'.
 
In order to access the first character of the resulting string, it is necessary to evaluate 'a'.
 
If 'a' itself is a concatenation of strings:
 
If 'a' itself is a concatenation of strings:
{{{
 
  +
<haskell>
 
(a1 ++ a2) ++ b
 
(a1 ++ a2) ++ b
}}}
 
  +
</haskell>
 
then that concatenation too must be evaluated.
 
then that concatenation too must be evaluated.
   
Line 533: Line 484:
   
 
The above example then would be:
 
The above example then would be:
{{{
 
  +
<haskell>
 
showString a1 (showString a2 (showString b ""))
 
showString a1 (showString a2 (showString b ""))
}}}
 
  +
</haskell>
 
Thus, the introduction of a function allows the left-associated construction of
 
Thus, the introduction of a function allows the left-associated construction of
 
concatenations to be evaluated in a right-associated fashion.
 
concatenations to be evaluated in a right-associated fashion.
Line 542: Line 493:
 
So how should one use `ShowS` values? There are three key operations:
 
So how should one use `ShowS` values? There are three key operations:
   
1. to turn a string into a `ShowS`, use `showString`:
+
=== to turn a string into a `ShowS`, use `showString`===
{{{
+
<haskell>
a = showString "Hello"
+
a = showString "Hello"
}}}
+
</haskell>
or just a section:
+
or just a section:
{{{
+
<haskell>
a = ("Hello"++)
+
a = ("Hello"++)
}}}
+
</haskell>
The Haskell prelude and library provide functions that turn other values into ShowS: the default mechanism is to turn them into a string, then apply showString as above. The default showsPrec is defined as:
+
The Haskell prelude and library provide functions that turn other values into ShowS: the default mechanism is to turn them into a string, then apply showString as above. The default showsPrec is defined as:
{{{
+
<haskell>
showsPrec _ x s = show x ++ s
+
showsPrec _ x s = show x ++ s
}}}
+
</haskell>
which might equally be written as:
+
which might equally be written as:
{{{
+
<haskell>
showsPrec _ x = ((show x)++)
+
showsPrec _ x = ((show x)++)
}}}
+
</haskell>
or:
+
or:
{{{
+
<haskell>
showsPrec _ x = showString (show x)
+
showsPrec _ x = showString (show x)
}}}
+
</haskell>
When implementing complex data structures as instances of Show, consider implementing showsPrec instead of show.
 
 
2. to turn a `ShowS` into a string: simply apply it to an empty string:
 
{{{
 
a = (\x->"Hello "++(" world"++x)) ""
 
= "Hello"++(" world"++"")
 
= "Hello"++(" world")
 
= "Hello world"
 
}}}
 
   
3. to concatenate `ShowS` values, use function composition:
 
  +
* When implementing complex data structures as instances of Show, consider implementing showsPrec instead of show
{{{
 
(a . (b . c)) x = a ((b . c) x)
 
= a (b (c x))
 
((a . b) . c) x = (a . b) (c x)
 
= a (b (c x))
 
}}}
 
So if:
 
{{{
 
a = ("Hello,"++)
 
b = (" cruel"++)
 
c = (" world"++)
 
}}}
 
then:
 
{{{
 
(a . b . c) "" = a (b (c ""))
 
= a (b " world")
 
= a " cruel world"
 
= "Hello, cruel world"
 
}}}
 
   
  +
===to turn a `ShowS` into a string: simply apply it to an empty string===
  +
<haskell>
  +
a = (\x->"Hello "++(" world"++x)) ""
  +
= "Hello"++(" world"++"")
  +
= "Hello"++(" world")
  +
= "Hello world"
  +
</haskell>
   
[[Anchor(UseStandardTypes)]]
 
  +
===to concatenate `ShowS` values, use function composition===
== Use standard types as far as possible ==
 
  +
<haskell>
  +
(a . (b . c)) x = a ((b . c) x)
  +
= a (b (c x))
  +
((a . b) . c) x = (a . b) (c x)
  +
= a (b (c x))
  +
</haskell>
  +
So if:
  +
<haskell>
  +
a = ("Hello,"++)
  +
b = (" cruel"++)
  +
c = (" world"++)
  +
</haskell>
  +
then:
  +
<haskell>
  +
(a . b . c) "" = a (b (c ""))
  +
= a (b " world")
  +
= a " cruel world"
  +
= "Hello, cruel world"
  +
</haskell>
   
  +
==Use standard types as far as possible==
 
There is extensive support for doing useful things with the standard types declared in the prelude.
 
There is extensive support for doing useful things with the standard types declared in the prelude.
   
 
For example, it may be tempting to define a new type to represent a particular value or absence of a value.
 
For example, it may be tempting to define a new type to represent a particular value or absence of a value.
 
I did this for my RDF language tags, defining:
 
I did this for my RDF language tags, defining:
{{{
 
  +
<haskell>
 
data Language = Lang String | NoLanguage
 
data Language = Lang String | NoLanguage
}}}
 
  +
</haskell>
   
 
It would have been far better to use:
 
It would have been far better to use:
{{{
 
  +
<haskell>
 
type Language = Maybe String
 
type Language = Maybe String
}}}
 
  +
</haskell>
   
 
Many of the built-in types are defined as Monads or Functors,
 
Many of the built-in types are defined as Monads or Functors,
Line 608: Line 566:
 
For example, consider `sequence`...
 
For example, consider `sequence`...
   
 
  +
==The power of sequence==
[[Anchor(Sequence)]]
 
== The power of sequence ==
 
 
 
The prelude function `sequence` is surprisingly versatile.
 
The prelude function `sequence` is surprisingly versatile.
   
 
The basic type signature is:
 
The basic type signature is:
{{{
 
  +
<haskell>
 
sequence :: Monad m => [m a] -> m [a]
 
sequence :: Monad m => [m a] -> m [a]
}}}
 
  +
</haskell>
 
Its function varies with the particular Monad (m) with which it is used:
 
Its function varies with the particular Monad (m) with which it is used:
* list: sequence has the effect of computing a cartesian product
+
* list: sequence has the effect of computing a Cartesian product
* `Maybe`: sequence returns Nothing if any of the list members is Nothing, otherwise Just a list of bare values.
+
* `Maybe`: sequence returns Nothing if any of the list members is Nothing, otherwise Just a list of bare values.
* Error monad: returns the first error in the list, or a list of values if there are no errors
+
* Error monad: returns the first error in the list, or a list of values if there are no errors
* State monad: returns the effect of applying each transformation in turn.
+
* State monad: returns the effect of applying each transformation in turn.
* IO: (also function sequence_) returns a new IO value that is the result of executing each of the statements in turn. One particular use for this is to turn a list of IO values, each returning a value of some type, into a single IO value that returns a list of values of that type. (An IO value being an instance of some type constructed using the IO monad.)
+
* IO: (also function sequence_) returns a new IO value that is the result of executing each of the statements in turn. One particular use for this is to turn a list of IO values, each returning a value of some type, into a single IO value that returns a list of values of that type. (An IO value being an instance of some type constructed using the IO monad.)
   
 
Sometimes, sequence can be applied more than once to a value,
 
Sometimes, sequence can be applied more than once to a value,
Line 630: Line 585:
 
There is a generalization of `sequence` that re-arranges a more arbitrary
 
There is a generalization of `sequence` that re-arranges a more arbitrary
 
monadic structure, as in:
 
monadic structure, as in:
{{{
 
  +
<haskell>
 
foo :: (Monad m1, Monad m2) => m1 m2 a -> m2 m1 a
 
foo :: (Monad m1, Monad m2) => m1 m2 a -> m2 m1 a
}}}
 
  +
</haskell>
 
There is much work on composing monads.
 
There is much work on composing monads.
 
A function like this is usually called 'swap'.
 
A function like this is usually called 'swap'.
 
These two papers address this topic:
 
These two papers address this topic:
* http://www.cse.ogi.edu/%7empj/pubs/springschool.html
+
* http://www.cse.ogi.edu/%7empj/pubs/springschool.html
* http://www.cse.ogi.edu/%7empj/pubs/composing.html
+
* http://www.cse.ogi.edu/%7empj/pubs/composing.html
 
(I understand there is another that addresses this topic more directly,
 
(I understand there is another that addresses this topic more directly,
 
but the reference is not currently available.)
 
but the reference is not currently available.)
   
 
  +
==((->) e) is a Monadic type==
[[Anchor(ReaderMonad)]]
 
== ((->) e) is a Monadic type ==
 
 
 
The type `((->) e)` can be defined as the simplest(?) instance of a Reader Monad,
 
The type `((->) e)` can be defined as the simplest(?) instance of a Reader Monad,
 
which can be used to evaluate several functions with the same 'environment' value.
 
which can be used to evaluate several functions with the same 'environment' value.
(Note that `(->)` is the funtion type constructor, having kind `(* -> * -> *)`
+
(Note that `(->)` is the function type constructor, having kind `(* -> * -> *)`
 
-- see Haskell Report section 4.1.2.
 
-- see Haskell Report section 4.1.2.
   
 
  +
==Scanning a list==
[[Anchor(ScanList)]]
 
== Scanning a list ==
 
 
 
A common requirement seems to be to scan through a list of values,
 
A common requirement seems to be to scan through a list of values,
 
returning a computation based on the first member of the list that
 
returning a computation based on the first member of the list that
Line 659: Line 608:
 
In imperative programming, this might be achieved by looping to probe each
 
In imperative programming, this might be achieved by looping to probe each
 
member of the list, and breaking out of the loop when a satisfactory element
 
member of the list, and breaking out of the loop when a satisfactory element
is found. If the entire list is scanned without finding a satisfactory element,
+
is found. If the entire list is scanned without finding a satisfactory element,
 
some different or default result is returned.
 
some different or default result is returned.
   
Line 667: Line 616:
 
yield a desired result.
 
yield a desired result.
 
Then, the following code might be used:
 
Then, the following code might be used:
{{{
 
  +
<haskell>
listToMaybe $ catMaybes (map processItem list)
+
listToMaybe $ catMaybes $ map processItem list
}}}
+
</haskell>
 
This returns `Just` the required value, or `Nothing` if no member of
 
This returns `Just` the required value, or `Nothing` if no member of
`list` yields such a value. Lazy evaluation means that only elements of
+
`list` yields such a value. Lazy evaluation means that only elements of
 
`list` up to that which returns the desired value are actually processed.
 
`list` up to that which returns the desired value are actually processed.
 
Alternatively, to return an empty list if there is no matching value,
 
Alternatively, to return an empty list if there is no matching value,
 
otherwise a singleton, use:
 
otherwise a singleton, use:
{{{
 
  +
<haskell>
take 1 $ catMaybes (map processItem list)
+
take 1 $ catMaybes $ map processItem list
}}}
+
</haskell>
   
 
This idea can be applied to other traversable structures, such as trees, etc.
 
This idea can be applied to other traversable structures, such as trees, etc.
   
----
 
  +
==Acknowledgments==
 
= Acknowledgements =
 
 
 
Many of the lessons recorded here are due to careful and patient explanations by many members of
 
Many of the lessons recorded here are due to careful and patient explanations by many members of
 
the Haskell developer community, and denizens of the mailing lists listed at http://www.haskell.org/mailinglist.html.
 
the Haskell developer community, and denizens of the mailing lists listed at http://www.haskell.org/mailinglist.html.
Line 691: Line 637:
 
Any errors or misunderstandings propagated above are, of course, solely my own.
 
Any errors or misunderstandings propagated above are, of course, solely my own.
   
''-- GrahamKlyne [[DateTime(2005-05-11T15:18:48Z)]]''
+
''-- Graham Klyne (DateTime: 2005-05-11T15:18:48Z)''
----
+
CategoryArticle
+
[[Category:Article]]

Latest revision as of 04:59, 9 March 2011

Learning Haskell Notes

By Graham Klyne

This page contains some personal notes accumulated over several months of learning to program in Haskell, developing a moderately-sized application.

Learning Materials

I found there was plenty of good tutorial material for introducing a new programmer to Haskell, and the language report was a very precise and concise reference for language specialists, but there seems to be little in between for a 'jobbing programmer'. The report could sometimes be unhelpful when trying to learn details of some subtle new feature of the language not covered by the tutorials.

A book in the style of, say, the The Annotated C++ Reference Manual (Ellis and Stroustrup) would be useful: when I was learning and using C++, this book was invaluable for me, as it contained just the right mix of formal definition and in-depth pedagogical material.

I find the Haskell report tends to concentrate largely on definition, and sometimes skimps on explanation. This would be fine, but the other, more tutorial, sources I consulted don't seem to have complete coverage of the language.

Error messages from compilers

Error messages can sometimes be very obscure, misleading even. Especially in the realm of types and classes.

The type system is very subtle. Maybe this is unavoidable. Better diagnostics would help; GHC and GHCi seem to be better in this respect than Hugs, in part because they give more information about the difference between what was expected and what was found.

I think that because so many concepts are separated, it's easy to focus on the wrong aspects of one's type declarations.

A common error is to mis-judge operator precedences and end up with a sub-expression being inferred to have a very different type from what was intended; usually a function type. (Hint: function application binds more tightly than any operator; function composition has very low precedence.) Here is a simple example:

f1 :: (Char->Float) -> (Float->Int) -> Char -> Int
f1 g h c = h . g c

results in the HUGS error message

... - Type error in application
*** Expression : h . g c
*** Term : g c
*** Type : Float
*** Does not match : a -> b

A valid form would be:

f1 :: (Char->Float) -> (Float->Int) -> Char -> Int
f1 g h c = (h . g) c

In this case, it's fairly easy to see the mistake, but in more complex expressions the type errors can be rather tricky to spot, especially when inferred intermediate types are being used. In the following example, the same error is less easy to determine from the offered diagnostic message Term `f5`, type `Char -> a -> b` does not match `Char -> Int`:

f2 g h c = h . g c
f3 'a' = 1.0
f4 1.0 = 1
f5 :: Char -> Int
f5 c = f2 f3 f4 'a'

It can sometimes be difficult to distinguish between simple typos and deep abuse of the type system. In this example:

 class (Eq k, Show k) => LookupEntryClass a k v where
 entryKey :: a -> k
 entryVal :: a -> v
 newEntry :: (k,v) -> a

 data (LookupEntryClass a k v) => LookupMap a k v = LookupMap [a k v]

the error message reported (by HUGS) was of an invalid type constructor in the 'data' declaration, when the real problem was missing type parameters in the class method declarations; i.e. should have been:

 class (Eq k, Show k) => LookupEntryClass a k v where
 entryKey :: a k v -> k
 entryVal :: a k v -> v
 newEntry :: (k,v) -> a k v

 data (LookupEntryClass a k v) => LookupMap a k v = LookupMap [a k v]

Multi-parameter type classes

Although not part of the official Haskell 98 language, multi-parameter classes seem to be pretty much essential for many sizeable tasks or re-targetable software components. I think clearer signposting of this widely-implemented feature would help a new programmer. (But see also point [#type-class-misuse below] about using type classes,) I note others, too, have observed the importance of multi-parameter classes.

Also, more accessible documentation in this area would help. I think it could be made clearer that multiparameter classes are general relational constraints. I got locked into an incorrect reading that 'C a s' meant that 'a s' was an instance of 'C', rather than C is a relation on types, of which (a,s) is a member. It's not clear why I made this mistake -- I think the seeds of my confusion may have been laid by the form of subclass declarations:

 (C a) => (D a)

which means `D` is a subclass of `C`, or `a` stands for an instance of `C` as well as `D`. In OO programming, classes are types and instances are values -- Haskell separates these ideas more carefully.

I don't know if this was just a personal confusion, or more widely experienced. I recognize that it's all explained, but it's difficult to take on all the new material at once without being lead by previous experience.

Confusing introduction to monads

Monads can be a source of confusion. The basic ideas are easy enough to get hold of, and using existing monad-based libraries is quite simple, but the deeper issues of when and when not to use monads are less well covered. I found little help given on how to design new monads. Maybe this is covered in the common language specification and/or tutorial materials, but if so it's not clear where to find the appropriate descriptions.

I found the paper State in Haskell by Simon Peyton-Jones and John Launchbury to be one of the more helpful sources in this respect. (The introductory approach of introducing IO first, then generalizing to state -- as in, say, The Craft of Functional Programming -- I found to be less helpful, as it didn't help me to understand how monadic values are created and activated. This is covered in later material, but by then some incorrect/incomplete views about how monads work had become entrenched.

(I had some ideas for writing a short Programmers' View Of Monads - if I can remember them! If I recall correctly, my idea was to focus on concrete monads, describing the monadic behavior of Maybe, and then moving up to lists, state, and eventually IO.)

Programming idioms

I suspect that even though I am now able to write Haskell programs of moderate complexity, I am not really using the functional idioms as effectively as I might. E.g. I have read claims that functional programming allows for better modularization of code, but I've found my own attempts to modularize seem to run into difficulties. I'm probably trying to apply conventional techniques inappropriately.

I could use a good source of information about effective programming style and idioms in Haskell.

Later, I read John Hughes' paper Why Functional Programming Matters, and Scrap Your Boilerplate by Ralf Laemmer and Simon Peyton-Jones, which have helped me to some insights about how one might strive to use functional programming more effectively.

Later still, it seems to me that a key to effective use of Haskell is designing things so that each function is concerned with just one aspect of a computation: data structure traversal, simple data value manipulations, computational strategy, etc., with all other aspects of a computation dealt with by separate function parameters. Thus, a complex computation can be expressed as a combination of primitive single-purpose functions, each of which defines some single part, and can be re-used in a different context. One way of seeing this might be to consider Haskell as a pattern language: I think many, if not most, design patterns commonly found in design pattern catalogs can be expressed in some way has higher order functions.

(All this is easier said than practised, and I find most of my own code falls well short of this ideal.)

Separate structure traversal logic from data specifics

A particular example of the preceding exhortation comes in separating structure traversal from the specifics of processing the structure members.

Haskell's polymorphic type system allows very generic structure handling routines to be written. The list functions in the Haskell prelude and library are examples of this. The handling of structure-related issues can be kept separate from data-specific issues by using function parameters to handle data-specific functions. The Haskell list libraries have
processBy
functions that abstract element matching tests in this way.

I have found that failing to separate traversal logic in this way tends to lead to over-complicated code. Separating traversal logic means that it can be tested on simpler data structures, and of course means that it can be re-used with different data. But even when the traversal function is so arbitrary as to defy re-use, the code simplification is a big win.

(This advice is carried a good deal further by the functional strategic programming project (Strafunski); see also the 'Scrap Your Boilerplate' paper . See [1] for links).

Functors

In my early days of Haskell programming, I kept on bumping into this idea of a 'functor' without any real motivation or sense of what it was used for. Then suddenly it seemed to click...

A 'functor' is a structured collection (or container) type with a method (
fmap
) that accepts

a method and applies that method to the members of the collection yielding an isomorphic collection of values of a (possibly) new type. This separates data structure traversal from operations on members of the structure, which is a simple example of the separation of concerns mentioned in the section [#programming-idioms above] on programming idioms.

It has been pointed out to me that this characterization of Functor doesn't obviously embrace many useful Functors which are rather not collections.

For example, every monad (like IO) can be a Functor if you take
fmap = Monad.liftM
.

(For common monadic functors like `[]` and `Maybe` this would give the same operation as their normal definitions.) For myself, if I squint, I can see most monads as a kind of container or carrier or arrangement of other data values, so while I accept the comment it doesn't completely invalidate my early intuition about functors.

If the true nature of a functor has been intuitively and clearly explained, it doesn't seem to be anywhere that I've come across yet.

Functors and Monads

(This sub-section deals with some matters arising in discussion of the original Haskell Learning Notes, and probably does not contain any useful information for new students of Haskell.)

Let's look more closely at the comment that every Monad can be a Functor:

liftM :: Monad m => (a->b) -> m a -> m b
fmap :: Functor f => (a->b) -> f a -> f b

The function signatures are clearly comparable. Consider `Maybe`:

liftM f a = do { a' <- a ; return f a' }
 = a >>= \a' -> return f a'
 = case a of
 Nothing -> Nothing
 Just a' -> Just (f a')
fmap f Nothing = Nothing
fmap f (Just a') = Just (f a')

so it's easy to see these are equivalent for `Maybe`.

Looking at the pattern here, it seems that a Monad is a extension of a Functor, which also deals with the 'no value' case (Nothing, error, empty list, or whatever is returned by the monad's fail function).

It has been claimed that every monad is a functor, but not the other way around; a monad is a functor PLUS functions >>= and return satisfying some laws. Similarly, every functor is a type constructor but not vice versa; a functor is a type constructor PLUS a function fmap satisfying some laws. (So what are the Functor laws? I can think of maybe `fmap id = id` when applied to a Functor)

The claim that every monad is a functor is not without controversy.

I think this is because there are other ways of defining `fmap` than
fmap = Monad.liftM

which result in equally valid functors. (It has been said: "The fact Haskell does not make Functor a superclass of Monad is an infelicity of the Standard. In Haskell 1.4 (I think), it was in fact a superclass.")

Strictly speaking monads need not support fail functions. The Id monad has no 'no value' case, for example.

Rather than using `liftM`, `fmap` can be defined via the monad operations:

fmap f m = do { x <- m; return (f x) }

or, more succinctly:

fmap f = (>>= return . f)

But you cannot define `>>=` and `return` via `fmap` alone. So, in this sense, a monad is a special kind of functor. (Note that this definition of `fmap` is the same as the standard monadic definition of `liftM`.)

The term 'monad' is used above rather loosely. 'monad' is a term from category theory (whose precise definition is quite abstrusely mathematical), from which the Haskell Monad type class is derived. In Haskell, a Monad is a type constructor: it accepts a type value and returns a new type. That is, a Monad has kind `* -> *`. If `m` is a Monad and `a` is a type, then `m a` is a monadic type. An instance of a monadic type is a monadic value.

Using and composing monads

Reading the available material, I found this really confusing. Not because it's difficult, but because the patterns used are so different from what I have been used to in conventional programming. The particular area that I was having problems getting my head around was the understanding how a monadic value (i.e. an instance of a monadic type) is created, and how final values are extracted and used. Many of the examples I found appeared to gloss over these issues, especially if some result was to be used in a non-monadic expression. I think a cookbook approach to use of composed monads, with emphasis on actually how to actually execute the monads and extract a final result, might be helpful.

Eventually, I found Mark Carroll's example at [2] to be really useful. I spent a little while adding comments to this example, and everything became much clearer.

Type classes, class constraints and data declarations

Try to avoid putting class constraints on data declarations.

I found myself doing this when declaring a datatype intended to be an instance of a particular class, and then finding the type system would not allow me to make it a member of some other class, such as Functor, because the constraint was not justifiable by the method signature for that class.

I think the desire to do this is an artefact caused by OO-style thinking, along the lines of 'an instance of this class must be subject to the class constraints'.

The data type used as an instance of a class need not always be subject to the class constraints (and, it seems, usually should not). The data type may be used additionally for non-class purposes, and this is assumed when it is also used as an instance of some other class, like Functor. It's only when the specific class methods are used that the constraints come into play, and need to be declared.

Thus, rather than adding class constraints to a data type, they should be part of the type signature of any function or class that depends on the class membership, e.g. by using functions associated with the class signature.

Common errors in do sequences

An error I frequently make, which leads to really obscure error messages, is:

return f a

in a do-sequence, instead of:

return $ f a

or:

return (f a)

I feel that compilers might helpfully notice this and offer a hint. I can't offhand think of any situation when the former case is actually sensible (though I'm told there are a few).

I've also noticed that, because a String is a list, and a list is a Monad, it is sometimes possible to completely omit the 'return' statement when returning a string value without this being an immediately invalid do sequence. Instead, the return type of the do sequence is an unexpected value (a list monad containing a Char, or something like that), which in turn leads to very confusing secondary type error messages.

Type classes are not as useful as they first seem

Or, maybe more correctly, there are many better ways to achieve the same things that an OO programmer would do using classes (or, in Java, interfaces).

I have encountered situations in which I declare a class, and instances of that class, and other functions and structures that use the class, and later I find it really doesn't support what I want to do because one cannot mix different instances of the class in situations where a single common datatype is needed. E.g., a list of class instances must have values of the same instance type in every position. In this kind of situation, storing instances of a class is often the wrong thing to do, unless it is genuinely desired that every value has the same constrained but unspecified data type.

Commonly, a more effective way is to use polymorphic function types: if the putative class has just one interface method, consider declaring a function type instead, and using that in place of a class-constrained type. If there are several methods, declare an algebraic data type whose members are polymorphic functions. (This latter amounts to almost exactly how late-binding class methods are implemented for many OO programming languages.)

The time to use a class seems to be when there is a genuine desire to instantiate a particular interface over different and specific underlying types of data. A class doesn't make sense as a structuring or organizing mechanism. For example, it is common to want to format values of very different types, and to write functions that use this generic capability. But it doesn't really make sense to organize data types into those that can be formatted and those that cannot -- they may have no other attributes in common.

Some tutorial material on using algebraic data types for 'OO-style' programming tasks might be helpful. (There's a useful message from John Meacham here: [3].)

A weakness of over-use of the type class system has been pointed out by Frank Atanassow in the context

of formatting (the
Show
class):
"I do my best to avoid type classes, but I'm in the minority.

In the long run, I think it's better to use existentials, as they let you define multiple instances for the same type. This usually makes more sense. For example, 'Show': why support only one, global way of printing things?

I often need multiple ones."

I too have noticed that -- I often seem to want both a simple conversion to text, and a variation which formats over several lines with indentation. Still, classes like Show that provide common methods over a range of types are often useful.

In general, I think beginners overuse type classes, and use them incorrectly. I think a good approach is to write a first version without any fancy type hacking, just using the usual ADT approach, and later come back to it and see how to generalize it with classes, if and when the need arises. People tend to forget that the major difference between ADTs and OO-style classes is really only that with a class you can have many type instances in the same program simultaneously, whereas with an ADT you can have only one; but the ADT implementation is still interchangeable. Usually, that is all you need. Also, the single instance of an ADT can still be polymorphic, as long as the variable types are opaque to operations on the ADT itself.


Existential types

It seems that
forall x
introduces x as existential when it appears immediately preceding a

datatype constructor declaration. Elsewhere, it appears to signal a universal quantifier.

Existential types are a kind of information hiding mechanism: some datatype is used in an implementation, but its details are hidden from view, and the particular type used may vary from instance-to-instance in the same surrounding type context (e.g. between members of a list). Having the existential type being an instance of a class may provide means for some functions to do useful things with the type.

See also: discussion of issues on Haskell-cafe mailing list, about October 2003, starting with [4].

The reason is that, in:

data AppT a = forall x. App (x -> a, x)

by considering the constructor's role as a function, we see that `App` must be an instance of type:

(forall x. (x -> a, x)) -> AppT a

(note the `forall` is under the second arrow) which is implied by:

exist x. ((x -> a, x) -> AppT a

where the `exist` is at top-level. With rank-N polymorphism you can construct such things without data declarations. (A forall induces an existential when it occurs in an `odd' argument position.)

(The above explanation was a response to my original notes, which I now find I don't entirely understand. Hopefully, a more comprehensible explanation will arise in due course.)

More on type class subtleties; class dependencies

I think I finally figured out how to get my LookupMap abstraction to work the way I wanted. The idea was that I would be able to use LookupMap to provide a lookup over any datatype by creating an appropriate LookupEntryClass instance declaration. Originally, I had:

class (Eq k, Show k) => LookupEntryClass a k v
 where
 newEntry :: (k,v) -> a k v
 keyVal :: a k v -> (k,v)</pre>

and

instance (Eq k, Show k) => LookupEntryClass (,) k v where
 newEntry = id
 keyVal = id

and

data LookupMap a k v = LookupMap [a k v]

(A list is a poor implementation structure for this purpose. The idea of this class is that I can vary the implementation while maintaining a consistent interface for my applications.)

The trouble with this is that I couldn't use an existing type as a lookup entry unless it had a type constructor that was of the prescribed form; i.e. (typename keytype valuetype) in this case. I finally realized, after 6 months or so of using Haskell, that I could declare the lookup entry class thus:

class (Eq k, Show k) => LookupEntryClass a k v | a -> k, a -> v
 where
 newEntry :: (k,v) -> a
 keyVal :: a -> (k,v)

and

instance (Eq k, Show k) => LookupEntryClass (k,v) k v where
 newEntry = id
 keyVal = id

and

data LookupMap a = LookupMap [a]

That is, the `LookupEntryClass` constructor takes three parameters of kind `*`, rather than the first being of kind `* -> * -> *`. In this way, the relationship between the component types and the lookup entry type is hidden. The downside of this is that the system can no longer infer that if, say, `a k v` is a `LookupEntryClass` then so is `a v k`. There is also some other type information that was inferred automatically, but which now must be stated explicitly, though, in practice, not so much.

When needed, the key and value types are made available through a context declaration, thus:

keyOrder :: (LookupEntryClass a k v, Ord k) => a -> a -> Ordering
keyOrder e1 e2 = compare k1 k2
 where
 (k1,_) = keyVal e1
 (k2,_) = keyVal e2

I don't think this overall approach would be possible without using the type class dependency extension to Haskell.

Using ShowS

`ShowS` is a type that is used to improve the efficiency of creating long strings (character sequences) that are for output, or are otherwise used in left-to-right order.

The problem addressed by `ShowS` is that string concatenation is O(n) in the length of its first string argument: this means that when building a long string by successive concatenation, the time required to append each character is proportional to the length of the string, making the overall time required to build the string proportional to the square of its length.

Type `ShowS` is defined as:

type ShowS = String -> String

At first glance, `ShowS` does not appear to rectify this, since for a string value it is defined simply as `(++)`:

showString :: String -> ShowS
showString = (++)

`ShowS` uses lazy evaluation to defer evaluation of the `++` operation; if the string is accessed strictly sequentially, the `++` never needs to be explicitly evaluated.

Consider the expression:

a ++ b

In order to access the first character of the resulting string, it is necessary to evaluate 'a'. If 'a' itself is a concatenation of strings:

(a1 ++ a2) ++ b

then that concatenation too must be evaluated.

By representing an unterminated string as a function, the evaluation of concatenation can never be required until the string is closed by applying the `ShowS` function to en empty string (or any string).

The above example then would be:

showString a1 (showString a2 (showString b ""))

Thus, the introduction of a function allows the left-associated construction of concatenations to be evaluated in a right-associated fashion. To get at the start of a sequence, it if not necessary to evaluate anything that comes after it.

So how should one use `ShowS` values? There are three key operations:

to turn a string into a `ShowS`, use `showString`

 a = showString "Hello"

or just a section:

 a = ("Hello"++)

The Haskell prelude and library provide functions that turn other values into ShowS: the default mechanism is to turn them into a string, then apply showString as above. The default showsPrec is defined as:

 showsPrec _ x s = show x ++ s

which might equally be written as:

 showsPrec _ x = ((show x)++)

or:

 showsPrec _ x = showString (show x)
  • When implementing complex data structures as instances of Show, consider implementing showsPrec instead of show

to turn a `ShowS` into a string: simply apply it to an empty string

 a = (\x->"Hello "++(" world"++x)) ""
 = "Hello"++(" world"++"")
 = "Hello"++(" world")
 = "Hello world"

to concatenate `ShowS` values, use function composition

 (a . (b . c)) x = a ((b . c) x)
 = a (b (c x))
 ((a . b) . c) x = (a . b) (c x)
 = a (b (c x))
So if:
 a = ("Hello,"++)
 b = (" cruel"++)
 c = (" world"++)

then:

 (a . b . c) "" = a (b (c ""))
 = a (b " world")
 = a " cruel world"
 = "Hello, cruel world"

Use standard types as far as possible

There is extensive support for doing useful things with the standard types declared in the prelude.

For example, it may be tempting to define a new type to represent a particular value or absence of a value. I did this for my RDF language tags, defining:

data Language = Lang String | NoLanguage

It would have been far better to use:

type Language = Maybe String

Many of the built-in types are defined as Monads or Functors, either in the prelude or in supporting libraries. While not obvious to someone new to Haskell, these additional definitions provide a rich seam of added functionality, ready to be accessed as the techniques associated with higher order functions become appreciated. For example, consider `sequence`...

The power of sequence

The prelude function `sequence` is surprisingly versatile.

The basic type signature is:

sequence :: Monad m => [m a] -> m [a]

Its function varies with the particular Monad (m) with which it is used:

  • list: sequence has the effect of computing a Cartesian product
  • `Maybe`: sequence returns Nothing if any of the list members is Nothing, otherwise Just a list of bare values.
  • Error monad: returns the first error in the list, or a list of values if there are no errors
  • State monad: returns the effect of applying each transformation in turn.
  • IO: (also function sequence_) returns a new IO value that is the result of executing each of the statements in turn. One particular use for this is to turn a list of IO values, each returning a value of some type, into a single IO value that returns a list of values of that type. (An IO value being an instance of some type constructed using the IO monad.)

Sometimes, sequence can be applied more than once to a value, with each application having quite different effects as inner monadic values are exposed.

There is a generalization of `sequence` that re-arranges a more arbitrary monadic structure, as in:

foo :: (Monad m1, Monad m2) => m1 m2 a -> m2 m1 a

There is much work on composing monads. A function like this is usually called 'swap'. These two papers address this topic:

(I understand there is another that addresses this topic more directly, but the reference is not currently available.)

((->) e) is a Monadic type

The type `((->) e)` can be defined as the simplest(?) instance of a Reader Monad, which can be used to evaluate several functions with the same 'environment' value. (Note that `(->)` is the function type constructor, having kind `(* -> * -> *)` -- see Haskell Report section 4.1.2.

Scanning a list

A common requirement seems to be to scan through a list of values, returning a computation based on the first member of the list that yields a value for that computation. In imperative programming, this might be achieved by looping to probe each member of the list, and breaking out of the loop when a satisfactory element is found. If the entire list is scanned without finding a satisfactory element, some different or default result is returned.

In Haskell, a pattern for achieving this relies upon lazy evaluation: suppose the required computation is `processItem`, returning `Just` the desired result, or `Nothing` if the item does not yield a desired result. Then, the following code might be used:

listToMaybe $ catMaybes $ map processItem list

This returns `Just` the required value, or `Nothing` if no member of `list` yields such a value. Lazy evaluation means that only elements of `list` up to that which returns the desired value are actually processed. Alternatively, to return an empty list if there is no matching value, otherwise a singleton, use:

take 1 $ catMaybes $ map processItem list

This idea can be applied to other traversable structures, such as trees, etc.

Acknowledgments

Many of the lessons recorded here are due to careful and patient explanations by many members of the Haskell developer community, and denizens of the mailing lists listed at http://www.haskell.org/mailinglist.html. My thanks are given to all, too many to enumerate, who have helped me along the way. Included are specific contributions from Frank Atanassow and Andrew Pimlott. Any errors or misunderstandings propagated above are, of course, solely my own.

-- Graham Klyne (DateTime: 2005-05-11T15:18:48Z)