https://wiki.haskell.org/api.php?action=feedcontributions&user=Dibblego&feedformat=atomHaskellWiki - User contributions [en]2021-09-20T09:29:54ZUser contributionsMediaWiki 1.27.4https://wiki.haskell.org/index.php?title=HXT&diff=57006HXT2013-10-16T12:22:18Z<p>Dibblego: fix typo</p>
<hr />
<div>[[Category:Web]]<br />
[[Category:XML]]<br />
[[Category:Tools]]<br />
[[Category:Tutorials]]<br />
[[Category:Libraries]]<br />
<br />
== A gentle introduction to the Haskell XML Toolbox ==<br />
<br />
The [http://www.fh-wedel.de/~si/HXmlToolbox/index.html Haskell XML Toolbox (HXT)] is a collection of tools for processing XML with Haskell. The core component of the Haskell XML Toolbox is a domain specific language consisting of a set of combinators for processing XML trees in a simple and elegant way. The combinator library is based on the concept of arrows. The main component is a validating and namespace aware XML-Parser that supports almost fully the XML 1.0 Standard. Extensions are a validator for RelaxNG and an XPath evaluator.<br />
<br />
__TOC__<br />
<br />
== Background ==<br />
<br />
The Haskell XML Toolbox is based on the ideas of [http://www.cs.york.ac.uk/fp/HaXml/ HaXml] and [http://www.flightlab.com/~joe/hxml/ HXML], but introduces a more general approach for processing XML with Haskell. HXT uses a generic data model for representing XML documents, including the DTD subset, entity references, CData parts and processing instructions. This data model makes it possible to use tree transformation functions as a uniform design of XML processing steps from parsing, DTD processing, entity processing, validation, namespace propagation, content processing and output.<br />
<br />
HXT has grown over the years. Components for XPath, XSLT, validation<br />
with RelaxNG, picklers for conversion from/to native Haskell data,<br />
lazy parsing with tagsoup, input via curl and native Haskell HTTP<br />
and others have been added. This has led to a rather large package<br />
with a lot of dependencies.<br />
<br />
To make the toolbox more modular and to reduce the dependencies on<br />
other packages, hxt has been split<br />
into various smaller packages since version 9.0.0.<br />
<br />
== Resources ==<br />
<br />
=== Home Page and Repository ===<br />
<br />
;[http://www.fh-wedel.de/~si/HXmlToolbox/index.html HXT]: The project home for HXT<br />
;[http://github.com/UweSchmidt/hxt HXT on GitHub]: The git source repository on github for all HXT packages<br />
<br />
=== Packages ===<br />
All packages are available on hackage.<br />
<br />
;[http://hackage.haskell.org/package/hxt hxt]:The package [http://hackage.haskell.org/package/hxt hxt] forms the core of the toolbox. It contains a validating XML parser and a HTML parser, which tries to read any text as HTML, a DSL for processing, transforming and generating XML/HTML, and so called pickler for conversion from/to XML and native Haskell data.<br />
;[http://hackage.haskell.org/package/HandsomeSoup HandsomeSoup]: HandsomeSoup adds CSS selectors to HXT.<br />
;[http://hackage.haskell.org/package/hxt-http hxt-http]: Native HTTP support is contained in [http://hackage.haskell.org/package/hxt-http hxt-http] and depends on package [http://hackage.haskell.org/package/HTTP HTTP].<br />
;[http://hackage.haskell.org/package/hxt-curl hxt-curl]:HTTP support via libCurl and package [http://hackage.haskell.org/package/curl curl] is in [http://hackage.haskell.org/package/hxt-curl hxt-curl].<br />
;[http://hackage.haskell.org/package/hxt-tagsoup hxt-tagsoup]:The lazy tagsoup parser can be found in package [http://hackage.haskell.org/package/hxt-tagsoup hxt-tagsoup], only this package depends on Neil Mitchell's [http://hackage.haskell.org/package/tagsoup tagsoup].<br />
;[http://hackage.haskell.org/package/hxt-xpath hxt-xpath]:<br />
;[http://hackage.haskell.org/package/hxt-xslt hxt-xslt]:<br />
;[http://hackage.haskell.org/package/hxt-relaxng hxt-relaxng]: The XPath-, XSLT- and RelaxNG-extensions are separated into [http://hackage.haskell.org/package/hxt-xpath hxt-xpath], [http://hackage.haskell.org/package/hxt-xslt hxt-xslt] and [http://hackage.haskell.org/package/hxt-relaxng hxt-relaxng].<br />
;Basic packages:There are some basic functionalities, which are not only of interest for HXT, but can be useful for other none XML/HTML related projects. These have been separated too.<br />
;[http://hackage.haskell.org/package/hxt-charproperties hxt-charproperties]: defines XML- and Unicode character class properties.<br />
;[http://hackage.haskell.org/package/hxt-unicode hxt-unicode]:contains decoding function from various encoding schemes to Unicode. The difference of these functions compared to most of those available on hackage are, that these functions are lazy even in the case of encoding errors (thanks to Henning Thielemann).<br />
;[http://hackage.haskell.org/package/hxt-regex-xmlschema hxt-regex-xmlschema]: contains a lightweight and efficient regex-library. There is full Unicode support, the standard syntax defined in the XML-Schema doc is supported, and there are extensions available for intersection, difference, exclusive OR. The package is self contained, no other regex library is required. The Wiki page [[Regular expressions for XML Schema]] describes the theory behind this regex library and the extensions and gives some usage examples.<br />
;[http://hackage.haskell.org/package/hxt-cache hxt-cache]: A cache for storing parsed XML/HTML pages in binary from. This is used in the Holumbus searchengine framework and the Hayoo! API search for speeding up the repeated indexing of pages.<br />
<br />
=== Installation ===<br />
When installing hxt with cabal, one does not have to deal with all the<br />
basic packages. Just a<br />
<br />
<code>cabal install hxt</code><br />
<br />
does the work for the core toolbox. When HTTP access is required, install at least one of<br />
the packages hxt-curl or hxt-http. All other packages can be installed<br />
on demand any time later.<br />
<br />
=== Upgrade from HXT versions < 9.0 ===<br />
<br />
HXT-9 is not downwards compatible. The splitting into smaller<br />
packages required some internal reorganisation and changes of some type<br />
declarations.<br />
To use the main features of the core package, add an<br />
<br />
<haskell><br />
import Text.XML.HXT.Core<br />
</haskell><br />
<br />
to your sources, instead of <hask>Text.XML.HXT.Arrow</hask>.<br />
<br />
The second major change was the kind of configuration and option handling.<br />
This was done previously by lists of key-value-pairs implemented as string.<br />
The growing number of options and the untyped option values have led to<br />
unreliable code. With HXT-9 options are represented by functions with<br />
type save argument types instead of strings. This option handling has to be<br />
modified when switching to the new version.<br />
<br />
Examples [[#copyXML|copyXML]] and<br />
[[#Pattern for a main program|Pattern for a main program]] show<br />
the new form of options.<br />
<br />
== The basic concepts ==<br />
<br />
=== The basic data structures ===<br />
<br />
Processing of XML is a task of processing tree structures. This is can be done in Haskell in a very elegant way by defining an appropriate tree data type, a Haskell DOM (document object model) structure. The tree structure in HXT is a rose tree with a special XNode data type for storing the XML node information.<br />
<br />
The generally useful tree structure (NTree) is separated from the node type (XNode). This allows for reusing the tree structure and the tree traversal and manipulation functions in other applications.<br />
<br />
<haskell><br />
data NTree a = NTree a [NTree a] -- rose tree<br />
<br />
data XNode = XText String -- plain text node<br />
| ...<br />
| XTag QName XmlTrees -- element name and list of attributes<br />
| XAttr QName -- attribute name<br />
| ...<br />
<br />
type QName = ... -- qualified name<br />
<br />
type XmlTree = NTree XNode<br />
<br />
type XmlTrees = [XmlTree]<br />
</haskell><br />
<br />
=== The concept of filters ===<br />
<br />
Selecting, transforming and generating trees often requires routines, which compute not only a single result tree, but a (possibly empty) list of (sub-)trees. This leads to the idea of XML filters like in HaXml. Filters are functions, which take an XML tree as input and compute a list of result trees.<br />
<br />
<haskell><br />
type XmlFilter = XmlTree -> [XmlTree]<br />
</haskell><br />
<br />
More generally we can define a filter as<br />
<br />
<haskell><br />
type Filter a b = a -> [b]<br />
</haskell><br />
<br />
We will do this abstraction later, when introducing arrows. Many of the functions in the following motivating examples can be generalised this way. But for getting the idea, the <hask>XmlFilter</hask> is sufficient.<br />
<br />
The filter functions are used so frequently, that the idea of defining a domain specific language with filters as the basic processing units comes up. In such a DSL the basic filters are predicates, selectors, constructors and transformers, all working on the HXT DOM tree structure. For a DSL it becomes necessary to define an appropriate set of combinators for building more complex functions from simpler ones. Of course filter composition, like <hask>(.)</hask>, becomes one of the most frequently used combinators. There are more complex filters for traversal of a whole tree and selection or transformation of several nodes. We will see a few first examples in the following part.<br />
<br />
The first task is to build filters from pure functions, to define a lift operator. Pure functions are lifted to filters in the following way:<br />
<br />
Predicates are lifted by mapping False to the empty list and True to the single element list, containing the input tree.<br />
<br />
<haskell><br />
p :: XmlTree -> Bool -- pure function<br />
p t = ...<br />
<br />
pf :: XmlTree -> [XmlTree] -- or XmlFilter<br />
pf t<br />
| p t = [t]<br />
| otherwise = []<br />
</haskell><br />
<br />
The combinator for this type of lifting is called <hask>isA</hask>, it works on any type and is defined as<br />
<br />
<haskell><br />
isA :: (a -> Bool) -> (a -> [a])<br />
isA p x<br />
| p x = [x]<br />
| otherwise = []<br />
</haskell><br />
<br />
A predicate for filtering text nodes looks like this<br />
<br />
<haskell><br />
isXText :: XmlFilter -- XmlTree -> [XmlTree]<br />
isXText t@(NTree (XText _) _) = [t]<br />
isXText _ = []<br />
</haskell><br />
<br />
Transformers -- functions that map a tree into another tree -- are lifted in a trivial way:<br />
<br />
<haskell><br />
f :: XmlTree -> XmlTree<br />
f t = exp(t)<br />
<br />
ff :: XmlTree -> [XmlTree]<br />
ff t = [exp(t)]<br />
</haskell><br />
<br />
This basic function is called <hask>arr</hask>, it comes from the Control.Arrow module of the basic library package of ghc.<br />
<br />
Partial functions, functions that can't always compute a result, are usually lifted to totally defined filters:<br />
<br />
<haskell><br />
f :: XmlTree -> XmlTree<br />
f t<br />
| p t = expr(t)<br />
| otherwise = error "f not defined"<br />
<br />
ff :: XmlFilter<br />
ff t<br />
| p t = [expr(t)]<br />
| otherwise = []<br />
</haskell><br />
<br />
This is a rather comfortable situation, with these filters we don't have to deal with illegal argument errors. Illegal arguments are just mapped to the empty list.<br />
<br />
When processing trees, there's often the case, that no, exactly one, or more than one result is possible. These functions, returning a set of results are often a bit imprecisely called ''nondeterministic'' functions. These functions, e.g. selecting all children of a node or all grandchildren, are exactly our filters. In this context lists instead of sets of values are the appropriate result type, because the ordering in XML is important and duplicates are possible.<br />
<br />
Working with filters is rather similar to working with binary relations, and working with relations is rather natural and comfortable, database people know this very well.<br />
<br />
Two first examples for working with ''nondeterministic'' functions are selecting the children and the grandchildren of an XmlTree which can be implemented by<br />
<br />
<haskell><br />
getChildren :: XmlFilter<br />
getChildren (NTree n cs)<br />
= cs<br />
<br />
getGrandChildren :: XmlFilter<br />
getGrandChildren (NTree n cs)<br />
= concat [ getChildren c | c <- cs ]<br />
</haskell><br />
<br />
=== Filter combinators ===<br />
<br />
Composition of filters (like function composition) is the most important combinator. We will use the infix operator <hask>(>>>)</hask> for filter composition and reverse the arguments, so we can read composition sequences from left to right, like with pipes in Unix. Composition is defined as follows:<br />
<br />
<haskell><br />
(>>>) :: XmlFilter -> XmlFilter -> XmlFilter<br />
<br />
(f >>> g) t = concat [g t' | t' <- f t]<br />
</haskell><br />
<br />
This definition corresponds 1-1 to the composition of binary relations. With help of the <hask>(>>>)</hask> operator the definition of <hask>getGrandChildren</hask> becomes rather simple:<br />
<br />
<haskell><br />
getGrandChildren :: XmlFilter<br />
getGrandChildren = getChildren >>> getChildren<br />
</haskell><br />
<br />
Selecting all text nodes of the children of an element can also be formulated very easily with the help of <hask>(>>>)</hask><br />
<br />
<haskell><br />
getTextChildren :: XmlFilter<br />
getTextChildren = getChildren >>> isXText<br />
</haskell><br />
<br />
When used to combine predicate filters, the <hask>(>>>)</hask> serves as a logical "and" operator or, from the relational view, as an intersection operator: <hask>isA p1 >>> isA p2</hask> selects all values for which p1 and p2 both hold.<br />
<br />
The dual operator to <hask>(>>>)</hask> is the logical or, (thinking in sets: The union operator). For this we define a sum operator <hask>(<+>)</hask>. The sum of two filters is defined as follows:<br />
<br />
<haskell><br />
(<+>) :: XmlFilter -> XmlFilter -> XmlFilter<br />
<br />
(f <+> g) t = f t ++ g t<br />
</haskell><br />
<br />
Example: <hask>isA p1 <+> isA p2</hask> is the logical or for filter.<br />
<br />
Combining elementary filters with (>>>) and (<+>) leads to more complex functionality. For example, selecting all text nodes within two levels of depth (in left to right order) can be formulated with:<br />
<br />
<haskell><br />
getTextChildren2 :: XmlFilter<br />
getTextChildren2 = getChildren >>> ( isXText <+> ( getChildren >>> isXText ) )<br />
</haskell><br />
<br />
'''Exercise:''' Are these filters equivalent or what's the difference between the two filters?<br />
<br />
<haskell><br />
getChildren >>> ( isXText <+> ( getChildren >>> isXText ) )<br />
<br />
( getChildren >>> isXText ) <+> ( getChildren >>> getChildren >>> isXText )<br />
</haskell><br />
<br />
Of course we need choice combinators. The first idea is an if-then-else filter, <br />
built up from three simpler filters. But often it's easier and more elegant to work with simpler binary combinators for choice. So we will introduce the simpler ones first.<br />
<br />
One of these choice combinators is called <hask>orElse</hask> and is defined as<br />
follows:<br />
<br />
<haskell><br />
orElse :: XmlFilter -> XmlFilter -> XmlFilter<br />
orElse f g t<br />
| null res1 = g t<br />
| otherwise = res1<br />
where<br />
res1 = f t<br />
</haskell><br />
<br />
The meaning is the following: If f computes a non-empty list as result, f succeeds and this list is the result, else g is applied to the input and this yields the result. There are two other simple choice combinators usually written in infix notation, <hask> g `guards` f</hask> and <hask>f `when` g</hask>:<br />
<br />
<haskell><br />
guards :: XmlFilter -> XmlFilter -> XmlFilter<br />
guards g f t<br />
| null (g t) = []<br />
| otherwise = f t<br />
<br />
when :: XmlFilter -> XmlFilter -> XmlFilter<br />
when f g t<br />
| null (g t) = [t]<br />
| otherwise = f t<br />
</haskell><br />
<br />
These choice operators become useful when transforming and manipulating trees.<br />
<br />
=== Tree traversal filter ===<br />
<br />
A very basic operation on tree structures is the traversal of all nodes and the selection and/or transformation of nodes. These traversal filters serve as control structures for processing whole trees. They correspond to the map and fold combinators for lists.<br />
<br />
The simplest traversal filter does a top down search of all nodes with a special feature. This filter, called <hask>deep</hask>, is defined as follows:<br />
<br />
<haskell><br />
deep :: XmlFilter -> XmlFilter<br />
deep f = f `orElse` (getChildren >>> deep f)<br />
</haskell><br />
<br />
When a predicate filter is applied to <hask>deep</hask>, a top down search is done and all subtrees satisfying the predicate are collected. The descent into the tree stops when a subtree is found, because of the use of <hask>orElse</hask>.<br />
<br />
'''Example:''' Selecting all plain text nodes of a document can be formulated with:<br />
<br />
<haskell><br />
deep isXText<br />
</haskell><br />
<br />
'''Example:''' Selecting all "top level" tables in a HTML documents looks like<br />
this:<br />
<br />
<haskell><br />
deep (isElem >>> hasName "table")<br />
</haskell><br />
<br />
A variant of <hask>deep</hask>, called <hask>multi</hask>, performs a complete search, where the tree traversal does not stop when a node is found.<br />
<br />
<haskell><br />
multi :: XmlFilter -> XmlFilter<br />
multi f = f <+> (getChildren >>> multi f)<br />
</haskell><br />
<br />
'''Example:''' Selecting all tables in a HTML document, even nested ones, <hask>multi</hask> has to be used instead of <hask>deep</hask>:<br />
<br />
<hask>multi (isElem >>> hasName "table")</hask><br />
<br />
=== Arrows ===<br />
<br />
We've already seen, that the filters <hask>a -> [b]</hask> are a very powerful and sometimes a more elegant way to process XML than pure function. This is the good news. The bad news is, that filter are not general enough. Of course we sometimes want to do some I/O and we want to stay in the filter level. So we need something like<br />
<br />
<haskell><br />
type XmlIOFilter = XmlTree -> IO [XmlTree]<br />
</haskell><br />
<br />
for working in the IO monad.<br />
<br />
Sometimes it's appropriate to thread some state through the computation like in state monads. This leads to a type like<br />
<br />
<haskell><br />
type XmlStateFilter state = state -> XmlTree -> (state, [XmlTree])<br />
</haskell><br />
<br />
And in real world applications we need both extensions at the same time. Of course I/O is necessary but usually there are also some global options and variables for controlling the computations. In HXT, for instance there are variables for controlling trace output, options for setting the default encoding scheme for input data and a base URI for accessing documents, which are addressed in a content or in a DTD part by relative URIs. So we need something like<br />
<br />
<haskell><br />
type XmlIOStateFilter state = state -> XmlTree -> IO (state, [XmlTree])<br />
</haskell><br />
<br />
We want to work with all four filter variants, and in the future perhaps with even more general filters, but of course not with four sets of filter names, e.g. <hask>deep, deepST, deepIO, deepIOST</hask>.<br />
<br />
This is the point where <hask>newtype</hask>s and <hask>class</hask>es come in. Classes are needed for overloading names and <hask>newtype</hask>s are needed to declare instances. Further the restriction of <hask>XmlTree</hask> as argument and result type is not neccessary and hinders reuse in many cases.<br />
<br />
A filter discussed above has all features of an arrow. Arrows are introduced for generalising the concept of functions and function combination to more general kinds of computation than pure functions.<br />
<br />
A basic set of combinators for arrows is defined in the classes in the <hask>Control.Arrow</hask> module, containing the above mentioned <hask>(>>>), (<+>), arr</hask>.<br />
<br />
In HXT the additional classes for filters working with lists as result type are defined in <hask>Control.Arrow.ArrowList</hask>. The choice operators are in <hask>Control.Arrow.ArrowIf</hask>, tree filters, like <hask>getChildren, deep, multi, ...</hask> in <hask>Control.Arrow.ArrowTree</hask> and the elementary XML specific filters in <hask>Text.XML.HXT.XmlArrow</hask>.<br />
<br />
In HXT there are four types instantiated with these classes for pure list arrows, list arrows with a state, list arrows with IO and list arrows with a state and IO.<br />
<br />
<haskell><br />
newtype LA a b = LA { runLA :: (a -> [b]) }<br />
<br />
newtype SLA s a b = SLA { runSLA :: (s -> a -> (s, [b])) }<br />
<br />
newtype IOLA a b = IOLA { runIOLA :: (a -> IO [b]) }<br />
<br />
newtype IOSLA s a b = IOSLA { runIOSLA :: (s -> a -> IO (s, [b])) }<br />
</haskell><br />
<br />
The first one and the last one are those used most frequently in the toolbox, and of course there are lifting functions for converting general arrows into more specific arrows.<br />
<br />
Don't worry about all these conceptual details. Let's have a look into some ''Hello world'' examples.<br />
<br />
== Getting started: Hello world examples ==<br />
<br />
=== copyXML ===<br />
<br />
The first complete example is a program for copying an XML document<br />
<br />
<haskell><br />
module Main<br />
where<br />
<br />
import Text.XML.HXT.Core<br />
import Text.XML.HXT.Curl -- use libcurl for HTTP access<br />
-- only necessary when reading http://...<br />
<br />
import System.Environment<br />
<br />
main :: IO ()<br />
main<br />
= do<br />
[src, dst] <- getArgs<br />
runX ( readDocument [withValidate no<br />
,withCurl []<br />
] src<br />
>>><br />
writeDocument [withIndent yes<br />
,withOutputEncoding isoLatin1<br />
] dst<br />
)<br />
return ()<br />
</haskell><br />
<br />
The interesting part of this example is the call of <hask>runX</hask>. <hask>runX</hask> executes an arrow. This arrow is one of the more powerful list arrows with IO and a HXT system state.<br />
<br />
The arrow itself is a composition of <hask>readDocument</hask> and <hask>writeDocument</hask>.<br />
<br />
<hask>readDocument</hask> is an arrow for reading, DTD processing and validation of documents. Its behaviour can be controlled by a list of system options. Here we turn off the validation step. The <hask>src</hask>, a file name or an URI is read and parsed and a document tree is built.<br />
<br />
The input option <hask>withCurl []</hask> enables reading via HTTP. For using this option, the extra package hxt-curl must be installed, and <hask>withCurl</hask> must be imported by <hask>import Text.XML.HXT.Curl</hask>.<br />
If only file access is necessary, this option and the import can be dropped. In that case the program does not depend on the libCurl binding.<br />
<br />
The tree read in is ''piped'' into the output arrow. This one again is controlled by a set of system options. The <hask>withIndent</hask> option controlls the output formatting, here indentation is switched on, the <hask>withOutputEncoding</hask> is set to IOS Latin1.<br />
<br />
<hask>writeDocument</hask> converts the tree into a string and writes it to the <hask>dst</hask>.<br />
<br />
We've omitted here the boring stuff of option parsing and error handling.<br />
<br />
Compilation and a test run looks like this:<br />
<br />
<pre><br />
hobel > ghc --make -o copyXml CopyXML.hs<br />
hobel > cat hello.xml<br />
<hello><haskell>world</haskell></hello><br />
hobel > copyXml hello.xml -<br />
<?xml version="1.0" encoding="ISO-8859-1"?><br />
<hello><br />
<haskell>world</haskell><br />
</hello><br />
hobel ><br />
</pre><br />
<br />
The mini XML document in file <tt>hello.xml</tt> is read and a document tree is built. Then this tree is converted into a string and written to standard output (filename: <tt>-</tt>). It is decorated with an XML declaration containing the version and the output encoding.<br />
<br />
For processing HTML documents there is a HTML parser, which tries to parse and interpret rather anything as HTML. The HTML parser can be selected by calling<br />
<br />
<hask>readDocument [withParseHTML yes, ...]</hask><br />
<br />
The available read and write options can be found in the hxt module <hask>Text.XML.HXT.Arrow.XmlState.SystemConfig</hask><br />
<br />
=== Pattern for a main program ===<br />
<br />
A more realistic pattern for a simple Unix filter like program has<br />
the following structure:<br />
<br />
<haskell><br />
module Main<br />
where<br />
<br />
import Text.XML.HXT.Core<br />
import Text.XML.HXT.... -- further HXT packages<br />
<br />
import System.IO<br />
import System.Environment<br />
import System.Console.GetOpt<br />
import System.Exit<br />
<br />
main :: IO ()<br />
main<br />
= do<br />
argv <- getArgs<br />
(al, src, dst) <- cmdlineOpts argv<br />
[rc] <- runX (application al src dst)<br />
if rc >= c_err<br />
then exitWith (ExitFailure (0-1))<br />
else exitWith ExitSuccess<br />
<br />
-- | the dummy for the boring stuff of option evaluation,<br />
-- usually done with 'System.Console.GetOpt'<br />
<br />
cmdlineOpts :: [String] -> IO (SysConfigList, String, String)<br />
cmdlineOpts argv<br />
= return ([withValidate no], argv!!0, argv!!1)<br />
<br />
-- | the main arrow<br />
<br />
application :: SysConfigList -> String -> String -> IOSArrow b Int<br />
application cfg src dst<br />
= configSysVars cfg -- (0)<br />
>>><br />
readDocument [] src<br />
>>><br />
processChildren (processDocumentRootElement `when` isElem) -- (1)<br />
>>><br />
writeDocument [] dst -- (3)<br />
>>><br />
getErrStatus<br />
<br />
<br />
-- | the dummy for the real processing: the identity filter<br />
<br />
processDocumentRootElement :: IOSArrow XmlTree XmlTree<br />
processDocumentRootElement<br />
= this -- substitute this by the real application<br />
</haskell><br />
<br />
This program has the same functionality as our first example,<br />
but it separates the arrow from the boring option evaluation and<br />
return code computation.<br />
<br />
In line (0) the system is configured with the list of options.<br />
These options are then used as defaults for all read and write operation.<br />
The options can be overwritten for single read/write calls<br />
by putting config options into the parameter list of the<br />
read/write function calls.<br />
<br />
The interesing line is (1).<br />
<hask>readDocument</hask> generates a tree structure with a so called extra<br />
root node. This root node is a node above the XML document root<br />
element. The node above the XML document root element is neccessary<br />
because of possible other elements on the same tree level as the XML<br />
root, for instance comments, processing instructions or whitespace.<br />
<br />
Furthermore the artificial root node serves for storing meta<br />
information about the document in the attribute list, like the<br />
document name, the encoding scheme, the HTTP transfer headers and<br />
other information.<br />
<br />
To process the real XML root element, we have to take the children of<br />
the root node, select the XML root element and process this, but<br />
remain all other children unchanged. This is done with<br />
<hask>processChildren</hask> and the <hask>when</hask> choice<br />
operator. <hask>processChildren</hask> applies a filter elementwise to<br />
all children of a node. All results form processing the list of children from<br />
the result node.<br />
<br />
The structure of internal document tree can be made visible<br />
e.g. by adding the option <hask>withShowTree yes</hask> to the<br />
<hask>writeDocument</hask> arrow in (3).<br />
This will emit the tree in a readable<br />
text representation instead of the real document.<br />
<br />
In the next section we will give examples for the<br />
<hask>processDocumentRootElement</hask> arrow.<br />
<br />
=== Tracing ===<br />
<br />
There are tracing facilities to observe the actions performed<br />
and to show intermediate results<br />
<br />
<haskell><br />
application :: SysConfigList -> String -> String -> IOSArrow b Int<br />
application cfg src dst<br />
= configSysVars (withTrace 1 : cfg) -- (0)<br />
>>><br />
traceMsg 1 "start reading document" -- (1)<br />
>>><br />
readDocument [] src<br />
>>><br />
traceMsg 1 "document read, start processing" -- (2)<br />
>>><br />
processChildren (processDocumentRootElement `when` isElem)<br />
>>><br />
traceMsg 1 "document processed" -- (3)<br />
>>><br />
writeDocument [] dst<br />
>>><br />
getErrStatus<br />
</haskell><br />
<br />
In (0) the system trace level is set to 1, in default level 0<br />
all trace messages are suppressed. The three trace messages (1)-(3)<br />
will be issued, but also readDocument and writeDocument will<br />
log their activities.<br />
<br />
How a whole document and the internal tree structure can be traced,<br />
is shown in the following example<br />
<br />
<haskell><br />
...<br />
>>><br />
processChildren (processDocumentRootElement `when` isElem)<br />
>>><br />
withTraceLevel 4 (traceDoc "resulting document") -- (1)<br />
>>><br />
... <br />
</haskell><br />
<br />
In (1) the trace level is locally set to the highest level 4.<br />
traceDoc will then issue the trace message, the document formatted<br />
as XML, and the internal DOM tree of the document.<br />
<br />
== Selection examples ==<br />
<br />
=== Selecting text from an HTML document ===<br />
<br />
Selecting all the plain text of an XML/HTML document<br />
can be formulated with<br />
<br />
<haskell><br />
selectAllText :: ArrowXml a => a XmlTree XmlTree<br />
selectAllText<br />
= deep isText<br />
</haskell><br />
<br />
<hask>deep</hask> traverses the whole tree, stops the traversal when<br />
a node is a text node (<hask>isText</hask>) and returns all the text nodes.<br />
There are two other traversal operators <hask>deepest</hask> and <hask>multi</hask>,<br />
In this case, where the selected nodes are all leaves, these would give the same result.<br />
<br />
=== Selecting text and ALT attribute values ===<br />
<br />
Let's take a bit more complex task: We want to select all text, but also the values of the <tt>alt</tt> attributes<br />
of image tags.<br />
<br />
<haskell><br />
selectAllTextAndAltValues :: ArrowXml a => a XmlTree XmlTree<br />
selectAllTextAndAltValues<br />
= deep<br />
( isText -- (1)<br />
<+><br />
( isElem >>> hasName "img" -- (2)<br />
>>><br />
getAttrValue "alt" -- (3)<br />
>>><br />
mkText -- (4)<br />
)<br />
)<br />
</haskell><br />
<br />
The whole tree is searched for text nodes (1) and for image elements (2), from the image elements<br />
the alt attribute values are selected as plain text (3), this text is transformed into a text node (4).<br />
<br />
=== Selecting text and ALT attribute values (2) ===<br />
<br />
Let's refine the above filter one step further. The text from the alt attributes shall be marked in the output<br />
by surrounding double square brackets. Empty alt values shall be ignored.<br />
<br />
<haskell><br />
selectAllTextAndRealAltValues :: ArrowXml a => a XmlTree XmlTree<br />
selectAllTextAndRealAltValues<br />
= deep<br />
( isText<br />
<+><br />
( isElem >>> hasName "img"<br />
>>><br />
getAttrValue "alt"<br />
>>><br />
isA significant -- (1)<br />
>>><br />
arr addBrackets -- (2)<br />
>>><br />
mkText<br />
)<br />
)<br />
where<br />
significant :: String -> Bool<br />
significant = not . all (`elem` " \n\r\t")<br />
<br />
addBrackets :: String -> String<br />
addBrackets s<br />
= " [[ " ++ s ++ " ]] "<br />
</haskell><br />
<br />
This example shows two combinators for building arrows from pure functions.<br />
The first one <hask>isA</hask> removes all empty or whitespace values from alt attributes (1),<br />
the other <hask>arr</hask> lifts the editing function to the arrow level (2).<br />
<br />
== Document construction examples ==<br />
<br />
=== The ''Hello World'' document ===<br />
<br />
The first document, of course, is a ''Hello World'' document:<br />
<br />
<haskell><br />
helloWorld :: ArrowXml a => a XmlTree XmlTree<br />
helloWorld<br />
= mkelem "html" [] -- (1)<br />
[ mkelem "head" []<br />
[ mkelem "title" []<br />
[ txt "Hello World" ] -- (2)<br />
]<br />
, mkelem "body"<br />
[ sattr "class" "haskell" ] -- (3)<br />
[ mkelem "h1" []<br />
[ txt "Hello World" ] -- (4)<br />
]<br />
]<br />
</haskell><br />
<br />
The main arrows for document construction are <hask>mkelem</hask><br />
and it's variants (<hask>selem, aelem, eelem</hask>) for element creation, <hask>attr</hask> and <hask>sattr</hask> for attributes and <hask>mktext</hask><br />
and <hask>txt</hask> for text nodes. <hask>mkelem</hask> takes three arguments, the element name (or tag name), a list of arrows for the construction of attributes, not empty in (3), and a list of arrows for the contents. Text content is generated in (2) and (4).<br />
<br />
To write this document to a file use the following arrow<br />
<br />
<haskell><br />
root [] [helloWorld] -- (1)<br />
>>><br />
writeDocument [withIndent yes] "hello.xml" -- (2)<br />
</haskell><br />
<br />
When this arrow is executed, the <hask>helloWorld</hask><br />
document is wrapped into a so called root node (1). This complete<br />
document is written to "hello.xml" (2).<br />
<hask>writeDocument</hask> and its variants always expect<br />
a whole document tree with such a root node. Before writing, the document is<br />
indented (<hask>withIndent yes</hask>)) by inserting extra whitespace<br />
text nodes, and an XML declaration with version and encoding is added. If the indent option is not given, the whole document would appears on a single line:<br />
<br />
<pre><br />
<?xml version="1.0" encoding="UTF-8"?><br />
<html><br />
<head><br />
<title>Hello World</title><br />
</head><br />
<body class="haskell"><br />
<h1>Hello World</h1><br />
</body><br />
</html><br />
</pre><br />
<br />
The code can be shortened a bit by using some of the<br />
convenient functions:<br />
<br />
<haskell><br />
helloWorld2 :: ArrowXml a => a XmlTree XmlTree<br />
helloWorld2<br />
= selem "html"<br />
[ selem "head"<br />
[ selem "title"<br />
[ txt "Hello World" ]<br />
]<br />
, mkelem "body"<br />
[ sattr "class" "haskell" ]<br />
[ selem "h1"<br />
[ txt "Hello World" ]<br />
]<br />
]<br />
</haskell><br />
<br />
In the above two examples the arrow input is totally ignored, because<br />
of the use of the constant arrow <hask>txt "..."</hask>.<br />
<br />
=== A page about all images within a HTML page ===<br />
<br />
A bit more interesting task is the construction of a page<br />
containing a table of all images within a page inclusive image URLs, geometry and ALT attributes.<br />
<br />
The program for this has a frame similar to the <hask>helloWorld</hask> program,<br />
but the rows of the table must be filled in from the input document.<br />
In the first step we will generate a table with a single column containing<br />
the URL of the image.<br />
<br />
<haskell><br />
imageTable :: ArrowXml a => a XmlTree XmlTree<br />
imageTable<br />
= selem "html"<br />
[ selem "head"<br />
[ selem "title"<br />
[ txt "Images in Page" ]<br />
]<br />
, selem "body"<br />
[ selem "h1"<br />
[ txt "Images in Page" ]<br />
, selem "table"<br />
[ collectImages -- (1)<br />
>>><br />
genTableRows -- (2)<br />
]<br />
]<br />
]<br />
where<br />
collectImages -- (1)<br />
= deep ( isElem<br />
>>><br />
hasName "img"<br />
)<br />
genTableRows -- (2)<br />
= selem "tr"<br />
[ selem "td"<br />
[ getAttrValue "src" >>> mkText ]<br />
]<br />
</haskell><br />
<br />
With (1) the image elements are collected, and with (2)<br />
the HTML code for an image element is built.<br />
<br />
Applied to <tt>http://www.haskell.org/</tt> we get the following result<br />
(at the time writing this page):<br />
<br />
<pre><br />
<html><br />
<head><br />
<title>Images in Page</title><br />
</head><br />
<body><br />
<h1>Images in Page</h1><br />
<table><br />
<tr><br />
<td>/haskellwiki_logo.png</td><br />
</tr><br />
<tr><br />
<td>/sitewiki/images/1/10/Haskelllogo-small.jpg</td><br />
</tr><br />
<tr><br />
<td>/haskellwiki_logo_small.png</td><br />
</tr><br />
</table><br />
</body><br />
</html><br />
</pre><br />
<br />
When generating HTML, often there are constant parts within the page,<br />
in the example e.g. the page header. It's possible to write these<br />
parts as a string containing plain HTML and then read this with<br />
a simple XML contents parser called <hask>xread</hask>.<br />
<br />
The example above could then be rewritten as<br />
<br />
<haskell><br />
imageTable<br />
= selem "html"<br />
[ pageHeader<br />
, ...<br />
]<br />
where<br />
pageHeader<br />
= constA "<head><title>Images in Page</title></head>"<br />
>>><br />
xread<br />
...<br />
</haskell><br />
<br />
<hask>xread</hask> is a very primitive arrow. It does not run in the<br />
IO monad, so it can be used in any context, but therefore the error handling<br />
is very limited. <hask>xread</hask> parses an XML element content.<br />
<br />
=== A page about all images within a HTML page: 1. Refinement ===<br />
<br />
The next refinement step is the extension of the table such that<br />
it contains four columns, one for the image itself, one for the URL,<br />
the geometry and the ALT text. The extended <hask>getTableRows</hask><br />
has the following form:<br />
<br />
<haskell><br />
genTableRows<br />
= selem "tr"<br />
[ selem "td" -- (1)<br />
[ this -- (1.1)<br />
]<br />
, selem "td" -- (2)<br />
[ getAttrValue "src"<br />
>>><br />
mkText<br />
>>><br />
mkelem "a" -- (2.1)<br />
[ attr "href" this ]<br />
[ this ]<br />
]<br />
, selem "td" -- (3)<br />
[ ( getAttrValue "width"<br />
&&& -- (3.1)<br />
getAttrValue "height"<br />
)<br />
>>><br />
arr2 geometry -- (3.2)<br />
>>><br />
mkText<br />
]<br />
, selem "td" -- (4)<br />
[ getAttrValue "alt"<br />
>>><br />
mkText<br />
]<br />
]<br />
where<br />
geometry :: String -> String -> String<br />
geometry "" ""<br />
= ""<br />
geometry w h<br />
= w ++ "x" ++ h<br />
</haskell><br />
<br />
In (1) the identity arrow <hask>this</hask> is used for<br />
inserting the whole image element (<hask>this</hask> value) into the first column.<br />
(2) is the column from the previous example but the URL has been made active<br />
by embedding the URL in an A-element (2.1). In (3) there are two<br />
new combinators, <hask>(&&&)</hask> (3.1) is an arrow for applying two<br />
arrows to the same input and combine the results into a pair. <hask>arr2</hask><br />
works like <hask>arr</hask> but it lifts a binary function into an arrow<br />
accepting a pair of values. <hask>arr2 f</hask> is a shortcut for<br />
<hask>arr (uncurry f)</hask>. So width and height are combined into an X11 like<br />
geometry spec. (4) adds the ALT-text.<br />
<br />
=== A page about all images within a HTML page: 2. Refinement ===<br />
<br />
The generated HTML page is not yet very useful, because it usually<br />
contains relative HREFs to the images, so the links do not work.<br />
We have to transform the SRC attribute values into absolute URLs.<br />
This can be done with the following code:<br />
<br />
<haskell><br />
imageTable2 :: IOStateArrow s XmlTree XmlTree<br />
imageTable2<br />
= ...<br />
...<br />
, selem "table"<br />
[ collectImages<br />
>>><br />
mkAbsImageRef -- (1)<br />
>>><br />
genTableRows<br />
]<br />
...<br />
<br />
mkAbsImageRef :: IOStateArrow s XmlTree XmlTree -- (1)<br />
mkAbsImageRef<br />
= processAttrl ( mkAbsRef -- (2)<br />
`when`<br />
hasName "src" -- (3)<br />
)<br />
where<br />
mkAbsRef -- (4)<br />
= replaceChildren<br />
( xshow getChildren -- (5)<br />
>>><br />
( mkAbsURI `orElse` this ) -- (6)<br />
>>><br />
mkText -- (7)<br />
)<br />
</haskell><br />
<br />
The <hask>imageTable2</hask> is extended by an arrow <hask>mkAbsImageRef</hask><br />
(1). This arrow uses the global system state of HXT, in which the base URL<br />
of a document is stored. For editing the SRC attribute value, the attribute list<br />
of the image elements is processed with <hask>processAttrl</hask>.<br />
With the <hask>`when` hasName "src"</hask> only SRC attributes are manipulated (3). The real work is done in (4): The URL is selected with <hask>getChildren</hask>, a text node, and converted into a string (<hask>xshow</hask>), the URL is transformed into an absolute URL<br />
with <hask>mkAbsURI</hask> (6). This arrow may fail, e.g. in case of illegal<br />
URLs. In this case the URL remains unchanged (<hask>`orElse` this</hask>).<br />
The resulting String value is converted into a text node forming the new<br />
attribute value node (7).<br />
<br />
Because of the use of the global HXT state in <hask>mkAbsURI</hask><br />
<hask>mkAbsRef</hask> and <hask>imageTable2</hask> need to have the more specialized signature <hask>IOStateArrow s XmlTree XmlTree</hask>.<br />
<br />
== Transformation examples ==<br />
<br />
=== Decorating external references of an HTML document ===<br />
<br />
In the following examples, we want to decorate the external references<br />
in an HTML page by a small icon, like it's done in many wikis.<br />
For this task the document tree has to be traversed, all parts<br />
except the intersting A-Elements remain unchanged. At the end of the list of children of an A-Element we add an image element.<br />
<br />
Here is the first version:<br />
<br />
<haskell><br />
addRefIcon :: ArrowXml a => a XmlTree XmlTree<br />
addRefIcon<br />
= processTopDown -- (1)<br />
( addImg -- (2)<br />
`when`<br />
isExternalRef -- (3)<br />
)<br />
where<br />
isExternalRef -- (4)<br />
= isElem<br />
>>><br />
hasName "a"<br />
>>><br />
hasAttr "href"<br />
>>><br />
getAttrValue "href"<br />
>>><br />
isA isExtRef<br />
where<br />
isExtRef -- (4.1)<br />
= isPrefixOf "http:" -- or something more precise<br />
<br />
addImg<br />
= replaceChildren -- (5)<br />
( getChildren -- (6)<br />
<+><br />
imgElement -- (7)<br />
)<br />
<br />
imgElement<br />
= mkelem "img" -- (8)<br />
[ sattr "src" "/icons/ref.png" -- (9)<br />
, sattr "alt" "external ref"<br />
] [] -- (10)<br />
</haskell><br />
<br />
The traversal is done with <hask>processTopDown</hask> (1).<br />
This arrow applies an arrow to all nodes of the whole document tree.<br />
The transformation arrow applies the <hask>addImg</hask> (2) to<br />
all A-elements (3),(4). This arrow uses a bit simplified test (4.1)<br />
for external URLs.<br />
<hask>addImg</hask> manipulates all children (5) of the A-elements by<br />
selecting the current children (6) and adding an image element (7).<br />
The image element is constructed with <hask>mkelem</hask> (8). This takes<br />
an element name, a list of arrows for computing the attributes and a<br />
list of arrows for computing the contents. The content of the image element is<br />
empty (10). The attributes are constructed with <hask>sattr</hask> (9).<br />
<hask>sattr</hask> ignores the arrow input and builds an attribute form<br />
the name value pair of arguments.<br />
<br />
=== Transform external references into absolute references ===<br />
<br />
In the following example we will develop a program for<br />
editing a HTML page such that all references to external documents<br />
(images, hypertext refs, style refs, ...) become absolute references.<br />
We will see some new, but very useful combinators in the solution.<br />
<br />
The task seems to be rather trivial. In a tree travaersal<br />
all references are edited with respect to the document base.<br />
But in HTML there is a BASE element, allowed in the content of HEAD<br />
with a HREF attribute, which defines the document base. Again this<br />
href can be a relative URL.<br />
<br />
We start the development with the editing arrow. This gets<br />
the real document base as argument.<br />
<br />
<haskell><br />
mkAbsHRefs :: ArrowXml a => String -> a XmlTree XmlTree<br />
mkAbsHRefs base<br />
= processTopDown editHRef -- (1)<br />
where<br />
editHRef<br />
= processAttrl -- (3)<br />
( changeAttrValue (absHRef base) -- (5)<br />
`when`<br />
hasName "href" -- (4)<br />
)<br />
`when`<br />
( isElem >>> hasName "a" ) -- (2)<br />
where<br />
<br />
absHRef :: String -> String -> String -- (5)<br />
absHRef base url<br />
= fromMaybe url . expandURIString url $ base<br />
</haskell><br />
<br />
The tree is traversed (1) and for every A element the attribute<br />
list is processed (2). All HREF attribute values (4) are manipulated<br />
by <hask>changeAttrValue</hask> called with a string function (5).<br />
<hask>expandURIString</hask> is a pure function defined in HXT for computing<br />
an absolut URI.<br />
In this first step we only edit A-HREF attribute values. We will refine this<br />
later.<br />
<br />
The second step is the complete computation of the base URL.<br />
<br />
<haskell><br />
computeBaseRef :: IOStateArrow s XmlTree String<br />
computeBaseRef<br />
= ( ( ( isElem >>> hasName "html" -- (0)<br />
>>><br />
getChildren -- (1)<br />
>>><br />
isElem >>> hasName "head" -- (2)<br />
>>><br />
getChildren -- (3)<br />
>>><br />
isElem >>> hasName "base" -- (4)<br />
>>><br />
getAttrValue "href" -- (5)<br />
)<br />
&&&<br />
getBaseURI -- (6)<br />
)<br />
>>> expandURI -- (7)<br />
)<br />
`orElse` getBaseURI -- (8)<br />
</haskell><br />
<br />
Input to this arrow is the HTML element, (0) to (5) is the arrow for selecting<br />
the BASE elements HREF value, parallel to this the system base URL is read<br />
with <hask>getBaseURI</hask> (6) like in examples above. The resulting <br />
pair of strings is piped into <hask>expandURI</hask> (7), the arrow version of<br />
<hask>expandURIString</hask>. This arrow ((1) to (7)) fails in the absense<br />
of a BASE element. in this case we take the plain document base (8).<br />
The selection of the BASE elements is not yet very handy. We will define<br />
a more general and elegant function later, allowing an element path as selection argument.<br />
<br />
In the third step, we will combine the to arrows. For this we will use<br />
a new combinator <hask>($<)</hask>. The need for this new combinator<br />
is the following: We need the arrow input (the document) two times,<br />
once for computing the document base, and second for editing the<br />
whole document, and we want to compute the extra string parameter<br />
for editing of course with the above defined arrow.<br />
<br />
The combined arrow, our main arrow, looks like this<br />
<br />
<haskell><br />
toAbsRefs :: IOStateArrow s XmlTree XmlTree<br />
toAbsRefs<br />
= mkAbsHRefs $< computeBaseRef -- (1)<br />
</haskell><br />
<br />
In (1) first the arrow input is piped into <hask>computeBaseRef</hask>,<br />
this result is used in <hask>mkAbsHRefs</hask> as extra string parameter<br />
when processing the document. Internally the <hask>($<)</hask> combinator<br />
is defined by the basic combinators <hask>(&&&), (>>>)</hask> and <hask>app</hask>, but in a bit more complex computations,<br />
this pattern occurs rather frequently, so ($<) becomes very useful.<br />
<br />
Programming with arrows is one style of point free programming. Point free<br />
programming often becomes unhandy when values are used more than once.<br />
One solution is the special arrow syntax supported by ghc and others, similar to the do notation for monads. But for many simple cases the <hask>($<)</hask> combinator and it's variants <hask>($<<), ($<<<), ($<<<<), ($<$)</hask><br />
is sufficient.<br />
<br />
To complete the development of the example, a last step is neccessary:<br />
The removal of the redundant BASE element.<br />
<br />
<haskell><br />
toAbsRefs :: IOStateArrow s XmlTree XmlTree<br />
toAbsRefs<br />
= ( mkAbsHRefs $< computeBaseRef )<br />
>>><br />
removeBaseElement<br />
<br />
removeBaseElement :: ArrowXml a => a XmlTree XmlTree<br />
removeBaseElement<br />
= processChildren<br />
( processChildren<br />
( none -- (1)<br />
`when`<br />
( isElem >>> hasName "base" )<br />
)<br />
`when`<br />
( isElem >>> hasName "head" )<br />
)<br />
</haskell><br />
<br />
In this function the children of the HEAD element are searched for<br />
a BASE element. This is removed by aplying the null arrow <hask>none</hask><br />
to the input, returning always the empty list.<br />
<hask>none `when` ...</hask> is the pattern for deleting nodes from a tree.<br />
<br />
The <hask>computeBaseRef</hask> function defined above contains an arrow pattern<br />
for selecting the right subtree that is rather common in HXT applications<br />
<br />
<haskell><br />
isElem >>> hasName n1<br />
>>><br />
getChildren<br />
>>><br />
isElem >>> hasName n2<br />
...<br />
>>><br />
getChildren<br />
>>><br />
isElem >>> hasName nm<br />
</haskell><br />
<br />
For this pattern we will define a convenient function creating the<br />
arrow for selection<br />
<br />
<haskell><br />
getDescendents :: ArrowXml a => [String] -> a XmlTree XmlTree<br />
getDescendents<br />
= foldl1 (\ x y -> x >>> getChildren >>> y) -- (1)<br />
.<br />
map (\ n -> isElem >>> hasName n) -- (2)<br />
</haskell><br />
<br />
The name list is mapped to the element checking arrow (2),<br />
the resulting list of arrows is folded with <hask>getChildren</hask><br />
into a single arrow. <hask>computeBaseRef</hask> can then be simplified<br />
and becomes more readable:<br />
<br />
<haskell><br />
computeBaseRef :: IOStateArrow s XmlTree String<br />
computeBaseRef<br />
= ( ( ( getDescendents ["html","head","base"] -- (1)<br />
>>><br />
getAttrValue "href" -- (2)<br />
)<br />
...<br />
...<br />
</haskell><br />
<br />
An even more general and flexible technic are the XPath expressions<br />
available for selection of document parts defined in the module<br />
<hask>Text.XML.HXT.Arrow.XmlNodeSet</hask>.<br />
<br />
With XPath <hask>computeBaseRef</hask> can be simplified to<br />
<br />
<haskell><br />
computeBaseRef<br />
= ( ( ( getXPathTrees "/html/head/base" -- (1)<br />
>>><br />
getAttrValue "href" -- (2)<br />
)<br />
...<br />
</haskell><br />
<br />
Even the attribute selection can be expressed by XPath,<br />
so (1) and (2) can be combined into<br />
<br />
<haskell><br />
computeBaseRef<br />
= ( ( xshow (getXPathTrees "/html/head/base@href")<br />
...<br />
</haskell><br />
<br />
The extra <hask>xshow</hask> is here required to convert the<br />
XPath result, an XmlTree, into a string.<br />
<br />
XPath defines a<br />
full language for selecting parts of an XML document.<br />
Sometimes it's rather comfortable to make selections of this<br />
type, but the XPath evaluation in general is more expensive<br />
in time and space than a simple combination of arrows, like we've<br />
seen it in <hask>getDescendends</hask>.<br />
<br />
=== Transform external references into absolute references: Refinement ===<br />
<br />
In the above example only A-HREF URLs are edited. Now we extend this<br />
to other element-attribute combinations.<br />
<br />
<haskell><br />
mkAbsRefs :: ArrowXml a => String -> a XmlTree XmlTree<br />
mkAbsRefs base<br />
= processTopDown ( editRef "a" "href" -- (2)<br />
>>><br />
editRef "img" "src" -- (3)<br />
>>><br />
editRef "link" "href" -- (4)<br />
>>><br />
editRef "script" "src" -- (5)<br />
)<br />
where<br />
editRef en an -- (1)<br />
= processAttrl ( changeAttrValue (absHRef base)<br />
`when`<br />
hasName an<br />
)<br />
`when`<br />
( isElem >>> hasName en )<br />
where<br />
absHRef :: String -> String -> String<br />
absHRef base url<br />
= fromMaybe url . expandURIString url $ base<br />
</haskell><br />
<br />
<hask>editRef</hask> is parameterized by the element and attribute names.<br />
The arrow applied to every element is extended to a sequence of<br />
<hask>editRef</hask>'s ((2)-(5)). Notice that the document is still traversed only once.<br />
To process all possible HTML elements,<br />
this sequence should be extended by further element-attribute pairs.<br />
<br />
This can further be simplified into<br />
<br />
<haskell><br />
mkAbsRefs :: ArrowXml a => String -> a XmlTree XmlTree<br />
mkAbsRefs base<br />
= processTopDown editRefs<br />
where<br />
editRefs<br />
= foldl (>>>) this<br />
.<br />
map (\ (en, an) -> editRef en an)<br />
$<br />
[ ("a", "href")<br />
, ("img", "src")<br />
, ("link", "href")<br />
, ("script", "src") -- and more<br />
]<br />
editRef<br />
= ...<br />
</haskell><br />
<br />
The <hask>foldl (>>>) this</hask> is defined in HXT as <hask>seqA</hask>,<br />
so the above code can be simplified to<br />
<br />
<haskell><br />
mkAbsRefs :: ArrowXml a => String -> a XmlTree XmlTree<br />
mkAbsRefs base<br />
= processTopDown editRefs<br />
where<br />
editRefs<br />
= seqA . map (uncurry editRef)<br />
$<br />
...<br />
</haskell><br />
<br />
== More complex examples ==<br />
<br />
=== Serialization and deserialisation to/from XML ===<br />
<br />
Examples can be found in [[HXT/Conversion of Haskell data from/to XML]]<br />
<br />
=== Practical examples of HXT ===<br />
<br />
More complex and complete examples of HXT in action<br />
can be found in [[HXT/Practical]]<br />
<br />
=== The Complete Guide To Working With HTML ===<br />
<br />
Tutorial and Walkthrough: http://adit.io/posts/2012-04-14-working_with_HTML_in_haskell.html</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54315List of partial functions2012-10-12T11:26:48Z<p>Dibblego: /* List functions */</p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* scanl1<br />
* scanr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* genericLength<br />
* length<br />
* sum<br />
* product<br />
* reverse<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* succ<br />
* pred<br />
* toEnum<br />
* (^)<br />
* fail<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
===Data.Map===<br />
<br />
* (!)<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54310List of partial functions2012-10-12T02:39:28Z<p>Dibblego: /* Other */</p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* scanl1<br />
* scanr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* genericLength<br />
* length<br />
* sum<br />
* reverse<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* succ<br />
* pred<br />
* toEnum<br />
* (^)<br />
* fail<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54309List of partial functions2012-10-12T02:38:47Z<p>Dibblego: /* Other */</p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* scanl1<br />
* scanr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* genericLength<br />
* length<br />
* sum<br />
* reverse<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* succ<br />
* pred<br />
* toEnum<br />
* (^)<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54308List of partial functions2012-10-12T02:36:19Z<p>Dibblego: /* Other */</p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* scanl1<br />
* scanr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* genericLength<br />
* length<br />
* sum<br />
* reverse<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* succ<br />
* pred<br />
* toEnum<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54307List of partial functions2012-10-12T02:35:49Z<p>Dibblego: /* Other */</p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* scanl1<br />
* scanr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* genericLength<br />
* length<br />
* sum<br />
* reverse<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* succ<br />
* pred<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54305List of partial functions2012-10-12T02:33:56Z<p>Dibblego: /* List functions */</p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* scanl1<br />
* scanr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* genericLength<br />
* length<br />
* sum<br />
* reverse<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54304List of partial functions2012-10-12T02:29:32Z<p>Dibblego: </p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* length<br />
* sum<br />
* reverse<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54303List of partial functions2012-10-12T02:29:13Z<p>Dibblego: </p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* length<br />
* sum<br />
* reverse<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
===Data.Maybe===<br />
<br />
* fromJust<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54302List of partial functions2012-10-12T02:28:50Z<p>Dibblego: </p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* length<br />
* sum<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
===Data.Maybe===<br />
<br />
* fromJust<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54301List of partial functions2012-10-12T02:27:32Z<p>Dibblego: </p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl1<br />
* foldl1'<br />
* foldl1<br />
* foldr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldr1<br />
* length<br />
* sum<br />
* ... (todo)<br />
<br />
===Maybe functions===<br />
<br />
* fromJust<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
===Data.Maybe===<br />
<br />
* fromJust<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=List_of_partial_functions&diff=54300List of partial functions2012-10-12T02:25:05Z<p>Dibblego: /* List functions */</p>
<hr />
<div>==Partial functions in Prelude==<br />
<br />
===List functions===<br />
<br />
* maximum<br />
* minimum<br />
* head<br />
* tail<br />
* init<br />
* last<br />
* foldl1<br />
* foldl1'<br />
* foldr1<br />
* cycle<br />
* !!<br />
* genericIndex<br />
* foldl<br />
* foldl'<br />
* foldl1<br />
* foldr1<br />
* length<br />
* sum<br />
* ... (todo)<br />
<br />
===Other===<br />
<br />
* read<br />
* quot<br />
* rem<br />
* quotRem<br />
* div<br />
* mod<br />
* divMod<br />
* ... (todo)<br />
<br />
==Partial functions in other base libraries==<br />
<br />
... (todo)<br />
<br />
===Data.Maybe===<br />
<br />
* fromJust<br />
<br />
==Partial functions in other Haskell Platform packages==<br />
<br />
... (todo)</div>Dibblegohttps://wiki.haskell.org/index.php?title=Monad/ST&diff=47050Monad/ST2012-07-23T09:13:58Z<p>Dibblego: Add reference to SPJ paper and the base library</p>
<hr />
<div>{{Standard class|ST|module=Control.Monad.ST|module-doc=Control-Monad-ST|package=base}}<br />
<br />
The ST monad provides support for ''strict'' state threads.<br />
<br />
<br />
==A discussion on the Haskell irc ==<br />
From #haskell (see 13:05:37 in the [http://tunes.org/~nef/logs/haskell/07.02.07 log] ):<br />
<br />
* TuringTest: ST lets you implement algorithms that are much more efficient with mutable memory used internally. But the whole "thread" of computation cannot exchange mutable state with the outside world, it can only exchange immutable state.<br />
<br />
* TuringTest: chessguy: You pass in normal Haskell values and then use ST to allocate mutable memory, then you initialize and play with it, then you put it away and return a normal Haskell value.<br />
<br />
* sjanssen: a monad that has mutable references and arrays, but has a "run" function that is referentially transparent<br />
<br />
* DapperDan2: it strikes me that ST is like a lexical scope, where all the variables/state disappear when the function returns.<br />
[[Category:Standard classes]] [[Category:Monad]]<br />
<br />
<br />
==An explanation in Haskell-Cafe==<br />
<br />
The ST monad lets you use update-in-place, but is escapable (unlike IO). <br />
ST actions have the form:<br />
<br />
<haskell><br />
ST s Î±<br />
</haskell><br />
<br />
Meaning that they return a value of type Î±, and execute in "thread" s.<br />
All reference types are tagged with the thread, so that actions can only<br />
affect references in their own "thread".<br />
<br />
Now, the type of the function used to escape ST is:<br />
<br />
<haskell><br />
runST :: forall Î±. (forall s. ST s Î±) -> Î±<br />
</haskell><br />
<br />
The action you pass must be universal in s, so inside your action you don't know what thread, thus you cannot access any other threads, thus <hask>runST</hask> is pure. This is very useful, since it allows you to implement externally pure things like in-place quicksort, and present them as pure functions âˆ€ e. Ord e â‡’ Array e â†’ Array e; without using any unsafe functions.<br />
<br />
But that type of <hask>runST</hask> is illegal in Haskell-98, because it needs a universal quantifier *inside* the function-arrow! In the jargon, that type has rank 2; haskell 98 types may have rank at most 1.<br />
<br />
See http://www.haskell.org/pipermail/haskell-cafe/2007-July/028233.html<br />
<br />
Could we *please* see an example.<br />
<br />
Sure thing...<br />
<br />
== A few simple examples ==<br />
<br />
In this example, we define a version of the function sum, but do it in a way which more like how it would be done in imperative languages, where a variable is updated, rather than a new value is formed and passed to the next iteration of the function. While in place modifications of the STRef n are occurring, something that would usually be considered a side effect, it is all done in a safe way which is deterministic. The result is that we get the benefits of being able to modify memory in place, while still producing a pure function with the use of runST.<br />
<br />
<haskell><br />
import Control.Monad.ST<br />
import Data.STRef<br />
import Control.Monad<br />
<br />
<br />
sumST :: Num a => [a] -> a<br />
sumST xs = runST $ do -- runST takes out stateful code and makes it pure again.<br />
<br />
n <- newSTRef 0 -- Create an STRef (place in memory to store values)<br />
<br />
forM_ xs $ \x -> do -- For each element of xs ..<br />
modifySTRef n (+x) -- add it to what we have in n.<br />
<br />
readSTRef n -- read the value of n, and return it.<br />
<br />
<br />
</haskell><br />
<br />
An implementation of foldl using the ST monad (a lot like sum, and in fact sum can be defined in terms of foldlST):<br />
<br />
<haskell><br />
foldlST :: (a -> b -> a) -> a -> [b] -> a<br />
foldlST f acc xs = runST $ do<br />
acc' <- newSTRef acc -- Create a variable for the accumulator<br />
<br />
forM_ xs $ \x -> do -- For each x in xs...<br />
<br />
a <- readSTRef acc' -- read the accumulator<br />
writeSTRef acc' (f a x) -- apply f to the accumulator and x<br />
<br />
readSTRef acc' -- and finally read the result<br />
</haskell><br />
<br />
An example of the Fibonacci function running in constantÂ¹ space:<br />
<br />
<haskell><br />
fibST :: Integer -> Integer<br />
fibST n = <br />
if n < 2<br />
then n<br />
else runST $ do<br />
x <- newSTRef 0<br />
y <- newSTRef 1<br />
fibST' n x y<br />
<br />
where fibST' 0 x _ = readSTRef x<br />
fibST' n x y = do<br />
x' <- readSTRef x<br />
y' <- readSTRef y<br />
writeSTRef x y'<br />
writeSTRef y $! x'+y'<br />
fibST' (n-1) x y<br />
</haskell><br />
<br />
[1] Since we're using Integers, technically it's not constant space, as they grow in size when they get bigger, but we can ignore this.<br />
<br />
== References ==<br />
* [http://research.microsoft.com/en-us/um/people/simonpj/papers/lazy-functional-state-threads.ps.Z Lazy Functional State Threads, John Launchbury and Simon Peyton Jones]<br />
* [http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad-ST.html Control.Monad.ST in the base libraries]</div>Dibblegohttps://wiki.haskell.org/index.php?title=GHC/GHCi&diff=37522GHC/GHCi2010-11-22T01:12:02Z<p>Dibblego: Typo</p>
<hr />
<div>[[Category:GHC|GHCi]]<br />
== Introduction ==<br />
<br />
GHCi is GHC's interactive environment, in which Haskell expressions can be interactively evaluated and programs can be interpreted. The [http://www.haskell.org/ghc/docs/latest/html/users_guide/index.html GHC User's Guide] contains [http://www.haskell.org/ghc/docs/latest/html/users_guide/ghci-debugger.html more fundamental and detailed information about GHCi.]<br />
<br />
This page is a place to collect advice about how to use GHCi beyond what User's Guide covers. Please add to it! <br />
<br />
== Advanced customization ==<br />
<br />
=== Using <tt>.ghci</tt>, a mini-tutorial ===<br />
<br />
There is a lot more one can do to customize and extend GHCi. Some extended examples can be found in an email posted to <tt>haskell-cafe</tt>, titled<br />
[http://www.haskell.org/pipermail/haskell-cafe/2007-September/032260.html getting more out of ghci]. Dating from September 2007, and using GHC 6.6.1, some of the GHCi tickets mentioned in there have since been fixed, but the message should still serve as a useful introduction to writing your own <tt>.ghci</tt> files. It also provides several useful commands you might want to copy into your own file!-) Newer GHCis support the multiline commands mentioned in the message, allowing for more readable <tt>.ghci</tt> files (at the time, definitions had to be squashed into single lines, so you have to read the message to understand the `.ghci` file). For those still using older GHCis, a variant file for 6.4.1 is available, too:<br />
<br />
* [http://www.haskell.org/pipermail/haskell-cafe/2007-September/032260.html "getting more out of ghci", the mini-tutorial]<br />
* [http://www.cs.kent.ac.uk/people/staff/cr3/toolbox/haskell/dot-squashed.ghci squashed .ghci, for 6.6.1 or later]<br />
* [http://www.cs.kent.ac.uk/people/staff/cr3/toolbox/haskell/dot-squashed.ghci641 squashed .ghci, for 6.4.1]<br />
<br />
(See the user guide for [http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/ghci-dot-files.html where to locate the <tt>.ghci</tt> file]).<br />
<br />
=== Customized GHCi interactive environments ===<br />
<br />
You can create shell commands that start up GHCi and<br />
initialize it for use as a specialized interactive<br />
computing environment for any purpose that you can<br />
imagine.<br />
<br />
The idea is that if you put the following lines in your<br />
<tt>.ghci</tt> file, GHCi will load commands at startup<br />
from whatever file whose path you specify in<br />
the <tt>GHCIRC</tt> environment variable. You can then easily write<br />
shell scripts that exploit this to initialize GHCi in<br />
any manner you please.<br />
<br />
<haskell><br />
-- Read GHCI commands from the file whose name is<br />
-- in the GHCIRC environment variable<br />
:def _load const(System.Environment.getEnvironment>>=maybe(return"")readFile.lookup"GHCIRC")<br />
:_load<br />
:undef _load<br />
</haskell><br />
<br />
=== External tool integration ===<br />
==== Hoogle ====<br />
External command-line tools like [[Hoogle]] can be integrated in GHCi by adding a line to .ghci similar to<br />
<haskell><br />
:def hoogle \str -> return $ ":! hoogle --count=15 \"" ++ str ++ "\""<br />
</haskell><br />
<br />
Make sure that the directory containing the executable is in your PATH environment variable or modify the line to point directly to the executable. Invoke the executable with commands like<br />
<haskell><br />
:hoogle map<br />
</haskell><br />
<br />
==== Hlint ====<br />
Hlint can be similarly integrated. It is much more complex, however, as one must acquire the filename to run hlint on:<br />
<br />
<haskell><br />
-- <http://www.cs.kent.ac.uk/people/staff/cr3/toolbox/haskell/dot-squashed.ghci641><br />
let { redir varcmd = case break Data.Char.isSpace varcmd of { (var,_:cmd) -> return $ unlines [":set -fno-print-bind-result","tmp <- System.Directory.getTemporaryDirectory","(f,h) <- System.IO.openTempFile tmp \"ghci\"","sto <- GHC.Handle.hDuplicate System.IO.stdout","GHC.Handle.hDuplicateTo h System.IO.stdout","System.IO.hClose h",cmd,"GHC.Handle.hDuplicateTo sto System.IO.stdout","let readFileNow f = readFile f >>= \\t->length t `seq` return t",var++" <- readFileNow f","System.Directory.removeFile f"]; _ -> return "putStrLn \"usage: :redir <var> <cmd>\"" } }<br />
:def redir redir<br />
-- End copied material<br />
<br />
-- Integration with the hlint code style tool<br />
let hlint _ = return $ unlines [":set -w", ":redir hlintvar1 :show modules", ":cmd return (\":! hlint \" ++ (concat $ Data.List.intersperse \" \" (map (fst . break (==',') . drop 2 . snd . break (== '(')) $ lines hlintvar1)))", ":set -Wall"]<br />
:def hlint hlint<br />
</haskell><br />
<br />
(There may be a more up-to-date version in the hlint darcs repo.)<br />
<br />
=== Package and documentation lookup ===<br />
<br />
Ever tried to find the users guide for the version of GHCi you are currently running? Or information about the packages installed for it? The new <tt>[http://hackage.haskell.org/cgi-bin/hackage-scripts/package/ghc-paths ghc-paths]</tt> package makes such tasks easier by exporting a <tt>GHC.Paths</tt> module:<br />
<haskell><br />
Prelude> :browse GHC.Paths<br />
docdir :: FilePath<br />
ghc :: FilePath<br />
ghc_pkg :: FilePath<br />
libdir :: FilePath<br />
</haskell><br />
We can define some auxiliary commands to make this more comfortable:<br />
<haskell><br />
:ghc_pkg cmds -- run ghc-pkg commands<br />
:browser url -- start browser with url<br />
:doc [relative] -- open docs, with optional relative path<br />
:users_guide [relative] -- open users guide, with optional relative path<br />
</haskell><br />
So, <haskell>:ghc_pkg list</haskell> will list the packages for the current GHCi instance, <haskell>:ghc_pkg find-module Text.Regex</haskell> will tell us what package that module is in, etc. <haskell>:doc</haskell> will open a browser window on the documentation for this GHCi version, <haskell>:doc /Cabal/index.html</haskell> takes us to the Cabal docs, and <haskell>:users_guide /flag-reference.html</haskell> takes us to the flag reference, all matching the version of GHCi we're in, provided that the docs and <tt>ghc-paths</tt> are installed.<br />
<br />
Here are the definitions - adapt to your preferences (note that the construction of the documentation path from <tt>libdir</tt> and <tt>docdir</tt> is slightly dodgy):<br />
<haskell><br />
:def ghc_pkg (\l->return $ ":!"++GHC.Paths.ghc_pkg++" "++l)<br />
<br />
:def browser (\l->return $ ":!c:/Progra~1/Opera/Opera.exe "++l)<br />
<br />
let doc p = return $ ":browser "++GHC.Paths.libdir++dropWhile (/='/')GHC.Paths.docdir++relative where { relative | p=="" = "/index.html" | otherwise = p }<br />
:def doc doc<br />
<br />
let users_guide p = doc ("/users_guide"++if null p then "/index.html" else p)<br />
:def users_guide users_guide<br />
</haskell><br />
<br />
=== GHCi on Acid ===<br />
<br />
GHCi on Acid is an extension to GHCi (Interactive GHC) for adding useful lambdabot features. It does pretty much anything lambadabot does, just nicely embedded inside your GHCi.<br />
<br />
==== Features ====<br />
<br />
Here are some examples of the commands that can be used.<br />
<br />
The :instances command shows all the instances of a class:<br />
<br />
GOA> :instances Monad<br />
((->) r), ArrowMonad a, Cont r, ContT r m, Either e, ErrorT e m, IO, Maybe, RWS r w s, RWST r w s m, Reader r, ReaderT r m, ST s, State s, StateT s m, Writer w, WriterT w m, []<br />
GOA> :instances Arrow<br />
(->), Kleisli m<br />
GOA> :instances Num<br />
Double, Float, Int, Integer<br />
<br />
Here we have the :hoogle command, for querying the Hoogle database. Great for looking for functions of a specific type:<br />
<br />
GOA> :hoogle Arrow<br />
Control.Arrow :: module<br />
Control.Arrow.Arrow :: class Arrow a<br />
Control.Arrow.ArrowZero :: class Arrow a => ArrowZero a<br />
GOA> :hoogle b -> (a -> b) -> Maybe a -> b<br />
Prelude.maybe :: b -> (a -> b) -> Maybe a -> b<br />
Data.Maybe.maybe :: b -> (a -> b) -> Maybe a -> b<br />
<br />
The :source command gives a link to the source code of a module (sometimes you are curious):<br />
<br />
GOA> :source Data.Maybe<br />
http://darcs.haskell.org/packages/base/Data/Maybe.hs<br />
<br />
Similarly, :docs gives a link to the documentation of a module.<br />
<br />
GOA> :docs Data.Maybe<br />
http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Maybe.html<br />
<br />
The :index command is a nice way to search modules.<br />
<br />
GOA> :index Monad<br />
Control.Monad, Prelude, Control.Monad.Reader, Control.Monad.Writer, Control.Monad.State, Control.Monad.RWS, Control.Monad.Identity, Control.Monad.Cont, Control.Monad.Error, Control.Monad.List<br />
<br />
Then we have :pl, which shows the pointless (or: point-free) way of writing a function, which is very useful for learning and sometimes for fun:<br />
<br />
GOA> :pl (\x -> x * x)<br />
join (*)<br />
GOA> :pl (\x y -> (x * 5) + (y * 5))<br />
(. (5 *)) . (+) . (5 *)<br />
<br />
==== How to install ====<br />
<br />
$ cabal install lambdabot<br />
$ cabal install goa<br />
<br />
Then edit your .ghci to look like the following:<br />
<br />
:m - Prelude<br />
:m + GOA<br />
setLambdabotHome "/home/chris/.cabal/bin"<br />
:def bs lambdabot "botsnack"<br />
:def pl lambdabot "pl"<br />
:def unpl lambdabot "unpl"<br />
:def redo lambdabot "redo"<br />
:def undo lambdabot "undo"<br />
:def index lambdabot "index"<br />
:def docs lambdabot "docs"<br />
:def instances lambdabot "instances"<br />
:def hoogle lambdabot "hoogle"<br />
:def source lambdabot "fptools"<br />
:def where lambdabot "where"<br />
:def version lambdabot "version"<br />
:def src lambdabot "src"<br />
<br />
And you should be able to do this:<br />
<br />
chris@chrisamilo:~$ ghci<br />
GHCi, version 6.10.4: http://www.haskell.org/ghc/ :? for help<br />
Loading package ghc-prim ... linking ... done.<br />
Loading package integer ... linking ... done.<br />
Loading package base ... linking ... done.<br />
Loading package syb ... linking ... done.<br />
Loading package base-3.0.3.1 ... linking ... done.<br />
Loading package old-locale-1.0.0.1 ... linking ... done.<br />
Loading package old-time-1.0.0.2 ... linking ... done.<br />
Loading package filepath-1.1.0.2 ... linking ... done.<br />
Loading package unix-2.3.2.0 ... linking ... done.<br />
Loading package directory-1.0.0.3 ... linking ... done.<br />
Loading package process-1.0.1.1 ... linking ... done.<br />
Loading package goa-3.0.2 ... linking ... done.<br />
GOA> :src foldr<br />
src foldr<br />
foldr f z [] = z<br />
foldr f z (x:xs) = f x (foldr f z xs)<br />
GOA> <br />
<br />
Tip: if you accidentally unload the GOA module, use :m + GOA to load it.<br />
<br />
== Frequently Asked Questions (FAQ) ==<br />
<br />
=== How do I stop GHCi from printing the result of a bind statement? ===<br />
<br />
Sometimes you want to perform an IO action at the prompt that will produce a lot of data (e.g. reading a large file). When you try to do this, GHCi will helpfully spew this data all over your terminal, making the console temporarily unavailable.<br />
<br />
To prevent this, use <tt>:set -fno-print-bind-result</tt>. If you want this option to be permanently set, add it to your <tt>.ghci</tt> file.<br />
<br />
== Additional command advice ==<br />
<br />
=== The :def command ===<br />
<br />
The [http://www.haskell.org/ghc/docs/latest/html/users_guide/ghci-commands.html#id2902944 :def command], documented here, allows GHCi's commands to be extended in quite a powerful way.<br />
<br />
Here is one example.<br />
<br />
Prelude> let loop = do { l <- getLine; if l == "\^D" then return () else do appendFile "foo.hs" (l++"\n"); loop }<br />
Prelude> :def pasteCode (\_ -> loop >> return ":load foo.hs")<br />
<br />
This defines a new command :pasteCode, which allows you to paste Haskell code directly into GHCi. You type the command :pasteCode, followed by the code you want, followed by ^D, followed (unfortunately) by enter, and your code is executed. Thus:<br />
<br />
Prelude> :pasteCode<br />
x = 42<br />
^D<br />
Compiling Main ( foo.hs, interpreted )<br />
Ok, modules loaded: Main.<br />
*Main> x<br />
42<br />
*Main><br />
<br />
== Compatibility/shell/platform integration ==<br />
<br />
=== Readline/editline ===<br />
<br />
There are some tricks to getting readline/editline to work as expected with GHCi.<br />
<br />
==== A readline-aware GHCi on Windows ====<br />
<br />
Mauricio reports: I've just uploaded a package (<tt>rlwrap</tt>) to Cygwin that I like to use with <tt>ghci</tt>. You can use it like this:<br />
<pre><br />
rlwrap ghcii.sh<br />
</pre><br />
and then you will use <tt>ghc</tt> as if it were readline aware (i.e., you can <br />
press up arrow to get last typed lines etc.). <tt>rlwrap</tt> is very stable <br />
and I never had unexpected results while using it.<br />
<br />
Since the issue of <tt>ghci</tt> integration with terminals has been raised <br />
here sometimes, I thought some guys here would be interested (actually, <br />
I found rlwrap looking for a better way to use ghci).<br />
<br />
==== rlwrap (for GHCI compiled without readline/editline) ====<br />
<br />
GHCi has support for session-specific command-line completion, but only if it was built with the readline or editline package, and some versions of GHCi aren't. In such cases, you can try [http://utopia.knoware.nl/~hlub/uck/rlwrap/ rlwrap (readline wrapper)] to attach readline "from the outside", which isn't as specific, but gives basic completion support. In particular, there's an rlwrap package for Cygwin.<br />
<br />
For starters,<br />
<br />
<code><br />
rlwrap -cr ghci<br />
</code><br />
<br />
gives you filename completion, and completion wrt to previous input/output in your GHCi session (so if a GHCi error message suggests to set <hask>AnnoyinglyLongVerySpecificOption</hask>, that will be available for completion;-).<br />
<br />
If you want to get more specific, you need to supply files with possible completions - flags and modules spring to mind, but where to get those?<br />
<br />
1. extracting a list of options from the flag-reference in the users guide:<br />
<br />
<pre><br />
cat /cygdrive/c/ghc/ghc-6.9.20080514/doc/users_guide/flag-reference.html<br />
| sed 's/</\n</g' <br />
| sed '/<code class="option">/!d;s/<code class="option">\(.*\)$/\1/'<br />
> options.txt<br />
</pre><br />
<br />
actually, we only want the dynamic or :setable options, and no duplicates, so:<br />
<br />
<pre><br />
cat /cygdrive/c/ghc/ghc-6.9.20080514/doc/users_guide/flag-reference.html <br />
| sed 's/<tr/\n<tr/g' <br />
| grep '<code class="option".*>\(dynamic\|:set\)<' <br />
| sed 's/^.*<code class="option">\([^<]*\)<.*$/\1/' <br />
| uniq<br />
> options.txt<br />
</pre><br />
<br />
2. extracting a list of modules from <tt>ghc-pkg</tt>:<br />
<br />
<code><br />
ghc-pkg field '*' exposed-modules | sed 's/exposed-modules: //; s/^\s\+//g' >modules.txt<br />
</code><br />
<br />
And now,<br />
<br />
<code><br />
rlwrap -cr -f modules.txt -f options.txt ghcii.sh<br />
</code><br />
<br />
will give you completion wrt filenames, options, module names, and previous session contents, as well as the usual readline goodies, like history search and editing. The main drawback is that the completion is neither session nor context-specific, so it will suggest filenames where module names are expected, it will suggest module names that may not be exposed, etc.</div>Dibblegohttps://wiki.haskell.org/index.php?title=Leksah_FAQ&diff=36356Leksah FAQ2010-07-27T10:53:23Z<p>Dibblego: </p>
<hr />
<div>This is FAQ for the [[Leksah]] IDE.<br />
<br />
=== The cross [x] buttons on tabs are almost invisible (don't fit in tabs) ===<br />
<br />
With version 0.6 and later for a pleasant visual appearance, you have to copy<br />
or append the .gtkrc-2.0 file from the Leksah data folder or<br />
from the data folder in Leksah sources to your home folder. (Manual 7)<br />
<br />
That's not a global change, but leksah specific parameter that can't be<br />
set in another way, as far as I know.<br />
<br />
To install leksah's gtkrc in Linux for the current user:<br />
$ cd ~<br />
$ wget http://code.haskell.org/leksah-head/leksah/data/.gtkrc-2.0 -O .gtkrc-2.0-leksah<br />
$ echo -e '\ninclude ".gtkrc-2.0-leksah"' >> .gtkrc-2.0<br />
<br />
=== The [X] button on the toolbar behaves in a counterintuitive way (I expected it to close source editor tabs not modules/debugging/... tabs). Is it really needed? ===<br />
<br />
<br />
Well we have made a concept, where you are basically free to<br />
rearrange every part of the Leksah window. So you are free to close every<br />
pane.<br />
<br />
When I work this is my rescue:<br />
<br />
In Leksah there may be an active pane. The name of this<br />
pane is displayed in the second compartment from the left<br />
side in the status bar. Some actions like moving, splitting,<br />
closing panes or finding or replacing items in a text buffer<br />
act on the current pane, so check the display in the status<br />
bar to see if the pane you want to act on, is really the active<br />
one. (Manual 33)<br />
<br />
=== Name completion popup sometimes goes beyond the bottom edge of the window/screen ===<br />
<br />
True.<br />
<br />
=== Debugging doesn't work if ~/.cabal/bin is not in the path - "Setup: Cannot find the program 'ghc' at 'leksahecho' or on the path" (does it mean that it can't find leksahecho?) ===<br />
<br />
True. Not sure what it is really searching, but it needs to find both programs.<br />
<br />
=== Does importing some module need its package to be specified in dependencies first? Would it be possible to automatically add package to dependencies if I wanted to use something from it? ===<br />
<br />
The import helper just looks in imported packages, so if you<br />
miss a package import, you have to fix it manually. It would be an<br />
interesting feature for the future.<br />
<br />
=== Adding dependencies from packages installed by cabal doesn't seem to work: "Setup: At least the following dependencies are missing: some_package", while doing "~/.cabal/bin/cabal configure" in the package directory runs normally. I have some_package installed in my home directory by cabal-install ===<br />
<br />
This is because cabal install uses the per user database of haskell<br />
packages, while the default is the machine database. So you have to add<br />
--user to the ConfigFlags in Package / Package Flags.<br />
<br />
Note: "--user" was made the default in the releases > 0.6.1.<br />
<br />
=== After opening the leksah.cabal package, the Package/Edit Package menu item doesn't work ===<br />
<br />
As of version 0.6 "Cabal file with configurations can't be edited with the current version of the editor" (this is printed to stdout in the background).<br />
<br />
=== What font to use with Leksah? ===<br />
<br />
[http://damieng.com/blog/2008/05/26/envy-code-r-preview-7-coding-font-released Envy code R]<br />
and<br />
[http://www.is-vn.bg/hamster/ Terminus]<br />
seem to be a good match with Leksah.<br />
<br />
=== Is it possible to change the editor color scheme / background color? ===<br />
<br />
Yes. Leksah is based on GtkSourceView and supports the same color schemes as for example gedit. The schemes can be set in preferences, editor -> editor style.<br />
<br />
=== Where is the list of default keybindings? ===<br />
<br />
The keybindings can be configured by editing the .keymap file (name of the file is specified in preferences -> GUI options -> name of the keymap.<br />
<br />
The default keymap file is<br />
[http://code.haskell.org/leksah/data/Default.keymap here]. You could make your own by copying the default one to your ~/.leksah under a different name.<br />
<br />
=== There is "Goto definition" in the modules pane. Is there also goto definition in the editor window? ===<br />
<br />
AFAIK, there is no 'direct' one ATM, but you could select the identifier text, use "Search (metadata)" from the right-click context menu and then right-click on one of the results to go to its definition.<br />
<br />
Also, holding down ctrl and double clicking on identifier in the code editor finds it in the module pane (if the results are unambiguous) and falls back on standard search otherwise (from Hamish).<br />
<br />
=== I'm having troubles linking Leksah against gtksourceview. I've installed gtk2hs but am not sure whether/where to specify the library search path for ld ===<br />
<br />
Here is the output I get:<br />
Linking dist\build\leksah\leksah.exe ...<br />
C:\app\Haskell_Platform\2009.2.0.1\gcc-lib\ld.exe: cannot find -lgtksourceview-2.0<br />
<br />
solution from Tobias Bexelius (with minor edits):<br />
<br />
I downloaded the latest devlopment version binaries for windows from http://ftp.gnome.org/pub/gnome/binaries/win32/gtksourceview/2.6/gtksourceview-dev-2.6.2.zip (the official download page is at http://projects.gnome.org/gtksourceview/download.html) and put the libgtksourceview-2.0.dll.a file in a lib directory that the GCC compiler knows about (I use Mingw so i put it in the library folder found there). I was then able to compile Leksah (after changing the regex dependencies as reported earlier on this mailinglist).<br />
<br />
=== I have a problem with autocompletion. I can see StartComplete in status bar, but dialog *never* appears. ===<br />
<br />
It probably depends on what you are trying to autocomplete.<br />
AFAIK, for Leksah to know names from some module, you have to add its package to dependencies, configure package and import the module in your code.</div>Dibblegohttps://wiki.haskell.org/index.php?title=Applications_and_libraries/Program_development&diff=34525Applications and libraries/Program development2010-04-11T23:15:17Z<p>Dibblego: </p>
<hr />
<div>{{LibrariesPage}}<br />
<br />
A list of tools and libraries that are helpful when developing Haskell code.<br />
See also the [[Libraries and tools/Compiler tools|compiler tools]] and [[Libraries and tools/Theorem provers|theorem provers]].<br />
<br />
== Applications ==<br />
<br />
=== Preprocessors ===<br />
<br />
;[http://www.cs.york.ac.uk/fp/cpphs/ cpphs]<br />
:Cpphs is a re-implementation (in Haskell) of the C pre-processor.<br />
<br />
;[http://repetae.net/john/computer/haskell/DrIFT DrIFT]<br />
:DrIFT is a tool which allows derivation of instances for classes that aren't supported by the standard compilers. In addition, instances can be produced in seperate modules to that containing the type declaration. This allows instances to be derived for a type after the original module has been compiled. As a bonus, simple utility functions can also be produced from a type.<br />
<br />
;[http://www.cs.vu.nl/Strafunski/ Strafunski]<br />
:Strafunski is a Haskell bundle that provides support for generic programming in Haskell, based on the concept of a functional strategy. It consists of a combinator library (StrategyLib) and a precompiler (DrIFT-Strafunski).<br />
<br />
;[http://hackage.haskell.org/cgi-bin/hackage-scripts/package/zeroth Zeroth]<br />
:A program using Template Haskell must link with the TH library even if it contains no references to TH after it has been compiled. Zeroth is a preprocessor which allows modules to use TH without linking with the TH library. To do this, Zeroth evaluates the top level splices from a module and saves the resulting code.<br />
<br />
;[http://www.cs.utah.edu/~hal/HAllInOne/index.html Haskell All-In-One]<br />
:Haskell All-In-One is a Haskell utility which will take a program implemented in multiple modules and convert it to a single module, for optimisation purposes.<br />
<br />
=== Build systems ===<br />
<br />
;[http://www.haskell.org/cabal Cabal]<br />
:The Haskell Cabal is a Common Architecture for Building Applications and Libraries. It is an API distributed with GHC, NHC98, and Hugs which allows a developer to easily group together a set of modules into a package. It is the standard build system for new Haskell libraries and applications <br />
<br />
;[http://www.cs.york.ac.uk/fp/hmake/ hmake], a Haskell-aware replacement for make<br />
:Automatically keeps track of module dependencies (i.e. no need to write any Makefiles!). Can be used with any of the usual Haskell compilers (ghc, hbc, nhc98).<br />
<br />
=== Source tags ===<br />
<br />
;[http://www.cl.cam.ac.uk/users/rje33/software.html HaskTags]<br />
:Hasktags is a simple program that generates TAGS files for Haskell code. Together with a supporting editor (e.g. NEdit, XEmacs, or Vim) TAGS files can be used to quickly find the places where functions, data constructors etc. are defined.<br />
<br />
;[http://hackage.haskell.org/package/hothasktags HotHaskTags]<br />
:HotHaskTags is a reimplementation of HaskTags that is more aware of the structure of Haskell source. If you have multiple functions of the same name in a project and jump from a file, it will analyze the imports of that file (including qualified imports correctly) and jump to the one that is being referred to. Extended ctags format only (read: Vim only).<br />
<br />
;[http://www.dtek.chalmers.se/~d99josve/tagsh.tar.gz tagsh]<br />
:A version of the tags program for Haskell. It uses the standardised hssource and posix library, works with GHC 5.02.1. tags file has been checked to work with vim and nedit.<br />
<br />
=== Program Transformation ===<br />
<br />
;[http://www.cs.kent.ac.uk/projects/refactor-fp/hare.html HaRe -- The Haskell Refactorer]<br />
:Mechanical refactoring of Haskell code (across module boundaries). HaRe now supports many refactorings such as renaming identifiers, moving/introducing/inlining definitions, and so on. Those refactorings are not limited to a single module. HaRe can be accessed from either Vim or Emacs<br />
<br />
;[http://www.cs.utah.edu/~hal/HAllInOne/ Haskell All-In-One]<br />
:This Haskell utility takes a program implemented in multiple modules and converts it to a single module. This way you get whole program optimization for compilers that do not support that. Resulting programs will be probably faster using this method.<br />
<br />
;[http://wiki.di.uminho.pt/wiki/bin/view/Alcino/DrHylo DrHylo]<br />
:Tool for deriving hylomorphisms from a restricted Haskell syntax. It is based on the algorithm first presented in the paper "Deriving Structural Hylomorphisms From Recursive Definitions" at ICFP'96 by Hu, Iwasaki, and Takeichi.<br />
<br />
;[http://community.haskell.org/~ndm/hlint/ HLint]<br />
:Tool for automatically suggesting program improvements - library functions you may have missed, instances of map or foldr etc.<br />
<br />
=== Documentation and browsing ===<br />
<br />
;[http://www.haskell.org/hoogle/ Hoogle]<br />
:Hoogle is a Haskell API search engine. It allows you to search for a function in the standard libraries by either name, or by approximate type signature.<br />
<br />
;[[Haddock]] A Haskell Documentation Tool<br />
:A tool for automatically generating documentation from annotated Haskell source code. It is primarily intended for documenting libraries, but it should be useful for any kind of Haskell code. Haddock lets you write documentation annotations next to the definitions of functions and types in the source code, in a syntax that is easy on the eye when writing the source code (no heavyweight mark-up). The documentation generated by Haddock is fully hyperlinked - click on a type name in a type signature to go straight to the definition, and documentation, for that type.<br />
<br />
;[http://www.cse.unsw.edu.au/~chak/haskell/idoc/ IDoc] A No Frills Haskell Interface Documentation System<br />
:IDoc extracts interface documentation and declarations from Haskell modules based on standard Haskell layout rules and a small number of clues that the programmer embeds in interface comments. These clues have been designed to be visually non-imposing when displaying the source in a text editor. Interface documentation is rendered in standard markup languages (currently, only HTML is supported). IDoc has been designed to be simple to use and install.<br />
<br />
;[http://www.fmi.uni-passau.de/~groessli/hdoc/ HDoc]<br />
:HDoc generates documentation in HTML format for Haskell modules. The generated documents are cross linked and include summaries and detailed descriptions for the documented functions, data types, type classes and instance declarations.<br />
<br />
;[http://www.ida.liu.se/~jakax/haskell.html HaskellDoc]<br />
:This program generates an HTML document showing the module interfaces of a Haskell project. Convenient links are placed for easy browsing of the different modules of the project, and for quick access to the source code.<br />
<br />
;[http://home.conceptsfa.nl/~jwit/HaSpell.html HaSpell]<br />
:HaSpell is a spelling and style checker for Haskell programs. It can detect spelling errors in comments in the program text, and optionally in the code itself. There is an option to detect metasyntactic variables (such as 'foo') and 'bad function prefixes' such as 'compute' and 'doThe' - these make the program less readable and generally indicate bad programming style.<br />
<br />
;[[Lambdabot]]<br />
:Lambdabot is a large, ad-hoc collection of Haskell development tools available for offline use. In particular, automatic point-free refactoring is available via a vim interface, as well as access to [[Hoogle]], djinn, ghci, and much much more.<br />
<br />
=== Tracing &amp; debugging ===<br />
<br />
Tracing gives access to otherwise invisible information about a computation. Conventional debuggers allow the user to step through the program computation, stop at given points and examine variable contents. This tracing method is quite unsuitable for Haskell, because its evaluation order is complex, function arguments are usually unwieldy large unevaluated expressions and generally<br />
computation details do not match the user's high-level view of functions mapping values to values. <br />
<br />
;[http://www.cs.mu.oz.au/~bjpop/buddha/ Buddha]<br />
:Buddha is a declarative debugger for Haskell 98 programs. It presents the evaluation of a Haskell program as a series of function applications. A typical debugging session involves a series of questions and answers. The questions are posed by the debugger, and the answers are provided by the user. The implementation of Buddha is based on program transformation.<br />
<br />
;[http://www.ida.liu.se/~henni Freja]<br />
:A compiler for a subset of Haskell. Running a compiled program creates an evaluation dependency tree as trace, a structure based on the idea of declarative debugging from the logic programming community. A debugging session consists of the user answering a sequence of yes/no questions.<br />
<br />
;[http://www.cs.york.ac.uk/fp/hat Hat]<br />
:A Haskell program is first transformed by hat-trans and then compiled with nhc98 or ghc. At runtime the program writes a trace file. There are tools for viewing the trace in various ways: Hat-stack shows a virtual stack of redexes. Hat-observe shows top-level functions in the style of Hood. Hat-trail enables exploring a computation backwards, starting from (part of) a faulty output or an error message. Hat-detect provides algorithmic debugging in the style of Freja. Hat-explore allows free navigation through a computation similar to traditional debuggers and algorithmic debugging and slicing.<br />
<br />
;[http://www.haskell.org/hood Hood]<br />
:A library that permits to observe data structures at given program points. It can basically be used like print statements in imperative languages, but the lazy evaluation order is not affected and functions can be observed as well.<br />
<br />
;[http://www.cs.ukc.ac.uk/people/staff/cr3/toolbox/haskell/ GHood]<br />
:"Graphical Hood" - a Java-based graphical observation event viewer, building on Hood.<br />
<br />
=== Revision control ===<br />
<br />
;[http://darcs.net Darcs]<br />
: Darcs is a cutting edge revision control system written in Haskell<br />
<br />
;[http://www.cse.unsw.edu.au/~dons/darcs-graph.html darcs-graph]<br />
:a tool for generating graphs of commit activity for darcs repositories.<br />
<br />
;[http://www.cse.unsw.edu.au/~chak/haskell/VersionTool/ VersionTool]<br />
:a small utility that:<br />
* extracts version information from Cabal files,<br />
* maintains version tags in darcs,<br />
* computes patch levels by querying darcs,<br />
* extracts the current context from darcs, and<br />
* adds all this information to a source file<br />
<br />
=== Licensing ===<br />
<br />
;[http://www.haskell.org/pipermail/haskell/2006-June/018043.html Kamiariduki]<br />
:a system to judge your derivative work's purpose and license is valid with Creative Commons License Works.<br />
<br />
=== Bug tracking ===<br />
<br />
;[http://urchin.earth.li/darcs/ian/bts/ Bark] <br />
:Bark is a bug tracking system written in Haskell<br />
<br />
=== Typesetting Haskell ===<br />
<br />
;[http://www.cs.york.ac.uk/fp/darcs/hscolour/ HsColour]<br />
:Colourise Haskell source code in HTML or ANSI terminal screen codes.<br />
<br />
;[http://www.iai.uni-bonn.de/~loeh/lhs2tex/ lhs2tex]<br />
:A preprocessor for typesetting Haskell programs that combines some of the good features of pphs and smugweb. It generates LaTeX code from literate Haskell sources.<br />
<br />
;[http://www.cs.uu.nl/wiki/Ehc/Shuffle Shuffle]<br />
:Another tool helping literate programming in Haskell. It helps to maintain ''views'' in a literate programming project. For example, it is among the tools used for developing a compiler in an iterative way with manuals didactically reflecting these evolving series of versions deriving from the literal code (see [http://www.cs.uu.nl/wiki/Ehc/WebHome Essential Haskell Compiler] project). Thus, Shuffle gives us the possibility for making didactically the evolution of versions visible in the documentation, when this is needed. More generally, Shuffle gives us tangling and weaving possibilities of literate programming. I think it gives a way to think of literal program development in a more abstract way by supporting the concept of views (maybe a too far analogy: version control management -- e.g. [http://abridgegame.org/darcs/ darcs] -- helps thinking of program development in a more abstract way, too). Shuffle works well together with lhs2tex.<br />
<br />
;[http://web.comlab.ox.ac.uk/oucl/work/ian.lynagh/Haskell2LaTeX/ Haskell2Latex]<br />
:Ian Lynagh's Haskell2LaTeX takes a literate Haskell program, or any LaTeX document with embedded Haskell, and pretty-prints the Haskell sections within it. The most significant difference between Haskell2LaTeX and other programs with similar goals is is that Haskell2LaTeX parses the input rather than merely lexing it.<br />
<br />
==== TeX ====<br />
<br />
;[http://www.acooke.org/jara/pancito/haskell.sty haskell.sty]<br />
:A Latex style file by Andrew Cooke that makes literal programming in Haskell simple.<br />
<br />
;[http://www.jantar.org/lambdaTeX/ lambdaTeX]<br />
:A TeX package for typesetting literate scripts in TeX. The output looks much like the code from Chris Okasaki's book "Purely Functional Data Structures", doing syntax highlighting and converting ASCII art such as <code>-&gt;</code> or <code>alpha</code> to proper mathematical symbols. It should work with both LaTeX and plain TeX, and it does its magic without any annotations, directly on the source code (lambdaTeX uses an almost-complete Haskell lexical analyzer written entirely in plain TeX). You only have to add <code>\input lambdaTeX</code> at the top of your source file, and manually typeset your literate comments so they look as good as the source code.<br />
<br />
;[http://www.cse.unsw.edu.au/~chak/haskell/haskell-style.html Haskell Style for LaTeX2e]<br />
:by Manuel Chakravarty provides environments and macros that simplify setting Haskell programs in LaTeX.<br />
<br />
== Editor support ==<br />
<br />
=== Integrated Development Environments ===<br />
<br />
;[http://www.haskell.org/visualhaskell Visual Haskell]<br />
:Visual Haskell is a complete development environment for Haskell software, based on Microsoft's [http://msdn.microsoft.com/vstudio/productinfo/ Microsoft Visual Studio] platform. Visual Haskell integrates with the Visual Studio editor to provide interactive features to aid Haskell development, and it enables the construction of projects consisting of multiple Haskell modules, using the Cabal building/packaging infrastructure.<br />
<br />
;[http://eclipsefp.sourceforge.net/ Haskell support for Eclipse]<br />
:Eclipse is an open, extensible IDE platform for "everything and nothing in particular". It is implemented in Java and runs on several platforms. The Java IDE built on top of it has already become very popular among Java developers. The Haskell tools extend it to support editing (syntax coloring, code assist), compiling, and running Haskell programs from within the IDE. More features like source code navigation, module browsing etc. will be added in the future.<br />
<br />
;[[Leksah]]<br />
:Leksah is an IDE for Haskell written in Haskell. Leksah is intended as a practical tool to support the Haskell development process. It is an pre-release phase with bugs and open ends but actively developed and moving quickly. Hopefully, Leksah will already be interesting, useful and fun. Leksah uses GTK+ as GUI Toolkit with the gtk2hs binding. It is platform independent and should run on any platform where GTK+, gtk2hs and GHC can be installed. I have tested it on Windows and Linux. It only supports GHC.<br />
<br />
;[http://www.dtek.chalmers.se/~d99josve/hide/ hIDE]<br />
:hIDE is a GUI-based Haskell IDE written using gtk+hs. It does not include an editor but instead interfaces with NEdit, vim or GNU emacs.<br />
<br />
;[http://haskell.org/hide hIDE-2]<br />
:Through the dark ages many a programmer has longed for the ultimate tool. In response to this most unnerving craving, of which we ourselves have had maybe more than our fair share, the dynamic trio of #Haskellaniacs (dons, dcoutts and Lemmih) hereby announce, to the relief of the community, that a fetus has been conceived: ''hIDE - the Haskell Integrated Development Environment''. So far the unborn integrates source code recognition and a chameleon editor, resenting these in a snappy gtk2 environment. Although no seer has yet predicted the date of birth of our hIDEous creature, we hope that the mere knowledge of its existence will spread peace of mind throughout the community as oil on troubled waters. See also: [[HIDE/Screenshots of HIDE]] and [[HIDE]]<br />
<br />
;[http://www.students.cs.uu.nl/people/rjchaaft/JCreator JCreator with Haskell support]<br />
:JCreator is a highly customizable Java IDE for Windows. Features include extensive project support, fully customizable toolbars (including the images of user tools) and menus, increase/decrease indent for a selected block of text (tab/shift+tab respectively). The Haskell support module adds syntax highlighting for haskell files and winhugs, hugs, a static checker (if you double click on the error message, JCreator will jump to the right file and line and highlight it yellow) and the Haskell 98 Report as tools. Platforms: Win95, Win98, WinNT and Win2000 (only Win95 not tested yet). Size: 6MB. JCreator is a trademark of Xinox Software; Copyright &copy; 2000 Xinox Software The Haskell support module is made by [http://www.students.cs.uu.nl/people/rjchaaft/ Rijk-Jan van Haaften].<br />
<br />
;[http://www.kdevelop.org/ KDevelop]<br />
:This IDE supports many languages. For Haskell it [http://www.kdevelop.org/HEAD/doc/api/html/LangSupportStatus.html currently supports] project management, syntax highlighting, building (with GHC) & executing within the IDE.<br />
<br />
;[[haste]] - Haskell TurboEdit<br />
:haste - Haskell TurboEdit - was an IDE for the functional programming language Haskell, written in Haskell.<br />
<br />
;[http://www.cs.kent.ac.uk/projects/vital/ Vital]<br />
:Vital is a visual programming environment. It is particularly intended for supporting the open-ended, incremental style of development often preferred by end users (engineers, scientists, analysts, etc.).<br />
<br />
;[http://www.cs.kent.ac.uk/projects/pivotal/ Pivotal]<br />
:Pivotal 0.025 is an early prototype of a Vital-like environment for Haskell. Unlike Vital, however, Pivotal is implemented entirely in Haskell. The implementation is based on the use of the hs-plugins library to allow dynamic compilation and evaluation of Haskell expressions together with the gtk2hs library for implementing the GUI.<br />
<br />
=== Editor modes ===<br />
<br />
====Crimson Editor====<br />
<br />
;[http://www.crimsoneditor.com/ Crimson Editor] Free source code editor for MS Windows. Haskell support files are included in the standard installation, but must be added to the editor via the options dialog.<br />
<br />
====Kate====<br />
<br />
; Syntax highlighting files for KDE's Kate<br />
:<br />
* [http://www.informatik.uni-bonn.de/~ralf/software.html#syntax Files] by Ralf Hinze.<br />
* [hs.xml hs.xml] and [lhs.xml lhs.xml] by Brian Huffman.<br />
<br />
====NEdit====<br />
<br />
;[http://www.nedit.org/ftp/contrib/highlighting/haskell.pats NEdit] syntax highlighting and block comment support.<br />
<br />
====Vim====<br />
<br />
;[http://www.vim.org Vim] syntax highlighting<br />
:<br />
* [ftp://ftp.cse.unsw.edu.au/pub/users/dons/vim/ by Don Stewart]: for TeX and cpp style Haskell files. <br />
* [http://urchin.earth.li/~ian/vim/ by Ian Lynagh]: distinguishes different literal Haskell styles.<br />
* by John Williams: Both regular Haskell [haskell.vim .hs] and [lhaskell.vim .lhs] files that uncomment lines using '>' are supported. This is included in Vim 6.2.<br />
* Included in Vim 7.0 and newer are syntax files based on the previous three, superceding all three.<br />
* There's a [[Literate programming/Vim|copy of lhaskell.vim]] on the Wiki.<br />
<br />
;Further useful stuff<br />
:<br />
* [http://mawercer.de/marcweber/vimlib_haskell.html vim script based function/module completion, cabal support, tagging by one command, context completion ( w<tab> -> where ), module outline, etc]<br />
* [http://tokyoenvious.xrea.jp/vim/indent/haskell.vim Vim indenting mode for Haskell]<br />
* [http://projects.haskell.org/haskellmode-vim Claus Reinke's vim tools for Haskell]<br />
<br />
====Textpad====<br />
<br />
;[http://www.haskell.org/libraries/Haskell98.syn Syntax highlighting file] for [http://www.textpad.com textpad]<br />
:by Jeroen van Wolffelaar and Arjan van IJzerdoorn, which inludes all prelude functions, datatype, constructors, etc, all in seperate groups.<br />
<br />
====Emacs====<br />
<br />
;[http://www.haskell.org/haskell-mode/ Haskell Mode for Emacs]<br />
:Supports font locking, declaration scanning, documentation of types, indentation and interaction with Hugs/GHCi. See the [[Haskell mode for Emacs]] wiki page for setup and usage tips, including how to get the CVS version. XEmacs users should also check out the minor bugs noted in that page. <br />
<br />
;Alternative [http://www.haskell.org/libraries/hugs-mode.el Hugs Mode for Emacs] by Chris Van Humbeeck<br />
:Provides fontification and cooperation with Hugs. Updated for emacs 20.* by Adam P. Jenkins.<br />
<br />
;[http://daniel.franke.name/latex-lhs-mode.el latex-lhs-mode.el]<br />
:Built on top of haskell-mode (above), it provides simultaneous use of all of literate-haskell-mode and most of emacs' built-in latex-mode.<br />
<br />
;[http://mapcar.org/haskell/cabal-mode/ cabal-mode.el]<br />
:A small (and developing) major mode for editing Cabal files.<br />
<br />
====Jed====<br />
;[http://haskell.org/sitewiki/images/7/75/Haskellmode-jed.tgz Haskell mode]<br />
:for [http://www.jedsoft.org/jed/ jed] by Marcin 'Qrczak' Kowalczyk.<br />
<br />
====Subethaedit====<br />
<br />
;[http://www.codingmonkeys.de/subethaedit/modes.html Haskell mode For SubEthaEdit]<br />
: SubEthaEdit is a Mac OS X editor.<br />
<br />
====Xcode====<br />
<br />
;[http://www.hoovy.org/HaskellXcodePlugin/ Xcode plugin]<br />
:a plugin for Xcode enabling syntax highlighting, Xcode projects compiling and linking, and a couple missing features, for Haskell<br />
<br />
====Other====<br />
<br />
Some other, mostly obsolete, modes are available in [http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/CONTRIB/haskell-modes/ CVS].<br />
<br />
== Libraries ==<br />
<br />
=== Testing ===<br />
<br />
;[http://hunit.sourceforge.net HUnit]<br />
:A unit testing framework for Haskell, similar to JUnit for Java. With HUnit, the programmer can easily create tests, name them, group them into suites, and execute them, with the framework checking the results automatically. Test specification is concise, flexible, and convenient.<br />
<br />
;[http://www.cs.chalmers.se/~rjmh/QuickCheck/ QuickCheck]<br />
:A tool for testing Haskell programs automatically. The programmer provides a specification of the program, in the form of properties which functions should satisfy, and QuickCheck then tests that the properties hold in a large number of randomly generated cases. Specifications are expressed in Haskell, using combinators defined in the QuickCheck library. QuickCheck provides combinators to define properties, observe the distribution of test data, and define test data generators.<br />
<br />
;[http://www.informatik.uni-freiburg.de/~wehr/haskell/ HTF - The Haskell Test Framework]<br />
:The HTF lets you write HUnit tests and QuickCheck properties in an easy and convenient way. Additionally, the HTF provides a facility for testing programs by running them and comparing the actual output with the expected output (so called "file-based tests"). The HTF uses Template Haskell to collect all tests and properties, so you do not need to write boilerplate code for that purpose. Preprocessor macros provide you with file name and line number information for tests and properties that failed.<br />
<br />
;[http://projects.unsafeperformio.com/hpc/ Hpc: Haskell Program Coverage]<br />
:Hpc is a tool-kit to record and display [[Haskell program coverage]]. Hpc includes tools that instrument Haskell programs to record program coverage, run instrumented programs, and display the coverage information obtained.<br />
<br />
;[[Haskell Equational Reasoning Assistant]]<br />
:Functional programmers often appeal to equational reasoning to justify various decisions made in both design and implementation. This page introduces the Haskell Equational Reasoning Assistant (HERA), an architecture that provides both a GUI level and a batch level Haskell rewrite engine inside a single tool.<br />
<br />
;[http://www.cse.unsw.edu.au/~dons/pqc.html pQC: parallel QuickCheck]<br />
:pqc provides Test.QuickCheck.Parallel, a QuickCheck driver that runs jobs in parallel, and will utilise as many cores as you wish, with the SMP parallel GHC 6.6 runtime. It is simple, scalable replacement for Test.QuickCheck.Batch.<br />
<br />
=== Debugging ===<br />
<br />
;[http://www.cs.mu.oz.au/~bjpop/code.html highWaterMark]<br />
:A library for determining the amount of memory allocated at any point by a GHC program.<br />
<br />
;[http://www.cs.mu.oz.au/~bjpop/code.html GHC Internals library]<br />
:A GHC library for polymorphically deconstructing heap objects from within Haskell code.<br />
<br />
;[http://www.cs.mu.oz.au/~bjpop/code.html GHC Heap and Stable Table Printing library]<br />
:Two libraries for GHC. The first is for printing heap objects from within Haskell or C code. The second is for dumping the contents of the Stable Table which is used for Stable Pointers and Stable Names.<br />
<br />
;[http://www.cse.unsw.edu.au/~dons/loch.html LocH]<br />
:Located errors, tracing and exceptions in Haskell.<br />
<br />
=== Formal methods ===<br />
<br />
;[http://aprove.informatik.rwth-aachen.de Haskell98 termination analyzer]<br />
:Checks termination of given start terms w.r.t. a Haskell program:<br />
* [[Analysis and design|analysis and design methods]]<br />
* [[Libraries and tools/Theorem provers|theorem provers]].</div>Dibblegohttps://wiki.haskell.org/index.php?title=AusHac2010&diff=34117AusHac20102010-03-17T11:42:23Z<p>Dibblego: </p>
<hr />
<div>If you've found this page, you use Haskell, ''and'' live in Australia (or at the very least able and willing to travel here), then you're in the right place! We're looking into organising a Haskell [[Hackathon]] some time during the middle of 2010, and this where it shall be organised.<br />
<br />
If you're interested in coming, '''please''' put your name down on the list below, along with your IRC nickname if you're on #haskell, and possibly your email (We'll use this to let you know of any progress we've made, but it's not mandatory). Also, if you've got something to discuss, feel free to add it to the bottom of the page in the Discussion section (just to keep the rest of the page clean and helpful).<br />
<br />
== What we've got so far ==<br />
<br />
===Why===<br />
<br />
Because we miss out on all the fun they have up north, and we've got something to offer. It's also a great chance to meet all these people you talk to on IRC, or read their blogs, and just have a good time, while getting some (potentially) useful work done!<br />
<br />
===When===<br />
<br />
A few dates have been discussed, mainly taking into account when the university holidays are for various universities:<br />
<br />
* ANU: 7 June -> 18 July<br />
* UNSW: 29 June -> 18 July<br />
<br />
So so far we need a weekend between the 28th of June and the 18th of July.<br />
<br />
We're looking at organising it over a weekend, and I (Axman6) would quite like to have it start on a Friday, ending on Sunday. This does not at all mean that those who canâ€™t make the Friday will miss out, the more people we have, the better. But I think that having more time will mean that we can get more done (which is the point right?).<br />
<br />
===Where===<br />
<br />
Manuel Chakravarty and Ben Lippmeier have said there should be no problem finding a room at UNSW, with the only possible problem being Internet access for everyone, but hopefully something can be arranged by that time.<br />
<br />
===Who===<br />
<br />
If you're interested in coming, please show your interest by adding your details to the list below (if you don't have an account, please email me (Axman6) your details and I'll add you).<br />
<br />
<table border="1px"><br />
<tr><br />
<td>Name</td><br />
<td>IRC Nickname</td><br />
<td>Email</td><br />
<td>Availability</td><br />
<td>Preferred date</td><br />
<td>Comment</td><br />
</tr><br />
<br />
<tr><br />
<td>Alex Mason</td><br />
<td>Axman6</td><br />
<td>axman6@gmail.com</td><br />
<td>Probably any weekend during the ANU holidays</td><br />
<td>-</td><br />
<td>Organiser... sort of</td><br />
</tr><br />
<br />
<br />
<tr><br />
<td>[[:User:ivanm|Ivan Miljenovic]]</td><br />
<td>ivanm</td><br />
<td>Ivan <dot> Miljenovic <at> gmail <dot> com</td><br />
<td>*shrug* lazy PhD student, so whenever</td><br />
<td>&nbsp;&nbsp; <=== </td><br />
<td>ditto</td><br />
</tr><br />
<br />
<tr><br />
<td>Tony Morris</td><br />
<td>dibblego</td><br />
<td>code@tmorris.net</td><br />
<td>Nothing specific</td><br />
<td>-</td><br />
<td>Tentative, depending on health</td><br />
</tr><br />
</table><br />
<br />
== Discussion ==<br />
<br />
=== Possible Projects ===<br />
<br />
====Generic graph class====<br />
'''What:''' I (Ivan) last year floated the idea of replacing the current default array-based Graph data type with an extensible set of classes with default instances. There's various interest about this around and I've done some work on it, but if there's anyone else coming it'd be better to bounce ideas together about how to define such classes.<br />
<br />
'''Who:''' Ivan M<br />
<br />
====Gloss-based plots====<br />
'''What:''' Either an alternative graphing back end to Criterion that only relies on OpenGL (through the use of Gloss), or a library for plotting. At the moment Gloss looks like it may only be suitable for bar type graphs, but we'll see. (We may look into writing some other library that's better suited than Gloss, as Gloss is aimed at students learning haskell, and wanting to just get something drawn)<br />
<br />
'''Who:''' Ivan M, Alex M<br />
<br />
====GHC LLVM backend====<br />
'''What:''' The recent work dome by David Terei on an LLVM backend for GHC has shown some fantastic results, and getting it to a point where it could become the default GHC backend is something a lot of people would really like to see.<br />
<br />
'''Who:''' Alex M<br />
<br />
=== Dates ===<br />
<br />
<br />
== Related Links ==<br />
<br />
* [[OzHaskell]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=User_groups&diff=28312User groups2009-05-25T12:20:56Z<p>Dibblego: </p>
<hr />
<div>[[Category:Community]]<br />
<br />
A range of Haskell User Groups are springing up all over. Also see a [http://www.frappr.com/haskellers Map of Haskellers].<br />
<br />
== User groups ==<br />
<br />
Regular meetings in a particular geographical area. Great if you want to see and meet other Haskellers.<br />
<br />
===North America===<br />
<br />
====West Coast ====<br />
<br />
;[http://socalfp.blogspot.com/ SoCal FP Group]<br />
<br />
;[http://bayfp.org/ The Bay Area Functional Programmers group]<br />
:Meeting monthly in the San Francisco Bay area. See [http://bayfp.org/blog their blog] for more details and news of upcoming meetings.<br />
<br />
;[http://groups.google.com/group/pdxfunc PDXfunc: Portland FP Group]<br />
:Monthly meetings of the Portland, Oregon functional programming group. Meetings occur on the second Monday of each month at 7 pm. They are held at [http://www.cubespacepdx.com CubeSpace] on 6th & Grant ([http://www.cubespacepdx.com/directions directions]), which generously provides the space free of charge. <br />
<br />
;[http://www.haskell.org/pipermail/haskell-cafe/2008-February/038991.html Seattle: Northwest Functional Programming Interest Group]<br />
:a Northwest Functional Programming Interest Group in Seattle.<br />
<br />
;[http://groups.google.com/group/hugvan Vancouver, Canada: Haskell Programmers Group]<br />
:Regular informal meetings for local Haskell programmers/devotees.<br />
<br />
====East Coast====<br />
<br />
;[http://article.gmane.org/gmane.comp.lang.haskell.cafe/21856 New York Functional Programmers Network]<br />
:Come and meet like-minded functional programmers in the New York area. The next meeting is at 6:30pm on February 26th, at Credit Suisse's offices. Paul Hudak from Yale will be giving a talk on real-time sound synthesis using Haskell. Please RSVP at the [http://lisp.meetup.com/59/ NYFPN Meetup Page].<br />
<br />
;[http://www.lisperati.com/fringedc.html FringeDC Washington]<br />
:Meetings about functional programming languages in Washington DC.<br />
<br />
====Central====<br />
<br />
;[http://leibnizdream.wordpress.com/2007/12/22/new-austin-functional-programmers-group-in-2008/ Austin Functional Programmers Group]<br />
:See the [http://groups.google.com/group/austin-fp discussion group] for more.<br />
<br />
;[http://groups.google.com/group/real-world-haskell-book-club/browse_thread/thread/3e8e59768c8c50a9 Colorado Area Haskell Study Group]<br />
<br />
===Australia===<br />
<br />
;[http://groups.google.com/group/fp-syd FP-SYD, the Sydney (Australia) Functional Programming group]<br />
:FP hackers in Sydney.<br />
<br />
;[http://sites.google.com/site/fpunion/ (FPU) Melbourne Functional Programming Union]<br />
:The FPU is a collective of functional programming language enthusiasts, which has been in operation since 1998. We are based at the University of Melbourne, in the Department of Computer Science and Software Engineering, but we are open to all members of the community. We meet on a regular basis for lively discussions on topics broadly associated with the declarative programming paradigm.<br />
<br />
;[http://www.meetup.com/Brisbane-Functional-Programming-Group-BFG/ Brisbane Functional Programming Group]<br />
; A group for Functional Programming with Haskell, Scala and other languages.<br />
<br />
===Europe===<br />
<br />
;[http://www.londonhug.net/ London Haskell User Group]<br />
:The first meeting of the London Haskell User Group took place on 23rd May 2007, at City University in central London<br />
<br />
;Haskell in Leipzig<br />
:Hal, they have videos [http://iba-cg.de/haskell.html online].<br />
<br />
;[http://users.ecs.soton.ac.uk/pocm06r/fpsig/ Southampton University FPSIG]<br />
:The Functional Programming Special Interest Group of the University of Southampton is a meeting for people interested in FP and Haskell and meet weekly on Tuesdays at 11.30, on the Access Grid Room of Blg. 32<br />
<br />
;[http://oasis.yi.org/oasis/HUGZ Haskell User Group Zurich]<br />
:A user group for the haskell users residing in Zurich and surroundings. It's new and still being formed.<br />
<br />
;[[IsraelHaskell]] User Group<br />
:[http://article.gmane.org/gmane.comp.lang.haskell.cafe/28877 Are getting organised].<br />
<br />
;[http://spbhug.folding-maps.org Saint-Petersburg Haskell User Group]<br />
:The next meeting will be held in April, 2008.<br />
<br />
;[[ItaloHaskell]]<br />
:We had a first meeting in August 2008 and we are planning a second one sometime during the 2008/2009 Autumn/Winter season.<br />
<br />
;[[Reykjavik Haskell User Group]] Iceland<br />
;[http://groups.google.com/group/haskell-is Currently recruiting members]<br />
<br />
;[http://groups.google.com/group/core-haskell?lnk=srg Turkey Haskell Programmer's Group]<br />
:Formed by Turkish Functional Programmers, the group began to communicate via an e-mail list opened by core.gen.tr. The first contribution is hlibev project by Aycan iRiCAN. <br />
<br />
;[[Dutch HUG]]<br />
:The Dutch HUG meets monthly in an informal setting.<br />
<br />
===South America===<br />
<br />
;[http://groups.google.com/group/hug-br HUG-BR]<br />
:Haskell Users' Group for Brasil<br />
<br />
===Asia===<br />
<br />
;[http://www.starling-software.com/en/tsac.html Tokyo Society for the Application of Currying]<br />
<br />
== Workshops/meet ups ==<br />
<br />
Less regular, and move around. Usually have a few talks from invited speakers.<br />
<br />
;[[AngloHaskell]]<br />
:AngloHaskell is a Haskell meeting held in England once a year.<br />
<br />
;[[OzHaskell]]<br />
:Australian Haskell Programmer's Group<br />
<br />
;[[AmeroHaskell]]<br />
:USAsian Haskell Programmer's Group<br />
<br />
;[http://taichi.ddns.comp.nus.edu.sg/taichiwiki/SingHaskell2007 SingHaskell]<br />
:Sing(apore)Haskell is a Haskell (and related languages) meeting in Singapore<br />
* [http://www.comp.nus.edu.sg/~sulzmann/singhaskell07/index.html slides]<br />
<br />
;[http://www.comp.mq.edu.au/~asloane/pmwiki.php/SAPLING/HomePage Sydney Area Programming Languages Interest Group]<br />
:10am-4pm, June 12, 2007. Room T5, Building E7B, Macquarie University<br />
<br />
;[http://www.cs.uu.nl/~johanj/FPDag2008/ Utrecht Functioneel Programmeren dag 2008]<br />
:11 januari 2008<br />
<br />
== Hackathons ==<br />
<br />
Getting together to squash bugs and write new stuff. For a more complete list, see [[Hackathon]].<br />
<br />
;[http://haskell.org/haskellwiki/Hac_2007 Hackathons]<br />
:Hac 07 was held January 10-12, 2007, Oxford University Computing Laboratory<br />
<br />
;[http://haskell.org/haskellwiki/HaL3 HaL3 Hackathon]<br />
:HaL3 was held Apr 19-20, 2008, Leipzig<br />
<br />
;[[Hac5]]<br />
:Hac5 was held 17-19 April 2009 in Utrecht.<br />
<br />
== Conferences ==<br />
<br />
See the [[Conferences]] page for academic workshops and conferences<br />
focusing on Haskell and related technology</div>Dibblegohttps://wiki.haskell.org/index.php?title=IRC_channel/Phase_2&diff=26238IRC channel/Phase 22009-02-01T22:54:08Z<p>Dibblego: </p>
<hr />
<div>=Managing traffic in #haskell=<br />
<br />
'''Motivation'''<br />
<br />
The #haskell channel is such a [http://donsbot.wordpress.com/2009/02/01/haskell-is-a-busy-place/ busy place] that we're moving away from our [http://www.haskell.org/haskellwiki/IRC_channel#Principles core principles] of being newbie friendly.<br />
<br />
It is now not uncommon to hear or see newbie questions go unanswered.<br />
<br />
'''Goal'''<br />
<br />
We want to reduce non-newbie question related traffic, so that we again achieve maximum newbie friendliness.<br />
<br />
'''Strategy'''<br />
<br />
* Create a 'deep' channel for in-depth hard topics<br />
** this channel shall be a subset of #haskell users (old hands mostly, and theory heads)<br />
** those people should also stay in #haskell to answer questions<br />
<br />
* Deep/hard topics should move into this channel to make way for newbie traffic in #haskell<br />
* lambdabot and other noisy services will be monitored and have non-newbie related content reduced.<br />
<br />
'''Evaluation'''<br />
<br />
We monitor success by how much we reduce traffic, without reducing newbies<br />
<br />
We must also be careful to ensure there's a clear path for new users into the -advanced channel.</div>Dibblegohttps://wiki.haskell.org/index.php?title=IRC_channel/Phase_2&diff=26236IRC channel/Phase 22009-02-01T22:53:06Z<p>Dibblego: </p>
<hr />
<div>=Managing traffic in #haskell=<br />
<br />
'''Motivation'''<br />
<br />
The #haskell is such a [http://donsbot.wordpress.com/2009/02/01/haskell-is-a-busy-place/ busy place] that we're moving away from our [http://www.haskell.org/haskellwiki/IRC_channel#Principles core principles] of being newbie friendly.<br />
<br />
It is now not uncommon to hear or see newbie questions go unanswered.<br />
<br />
'''Goal'''<br />
<br />
We want to reduce non-newbie question related traffic, so that we again achieve maximum newbie friendliness.<br />
<br />
'''Strategy'''<br />
<br />
* Create a 'deep' channel for in-depth hard topics<br />
** this channel shall be a subet of #haskell users (old hands mostly, and theory heads)<br />
** those people should also say in #haskell to answer questions<br />
<br />
* Deep / hard topics should move into this channel ,to make way for newbie traffic in #haskell<br />
* lambdabot and other noisy services will be monitored and have non-newbie related content reduced.<br />
<br />
'''Evaluation'''<br />
<br />
We monitor success by how much we reduce traffic, without reducing newbies<br />
<br />
We must also be careful to ensure there's a clear path for new users into the -advanced chan.</div>Dibblegohttps://wiki.haskell.org/index.php?title=IRC_channel/Management&diff=25136IRC channel/Management2008-12-22T07:02:49Z<p>Dibblego: </p>
<hr />
<div>== Overview ==<br />
<br />
===Redirect bans===<br />
<br />
Redirect ban to the op channel is good for wide-reaching bans. (e.g.<br />
tor or *ass*!*@*), if we expect the odd false positive. <br />
<br />
This can also be done for people with problematic connections.<br />
<br />
===Controlling offtopic conversations===<br />
<br />
Never happened in #haskell, but it is possible to manage an off topic<br />
flamefest, with:<br />
<br />
+mz <br />
<br />
All of the offtopic conversations die off because no one can see any<br />
responses to them.<br />
<br />
===Kicks and parts===<br />
<br />
You can "remove" that forces a PART instead of a KICK. The difference<br />
being that many clients don't auto-rejoin on PART.<br />
<br />
===Silencing===<br />
<br />
T +q ____ and +b %____ are identical and just silence the offender but don't prevent joins<br />
<br />
===Real name bans===<br />
<br />
Some persistent trolls will attempt to rejoin using all available means,<br />
they can be often stopped with realname bans, using +d.<br />
<br />
==Policies==<br />
<br />
===Bans===<br />
<br />
I default to *!*@hostname bans, especially when I expect it to be a<br />
temporary ban or when banning an unregistered nick. Hostname specific<br />
bans against a dynamically assigned hostname should be cleared<br />
periodically.<br />
<br />
For example:<br />
<br />
/ban *!*@foo.bar.com<br />
<br />
I am much less lenient with someone that join/spams than with someone<br />
that has a history of productive behaviour who slips up. I have no<br />
problem kick/temp-banning a join/spammer while I'm likely to warn and<br />
chat with a regular user who violates the policy.<br />
<br />
Many users aren't aware of what acceptable #haskell behavior is. We keep<br />
the channel noise level low to encourage productive, on-topic<br />
discussion. Private messages to a problem user explaining why a<br />
behaviour is not acceptable are often successful at neutralizing a<br />
situation before it escalates. Of course other users are intentionally<br />
disruptive, but even these are eligible to be saved. I will often ban<br />
the user in the channel without kicking (which mutes them) and<br />
immediately send a private message explaining the situation.<br />
<br />
==Chaos Control==<br />
<br />
This situation hasn't really happened in #haskell before, but since I've<br />
dealt with it in other channels I'll document it here:<br />
<br />
Sometimes the channel can become wildly off-topic with too many people<br />
to blame to point individual fingers. The most effective way I've found<br />
to deal with this problem is to +o a few of the channel moderators and<br />
to set the channel to +mz. This configures the channel such that only +o<br />
users can read the messages and respond to them. This off-topic<br />
conversation will die out and once someone asks a productive, on-topic<br />
questions you can set the mode to -mz and return to normal. -- glguy<br />
<br />
==Dealing with gateway abuse==<br />
<br />
Some people think that IRC gateways like Tor and Mibbit grant them<br />
enough anonymity that they can hassle IRC channels unchecked. If someone<br />
is reconnecting through such a gateway and proving difficult to ban, do<br />
the following (example using tor):<br />
<br />
/ban *!*@gateway/tor/*!#haskell-ops<br />
<br />
This will redirect all Tor users to #haskell-ops. If a legitimate user<br />
gets caught in this wide-reaching ban, you can add an exception for that<br />
specific user:<br />
<br />
/mode #haskell +e nick!user@host</div>Dibblegohttps://wiki.haskell.org/index.php?title=Fibonacci_primes_in_parallel&diff=21100Fibonacci primes in parallel2008-05-27T01:32:48Z<p>Dibblego: </p>
<hr />
<div>The problem is to find all primes in the sequence of rapidly growing Fibonacci numbers.<br />
<br />
[http://en.wikipedia.org/wiki/Miller-Rabin_primality_test Miller-Rabin] probabilistic primality test is used to check numbers in the sequence. In addition only the elements with prime indices in the sequence are considered due to known properties of Fibonacci numbers.<br />
<br />
The [http://www.haskell.org/ghc/dist/current/docs/users_guide/lang-parallel.html#id437410 parallel capabilities] of [http://haskell.org/ghc GHC] give almost twofold performance speedup if running the program on a dual core system.<br />
<br />
<haskell><br />
module Main where<br />
<br />
import Data.Time.Clock.POSIX<br />
import Data.Maybe<br />
import Numeric<br />
import Char<br />
import Control.Parallel.Strategies<br />
<br />
-- binary representation of an integer<br />
binary :: Integer -> String <br />
binary = flip (showIntAtBase 2 intToDigit) []<br />
<br />
-- returns True if (a) is a witness that odd n is compound<br />
witness :: Integer -> Integer -> Bool<br />
witness n a = pow (tail $ binary $ n-1) a<br />
where<br />
pow _ 1 = False<br />
pow [] _ = True<br />
pow xs d = pow' xs d $ (d*d) `mod` n<br />
where<br />
pow' _ d d2 | d2 == 1 && d /= (n-1) = True<br />
pow' ('0':xs) d d2 = pow xs d2<br />
pow' ('1':xs) d d2 = pow xs $ (d2*a) `mod` n<br />
<br />
-- is (n) a prime with probability 4^(-k)<br />
miller_rabin k n<br />
| n `mod` 2 == 0 = n == 2<br />
| otherwise = not $ any (witness n) $ takeWhile (< n) $ take k primes<br />
<br />
primes :: [Integer]<br />
primes = 2:3:5: [ x | x <- candidates 7 11, isPrime x ]<br />
where<br />
candidates a1 a2 = a1 : a2 : candidates (a1+6) (a2+6)<br />
<br />
-- simple primality test applied for indices only, not for Fibonacci numbers <br />
isPrime x = all ((0 /=) . (x `mod`)) $ takeWhile ((x>=).(^2)) primes<br />
<br />
fib = 1 : 1 : [ a+b | (a,b) <- zip fib (tail fib) ]<br />
<br />
-- indexed Fibonacci numbers<br />
numfib = zip [1..] fib<br />
<br />
isPrimeOr4 x = x /= 1 && x /=2 && (x == 4 || isPrime x)<br />
<br />
-- Fibonacci numbers with primal indices<br />
numfib' = filter (isPrimeOr4 . fst) numfib<br />
<br />
isProbablyPrime = miller_rabin 10<br />
<br />
maybeFibprimes = [if isProbablyPrime f then Just (n,f) else Nothing | (n,f) <- numfib' ]<br />
<br />
-- probably you need to increase 10 if running the program<br />
-- in more than two threads<br />
fibprimes = catMaybes $ parBuffer 10 rnf maybeFibprimes<br />
<br />
main = do<br />
start <- getPOSIXTime<br />
printEach fibprimes start 1<br />
where printEach (x:xs) start n = do<br />
t <- getPOSIXTime<br />
print (t - start, n, fst x)<br />
printEach xs start (n+1)<br />
</haskell><br />
<br />
In order to gain parallel benefits compile the program using<br />
<haskell><br />
ghc --make â€“threaded parfibs.hs<br />
</haskell><br />
then run by<br />
<haskell><br />
parfibs +RTS -N2<br />
</haskell><br />
<br />
For each found Fibonacci prime the program prints the time spent from the start. The following output was received on dual-core Athlon64 x2 4600+ microprocessor.<br />
<haskell><br />
(0s,1,3)<br />
(0s,2,4)<br />
(0s,3,5)<br />
(0s,4,7)<br />
(0s,5,11)<br />
(0s,6,13)<br />
(0s,7,17)<br />
(0s,8,23)<br />
(0s,9,29)<br />
(0s,10,43)<br />
(0s,11,47)<br />
(0s,12,83)<br />
(0.015625s,13,131)<br />
(0.015625s,14,137)<br />
(0.03125s,15,359)<br />
(0.046875s,16,431)<br />
(0.046875s,17,433)<br />
(0.046875s,18,449)<br />
(0.0625s,19,509)<br />
(0.078125s,20,569)<br />
(0.078125s,21,571)<br />
(2.546875s,22,2971)<br />
(10.890625s,23,4723)<br />
(16.859375s,24,5387)<br />
(104.390625s,25,9311)<br />
(120.546875s,26,9677)<br />
(464.359375s,27,14431)<br />
(3368.578125s,28,25561)<br />
(6501.296875s,29,30757)<br />
(11361.453125s,30,35999)<br />
(13182.71875s,31,37511)<br />
(37853.765625s,32,50833)<br />
</haskell><br />
<br />
In about 10 hours the program found 32 Fibonacci primes from about 40 [http://mathworld.wolfram.com/FibonacciPrime.html known so far]. Of course, obtaining each next number takes more time than all preceding ones. <br />
<br />
== Optimising further ==<br />
<br />
We can improve the performance somewhat by throwing more cores at the problem, and turning up the optimisation level. Compiling, with optimisations on, on a 4 core machine, also, replacing lazy pairs with strict ones, we get:<br />
<br />
$ ghc -O2 -optc-O2 -fvia-C -threaded A.hs -no-recomp --make<br />
<br />
We get the result slightly sooner, and with better cpu utilisation,<br />
<br />
$ time ./A +RTS -N4<br />
....<br />
(69.869631s, 25, 9311)<br />
(88.357936s, 26, 9677)<br />
(297.578843s, 27, 14431)<br />
<br />
Note that this program runs at around 300% cpu utilisation, despite having 4 cores, as we're contesting for cpu time on this machine. <br />
<br />
== Improving the algorithm ==<br />
<br />
Are there other [http://home.att.net/~blair.kelly/mathematics/fibonacci/factthms.html factorization theorems] for fibonacci primes we can exploit?<br />
<br />
[[Category:Code]]<br />
[[Category:Mathematics]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=OzHaskell&diff=16225OzHaskell2007-10-21T22:29:27Z<p>Dibblego: </p>
<hr />
<div>'''There is AngloHaskell and now AmeroHaskell. Doesn't that call for OzHaskell?'''<br />
<br />
Who would be interested to have a Haskell event in Australia, possibly in Sydney? This is just a wild idea without any concrete date or format yet. Jot down any suggestions on this page.<br />
<br />
Interested Haskellers:<br />
<br />
* [[:User:Chak|Manuel Chakravarty]] (Sydney)<br />
* [[:User:TonyMorris|Tony Morris]] (Brisbane)<br />
* [[:User:Brecknell|Matthew Brecknell]] (Brisbane)<br />
* [[:User:Mark_Wassell|Mark Wassell]] - Prefer Jan/Feb option.<br />
* [[:User:Rl|Roman Leshchinskiy]]<br />
* [[:User:cbrad|Brad Clow]]<br />
* [[:User:nornagon|Jeremy Apthorp]]<br />
* [[:User:AndrewA|Andrew Appleyard]] (Sydney)<br />
* [[:User:bjpop|Bernie Pope]] (Melbourne)<br />
* [[:User:benl23|Ben Lippmeier]]<br />
* [[:User:RohanDrape|Rohan Drape]]<br />
* [[:User:ivanm|Ivan Miljenovic]] (Brisbane)<br />
* [[:User:EricWilligers|Eric Willigers]]<br />
* [[:User:TonySloane|Tony Sloane]] (Sydney)<br />
* [[:User:Bens|Ben Sinclair]] (Sydney)<br />
* [[:User:andrep|Andre Pang]]<br />
* [[:User:AndrewBromage|Andrew Bromage]] (Melbourne)<br />
* [[:User:Droberts|Dale Roberts]] (Sydney)<br />
* [[:User:GeoffWilson|Geoff Wilson]] (Melbourne)<br />
* [[:User:Saulzar|Oliver Batchelor]] (Brisbane)<br />
* [[:User:Nick|Nick Seow]] (Sydney)<br />
* [[:User:sseefried|Sean Seefried]] (Sydney)<br />
* [[:User:green_tea|Alexis Hazell]] (Melbourne)<br />
* [[:User:PhilipDerrin|Philip Derrin]] (Sydney)<br />
(Add your name!)<br />
<br />
== Possible dates ==<br />
<br />
Shall we try to organise something for sometime over the summer? Avoiding the summer holidays, either of the following two periods seem attractive:<br />
<br />
* last week of November/first week of December or<br />
* last week of January/first week of February.<br />
<br />
(Add any additional periods that you would find attractive and/or comment on suitability.)<br />
<br />
Events to avoid clashing with:<br />
<br />
* The 2007 federal election (weekend of 24-25 November).<br />
* linux.conf.au (28 Jan - 2 Feb 2007, unless it's held in conjunction).<br />
* Inevitable family Christmas parties/holiday travel rush (some weekends in December, different for everyone I suspect).<br />
<br />
== Format ==<br />
<br />
How about the following?<br />
<br />
* One day meeting with informal talks and demos (preferably on a Friday)<br />
* There could be a second, even less formal day, for those who want to hang out some more and maybe some hacking<br />
* Run it at the University of New South Wales, Sydney<br />
<br />
(Add your thoughts to the above.)<br />
<br />
== Talks and demos ==<br />
<br />
Do you have anything you'd like to talk about or a system you'd like to demo? '''This is just a tentative list - you commit to nothing.'''<br />
<br />
=== Talk proposals ===<br />
<br />
* Manuel Chakravarty: ''Type-level Programming with Type Families''<br />
::GHC recently gained support for data families and type synonym families (which are a generalisation of our earlier proposal for associated types). In this talk, I'd give an overview over this new language feature, illustrate what it is good for, and discuss why I believe it fits Haskell better than functional dependencies.<br />
* Bernie Pope: ''The GHCi debugger''<br />
:: A new breakpoint debugger has been added to GHCi. In this talk, I'd demonstrate how to use the debugger, and also go into some detail about how it works. I might even discuss the relative advantages and disadvantages of this debugger over tools such as Hat.<br />
<br />
=== Demo proposals ===</div>Dibblegohttps://wiki.haskell.org/index.php?title=OzHaskell&diff=15615OzHaskell2007-09-17T03:33:29Z<p>Dibblego: </p>
<hr />
<div>'''There is AngloHaskell and now AmeroHaskell. Doesn't that call for OzHaskell?'''<br />
<br />
Who would be interested to have a Haskell event in Australia, possibly in Sydney? This is just a wild idea without any concrete date or format yet. Jot down any suggestions on this page.<br />
<br />
Interested Haskellers:<br />
<br />
* [[:User:Chak|Manuel Chakravarty]]<br />
* [[:User:TonyMorris|Tony Morris]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=User:TonyMorris&diff=15614User:TonyMorris2007-09-17T03:32:06Z<p>Dibblego: </p>
<hr />
<div>I live in Brisbane, Australia. I use the nick 'dibblego' on IRC.<br />
<br />
[http://blog.tmorris.net/ Here is my ranting web page.]</div>Dibblegohttps://wiki.haskell.org/index.php?title=Tutorials/Programming_Haskell/String_IO&diff=9780Tutorials/Programming Haskell/String IO2006-12-27T22:47:27Z<p>Dibblego: </p>
<hr />
<div>This is part two in a series of tutorials on programming Haskell. You<br />
can get up to speed by reading [http://cgi.cse.unsw.edu.au/~dons/blog/2006/12/16#programming-haskell-intro yesterday's introductory article]. <br />
<br />
Today we'll look more into the basic tools at our disposal in the<br />
[[Haskell]] language, in particular, operations for doing IO and playing<br />
with files and strings.<br />
<br />
==Administrivia==<br />
<br />
Before we get started, I should clarify a small point raised by <br />
[http://cgi.cse.unsw.edu.au/~dons/blog/2006/12/16#programming-haskell-intro yesterday's article]. <br />
One issue I forgot to mention was that there are slight differences<br />
between running Haskell in ghci, the bytecode interpreter, and compiling<br />
it to native code with GHC.<br />
<br />
Haskell programs are executed by evaluating the special 'main' function.<br />
<br />
<haskell><br />
import Data.List<br />
<br />
mylength = foldr (const (+1)) 0<br />
main = print (mylength "haskell")<br />
</haskell><br />
<br />
To compile this to native code, we would feed the source file to the compiler:<br />
<br />
$ ghc A.hs<br />
$ ./a.out<br />
7<br />
<br />
For a faster turnaround, we can run the code directly through<br />
the bytecode interpreter, GHCi, using the 'runhaskell' program:<br />
<br />
$ runhaskell A.hs<br />
7<br />
<br />
GHCi, the interactive Haskell environment, is a little bit different.<br />
As it is an interactive system, GHCi must execute your code<br />
sequentially, as you define each line. This is different to normal<br />
Haskell, where the order of definition is irrelevant. GHCi effectively<br />
executes your code inside a <i>do</i>-block. Therefore you can use the<br />
<i>do</i>-notation at the GHCi prompt to define new functions:<br />
<br />
$ ghci<br />
Prelude> :m + Data.List<br />
<br />
Prelude> let mylength = foldr (const (+1)) 0<br />
<br />
Prelude> :t mylength<br />
mylength :: [a] -> Integer<br />
<br />
Prelude> mylength "haskell"<br />
7<br />
<br />
For this tutorial I will be developing code in a source file, and either<br />
compiling it as above, or loading the source file into GHCi for testing.<br />
To load a source file into GHCi, we do:<br />
<br />
$ ghci<br />
Prelude> :load A.hs<br />
<br />
*Main> :t main<br />
main :: IO ()<br />
<br />
*Main> :t mylength<br />
mylength :: [a] -> Integer<br />
<br />
*Main> mylength "foo"<br />
3<br />
<br />
*Main> main<br />
7<br />
<br />
Now, let's get into the code!<br />
<br />
==IO==<br />
<br />
As the Camel Book says:<br />
<br />
<blockquote><br />
Unless you're using artificial intelligence to model a solipsistic<br />
philosopher, your program needs some way to communicate with the<br />
outside world.<br />
</blockquote><br />
<br />
<br />
In yesterday's tutorial, I briefly introduced 'readFile', for reading a<br />
String from a file on disk. Let's consider now IO in more detail.<br />
The most common IO operations are defined in the [http://haskell.org/ghc/docs/latest/html/libraries/base/System-IO.html System.IO] library.<br />
<br />
For the most basic stdin/stdout Unix-style programs in Haskell, we can<br />
use the 'interact' function:<br />
<br />
<haskell><br />
interact :: (String -> String) -> IO ()<br />
</haskell><br />
<br />
This <i>higher order</i> function takes, as an argument, some function for<br />
processing a string (of type String -> String). It runs this function<br />
over the standard input stream, printing the result to standard output.<br />
A surprisingly large number of useful programs can be written this way.<br />
For example, we can write the 'cat' unix program as:<br />
<br />
<haskell><br />
main = interact id<br />
</haskell><br />
<br />
Yes, that's it! Let's compile and run this program:<br />
<br />
$ ghc -O A.hs<br />
<br />
$ cat A.hs | ./a.out<br />
main = interact id<br />
<br />
How does this work? Firstly, 'interact' is defined as:<br />
<br />
<haskell><br />
interact f = do s <- getContents<br />
putStr (f s)<br />
</haskell><br />
<br />
So it reads a string from standard input, and writes to standard output<br />
the result of applying its argument function to that string. The 'id'<br />
function itself has the type:<br />
<br />
<haskell><br />
id :: a -> a<br />
</haskell><br />
<br />
'id' is a function of one argument, of any type (the lowercase 'a' in<br />
the type means any type can be used in that position, i.e. it is a<br />
polymorphic function (also called a generic function in some<br />
languages)). 'id' takes a value of some type 'a', and returns a value of<br />
the same type. There's is only one (non-trivial) function of this type:<br />
<br />
<haskell><br />
id a = a<br />
</haskell><br />
<br />
So 'interact id' will print to the input string to standard output<br />
unmodified. <br />
<br />
Let's now write the 'wc' program:<br />
<br />
<haskell><br />
main = interact count<br />
count s = show (length s) ++ "\n"<br />
</haskell><br />
<br />
This will print the length of the input string, that is, the number of<br />
chars:<br />
<br />
$ runhaskell A.hs < A.hs<br />
57<br />
<br />
==Line oriented IO==<br />
<br />
Only a small number of programs operate on unstructured input streams.<br />
It is far more common to treat an input stream as a list of lines. So<br />
let's do that. To break a string up into lines, we'll use the ...<br />
'lines' function, defined in the [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html Data.List] library:<br />
<br />
<haskell><br />
lines :: String -> [String]<br />
</haskell><br />
<br />
The type, once again, tells the story. 'lines' takes a String, and<br />
breaks it up into a list of strings, splitting on newlines.<br />
To join a list of strings back into a single string, inserting newlines,<br />
we'd use the ... 'unlines' function:<br />
<br />
<haskell><br />
unlines :: [String] -> String<br />
</haskell><br />
<br />
There are also similar functions for splitting on words, namely 'words'<br />
and 'unwords'. Now, an example. To count the number of lines in a file:<br />
<br />
<haskell><br />
main = interact (count . lines)<br />
</haskell><br />
<br />
We can run this as:<br />
<br />
$ ghc -O A.hs<br />
<br />
$ ./a.out < A.hs<br />
3<br />
<br />
Here we reuse the 'count' function from above, by <i>composing</i> it<br />
with the lines function. <br />
<br />
===On composition===<br />
<br />
<br />
This nice code reuse via composition is achieved using the (.) function,<br />
pronounced 'compose'. Let's look at how that works. (Feel free to skip<br />
this section, if you want to just get things done).<br />
<br />
<br />
The (.) function is just a normal everyday Haskell function, defined as:<br />
<br />
<haskell><br />
(.) f g x = f (g x)<br />
</haskell><br />
<br />
This looks a little like magic (or line noise), but its pretty easy. The<br />
(.) function simply takes <i>two functions</i> as arguments, along with<br />
another value. It applies the 'g' function to the value 'x', and then<br />
applies 'f' to the result, returning this final value. The functions may<br />
be of any type. The type of (.) is actually:<br />
<br />
<haskell><br />
(.) :: (b -> c) -> (a -> b) -> a -> c<br />
</haskell><br />
<br />
which might look a bit hairy, but it essentially specifies what types of<br />
arguments make sense to compose. That is, only those where:<br />
<br />
<haskell><br />
f :: b -> c<br />
g :: a -> b<br />
x :: a<br />
</haskell><br />
<br />
can be composed, yielding a new function of type: <br />
<br />
<haskell><br />
(f . g) :: a -> c<br />
</haskell><br />
<br />
The nice thing is that this composition makes sense (and works) <i>for all types a, b and<br />
c</i>.<br />
<br />
<br />
How does this relate to code reuse? Well, since our 'count' function is<br />
<i>polymorphic</i>, it works equally well counting the length of a<br />
string, or the length of a list of strings. Our littler 'wc' program is<br />
the epitome of the phrase: <i>"higher order + polymorphic =<br />
reusable"</i>. That is, functions which take other functions as<br />
arguments, when combined with functions that work over any type, produce<br />
great reusable 'glue'. You only need vary the argument function to gain<br />
terrific code reuse (and the strong type checking ensures you can only<br />
reuse code in ways that work).<br />
<br />
===More on lines===<br />
<br />
<br />
Another little example, let's reverse each line of a file (like the unix<br />
'rev' command):<br />
<br />
<haskell><br />
main = interact (unlines . map reverse . lines)<br />
</haskell><br />
<br />
Which when run, reverses the input lines:<br />
<br />
$ ./a.out < B.hs<br />
rahC.ataD tropmi<br />
ebyaM.ataD tropmi<br />
tsiL.ataD tropmi<br />
<br />
So we take the input string, split it into lines, and the loop over that<br />
list of lines, reversing each of them, using the 'map' function.<br />
Finally, once we've reversed each line, we join them back into a single<br />
string with unlines, and print it out.<br />
<br />
<br />
The 'map' function is a fundamental control structure of functional<br />
programming, similar to the 'foreach' keyword in a number of imperative<br />
languages. 'map' however is just a function on lists, not built in<br />
syntax, and has the type:<br />
<br />
<haskell><br />
map :: (a -> b) -> [a] -> [b]<br />
</haskell><br />
<br />
That is, it takes some function, and a list, and applies that function<br />
to each element of the list, returning a new list as a result. Since<br />
loops are so common in programming, we'll be using 'map' a lot.<br />
Just for reference, 'map' is implemented as:<br />
<br />
<haskell><br />
map _ [] = []<br />
map f (x:xs) = f x : map f xs<br />
</haskell><br />
<br />
==File IO==<br />
<br />
Operating on stdin/stdout is good for scripts (and this is how tools<br />
like sed or perl -p work), but for 'real' programs we'll at least need<br />
to do some file IO. The basic operations of files are:<br />
<br />
<haskell><br />
readFile :: FilePath -> IO String<br />
writeFile :: FilePath -> String -> IO ()<br />
</haskell><br />
<br />
'readFile' takes a file name as an argument, does some IO, and returns the<br />
file's contents as a string. 'writeFile' takes a file name, a string,<br />
and does some IO (writing that string to the file), before returning the<br />
void (or unit) value, ().<br />
<br />
<br />
We could implement a 'cp' program on files, as:<br />
<br />
<haskell><br />
import System.Environment<br />
<br />
main = do<br />
[f,g] <- getArgs<br />
s <- readFile f<br />
writeFile g s<br />
</haskell><br />
<br />
Running this program:<br />
<br />
$ ghc -O A.hs<br />
<br />
$ ./a.out A.hs Z.hs<br />
<br />
$ cat Z.hs<br />
import System.Environment<br />
<br />
main = do<br />
[f,g] <- getArgs<br />
s <- readFile f<br />
writeFile g s<br />
<br />
Since we're doing IO (the type of readFile and writeFile enforce this),<br />
the code runs inside a do-block, using the IO <i>monad</i>. "Using the<br />
IO monad" just means that we wish to use an imperative, sequential order<br />
of evaluation. (As an aside, a wide range of other monads exist, for<br />
programming different program evaluation strategies, such as<br />
Prolog-style backtracking, or continuation-based evaluation. All of<br />
imperative programming is just one subset of possible evaluation<br />
strategies you can use in Haskell, via monads).<br />
<br />
<br />
In <i>do</i>-notation, whenever you wish to run an action, for its side<br />
effect, and save the result to a variable, you would write:<br />
<br />
<haskell><br />
v <- action<br />
</haskell><br />
<br />
For example, to run the 'readFile' action, which has the side effect of<br />
reading a file from disk, we say:<br />
<br />
<haskell><br />
s <- readFile f<br />
</haskell><br />
<br />
Finally, we can use the 'appendFile' function to append to an existing<br />
file.<br />
<br />
<br />
==File Handles==<br />
<br />
<br />
The most generic interface to files is provided via Handles. Sometimes<br />
we need to keep a file open, for multiple reads or writes. To do this we<br />
use Handles, an abstraction much like the underlying system's file<br />
handles.<br />
<br />
<br />
To open up a file, and get its Handle, we use:<br />
<br />
<haskell><br />
openFile :: FilePath -> IOMode -> IO Handle<br />
</haskell><br />
<br />
So to open a file for reading only, in GHCi:<br />
<br />
<haskell><br />
Prelude System.IO> h <- openFile "A.hs" ReadMode<br />
{handle: A.hs}<br />
</haskell><br />
<br />
Which returns a Handle onto the file "A.hs". We can read a line from this handle:<br />
<br />
<haskell><br />
Prelude System.IO> hGetLine h<br />
"main = do"<br />
</haskell><br />
<br />
To close a Handle, and flush the buffer:<br />
<br />
<haskell><br />
hClose :: Handle -> IO ()<br />
</haskell><br />
<br />
Once a Handle is closed, we can no longer read from it:<br />
<br />
<haskell><br />
Prelude System.IO> hClose h<br />
Prelude System.IO> hGetLine h<br />
*** Exception: A.hs: hGetLine: illegal operation (handle is closed)<br />
</haskell><br />
<br />
We can also flush explicitly with:<br />
<br />
<haskell><br />
hFlush :: Handle -> IO ()<br />
</haskell><br />
<br />
Other useful operations for reading from Handles:<br />
<br />
<haskell><br />
hGetChar :: Handle -> IO Char<br />
hGetLine :: Handle -> IO [Char]<br />
hGetContents :: Handle -> IO [Char]<br />
</haskell><br />
<br />
To write to Handles:<br />
<br />
<haskell><br />
hPutChar :: Handle -> Char -> IO ()<br />
hPutStr :: Handle -> [Char] -> IO ()<br />
hPutStrLn :: Handle -> [Char] -> IO ()<br />
hPrint :: Show a => Handle -> a -> IO ()<br />
</haskell><br />
<br />
Some other useful actions:<br />
<br />
<haskell><br />
hSeek :: Handle -> SeekMode -> Integer -> IO ()<br />
hTell :: Handle -> IO Integer<br />
hFileSize :: Handle -> IO Integer<br />
hIsEOF :: Handle -> IO Bool<br />
</haskell><br />
<br />
==An example: spell checking==<br />
<br />
<br />
Here is a small example of combining the [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Set.html Data.Set]<br />
and List data structures from yesterday's tutorial, with more IO<br />
operations. We'll implement a little spell checker, building the<br />
dictionary in a Set data type. First, some libraries to import:<br />
<br />
<haskell><br />
import System.Environment<br />
import Control.Monad<br />
import Data.Set<br />
</haskell><br />
<br />
And the complete program:<br />
<br />
<haskell><br />
main = do<br />
[s] <- getArgs<br />
f <- readFile "/usr/share/dict/words"<br />
g <- readFile s<br />
let dict = fromList (lines f)<br />
mapM_ (spell dict) (words g)<br />
<br />
spell d w = when (w `notMember` d) (putStrLn w)<br />
</haskell><br />
<br />
Running this program, on its own source, and it reports the following<br />
words are not found in the dictionary:<br />
<br />
$ ghc -O Spell.hs -o spell<br />
<br />
$ ./spell A.hs<br />
Data.Char<br />
=<br />
<-<br />
(map<br />
toUpper<br />
n)<br />
=<br />
<-<br />
getLine<br />
1<br />
<br />
<br />
===Writing the results out===<br />
<br />
<br />
If we wanted to write the results out to a temporary file, we can do<br />
so. Let's import a couple of other modules:<br />
<br />
<haskell><br />
import Data.Set<br />
import Data.Maybe<br />
import Text.Printf<br />
import System.IO<br />
import System.Environment<br />
import System.Posix.Temp<br />
</haskell><br />
<br />
Refactoring the main code to separate out the reading and writing phases<br />
in to their own function, we end up with the core code:<br />
<br />
<haskell><br />
main = do<br />
(f, g) <- readFiles<br />
let dict = fromList (lines f)<br />
errs = mapMaybe (spell dict) (words g)<br />
write errs<br />
<br />
spell d w | w `notMember` d = Just w<br />
| otherwise = Nothing<br />
</haskell><br />
<br />
Where reading is now its own function:<br />
<br />
<haskell><br />
readFiles = do<br />
[s] <- getArgs<br />
f <- readFile "/usr/share/dict/words"<br />
g <- readFile s<br />
return (f,g)<br />
</haskell><br />
<br />
And writing errors out to their own file:<br />
<br />
<haskell><br />
write errs = do<br />
(t,h) <- mkstemp "/tmp/spell.XXXXXX"<br />
mapM_ (hPutStrLn h) errs<br />
hClose h<br />
printf "%d spelling errors written to '%s'\n" (length errs) t<br />
</haskell><br />
<br />
Pretty simple! Running this program:<br />
<br />
$ ghc --make -O Spell.hs -o myspell<br />
[1 of 1] Compiling Main ( Spell.hs, Spell.o )<br />
Linking myspell ...<br />
<br />
$ ./myspell Spell.hs<br />
67 spelling errors written to '/tmp/spell.ia8256'<br />
<br />
==Extension: using SMP parallelism==<br />
<br />
<br />
Finally, just for some bonus fun ... and hold on to your hat 'cause I'm<br />
going to go fast ... we'll add some parallelism to the mix.<br />
<br />
<br />
Haskell was designed from the start to support easy parallelisation, and<br />
since GHC 6.6, multithreaded code will run transparently on multicore<br />
systems using as many cores as you specify. Let's look at how we'd<br />
parallelise our little program to exploit multiple cores. We'll use an<br />
explicit threading model, via [http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html Control.Concurrent]. You can also make your code implicitly<br />
parallel, using [http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Parallel-Strategies.html">Control.Parallel.Strategies],<br />
but we'll leave that for another tutorial.<br />
<br />
<br />
Here's the source, for you to ponder. First some imports:<br />
<br />
<haskell><br />
import Data.Set hiding (map)<br />
import Data.Maybe<br />
import Data.Char<br />
import Text.Printf<br />
import System.IO<br />
import System.Environment<br />
import Control.Concurrent<br />
import Control.Monad<br />
</haskell><br />
<br />
The entry point, modified to break the word list into chunks, and then<br />
dispatching a chunk to each thread:<br />
<br />
<haskell><br />
main = do<br />
(f, g, n) <- readFiles<br />
let dict = fromList (lines f)<br />
work = chunk n (words g)<br />
run n dict work<br />
</haskell><br />
<br />
The 'run' function sets up a channel between the main thread and all<br />
children thread ('errs'), and prints spelling errors as they arrive on<br />
the channel from the children. It then forks off 'n' children threads on<br />
each piece of the work list:<br />
<br />
<haskell><br />
run n dict work = do<br />
chan <- newChan<br />
errs <- getChanContents chan -- errors returned back to main thread<br />
mapM_ (forkIO . thread chan dict) (zip [1..n] work)<br />
wait n errs 0<br />
</haskell><br />
<br />
<br />
The main thread then just waits on all the threads to finish, printing<br />
any spelling errors they pass up:<br />
<br />
<haskell><br />
wait n xs i = when (i < n) $ case xs of<br />
Nothing : ys -> wait n ys $! i+1<br />
Just s : ys -> putStrLn s >> wait n ys i<br />
</haskell><br />
<br />
Each thread spell checks its own piece of the work list. If it finds a<br />
spelling error, it passes the offending word back over the channel to<br />
the main thread.<br />
<br />
<haskell><br />
thread chan dict (me, xs) = do<br />
mapM_ spellit xs<br />
writeChan chan Nothing<br />
<br />
where<br />
spellit w = when (spell dict w) $<br />
writeChan chan . Just $ printf "Thread %d: %-25s" (me::Int) w<br />
</haskell><br />
<br />
The 'spell' function is simplified a little:<br />
<br />
<haskell><br />
spell d w = w `notMember` d<br />
</haskell><br />
<br />
which we could also write as:<br />
<br />
<haskell><br />
spell = flip notMember<br />
</haskell><br />
<br />
We modify the readFiles phase to take an additional numeric command line<br />
argument, specifying the number of threads to run:<br />
<br />
<haskell><br />
readFiles = do<br />
[s,n] <- getArgs<br />
f <- readFile "/usr/share/dict/words"<br />
g <- readFile s<br />
return (f,g, read n)<br />
</haskell><br />
<br />
We compile this with the GHC SMP parallel runtime system:<br />
<br />
$ ghc -O --make -threaded Spell.hs -o spell<br />
<br />
Now, we can run 'n' worker threads (lightweight Haskell threads), mapped<br />
onto 'm' OS threads. Since I'm using a 4 core linux server, we'll play<br />
around with 4 OS threads. First, running everything in a single thread:<br />
<br />
$ time ./spell test.txt 1 +RTS -N1<br />
...<br />
Thread 1: week:<br />
Thread 1: IO!<br />
./spell test.txt 1 +RTS -N1 99% cpu 2.533 total<br />
<br />
Ok, now we change the command line flag to run it with 4 OS threads, to<br />
try to utilise all 4 cores:<br />
<br />
$ time ./spell 4 +RTS -N4<br />
...<br />
Thread 2: week:<br />
Thread 3: IO!<br />
./spell test.txt 4 +RTS -N4 111% cpu 2.335 total<br />
<br />
Ok. Good... A little bit faster, uses a little bit more cpu. It turns<br />
out however the program is bound currently by the time spent in the main<br />
thread building the initial dictionary. Actual searching time is only<br />
some 10% of the running time. Nonetheless, it was fairly painless to<br />
break up the initial simple program into a parallel version.<br />
<br />
<br />
If the program running time was extended (as the case for a server), the<br />
parallelism would be a win. Additionally, should we buy more cores for<br />
the server, all we need to is change the +RTS -N argument to the<br />
program, to start utilising these extra cores.<br />
<br />
==Next week==<br />
<br />
<br />
In upcoming tutorials we'll look more into implicitly parallel programs,<br />
and the use of the new high performance ByteString data type for string<br />
processing.</div>Dibblegohttps://wiki.haskell.org/index.php?title=Tutorials/Programming_Haskell/String_IO&diff=9779Tutorials/Programming Haskell/String IO2006-12-27T22:36:06Z<p>Dibblego: </p>
<hr />
<div>This is part two in a series of tutorials on programming Haskell. You<br />
can get up to speed by reading [http://cgi.cse.unsw.edu.au/~dons/blog/2006/12/16#programming-haskell-intro yesterday's introductory article]. <br />
<br />
Today we'll look more into the basic tools at our disposal in the<br />
[[Haskell]] language, in particular, operations for doing IO and playing<br />
with files and strings.<br />
<br />
==Administrivia==<br />
<br />
Before we get started, I should clarify a small point raised by <br />
[http://cgi.cse.unsw.edu.au/~dons/blog/2006/12/16#programming-haskell-intro yesterday's article]. <br />
One issue I forgot to mention was that there are slight differences<br />
between running Haskell in ghci, the bytecode interpreter, and compiling<br />
it to native code with GHC.<br />
<br />
Haskell programs are executed by evaluating the special 'main' function.<br />
<br />
<haskell><br />
import Data.List<br />
<br />
mylength = foldr (const (+1)) 0<br />
main = print (mylength "haskell")<br />
</haskell><br />
<br />
To compile this to native code, we would feed the source file to the compiler:<br />
<br />
$ ghc A.hs<br />
$ ./a.out<br />
7<br />
<br />
For a faster turnaround, we can run the code directly through<br />
the bytecode interpreter, GHCi, using the 'runhaskell' program:<br />
<br />
$ runhaskell A.hs<br />
7<br />
<br />
GHCi, the interactive Haskell environment, is a little bit different.<br />
As it is an interactive system, GHCi must execute your code<br />
sequentially, as you define each line. This is different to normal<br />
Haskell, where the order of definition is irrelevant. GHCi effectively<br />
executes your code inside a <i>do</i>-block. Therefore you can use the<br />
<i>do</i>-notation at the GHCi prompt to define new functions:<br />
<br />
$ ghci<br />
Prelude> :m + Data.List<br />
<br />
Prelude> let mylength = foldr (const (+1)) 0<br />
<br />
Prelude> :t mylength<br />
mylength :: [a] -> Integer<br />
<br />
Prelude> mylength "haskell"<br />
7<br />
<br />
For this tutorial I will be developing code in a source file, and either<br />
compiling it as above, or loading the source file into GHCi for testing.<br />
To load a source file into GHCi, we do:<br />
<br />
$ ghci<br />
Prelude> :load A.hs<br />
<br />
*Main> :t main<br />
main :: IO ()<br />
<br />
*Main> :t mylength<br />
mylength :: [a] -> Integer<br />
<br />
*Main> mylength "foo"<br />
3<br />
<br />
*Main> main<br />
7<br />
<br />
Now, let's get into the code!<br />
<br />
==IO==<br />
<br />
As the Camel Book says:<br />
<br />
<blockquote><br />
Unless you're using artificial intelligence to model a solipsistic<br />
philosopher, your program needs some way to communicate with the<br />
outside world.<br />
</blockquote><br />
<br />
<br />
In yesterday's tutorial, I briefly introduced 'readFile', for reading a<br />
String from a file on disk. Let's consider now IO in more detail.<br />
The most common IO operations are defined in the [http://haskell.org/ghc/docs/latest/html/libraries/base/System-IO.html System.IO] library.<br />
<br />
For the most basic stdin/stdout Unix-style programs in Haskell, we can<br />
use the 'interact' function:<br />
<br />
<haskell><br />
interact :: (String -> String) -> IO ()<br />
</haskell><br />
<br />
This <i>higher order</i> function takes, as an argument, some function for<br />
processing a string (of type String -> String). It runs this function<br />
over the standard input stream, printing the result to standard output.<br />
A surprisingly large number of useful programs can be written this way.<br />
For example, we can write the 'cat' unix program as:<br />
<br />
<haskell><br />
main = interact id<br />
</haskell><br />
<br />
Yes, that's it! Let's compile and run this program:<br />
<br />
$ ghc -O A.hs<br />
<br />
$ cat A.hs | ./a.out<br />
main = interact id<br />
<br />
How does this work? Firstly, 'interact' is defined as:<br />
<br />
<haskell><br />
interact f = do s <- getContents<br />
putStr (f s)<br />
</haskell><br />
<br />
So it reads a string from standard input, and writes to standard output<br />
the result of applying its argument function to that string. The 'id'<br />
function itself has the type:<br />
<br />
<haskell><br />
id :: a -> a<br />
</haskell><br />
<br />
'id' is a function of one argument, of any type (the lowercase 'a' in<br />
the type means any type can be used in that position, i.e. it is a<br />
polymorphic function (also called a generic function in some<br />
languages)). 'id' takes a value of some type 'a', and returns a value of<br />
the same type. There's is only one (non-trivial) function of this type:<br />
<br />
<haskell><br />
id a = a<br />
</haskell><br />
<br />
So 'interact id' will print to the input string to standard output<br />
unmodified. <br />
<br />
Let's now write the 'wc' program:<br />
<br />
<haskell><br />
main = interact count<br />
count s = show (length s) ++ "\n"<br />
</haskell><br />
<br />
This will print the length of the input string, that is, the number of<br />
chars:<br />
<br />
$ runhaskell A.hs < A.hs<br />
57<br />
<br />
==Line oriented IO==<br />
<br />
Only a small number of programs operate on unstructured input streams.<br />
It is far more common to treat an input stream as a list of lines. So<br />
let's do that. To break a string up into lines, we'll use the ...<br />
'lines' function, defined in the [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html Data.List] library:<br />
<br />
<haskell><br />
lines :: String -> [String]<br />
</haskell><br />
<br />
The type, once again, tells the story. 'lines' takes a String, and<br />
breaks it up into a list of strings, splitting on newlines.<br />
To join a list of strings back into a single string, inserting newlines,<br />
we'd use the ... 'unlines' function:<br />
<br />
<haskell><br />
unlines :: [String] -> String<br />
</haskell><br />
<br />
There are also similar functions for splitting on words, namely 'words'<br />
and 'unwords'. Now, an example. To count the number of lines in a file:<br />
<br />
<haskell><br />
main = interact (count . lines)<br />
</haskell><br />
<br />
We can run this as:<br />
<br />
$ ghc -O A.hs<br />
<br />
$ ./a.out < A.hs<br />
3<br />
<br />
Here we reuse the 'count' function from above, by <i>composing</i> it<br />
with the lines function. <br />
<br />
===On composition===<br />
<br />
<br />
This nice code reuse via composition is achieved using the (.) function,<br />
pronounced 'compose'. Let's look at how that works. (Feel free to skip<br />
this section, if you want to just get things done).<br />
<br />
<br />
The (.) function is just a normal everyday Haskell function, defined as:<br />
<br />
<haskell><br />
(.) f g x = f (g x)<br />
</haskell><br />
<br />
This looks a little like magic (or line noise), but its pretty easy. The<br />
(.) function simply takes <i>two functions</i> as arguments, along with<br />
another value. It applies the 'g' function to the value 'x', and then<br />
applies 'f' to the result, returning this final value. The functions may<br />
be of any type. The type of (.) is actually:<br />
<br />
<haskell><br />
(.) :: (b -> c) -> (a -> b) -> a -> c<br />
</haskell><br />
<br />
which might look a bit hairy, but it essentially specifies what types of<br />
arguments make sense to compose. That is, only those where:<br />
<br />
<haskell><br />
f :: b -> c<br />
g :: a -> b<br />
x :: a<br />
</haskell><br />
<br />
can be composed, yielding a new function of type: <br />
<br />
<haskell><br />
(f . g) :: a -> c<br />
</haskell><br />
<br />
The nice thing is that this composition makes sense (and works) <i>for all types a, b and<br />
c</i>.<br />
<br />
<br />
How does this relate to code reuse? Well, since our 'count' function is<br />
<i>polymorphic</i>, it works equally well counting the length of a<br />
string, or the length of a list of strings. Our littler 'wc' program is<br />
the epitome of the phrase: <i>"higher order + polymorphic =<br />
reusable"</i>. That is, functions which take other functions as<br />
arguments, when combined with functions that work over any type, produce<br />
great reusable 'glue'. You only need vary the argument function to gain<br />
terrific code reuse (and the strong type checking ensures you can only<br />
reuse code in ways that work).<br />
<br />
===More on lines===<br />
<br />
<br />
Another little example, let's reverse each line of a file (like the unix<br />
'rev' command):<br />
<br />
<haskell><br />
main = interact (unlines . map reverse . lines)<br />
</haskell><br />
<br />
Which when run, reverses the input lines:<br />
<br />
$ ./a.out < B.hs<br />
rahC.ataD tropmi<br />
ebyaM.ataD tropmi<br />
tsiL.ataD tropmi<br />
<br />
So we take the input string, split it into lines, and the loop over that<br />
list of lines, reversing each of them, using the 'map' function.<br />
Finally, once we've reversed each line, we join them back into a single<br />
string with unlines, and print it out.<br />
<br />
<br />
The 'map' function is a fundamental control structure of functional<br />
programming, similar to the 'foreach' keyword in a number of imperative<br />
languages. 'map' however is just a function on lists, not built in<br />
syntax, and has the type:<br />
<br />
<haskell><br />
map :: (a -> b) -> [a] -> [b]<br />
</haskell><br />
<br />
That is, it takes some function, and a list, and applies that function<br />
to each element of the list, returning a new list as a result. Since<br />
loops are so common in programming, we'll be using 'map' a lot.<br />
Just for reference, 'map' is implemented as:<br />
<br />
<haskell><br />
map _ [] = []<br />
map f (x:xs) = f x : map f xs<br />
</haskell><br />
<br />
==File IO==<br />
<br />
Operating on stdin/stdout is good for scripts (and this is how tools<br />
like sed or perl -p work), but for 'real' programs we'll at least need<br />
to do some file IO. The basic operations of files are:<br />
<br />
<haskell><br />
readFile :: FilePath -> IO String<br />
writeFile :: FilePath -> String -> IO ()<br />
</haskell><br />
<br />
'readFile' takes a file name as an argument, does some IO, and returns the<br />
file's contents as a string. 'writeFile' takes a file name, a string,<br />
and does some IO (writing that string to the file), before returning the<br />
void (or unit) value, ().<br />
<br />
<br />
We could implement a 'cp' program on files, as:<br />
<br />
<haskell><br />
import System.Environment<br />
<br />
main = do<br />
[f,g] <- getArgs<br />
s <- readFile f<br />
writeFile g s<br />
</haskell><br />
<br />
Running this program:<br />
<br />
$ ghc -O A.hs<br />
<br />
$ ./a.out A.hs Z.hs<br />
<br />
$ cat Z.hs<br />
import System.Environment<br />
<br />
main = do<br />
[f,g] <- getArgs<br />
s <- readFile f<br />
writeFile g s<br />
<br />
Since we're doing IO (the type of readFile and writeFile enforce this),<br />
the code runs inside a do-block, using the IO <i>monad</i>. "Using the<br />
IO monad" just means that we wish to use an imperative, sequential order<br />
of evaluation. (As an aside, a wide range of other monads exist, for<br />
programming different program evaluation strategies, such as<br />
Prolog-style backtracking, or continutation-based evaluation. All of<br />
imperative programming is just one subset of possible evaluation<br />
strategies you can use in Haskell, via monads).<br />
<br />
<br />
In <i>do</i>-notation, whenever you wish to run an action, for its side<br />
effect, and save the result to a variable, you would write:<br />
<br />
<haskell><br />
v <- action<br />
</haskell><br />
<br />
For example, to run the 'readFile' action, which has the side effect of<br />
reading a file from disk, we say:<br />
<br />
<haskell><br />
s <- readFile f<br />
</haskell><br />
<br />
Finally, we can use the 'appendFile' function to append to an existing<br />
file.<br />
<br />
<br />
==File Handles==<br />
<br />
<br />
The most generic interface to files is provided via Handles. Sometimes<br />
we need to keep a file open, for multiple reads or writes. To do this we<br />
use Handles, an abstraction much like the underlying system's file<br />
handles.<br />
<br />
<br />
To open up a file, and get its Handle, we use:<br />
<br />
<haskell><br />
openFile :: FilePath -> IOMode -> IO Handle<br />
</haskell><br />
<br />
So to open a file for reading only, in GHCi:<br />
<br />
<haskell><br />
Prelude System.IO> h <- openFile "A.hs" ReadMode<br />
{handle: A.hs}<br />
</haskell><br />
<br />
Which returns a Handle onto the file "A.hs". We can read a line from this handle:<br />
<br />
<haskell><br />
Prelude System.IO> hGetLine h<br />
"main = do"<br />
</haskell><br />
<br />
To close a Handle, and flush the buffer:<br />
<br />
<haskell><br />
hClose :: Handle -> IO ()<br />
</haskell><br />
<br />
Once a Handle is closed, we can no longer read from it:<br />
<br />
<haskell><br />
Prelude System.IO> hClose h<br />
Prelude System.IO> hGetLine h<br />
*** Exception: A.hs: hGetLine: illegal operation (handle is closed)<br />
</haskell><br />
<br />
We can also flush explicitly with:<br />
<br />
<haskell><br />
hFlush :: Handle -> IO ()<br />
</haskell><br />
<br />
Other useful operations for reading from Handles:<br />
<br />
<haskell><br />
hGetChar :: Handle -> IO Char<br />
hGetLine :: Handle -> IO [Char]<br />
hGetContents :: Handle -> IO [Char]<br />
</haskell><br />
<br />
To write to Handles:<br />
<br />
<haskell><br />
hPutChar :: Handle -> Char -> IO ()<br />
hPutStr :: Handle -> [Char] -> IO ()<br />
hPutStrLn :: Handle -> [Char] -> IO ()<br />
hPrint :: Show a => Handle -> a -> IO ()<br />
</haskell><br />
<br />
Some other useful actions:<br />
<br />
<haskell><br />
hSeek :: Handle -> SeekMode -> Integer -> IO ()<br />
hTell :: Handle -> IO Integer<br />
hFileSize :: Handle -> IO Integer<br />
hIsEOF :: Handle -> IO Bool<br />
</haskell><br />
<br />
==An example: spell checking==<br />
<br />
<br />
Here is a small example of combining the [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Set.html Data.Set]<br />
and List data structures from yesterday's tutorial, with more IO<br />
operations. We'll implement a little spell checker, building the<br />
dictionary in a Set data type. First, some libraries to import:<br />
<br />
<haskell><br />
import System.Environment<br />
import Control.Monad<br />
import Data.Set<br />
</haskell><br />
<br />
And the complete program:<br />
<br />
<haskell><br />
main = do<br />
[s] <- getArgs<br />
f <- readFile "/usr/share/dict/words"<br />
g <- readFile s<br />
let dict = fromList (lines f)<br />
mapM_ (spell dict) (words g)<br />
<br />
spell d w = when (w `notMember` d) (putStrLn w)<br />
</haskell><br />
<br />
Running this program, on its own source, and it reports the following<br />
words are not found in the dictionary:<br />
<br />
$ ghc -O Spell.hs -o spell<br />
<br />
$ ./spell A.hs<br />
Data.Char<br />
=<br />
<-<br />
(map<br />
toUpper<br />
n)<br />
=<br />
<-<br />
getLine<br />
1<br />
<br />
<br />
===Writing the results out===<br />
<br />
<br />
If we wanted to write the results out to a temporary file, we can do<br />
so. Let's import a couple of other modules:<br />
<br />
<haskell><br />
import Data.Set<br />
import Data.Maybe<br />
import Text.Printf<br />
import System.IO<br />
import System.Environment<br />
import System.Posix.Temp<br />
</haskell><br />
<br />
Refactoring the main code to separate out the reading and writing phases<br />
in to their own function, we end up with the core code:<br />
<br />
<haskell><br />
main = do<br />
(f, g) <- readFiles<br />
let dict = fromList (lines f)<br />
errs = mapMaybe (spell dict) (words g)<br />
write errs<br />
<br />
spell d w | w `notMember` d = Just w<br />
| otherwise = Nothing<br />
</haskell><br />
<br />
Where reading is now its own function:<br />
<br />
<haskell><br />
readFiles = do<br />
[s] <- getArgs<br />
f <- readFile "/usr/share/dict/words"<br />
g <- readFile s<br />
return (f,g)<br />
</haskell><br />
<br />
And writing errors out to their own file:<br />
<br />
<haskell><br />
write errs = do<br />
(t,h) <- mkstemp "/tmp/spell.XXXXXX"<br />
mapM_ (hPutStrLn h) errs<br />
hClose h<br />
printf "%d spelling errors written to '%s'\n" (length errs) t<br />
</haskell><br />
<br />
Pretty simple! Running this program:<br />
<br />
$ ghc --make -O Spell.hs -o myspell<br />
[1 of 1] Compiling Main ( Spell.hs, Spell.o )<br />
Linking myspell ...<br />
<br />
$ ./myspell Spell.hs<br />
67 spelling errors written to '/tmp/spell.ia8256'<br />
<br />
==Extension: using SMP parallelism==<br />
<br />
<br />
Finally, just for some bonus fun ... and hold on to your hat 'cause I'm<br />
going to go fast ... we'll add some parallelism to the mix.<br />
<br />
<br />
Haskell was designed from the start to support easy parallelisation, and<br />
since GHC 6.6, multithreaded code will run transparently on multicore<br />
systems using as many cores as you specify. Let's look at how we'd<br />
parallelise our little program to exploit multiple cores. We'll use an<br />
explicit threading model, via [http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html Control.Concurrent]. You can also make your code implicitly<br />
parallel, using [http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Parallel-Strategies.html">Control.Parallel.Strategies],<br />
but we'll leave that for another tutorial.<br />
<br />
<br />
Here's the source, for you to ponder. First some imports:<br />
<br />
<haskell><br />
import Data.Set hiding (map)<br />
import Data.Maybe<br />
import Data.Char<br />
import Text.Printf<br />
import System.IO<br />
import System.Environment<br />
import Control.Concurrent<br />
import Control.Monad<br />
</haskell><br />
<br />
The entry point, modified to break the word list into chunks, and then<br />
dispatching a chunk to each thread:<br />
<br />
<haskell><br />
main = do<br />
(f, g, n) <- readFiles<br />
let dict = fromList (lines f)<br />
work = chunk n (words g)<br />
run n dict work<br />
</haskell><br />
<br />
The 'run' function sets up a channel between the main thread and all<br />
children thread ('errs'), and prints spelling errors as they arrive on<br />
the channel from the children. It then forks off 'n' children threads on<br />
each piece of the work list:<br />
<br />
<haskell><br />
run n dict work = do<br />
chan <- newChan<br />
errs <- getChanContents chan -- errors returned back to main thread<br />
mapM_ (forkIO . thread chan dict) (zip [1..n] work)<br />
wait n errs 0<br />
</haskell><br />
<br />
<br />
The main thread then just waits on all the threads to finish, printing<br />
any spelling errors they pass up:<br />
<br />
<haskell><br />
wait n xs i = when (i < n) $ case xs of<br />
Nothing : ys -> wait n ys $! i+1<br />
Just s : ys -> putStrLn s >> wait n ys i<br />
</haskell><br />
<br />
Each thread spell checks its own piece of the work list. If it finds a<br />
spelling error, it passes the offending word back over the channel to<br />
the main thread.<br />
<br />
<haskell><br />
thread chan dict (me, xs) = do<br />
mapM_ spellit xs<br />
writeChan chan Nothing<br />
<br />
where<br />
spellit w = when (spell dict w) $<br />
writeChan chan . Just $ printf "Thread %d: %-25s" (me::Int) w<br />
</haskell><br />
<br />
The 'spell' function is simplified a little:<br />
<br />
<haskell><br />
spell d w = w `notMember` d<br />
</haskell><br />
<br />
which we could also write as:<br />
<br />
<haskell><br />
spell = flip notMember<br />
</haskell><br />
<br />
We modify the readFiles phase to take an additional numeric command line<br />
argument, specifying the number of threads to run:<br />
<br />
<haskell><br />
readFiles = do<br />
[s,n] <- getArgs<br />
f <- readFile "/usr/share/dict/words"<br />
g <- readFile s<br />
return (f,g, read n)<br />
</haskell><br />
<br />
We compile this with the GHC SMP parallel runtime system:<br />
<br />
$ ghc -O --make -threaded Spell.hs -o spell<br />
<br />
Now, we can run 'n' worker threads (lightweight Haskell threads), mapped<br />
onto 'm' OS threads. Since I'm using a 4 core linux server, we'll play<br />
around with 4 OS threads. First, running everything in a single thread:<br />
<br />
$ time ./spell test.txt 1 +RTS -N1<br />
...<br />
Thread 1: week:<br />
Thread 1: IO!<br />
./spell test.txt 1 +RTS -N1 99% cpu 2.533 total<br />
<br />
Ok, now we change the command line flag to run it with 4 OS threads, to<br />
try to utilise all 4 cores:<br />
<br />
$ time ./spell 4 +RTS -N4<br />
...<br />
Thread 2: week:<br />
Thread 3: IO!<br />
./spell test.txt 4 +RTS -N4 111% cpu 2.335 total<br />
<br />
Ok. Good... A little bit faster, uses a little bit more cpu. It turns<br />
out however the program is bound currently by the time spent in the main<br />
thread building the initial dictionary. Actual searching time is only<br />
some 10% of the running time. Nonetheless, it was fairly painless to<br />
break up the initial simple program into a parallel version.<br />
<br />
<br />
If the program running time was extended (as the case for a server), the<br />
parallelism would be a win. Additionally, should we buy more cores for<br />
the server, all we need to is change the +RTS -N argument to the<br />
program, to start utilising these extra cores.<br />
<br />
==Next week==<br />
<br />
<br />
In upcoming tutorials we'll look more into implicitly parallel programs,<br />
and the use of the new high performance ByteString data type for string<br />
processing.</div>Dibblegohttps://wiki.haskell.org/index.php?title=How_to_write_a_Haskell_program&diff=8443How to write a Haskell program2006-11-19T04:59:55Z<p>Dibblego: </p>
<hr />
<div>A guide to the best practice for creating a new Haskell project or<br />
program.<br />
<br />
== Structure ==<br />
<br />
The basic structure of a new Haskell project can be adopted from<br />
[http://semantic.org/hnop/ HNop], the minimal Haskell project. It<br />
consists of the following files, for the mythical project "haq"<br />
<br />
* Haq.hs -- the main haskell source file<br />
* haq.cabal -- the cabal build description<br />
* Setup.hs -- build script itself<br />
* _darcs -- revision control<br />
* README -- info<br />
* LICENSE -- license<br />
<br />
You can of course elaborate on this, with subdirectories and multiple<br />
modules.<br />
<br />
Here is a transcript on how you'd create a minimal darcs and cabalised<br />
Haskell project, for the cool new Haskell program "haq", build it,<br />
install it and release.<br />
<br />
=== Create a directory ===<br />
<br />
Create somewhere for the source:<br />
<br />
<haskell><br />
$ mkdir haq<br />
$ cd haq<br />
</haskell><br />
<br />
=== Write some Haskell source ===<br />
<br />
Write your program:<br />
<br />
<haskell><br />
$ cat > Haq.hs<br />
--<br />
-- Copyright (c) 2006 Don Stewart - http://www.cse.unsw.edu.au/~dons<br />
-- GPL version 2 or later (see http://www.gnu.org/copyleft/gpl.html)<br />
--<br />
import System.Environment<br />
<br />
-- | 'main' runs the main program<br />
main :: IO ()<br />
main = getArgs >>= print . haqify . head<br />
<br />
haqify s = "Haq! " ++ s<br />
</haskell><br />
<br />
=== Stick it in darcs ===<br />
<br />
Place the source under revision control:<br />
<br />
<haskell><br />
$ darcs init<br />
$ darcs add Haq.hs <br />
$ darcs record<br />
addfile ./Haq.hs<br />
Shall I record this change? (1/?) [ynWsfqadjkc], or ? for help: y<br />
hunk ./Haq.hs 1<br />
+--<br />
+-- Copyright (c) 2006 Don Stewart - http://www.cse.unsw.edu.au/~dons<br />
+-- GPL version 2 or later (see http://www.gnu.org/copyleft/gpl.html)<br />
+--<br />
+import System.Environment<br />
+<br />
+-- | 'main' runs the main program<br />
+main :: IO ()<br />
+main = getArgs >>= print . haqify . head<br />
+<br />
+haqify s = "Haq! " ++ s<br />
Shall I record this change? (2/?) [ynWsfqadjkc], or ? for help: y<br />
What is the patch name? Import haq source<br />
Do you want to add a long comment? [yn]n<br />
Finished recording patch 'Import haq source'<br />
</haskell><br />
<br />
And we can see now darcs is running the show:<br />
<br />
<haskell><br />
$ ls<br />
Haq.hs _darcs<br />
</haskell><br />
<br />
=== Add a build system ===<br />
<br />
Create a .cabal file describing how to build your project:<br />
<br />
<haskell><br />
$ cat > haq.cabal<br />
Name: haq<br />
Version: 0.0<br />
Description: Super cool mega lambdas<br />
License: GPL<br />
License-file: LICENSE<br />
Author: Don Stewart<br />
Maintainer: dons@cse.unsw.edu.au<br />
Build-Depends: base<br />
<br />
Executable: haq<br />
Main-is: Haq.hs<br />
ghc-options: -O<br />
</haskell><br />
<br />
Add a Setup.hs that will actually do the building:<br />
<br />
<haskell><br />
$ cat > Setup.hs<br />
#!/usr/bin/env runhaskell<br />
import Distribution.Simple<br />
main = defaultMainWithHooks defaultUserHooks<br />
</haskell><br />
<br />
And record your changes:<br />
<br />
<haskell><br />
$ darcs add haq.cabal Setup.hs<br />
$ darcs record --all<br />
What is the patch name? Add a build system<br />
Do you want to add a long comment? [yn]n<br />
Finished recording patch 'Add a build system'<br />
</haskell><br />
<br />
=== Build your project ===<br />
<br />
Now build it!<br />
<br />
<haskell><br />
$ runhaskell Setup.hs configure --prefix=/home/dons<br />
$ runhaskell Setup.hs build<br />
$ runhaskell Setup.hs install<br />
</haskell><br />
<br />
=== Run it ===<br />
<br />
And now you can run your cool project:<br />
<haskell><br />
$ haq me<br />
"Haq! me"<br />
</haskell><br />
<br />
You can also run it in-place, avoiding the install phase:<br />
<haskell><br />
$ dist/build/haq/haq you<br />
"Haq! you"<br />
</haskell><br />
<br />
=== Build some haddock documentation ===<br />
<br />
Generate some API documentation into dist/doc/*<br />
<br />
<haskell><br />
$ runhaskell Setup.hs haddock<br />
</haskell><br />
<br />
which generates files in dist/doc/ including:<br />
<br />
<haskell><br />
$ w3m -dump dist/doc/html/haq/Main.html<br />
haq Contents Index<br />
Main<br />
<br />
Synopsis<br />
main :: IO ()<br />
<br />
Documentation<br />
<br />
main :: IO ()<br />
main runs the main program<br />
<br />
Produced by Haddock version 0.7<br />
</haskell><br />
<br />
=== Add some automated testing: QuickCheck ===<br />
<br />
We'll use QuickCheck to specify a simple property of our Haq.hs code. Create a tests module, Tests.hs, with some QuickCheck boilerplate:<br />
<br />
<haskell><br />
$ cat > Tests.hs<br />
import Char<br />
import List<br />
import Test.QuickCheck<br />
import Text.Printf<br />
<br />
main = mapM_ (\(s,a) -> printf "%-25s: " s >> a) tests<br />
<br />
instance Arbitrary Char where<br />
arbitrary = choose ('\0', '\128')<br />
coarbitrary c = variant (ord c `rem` 4)<br />
</haskell><br />
<br />
Now let's write a simple property:<br />
<br />
<haskell><br />
$ cat >> Tests.hs <br />
-- reversing twice a finite list, is the same as identity<br />
prop_reversereverse s = (reverse . reverse) s == id s<br />
where _ = s :: [Int]<br />
<br />
-- and add this to the tests list<br />
tests = [("reverse.reverse/id", test prop_reversereverse)]<br />
</haskell><br />
<br />
We can now run this test, and have QuickCheck generate the test data:<br />
<br />
<haskell><br />
$ runhaskell Tests.hs<br />
reverse.reverse/id : OK, passed 100 tests.<br />
</haskell><br />
<br />
Let's add a test for the 'haqify' function:<br />
<br />
<haskell><br />
-- Dropping the "Haq! " string is the same as identity<br />
prop_haq s = drop (length "Haq! ") (haqify s) == id s<br />
where haqify s = "Haq! " ++ s<br />
<br />
tests = [("reverse.reverse/id", test prop_reversereverse)<br />
,("drop.haq/id", test prop_haq)]<br />
</haskell><br />
<br />
and let's test that:<br />
<br />
<haskell><br />
$ runhaskell Tests.hs<br />
reverse.reverse/id : OK, passed 100 tests.<br />
drop.haq/id : OK, passed 100 tests.<br />
</haskell><br />
<br />
Great!<br />
<br />
=== Running the test suite from darcs ===<br />
<br />
We can arrange for darcs to run the test suite on every commit:<br />
<br />
<haskell><br />
$ darcs setpref test "runhaskell Tests.hs"<br />
</haskell><br />
<br />
will run the full set of QuickChecks. Let's commit a new patch:<br />
<br />
<haskell><br />
$ darcs setpref test "runhaskell Tests.hs"<br />
Changing value of test from '' to 'runhaskell Tests.hs'<br />
$ darcs add Tests.hs<br />
$ darcs record --all<br />
What is the patch name? Add testsuite<br />
Do you want to add a long comment? [yn]n<br />
Running test...<br />
reverse.reverse/id : OK, passed 100 tests.<br />
drop.haq/id : OK, passed 100 tests.<br />
Test ran successfully.<br />
Looks like a good patch.<br />
Finished recording patch 'Add testsuite'<br />
</haskell><br />
<br />
Excellent, now patches must pass the test suite before they can be<br />
committed.<br />
<br />
=== Tag the stable version, create a tarball, and sell it! ===<br />
<br />
Tag the stable version:<br />
<br />
<haskell><br />
$ darcs tag<br />
What is the version name? 0.0<br />
Finished tagging patch 'TAG 0.0'<br />
</haskell><br />
<br />
Now generate a tarball:<br />
<haskell><br />
$ darcs dist -d haq-0.0<br />
Created dist as haq-0.0.tar.gz<br />
</haskell><br />
<br />
And you're all set up!<br />
<br />
=== Summary ===<br />
<br />
The following files were created:<br />
<br />
$ ls<br />
Haq.hs Tests.hs dist haq.cabal<br />
Setup.hs _darcs haq-0.0.tar.gz<br />
<br />
== Licenses ==<br />
<br />
Code for the common base library package must be BSD licensed. Otherwise, it<br />
is entirely up to you as the author.<br />
Choose a licence (inspired by [http://www.dina.dk/~abraham/rants/license.html this]).<br />
Check the licences of things you use, both other Haskell packages and C<br />
libraries, since these may impose conditions you must follow.<br />
Use the same licence as related projects, where possible. The Haskell community is<br />
split into 2 camps, roughly, those who release everything under BSD, and<br />
(L)GPLers. Some Haskellers recommend avoiding LGPL, due to cross module optimisation<br />
issues. Like many licensing questions, this advice is controversial. Several Haskell projects<br />
(wxHaskell, HaXml, etc) use the LGPL with an extra permissive clause which gets round the<br />
cross-module optimisation thing.<br />
<br />
== Revision control ==<br />
<br />
Use [http://darcs.net Darcs] unless you have a specific reason not to.<br />
Almost all new Haskell projects are released under Darcs, and this<br />
benefits everyone -- a set of common tools increases productivity, and<br />
you're more likely to get patches.<br />
<br />
Advice:<br />
* Tag each release<br />
<br />
== Releases ==<br />
<br />
It's important to release your code as stable, tagged tarballs. Don't<br />
just [http://awayrepl.blogspot.com/2006/11/we-dont-do-releases.html rely on darcs for distribution].<br />
<br />
* '''darcs dist''' generates tarballs directly from a darcs repository<br />
<br />
For example:<br />
<br />
$ cd fps<br />
$ ls <br />
Data LICENSE README Setup.hs TODO _darcs cbits dist fps.cabal tests<br />
$ darcs dist -d fps-0.8<br />
Created dist as fps-0.8.tar.gz<br />
<br />
You can now just post your fps-0.8.tar.gz<br />
<br />
You can also have darcs do the equivalent of 'daily snapshots' for you by using a post-hook.<br />
<br />
put the following in _darcs/prefs/defaults:<br />
apply posthook darcs dist<br />
apply run-posthook<br />
<br />
== Hosting ==<br />
<br />
A Darcs repository can be published simply by making it available from a<br />
web page. If you don't have an account online, or prefer not to do this<br />
yourself, source can be hosted on darcs.haskell.org (you will need to <br />
email [http://research.microsoft.com/~simonmar/ Simon Marlow] to do this). <br />
haskell.org itself has some user accounts available.<br />
<br />
There are also many free hosting places for open source, such as<br />
* [http://code.google.com/hosting/ Google Project Hosting]<br />
* [http://sourceforge.net/ SourceForge].<br />
<br />
== Web page ==<br />
<br />
Create a web page documenting your project! An easy way to do this is to<br />
add a project specific page to [[Haskell|the Haskell wiki]]<br />
<br />
== Build system ==<br />
<br />
Use [http://haskell.org/cabal Cabal].<br />
<br />
=== Example Setup ===<br />
Create a file called Setup.lhs with these contents:<br />
<haskell><br />
#!/usr/bin/env runhaskell<br />
<br />
> import Distribution.Simple<br />
> main = defaultMain<br />
<br />
</haskell><br />
Writing the setup file this way allows it to be executed directly by unix shells.<br />
<br />
=== Example cabal file for Executables ===<br />
Create the file myproject.cabal following this example:<br />
<pre><br />
Name: MyProject<br />
Version: 0.1<br />
License: BSD3<br />
Author: Your Name<br />
Build-Depends: base<br />
Synopsis: Example Cabal Executable File<br />
<br />
Executable: myproj<br />
Main-Is: Main.hs<br />
Other-Modules: Foo<br />
</pre><br />
<br />
=== Example cabal file for Libraries ===<br />
Create the file myproject.cabal following this example:<br />
<pre><br />
Name: MyProj<br />
Version: 0.1<br />
License: BSD3<br />
Author: Your Name<br />
Build-Depends: base<br />
Synopsis: Example Cabal Library File<br />
Exposed-Modules: MyProject.Foo<br />
</pre><br />
<br />
== Documentation ==<br />
<br />
Use [http://haskell.org/haddock Haddock].<br />
<br />
== Testing ==<br />
<br />
Pure code can be tested using [http://www.md.chalmers.se/~rjmh/QuickCheck/ QuickCheck] or [http://www.mail-archive.com/haskell@haskell.org/msg19215.html SmallCheck]. Impure code with [http://hunit.sourceforge.net/ HUnit]. <br />
<br />
To get started try, [[Introduction to QuickCheck]]. For a slightly more advanced introduction, here is a blog article about creating a testing framework for QuickCheck using some Template Haskell, [http://blog.codersbase.com/2006/09/01/simple-unit-testing-in-haskell/ Simple Unit Testing in Haskell].<br />
<br />
== Program structure ==<br />
<br />
Monad transformers are very useful for programming in the large,<br />
encapsulating state, and controlling side effects. To learn more about this approach, try [http://uebb.cs.tu-berlin.de/~magr/pub/Transformers.en.html Monad Transformers Step by Step].<br />
<br />
== Publicity ==<br />
<br />
The best code in the world is meaningless if nobody knows about it:<br />
<br />
* Firstly, join the community! [http://haskell.org/haskellwiki/Mailing_lists Subscribe] to at least haskell-cafe@ and haskell@ mailing lists.<br />
* Announce your project releases to haskell@haskell.org! This ensure it will then make it into the [http://haskell.org/haskellwiki/HWN Haskell Weekly News]. To be doubly sure, you should CC the release to the [[HWN|HWN editor]]<br />
* Blog about it, on [http://planet.haskell.org Planet Haskell]<br />
** Write about it on your blog<br />
** Then email the [http://planet.haskell.org/ Planet Haskell] maintainer (ibid on [[IRC channel|#haskell]]) the RSS feed url for your blog<br />
* Add your library or tool to the [[Libraries and tools]] page, under the relevant category, so people can find it.<br />
<br />
[[Category:Community]]<br />
[[Category:Tutorials]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=Haskell_in_research&diff=7763Haskell in research2006-11-02T00:55:12Z<p>Dibblego: </p>
<hr />
<div>Since its inception, Haskell development has been driven by programming<br />
language researchers. This page collects information about that community.<br />
<br />
==Research groups==<br />
<br />
*[http://www-i2.informatik.rwth-aachen.de/Forschung/FP/ Aachen]<br />
*[http://www-fp.dcs.st-andrews.ac.uk/ St. Andrews]<br />
*[http://www.cs.bris.ac.uk/%7Eian/Functional/ Bristol]<br />
*[http://www.md.chalmers.se/Cs/Research/Functional/ Chalmers]<br />
*[http://www.cs.kent.ac.uk/research/groups/tcs/fp/ Kent]<br />
*[http://www.cs.mu.oz.au/fpu/ Melbourne]<br />
*[http://www.cse.unsw.edu.au/~pls/ New South Wales (Sydney)]<br />
*[http://www.cs.nott.ac.uk/Research/fop/ Nottingham]<br />
*[http://www.cse.ogi.edu/ OGI]<br />
*[http://www.comlab.ox.ac.uk/oucl/research/areas/ap Oxford]<br />
*[http://www.workingmouse.com/research/ Queensland (Brisbane)]<br />
*[http://www.cs.uu.nl/groups/ST Utrecht]<br />
*[http://haskell.cs.yale.edu/yale/ Yale]<br />
*[http://www.cs.york.ac.uk/fp/ York]<br />
<br />
==Researchers==<br />
<br />
*[[Research_papers/Authors|Haskell people]]<br />
<br />
==Research papers==<br />
<br />
* A collection of [[Research papers|Haskell research]] papers.<br />
<br />
==Conferences==<br />
<br />
* A list of [[Conferences|conferences]] relevant to Haskell.<br />
<br />
===Haskell Workshops===<br />
<br />
*[http://haskell.org/haskell-workshop/1995/ The First Haskell Workshop], 1995, La Jolla.<br />
*[http://www.cse.ogi.edu/~jl/ACM/Haskell.html The Second Haskell Workshop], 7 June 1997, Amsterdam, The Netherlands.<br />
*[http://www.haskell.org/HaskellWorkshop.html The Third Haskell Workshop], 1 October 1999, Paris, France.<br />
*[http://www.cs.nott.ac.uk/~gmh/hw00.html The Fourth Haskell Workshop], 17 September 2000, Montreal, Canada.<br />
*[http://www.cs.uu.nl/people/ralf/hw2001.html The Fifth Haskell Workshop], September 2001, Firenze, Italy. <br />
*[http://www.cse.unsw.edu.au/~chak/hw2002/ The Sixth Haskell Workshop], October 2002, Pittsburgh, USA. <br />
*[http://www.cs.uu.nl/~johanj/HaskellWorkshop/cfp03.html The Seventh Haskell Workshop], August 2003, Uppsala, Sweden. <br />
*[http://www.cs.nott.ac.uk/~nhn/HW2004/ The Eigth Haskell Workshop], September 2004, Snowbird, Utah, USA.<br />
*[http://www.haskell.org/haskell-workshop/2005/ The Ninth Haskell Workshop], September 2005, Tallinn, Estonia.<br />
*[http://haskell.org/haskell-workshop/2006/ The Tenth Haskell Workshop], September 2006, Portland, Oregon, USA.<br />
<br />
[[Category:Community]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=OOP_vs_type_classes&diff=6844OOP vs type classes2006-10-10T07:48:45Z<p>Dibblego: </p>
<hr />
<div>(this is just a sketch now. feel free to edit/comment it. I will include information you provided into the final version of this tutorial)<br />
<br />
<br />
I had generally not used type classes in my application programs, but when<br />
I'd gone to implement general purpose libraries and tried to maintain<br />
as much flexibility as possible, it was natural to start building large<br />
and complex class hierarchies. I tried to use my C++ experience when<br />
doing this but I was bitten many times by the restrictions of type classes. After this experience, I think that I now have a better feeling and mind model<br />
for type classes and I want to share it with other Haskellers -<br />
especially ones having OOP backgrounds.<br />
<br />
Brian Hulley provided us with the program that emulates OOP in Haskell - as<br />
you can see, it's much larger than equivalent C++ program. An equivalent translation from<br />
Haskell to C++ should be even longer :)<br />
<br />
<br />
== Everything is object? ==<br />
<br />
You all know this OOP motto - "everything is object". While I program in<br />
C++, it was really hard to do it without classes. But when I - the same<br />
John de Mac-Lee programmer - use Haskell, it's hard to find open<br />
vacancies for type classes. Why it is so different?<br />
<br />
C++ classes pack functions together with data and make it possible to<br />
use different data representations via the same interface. As long as<br />
you need any of these facilities, you are forced to use classes.<br />
Although C++ supports alternative ways to implement such functionality<br />
(function pointers, discriminated unions), these are not as handy<br />
as classes. Classes are also the sole method to hide implementation<br />
details. Moreover, classes represent a handy way to group related<br />
functionality together, so I found myself sometimes developing classes that<br />
contain only static functions just to group them into some<br />
'package'. It's extremely useful to browse the structure of large C++ project<br />
in terms of classes instead of individual functions.<br />
<br />
Haskell provides other solutions for these problems.<br />
<br />
=== Type with several representations: use algebraic data type (ADT) ===<br />
<br />
For the types with different representations, algebraic data types<br />
(ADT) - an analog of discriminated unions - are supported:<br />
<br />
<haskell><br />
data Point = FloatPoint Float Float<br />
| IntPoint Int Int<br />
</haskell><br />
<br />
Haskell provides a very easy way to build/analyze them:<br />
<br />
<haskell><br />
coord :: Point -> (Float, Float)<br />
coord (FloatPoint x y) = (x,y)<br />
coord (IntPoint x y) = (realToFrac x, realToFrac y)<br />
<br />
main = do print (coord (FloatPoint 1 2))<br />
print (coord (IntPoint 1 2))<br />
</haskell><br />
<br />
So ADTs in general are preferred in Haskell over the class-based<br />
solution of the same problem:<br />
<br />
<haskell><br />
class Point a where<br />
coord :: a -> (Float, Float)<br />
<br />
data FloatPoint = FloatPoint Float Float<br />
instance Point FloatPoint where<br />
coord (FloatPoint x y) = (x,y)<br />
<br />
data IntPoint = IntPoint Int Int<br />
instance Point IntPoint where<br />
coord (IntPoint x y) = (realToFrac x, realToFrac y)<br />
</haskell><br />
<br />
<br />
You can also imagine all the C++ machinery that is required to implement the<br />
analog of our 5 line, ADT-based solution to say that objects are not<br />
so great as you thought before :D Of course, C++ classes are usually<br />
much larger but that's again Haskell benefit - it's so easy to define<br />
types/functions that you may use much finer granularity.<br />
<br />
<pre><br />
#include <algorithm><br />
#include <iostream><br />
using namespace std;<br />
<br />
struct Point {<br />
pair<float,float> coord();<br />
};<br />
<br />
struct FloatPoint : Point {<br />
float x, y;<br />
FloatPoint (float _x, float _y) { x=_x; y=_y; }<br />
pair<float,float> coord() { return make_pair(x,y); }<br />
};<br />
<br />
struct IntPoint : Point {<br />
int x, y;<br />
IntPoint (int _x, int _y) { x=_x; y=_y; }<br />
pair<float,float> coord() { return make_pair(x,y); }<br />
};<br />
<br />
main () {<br />
cout << FloatPoint(1,2).coord();<br />
cout << IntPoint(1,2).coord();<br />
}<br />
</pre><br />
<br />
As you see, ADTs together with type inference make Haskell programs<br />
about 2 times smaller than their C++ equivalent.<br />
<br />
<br />
=== Packing data & functions together: use (records of) closures ===<br />
<br />
Another very typical class usage is to pack data together with one<br />
or more functions to proceed them and pass this bunch to some<br />
function. Then this function can call these functions to implement<br />
some functionality and don't bother how it is internally implemented.<br />
Hopefully Haskell provides better way to implement this - you can<br />
directly pass any functions as parameters to other functions.<br />
Moreover, you can construct passed functions on-the-fly and capture in<br />
these definitions values of identifiers available at the call place,<br />
creating so-called closures. In this way, you construct something like<br />
object on-the-fly and don't even need a type class:<br />
<br />
<haskell><br />
do x <- newIORef 0<br />
proc (modifyIORef x (+1), readIORef x)<br />
</haskell><br />
<br />
Here, we passed to proc two routines - one that increments value of<br />
counter and another that reads its current value. Another call to proc<br />
that uses counter with locking, might look as:<br />
<br />
<haskell><br />
do x <- newMVar 0<br />
proc (modifyMVar x (+1), readMVar x)<br />
</haskell><br />
<br />
Here, proc may be defined as:<br />
<br />
<haskell><br />
proc :: (IO (), IO Int) -> IO ()<br />
proc (inc, read) = do { inc; inc; inc; read >>= print }<br />
</haskell><br />
<br />
<br />
i.e. it receive two abstract operations whose implementation may vary<br />
in different calls to proc and call them without any knowledge of<br />
implementation details. The equivalent C++ code look as:<br />
<br />
<pre><br />
struct Counter {<br />
void inc();<br />
int read();<br />
};<br />
<br />
struct SimpleCounter : Counter {<br />
int n;<br />
SimpleCounter () { n=0; }<br />
void inc() { n++; }<br />
int read() { return n; }<br />
};<br />
<br />
void proc (Counter c) {<br />
c.inc(); c.inc(); c.inc(); cout << c.read();<br />
}<br />
</pre><br />
<br />
And again, Haskell code is much more simple and straightforward - we<br />
don't need to declare classes, operations, their types - we just pass<br />
to the proc implementation of operations it need. Look at<br />
http://haskell.org/haskellwiki/IO_inside#Example:_returning_an_IO_action_as_a_result<br />
and following sections to find more examples of using closures instead<br />
of OOP classes.<br />
<br />
<br />
=== Hiding implementation details: use module export list ===<br />
<br />
One more usage of OOP classes is to hide implementation details, making<br />
internal data/functions inaccessible to class clients. Unfortunately, this<br />
functionality is not part of type class facilities. Instead, you<br />
should use the sole Haskell method of implementation hiding - module<br />
export list:<br />
<br />
<haskell><br />
module Stack (Stack, empty, push, pop, top, isEmpty) where<br />
<br />
newtype Stack a = Stk [a]<br />
<br />
empty = Stk []<br />
push x (Stk xs) = Stk (x:xs)<br />
pop (Stk (x:xs)) = Stk xs<br />
top (Stk (x:xs)) = x<br />
isEmpty (Stk xs) = null xs<br />
</haskell><br />
<br />
Since the constructor for the data type Stack is hidden (the export<br />
list would say Stack(Stk) if it were exposed), outside of this module<br />
a stack can only be built from operations empty, push and pop, and<br />
examined with top and isEmpty.<br />
<br />
<br />
=== Grouping related functionality: use module hierarchy and Haddock markup ===<br />
<br />
Dividing whole program into classes and using their hierarchy to<br />
represent entire program structure is a great instrument for OOP<br />
languages. Unfortunately, it's again impossible in Haskell. Instead,<br />
program structure typically rendered in module hierarchy and inside<br />
module - in its export list. Although Haskell language don't provide<br />
facilities to describe hierarchical structure inside of module, we had<br />
another tool to do it - Haddock, a de-facto standard documentation tool.<br />
<br />
<haskell><br />
module System.Stream.Instance (<br />
<br />
-- * File is a file stream<br />
File,<br />
-- ** Functions that open files<br />
openFile, -- open file in text mode<br />
openBinaryFile, -- open file in binary mode<br />
-- ** Standard file handles<br />
stdin,<br />
stdout,<br />
stderr,<br />
<br />
-- * MemBuf is a memory buffer stream<br />
MemBuf,<br />
-- ** Functions that open MemBuf<br />
createMemBuf, -- create new MemBuf<br />
createContiguousMemBuf, -- create new contiguous MemBuf<br />
openMemBuf, -- use memory area as MemBuf<br />
<br />
) where<br />
...<br />
</haskell><br />
<br />
Here, Haddock will build documentation for module looking at its<br />
export list. The export list will be divided into sections (whose<br />
headers given with "-- *") and subsections (given with "-- **"). As<br />
the result, module documentation reflect its structure without using<br />
classes for this purpose.<br />
<br />
<br />
== Type classes is a sort of templates, not classes ==<br />
<br />
At this moment C++/C#/Java languages has classes and<br />
templates/generics. What is a difference? With a class, type<br />
information carried with object itself while with templates it's<br />
outside of object and is part of the whole operation.<br />
<br />
For example, if == operation is defined in a class, the actual<br />
procedure called for a==b may depend on run-time type of 'a' but if it<br />
is defined in template, actual procedure depends only on template<br />
instantiated (and determined at compile time).<br />
<br />
Haskell's objects don't carry run-time type information. Instead,<br />
class constraint for polymorphic operation passed in form of<br />
"dictionary" implementing all operations of the class (there are also<br />
other implementation techniques, but this don't matter). For example,<br />
<br />
<haskell><br />
eqList :: (Eq a) => [a] -> [a] -> Bool<br />
</haskell><br />
<br />
translated into:<br />
<br />
<haskell><br />
type EqDictionary a = (a->a->Bool, a->a->Bool)<br />
eqList :: EqDictionary a -> [a] -> [a] -> Bool<br />
</haskell><br />
<br />
where first parameter is "dictionary" containing implementation of<br />
"==" and "/=" operations for objects of type 'a'. If there are several<br />
class constraints, dictionary for each is passed.<br />
<br />
If class has base class(es), the dictionary tuple also includes tuples<br />
of base classes dictionaries:<br />
<br />
<haskell><br />
class Eq a => Cmp a where<br />
cmp :: a -> a -> Ordering<br />
<br />
cmpList :: (Cmp a) => [a] -> [a] -> Ordering<br />
</haskell><br />
<br />
turns into:<br />
<br />
<haskell><br />
type CmpDictionary a = (eqDictionary a, a -> a -> Ordering)<br />
cmpList :: CmpDictionary a -> [a] -> [a] -> Bool<br />
</haskell><br />
<br />
<br />
Comparing to C++, this is like the templates, not classes! As with<br />
templates, typing information is part of operation, not object! But<br />
while C++ templates are really form of macro-processing (like<br />
Template Haskell) and at last end generates non-polymorphic code,<br />
Haskell's using of dictionaries allows run-time polymorphism<br />
(explanation of run-time polymorphism?).<br />
<br />
Moreover, Haskell type classes supports inheritance. Run-time<br />
polymorphism together with inheritance are often seen as OOP<br />
distinctive points, so during long time I considered type classes as a<br />
form of OOP implementation. But that's wrong! Haskell type classes<br />
build on different basis, so they are like C++ templates with added<br />
inheritance and run-time polymorphism! And this means that usage of<br />
type classes is different from using classes, with its own strong and<br />
weak points.<br />
<br />
<br />
== Type classes vs classes ==<br />
<br />
Here is a brief listing of differences between OOP classes and Haskell type classes<br />
<br />
=== Type classes is like interfaces/abstract classes, not classes itself ===<br />
<br />
There is no data fields inheritance and data fields itself<br />
(so type classes more like to interfaces than to classes itself)....<br />
<br />
<br />
For those more familiar with Java/C# rather than C++, type classes resemble interfaces more than the classes. In fact, the generics in those languages capture the notion of parametric polymorphism (but Haskell is a language that takes parametric polymorphism quite seriously, so you can expect a fair amount of type gymnastics when dealing with Haskell), so more precisely, type classes are like generic interfaces.<br />
<br />
Why interface, and not class? Mostly because type classes do not implement the methods themselves, they just guarantee that the actual types that instantiate the type class will implement specific methods. So the types are like classes in Java/C#.<br />
<br />
One added twist: type classes can decide to provide default implementation of some methods (using other methods). You would say, then they are sort of like abstract classes. Right. But at the same time, you cannot extend (inherit) multiple abstract classes, can you?<br />
<br />
So a type class is sort of like a contract: "any type that instantiates this type class will have the following functions defined on them..." but with the added advantage that you have type parameters built-in, so:<br />
<br />
<haskell><br />
class Eq a where<br />
(==) :: a -> a -> Bool<br />
(/=) :: a -> a -> Bool<br />
-- let's just implement one function in terms of the other<br />
x /= y = not (x == y) <br />
</haskell><br />
<br />
is, in a Java-like language:<br />
<br />
'''interface''' Eq<A> {<br />
'''boolean''' equal(A that);<br />
'''boolean''' notEqual(A that) { <br />
''// default, can be overriden''<br />
'''return''' !equal(that); <br />
} <br />
}<br />
<br />
And the "instance TypeClass ParticularInstance where ..." definition means "ParticularInstance implements TypeClass { ... }", now, multiple parameter type classes, of course, cannot be interpreted this way.<br />
<br />
<br />
=== Type can appear at any place in function signature ===<br />
Type can appear at any place in function signature: be any<br />
parameter, inside parameter, in a list (possibly empty), or in a result<br />
<br />
<haskell><br />
class C a where<br />
f :: a -> Int<br />
g :: Int -> a -> Int<br />
h :: Int -> (Int,a) -> Int<br />
i :: [a] -> Int<br />
j :: Int -> a<br />
new :: a<br />
</haskell><br />
<br />
It's even possible to define instance-specific constants (look at 'new').<br />
<br />
If function value is instance-specific, OOP programmer will use<br />
"static" method while with type classes you need to use fake<br />
parameter:<br />
<br />
<haskell><br />
class FixedSize a where<br />
sizeof :: a -> Int<br />
instance FixedSize Int8 where<br />
sizeof _ = 1<br />
instance FixedSize Int16 where<br />
sizeof _ = 2<br />
<br />
main = do print (sizeof (undefined::Int8))<br />
print (sizeof (undefined::Int16))<br />
</haskell><br />
<br />
<br />
=== Inheritance between interfaces ===<br />
Inheritance between interfaces (in "class" declaration) means<br />
inclusion of base class dictionaries in dictionary of subclass:<br />
<br />
<haskell><br />
class (Show s, Monad m s) => Stream m s where<br />
sClose :: s -> m ()<br />
</haskell><br />
<br />
means<br />
<br />
<haskell><br />
type StreamDictionary m s = (ShowDictionary s, MonadDictionary m s, s->m())<br />
</haskell><br />
<br />
There is upcasting mechanism, it just extracts dictionary of a base<br />
class from a dictionary tuple, so you can run function that requires<br />
base class from a function that requires subclass:<br />
<br />
<haskell><br />
f :: (Stream m s) => s -> m String<br />
show :: (Show s) => s -> String<br />
f s = return (show s)<br />
</haskell><br />
<br />
But downcasting is absolutely impossible - there is no way to get<br />
subclass dictionary from a superclass one<br />
<br />
<br />
<br />
=== Inheritance between instances ===<br />
Inheritance between instances (in "instance" declaration) means<br />
that operations of some class can be executed via operations of other<br />
class, i.e. such declaration describe a way to compute dictionary of<br />
inherited class via functions from dictionary of base class:<br />
<br />
<haskell><br />
class Eq a where<br />
(==) :: a -> a -> Bool<br />
class Cmp a where<br />
cmp :: a -> a -> Ordering<br />
instance (Cmp a) => Eq a where<br />
a==b = cmp a b == EQ<br />
</haskell><br />
<br />
creates the following function:<br />
<br />
<haskell><br />
cmpDict2EqDict :: CmpDictionary a -> EqDictionary a<br />
cmpDict2EqDict (cmp) = (\a b -> cmp a b == EQ)<br />
</haskell><br />
<br />
This results in that any function that receives dictionary for Cmp class<br />
can call functions that require dictionary of Eq class<br />
<br />
<br />
=== Downcasting is a mission impossible ===<br />
<br />
Selection between instances are done at compile-time, based only on<br />
information present at this moment. So don't expect that more concrete<br />
instance will be selected just because you passed this concrete<br />
datatype to the function which accepts some general class:<br />
<br />
<haskell><br />
class Foo a where<br />
foo :: a -> String<br />
<br />
instance (Num a) => Foo a where<br />
foo _ = "Num"<br />
<br />
instance Foo Int where<br />
foo _ = "int"<br />
<br />
f :: (Num a) => a -> String<br />
f = foo<br />
<br />
main = do print (foo (1::Int))<br />
print (f (1::Int))<br />
</haskell><br />
<br />
Here, the first call will return "int", but second - only "Num".<br />
this can be easily justified by using dictionary-based translation<br />
as described above. After you've passed data to polymorphic procedure<br />
it's type is completely lost, there is only dictionary information, so<br />
instance for Int can't be applied. The only way to construct Foo<br />
dictionary is by calculating it from Num dictionary using the first<br />
instance.<br />
<br />
<br />
=== There is only one dictionary per function call ===<br />
For "eqList :: (Eq a) => [a] -> [a] -> Bool" types of all elements<br />
in list must be the same, and types of both arguments must be the same<br />
too - there is only one dictionary and it know how to handle variables<br />
of only one concrete type!<br />
<br />
=== Existential variables is more like OOP objects ===<br />
Existential variables pack dictionary together with variable (looks<br />
very like the object concept!) so it's possible to create polymorphic<br />
containers (i.e. holding variables of different types). But<br />
downcasting is still impossible. Also, existentials still don't allow<br />
to mix variables of different types in a call to some polymorhic operation<br />
(their personal dictionaries still built for variables of one concrete type):<br />
<br />
<haskell><br />
data HasCmp = forall a. Cmp a => HasCmp a<br />
<br />
sorted :: [HasCmp] -> Ordering<br />
<br />
sorted [] = True<br />
sorted [_] = True<br />
sorted (HasCmp a : HasCmp b : xs) = a<=b && sorted (b:xs)<br />
</haskell><br />
<br />
This code will not work - a<=b can use nor 'a' nor 'b' dictionary.<br />
Even if orderings for apples and penguins are defined, we still don't have<br />
a method to compare penguin with apple!<br />
<br />
<br />
== Other opinions ==<br />
<br />
=== OO class always corresponds to a haskell class + a related haskell existential (John Meacham) ===<br />
<br />
> Roughly Haskell type classes correspond to parameterized abstract<br />
> classes in C++ (i.e. class templates with virtual functions <br />
> representing the operations). Instance declarations correspond to<br />
> derivation and implementations of those parameterized classes.<br />
<br />
There is a major difference though, in C++ (or java, or sather, or c#,<br />
etc..) the dictionary is always attached to the value, the actual class<br />
data type you pass around. in haskell, the dictionary is passed<br />
separately and the appropriae one is infered by the type system. C++<br />
doesn't infer, it just assumes everything will be carying around its<br />
dictionary with it.<br />
<br />
this makes haskell classes signifigantly more powerful in many ways.<br />
<br />
<haskell><br />
class Num a where<br />
(+) :: a -> a -> a<br />
</haskell><br />
<br />
is imposible to express in OO classes, since both arguments to +<br />
necessarily carry their dictionaries with them, there is no way to<br />
statically guarentee they have the same one. Haskell will pass a single<br />
dictionary that is shared by both types so it can handle this just fine.<br />
<br />
in haskell you can do<br />
<br />
class Monoid a where<br />
mempty :: a<br />
<br />
in OOP, this cannot be done because where does the dicionary come from?<br />
since dictionaries are always attached to a concrete class, every method<br />
must take at least one argument of the class type (in fact, exactly one,<br />
as I'll show below). In haskell again, this is not a problem since the<br />
dictionary is passed in by the consumer of 'mempty', mempty need not<br />
conjure one out of thin air.<br />
<br />
<br />
In fact, OO classes can only express single parameter type classes where<br />
the type argument appears exactly once in strictly covariant position.<br />
in particular, it is pretty much always the first argument and often<br />
(but not always) named 'self' or 'this'.<br />
<br />
<br />
<haskell><br />
class HasSize a where<br />
getSize :: a -> Int<br />
</haskell><br />
<br />
can be expressed in OO, 'a' appears only once, as its first argument.<br />
<br />
<br />
Now, another thing OO classes can do is they give you the ability to<br />
create existential collections (?) of objects. as in, you can have a<br />
list of things that have a size. In haskell, the ability to do this is<br />
independent of the class (which is why haskell classes can be more<br />
powerful) and is appropriately named existential types.<br />
<br />
<haskell><br />
data Sized = exists a . HasSize a => Sized a <br />
</haskell><br />
<br />
what does this give you? you can now create a list of things that have a<br />
size [Sized] yay!<br />
<br />
and you can declare an instance for sized, so you can use all your<br />
methods on it.<br />
<br />
<haskell><br />
instance HasSize Sized where<br />
getSize (Sized a) = getSize a<br />
</haskell><br />
<br />
<br />
an exisential, like Sized, is a value that is passed around with its<br />
dictionary in tow, as in, it is an OO class! I think this is where<br />
people get confused when comparing OO classes to haskell classes. _there<br />
is no way to do so without bringing existentials into play_. OO classes<br />
are inherently existential in nature.<br />
<br />
so, an OO abstract class declaration declares the equivalent of 3 things<br />
in haskell: a class to establish the mehods, an existential type to<br />
carry the values about, and an instance of the class for the exisential<br />
type.<br />
<br />
an OO concrete class declares all of the above plus a data declaration<br />
for some concrete representation.<br />
<br />
<br />
OO classes can be perfectly (even down to the runtime representation!)<br />
emulated in Haskell, but not vice versa. since OO languages tie class<br />
declarations to existentials, they are limited to only the intersection<br />
of their capabilities, because haskell has separate concepts for them,<br />
each is independently much much more powerful.<br />
<br />
data CanApply = exists a b . CanApply (a -> b) a (b -> a)<br />
<br />
is an example of something that cannot be expressed in OO, existentials<br />
are limited to having exactly a single value since they are tied to a<br />
single dictionary<br />
<br />
<br />
<haskell><br />
class Num a where<br />
(+) :: a -> a -> a<br />
zero :: a<br />
negate :: a -> a<br />
</haskell><br />
<br />
cannot be expressed in OO, because there is no way to pass in the same<br />
dicionary for two elements, or for a returning value to conjure up a<br />
dictionary out of thin air. (if you are not convinced, try writing a<br />
'Number' existential and making it an instance of Num and it will be<br />
clear why it is not possible)<br />
<br />
negate is an interesting one, there is no technical reason it cannot be<br />
implemented in OO languages, but none seem to actually support it.<br />
<br />
<br />
so, when comparing, remember an OO class always cooresponds to a haskell<br />
class + a related haskell existential.<br />
<br />
<br />
incidentally, an extension I am working on is to allow<br />
<br />
<haskell><br />
data Sized = exists a . HasSize a => Sized a <br />
deriving(HasSize)<br />
</haskell><br />
<br />
which would have the obvious interpretation, obviously it would only work<br />
under the same limitations as OO classes have, but it would be a simple<br />
way for haskell programs to declare OO style classes if they so choose.<br />
<br />
(actually, it is still signifigantly more powerful than OO classes since<br />
you can derive many instances, and even declare your own for classes<br />
that don't meet the OO consraints, also, your single class argument need<br />
not appear as the first one. it can appear in any strictly covarient<br />
position, and it can occur as often as you want in contravariant ones!)<br />
<br />
<br />
=== Type classes correspond to parameterized abstract classes (Gabriel Dos Reis) ===<br />
<br />
| > Roughly Haskell type classes correspond to parameterized abstract<br />
| > classes in C++ (i.e. class templates with virtual functions <br />
| > representing the operations). Instance declarations correspond to<br />
| > derivation and implementations of those parameterized classes.<br />
| <br />
| There is a major difference though, in C++ (or java, or sather, or c#,<br />
| etc..) the dictionary is always attached to the value, the actual class<br />
| data type you pass around.<br />
<br />
I suspect that most of the confusion come from the fact that people<br />
believe just because virtual functions are attached to objects, <br />
they cannot attach them to operations outside classes. That, to my<br />
surprise, hints at a deeper misappreciation of both type classes and<br />
so-called "OO" technology. Type classes are more OO than one might<br />
realize. <br />
<br />
The dictionary can be attached to the operations (not just to the values) by<br />
using objects local to functions (which sort of matierialize the<br />
dictionary). Consider<br />
<br />
// Abstract class for a collection of classes that implement<br />
// the "Num" mathematical structure<br />
template<typename T><br />
struct Num {<br />
virtual T add(T, T) const = 0;<br />
};<br />
<br />
// Every type must specialize this class template to assert<br />
// membership to the "Num" structure. <br />
template<typename T> struct Num_instance;<br />
<br />
// The operation "+" is defined for any type that belongs to "Num".<br />
// Notice, membership is asserted aby specializing Num_instance<>.<br />
template<typename T><br />
T operator+(T lhs, T rhs)<br />
{<br />
const Num_instance<T> instance; <br />
return instance.add(lhs, rhs);<br />
}<br />
<br />
// "Foo" is in "Num"<br />
struct Num_instance<Foo> : Num<Foo> {<br />
Foo add(Foo a, Foo b) const { ... }<br />
};<br />
<br />
<br />
The key here is in the definition of operator+ which is just a formal<br />
name for the real operation done by instance.add().<br />
<br />
I appreciate that inferring and building the dictionary (represented<br />
here by the "instance" local to operator+<T>) is done automatically by<br />
the Haskell type system.<br />
That is one of the reasons why the type class notation is a nice sugar.<br />
However, that should not distract from its deerper OO semantics.<br />
<br />
<br />
[...]<br />
<br />
| in haskell you can do<br />
| <br />
| class Monoid a where<br />
| mempty :: a<br />
| <br />
| in OOP, this cannot be done because where does the dicionary come from?<br />
<br />
See above. I believe a key in my suggestion was "parameterized<br />
abstract classes", not just "abstract classes".<br />
<br />
<br />
<br />
== Haskell emulation of OOP inheritance with record extension ==<br />
<br />
Brian Hulley provided us the code that shows how OOP inheritance can be<br />
emulated in Haskell. His translation method supports data fields<br />
inheritance, although don't supports downcasting.<br />
<br />
> although i mentioned not only pluses but also drawbacks of type<br />
> classes: lack of record extension mechanisms (such at that implemented<br />
> in O'Haskell) and therefore inability to reuse operation<br />
> implementation in an derived data type...<br />
<br />
You can reuse ops in a derived data type but it involves a tremendous amount <br />
of boilerplate. Essentially, you just use the type classes to simulate <br />
extendable records by having a method in each class that accesses the <br />
fixed-length record corresponding to that particular C++ class.<br />
<br />
Here is an example (apologies for the length!) which shows a super class <br />
function being overridden in a derived class and a derived class method <br />
(B::Extra) making use of something implemented in the super class:<br />
<br />
<haskell><br />
module Main where<br />
<br />
{- Haskell translation of the following C++<br />
<br />
class A {<br />
public:<br />
String s;<br />
Int i;<br />
<br />
A(String s, Int i) s(s), i(i){}<br />
<br />
virtual void Display(){<br />
printf("A %s %d\n", s.c_str(), i);<br />
}<br />
<br />
virtual Int Reuse(){<br />
return i * 100;<br />
}<br />
};<br />
<br />
<br />
class B: public A{<br />
public:<br />
Char c;<br />
<br />
B(String s, Int i, Char c) : A(s, i), c(c){}<br />
<br />
virtual void Display(){<br />
printf("B %s %d %c", s.c_str(), i, c);<br />
}<br />
<br />
virtual void Extra(){<br />
printf("B Extra %d\n", Reuse());<br />
}<br />
<br />
};<br />
<br />
-}<br />
<br />
data A = A<br />
{ _A_s :: String<br />
, _A_i :: Int<br />
}<br />
<br />
-- This could do arg checking etc<br />
constructA :: String -> Int -> A<br />
constructA = A<br />
<br />
<br />
class ClassA a where<br />
getA :: a -> A<br />
<br />
display :: a -> IO ()<br />
display a = do<br />
let<br />
A{_A_s = s, _A_i = i} = getA a<br />
putStrLn $ "A " ++ s ++ show i<br />
<br />
reuse :: a -> Int<br />
reuse a = _A_i (getA a) * 100<br />
<br />
<br />
data WrapA = forall a. ClassA a => WrapA a<br />
<br />
instance ClassA WrapA where<br />
getA (WrapA a) = getA a<br />
display (WrapA a) = display a<br />
reuse (WrapA a) = reuse a<br />
<br />
instance ClassA A where<br />
getA = id<br />
<br />
<br />
data B = B { _B_A :: A, _B_c :: Char }<br />
<br />
<br />
constructB :: String -> Int -> Char -> B<br />
constructB s i c = B {_B_A = constructA s i, _B_c = c}<br />
<br />
class ClassA b => ClassB b where<br />
getB :: b -> B<br />
<br />
extra :: b -> IO ()<br />
extra b = do<br />
putStrLn $ "B Extra " ++ show (reuse b)<br />
<br />
data WrapB = forall b. ClassB b => WrapB b<br />
<br />
instance ClassB WrapB where<br />
getB (WrapB b) = getB b<br />
extra (WrapB b) = extra b<br />
<br />
instance ClassA WrapB where<br />
getA (WrapB b) = getA b<br />
display (WrapB b) = display b<br />
reuse (WrapB b) = reuse b<br />
<br />
instance ClassB B where<br />
getB = id<br />
<br />
instance ClassA B where<br />
getA = _B_A<br />
<br />
-- override the base class version<br />
display b = putStrLn $<br />
"B " ++ _A_s (getA b)<br />
++ show (_A_i (getA b))<br />
++ [_B_c (getB b)]<br />
<br />
<br />
main :: IO ()<br />
main = do<br />
let<br />
a = constructA "a" 0<br />
b = constructB "b" 1 '*'<br />
<br />
col = [WrapA a, WrapA b]<br />
<br />
mapM_ display col<br />
putStrLn ""<br />
mapM_ (putStrLn . show . reuse) col<br />
putStrLn ""<br />
extra b<br />
<br />
{- Output:<br />
<br />
> ghc -fglasgow-exts --make Main<br />
> main<br />
A a0<br />
B b1*<br />
<br />
0<br />
100<br />
<br />
B Extra 100<br />
<br />
><br />
-}<br />
</haskell><br />
<br />
(If the "caseless underscore" Haskell' ticket is accepted the leading <br />
underscores would have to be replaced by something like "_f" ie _A_s ---> <br />
_fA_s etc)<br />
<br />
<br />
<br />
== Type class system extensions ==<br />
<br />
Brief list of extensions, their abbreviated names and compatibility level<br />
<br />
* Constructor classes (Haskell'98)<br />
* MPTC: multi-parameter type classes (Hugs/GHC extension)<br />
* FD: functional dependencies (Hugs/GHC extension)<br />
* AT: associated types (GHC 6.6 only)<br />
* Overlapped, undecidable and incoherent instances (Hugs/GHC extension)<br />
<br />
<br />
== Literature ==<br />
<br />
The paper that at first time introduced type classes and their implementation<br />
using dictionaries was Philip Wadler and Stephen Blott "How to make ad-hoc polymorphism less ad-hoc" (http://homepages.inf.ed.ac.uk/wadler/papers/class/class.ps.gz)<br />
<br />
You can find more papers on the [http://haskell.org/haskellwiki/Research_papers/Type_systems#Type_classes Type classes] page.<br />
<br />
I thanks Ralf Lammel and Klaus Ostermann for their paper<br />
"Software Extension and Integration with Type Classes" (http://homepages.cwi.nl/~ralf/gpce06/) which prompts me to start thinking about differences between OOP and type classes instead of their similarities<br />
<br />
[[Category:Tutorials]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=OOP_vs_type_classes&diff=6843OOP vs type classes2006-10-10T07:47:26Z<p>Dibblego: </p>
<hr />
<div>(this is just a sketch now. feel free to edit/comment it. I will include information you provided into the final version of this tutorial)<br />
<br />
<br />
I had generally not used type classes in my application programs, but when<br />
I'd gone to implement general purpose libraries and tried to maintain<br />
as much flexibility as possible, it was natural to start building large<br />
and complex class hierarchies. I tried to use my C++ experience when<br />
doing this but I was bitten many times by the restrictions of type classes. After this experience, I think that I now have a better feeling and mind model<br />
for type classes and I want to share it with other Haskellers -<br />
especially ones having OOP backgrounds.<br />
<br />
Brian Hulley provided us with the program that emulates OOP in Haskell - as<br />
you can see, it's much larger than equivalent C++ program. An equivalent translation from<br />
Haskell to C++ should be even longer :)<br />
<br />
<br />
== Everything is object? ==<br />
<br />
You all know this OOP motto - "everything is object". While I program in<br />
C++, it was really hard to do it without classes. But when I - the same<br />
John de Mac-Lee programmer - use Haskell, it's hard to find open<br />
vacancies for type classes. Why it is so different?<br />
<br />
C++ classes pack functions together with data and make it possible to<br />
use different data representations via the same interface. As long as<br />
you need any of these facilities, you are forced to use classes.<br />
Although C++ supports alternative ways to implement such functionality<br />
(function pointers, discriminated unions), these are not as handy<br />
as classes. Classes are also the sole method to hide implementation<br />
details. Moreover, classes represent a handy way to group related<br />
functionality together, so I found myself sometimes developing classes that<br />
contain only static functions just to group them into some<br />
'package'. It's extremely useful to browse the structure of large C++ project<br />
in terms of classes instead of individual functions.<br />
<br />
Haskell provides other solutions for these problems.<br />
<br />
=== Type with several representations: use algebraic data type (ADT) ===<br />
<br />
For the types with different representations, algebraic data types<br />
(ADT) - an analog of discriminated unions - are supported:<br />
<br />
<haskell><br />
data Point = FloatPoint Float Float<br />
| IntPoint Int Int<br />
</haskell><br />
<br />
Haskell provides a very easy way to build/analyze them:<br />
<br />
<haskell><br />
coord :: Point -> (Float, Float)<br />
coord (FloatPoint x y) = (x,y)<br />
coord (IntPoint x y) = (realToFrac x, realToFrac y)<br />
<br />
main = do print (coord (FloatPoint 1 2))<br />
print (coord (IntPoint 1 2))<br />
</haskell><br />
<br />
So ADTs in general are preferred in Haskell over the class-based<br />
solution of the same problem:<br />
<br />
<haskell><br />
class Point a where<br />
coord :: a -> (Float, Float)<br />
<br />
data FloatPoint = FloatPoint Float Float<br />
instance Point FloatPoint where<br />
coord (FloatPoint x y) = (x,y)<br />
<br />
data IntPoint = IntPoint Int Int<br />
instance Point IntPoint where<br />
coord (IntPoint x y) = (realToFrac x, realToFrac y)<br />
</haskell><br />
<br />
<br />
You can also imagine all the C++ machinery that is required to implement the<br />
analog of our 5 line, ADT-based solution to say that objects are not<br />
so great as you thought before :D Of course, C++ classes are usually<br />
much larger but that's again Haskell benefit - it's so easy to define<br />
types/functions that you may use much finer granularity.<br />
<br />
<pre><br />
#include <algorithm><br />
#include <iostream><br />
using namespace std;<br />
<br />
struct Point {<br />
pair<float,float> coord();<br />
};<br />
<br />
struct FloatPoint : Point {<br />
float x, y;<br />
FloatPoint (float _x, float _y) { x=_x; y=_y; }<br />
pair<float,float> coord() { return make_pair(x,y); }<br />
};<br />
<br />
struct IntPoint : Point {<br />
int x, y;<br />
IntPoint (int _x, int _y) { x=_x; y=_y; }<br />
pair<float,float> coord() { return make_pair(x,y); }<br />
};<br />
<br />
main () {<br />
cout << FloatPoint(1,2).coord();<br />
cout << IntPoint(1,2).coord();<br />
}<br />
</pre><br />
<br />
As you see, ADTs together with type inference make Haskell programs<br />
about 2 times smaller than their C++ equivalent.<br />
<br />
<br />
=== Packing data & functions together: use (records of) closures ===<br />
<br />
Another very typical class usage is to pack data together with one<br />
or more functions to proceed them and pass this bunch to some<br />
function. Then this function can call these functions to implement<br />
some functionality and don't bother how it is internally implemented.<br />
Hopefully Haskell provides better way to implement this - you can<br />
directly pass any functions as parameters to other functions.<br />
Moreover, you can construct passed functions on-the-fly and capture in<br />
these definitions values of identifiers available at the call place,<br />
creating so-called closures. In this way, you construct something like<br />
object on-the-fly and don't even need a type class:<br />
<br />
<haskell><br />
do x <- newIORef 0<br />
proc (modifyIORef x (+1), readIORef x)<br />
</haskell><br />
<br />
Here, we passed to proc two routines - one that increments value of<br />
counter and another that reads its current value. Another call to proc<br />
that uses counter with locking, might look as:<br />
<br />
<haskell><br />
do x <- newMVar 0<br />
proc (modifyMVar x (+1), readMVar x)<br />
</haskell><br />
<br />
Here, proc may be defined as:<br />
<br />
<haskell><br />
proc :: (IO (), IO Int) -> IO ()<br />
proc (inc, read) = do { inc; inc; inc; read >>= print }<br />
</haskell><br />
<br />
<br />
i.e. it receive two abstract operations whose implementation may vary<br />
in different calls to proc and call them without any knowledge of<br />
implementation details. The equivalent C++ code look as:<br />
<br />
<pre><br />
struct Counter {<br />
void inc();<br />
int read();<br />
};<br />
<br />
struct SimpleCounter : Counter {<br />
int n;<br />
SimpleCounter () { n=0; }<br />
void inc() { n++; }<br />
int read() { return n; }<br />
};<br />
<br />
void proc (Counter c) {<br />
c.inc(); c.inc(); c.inc(); cout << c.read();<br />
}<br />
</pre><br />
<br />
And again, Haskell code is much more simple and straightforward - we<br />
don't need to declare classes, operations, their types - we just pass<br />
to the proc implementation of operations it need. Look at<br />
http://haskell.org/haskellwiki/IO_inside#Example:_returning_an_IO_action_as_a_result<br />
and following sections to find more examples of using closures instead<br />
of OOP classes.<br />
<br />
<br />
=== Hiding implementation details: use module export list ===<br />
<br />
One more usage of OOP classes is to hide implementation details, making<br />
internal data/functions inaccessible to class clients. Unfortunately, this<br />
functionality is not part of type class facilities. Instead, you<br />
should use the sole Haskell method of implementation hiding - module<br />
export list:<br />
<br />
<haskell><br />
module Stack (Stack, empty, push, pop, top, isEmpty) where<br />
<br />
newtype Stack a = Stk [a]<br />
<br />
empty = Stk []<br />
push x (Stk xs) = Stk (x:xs)<br />
pop (Stk (x:xs)) = Stk xs<br />
top (Stk (x:xs)) = x<br />
isEmpty (Stk xs) = null xs<br />
</haskell><br />
<br />
Since the constructor for the data type Stack is hidden (the export<br />
list would say Stack(Stk) if it were exposed), outside of this module<br />
a stack can only be built from operations empty, push and pop, and<br />
examined with top and isEmpty.<br />
<br />
<br />
=== Grouping related functionality: use module hierarchy and Haddock markup ===<br />
<br />
Dividing whole program into classes and using their hierarchy to<br />
represent entire program structure is a great instrument for OOP<br />
languages. Unfortunately, it's again impossible in Haskell. Instead,<br />
program structure typically rendered in module hierarchy and inside<br />
module - in its export list. Although Haskell language don't provide<br />
facilities to describe hierarchical structure inside of module, we had<br />
another tool to do it - Haddock, a de-facto standard documentation tool.<br />
<br />
<haskell><br />
module System.Stream.Instance (<br />
<br />
-- * File is a file stream<br />
File,<br />
-- ** Functions that open files<br />
openFile, -- open file in text mode<br />
openBinaryFile, -- open file in binary mode<br />
-- ** Standard file handles<br />
stdin,<br />
stdout,<br />
stderr,<br />
<br />
-- * MemBuf is a memory buffer stream<br />
MemBuf,<br />
-- ** Functions that open MemBuf<br />
createMemBuf, -- create new MemBuf<br />
createContiguousMemBuf, -- create new contiguous MemBuf<br />
openMemBuf, -- use memory area as MemBuf<br />
<br />
) where<br />
...<br />
</haskell><br />
<br />
Here, Haddock will build documentation for module looking at its<br />
export list. The export list will be divided into sections (whose<br />
headers given with "-- *") and subsections (given with "-- **"). As<br />
the result, module documentation reflect its structure without using<br />
classes for this purpose.<br />
<br />
<br />
== Type classes is a sort of templates, not classes ==<br />
<br />
At this moment C++/C#/Java languages has classes and<br />
templates/generics. What is a difference? With a class, type<br />
information carried with object itself while with templates it's<br />
outside of object and is part of the whole operation.<br />
<br />
For example, if == operation is defined in a class, the actual<br />
procedure called for a==b may depend on run-time type of 'a' but if it<br />
is defined in template, actual procedure depends only on template<br />
instantiated (and determined at compile time).<br />
<br />
Haskell's objects don't carry run-time type information. Instead,<br />
class constraint for polymorphic operation passed in form of<br />
"dictionary" implementing all operations of the class (there are also<br />
other implementation techniques, but this don't matter). For example,<br />
<br />
<haskell><br />
eqList :: (Eq a) => [a] -> [a] -> Bool<br />
</haskell><br />
<br />
translated into:<br />
<br />
<haskell><br />
type EqDictionary a = (a->a->Bool, a->a->Bool)<br />
eqList :: EqDictionary a -> [a] -> [a] -> Bool<br />
</haskell><br />
<br />
where first parameter is "dictionary" containing implementation of<br />
"==" and "/=" operations for objects of type 'a'. If there are several<br />
class constraints, dictionary for each is passed.<br />
<br />
If class has base class(es), the dictionary tuple also includes tuples<br />
of base classes dictionaries:<br />
<br />
<haskell><br />
class Eq a => Cmp a where<br />
cmp :: a -> a -> Ordering<br />
<br />
cmpList :: (Cmp a) => [a] -> [a] -> Ordering<br />
</haskell><br />
<br />
turns into:<br />
<br />
<haskell><br />
type CmpDictionary a = (eqDictionary a, a -> a -> Ordering)<br />
cmpList :: CmpDictionary a -> [a] -> [a] -> Bool<br />
</haskell><br />
<br />
<br />
Comparing to C++, this is like the templates, not classes! As with<br />
templates, typing information is part of operation, not object! But<br />
while C++ templates are really form of macro-processing (like<br />
Template Haskell) and at last end generates non-polymorphic code,<br />
Haskell's using of dictionaries allows run-time polymorphism<br />
(explanation of run-time polymorphism?).<br />
<br />
Moreover, Haskell type classes supports inheritance. Run-time<br />
polymorphism together with inheritance are often seen as OOP<br />
distinctive points, so during long time I considered type classes as a<br />
form of OOP implementation. But that's wrong! Haskell type classes<br />
build on different basis, so they are like C++ templates with added<br />
inheritance and run-time polymorphism! And this means that usage of<br />
type classes is different from using classes, with its own strong and<br />
weak points.<br />
<br />
<br />
== Type classes vs classes ==<br />
<br />
Here is a brief listing of differences between OOP classes and Haskell type classes<br />
<br />
=== Type classes is like interfaces/abstract classes, not classes itself ===<br />
<br />
There is no data fields inheritance and data fields itself<br />
(so type classes more like to interfaces than to classes itself)....<br />
<br />
<br />
For those more familiar with Java/C# rather than C++, type classes resemble interfaces more than the classes. In fact, the generics in those languages capture the notion of parametric polymorphism (but Haskell is a language that takes parametric polymorphism quite seriously, so you can expect a fair amount of type gymnastics when dealing with Haskell), so more precisely, type classes are like generic interfaces.<br />
<br />
Why interface, and not class? Mostly because type classes do not implement the methods themselves, they just guarantee that the actual types that instantiate the type class will implement specific methods. So the types are like classes in Java/C#.<br />
<br />
One added twist: type classes can decide to provide default implementation of some methods (using other methods). You would say, then they are sort of like abstract classes. Right. But at the same time, you cannot extend (inherit) multiple abstract classes, can you?<br />
<br />
So a type class is sort of like a contract: "any type that instantiates this type class will have the following functions defined on them..." but with the added advantage that you have type parameters built-in, so:<br />
<br />
<haskell><br />
class Eq a where<br />
(==) :: a -> a -> Bool<br />
(/=) :: a -> a -> Bool<br />
-- let's just implement one function in terms of the other<br />
x /= y = not (x == y) <br />
</haskell><br />
<br />
is, in a Java-like language:<br />
<br />
'''interface''' Eq<A> {<br />
'''boolean''' equal(A that);<br />
'''boolean''' notEqual(A that) { <br />
''// default, can be overriden''<br />
'''return''' !equal(that); <br />
} <br />
}<br />
<br />
And the "instance TypeClass ParticularInstance where ..." definition means "ParticularInstance implements TypeClass { ... }", now, multiple parameter type classes, of course, cannot be interpreted this way.<br />
<br />
<br />
=== Type can appear at any place in function signature ===<br />
Type can appear at any place in function signature: be any<br />
parameter, inside parameter, in a list (possibly empty), or in a result<br />
<br />
<haskell><br />
class C a where<br />
f :: a -> Int<br />
g :: Int -> a -> Int<br />
h :: Int -> (Int,a) -> Int<br />
i :: [a] -> Int<br />
j :: Int -> a<br />
new :: a<br />
</haskell><br />
<br />
It's even possible to define instance-specific constants (look at 'new').<br />
<br />
If function value is instance-specific, OOP programmer will use<br />
"static" method while with type classes you need to use fake<br />
parameter:<br />
<br />
<haskell><br />
class FixedSize a where<br />
sizeof :: a -> Int<br />
instance FixedSize Int8 where<br />
sizeof _ = 1<br />
instance FixedSize Int16 where<br />
sizeof _ = 2<br />
<br />
main = do print (sizeof (undefined::Int8))<br />
print (sizeof (undefined::Int16))<br />
</haskell><br />
<br />
<br />
=== Inheritance between interfaces ===<br />
Inheritance between interfaces (in "class" declaration) means<br />
inclusion of base class dictionaries in dictionary of subclass:<br />
<br />
<haskell><br />
class (Show s, Monad m s) => Stream m s where<br />
sClose :: s -> m ()<br />
</haskell><br />
<br />
means<br />
<br />
<haskell><br />
type StreamDictionary m s = (ShowDictionary s, MonadDictionary m s, s->m())<br />
</haskell><br />
<br />
There is upcasting mechanism, it just extracts dictionary of a base<br />
class from a dictionary tuple, so you can run function that requires<br />
base class from a function that requires subclass:<br />
<br />
<haskell><br />
f :: (Stream m s) => s -> m String<br />
show :: (Show s) => s -> String<br />
f s = return (show s)<br />
</haskell><br />
<br />
But downcasting is absolutely impossible - there is no way to get<br />
subclass dictionary from a superclass one<br />
<br />
<br />
<br />
=== Inheritance between instances ===<br />
Inheritance between instances (in "instance" declaration) means<br />
that operations of some class can be executed via operations of other<br />
class, i.e. such declaration describe a way to compute dictionary of<br />
inherited class via functions from dictionary of base class:<br />
<br />
<haskell><br />
class Eq a where<br />
(==) :: a -> a -> Bool<br />
class Cmp a where<br />
cmp :: a -> a -> Ordering<br />
instance (Cmp a) => Eq a where<br />
a==b = cmp a b == EQ<br />
</haskell><br />
<br />
creates the following function:<br />
<br />
<haskell><br />
cmpDict2EqDict :: CmpDictionary a -> EqDictionary a<br />
cmpDict2EqDict (cmp) = (\a b -> cmp a b == EQ)<br />
</haskell><br />
<br />
This results in that any function that receives dictionary for Cmp class<br />
can call functions that require dictionary of Eq class<br />
<br />
<br />
=== Downcasting is a mission impossible ===<br />
<br />
Selection between instances are done at compile-time, based only on<br />
information present at this moment. So don't expect that more concrete<br />
instance will be selected just because you passed this concrete<br />
datatype to the function which accepts some general class:<br />
<br />
<haskell><br />
class Foo a where<br />
foo :: a -> String<br />
<br />
instance (Num a) => Foo a where<br />
foo _ = "Num"<br />
<br />
instance Foo Int where<br />
foo _ = "int"<br />
<br />
f :: (Num a) => a -> String<br />
f = foo<br />
<br />
main = do print (foo (1::Int))<br />
print (f (1::Int))<br />
</haskell><br />
<br />
Here, the first call will return "int", but second - only "Num".<br />
this can be easily justified by using dictionary-based translation<br />
as described above. After you've passed data to polymorphic procedure<br />
it's type is completely lost, there is only dictionary information, so<br />
instance for Int can't be applied. The only way to construct Foo<br />
dictionary is by calculating it from Num dictionary using the first<br />
instance.<br />
<br />
<br />
=== There is only one dictionary per function call ===<br />
For "eqList :: (Eq a) => [a] -> [a] -> Bool" types of all elements<br />
in list must be the same, and types of both arguments must be the same<br />
too - there is only one dictionary and it know how to handle variables<br />
of only one concrete type!<br />
<br />
=== Existential variables is more like OOP objects ===<br />
Existential variables pack dictionary together with variable (looks<br />
very like the object concept!) so it's possible to create polymorphic<br />
containers (i.e. holding variables of different types). But<br />
downcasting is still impossible. Also, existentials still don't allow<br />
to mix variables of different types in a call to some polymorhic operation<br />
(their personal dictionaries still built for variables of one concrete type):<br />
<br />
<haskell><br />
data HasCmp = forall a. Cmp a => HasCmp a<br />
<br />
sorted :: [HasCmp] -> Ordering<br />
<br />
sorted [] = True<br />
sorted [_] = True<br />
sorted (HasCmp a : HasCmp b : xs) = a<=b && sorted (b:xs)<br />
</haskell><br />
<br />
This code will not work - a<=b can use nor 'a' nor 'b' dictionary.<br />
Even if orderings for apples and penguins are defined, we still don't have<br />
a method to compare penguin with apple!<br />
<br />
<br />
== Other opinions ==<br />
<br />
=== OO class always corresponds to a haskell class + a related haskell existential (John Meacham) ===<br />
<br />
> Roughly Haskell type classes correspond to parameterized abstract<br />
> classes in C++ (i.e. class templates with virtual functions <br />
> representing the operations). Instance declarations correspond to<br />
> derivation and implementations of those parameterized classes.<br />
<br />
There is a major difference though, in C++ (or java, or sather, or c#,<br />
etc..) the dictionary is always attached to the value, the actual class<br />
data type you pass around. in haskell, the dictionary is passed<br />
separately and the appropriae one is infered by the type system. C++<br />
doesn't infer, it just assumes everything will be carying around its<br />
dictionary with it.<br />
<br />
this makes haskell classes signifigantly more powerful in many ways.<br />
<br />
<haskell><br />
class Num a where<br />
(+) :: a -> a -> a<br />
</haskell><br />
<br />
is imposible to express in OO classes, since both arguments to +<br />
necessarily carry their dictionaries with them, there is no way to<br />
statically guarentee they have the same one. Haskell will pass a single<br />
dictionary that is shared by both types so it can handle this just fine.<br />
<br />
in haskell you can do<br />
<br />
class Monoid a where<br />
mempty :: a<br />
<br />
in OOP, this cannot be done because where does the dicionary come from?<br />
since dictionaries are always attached to a concrete class, every method<br />
must take at least one argument of the class type (in fact, exactly one,<br />
as I'll show below). In haskell again, this is not a problem since the<br />
dictionary is passed in by the consumer of 'mempty', mempty need not<br />
conjure one out of thin air.<br />
<br />
<br />
In fact, OO classes can only express single parameter type classes where<br />
the type argument appears exactly once in strictly covariant position.<br />
in particular, it is pretty much always the first argument and often<br />
(but not always) named 'self' or 'this'.<br />
<br />
<br />
<haskell><br />
class HasSize a where<br />
getSize :: a -> Int<br />
</haskell><br />
<br />
can be expressed in OO, 'a' appears only once, as its first argument.<br />
<br />
<br />
Now, another thing OO classes can do is they give you the ability to<br />
create existential collections (?) of objects. as in, you can have a<br />
list of things that have a size. In haskell, the ability to do this is<br />
independent of the class (which is why haskell classes can be more<br />
powerful) and is appropriately named existential types.<br />
<br />
<haskell><br />
data Sized = exists a . HasSize a => Sized a <br />
</haskell><br />
<br />
what does this give you? you can now create a list of things that have a<br />
size [Sized] yay!<br />
<br />
and you can declare an instance for sized, so you can use all your<br />
methods on it.<br />
<br />
<haskell><br />
instance HasSize Sized where<br />
getSize (Sized a) = getSize a<br />
</haskell><br />
<br />
<br />
an exisential, like Sized, is a value that is passed around with its<br />
dictionary in tow, as in, it is an OO class! I think this is where<br />
people get confused when comparing OO classes to haskell classes. _there<br />
is no way to do so without bringing existentials into play_. OO classes<br />
are inherently existential in nature.<br />
<br />
so, an OO abstract class declaration declares the equivalent of 3 things<br />
in haskell: a class to establish the mehods, an existential type to<br />
carry the values about, and an instance of the class for the exisential<br />
type.<br />
<br />
an OO concrete class declares all of the above plus a data declaration<br />
for some concrete representation.<br />
<br />
<br />
OO classes can be perfectly (even down to the runtime representation!)<br />
emulated in Haskell, but not vice versa. since OO languages tie class<br />
declarations to existentials, they are limited to only the intersection<br />
of their capabilities, because haskell has separate concepts for them,<br />
each is independently much much more powerful.<br />
<br />
data CanApply = exists a b . CanApply (a -> b) a (b -> a)<br />
<br />
is an example of something that cannot be expressed in OO, existentials<br />
are limited to having exactly a single value since they are tied to a<br />
single dictionary<br />
<br />
<br />
<haskell><br />
class Num a where<br />
(+) :: a -> a -> a<br />
zero :: a<br />
negate :: a -> a<br />
</haskell><br />
<br />
cannot be expressed in OO, because there is no way to pass in the same<br />
dicionary for two elements, or for a returning value to conjure up a<br />
dictionary out of thin air. (if you are not convinced, try writing a<br />
'Number' existential and making it an instance of Num and it will be<br />
clear why it is not possible)<br />
<br />
negate is an interesting one, there is no technical reason it cannot be<br />
implemented in OO languages, but none seem to actually support it.<br />
<br />
<br />
so, when comparing, remember an OO class always cooresponds to a haskell<br />
class + a related haskell existential.<br />
<br />
<br />
incidentally, an extension I am working on is to allow<br />
<br />
<haskell><br />
data Sized = exists a . HasSize a => Sized a <br />
deriving(HasSize)<br />
</haskell><br />
<br />
which would have the obvious interpretation, obviously it would only work<br />
under the same limitations as OO classes have, but it would be a simple<br />
way for haskell programs to declare OO style classes if they so choose.<br />
<br />
(actually, it is still signifigantly more powerful than OO classes since<br />
you can derive many instances, and even declare your own for classes<br />
that don't meet the OO consraints, also, your single class argument need<br />
not appear as the first one. it can appear in any strictly covarient<br />
position, and it can occur as often as you want in contravariant ones!)<br />
<br />
<br />
=== Type classes correspond to parameterized abstract classes (Gabriel Dos Reis) ===<br />
<br />
| > Roughly Haskell type classes correspond to parameterized abstract<br />
| > classes in C++ (i.e. class templates with virtual functions <br />
| > representing the operations). Instance declarations correspond to<br />
| > derivation and implementations of those parameterized classes.<br />
| <br />
| There is a major difference though, in C++ (or java, or sather, or c#,<br />
| etc..) the dictionary is always attached to the value, the actual class<br />
| data type you pass around.<br />
<br />
I suspect that most of the confusion come from the fact that people<br />
believe just because virtual functions are attached to objects, <br />
they cannot attach them to operations outside classes. That, to my<br />
surprise, hints at a deeper misappreciation of both type classes and<br />
so-called "OO" technology. Type classes are more OO than one might<br />
realize. <br />
<br />
The dictionary can be attached to the operations (not just to the values) by<br />
using objects local to functions (which sort of matierialize the<br />
dictionary). Consider<br />
<br />
// Abstract class for a collection of classes that implement<br />
// the "Num" mathematical structure<br />
template<typename T><br />
struct Num {<br />
virtual T add(T, T) const = 0;<br />
};<br />
<br />
// Every type must specialize this class template to assert<br />
// membership to the "Num" structure. <br />
template<typename T> struct Num_instance;<br />
<br />
// The operation "+" is defined for any type that belongs to "Num".<br />
// Notice, membership is asserted aby specializing Num_instance<>.<br />
template<typename T><br />
T operator+(T lhs, T rhs)<br />
{<br />
const Num_instance<T> instance; <br />
return instance.add(lhs, rhs);<br />
}<br />
<br />
// "Foo" is in "Num"<br />
struct Num_instance<Foo> : Num<Foo> {<br />
Foo add(Foo a, Foo b) const { ... }<br />
};<br />
<br />
<br />
The key here is in the definition of operator+ which is just a formal<br />
name for the real operation done by instance.add().<br />
<br />
I appreciate that inferring and building the dictionary (represented<br />
here by the "instance" local to operator+<T>) is done automatically by<br />
the Haskell type system.<br />
That is one of the reasons why the type class notation is a nice sugar.<br />
However, that should not distract from its deerper OO semantics.<br />
<br />
<br />
[...]<br />
<br />
| in haskell you can do<br />
| <br />
| class Monoid a where<br />
| mempty :: a<br />
| <br />
| in OOP, this cannot be done because where does the dicionary come from?<br />
<br />
See above. I believe a key in my suggestion was "paramaterized<br />
abstract classes", not just "abstract classes".<br />
<br />
<br />
<br />
== Haskell emulation of OOP inheritance with record extension ==<br />
<br />
Brian Hulley provided us the code that shows how OOP inheritance can be<br />
emulated in Haskell. His translation method supports data fields<br />
inheritance, although don't supports downcasting.<br />
<br />
> although i mentioned not only pluses but also drawbacks of type<br />
> classes: lack of record extension mechanisms (such at that implemented<br />
> in O'Haskell) and therefore inability to reuse operation<br />
> implementation in an derived data type...<br />
<br />
You can reuse ops in a derived data type but it involves a tremendous amount <br />
of boilerplate. Essentially, you just use the type classes to simulate <br />
extendable records by having a method in each class that accesses the <br />
fixed-length record corresponding to that particular C++ class.<br />
<br />
Here is an example (apologies for the length!) which shows a super class <br />
function being overridden in a derived class and a derived class method <br />
(B::Extra) making use of something implemented in the super class:<br />
<br />
<haskell><br />
module Main where<br />
<br />
{- Haskell translation of the following C++<br />
<br />
class A {<br />
public:<br />
String s;<br />
Int i;<br />
<br />
A(String s, Int i) s(s), i(i){}<br />
<br />
virtual void Display(){<br />
printf("A %s %d\n", s.c_str(), i);<br />
}<br />
<br />
virtual Int Reuse(){<br />
return i * 100;<br />
}<br />
};<br />
<br />
<br />
class B: public A{<br />
public:<br />
Char c;<br />
<br />
B(String s, Int i, Char c) : A(s, i), c(c){}<br />
<br />
virtual void Display(){<br />
printf("B %s %d %c", s.c_str(), i, c);<br />
}<br />
<br />
virtual void Extra(){<br />
printf("B Extra %d\n", Reuse());<br />
}<br />
<br />
};<br />
<br />
-}<br />
<br />
data A = A<br />
{ _A_s :: String<br />
, _A_i :: Int<br />
}<br />
<br />
-- This could do arg checking etc<br />
constructA :: String -> Int -> A<br />
constructA = A<br />
<br />
<br />
class ClassA a where<br />
getA :: a -> A<br />
<br />
display :: a -> IO ()<br />
display a = do<br />
let<br />
A{_A_s = s, _A_i = i} = getA a<br />
putStrLn $ "A " ++ s ++ show i<br />
<br />
reuse :: a -> Int<br />
reuse a = _A_i (getA a) * 100<br />
<br />
<br />
data WrapA = forall a. ClassA a => WrapA a<br />
<br />
instance ClassA WrapA where<br />
getA (WrapA a) = getA a<br />
display (WrapA a) = display a<br />
reuse (WrapA a) = reuse a<br />
<br />
instance ClassA A where<br />
getA = id<br />
<br />
<br />
data B = B { _B_A :: A, _B_c :: Char }<br />
<br />
<br />
constructB :: String -> Int -> Char -> B<br />
constructB s i c = B {_B_A = constructA s i, _B_c = c}<br />
<br />
class ClassA b => ClassB b where<br />
getB :: b -> B<br />
<br />
extra :: b -> IO ()<br />
extra b = do<br />
putStrLn $ "B Extra " ++ show (reuse b)<br />
<br />
data WrapB = forall b. ClassB b => WrapB b<br />
<br />
instance ClassB WrapB where<br />
getB (WrapB b) = getB b<br />
extra (WrapB b) = extra b<br />
<br />
instance ClassA WrapB where<br />
getA (WrapB b) = getA b<br />
display (WrapB b) = display b<br />
reuse (WrapB b) = reuse b<br />
<br />
instance ClassB B where<br />
getB = id<br />
<br />
instance ClassA B where<br />
getA = _B_A<br />
<br />
-- override the base class version<br />
display b = putStrLn $<br />
"B " ++ _A_s (getA b)<br />
++ show (_A_i (getA b))<br />
++ [_B_c (getB b)]<br />
<br />
<br />
main :: IO ()<br />
main = do<br />
let<br />
a = constructA "a" 0<br />
b = constructB "b" 1 '*'<br />
<br />
col = [WrapA a, WrapA b]<br />
<br />
mapM_ display col<br />
putStrLn ""<br />
mapM_ (putStrLn . show . reuse) col<br />
putStrLn ""<br />
extra b<br />
<br />
{- Output:<br />
<br />
> ghc -fglasgow-exts --make Main<br />
> main<br />
A a0<br />
B b1*<br />
<br />
0<br />
100<br />
<br />
B Extra 100<br />
<br />
><br />
-}<br />
</haskell><br />
<br />
(If the "caseless underscore" Haskell' ticket is accepted the leading <br />
underscores would have to be replaced by something like "_f" ie _A_s ---> <br />
_fA_s etc)<br />
<br />
<br />
<br />
== Type class system extensions ==<br />
<br />
Brief list of extensions, their abbeviated names and compatibility level<br />
<br />
* Constructor classes (Haskell'98)<br />
* MPTC: multi-parameter type classes (Hugs/GHC extension)<br />
* FD: functional dependencies (Hugs/GHC extension)<br />
* AT: associated types (GHC 6.6 only)<br />
* Overlapped, undecidable and incoherent instances (Hugs/GHC extension)<br />
<br />
<br />
== Literature ==<br />
<br />
The paper that at first time introduced type classes and their implementation<br />
using dictionaries was Philip Wadler and Stephen Blott "How to make ad-hoc polymorphism less ad-hoc" (http://homepages.inf.ed.ac.uk/wadler/papers/class/class.ps.gz)<br />
<br />
You can find more papers on the [http://haskell.org/haskellwiki/Research_papers/Type_systems#Type_classes Type classes] page.<br />
<br />
I thanks Ralf Lammel and Klaus Ostermann for their paper<br />
"Software Extension and Integration with Type Classes" (http://homepages.cwi.nl/~ralf/gpce06/) which prompts me to start thinking about differences between OOP and type classes instead of their similarities<br />
<br />
[[Category:Tutorials]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=Roll_your_own_IRC_bot&diff=6533Roll your own IRC bot2006-10-05T21:44:16Z<p>Dibblego: </p>
<hr />
<div>This tutorial is designed as a practical guide to writing real world<br />
code in [http://haskell.org Haskell] and hopes to intuitively motivate<br />
and introduce some of the advanced features of Haskell to the novice<br />
programmer. Our goal is to write a concise, robust and elegant<br />
[http://haskell.org/haskellwiki/IRC_channel IRC] bot in Haskell.<br />
<br />
== Getting started ==<br />
<br />
You'll need a reasonably recent version of [http://haskell.org/ghc GHC]<br />
or [http://haskell.org/hugs Hugs]. Our first step is to get on the<br />
network. So let's start by importing the Network package, and the<br />
standard IO library and defining a server to connect to.<br />
<br />
<haskell><br />
import Network<br />
import System.IO<br />
<br />
server = "irc.freenode.org"<br />
port = 6667<br />
<br />
main = do<br />
h <- connectTo server (PortNumber (fromIntegral port))<br />
hSetBuffering h NoBuffering<br />
t <- hGetContents h<br />
print t<br />
</haskell><br />
<br />
The key here is the <hask>main</hask> function. This is the entry point<br />
to a Haskell program. We first connect to the server, then set the<br />
buffering on the socket off. Once we've got a socket, we can then just<br />
read and print any data we receive.<br />
<br />
Put this code in the module <hask>1.hs</hask> and we can then run it.<br />
Use whichever system you like:<br />
<br />
Using runhaskell:<br />
<br />
$ runhaskell 1.hs<br />
"NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :***<br />
Checking ident\r\nNOTICE AUTH :*** Found your hostname\r\n ...<br />
<br />
Or we can just compile it to an executable with GHC:<br />
<br />
$ ghc --make 1.hs -o tutbot<br />
Chasing modules from: 1.hs<br />
Compiling Main ( 1.hs, 1.o )<br />
Linking ...<br />
$ ./tutbot<br />
"NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :***<br />
Checking ident\r\nNOTICE AUTH :*** Found your hostname\r\n ...<br />
<br />
Or using GHCi:<br />
<br />
$ ghci 1.hs<br />
*Main> main<br />
"NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :***<br />
Checking ident\r\nNOTICE AUTH :*** Found your hostname\r\n ...<br />
<br />
Or in Hugs:<br />
<br />
$ runhugs 1.hs<br />
"NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :***<br />
Checking ident\r\nNOTICE AUTH :*** Found your hostname\r\n ...<br />
<br />
Great! We're on the network.<br />
<br />
== Talking IRC ==<br />
<br />
Now we're listening to the server, we better start sending some<br />
information back. Three details are important: the nick, the user name,<br />
and a channel to join. So let's send those.<br />
<br />
<haskell><br />
import Network<br />
import System.IO<br />
import Text.Printf<br />
<br />
server = "irc.freenode.org"<br />
port = 6667<br />
chan = "#tutbot-testing"<br />
nick = "tutbot"<br />
<br />
main = do<br />
h <- connectTo server (PortNumber (fromIntegral port))<br />
hSetBuffering h NoBuffering<br />
write h "NICK" nick<br />
write h "USER" (nick++" 0 * :tutorial bot")<br />
write h "JOIN" chan<br />
listen h<br />
<br />
write :: Handle -> String -> String -> IO ()<br />
write h s t = do<br />
hPrintf h "%s %s\r\n" s t<br />
printf "> %s %s\n" s t<br />
<br />
listen :: Handle -> IO ()<br />
listen h = forever $ do<br />
s <- hGetLine h<br />
putStrLn s<br />
where<br />
forever a = do a; forever a<br />
</haskell><br />
<br />
Now, we've done quite a few things here. Firstly, we import<br />
<hask>Text.Printf</hask>, which will be useful. We also set up a channel<br />
name and bot nickname. The <hask>main</hask> function has been extended<br />
to send messages back to the IRC server using a <hask>write</hask><br />
function. Let's look at that a bit more closely:<br />
<br />
<haskell><br />
write :: Handle -> String -> String -> IO ()<br />
write h s t = do<br />
hPrintf h "%s %s\r\n" s t<br />
printf "> %s %s\n" s t<br />
</haskell><br />
<br />
We've given <hask>write</hask> an explicit type to help document it, and<br />
we'll use explicit types signatures from now on, as they're just good<br />
practice (though of course not required, as Haskell uses type inference<br />
to work out the types anyway).<br />
<br />
The <hask>write</hask> function takes 3 arguments; a handle (our<br />
socket), and then two strings representing an IRC protocol action, and<br />
any arguments it takes. <hask>write</hask> then uses <hask>hPrintf</hask><br />
to build an IRC message and write it over the wire to the server. For<br />
debugging purposes we also print to standard output the message we send.<br />
<br />
Our second function, <hask>listen</hask>, is as follows:<br />
<br />
<haskell><br />
listen :: Handle -> IO ()<br />
listen h = forever $ do<br />
s <- hGetLine h<br />
putStrLn s<br />
where<br />
forever a = do a; forever a<br />
</haskell><br />
<br />
This function takes a Handle argument, and sits in an infinite loop<br />
reading lines of text from the network and printing them. We take<br />
advantage of two powerful features; lazy evaluation and higher order<br />
functions to roll our own loop control structure, <hask>forever</hask>,<br />
as a normal function! <hask>forever</hask> takes a chunk of code as an<br />
argument, evaluates it and recurses - an infinite loop function. It<br />
is very common to roll our own control structures in Haskell this way,<br />
using higher order functions. No need to add new syntax to the language, lisp-like macros or meta programming - you just write a normal<br />
function to implement whatever control flow you wish. We can also avoid<br />
<hask>do</hask>-notation, and directly write: <hask>forever a = a >> forever a</hask>.<br />
<br />
Let's run this thing:<br />
<br />
<haskell><br />
$ runhaskell 2.hs<br />
> NICK tutbot<br />
> USER tutbot 0 * :tutorial bot<br />
> JOIN #tutbot-testing<br />
NOTICE AUTH :*** Looking up your hostname...<br />
NOTICE AUTH :*** Found your hostname, welcome back<br />
NOTICE AUTH :*** Checking ident<br />
NOTICE AUTH :*** No identd (auth) response<br />
:orwell.freenode.net 001 tutbot :Welcome to the freenode IRC Network tutbot<br />
:orwell.freenode.net 002 tutbot :Your host is orwell.freenode.net<br />
...<br />
:tutbot!n=tutbot@aa.bb.cc.dd JOIN :#tutbot-testing<br />
:orwell.freenode.net MODE #tutbot-testing +ns<br />
:orwell.freenode.net 353 tutbot @ #tutbot-testing :@tutbot<br />
:orwell.freenode.net 366 tutbot #tutbot-testing :End of /NAMES list.<br />
</haskell><br />
<br />
And we're in business! From an IRC client, we can watch the bot connect:<br />
<br />
15:02 -- tutbot [n=tutbot@aa.bb.cc.dd] has joined #tutbot-testing<br />
15:02 dons> hello<br />
<br />
And the bot logs to standard output:<br />
<br />
:dons!i=dons@my.net PRIVMSG #tutbot-testing :hello<br />
<br />
We can now implement some commands.<br />
<br />
== A simple interpreter ==<br />
<br />
<haskell><br />
listen :: Handle -> IO ()<br />
listen h = forever $ do<br />
t <- hGetLine h<br />
let s = init t<br />
if ping s then pong s else eval h (clean s)<br />
putStrLn s<br />
where<br />
forever a = a >> forever a<br />
<br />
clean = drop 1 . dropWhile (/= ':') . drop 1<br />
<br />
ping x = "PING :" `isPrefixOf` x<br />
pong x = write h "PONG" (':' : drop 6 x)<br />
</haskell><br />
<br />
We add 3 features to the bot here by modifying <hask>listen</hask>.<br />
Firstly, it responds to <hask>PING</hask> messages: <hask>if ping s then pong s ... </hask>.<br />
This is useful for servers that require pings to keep clients connected.<br />
Before we can process a command, remember the IRC protocol generates<br />
input lines of the form:<br />
<haskell><br />
:dons!i=dons@my.net PRIVMSG #tutbot-testing :!id foo<br />
</haskell><br />
so we need a <hask>clean</hask> function to simply drop the leading ':'<br />
character, and then everything up to the next ':', leaving just the<br />
actual command content. We then pass this cleaned up string to<br />
<hask>eval</hask>, which then dispatches bot commands.<br />
<br />
<haskell><br />
eval :: Handle -> String -> IO ()<br />
eval h "!quit" = write h "QUIT" ":Exiting" >> exitWith ExitSuccess<br />
eval h x | "!id " `isPrefixOf` x = privmsg h (drop 4 x)<br />
eval _ _ = return () -- ignore everything else<br />
</haskell><br />
<br />
So, if the single string "!quit" is received, we inform the server and<br />
exit the program. If a string beginning with "!id" appears, we echo any argument<br />
string back to the server (<hask>id</hask> id is the Haskell identity<br />
function, which just returns its argument). Finally, if no other matches<br />
occur, we do nothing.<br />
<br />
We add the <hask>privmsg</hask> function - a useful wrapper over<br />
<hask>write</hask> for sending <hask>PRIVMSG</hask> lines to the server.<br />
<br />
Here's a transcript from our minimal bot running in channel:<br />
<br />
15:12 -- tutbot [n=tutbot@aa.bb.cc.dd] has joined #tutbot-testing<br />
15:13 dons> !id hello, world!<br />
15:13 tutbot> hello, world!<br />
15:13 dons> !id very pleased to meet you.<br />
15:13 tutbot> very pleased to meet you.<br />
15:13 dons> !quit<br />
15:13 -- tutbot [n=tutbot@aa.bb.cc.dd] has quit [Client Quit]<br />
<br />
Now, before we go further, let's refactor the code a bit.<br />
<br />
== Roll your own monad ==<br />
<br />
A small annoyance so far has been that we've had to thread around our<br />
socket to every function that needs to talk to the network. The socket<br />
is essentially <em>immutable state</em>, that could be treated as a<br />
global read only value in other languages. In Haskell, we can implement<br />
such a structure using a state <em>monad</em>. Monads are a very powerful<br />
abstraction, and we'll only touch on them here. The interested reader is<br />
referred to [http://www.nomaware.com/monads/ All About Monads]. We'll be<br />
using a custom monad specifically to implement a read-only global state<br />
for our bot.<br />
<br />
The key requirement is that we wish to be able to perform IO actions,<br />
as well as thread a small state value transparently through the program.<br />
As this is Haskell, we can take the extra step of partitioning our<br />
stateful code from all other program code, using a new type. <br />
<br />
So let's define a small state monad:<br />
<haskell><br />
data Bot = Bot { socket :: Handle }<br />
<br />
type Net = ReaderT Bot IO<br />
</haskell><br />
<br />
Firstly, we define a data type for the global state. In this case, it is<br />
the <hask>Bot</hask> type, a simple struct storing our network socket.<br />
We then layer this data type over our existing IO code, with a <em>monad<br />
transformer</em>. This isn't as scary as it sounds and the effect is<br />
that we can just treat the socket as a global read-only value anywhere<br />
we need it. We'll call this new io + state structure the<br />
<hask>Net</hask> monad. <hask>ReaderT</hask> is a <em>type<br />
constructor</em>, essentially a type function, that takes 2 types as<br />
arguments, building a result type: the <hask>Net</hask> monad type.<br />
<br />
We can now throw out all that socket threading and just grab the socket<br />
when we need it. The key steps are connecting to the server, followed by<br />
the initialisation of our new state monad and then to run the main bot loop<br />
with that state. We add a small function, which takes the intial bot<br />
state and evaluates the bot's <hask>run</hask> loop "in" the Net monad,<br />
using the Reader monad's <hask>runReaderT</hask> function:<br />
<br />
<haskell><br />
loop st = runReaderT run st<br />
</haskell><br />
<br />
where <hask>run</hask> is a small function to register the bot's nick,<br />
join a channel, and start listening for commands.<br />
<br />
While we're here, we can tidy up the main function a little by using<br />
<hask>Control.Exception.bracket</hask> to explicitly delimit the<br />
connection, shutdown and main loop phases of the program - a useful<br />
technique. We can also make the code a bit more robust by wrapping the<br />
main loop in an exception handler using <hask>catch</hask>:<br />
<br />
<haskell><br />
main :: IO ()<br />
main = bracket connect disconnect loop<br />
where<br />
disconnect = hClose . socket<br />
loop st = catch (runReaderT run st) (const $ return ())<br />
</haskell><br />
<br />
That is, the higher order function <hask>bracket</hask> takes 3<br />
arguments: a function to connect to the server, a function to<br />
disconnect and a main loop to run in between. We can use<br />
<hask>bracket</hask> whenever we wish to run some code before and after<br />
a particular action - like <hask>forever</hask>, this is another<br />
control structure implemented as a normal Haskell function.<br />
<br />
Rather than threading the socket around, we can now simply ask for it<br />
when needed. Note that the type of <hask>write</hask> changes - it is in<br />
the Net monad, which tells us that the bot must already by connected to<br />
a server (and thus it is ok to use the socket, as it is initialised).<br />
<br />
<haskell><br />
--<br />
-- Send a message out to the server we're currently connected to<br />
--<br />
write :: String -> String -> Net ()<br />
write s t = do<br />
h <- asks socket<br />
io $ hPrintf h "%s %s\r\n" s t<br />
io $ printf "> %s %s\n" s t<br />
</haskell><br />
<br />
In order to use both state and IO, we use the small <hask>io</hask><br />
function to <em>lift</em> an IO expression into the Net monad making<br />
that IO function available to code in the <hask>Net</hask> monad.<br />
<br />
<haskell><br />
io :: IO a -> Net a<br />
io = liftIO<br />
</haskell><br />
<br />
Similarly, we can combine IO actions with pure functions by lifting<br />
them into the IO monad. We can therefore simplify our <hask>hGetLine</hask><br />
call:<br />
<haskell><br />
do t <- io (hGetLine h)<br />
let s = init t<br />
</haskell><br />
by lifting <hask>init</hask> over IO:<br />
<haskell><br />
do s <- init `fmap` io (hGetLine h)<br />
</haskell><br />
<br />
The monadic, stateful, exception-handling bot in all its glory:<br />
<br />
<haskell><br />
import Data.List<br />
import Network<br />
import System.IO<br />
import System.Exit<br />
import Control.Monad.Reader<br />
import Control.Exception<br />
import Text.Printf<br />
import Prelude hiding (catch)<br />
<br />
server = "irc.freenode.org"<br />
port = 6667<br />
chan = "#tutbot-testing"<br />
nick = "tutbot"<br />
<br />
-- The 'Net' monad, a wrapper over IO, carrying the bot's immutable state.<br />
type Net = ReaderT Bot IO<br />
data Bot = Bot { socket :: Handle }<br />
<br />
-- Set up actions to run on start and end, and run the main loop<br />
main :: IO ()<br />
main = bracket connect disconnect loop<br />
where<br />
disconnect = hClose . socket<br />
loop st = catch (runReaderT run st) (const $ return ())<br />
<br />
-- Connect to the server and return the initial bot state<br />
connect :: IO Bot<br />
connect = notify $ do<br />
h <- connectTo server (PortNumber (fromIntegral port))<br />
hSetBuffering h NoBuffering<br />
return (Bot h)<br />
where<br />
notify a = bracket_<br />
(printf "Connecting to %s ... " server >> hFlush stdout)<br />
(putStrLn "done.")<br />
a<br />
<br />
-- We're in the Net monad now, so we've connected successfully<br />
-- Join a channel, and start processing commands<br />
run :: Net ()<br />
run = do<br />
write "NICK" nick<br />
write "USER" (nick++" 0 * :tutorial bot")<br />
write "JOIN" chan<br />
asks socket >>= listen<br />
<br />
-- Process each line from the server<br />
listen :: Handle -> Net ()<br />
listen h = forever $ do<br />
s <- init `fmap` io (hGetLine h)<br />
io (putStrLn s)<br />
if ping s then pong s else eval (clean s)<br />
where<br />
forever a = a >> forever a<br />
clean = drop 1 . dropWhile (/= ':') . drop 1<br />
ping x = "PING :" `isPrefixOf` x<br />
pong x = write "PONG" (':' : drop 6 x)<br />
<br />
-- Dispatch a command<br />
eval :: String -> Net ()<br />
eval "!quit" = write "QUIT" ":Exiting" >> io (exitWith ExitSuccess)<br />
eval x | "!id " `isPrefixOf` x = privmsg (drop 4 x)<br />
eval _ = return () -- ignore everything else<br />
<br />
-- Send a privmsg to the current chan + server<br />
privmsg :: String -> Net ()<br />
privmsg s = write "PRIVMSG" (chan ++ " :" ++ s)<br />
<br />
-- Send a message out to the server we're currently connected to<br />
write :: String -> String -> Net ()<br />
write s t = do<br />
h <- asks socket<br />
io $ hPrintf h "%s %s\r\n" s t<br />
io $ printf "> %s %s\n" s t<br />
<br />
-- Convenience.<br />
io :: IO a -> Net a<br />
io = liftIO<br />
</haskell><br />
<br />
Note that we threw in a new control structure, <hask>notify</hask>, for<br />
fun. Now we're almost done! Let's run this bot. Using runhaskell:<br />
<br />
$ runhaskell 4.hs<br />
<br />
or using GHC:<br />
<br />
$ ghc --make 4.hs -o tutbot<br />
Chasing modules from: 4.hs<br />
Compiling Main ( 4.hs, 4.o )<br />
Linking ...<br />
$ ./tutbot<br />
<br />
If you're using Hugs, you'll have to use the <hask>-98</hask> flag:<br />
<br />
$ runhugs -98 4.hs<br />
<br />
And from an IRC client we can watch it connect:<br />
<br />
15:26 -- tutbot [n=tutbot@aa.bb.cc.dd] has joined #tutbot-testing<br />
15:28 dons> !id all good?<br />
15:28 tutbot> all good?<br />
15:28 dons> !quit<br />
15:28 -- tutbot [n=tutbot@aa.bb.cc.dd] has quit [Client Quit]<br />
<br />
So we now have a bot with explicit read-only monadic state, error<br />
handling, and some basic IRC operations. If we wished to add read-write<br />
state, we need only change the <hask>ReaderT</hask> transformer to<br />
<hask>StateT</hask>.<br />
<br />
== Extending the bot ==<br />
<br />
Let's implement a basic new command: uptime tracking. Conceptually, we<br />
need to remember the time the bot starts. Then, if a user requests, we<br />
work out the total running time and print it as a string. A nice way to<br />
do this is to extend the bot's state with a start time field:<br />
<br />
<haskell><br />
data Bot = Bot { socket :: Handle, starttime :: ClockTime }<br />
</haskell><br />
<br />
We can then modify the initial <hask>connect</hask> function to also set<br />
the start time.<br />
<br />
<haskell><br />
connect :: IO Bot<br />
connect = notify $ do<br />
t <- getClockTime<br />
h <- connectTo server (PortNumber (fromIntegral port))<br />
hSetBuffering h NoBuffering<br />
return (Bot h t)<br />
</haskell><br />
<br />
We then add a new case to the <hask>eval</hask> function, to handle<br />
uptime requests:<br />
<br />
<haskell><br />
eval "!uptime" = uptime >>= privmsg<br />
</haskell><br />
<br />
This will just run the <hask>uptime</hask> function and send it back to<br />
the server. <hask>uptime</hask> itself is:<br />
<br />
<haskell><br />
uptime :: Net String<br />
uptime = do<br />
now <- io getClockTime<br />
zero <- asks starttime<br />
return . pretty $ diffClockTimes now zero<br />
</haskell><br />
<br />
That is, in the Net monad, find the current time and the start time, and<br />
then calculate the difference, returning that number as a string.<br />
Rather than use the normal representation for dates, we'll write our own<br />
custom formatter for dates:<br />
<br />
<haskell><br />
--<br />
-- Pretty print the date in '1d 9h 9m 17s' format<br />
--<br />
pretty :: TimeDiff -> String<br />
pretty td = join . intersperse " " . filter (not . null) . map f $<br />
[(years ,"y") ,(months `mod` 12,"m")<br />
,(days `mod` 28,"d") ,(hours `mod` 24,"h")<br />
,(mins `mod` 60,"m") ,(secs `mod` 60,"s")]<br />
where<br />
secs = abs $ tdSec td ; mins = secs `div` 60<br />
hours = mins `div` 60 ; days = hours `div` 24<br />
months = days `div` 28 ; years = months `div` 12<br />
f (i,s) | i == 0 = []<br />
| otherwise = show i ++ s<br />
</haskell><br />
<br />
And that's it. Running the bot with this new command:<br />
<br />
16:03 -- tutbot [n=tutbot@aa.bb.cc.dd] has joined #tutbot-testing<br />
16:03 dons> !uptime<br />
16:03 tutbot> 51s<br />
16:03 dons> !uptime<br />
16:03 tutbot> 1m 1s<br />
16:12 dons> !uptime<br />
16:12 tutbot> 9m 46s<br />
<br />
== Where to now? ==<br />
<br />
This is just a flavour of application programming in Haskell, and only<br />
hints at the power of Haskell's lazy evaluation, static typing, monadic<br />
effects and higher order functions. There is much, much more to be said<br />
on these topics. Some places to start:<br />
<br />
* The [[/Source|complete bot source]] (also [http://www.cse.unsw.edu.au/~dons/irc/bot.html mirrored here])<br />
* A [[/Transcript|full transcript]].<br />
* [[Haskell|Haskell.org]]<br />
* [[Example_code|More Haskell code]]<br />
* [[Books_and_tutorials|Learning Haskell]]<br />
* A gallery of [[Libraries_and_tools/Network|network apps]] in Haskell<br />
<br />
Or take the bot home and hack! Some suggestions:<br />
* Use <hask>forkIO</hask> to add a command line interface, and you've got yourself an irc client with 4 more lines of code.<br />
* Port some commands from [[Lambdabot]].<br />
<br />
Author: [http://www.cse.unsw.edu.au/~dons Don Stewart]<br />
<br />
[[Category:Tutorials]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=Roll_your_own_IRC_bot&diff=6492Roll your own IRC bot2006-10-05T00:04:30Z<p>Dibblego: </p>
<hr />
<div>This tutorial is designed as a practical guide to writing real world<br />
code in [http://haskell.org Haskell] and hopes to intuitively motivate<br />
and introduce some of the advanced features of Haskell to the novice<br />
programmer. Our goal is to write a concise, robust and elegant<br />
[http://haskell.org/haskellwiki/IRC_channel IRC] bot in Haskell.<br />
<br />
== Getting started ==<br />
<br />
You'll need a reasonably recent version of [http://haskell.org/ghc GHC]<br />
or [http://haskell.org/hugs Hugs]. Our first step is to get on the<br />
network. So let's start by importing the Network package, and the<br />
standard IO library and defining a server to connect to.<br />
<br />
<haskell><br />
import Network<br />
import System.IO<br />
<br />
server = "irc.freenode.org"<br />
port = 6667<br />
<br />
main = do<br />
h <- connectTo server (PortNumber (fromIntegral port))<br />
hSetBuffering h NoBuffering<br />
t <- hGetContents h<br />
print t<br />
</haskell><br />
<br />
The key here is the <hask>main</hask> function. This is the entry point<br />
to a Haskell program. We first connect to the server, then set the<br />
buffering on the socket off. Once we've got a socket, we can then just<br />
read and print any data we receive.<br />
<br />
Put this code in the module <hask>1.hs</hask> and we can then run it.<br />
Use whichever system you like:<br />
<br />
Using runhaskell:<br />
<br />
$ runhaskell 1.hs<br />
"NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :***<br />
Checking ident\r\nNOTICE AUTH :*** Found your hostname\r\n ...<br />
<br />
Or we can just compile it to an executable with GHC:<br />
<br />
$ ghc --make 1.hs -o tutbot<br />
Chasing modules from: 1.hs<br />
Compiling Main ( 1.hs, 1.o )<br />
Linking ...<br />
$ ./tutbot<br />
"NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :***<br />
Checking ident\r\nNOTICE AUTH :*** Found your hostname\r\n ...<br />
<br />
Or using GHCi:<br />
<br />
$ ghci 1.hs<br />
*Main> main<br />
"NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :***<br />
Checking ident\r\nNOTICE AUTH :*** Found your hostname\r\n ...<br />
<br />
Or in Hugs:<br />
<br />
$ runhugs 1.hs<br />
"NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :***<br />
Checking ident\r\nNOTICE AUTH :*** Found your hostname\r\n ...<br />
<br />
Great! We're on the network.<br />
<br />
== Talking IRC ==<br />
<br />
Now we're listening to the server, we better start sending some<br />
information back. Three details are important: the nick, the user name,<br />
and a channel to join. So let's send those.<br />
<br />
<haskell><br />
import Network<br />
import System.IO<br />
import Text.Printf<br />
<br />
server = "irc.freenode.org"<br />
port = 6667<br />
chan = "#tutbot-testing"<br />
nick = "tutbot"<br />
<br />
main = do<br />
h <- connectTo server (PortNumber (fromIntegral port))<br />
hSetBuffering h NoBuffering<br />
write h "NICK" nick<br />
write h "USER" (nick++" 0 * :tutorial bot")<br />
write h "JOIN" chan<br />
listen h<br />
<br />
write :: Handle -> String -> String -> IO ()<br />
write h s t = do<br />
hPrintf h "%s %s\r\n" s t<br />
printf "> %s %s\n" s t<br />
<br />
listen h = forever $ do<br />
s <- hGetLine h<br />
putStrLn s<br />
where<br />
forever a = do a; forever a<br />
</haskell><br />
<br />
Now, we've done quite a few things here. Firstly, we import<br />
<hask>Text.Printf</hask>, which will be useful. We also set up a channel<br />
name and bot nickname. The <hask>main</hask> function has been extended<br />
to send messages back to the IRC server using a <hask>write</hask><br />
function. Let's look at that a bit more closely:<br />
<br />
<haskell><br />
write :: Handle -> String -> String -> IO ()<br />
write h s t = do<br />
hPrintf h "%s %s\r\n" s t<br />
printf "> %s %s\n" s t<br />
</haskell><br />
<br />
We've given <hask>write</hask> an explicit type to help document it, and<br />
we'll use explicit types signatures from now on, as they're just good<br />
practice (though of course not required, as Haskell uses type inference<br />
to work out the types anyway).<br />
<br />
The <hask>write</hask> function takes 3 arguments: a handle (our<br />
socket), and then two strings representing an IRC protocol action, and<br />
any arguments it takes. <hask>write</hask> then uses <hask>hPrintf</hask><br />
to build an IRC message and write it over the wire to the server. For<br />
debugging purposes we also print to standard output the message we send.<br />
<br />
Our second function, <hask>listen</hask>, is as follows:<br />
<br />
<haskell><br />
listen :: Handle -> IO ()<br />
listen h = forever $ do<br />
s <- hGetLine h<br />
putStrLn s<br />
where<br />
forever a = do a; forever a<br />
</haskell><br />
<br />
This function takes a Handle argument, and sits in an infinite loop<br />
reading lines of text from the network and printing them. We take<br />
advantage of two powerful features; lazy evaluation and higher order<br />
functions, to roll our own loop control structure, <hask>forever</hask>,<br />
as a normal function! <hask>forever</hask> takes a chunk of code as an<br />
argument, evaluates it and recurses - an infinite loop function. It<br />
is very common to roll our own control structures in Haskell this way,<br />
using higher order functions. No need to add new syntax to the language, lisp-like macros or meta programming - you just write a normal<br />
function to implement whatever control flow you wish. We can also avoid<br />
<hask>do</hask>-notation, and directly write: <hask>forever a = a >> forever a</hask>.<br />
<br />
Let's run this thing:<br />
<br />
<haskell><br />
$ runhaskell 2.hs<br />
> NICK tutbot<br />
> USER tutbot 0 * :tutorial bot<br />
> JOIN #tutbot-testing<br />
NOTICE AUTH :*** Looking up your hostname...<br />
NOTICE AUTH :*** Found your hostname, welcome back<br />
NOTICE AUTH :*** Checking ident<br />
NOTICE AUTH :*** No identd (auth) response<br />
:orwell.freenode.net 001 tutbot :Welcome to the freenode IRC Network tutbot<br />
:orwell.freenode.net 002 tutbot :Your host is orwell.freenode.net<br />
...<br />
:tutbot!n=tutbot@aa.bb.cc.dd JOIN :#tutbot-testing<br />
:orwell.freenode.net MODE #tutbot-testing +ns<br />
:orwell.freenode.net 353 tutbot @ #tutbot-testing :@tutbot<br />
:orwell.freenode.net 366 tutbot #tutbot-testing :End of /NAMES list.<br />
</haskell><br />
<br />
And we're in business! From an irc client, we can watch the bot connect:<br />
<br />
15:02 -- tutbot [n=tutbot@aa.bb.cc.dd] has joined #tutbot-testing<br />
15:02 dons> hello<br />
<br />
And the bot logs to standard output:<br />
<br />
:dons!i=dons@my.net PRIVMSG #tutbot-testing :hello<br />
<br />
We can now implement some commands.<br />
<br />
== A simple interpreter ==<br />
<br />
<haskell><br />
listen :: Handle -> IO ()<br />
listen h = forever $ do<br />
t <- hGetLine h<br />
let s = init t<br />
if ping s then pong s else eval h (clean s)<br />
putStrLn s<br />
where<br />
forever a = a >> forever a<br />
<br />
clean = drop 1 . dropWhile (/= ':') . drop 1<br />
<br />
ping x = "PING :" `isPrefixOf` x<br />
pong x = write h "PONG" (':' : drop 6 x)<br />
</haskell><br />
<br />
We add 3 features to the bot here by modifying <hask>listen</hask>.<br />
Firstly, it responds to <hask>PING</hask> messages: <hask>if ping s then pong s ... </hask>.<br />
This is useful for servers that require pings to keep clients connected.<br />
Before we can process a command, remember the IRC protocol generates<br />
input lines of the form:<br />
<haskell><br />
:dons!i=dons@my.net PRIVMSG #tutbot-testing :!id foo<br />
</haskell><br />
so we need a <hask>clean</hask> function to simply drop the leading ':'<br />
character, and then everything up to the next ':', leaving just the<br />
actual command content. We then pass this cleaned up string to<br />
<hask>eval</hask>, which then dispatches bot commands.<br />
<br />
<haskell><br />
eval :: Handle -> String -> IO ()<br />
eval h "!quit" = write h "QUIT" ":Exiting" >> exitWith ExitSuccess<br />
eval h x | "!id " `isPrefixOf` x = privmsg h (drop 4 x)<br />
eval _ _ = return () -- ignore everything else<br />
</haskell><br />
<br />
So, if the single string "!quit" is received, we inform the server, and<br />
exit the program. If a string beginning with "!id" appears, we echo any argument<br />
string back to the server (<hask>id</hask> id is the Haskell identity<br />
function, which just returns its argument). Finally, if no other matches<br />
occur, we do nothing.<br />
<br />
We add the <hask>privmsg</hask> function - a useful wrapper over<br />
<hask>write</hask> for sending <hask>PRIVMSG</hask> lines to the server.<br />
<br />
Here's a transcript from our minimal bot running in channel:<br />
<br />
15:12 -- tutbot [n=tutbot@aa.bb.cc.dd] has joined #tutbot-testing<br />
15:13 dons> !id hello, world!<br />
15:13 tutbot> hello, world!<br />
15:13 dons> !id very pleased to meet you.<br />
15:13 tutbot> very pleased to meet you.<br />
15:13 dons> !quit<br />
15:13 -- tutbot [n=tutbot@aa.bb.cc.dd] has quit [Client Quit]<br />
<br />
Now, before we go further, let's refactor the code a bit.<br />
<br />
== Roll your own monad ==<br />
<br />
A small annoyance so far has been that we've had to thread around our<br />
socket to every function that needs to talk to the network. The socket<br />
is essentially <em>immutable state</em>, that could be treated as a<br />
global read only value in other languages. In Haskell, we can implement<br />
such a structure using a state <em>monad</em>. Monads are a very powerful<br />
abstraction, and we'll only touch on them here. The interested reader is<br />
referred to [http://www.nomaware.com/monads/ All About Monads]. We'll be<br />
using a custom monad specifically to implement a read-only global state<br />
for our bot.<br />
<br />
The key requirement is that we wish to be able to perform IO actions,<br />
as well as thread a small state value transparently through the program.<br />
As this is Haskell, we can take the extra step of partitioning our<br />
stateful code from all other program code, using a new type. <br />
<br />
So let's define a small state monad:<br />
<haskell><br />
data Bot = Bot { socket :: Handle }<br />
<br />
type Net = ReaderT Bot IO<br />
</haskell><br />
<br />
Firstly, we define a data type for the global state. In this case, it is<br />
the <hask>Bot</hask> type, a simple struct storing our network socket.<br />
We then layer this data type over our existing IO code, with a <em>monad<br />
transformer</em>. This isn't as scary as it sounds and the effect is<br />
that we can just treat the socket as a global read-only value anywhere<br />
we need it. We'll call this new io + state structure the<br />
<hask>Net</hask> monad. <hask>ReaderT</hask> is a <em>type<br />
constructor</em>, essentially a type function, that takes 2 types as<br />
arguments, building a result type: the <hask>Net</hask> monad type.<br />
<br />
We can now throw out all that socket threading, and just grab the socket<br />
when we need it. The key steps are; once we've connected to the server,<br />
to initialise our new state monad and then to run the main bot loop<br />
with that state. We add a small function, which takes the intial bot<br />
state and evaluates the bot's <hask>run</hask> loop "in" the Net monad,<br />
using the Reader monad's <hask>runReaderT</hask> function:<br />
<br />
<haskell><br />
loop st = runReaderT run st<br />
</haskell><br />
<br />
where <hask>run</hask> is a small function to register the bot's nick,<br />
join a channel, and start listening for commands.<br />
<br />
While we're here we can tidy up the main function a little, by using<br />
<hask>Control.Exception.bracket</hask> to explicitly delimit the<br />
connection, shutdown and main loop phases of the program - a useful<br />
technique. We can also make the code a bit more robust, by wrapping the<br />
main loop in an exception handler, using <hask>catch</hask>:<br />
<br />
<haskell><br />
main :: IO ()<br />
main = bracket connect disconnect loop<br />
where<br />
disconnect = hClose . socket<br />
loop st = catch (runReaderT run st) (const $ return ())<br />
</haskell><br />
<br />
That is, the higher order function <hask>bracket</hask> takes 3<br />
arguments: a function to connect to the server, a function to<br />
disconnect and a main loop to run in between. We can use<br />
<hask>bracket</hask> whenever we wish to run some code before and after<br />
a particular action - like <hask>forever</hask>, this is another<br />
control structure implemented as a normal Haskell function.<br />
<br />
Rather than threading the socket around, we can now simply ask for it<br />
when needed. Note that the type of <hask>write</hask> changes: it is in<br />
the Net monad, which tells us that the bot must already by connected to<br />
a server (and thus it is ok to use the socket, as it is initialised).<br />
<br />
<haskell><br />
--<br />
-- Send a message out to the server we're currently connected to<br />
--<br />
write :: String -> String -> Net ()<br />
write s t = do<br />
h <- asks socket<br />
io $ hPrintf h "%s %s\r\n" s t<br />
io $ printf "> %s %s\n" s t<br />
</haskell><br />
<br />
In order to use both state and IO, we use the small <hask>io</hask><br />
function to <em>lift</em> an IO expression into the Net monad, making<br />
that IO function available to code in the <hask>Net</hask> monad.<br />
<br />
<haskell><br />
io :: IO a -> Net a<br />
io = liftIO<br />
</haskell><br />
<br />
Similarly, we can combine IO actions with pure functions by lifting<br />
them into the IO monad. We can therefore simplify our <hask>hGetLine</hask><br />
call:<br />
<haskell><br />
do t <- io (hGetLine h)<br />
let s = init t<br />
</haskell><br />
by lifting <hask>init</hask> over IO:<br />
<haskell><br />
do s <- init `fmap` io (hGetLine h)<br />
</haskell><br />
<br />
The monadic, stateful, exception-handling bot in all its glory:<br />
<br />
<haskell><br />
import Data.List<br />
import Network<br />
import System.IO<br />
import System.Exit<br />
import Control.Monad.Reader<br />
import Control.Exception<br />
import Text.Printf<br />
import Prelude hiding (catch)<br />
<br />
server = "irc.freenode.org"<br />
port = 6667<br />
chan = "#tutbot-testing"<br />
nick = "tutbot"<br />
<br />
-- The 'Net' monad, a wrapper over IO, carrying the bot's immutable state.<br />
type Net = ReaderT Bot IO<br />
data Bot = Bot { socket :: Handle }<br />
<br />
-- Set up actions to run on start and end, and run the main loop<br />
main :: IO ()<br />
main = bracket connect disconnect loop<br />
where<br />
disconnect = hClose . socket<br />
loop st = catch (runReaderT run st) (const $ return ())<br />
<br />
-- Connect to the server and return the initial bot state<br />
connect :: IO Bot<br />
connect = notify $ do<br />
h <- connectTo server (PortNumber (fromIntegral port))<br />
hSetBuffering h NoBuffering<br />
return (Bot h)<br />
where<br />
notify a = bracket_<br />
(printf "Connecting to %s ... " server >> hFlush stdout)<br />
(putStrLn "done.")<br />
a<br />
<br />
-- We're in the Net monad now, so we've connected successfully<br />
-- Join a channel, and start processing commands<br />
run :: Net ()<br />
run = do<br />
write "NICK" nick<br />
write "USER" (nick++" 0 * :tutorial bot")<br />
write "JOIN" chan<br />
asks socket >>= listen<br />
<br />
-- Process each line from the server<br />
listen :: Handle -> Net ()<br />
listen h = forever $ do<br />
s <- init `fmap` io (hGetLine h)<br />
io (putStrLn s)<br />
if ping s then pong s else eval (clean s)<br />
where<br />
forever a = a >> forever a<br />
clean = drop 1 . dropWhile (/= ':') . drop 1<br />
ping x = "PING :" `isPrefixOf` x<br />
pong x = write "PONG" (':' : drop 6 x)<br />
<br />
-- Dispatch a command<br />
eval :: String -> Net ()<br />
eval "!quit" = write "QUIT" ":Exiting" >> io (exitWith ExitSuccess)<br />
eval x | "!id " `isPrefixOf` x = privmsg (drop 4 x)<br />
eval _ = return () -- ignore everything else<br />
<br />
-- Send a privmsg to the current chan + server<br />
privmsg :: String -> Net ()<br />
privmsg s = write "PRIVMSG" (chan ++ " :" ++ s)<br />
<br />
-- Send a message out to the server we're currently connected to<br />
write :: String -> String -> Net ()<br />
write s t = do<br />
h <- asks socket<br />
io $ hPrintf h "%s %s\r\n" s t<br />
io $ printf "> %s %s\n" s t<br />
<br />
-- Convenience.<br />
io :: IO a -> Net a<br />
io = liftIO<br />
</haskell><br />
<br />
Note that we threw in a new control structure, <hask>notify</hask>, for<br />
fun. Now we're almost done! Let's run this bot. Using runhaskell:<br />
<br />
$ runhaskell 4.hs<br />
<br />
or using GHC:<br />
<br />
$ ghc --make 4.hs -o tutbot<br />
Chasing modules from: 4.hs<br />
Compiling Main ( 4.hs, 4.o )<br />
Linking ...<br />
$ ./tutbot<br />
<br />
If you're using Hugs, you'll have to use the <hask>-98</hask> flag:<br />
<br />
$ runhugs -98 4.hs<br />
<br />
And from an IRC client we can watch it connect:<br />
<br />
15:26 -- tutbot [n=tutbot@aa.bb.cc.dd] has joined #tutbot-testing<br />
15:28 dons> !id all good?<br />
15:28 tutbot> all good?<br />
15:28 dons> !quit<br />
15:28 -- tutbot [n=tutbot@aa.bb.cc.dd] has quit [Client Quit]<br />
<br />
So we now have a bot with explicit read-only monadic state, error<br />
handling, and some basic IRC operations. If we wished to add read-write<br />
state, we need only change the <hask>ReaderT</hask> transformer to<br />
<hask>StateT</hask>.<br />
<br />
== Extending the bot ==<br />
<br />
Let's implement a basic new command: uptime tracking. Conceptually, we<br />
need to remember the time the bot starts. Then, if a user requests, we<br />
work out the total running time and print it as a string. A nice way to<br />
do this is to extend the bot's state with a start time field:<br />
<br />
<haskell><br />
data Bot = Bot { socket :: Handle, starttime :: ClockTime }<br />
</haskell><br />
<br />
We can then modify the initial <hask>connect</hask> function to also set<br />
the start time.<br />
<br />
<haskell><br />
connect :: IO Bot<br />
connect = notify $ do<br />
t <- getClockTime<br />
h <- connectTo server (PortNumber (fromIntegral port))<br />
hSetBuffering h NoBuffering<br />
return (Bot h t)<br />
</haskell><br />
<br />
We then add a new case to the <hask>eval</hask> function, to handle<br />
uptime requests:<br />
<br />
<haskell><br />
eval "!uptime" = uptime >>= privmsg<br />
</haskell><br />
<br />
This will just run the <hask>uptime</hask> function and send it back to<br />
the server. <hask>uptime</hask> itself is:<br />
<br />
<haskell><br />
uptime :: Net String<br />
uptime = do<br />
now <- io getClockTime<br />
zero <- asks starttime<br />
return . pretty $ diffClockTimes now zero<br />
</haskell><br />
<br />
That is, in the Net monad, find the current time and the start time, and<br />
then calculate the difference, returning that number as a string.<br />
Rather than use the normal representation for dates, we'll write our own<br />
custom formatter for dates:<br />
<br />
<haskell><br />
--<br />
-- Pretty print the date in '1d 9h 9m 17s' format<br />
--<br />
pretty :: TimeDiff -> String<br />
pretty td = join . intersperse " " . filter (not . null) . map f $<br />
[(years ,"y") ,(months `mod` 12,"m")<br />
,(days `mod` 28,"d") ,(hours `mod` 24,"h")<br />
,(mins `mod` 60,"m") ,(secs `mod` 60,"s")]<br />
where<br />
secs = abs $ tdSec td ; mins = secs `div` 60<br />
hours = mins `div` 60 ; days = hours `div` 24<br />
months = days `div` 28 ; years = months `div` 12<br />
f (i,s) | i == 0 = []<br />
| otherwise = show i ++ s<br />
</haskell><br />
<br />
And that's it. Running the bot with this new command:<br />
<br />
16:03 -- tutbot [n=tutbot@aa.bb.cc.dd] has joined #tutbot-testing<br />
16:03 dons> !uptime<br />
16:03 tutbot> 51s<br />
16:03 dons> !uptime<br />
16:03 tutbot> 1m 1s<br />
16:12 dons> !uptime<br />
16:12 tutbot> 9m 46s<br />
<br />
== Where to now? ==<br />
<br />
This is just a flavour of application programming in Haskell, and only<br />
hints at the power of Haskell's lazy evaluation, static typing, monadic<br />
effects and higher order functions. There is much, much more to be said<br />
on these topics. Some places to start:<br />
<br />
* The [[/Source|complete bot source]] (also [http://www.cse.unsw.edu.au/~dons/irc/bot.html mirrored here])<br />
* A [[/Transcript|full transcript]].<br />
* [[Haskell|Haskell.org]]<br />
* [[Example_code|More Haskell code]]<br />
* [[Books_and_tutorials|Learning Haskell]]<br />
* A gallery of [[Libraries_and_tools/Network|network apps]] in Haskell<br />
<br />
Or take the bot home and hack! Some suggestions:<br />
* Use <hask>forkIO</hask> to add a command line interface, and you've got yourself an irc client with 4 more lines of code.<br />
* Port some commands from [[Lambdabot]].<br />
<br />
Author: [http://www.cse.unsw.edu.au/~dons Don Stewart]<br />
<br />
[[Category:Tutorials]]</div>Dibblegohttps://wiki.haskell.org/index.php?title=Introduction_to_QuickCheck1&diff=6201Introduction to QuickCheck12006-09-21T07:38:11Z<p>Dibblego: </p>
<hr />
<div>A quick introduction to QuickCheck, and testing Haskell code.<br />
<br />
== Motivation ==<br />
<br />
In September 2006, Bruno Martnez<br />
[http://www.haskell.org/pipermail/haskell-cafe/2006-September/018302.html asked] <br />
the following question:<br />
<br />
<haskell><br />
-- I've written a function that looks similar to this one<br />
<br />
getList = find 5 where<br />
find 0 = return []<br />
find n = do<br />
ch <- getChar<br />
if ch `elem` ['a'..'e'] then do<br />
tl <- find (n-1)<br />
return (ch : tl) else<br />
find n<br />
<br />
-- I want to test this function, without hitting the filesystem. In C++ I<br />
-- would use a istringstream. I couldn't find a function that returns a<br />
-- Handle from a String. The closer thing that may work that I could find<br />
-- was making a pipe and convertind the file descriptor. Can I simplify<br />
-- that function to take it out of the IO monad?<br />
</haskell><br />
<br />
So the problem is: how to effectively test this function in Haskell? The<br />
solution we turn to is refactoring and QuickCheck.<br />
<br />
== Keeping things pure ==<br />
<br />
The reason your getList is hard to test, is that the side effecting monadic code <br />
is mixed in with the pure computation, making it difficult to test<br />
without moving entirely into a "black box" IO-based testing model.<br />
Such a mixture is not good for reasoning about code.<br />
<br />
Let's untangle that, and then test the referentially transparent<br />
parts simply with QuickCheck. We can take advantage of lazy IO firstly,<br />
to avoid all the unpleasant low-level IO handling. <br />
<br />
So the first step is to factor out the IO part of the function into a<br />
thin "skin" layer:<br />
<br />
<haskell><br />
-- A thin monadic skin layer<br />
getList :: IO [Char]<br />
getList = take5 `fmap` getContents<br />
<br />
-- The actual worker<br />
take5 :: [Char] -> [Char]<br />
take5 = take 5 . filter (`elem` ['a'..'e'])<br />
</haskell><br />
<br />
== Testing with QuickCheck ==<br />
<br />
Now we can test the 'guts' of the algorithm, the take5 function, in<br />
isolation. Let's use QuickCheck. First we need an Arbitrary instance for<br />
the Char type -- this takes care of generating random Chars for us to<br />
test with. I'll restrict it to a range of nice chars just for<br />
simplicity:<br />
<br />
<haskell><br />
import Data.Char<br />
import Test.QuickCheck<br />
<br />
instance Arbitrary Char where<br />
arbitrary = choose ('\32', '\128')<br />
coarbitrary c = variant (ord c `rem` 4)<br />
</haskell><br />
<br />
Let's fire up GHCi (or Hugs) and try some generic properties (its nice<br />
that we can use the QuickCheck testing framework directly from the<br />
Haskell prompt). An easy one first, a [Char] is equal to itself:<br />
<br />
<haskell><br />
*A> quickCheck ((\s -> s == s) :: [Char] -> Bool)<br />
OK, passed 100 tests.<br />
</haskell><br />
<br />
What just happened? QuickCheck generated 100 random [Char] values, and<br />
applied our property, checking the result was True for all cases.<br />
QuickCheck ''generated the test sets for us''!<br />
<br />
A more interesting property now: reversing twice is the identity:<br />
<br />
<haskell><br />
*A> quickCheck ((\s -> (reverse.reverse) s == s) :: [Char] -> Bool)<br />
OK, passed 100 tests.<br />
</haskell><br />
<br />
Great!<br />
<br />
== Testing take5 ==<br />
<br />
The first step to testing with QuickCheck is to work out some properties<br />
that are true of the function, for all inputs. That is, we need to find<br />
''invariants''.<br />
<br />
A simple invariant might be:<br />
<math>\forall~s~.~length~(take5~s)~=~5</math><br />
<br />
So let's write that as a QuickCheck property:<br />
<haskell><br />
\s -> length (take5 s) == 5<br />
</haskell><br />
<br />
Which we can then run in QuickCheck as:<br />
<haskell><br />
*A> quickCheck (\s -> length (take5 s) == 5)<br />
Falsifiable, after 0 tests:<br />
""<br />
</haskell><br />
<br />
Ah! QuickCheck caught us out. If the input string contains less than 5<br />
filterable characters, the resulting string will be less than 5<br />
characters long. So let's weaken the property a bit:<br />
<math>\forall~s~.~length~(take5~s)~\le~5</math><br />
<br />
That is, take5 returns a string of at most 5 characters long. Let's test<br />
this: <br />
<haskell><br />
*A> quickCheck (\s -> length (take5 s) <= 5)<br />
OK, passed 100 tests.<br />
</haskell><br />
<br />
Good!<br />
<br />
== Another property ==<br />
<br />
Another thing to check would be that the correct characters are<br />
returned. That is, for all returned characters, those characters are<br />
members of the set ['a','b','c','d','e'].<br />
<br />
We can specify that as:<br />
<math>\forall~s~.~\forall~e~.~e~\in~take5~s~\to~e~\in~[abcde] </math><br />
<br />
And in QuickCheck:<br />
<haskell><br />
*A> quickCheck (\s -> all (`elem` ['a'..'e']) (take5 s))<br />
OK, passed 100 tests.<br />
</haskell><br />
<br />
Excellent. So we can have some confidence that the function neither<br />
returns strings that are too long, nor includes invalid characters.<br />
<br />
== Coverage ==<br />
<br />
One issue with the default QuickCheck configuration, when testing<br />
[Char], is that the standard 100 tests isn't enough for our situation.<br />
In fact, QuickCheck never generates a String greater than 5 characters<br />
long, when using the supplied Arbtrary instance for Char! We can confirm<br />
this:<br />
<br />
<haskell><br />
*A> quickCheck (\s -> length (take5 s) < 5)<br />
OK, passed 100 tests.<br />
</haskell><br />
<br />
QuickCheck wastes its time generating different Chars, when what we<br />
really need is longer strings. One solution to this is to modify<br />
QuickCheck's default configuration to test deeper:<br />
<br />
<haskell><br />
deepCheck p = check (defaultConfig { configMaxTest = 10000}) p<br />
</haskell><br />
<br />
This instructs the system to find at least 10000 test cases before<br />
concluding that all is well. Let's check that it is generating longer<br />
strings:<br />
<br />
<haskell><br />
*A> deepCheck (\s -> length (take5 s) < 5)<br />
Falsifiable, after 125 tests:<br />
";:iD^*NNi~Y\\RegMob\DEL@krsx/=dcf7kub|EQi\DELD*"<br />
</haskell><br />
<br />
We can check the test data QuickCheck is generating using the<br />
'verboseCheck' hook. Here, testing on integers lists:<br />
<br />
<haskell><br />
*A> verboseCheck (\s -> length s < 5)<br />
0: []<br />
1: [0]<br />
2: []<br />
3: []<br />
4: []<br />
5: [1,2,1,1]<br />
6: [2]<br />
7: [-2,4,-4,0,0]<br />
Falsifiable, after 7 tests:<br />
[-2,4,-4,0,0]<br />
</haskell><br />
<br />
== Going further ==<br />
<br />
QuickCheck is effectively an embedded domain specific language for<br />
testing Haskell code, and allows for much more complex properties than<br />
those you've seen here to be tested. Some sources for further reading<br />
are:<br />
* [http://www.cse.unsw.edu.au/~dons/data/QuickCheck.html The QuickCheck source]<br />
* [http://haskell.org/ghc/docs/latest/html/libraries/QuickCheck/Test-QuickCheck.html Library documentation]<br />
* [http://www.cse.unsw.edu.au/~dons/code/fps/tests/Properties.hs A large testsuite of QuickCheck code]<br />
* Paper [http://www.cs.chalmers.se/~koen/pubs/icfp00-quickcheck.ps QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs], Koen Claessen and John Hughes. In Proc. of International Conference on Functional Programming (ICFP), ACM SIGPLAN, 2000.<br />
* Paper [http://www.math.chalmers.se/~koen/pubs/entry-fop-quickcheck.html Specification Based Testing with QuickCheck], Koen Claessen and John Hughes. In Jeremy Gibbons and Oege de Moor (eds.), The Fun of Programming, Cornerstones of Computing, pp. 17--40, Palgrave, 2003.<br />
* Paper [http://www.math.chalmers.se/~koen/pubs/entry-tt04-quickcheck.html QuickCheck: Specification-based Random Testing], Koen Claessen. Presentation at Summer Institute on Trends in Testing: Theory, Techniques and Tools, August 2004.<br />
* Paper [http://www.cs.chalmers.se/~rjmh/Papers/QuickCheckST.ps Testing Monadic Programs with QuickCheck], Koen Claessen, John Hughes. SIGPLAN Notices 37(12): 47-59 (2002):<br />
* More [http://haskell.org/haskellwiki/Research_papers/Testing_and_correctness research on correctness and testing] in Haskell<br />
* Tutorial: [[QuickCheck as a test set generator]]<br />
* Tutorial: [[QuickCheck / GADT]]<br />
<br />
[[Category:Tutorials]]</div>Dibblego