https://wiki.haskell.org/api.php?action=feedcontributions&user=Iceland+jack&feedformat=atomHaskellWiki - User contributions [en]2021-01-26T03:49:47ZUser contributionsMediaWiki 1.27.4https://wiki.haskell.org/index.php?title=GHC/Kinds&diff=57624GHC/Kinds2014-02-28T16:02:36Z<p>Iceland jack: Gender</p>
<hr />
<div>Kinds (this is not the definitive name) will be a language extension adding a kind level and some kind polymorphic support to Haskell.<br />
<br />
== What does this extension do? ==<br />
<br />
Haskell defines [[Kind|kinds]] as <code>&kappa; ::= * | &kappa; -> &kappa;</code>. Nowadays, programmers use type-level computation more often using [[GADT|GADTs]] and [[GHC/Type families|type families]]. The need of a well-kinded type-level computation has become bigger. This extension provides a simple mechanism called ''promotion'' to populate the kind level.<br />
<br />
Each time the user defines a ''promotable'' data type at the term level they are able to use it at the type level too. For instance, the user can write the following example:<br />
<br />
<haskell><br />
data Nat = Zero | Succ Nat<br />
data Vector :: * -> Nat -> * where<br />
VNil :: Vector a Zero<br />
VCons :: a -> Vector a n -> Vector a (Succ n)<br />
<br />
data List a = Nil | Cons a (List a)<br />
data HList :: List * -> * where<br />
HNil :: HList Nil<br />
HCons :: a -> HList as -> HList (Cons a as)<br />
</haskell><br />
<br />
== How do I use it? ==<br />
<br />
Simply [http://hackage.haskell.org/trac/ghc/wiki/Building build GHC HEAD].<br />
<br />
== Which data types are promotable? ==<br />
<br />
A data type is promotable if its type constructor and data constructors are. A type constructor is promotable if it is of the form <hask>* -> .. * -> *</hask> which is a first-order Haskell98 type constructor. All Haskell98 data constructors of a first-order Haskell98 data type are promotable. More details can be found in [http://dreixel.net/research/pdf/ghp.pdf this paper].<br />
<br />
A simple way to decide if your data type is promotable is to see if you can write without the where-clause like this:<br />
<haskell><br />
data T (a::*) (b::*) (c::*) = A a | B Int b | C (Either a c) [b]<br />
</haskell></div>Iceland jackhttps://wiki.haskell.org/index.php?title=Dealing_with_binary_data&diff=57499Dealing with binary data2014-01-27T03:43:12Z<p>Iceland jack: /* The BitGet monad */</p>
<hr />
<div>[[Category:How to]]<br />
== Handling Binary Data with Haskell ==<br />
<br />
Many programming problems call for the use of binary formats for compactness,<br />
ease-of-use, compatibility or speed. This page quickly covers some common<br />
libraries for handling binary data in Haskell.<br />
<br />
=== Bytestrings ===<br />
<br />
Everything else in this tutorial will be based on bytestrings. Normal Haskell <hask>String</hask> types are linked lists of 32-bit characters.<br />
<br />
This has a number of useful properties like coverage of the Unicode space and laziness, however when it comes to dealing with bytewise data, <hask>String</hask> involves a space-inflation of about 24x and a large reduction in speed.<br />
<br />
Bytestrings are packed arrays of bytes or 8-bit chars. If you have experience in C, their memory representation would be the same as a <code>uint8_t[]</code>—although bytestrings know their length and don't allow overflows, etc.<br />
<br />
There are two major flavours of bytestrings: strict and lazy. Strict bytestrings are exactly what you would expect—a linear array of bytes in memory. Lazy bytestrings are a list of strict bytestrings; often this is called a cord in other languages. When reading a lazy bytestring from a file, the data will be read chunk by chunk and the file can be larger than the size of memory. The default chunk size is currently 32K.<br />
<br />
Within each flavour of bytestring comes the Word8 and Char8 versions. These are mostly an aid to the type system since they are fundamentally the same size of element. The Word8 unpacks as a list of <hask>Word8</hask> elements (bytes), the Char8 unpacks as a list of <hask>Char</hask>, which may be useful if you want to convert them to <hask>Strings</hask>.<br />
<br />
You might want to open the documentation for [http://haskell.org/ghc/docs/latest/html/libraries/bytestring/Data-ByteString.html strict bytestrings] and [http://haskell.org/ghc/docs/latest/html/libraries/bytestring/Data-ByteString-Lazy.html lazy bytestrings] in another tab so that you can follow along.<br />
<br />
==== Simple file IO ====<br />
<br />
Here's a very simple program which copies a file from standard input to<br />
standard output<br />
<br />
<haskell>module Main where<br />
<br />
import qualified Data.ByteString as B<br />
<br />
main :: IO ()<br />
main = do<br />
contents <- B.getContents<br />
B.putStr contents</haskell><br />
<br />
Note that we are using strict bytestrings here. (It's quite common to import the<br />
<code>ByteString</code> module under the names <code>B</code> or <code>BS</code>.)<br />
Since the bytestrings are strict, the code will read the whole of <code>stdin</code> into<br />
memory and then write it out. If the input was too large this would overflow<br />
the available memory and fail.<br />
<br />
Let's see the same program using lazy bytestrings. We are just changing the<br />
imported ByteString module to be the lazy one and calling the exact same<br />
functions from the new module:<br />
<br />
<haskell>module Main where<br />
<br />
import qualified Data.ByteString.Lazy as BL<br />
<br />
main :: IO ()<br />
main = do<br />
contents <- BL.getContents<br />
BL.putStr contents</haskell><br />
<br />
This code, because of the lazy bytestrings, will cope with any sized input and<br />
will start producing output before all the input has been read. You can think<br />
of the code as setting up a pipeline, rather than executing in-order, as you<br />
might expect. As <hask>putStr</hask> needs more data, it will cause the lazy<br />
bytestring <hask>contents</hask> to read more until the end of the input is<br />
found.<br />
<br />
You should review the [http://haskell.org/ghc/docs/latest/html/libraries/bytestring/Data-ByteString.html documentation]<br />
which lists all the functions which operate on ByteStrings. The documentation<br />
for the various types (lazy Word8, strict Char8, ...) are all very similar. You<br />
generally find the same functions in each, with the same names. Remember to<br />
import the modules as <code>qualified</code> and give them different names.<br />
<br />
==== The guts of ByteStrings ====<br />
<br />
I'll just mention in passing that sometimes you need to do something which would<br />
endanger the referential transparency of ByteStrings. Generally you only need<br />
to do this when using the FFI to interface with C libraries. Should such a need<br />
arise, you can have a look at the<br />
[http://haskell.org/ghc/docs/latest/html/libraries/bytestring/Data-ByteString-Internal.html internal functions] and the<br />
[http://haskell.org/ghc/docs/latest/html/libraries/bytestring/Data-ByteString-Unsafe.html unsafe functions].<br />
Remember that the last set of functions are called unsafe for a reason—misuse<br />
can crash your program!<br />
<br />
=== Binary parsing ===<br />
<br />
Once you have your data as a bytestring you'll be wanting to parse something<br />
from it. Here you need to install the<br />
<tt>[http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary-0.4.1 binary]</tt> package. You should read the instructions on<br />
[http://haskell.org/haskellwiki/Cabal/How_to_install_a_Cabal_package how to install a Cabal package] if you haven't done so already.<br />
<br />
The <tt>binary</tt> package has three major parts: the <code>Get</code> monad,<br />
the <code>Put</code> monad and a general serialisation for Haskell types. The<br />
latter is like the <tt>pickle</tt> module that you may know from Python—it<br />
has its own serialisation format and I won't be covering it any more here.<br />
However, if you just need to persist some Haskell data structures, it might be<br />
exactly what you want: the documentation is<br />
[http://hackage.haskell.org/packages/archive/binary/0.4.1/doc/html/Data-Binary.html here]<br />
<br />
==== The <tt>Get</tt> monad ====<br />
<br />
The <tt>Get</tt> monad is a state monad; it keeps some state and each action<br />
updates that state. The state in this case is an offset into the bytestring<br />
which is getting parsed. <tt>Get</tt> parses lazy bytestrings; this is how<br />
packages like<br />
[http://hackage.haskell.org/cgi-bin/hackage-scripts/package/tar-0.1.1.1 tar]<br />
can parse files several gigabytes long in constant memory: they are using a<br />
pipeline of lazy bytestrings. However, this also has a downside. When parsing a<br />
lazy bytestring a parse failure (such as running off the end of the bytestring)<br />
is signified by an exception. Exceptions can only be caught in the IO monad<br />
and, because of laziness, might not be thrown exactly where you expect. If this<br />
is a problem, you probably want a strict version of <tt>Get</tt>, which is<br />
covered below.<br />
<br />
Here's an example of using the <tt>Get</tt> monad:<br />
<br />
<haskell>import qualified Data.ByteString.Lazy as BL<br />
import Data.Binary.Get<br />
import Data.Word<br />
<br />
deserialiseHeader :: Get (Word32, Word32, Word32)<br />
deserialiseHeader = do<br />
alen <- getWord32be<br />
plen <- getWord32be<br />
chksum <- getWord32be<br />
return (alen, plen, chksum)<br />
<br />
main :: IO ()<br />
main = do<br />
input <- BL.getContents<br />
print $ runGet deserialiseHeader input</haskell><br />
<br />
This code takes three big-endian, 32-bit unsigned numbers from the input string<br />
and returns them as a tuple. Let's try running it:<br />
<br />
<pre>% runhaskell /tmp/example.hs << EOF<br />
heredoc> 123412341235<br />
heredoc> EOF<br />
(825373492,825373492,825373493)</pre><br />
<br />
Makes sense, right? Look what happens if the input is too short:<br />
<br />
<pre>% runhaskell /tmp/example.hs << EOF<br />
tooshort<br />
EOF<br />
(1953460083,1752134260,example.hs: too few bytes. Failed reading at byte position 12</pre><br />
<br />
Here an exception was thrown because we ran out of bytes.<br />
<br />
So the <tt>Get</tt> monad consists of a set of operations like<br />
<hask>getWord32be</hask> which walk over the input and return some type of<br />
data. You can see the full list of those functions in the<br />
[http://hackage.haskell.org/packages/archive/binary/0.4.1/doc/html/Data-Binary-Get.html documentation].<br />
<br />
Here's another example; decoding an EOF-terminated list of<br />
numbers just involves recursion:<br />
<br />
<haskell>listOfWord16 = do<br />
empty <- isEmpty<br />
if empty<br />
then return []<br />
else do v <- getWord64be<br />
rest <- listOfWord16<br />
return (v : rest)</haskell><br />
<br />
==== Strict <tt>Get</tt> monad ====<br />
<br />
If you're parsing small messages then, firstly your input isn't going to be a<br />
lazy bytestring but a strict one. That's not reallly a problem because you can<br />
easilly convert between them. However, if you want to handle parse failures you<br />
either have to write your parser very carefully, or you have to deal with the<br />
fact that you can only catch exceptions in the IO monad.<br />
<br />
If this is your dilemma, then you need a strict version of the <tt>Get</tt><br />
monad. It's almost exactly the same, but a parser of type <hask>Get a</hask><br />
results in <hask>(Either String a, ByteString)</hask> as the result of<br />
<hask>runGet</hask>. That type is a tuple where the first value is ''either'' a<br />
string (an error string from the parse) or the result, and the second value is<br />
the remaining bytestring when the parser finished.<br />
<br />
Let's update the first example with this strict version of <tt>Get</tt>. You'll<br />
have to install the<br />
[http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary-strict binary-strict]<br />
package for it to work.<br />
<br />
<haskell>import qualified Data.ByteString as B<br />
import Data.Binary.Strict.Get<br />
import Data.Word<br />
<br />
deserialiseHeader :: Get (Word32, Word32, Word32)<br />
deserialiseHeader = do<br />
alen <- getWord32be<br />
plen <- getWord32be<br />
chksum <- getWord32be<br />
return (alen, plen, chksum)<br />
<br />
main :: IO ()<br />
main = do<br />
input <- B.getContents<br />
print $ runGet deserialiseHeader input</haskell><br />
<br />
Note that all we've done is change from lazy bytestrings to strict bytestrings<br />
and change to importing <tt>Data.Binary.Strict.Get</tt>. Now we'll run<br />
it again:<br />
<br />
<pre>% runhaskell /tmp/example.hs << EOF<br />
heredoc> 123412341235<br />
heredoc> EOF<br />
(Right (825373492,825373492,825373493),"\n")</pre><br />
<br />
Now we can see that the parser was successful (we got a <tt>Right</tt>) and we<br />
can see that our shell actually added an extra newline on the input (correctly)<br />
and the parser didn't consume that, so it's also returned to us. Now we try it<br />
with a truncated input:<br />
<br />
<pre>% runhaskell /tmp/example.hs << EOF<br />
heredoc> tooshort<br />
heredoc> EOF<br />
(Left "too few bytes","\n")</pre><br />
<br />
This time we didn't get an exception, but a <tt>Left</tt> value, which can be<br />
handled in pure code. The remaining bytestring is the same because our<br />
truncated input is 9 bytes long, parsing the first two <tt>Word32</tt>'s<br />
consumed 8 bytes and parsing the third failed—at which point we had the last<br />
byte still in the input.<br />
<br />
In your parser, you can also call <hask>fail</hask>, with an error string,<br />
which will result in a <tt>Left</tt> value.<br />
<br />
That's it; it's otherwise the same as the <tt>Get</tt> monad.<br />
<br />
====Incremental parsing====<br />
<br />
If you have to deal with a protocol which isn't length prefixed, or otherwise<br />
chunkable, from the network then you are faced with the problem of knowing when<br />
you have enough data to parse something semantically useful. You could run a<br />
strict <tt>Get</tt> over what you have and catch the truncation result, but<br />
that means that you're parsing the data multiple times etc.<br />
<br />
Instead, you can use an incremental parser. There's an incremental version of<br />
the <tt>Get</tt> monad in <tt>Data.Binary.Strict.IncrementalGet</tt> (you'll<br />
need the <tt>binary-strict</tt> package).<br />
<br />
You use it as normal, but rather than returning an <tt>Either</tt> value, you<br />
get a [http://hackage.haskell.org/packages/archive/binary-strict/0.2.4/doc/html/Data-Binary-Strict-IncrementalGet.html#t%3AResult Result]. You need to go follow that link and look at the documentation for <tt>Result</tt>.<br />
<br />
It reflects the three outcomes of parsing possibly truncated data. Either the<br />
data is invalid as is, or it's complete, or it's truncated. In the truncated<br />
case you are given a function (called a continuation), to which you can pass<br />
more data, when you get it, and continue the parse. The continuation, again,<br />
returns a <tt>Result</tt> depending on the result of parsing the additional<br />
data as well.<br />
<br />
====Bit twiddling====<br />
<br />
Even with all this monadic goodness, sometimes you just need to move some bits<br />
around. That's perfectly possible in Haskell too. Just import<br />
<tt>Data.Bits</tt> and use the following table.<br />
<br />
<table><br />
<tr><th>Name</th><th>C operator</th><th>Haskell</th></tr><br />
<tr><td>AND</td><td><tt>&amp;</tt></td><td><hask>.&.</hask></td></tr><br />
<tr><td>OR</td><td><tt>|</tt></td><td><hask>.|.</hask></td></tr><br />
<tr><td>XOR</td><td><tt>^</tt></td><td><hask>`xor`</hask></td></tr><br />
<tr><td>NOT</td><td><tt>~</tt></td><td><hask>`complement`</hask></td></tr><br />
<tr><td>Left shift</td><td><tt>&lt;&lt;</tt></td><td><hask>`shiftL`</hask></td></tr><br />
<tr><td>Right shift</td><td><tt>&gt;&gt;</tt></td><td><hask>`shiftR`</hask></td></tr><br />
</table><br />
<br />
====The <tt>BitGet</tt> monad====<br />
<br />
As an alternative to bit twiddling, you can also use the <tt>BitGet</tt> monad.<br />
This is another state-like monad, like <tt>Get</tt>, but here the state<br />
includes the current bit-offset in the input. This means that you can easily pull out<br />
unaligned data. Sadly, haddock is currently breaking when trying to generate the<br />
documentation for <tt>BitGet</tt> so I'll start with an example. Again, you'll<br />
need the<br />
[http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary-strict binary-strict] package installed.<br />
<br />
Here's a description of the header of a DNS packet, direct from RFC 1035:<br />
<br />
<pre> 1 1 1 1 1 1<br />
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5<br />
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+<br />
| ID |<br />
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+<br />
|QR| Opcode |AA|TC|RD|RA| Z | RCODE |<br />
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+<br />
| QDCOUNT |<br />
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+<br />
| ANCOUNT |<br />
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+<br />
| NSCOUNT |<br />
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+<br />
| ARCOUNT |<br />
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+</pre><br />
<br />
The actual fields don't matter, but here's a function for parsing it:<br />
<br />
<haskell><br />
parseHeader :: G.Get Header<br />
parseHeader = do<br />
id <- G.getWord16be<br />
flags <- G.getByteString 2<br />
qdcount <- fmap fromIntegral G.getWord16be<br />
ancount <- fmap fromIntegral G.getWord16be<br />
nscount <- fmap fromIntegral G.getWord16be<br />
arcount <- fmap fromIntegral G.getWord16be<br />
<br />
let r = BG.runBitGet flags (do<br />
isquery <- BG.getBit<br />
opcode <- BG.getAsWord8 4 >>= parseEnum<br />
aa <- BG.getBit<br />
tc <- BG.getBit<br />
rd <- BG.getBit<br />
ra <- BG.getBit<br />
<br />
BG.getAsWord8 3<br />
rcode <- BG.getAsWord8 4 >>= parseEnum<br />
<br />
return $ Header id isquery opcode aa tc rd ra rcode qdcount ancount nscount arcount)<br />
<br />
case r of<br />
Left error -> fail error<br />
Right x -> return x</haskell><br />
<br />
Here you can see that only the second line (from the ASCII-art diagram) is<br />
parsed using <tt>BitGet</tt>. An outer <tt>Get</tt> monad is used for<br />
everything else and the bit fields are pulled out with<br />
<hask>getByteString</hask>. Again, <tt>BitGet</tt> is a strict monad and<br />
returns an <tt>Either</tt>, but it doesn't return the remaining bytestring,<br />
just because there's no obvious way to represent a bytestring of a fractional<br />
number of bytes.<br />
<br />
You can see the list of <tt>BitGet</tt> functions and their comments in the<br />
[http://darcs.imperialviolet.org/darcsweb.cgi?r=binary-strict;a=headblob;f=/src/Data/Binary/Strict/BitGet.hs source code].<br />
<br />
===Binary generation===<br />
<br />
In contrast to parsing binary data, you might want to generate it. This is the<br />
job of the <tt>Put</tt> monad. Follow along with the<br />
[http://hackage.haskell.org/packages/archive/binary/0.4.1/doc/html/Data-Binary-Put.html documentation]<br />
if you like.<br />
<br />
The <tt>Put</tt> monad is another state-like monad, but the state is an offset<br />
into a series of buffers where the generated data is placed. All the buffer<br />
creation and handling is done for you, so you can just forget about it. It<br />
results in a lazy bytestring (so you can generate outputs that are larger than memory).<br />
<br />
Here's the reverse of our simple <tt>Get</tt> example:<br />
<br />
<haskell>import qualified Data.ByteString.Lazy as BL<br />
import Data.Binary.Put<br />
<br />
serialiseSomething :: Put<br />
serialiseSomething = do<br />
putWord32be 1<br />
putWord16be 2<br />
putWord8 3<br />
<br />
main :: IO ()<br />
main = BL.putStr $ runPut serialiseSomething</haskell><br />
<br />
And running it shows that it's generating the correct serialisation:<br />
<br />
<pre>% runhaskell /tmp/example.hs| hexdump -C<br />
00000000 00 00 00 01 00 02 03 |.......|</pre><br />
<br />
If you want the output of <tt>runPut</tt> to be a strict bytestring, you just<br />
need to convert it with <hask>B.concat $ BL.toChunks $ runPut xyz</hask>.<br />
<br />
One limitation of <tt>Put</tt>, due to the nature of the <tt>Builder</tt> monad<br />
which it works with, is that you can't get the current offset into the output.<br />
This can be an issue with some formats which require you to encode byte offsets<br />
into the file. You have to calculate these byte offsets yourself.<br />
<br />
=== Other useful packages ===<br />
<br />
There are other packages which you should know about, but which are mostly<br />
covered by their documentation:<br />
<br />
* [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/network-bytestring-0.1.1 network-bytestring]: for reading and writing bytestring from the network<br />
* [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/zlib-0.4.0.2 zlib] and [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bzlib-0.4.0.1 bzlib]: for compressed formats<br />
* [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/encoding-0.3 encoding]: for dealing with character encodings<br />
* [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/tar-0.1.1.1 tar]: as an example of lazy parsing/serialisation</div>Iceland jackhttps://wiki.haskell.org/index.php?title=Constructor&diff=57362Constructor2013-12-29T16:57:01Z<p>Iceland jack: /* Deconstructing data constructors */</p>
<hr />
<div>[[Category:Language]]<br />
'''Constructor''' can mean:<br />
* Type constructor<br />
* Data constructor<br />
<br />
A <hask>data</hask> declaration introduces one or more ''type'' constructors and one or more ''value'' constructors for each of the type constructors.<br />
<br />
== Type constructor ==<br />
A '''[[type]] constructor''' may have zero or more arguments, if it has zero arguments it is called a ''nullary'' type constructor (or simply a '''type'''). An example of a nullary type constructor <hask>Bool</hask> with two nullary data constructors <hask>True</hask> and <hask>False</hask><br />
<br />
<haskell><br />
data Bool = True | False<br />
</haskell><br />
<br />
An example of a ''unary'' type constructor <haskell>Tree</haskell><br />
<br />
<haskell><br />
data Tree a = Tip | Node a (Tree a) (Tree a)<br />
</haskell><br />
<br />
illustrates how to define a data type with type constructors (and data constructors at the same time). The type constructor is named <hask>Tree</hask>, but a tree of what? Of any specific type <hask>a</hask>, be it <hask>Integer</hask>, <hask>Maybe String</hask>, or even <hask>Tree b</hask>, in which case it will be a tree of tree of <hask>b</hask>. The data type is polymorphic (and <hask>a</hask> is a type variable that is to be substituted by a specific type). So when used, the values will have types like <hask>Tree Int</hask> or <hask>Tree (Tree Boolean)</hask>.<br />
<br />
== Data constructor ==<br />
A '''data constructor''' (or '''value constructor''') can have zero or more arguments where a data constructor taking zero arguments is called a ''nullary data constructor'' or simply a '''constant'''. They group values together and tag alternatives in an [[algebraic data type]],<br />
<haskell><br />
data Tree a = Tip | Node a (Tree a) (Tree a)<br />
</haskell><br />
where there are two data constructors, <hask>Tip</hask> and <hask>Node</hask>. Any value that belongs to the type <hask>Tree a</hask> (I'm happy leaving the type parameter unspecified) will be a constructed by either <hask>Tip</hask> or <hask>Node</hask>. <hask>Tip</hask> is a constructor alright, but it groups no value whatsoever, that is, it's a nullary constructor. There can be only one value that will have this constructor, also conveniently denoted <hask>Tip</hask>. So nullary constructors contain no data apart from its name! For example, the <hask>Bool</hask> data type is defined to be<br />
<haskell><br />
data Bool = True | False<br />
</haskell><br />
and for all practical purposes you can just think of them as ''constants'' belonging to a type.<br />
On the other hand, <hask>Node</hask> contains other data. The types of those data are its parameters. The first one has type <hask>a</hask>, so it's just a value of the parameter type <hask>a</hask>. This one is the value the tree node holds in it. The remaining two are the branches. Each of them have type <hask>Tree a</hask>, naturally.<br />
<br />
===Data constructors as first class values===<br />
Data constructors are first class values in Haskell and actually have a [[type]]. For instance, the type of the <hask>Left</hask> constructor of the <hask>Either</hask> data type is:<br />
<haskell><br />
Left :: forall b a. a -> Either a b<br />
</haskell><br />
<br />
As first class values, they may be passed to functions, held in a list, be data elements of other algebraic data types and so forth.<br />
<br />
<br />
=== Data constructors are not types===<br />
As discussed above, they denote values. It is illegal to write <hask>Node a (Node a) (Node a)</hask> there, because the type is <hask>Tree</hask>, not <hask>Node</hask>.<br />
<br />
== Deconstructing data constructors ==<br />
All a data constructor does is holding values together. But you want to separate them if you want to use them. This is done via [[pattern matching]],<br />
<br />
<haskell><br />
depth Tip = 0<br />
depth (Node _ l r) = 1 + max (depth l) (depth r)<br />
</haskell><br />
<br />
So, the depth of a tip is zero. The depth of a node depends on its branches, but not its content. See how the constructor in the left hand side names its parts? we don't need the content so we don't name it (using <hask>_</hask>). The left branch is named <hask>l</hask>, the right <hask>r</hask>, allowing us to use these values in the right hand side.<br />
<br />
== Notes and tips==<br />
* You can declare a constructor (for both type and data) to be an infix operator, and this can make your code a lot more readable. However, for alphanumeric names, the names of both the type constructor and the data constructor(s) must start with an uppercase letter. <br />
* [[Tuples]] are a built in feature of the syntax but are plain old algebraic data types! They have only one constructor though. Having the same name as their types (don't freak out, it's just a matter of convenience, as the type constructors and the data constructors have separate namespaces). So, <hask>(4, True)</hask> is really a value of the form <hask>(,) 4 True</hask> having the type <hask>(,) Int Bool</hask>, which, too, is written conveniently as <hask>(Int, Bool)</hask> to make it more readable. Incidentally, the empty tuple type <hask>()</hask> with its only value <hask>()</hask> is used throughout, and is called ''unit''.<br />
* You can, in fact, name the values grouped together, using the [[record]] syntax, <haskell><br />
data Person = Person { name :: String, age :: Int, address :: String }<br />
</haskell> so that for a person <hask>p</hask>, you can say <hask>age p</hask> to select his/her age, without resorting to pattern matching.<br />
* Sometimes you need a little editting or checking when constructing your data. If you do, check [[smart constructors]]<br />
* Sometimes you do not want the user of your library to group arbitrary values together. This is achieved by ''hiding'' your constructor (not mentioning it in the export list of the module), creating an [[abstract data type]] as a result. Along with smart constructors mentioned above, one can achieve encapsulation.<br />
<br />
== References ==<br />
* [http://www.cs.cmu.edu/~rwh/smlbook/book.pdf Programming in Standard ML]: Chapter 10 (Concrete Data Types)</div>Iceland jackhttps://wiki.haskell.org/index.php?title=Constructor&diff=57361Constructor2013-12-29T16:55:57Z<p>Iceland jack: </p>
<hr />
<div>[[Category:Language]]<br />
'''Constructor''' can mean:<br />
* Type constructor<br />
* Data constructor<br />
<br />
A <hask>data</hask> declaration introduces one or more ''type'' constructors and one or more ''value'' constructors for each of the type constructors.<br />
<br />
== Type constructor ==<br />
A '''[[type]] constructor''' may have zero or more arguments, if it has zero arguments it is called a ''nullary'' type constructor (or simply a '''type'''). An example of a nullary type constructor <hask>Bool</hask> with two nullary data constructors <hask>True</hask> and <hask>False</hask><br />
<br />
<haskell><br />
data Bool = True | False<br />
</haskell><br />
<br />
An example of a ''unary'' type constructor <haskell>Tree</haskell><br />
<br />
<haskell><br />
data Tree a = Tip | Node a (Tree a) (Tree a)<br />
</haskell><br />
<br />
illustrates how to define a data type with type constructors (and data constructors at the same time). The type constructor is named <hask>Tree</hask>, but a tree of what? Of any specific type <hask>a</hask>, be it <hask>Integer</hask>, <hask>Maybe String</hask>, or even <hask>Tree b</hask>, in which case it will be a tree of tree of <hask>b</hask>. The data type is polymorphic (and <hask>a</hask> is a type variable that is to be substituted by a specific type). So when used, the values will have types like <hask>Tree Int</hask> or <hask>Tree (Tree Boolean)</hask>.<br />
<br />
== Data constructor ==<br />
A '''data constructor''' (or '''value constructor''') can have zero or more arguments where a data constructor taking zero arguments is called a ''nullary data constructor'' or simply a '''constant'''. They group values together and tag alternatives in an [[algebraic data type]],<br />
<haskell><br />
data Tree a = Tip | Node a (Tree a) (Tree a)<br />
</haskell><br />
where there are two data constructors, <hask>Tip</hask> and <hask>Node</hask>. Any value that belongs to the type <hask>Tree a</hask> (I'm happy leaving the type parameter unspecified) will be a constructed by either <hask>Tip</hask> or <hask>Node</hask>. <hask>Tip</hask> is a constructor alright, but it groups no value whatsoever, that is, it's a nullary constructor. There can be only one value that will have this constructor, also conveniently denoted <hask>Tip</hask>. So nullary constructors contain no data apart from its name! For example, the <hask>Bool</hask> data type is defined to be<br />
<haskell><br />
data Bool = True | False<br />
</haskell><br />
and for all practical purposes you can just think of them as ''constants'' belonging to a type.<br />
On the other hand, <hask>Node</hask> contains other data. The types of those data are its parameters. The first one has type <hask>a</hask>, so it's just a value of the parameter type <hask>a</hask>. This one is the value the tree node holds in it. The remaining two are the branches. Each of them have type <hask>Tree a</hask>, naturally.<br />
<br />
===Data constructors as first class values===<br />
Data constructors are first class values in Haskell and actually have a [[type]]. For instance, the type of the <hask>Left</hask> constructor of the <hask>Either</hask> data type is:<br />
<haskell><br />
Left :: forall b a. a -> Either a b<br />
</haskell><br />
<br />
As first class values, they may be passed to functions, held in a list, be data elements of other algebraic data types and so forth.<br />
<br />
<br />
=== Data constructors are not types===<br />
As discussed above, they denote values. It is illegal to write <hask>Node a (Node a) (Node a)</hask> there, because the type is <hask>Tree</hask>, not <hask>Node</hask>.<br />
<br />
== Deconstructing data constructors ==<br />
All a data constructor does is holding values together. But you want to separate them if you want to use them. This is done via [[pattern matching]],<br />
<haskell><br />
depth Tip = 0<br />
depth (Node _ l r) = 1 + max (depth l) (depth r)<br />
</haskell><br />
So, the depth of a tip is zero. The depth of a node depends on its branches, but not its content. See how the constructor in the left hand side names its parts? we don't need the content so we don't name it (using <hask>_</hask>). The left branch is named <hask>l</hask>, the right <hask>r</hask>, allowing us to use these values in the right hand side.<br />
<br />
== Notes and tips==<br />
* You can declare a constructor (for both type and data) to be an infix operator, and this can make your code a lot more readable. However, for alphanumeric names, the names of both the type constructor and the data constructor(s) must start with an uppercase letter. <br />
* [[Tuples]] are a built in feature of the syntax but are plain old algebraic data types! They have only one constructor though. Having the same name as their types (don't freak out, it's just a matter of convenience, as the type constructors and the data constructors have separate namespaces). So, <hask>(4, True)</hask> is really a value of the form <hask>(,) 4 True</hask> having the type <hask>(,) Int Bool</hask>, which, too, is written conveniently as <hask>(Int, Bool)</hask> to make it more readable. Incidentally, the empty tuple type <hask>()</hask> with its only value <hask>()</hask> is used throughout, and is called ''unit''.<br />
* You can, in fact, name the values grouped together, using the [[record]] syntax, <haskell><br />
data Person = Person { name :: String, age :: Int, address :: String }<br />
</haskell> so that for a person <hask>p</hask>, you can say <hask>age p</hask> to select his/her age, without resorting to pattern matching.<br />
* Sometimes you need a little editting or checking when constructing your data. If you do, check [[smart constructors]]<br />
* Sometimes you do not want the user of your library to group arbitrary values together. This is achieved by ''hiding'' your constructor (not mentioning it in the export list of the module), creating an [[abstract data type]] as a result. Along with smart constructors mentioned above, one can achieve encapsulation.<br />
<br />
== References ==<br />
* [http://www.cs.cmu.edu/~rwh/smlbook/book.pdf Programming in Standard ML]: Chapter 10 (Concrete Data Types)</div>Iceland jackhttps://wiki.haskell.org/index.php?title=Constructor&diff=57360Constructor2013-12-29T16:45:55Z<p>Iceland jack: </p>
<hr />
<div>[[Category:Language]]<br />
'''Constructor''' can mean:<br />
* Type constructor<br />
* Data constructor<br />
<br />
A <hask>data</hask> declaration introduces one or more ''type'' constructors and one or more ''value'' constructors for each of the type constructors.<br />
<br />
== Type constructor ==<br />
A [[type]] constructor is used to construct new types from given ones.<br />
<br />
<haskell><br />
data Tree a = Tip | Node a (Tree a) (Tree a)<br />
</haskell><br />
<br />
illustrates how to define a data type with type constructors (and data constructors at the same time). The type constructor is named <hask>Tree</hask>, but a tree of what? Of any specific type <hask>a</hask>, be it <hask>Integer</hask>, <hask>Maybe String</hask>, or even <hask>Tree b</hask>, in which case it will be a tree of tree of <hask>b</hask>. The data type is polymorphic (and <hask>a</hask> is a type variable that is to be substituted by a specific type). So when used, the values will have types like <hask>Tree Int</hask> or <hask>Tree (Tree Boolean)</hask>.<br />
<br />
== Data constructor ==<br />
A data constructor groups values together and tags alternatives in an [[algebraic data type]],<br />
<haskell><br />
data Tree a = Tip | Node a (Tree a) (Tree a)<br />
</haskell><br />
where there are two data constructors, <hask>Tip</hask> and <hask>Node</hask>. Any value that belongs to the type <hask>Tree a</hask> (I'm happy leaving the type parameter unspecified) will be a constructed by either <hask>Tip</hask> or <hask>Node</hask>. <hask>Tip</hask> is a constructor alright, but it groups no value whatsoever, that is, it's a nullary constructor. There can be only one value that will have this constructor, also conveniently denoted <hask>Tip</hask>. So nullary constructors contain no data apart from its name! For example, the <hask>Bool</hask> data type is defined to be<br />
<haskell><br />
data Bool = True | False<br />
</haskell><br />
and for all practical purposes you can just think of them as ''constants'' belonging to a type.<br />
On the other hand, <hask>Node</hask> contains other data. The types of those data are its parameters. The first one has type <hask>a</hask>, so it's just a value of the parameter type <hask>a</hask>. This one is the value the tree node holds in it. The remaining two are the branches. Each of them have type <hask>Tree a</hask>, naturally.<br />
<br />
===Data constructors as first class values===<br />
Data constructors are first class values in Haskell and actually have a [[type]]. For instance, the type of the <hask>Left</hask> constructor of the <hask>Either</hask> data type is:<br />
<haskell><br />
Left :: forall b a. a -> Either a b<br />
</haskell><br />
<br />
As first class values, they may be passed to functions, held in a list, be data elements of other algebraic data types and so forth.<br />
<br />
<br />
=== Data constructors are not types===<br />
As discussed above, they denote values. It is illegal to write <hask>Node a (Node a) (Node a)</hask> there, because the type is <hask>Tree</hask>, not <hask>Node</hask>.<br />
<br />
== Deconstructing data constructors ==<br />
All a data constructor does is holding values together. But you want to separate them if you want to use them. This is done via [[pattern matching]],<br />
<haskell><br />
depth Tip = 0<br />
depth (Node _ l r) = 1 + max (depth l) (depth r)<br />
</haskell><br />
So, the depth of a tip is zero. The depth of a node depends on its branches, but not its content. See how the constructor in the left hand side names its parts? we don't need the content so we don't name it (using <hask>_</hask>). The left branch is named <hask>l</hask>, the right <hask>r</hask>, allowing us to use these values in the right hand side.<br />
<br />
== Notes and tips==<br />
* You can declare a constructor (for both type and data) to be an infix operator, and this can make your code a lot more readable. However, for alphanumeric names, the names of both the type constructor and the data constructor(s) must start with an uppercase letter. <br />
* [[Tuples]] are a built in feature of the syntax but are plain old algebraic data types! They have only one constructor though. Having the same name as their types (don't freak out, it's just a matter of convenience, as the type constructors and the data constructors have separate namespaces). So, <hask>(4, True)</hask> is really a value of the form <hask>(,) 4 True</hask> having the type <hask>(,) Int Bool</hask>, which, too, is written conveniently as <hask>(Int, Bool)</hask> to make it more readable. Incidentally, the empty tuple type <hask>()</hask> with its only value <hask>()</hask> is used throughout, and is called ''unit''.<br />
* You can, in fact, name the values grouped together, using the [[record]] syntax, <haskell><br />
data Person = Person { name :: String, age :: Int, address :: String }<br />
</haskell> so that for a person <hask>p</hask>, you can say <hask>age p</hask> to select his/her age, without resorting to pattern matching.<br />
* Sometimes you need a little editting or checking when constructing your data. If you do, check [[smart constructors]]<br />
* Sometimes you do not want the user of your library to group arbitrary values together. This is achieved by ''hiding'' your constructor (not mentioning it in the export list of the module), creating an [[abstract data type]] as a result. Along with smart constructors mentioned above, one can achieve encapsulation.<br />
<br />
== References ==<br />
* [http://www.cs.cmu.edu/~rwh/smlbook/book.pdf Programming in Standard ML]: Chapter 10 (Concrete Data Types)</div>Iceland jackhttps://wiki.haskell.org/index.php?title=Constructor&diff=57359Constructor2013-12-29T16:44:25Z<p>Iceland jack: </p>
<hr />
<div>[[Category:Language]]<br />
'''Constructor''' can mean:<br />
* Type constructor<br />
* Data constructor<br />
<br />
A <hask>data</hask> declaration introduces one or more ''type'' constructors and one or more ''value'' constructors for each of the type constructors.<ref name="sml">[http://www.cs.cmu.edu/~rwh/smlbook/book.pdf Programming in Standard ML]: Chapter 10</ref> <br />
<br />
== Type constructor ==<br />
A [[type]] constructor is used to construct new types from given ones.<br />
<br />
<haskell><br />
data Tree a = Tip | Node a (Tree a) (Tree a)<br />
</haskell><br />
<br />
illustrates how to define a data type with type constructors (and data constructors at the same time). The type constructor is named <hask>Tree</hask>, but a tree of what? Of any specific type <hask>a</hask>, be it <hask>Integer</hask>, <hask>Maybe String</hask>, or even <hask>Tree b</hask>, in which case it will be a tree of tree of <hask>b</hask>. The data type is polymorphic (and <hask>a</hask> is a type variable that is to be substituted by a specific type). So when used, the values will have types like <hask>Tree Int</hask> or <hask>Tree (Tree Boolean)</hask>.<br />
<br />
== Data constructor ==<br />
A data constructor groups values together and tags alternatives in an [[algebraic data type]],<br />
<haskell><br />
data Tree a = Tip | Node a (Tree a) (Tree a)<br />
</haskell><br />
where there are two data constructors, <hask>Tip</hask> and <hask>Node</hask>. Any value that belongs to the type <hask>Tree a</hask> (I'm happy leaving the type parameter unspecified) will be a constructed by either <hask>Tip</hask> or <hask>Node</hask>. <hask>Tip</hask> is a constructor alright, but it groups no value whatsoever, that is, it's a nullary constructor. There can be only one value that will have this constructor, also conveniently denoted <hask>Tip</hask>. So nullary constructors contain no data apart from its name! For example, the <hask>Bool</hask> data type is defined to be<br />
<haskell><br />
data Bool = True | False<br />
</haskell><br />
and for all practical purposes you can just think of them as ''constants'' belonging to a type.<br />
On the other hand, <hask>Node</hask> contains other data. The types of those data are its parameters. The first one has type <hask>a</hask>, so it's just a value of the parameter type <hask>a</hask>. This one is the value the tree node holds in it. The remaining two are the branches. Each of them have type <hask>Tree a</hask>, naturally.<br />
<br />
===Data constructors as first class values===<br />
Data constructors are first class values in Haskell and actually have a [[type]]. For instance, the type of the <hask>Left</hask> constructor of the <hask>Either</hask> data type is:<br />
<haskell><br />
Left :: forall b a. a -> Either a b<br />
</haskell><br />
<br />
As first class values, they may be passed to functions, held in a list, be data elements of other algebraic data types and so forth.<br />
<br />
<br />
=== Data constructors are not types===<br />
As discussed above, they denote values. It is illegal to write <hask>Node a (Node a) (Node a)</hask> there, because the type is <hask>Tree</hask>, not <hask>Node</hask>.<br />
<br />
== Deconstructing data constructors ==<br />
All a data constructor does is holding values together. But you want to separate them if you want to use them. This is done via [[pattern matching]],<br />
<haskell><br />
depth Tip = 0<br />
depth (Node _ l r) = 1 + max (depth l) (depth r)<br />
</haskell><br />
So, the depth of a tip is zero. The depth of a node depends on its branches, but not its content. See how the constructor in the left hand side names its parts? we don't need the content so we don't name it (using <hask>_</hask>). The left branch is named <hask>l</hask>, the right <hask>r</hask>, allowing us to use these values in the right hand side.<br />
<br />
== Notes and tips==<br />
* You can declare a constructor (for both type and data) to be an infix operator, and this can make your code a lot more readable. However, for alphanumeric names, the names of both the type constructor and the data constructor(s) must start with an uppercase letter. <br />
* [[Tuples]] are a built in feature of the syntax but are plain old algebraic data types! They have only one constructor though. Having the same name as their types (don't freak out, it's just a matter of convenience, as the type constructors and the data constructors have separate namespaces). So, <hask>(4, True)</hask> is really a value of the form <hask>(,) 4 True</hask> having the type <hask>(,) Int Bool</hask>, which, too, is written conveniently as <hask>(Int, Bool)</hask> to make it more readable. Incidentally, the empty tuple type <hask>()</hask> with its only value <hask>()</hask> is used throughout, and is called ''unit''.<br />
* You can, in fact, name the values grouped together, using the [[record]] syntax, <haskell><br />
data Person = Person { name :: String, age :: Int, address :: String }<br />
</haskell> so that for a person <hask>p</hask>, you can say <hask>age p</hask> to select his/her age, without resorting to pattern matching.<br />
* Sometimes you need a little editting or checking when constructing your data. If you do, check [[smart constructors]]<br />
* Sometimes you do not want the user of your library to group arbitrary values together. This is achieved by ''hiding'' your constructor (not mentioning it in the export list of the module), creating an [[abstract data type]] as a result. Along with smart constructors mentioned above, one can achieve encapsulation.</div>Iceland jack