Difference between revisions of "Binary IO"

From HaskellWiki
Jump to navigation Jump to search
(moved)
 
(→‎Data.Binary: Added a link to System.IO)
 
(6 intermediate revisions by 5 users not shown)
Line 1: Line 1:
  +
== Data.Binary ==
   
There are a number of binary I/O libraries available for Haskell.
+
There are a number of binary I/O libraries available for Haskell. The
  +
best to use is the new, semi-standard Data.Binary library:
   
  +
* [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary Data.Binary]
* JeremyShaw's update of HalDaume's NewBinary package (Cabalized): http://www.n-heptane.com/nhlab/repos/NewBinary
 
* [http://www.cse.unsw.edu.au/~dons/fps.html Data.ByteString] (Cabalised) also provides byte level operations, and is used in some applications for binary IO
 
* [http://www.cs.helsinki.fi/u/ekarttun/SerTH/ SerTH], the TH version (sort of) of NewBinary.
 
* PeterSimons's BlockIO package (Cabalized): http://cryp.to/blockio/
 
* JohnGoerzen's MissingH package (Cabalized): http://quux.org/devel/missingh
 
* SimonMarlow's experimental NewIO package: http://www.haskell.org/~simonmar/new-io.tar.gz (documentation at http://www.haskell.org/~simonmar/io/)
 
   
  +
It's very simple to use, and provides a highly efficient, pure interface
If you have simple binary IO requirements, then FPS might be easiest -- you get a List-like interface to packed byte arrays (interface documented [http://www.cse.unsw.edu.au/~dons/fps/Data.FastPackedString.html here]). For more complex serialisation, NewBinary would be preferred. [[Lambdabot]] follows this rule: when serialising lists and maps, it uses Data.ByteStrings. For complex algebraic datatypes, NewBinary is used (see [http://www.cse.unsw.edu.au/~dons/code/lambdabot/Plugins/Seen.hs Plugin/Seen.hs] in Lambdabot).
 
  +
to binary serialisation.
   
  +
For just writing binary data to file, use
NewBinary is based on Binary.hs from GHC, and traces back to NHC's binary library, described in [ftp://ftp.cs.york.ac.uk/pub/malcolm/ismm98.html "The Bits Between The Lambdas"]. It is simple to use: for each value you wish to serialise/unserialise, you write an instance of Binary for its type. This can be automated with the DrIFT tool, which will derive binary for you (as is done in GHC's .hi files). You can rip out your own copy of Binary.hs from NewBinary -- just take Binary.hs and FastMutInt.lhs.
 
  +
* [http://hackage.haskell.org/package/base-4.6.0.1/docs/System-IO.html#g:21 System.IO]
   
  +
It is a part of the base package, so it comes with [[GHC]].
Lamdabot, hmp3 and hs-plugins are applications I've written using NewBinary, and have found it simple and effective. Here, for example, is the binary serialisation code for the [http://www.cse.unsw.edu.au/~dons/code/hmp3.html hmp3] database type, an Array, and a user defined type: File. Easy!
 
   
  +
A tutorial:
<haskell>
 
instance Binary a => Binary (Array Int a) where
 
put_ bh arr = do
 
put_ bh (bounds arr)
 
mapM_ (put_ bh) (elems arr)
 
get bh = do
 
((x,y) :: (Int,Int)) <- get bh
 
(els :: [a]) <- sequence $ take (y+1) $ repeat (get bh)
 
return $! listArray (x,y) els
 
   
  +
* [[Serialisation and compression with Data Binary]]
instance Binary File where
 
put_ bh (File nm i) = do
 
put_ bh nm
 
put_ bh i
 
get bh = do
 
nm <- get bh
 
i <- get bh
 
return (File nm i)
 
</haskell>
 
   
  +
See also [[Dealing with binary data]]
As an aside, SerTH lets you derive instance of a Binary-alike class automagically using TH, rather than requiring hand-written instances, or DrIFT.
 
   
  +
== Other libraries ==
So, in summary. There are a number of ways to do binary IO in Haskell, efficiently and simply. It's really not that hard at all.
 
  +
 
* JeremyShaw's update of HalDaume's NewBinary package (Cabalized): http://www.n-heptane.com/nhlab/repos/NewBinary
 
* [http://hackage.haskell.org/package/bytestring Data.ByteString] (Cabalised) also provides byte level operations, and is used in some applications for binary IO
 
* [http://web.archive.org/web/20080123105519/http://www.cs.helsinki.fi/u/ekarttun/SerTH/ SerTH], the TH version (sort of) of NewBinary.
 
* PeterSimons's BlockIO package (Cabalized): http://cryp.to/blockio/
 
* JohnGoerzen's MissingH package (Cabalized): http://quux.org/devel/missingh
 
* SimonMarlow's experimental NewIO package: http://www.haskell.org/~simonmar/new-io.tar.gz (documentation at http://www.haskell.org/~simonmar/io/)
  +
  +
For very simple serialisation, use <hask>read</hask> and <hask>show</hask>.
  +
 
If you have simple binary IO requirements, then Data.ByteString might be easiest -- you get a List-like interface to packed byte arrays (interface documented [http://www.cse.unsw.edu.au/~dons/fps/Data.FastPackedString.html here]). For more complex serialisation, Data.Binary would be preferred.
   
 
[[Category:Tutorials]]
 
[[Category:Tutorials]]

Latest revision as of 19:03, 23 October 2013

Data.Binary

There are a number of binary I/O libraries available for Haskell. The best to use is the new, semi-standard Data.Binary library:

   * Data.Binary

It's very simple to use, and provides a highly efficient, pure interface to binary serialisation.

For just writing binary data to file, use

   * System.IO

It is a part of the base package, so it comes with GHC.

A tutorial:

   * Serialisation and compression with Data Binary

See also Dealing with binary data

Other libraries

For very simple serialisation, use read and show.

If you have simple binary IO requirements, then Data.ByteString might be easiest -- you get a List-like interface to packed byte arrays (interface documented here). For more complex serialisation, Data.Binary would be preferred.