Difference between revisions of "Library/Data encoding"
< Library
Jump to navigation
Jump to search
(Development has been moved to github) |
GracjanPolak (talk | contribs) (Page moved to readme.md in source code on github) |
||
(One intermediate revision by one other user not shown) | |||
Line 1: | Line 1: | ||
− | [[Category:Libraries]] |
||
− | Data Encodings (dataenc): A collection of data encoding algorithms. |
||
− | |||
− | == Data encodings library == |
||
− | |||
− | The data encodings library strives to provide implementations in Haskell of every major data encoding, and a few minor ones as well. Currently the following encodings are implemented: |
||
− | |||
− | * Base16 (<hask>Codec.Binary.Base16</hask>) |
||
− | * Base32 (<hask>Codec.Binary.Base32</hask>) |
||
− | * Base32Hex (<hask>Codec.Binary.Base32Hex</hask>) |
||
− | * Base64 (<hask>Codec.Binary.Base64</hask>) |
||
− | * Base64Url (<hask>Codec.Binary.Base64Url</hask>) |
||
− | * Base85 (<hask>Codec.Binary.Base85</hask>) |
||
− | * Python string escaping (<hask>Codec.Binary.PythonString</hask>) |
||
− | * Quoted-Printable (<hask>Codec.Binary.QuotedPrintable</hask>) |
||
− | * URL encoding (<hask>Codec.Binary.Url</hask>) |
||
− | * Uuencode (<hask>Codec.Binary.Uu</hask>) |
||
− | * Xxencode (<hask>Codec.Binary.Xx</hask>) |
||
− | * yEncode (<hask>Codec.Binary.Yenc</hask>) |
||
− | |||
− | In some cases the encodings also specify headers and footers for the encoded data. Implementation of that is left for the user of the library. |
||
− | |||
− | == The API == |
||
− | |||
− | === Main API === |
||
− | |||
− | The module <hask>Codec.Binary.DataEncoding</hask> provides a type that collects the functions for an individual encoding: |
||
− | |||
− | <haskell> |
||
− | data DataCodec = DataCodec { |
||
− | encode :: [Word8] -> String, |
||
− | decode :: String -> Maybe [Word8], |
||
− | decode' :: String -> [Maybe Word8], |
||
− | chop :: Int -> String -> [String], |
||
− | unchop :: [String] -> String |
||
− | } |
||
− | </haskell> |
||
− | |||
− | It also exposes instances of this type for each encoding: |
||
− | |||
− | <haskell> |
||
− | base16 :: DataCodec |
||
− | base32 :: DataCodec |
||
− | base32Hex :: DataCodec |
||
− | base64 :: DataCodec |
||
− | base64Url :: DataCodec |
||
− | uu :: DataCodec |
||
− | </haskell> |
||
− | |||
− | <b>NB</b> There is no instance for yEncoding since the functions in that module have slightly different type signatures. |
||
− | |||
− | === Secondary API === |
||
− | |||
− | Each individual encoding module is also exposed and offers four functions: |
||
− | |||
− | <haskell> |
||
− | encode :: [Word8] -> String |
||
− | decode :: String -> Maybe [Word8] |
||
− | decode' :: String -> [Maybe Word8] |
||
− | chop :: Int -> String -> [String] |
||
− | unchop :: [String] -> String |
||
− | </haskell> |
||
− | |||
− | == Description of the encodings == |
||
− | |||
− | === Base16 === |
||
− | |||
− | Implemented as it's defined in [http://tools.ietf.org/html/rfc4648 RFC 4648]. |
||
− | |||
− | Each four bit nibble of an octet is encoded as a character in the set 0-9,A-F. |
||
− | |||
− | === Base32 === |
||
− | |||
− | Implemented as it's defined in [http://tools.ietf.org/html/rfc4648 RFC 4648]. |
||
− | |||
− | Five octets are expanded into eight so that only the five least significant bits are used. Each is then encoded into a 32-character encoding alphabet. |
||
− | |||
− | === Base32Hex === |
||
− | |||
− | Implemented as it's defined in [http://tools.ietf.org/html/rfc4648 RFC 4648]. |
||
− | |||
− | Just like Base32 but with a different encoding alphabet. Unlike Base64 and Base32, data encoded with Base32Hex maintains its sort order when the encoded data is compared bit wise. |
||
− | |||
− | === Base64 === |
||
− | |||
− | Implemented as it's defined in [http://tools.ietf.org/html/rfc4648 RFC 4648]. |
||
− | |||
− | Three octets are expanded into four so that only the six least significant bits are used. Each is then encoded into a 64-character encoding alphabet. |
||
− | |||
− | === Base64Url === |
||
− | |||
− | Implemented as it's defined in [http://tools.ietf.org/html/rfc4648 RFC 4648]. |
||
− | |||
− | Just like Base64 but with a different encoding alphabet. The encoding alphabet is made URL and filename safe by substituting <tt>+</tt> and <tt>/</tt> for <tt>-</tt> and <tt>_</tt> respectively. |
||
− | |||
− | === Base85 === |
||
− | |||
− | Implementation as described in the [http://en.wikipedia.org/wiki/Ascii85 Wikipedia article]. |
||
− | |||
− | === Python string escaping === |
||
− | |||
− | Implementation of Python's string escaping. |
||
− | |||
− | === Quoted-Printable === |
||
− | |||
− | Implemented as defined in [http://tools.ietf.org/html/rfc2045 RFC 2045]. |
||
− | |||
− | === URL encoding === |
||
− | |||
− | Implemented as defined in [http://tools.ietf.org/html/rfc3986 RFC 3986]. |
||
− | |||
− | === Uuencode === |
||
− | |||
− | Unfortunately uuencode is badly specified and there are in fact several differing implementations of it. This implementation attempts to encode data in the same way as the <tt>uuencode</tt> utility found in [http://www.gnu.org/software/sharutils/ GNU's sharutils]. The workings of <hask>chop</hask> and <hask>unchop</hask> also follow how sharutils split and unsplit encoded lines. |
||
− | |||
− | === Xxencode === |
||
− | |||
− | Implemented as described in the [http://en.wikipedia.org/wiki/Xxencode Wikipedia article]. |
||
− | |||
− | === yEncoding === |
||
− | |||
− | Implemented as it's defined in [http://yence.sourceforge.net/docs/protocol/version1_3_draft.html the 1.3 draft]. |
||
− | |||
− | == Downloading == |
||
− | |||
− | The current release is available from [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/dataenc HackageDB]. |
||
− | |||
− | See [[#Contributing]] below for how to get the development version. |
||
− | |||
− | == Example of use == |
||
− | |||
− | The package [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/omnicodec omnicodec] contains two command line tools for encoding and decoding data. |
||
− | |||
− | == Contributing == |
||
− | |||
− | The source is hosted on [https://github.com/magthe/dataenc/ github] and can be downloaded using git: |
||
− | |||
− | git clone https://github.com/magthe/dataenc.git |
||
− | |||
− | Patches can be sent to magnus@therning.org, but I suggest using github's pull requests if possible. |