Difference between revisions of "Cookbook/Lists and strings"

Latest revision as of 21:02, 6 January 2019

Lists

In Haskell, lists are what Arrays are in most other languages.

Creating simple lists

Problem	Solution	Examples
creating a list with given elements	-	3 : 12 : 42 : [] --> [3,12,42] 'f' : 'o' : 'o' : [] --> "foo"
creating a list with stepsize 1	-	[1..10] --> [1,2,3,4,5,6,7,8,9,10] ['a'..'z'] --> "abcdefghijklmnopqrstuvwxyz"
creating a list with different stepsize	-	[1,3..10] --> [1,3,5,7,9] ['a','c'..'z'] --> "acegikmoqsuwy"
creating an infinite constant list	-	[1,1..] --> [1,1,1,1,1,...
creating an infinite list with stepsize 1	-	[1..] --> [1,2,3,4,5,...

List comprehensions

The list of all squares can also be written in a more comprehensive way, using list comprehensions:

squares = [x*x | x <- [1..]]

List comprehensions allow for constraints as well:

-- multiples of 3 or 5
mults = [ x | x <- [1..], mod x 3 == 0 || mod x 5 == 0 ]

Combining lists

Problem	Solution	Examples
combining two lists	(++)	"foo" ++ "bar" --> "foobar" [42,43] ++ [60,61] --> [42,43,60,61]
combining many lists	concat	concat ["foo", "bar", "baz"] --> "foobarbaz"

Accessing sublists

Problem	Solution	Examples
accessing the first element	head	head "foo bar baz" --> 'f'
accessing the last element	last	last "foo bar baz" --> 'z'
accessing the element at a given index	(!!)	"foo bar baz" !! 4 --> 'b'
accessing the first `n` elements	take	take 3 "foo bar baz" --> "foo"
accessing the last `n` elements	reverse , take	reverse . take 3 . reverse $ "foobar" --> "bar"
accessing the `n` elements starting from index `m`	drop, take	take 4 $ drop 2 "foo bar baz" --> "o ba"

Splitting lists

Problem	Solution	Examples
splitting a string into a list of words	words	words "foo bar\t baz\n" --> ["foo","bar","baz"]
splitting a list into two parts	splitAt	splitAt 3 "foo bar baz" --> ("foo"," bar baz")

Strings

Since strings are lists of characters, you can use any available list function.

Multiline strings

"foo\
\bar"               --> "foobar"

Converting between characters and values

Problem	Solution	Examples
converting a character to a numeric value	ord	import Data.Char ord 'A' --> 65
converting a numeric value to a character	chr	import Data.Char chr 99 --> 'c'

Reversing a string by words or characters

Problem	Solution	Examples
reversing a string by characters	reverse	reverse "foo bar baz" --> "zab rab oof"
reversing a string by words	words, reverse, unwords	unwords $ reverse $ words "foo bar baz" --> "baz bar foo"
reversing a string by characters by words	words, reverse, map, unwords	unwords $ map reverse $ words "foo bar baz" --> "oof rab zab"

Converting case

Problem	Solution	Examples
converting a character to upper-case	toUpper	import Data.Char toUpper 'a' --> 'A'
converting a character to lower-case	toLower	import Data.Char toLower 'A' --> 'a'
converting a string to upper-case	toUpper, map	import Data.Char map toUpper "Foo Bar" --> "FOO BAR"
converting a string to lower-case	toLower, map	import Data.Char map toLower "Foo Bar" --> "foo bar"

Interpolation

TODO

Performance

Text handles character strings with better performance than Strings; it should be the prefered data type for UTF-8 encoded strings.

If observe that Text does not give sufficient performance, consider Data.ByteString, which is essentially a byte array. It can contain UTF-8 characters, but handle with care! .

Unicode

Current GHC (later than 6) encodes Strings and Text in UTF-8. This may change the behavior of some of the functions explained above when applied to characters beyond the traditional ASCII characters. Remember that not every character in UTF-8 encoding is one byte!

@@ Line 70: / Line 70: @@
 |-
 |  combining two lists
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3A%2B%2B (++)]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v%3A%2B%2B (++)]
 |<haskell>
 "foo" ++ "bar"                  --> "foobar"
@@ Line 77: / Line 77: @@
 |-
 |  combining many lists
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:concat concat]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:concat concat]
 | <haskell>
 concat ["foo", "bar", "baz"]    --> "foobarbaz"
@@ Line 92: / Line 92: @@
 |-
 |  accessing the first element
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:head head]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:head head]
 |<haskell>
 head "foo bar baz"      --> 'f'
@@ Line 98: / Line 98: @@
 |-
 |  accessing the last element
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3Alast last]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v%3Alast last]
 |<haskell>
 last "foo bar baz"      --> 'z'
@@ Line 104: / Line 104: @@
 |-
 |  accessing the element at a given index
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3A!! (!!)]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v%3A!! (!!)]
 |<haskell>
 "foo bar baz" !! 4      --> 'b'
@@ Line 110: / Line 110: @@
 |-
 |  accessing the first <code>n</code> elements
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:take take]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:take take]
 | <haskell>
 take 3 "foo bar baz"    --> "foo"
@@ Line 116: / Line 116: @@
 |-
 |  accessing the last <code>n</code> elements
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:reverse reverse ], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:take take]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:reverse reverse ], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:take take]
 | <haskell>
 reverse . take 3 . reverse $ "foobar"    --> "bar"
@@ Line 122: / Line 122: @@
 |-
 |  accessing the <code>n</code> elements starting from index <code>m</code>
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:drop drop], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:take take]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:drop drop], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:take take]
 | <haskell>
 take 4 $ drop 2 "foo bar baz"            --> "o ba"
@@ Line 138: / Line 138: @@
 |-
 |  splitting a string into a list of words
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:words words]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:words words]
 | <haskell>words "foo bar\t baz\n"    --> ["foo","bar","baz"]
 </haskell>
 |-
 |  splitting a list into two parts
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3AsplitAt splitAt]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v%3AsplitAt splitAt]
 | <haskell>splitAt 3 "foo bar baz"    --> ("foo"," bar baz")
 </haskell>
@@ Line 167: / Line 167: @@
 |-
 |  converting a character to a numeric value
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Char.html#v:ord ord]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Data-Char.html#v:ord ord]
 |<haskell>
-import Char
+import Data.Char
 ord 'A'    --> 65
 </haskell>
 |-
 |  converting a numeric value to a character
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Char.html#v%3Achr chr]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Data-Char.html#v%3Achr chr]
 | <haskell>
-import Char
+import Data.Char
 chr 99     --> 'c'
 </haskell>
@@ Line 190: / Line 190: @@
 |-
 |  reversing a string by characters
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:reverse reverse]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:reverse reverse]
 |<haskell>
 reverse "foo bar baz"                        --> "zab rab oof"
@@ Line 196: / Line 196: @@
 |-
 |  reversing a string by words
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3Awords words], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:reverse reverse], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3Aunwords unwords]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v%3Awords words], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:reverse reverse], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v%3Aunwords unwords]
 | <haskell>
 unwords $ reverse $ words "foo bar baz"      --> "baz bar foo"
@@ Line 202: / Line 202: @@
 |-
 |  reversing a string by characters by words
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3Awords words], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:reverse reverse], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:map map], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3Aunwords unwords]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v%3Awords words], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:reverse reverse], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:map map], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v%3Aunwords unwords]
 | <haskell>
 unwords $ map reverse $ words "foo bar baz"  --> "oof rab zab"
@@ Line 217: / Line 217: @@
 |-
 |  converting a character to upper-case
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Char.html#v%3AtoUpper toUpper]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Data-Char.html#v%3AtoUpper toUpper]
 |<haskell>
-import Char
+import Data.Char
-toUpper 'a'            --> "A"
+toUpper 'a'            --> 'A'
 </haskell>
 |-
 |  converting a character to lower-case
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Char.html#v%3AtoLower toLower]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Data-Char.html#v%3AtoLower toLower]
 | <haskell>
-import Char
+import Data.Char
-toLower 'A'            --> "a"
+toLower 'A'            --> 'a'
 </haskell>
 |-
 |  converting a string to upper-case
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Char.html#v%3AtoUpper toUpper], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:map map]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Data-Char.html#v%3AtoUpper toUpper], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:map map]
 |<haskell>
-import Char
+import Data.Char
 map toUpper "Foo Bar"  --> "FOO BAR"
 </haskell>
 |-
 |  converting a string to lower-case
-|  [http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Char.html#v%3AtoLower toLower], [http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:map map]
+|  [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Data-Char.html#v%3AtoLower toLower], [http://haskell.org/ghc/docs/latest/html/libraries/base-4.12.0.0/Prelude.html#v:map map]
 | <haskell>
-import Char
+import Data.Char
 map toLower "Foo Bar"  --> "foo bar"
 </haskell>
@@ Line 251: / Line 251: @@
 === Performance ===
+Text handles character strings with better performance than Strings; it should be the prefered data type for UTF-8 encoded strings.
-For high performance requirements (where you would typically consider
-C), consider using [http://hackage.haskell.org/packages/archive/bytestring/latest/doc/html/Data-ByteString.html Data.ByteString].
+If observe that Text does not give sufficient performance,  consider [http://hackage.haskell.org/packages/archive/bytestring/latest/doc/html/Data-ByteString.html Data.ByteString], which is essentially a byte array. It can contain UTF-8 characters, but handle with care! .
 === Unicode ===
+Current GHC (later than 6) encodes Strings and Text in UTF-8. This may change the behavior of some of the functions explained above when applied to characters beyond the traditional ASCII characters. Remember that not every character in UTF-8 encoding is one byte!
-TODO