List function suggestions: Difference between revisions
JaredUpdike (talk | contribs) mNo edit summary |
JaredUpdike (talk | contribs) (Lot's of restructuring. This is going to be a lot of work.) |
||
Line 1: | Line 1: | ||
= Let's fix this = | = Let's fix this = | ||
We need useful functions, | We need these useful functions, I'll call them 'replace' and 'splitBy' in Data.List. These are easily implemented but everyone always reinvents them. The goal is clarity/uniformity (everyone uses them widely and recognizes them) and portability (I don't have to keep reimplementing these or copying that one file UsefulMissingFunctions.hs). | ||
Use this page to record consensus as reached on the Talk Page. (Use four tildes to sign your post automatically with your name/timestamp.) Diverging opinions welcome! | Use this page to record consensus as reached on the Talk Page. (Use four tildes to sign your post automatically with your name/timestamp.) Diverging opinions welcome! Note: a lot of good points (diverging opinions!) are covered in the mailing lists, but if we include all these various cases, split* will have 9 variants! I'm working on trying to organize all this into something meaningful. | ||
== Summary == | == Summary == | ||
Line 24: | Line 24: | ||
A thread July 2004 | A thread July 2004 | ||
http://www.haskell.org/pipermail/libraries/2004-July/thread.html# | http://www.haskell.org/pipermail/libraries/2004-July/thread.html#2342 | ||
== Goal | == Goal == | ||
The goal is to reach some kind of reasonable consensus, specifically on naming and semantics. Even if we need pairs of functions to satisfy various usage and algebraic needs. Failing to accomodate every possible use of these functions should not be a sufficient reason to abandon the whole project. | |||
Note: I (Jared Updike) am working | Note: I (Jared Updike) am working with the belief that efficiency should not be a valid argument to bar these otherwise universally useful functions from the libraries; regexes are overkill for 'splitBy' and 'replace' for common simple situations. Let's assume people will know (or learn) when they need heavier machinery (regexes, FPS/ByteString) and will use it when efficiency is important. We can try to facilitate this by reusing any names from FastPackedString and/or ByteString, etc. | ||
= The Data.List functions = | |||
=== splitOn (working name) === | === splitOn (working name) === | ||
We need at | We need at least a few of these: | ||
==== splitOn ==== | |||
<haskell> | <haskell> | ||
splitOn :: Eq a => [a] -> [a] -> [[a]] | splitOn :: Eq a => [a] -> [a] -> [[a]] | ||
</haskell> | </haskell> | ||
One that preserves: | |||
<haskell> | |||
join sep (splitOn sep x) === x | |||
</haskell> | |||
See below for 'join' | |||
==== splitOn' ==== | |||
<haskell> | <haskell> | ||
splitOn' :: Eq a => [a] -> [a] -> [[a]] | |||
</haskell> | </haskell> | ||
One that uses the above splitOn and does a filter to remove empty elements but does not preserve above property. Easy enough: | |||
<haskell> | <haskell> | ||
Line 53: | Line 64: | ||
</haskell> | </haskell> | ||
==== splitBy ==== | |||
<haskell> | <haskell> | ||
splitBy :: (a -> Bool) -> [a] -> [[a]] | |||
</haskell> | </haskell> | ||
One that takes a function that determines if the element is part of a contiguous group of separator characters: | |||
(use of 'By' mirroring groupBy, sortBy, etc.) | |||
Usage would be: | |||
<haskell> | <haskell> | ||
splitws = splitBy (`elem` " \f\v\t\n\r\b") | |||
splitws "Hello there\n \n Haskellers! " ===> | |||
["Hello", "there", "Haskellers!"] | |||
</haskell> | </haskell> | ||
The 'join' property is not preserved. | |||
'''TODO: give code, copy-paste from threads mentioned above''' | '''TODO: give code, copy-paste from threads mentioned above''' | ||
Line 75: | Line 91: | ||
=== replace (working name) === | === replace (working name) === | ||
like Python replace | <haskell> | ||
replace :: [a] -> [a] -> [a] -> [a] | |||
</haskell> | |||
like Python replace: | |||
<haskell> | <haskell> | ||
replace "the" "a" "the quick brown fox jumped over the lazy black dog" === | replace "the" "a" "the quick brown fox jumped over the lazy black dog" | ||
===> | |||
"a quick brown fox jumped over a lazy black dog" | "a quick brown fox jumped over a lazy black dog" | ||
</haskell> | </haskell> | ||
Line 88: | Line 109: | ||
=== join (working name) === | === join (working name) === | ||
<haskell> | |||
join :: [a] -> [[a]] -> [a] | |||
</haskell> | |||
<haskell> | <haskell> | ||
Line 94: | Line 117: | ||
</haskell> | </haskell> | ||
'''TODO: | '''TODO: copy-paste things from threads mentioned above''' | ||
'''TODO: list names and reasons for/against''' | '''TODO: list names and reasons for/against''' |
Revision as of 03:50, 19 August 2006
Let's fix this
We need these useful functions, I'll call them 'replace' and 'splitBy' in Data.List. These are easily implemented but everyone always reinvents them. The goal is clarity/uniformity (everyone uses them widely and recognizes them) and portability (I don't have to keep reimplementing these or copying that one file UsefulMissingFunctions.hs).
Use this page to record consensus as reached on the Talk Page. (Use four tildes to sign your post automatically with your name/timestamp.) Diverging opinions welcome! Note: a lot of good points (diverging opinions!) are covered in the mailing lists, but if we include all these various cases, split* will have 9 variants! I'm working on trying to organize all this into something meaningful.
Summary
Hacking up your own custom split (or a tokens/splitOnGlue) must be one of the most common questions from beginners on the irc channel.
Anyone rememeber what the result of the "let's get split into the base library" movement's work was?
ISTR there wasn't a concensus, so nothing happened. Which is silly, really - I agree we should definitely have a Data.List.split.
A thread July 2006
http://www.haskell.org/pipermail/haskell-cafe/2006-July/thread.html#16559
A thread July 2004
http://www.haskell.org/pipermail/libraries/2004-July/thread.html#2342
Goal
The goal is to reach some kind of reasonable consensus, specifically on naming and semantics. Even if we need pairs of functions to satisfy various usage and algebraic needs. Failing to accomodate every possible use of these functions should not be a sufficient reason to abandon the whole project.
Note: I (Jared Updike) am working with the belief that efficiency should not be a valid argument to bar these otherwise universally useful functions from the libraries; regexes are overkill for 'splitBy' and 'replace' for common simple situations. Let's assume people will know (or learn) when they need heavier machinery (regexes, FPS/ByteString) and will use it when efficiency is important. We can try to facilitate this by reusing any names from FastPackedString and/or ByteString, etc.
The Data.List functions
splitOn (working name)
We need at least a few of these:
splitOn
splitOn :: Eq a => [a] -> [a] -> [[a]]
One that preserves:
join sep (splitOn sep x) === x
See below for 'join'
splitOn'
splitOn' :: Eq a => [a] -> [a] -> [[a]]
One that uses the above splitOn and does a filter to remove empty elements but does not preserve above property. Easy enough:
splitOn' sep x = filter (/=[]) (splitOn sep x)
splitBy
splitBy :: (a -> Bool) -> [a] -> [[a]]
One that takes a function that determines if the element is part of a contiguous group of separator characters:
(use of 'By' mirroring groupBy, sortBy, etc.)
Usage would be:
splitws = splitBy (`elem` " \f\v\t\n\r\b")
splitws "Hello there\n \n Haskellers! " ===>
["Hello", "there", "Haskellers!"]
The 'join' property is not preserved.
TODO: give code, copy-paste from threads mentioned above
TODO: list names and reasons for/against
replace (working name)
replace :: [a] -> [a] -> [a] -> [a]
like Python replace:
replace "the" "a" "the quick brown fox jumped over the lazy black dog"
===>
"a quick brown fox jumped over a lazy black dog"
TODO: give code, copy-paste from threads mentioned above
TODO: list names and reasons for/against
join (working name)
join :: [a] -> [[a]] -> [a]
join sep = concat . intersperse sep
TODO: copy-paste things from threads mentioned above
TODO: list names and reasons for/against
other favorites
Such as endsWith, beginsWith, etc.
TODO: copy-paste from threads mentioned above, or from your own code