Foldable and Traversable: Difference between revisions
RossPaterson (talk | contribs) (add hierarchy diagram) |
(Adding the insight that filter requires data structures to have an empty representation.) |
||
(14 intermediate revisions by 8 users not shown) | |||
Line 8: | Line 8: | ||
operations. | operations. | ||
When you've been using it for a little while, there seem to be some | When you've been using it for a little while, there seem to be some baffling omissions from the API. The first couple you are likely to notice are the absence of "<hask>map</hask>" and "<hask>toList</hask>". | ||
baffling omissions from the API. The first couple you are likely to | The answer to these lies in the long list of instances which Sequence has: | ||
notice are the absence of "<hask>map</hask>" and "<hask>toList</hask>". | * The Sequence version of map is "<hask>fmap</hask>", which comes from the Functor class. | ||
* The Sequence version of <hask>toList</hask> is in the <hask>Foldable</hask> [[class]]. | |||
The answer to these lies in the long list of instances which Sequence | |||
Functor class. The Sequence version of <hask>toList</hask> is in the <hask>Foldable</hask> [[class]]. | |||
When working with <hask>Sequence</hask> you also want to refer to the documentation | When working with <hask>Sequence</hask> you also want to refer to the documentation | ||
for at least <hask>Foldable</hask> and <hask>Traversable</hask>. <hask>Functor</hask> only has the single | for at least <hask>Foldable</hask> and <hask>Traversable</hask>. <hask>Functor</hask> only has the single [[method]], so we've already covered that. | ||
[[method]], so we've already covered that. | |||
==What do these classes all mean? A brief tour:== | ==What do these classes all mean? A brief tour:== | ||
Line 26: | Line 22: | ||
===<hask>Functor</hask>=== | ===<hask>Functor</hask>=== | ||
A [[functor]] is simply a [[container]]. Given a container, and a [[function]] | A [[functor]] is simply a [[container]]. Given a container, and a [[function]] which works on the elements, we can apply that function to each element. For lists, the familiar "<hask>map</hask>" does exactly this. | ||
which works on the elements, we can apply that function to each | |||
element. For lists, the familiar "<hask>map</hask>" does exactly this. | |||
Note that the function can produce elements of a different [[type]], so we | Note that the function can produce elements of a different [[type]], so we | ||
Line 44: | Line 38: | ||
===Foldable=== | ===Foldable=== | ||
A <hask>Foldable</hask> [[type]] is also a [[container]] | A <hask>Foldable</hask> [[type]] is also a [[container]]. | ||
The [[class]] does not require <hask>Functor</hask> superclass | |||
can be 'folded' to a summary value. In other words, it is a type which | in order to allow containers like <hask>Set</hask> or <hask>StorableVector</hask> | ||
supports "<hask>foldr</hask>". | that have additional constraints on the element type. | ||
But many interesting <hask>Foldable</hask>s are also <hask>Functor</hask>s. | |||
A foldable container is a container with the added property | |||
that its items can be 'folded' to a summary value. | |||
In other words, it is a type which supports "<hask>foldr</hask>". | |||
Once you support <hask>foldr</hask>, of course, | Once you support <hask>foldr</hask>, of course, it can be turned into a list, by using <hask>toList = foldr (:) []</hask>. This means that all <hask>Foldable</hask>s have a representation as a list, but the order of the items may or may not have any particular significance. However, if a <hask>Foldable</hask> is also a <hask>Functor</hask>, [[parametricity]] and the [[Functor law]] guarantee that <hask>toList</hask> and <hask>fmap</hask> commute. Further, in the case of <hask>Data.Sequence</hask>, there '''is''' a well defined order and it is exposed as expected by <hask>toList</hask>. | ||
using <hask>foldr (:) []</hask>. This means that all <hask>Foldable</hask>s have a | |||
representation as a list | |||
not have any particular significance. | |||
also a <hask>Functor</hask>, <hask>toList</hask> and <hask>fmap</hask> | |||
there '''is''' a well defined order and it is | |||
A particular kind of fold well-used by Haskell programmers is | A particular kind of fold well-used by Haskell programmers is <hask>mapM_</hask>, which is a kind of fold over <hask>(>>)</hask>, and <hask>Foldable</hask> provides this along with the related <hask>sequence_</hask>. | ||
<hask>mapM_</hask>, which is a kind of fold over | |||
<hask>(>>)</hask>, and <hask>Foldable</hask> provides this along with the | |||
related <hask>sequence_</hask>. | |||
===Traversable=== | ===Traversable=== | ||
A <hask>Traversable</hask> [[type]] is a kind of upgraded <hask>Foldable</hask>. Where <hask>Foldable</hask> | A <hask>Traversable</hask> [[type]] is a kind of upgraded <hask>Foldable</hask>. Where <hask>Foldable</hask> gives you the ability to go through the structure processing the elements (<hask>foldr</hask>) but throwing away the shape, <hask>Traversable</hask> allows you to do that whilst preserving the shape and, e.g., putting new values in. | ||
gives you the ability to go through the structure processing the | |||
elements (<hask>foldr</hask>) but throwing away the shape, <hask>Traversable</hask> allows you | |||
to do that whilst preserving the shape and, e.g., putting new values | |||
in. | |||
<hask>Traversable</hask> is what we need for <hask>mapM</hask> and | <hask>Traversable</hask> is what we need for <hask>mapM</hask> and <hask>sequence</hask> : note the apparently surprising fact that the "_" versions are in a different [[typeclass]]. | ||
<hask>sequence</hask> : note the apparently surprising fact that the | |||
"_" versions are in a different [[typeclass]]. | |||
== Some trickier functions: concatMap and filter == | == Some trickier functions: concatMap and filter == | ||
Neither <hask>Traversable</hask> nor <hask>Foldable</hask> contain elements for <hask>concatMap</hask> and <hask>filter</hask>. That is because <hask>Foldable</hask> is about tearing down the structure | Neither <hask>Traversable</hask> nor <hask>Foldable</hask> contain elements for <hask>concatMap</hask> and <hask>filter</hask>. That is because <hask>Foldable</hask> is about tearing down the structure completely, while <hask>Traversable</hask> is about preserving the structure exactly as-is. On the other hand <hask>concatMap</hask> tries to 'squeeze more elements in' at a place and <hask>filter</hask> tries to cut them out. | ||
completely, while <hask>Traversable</hask> is about preserving the structure | |||
exactly as-is. On the other hand <hask>concatMap</hask> tries to | |||
'squeeze more elements in' at a place and <hask>filter</hask> tries to | |||
cut them out. | |||
You can write <hask>concatMap</hask> for <hask>Sequence</hask> as follows: | You can write <hask>concatMap</hask> for <hask>Sequence</hask> as follows: | ||
Line 91: | Line 68: | ||
</haskell> | </haskell> | ||
But why does it work? It works because sequence is an instance of<hask>Monoid</hask>, where the [[monoid]]al operation is "appending". The same | But why does it work? It works because sequence is an instance of <hask>Monoid</hask>, where the [[monoid]]al operation is "appending". The same definition works for lists, and we can write it more generally as: | ||
definition works for lists, and we can write it more generally as: | |||
<haskell> | <haskell> | ||
Line 99: | Line 75: | ||
</haskell> | </haskell> | ||
And that works with lists and sequences both. Does it work with any | And that works with lists and sequences both. Does it work with any Monoid which is Foldable? Only if the Monoid 'means the right thing'. If you have <hask>toList (f `mappend` g) = toList f ++ toList g</hask> then it definitely makes sense. In fact this easy to write condition is stronger than needed; it would be good enough if they were permutations of each other. | ||
Monoid which is Foldable? Only if the Monoid 'means the right | |||
thing'. If you have <hask>toList (f `mappend` g) = toList f ++ toList g</hask> then it definitely makes sense. In fact this easy to write | <hask>filter</hask> turns out to be slightly harder still. Note the type signature of filter from Data.List: <code>filter :: (a -> Bool) -> [a] -> [a]</code>. Every element in the list is evaluated by a predicate function <code>(a -> Bool)</code>. If that evaluation returns <hask>False</hask>, the element is removed. If every evaluation returns False, all elements will be removed; therefore there must be an empty representation of the data structure. In the case of list this would be <hask>[]</hask>. A general representation of this might be <hask>mempty</hask> found in the <hask>Monoid</hask> typeclass. | ||
condition is stronger than needed; it would be good enough if they | |||
were permutations of each other. | |||
Additionally, for building structure around values you need something like 'singleton' (from <hask>Sequence</hask>), or <hask>\a -> [a]</hask> for lists. We can use <hask>pure</hask> from <hask>Applicative</hask>, although it's not really right to bring <hask>Applicative</hask> in for this, and get: | |||
something like 'singleton' (from <hask>Sequence</hask>), or <hask>\a -> [a]</hask> | |||
for lists. We can use <hask>pure</hask> from <hask>Applicative</hask>, although | |||
it's not really right to bring <hask>Applicative</hask> in for this, and get: | |||
<haskell> | <haskell> | ||
Line 116: | Line 87: | ||
</haskell> | </haskell> | ||
It's interesting to note that, under these conditions, we have a candidate | It's interesting to note that, under these conditions, we have a candidate to help us turn the <hask>Foldable</hask> into a <hask>Monad</hask>, since <hask>concatMap</hask> is a good definition for <hask>>>=</hask>, and we can use <hask>pure</hask> for <hask>return</hask>. | ||
to help us turn the <hask>Foldable</hask> into a <hask>Monad</hask>, since <hask>concatMap</hask> is a good | |||
definition for <hask>>>=</hask>, and we can use <hask>pure</hask> for <hask>return</hask>. | |||
== Generalising zipWith == | == Generalising zipWith == | ||
Another really useful list [[combinator]] that doesn't appear in the | Another really useful list [[combinator]] that doesn't appear in the interfaces for <hask>Sequence</hask>, <hask>Foldable</hask> or <hask>Traversable</hask> is <hask>zipWith</hask>. The most general kind of <hask>zipWith</hask> over <hask>Traversable</hask>s will keep the exact shape of the <hask>Traversable</hask> on the left, whilst zipping against the values on the right. It turns out you can get away with a <hask>Foldable</hask> on the right, but you need to use a <hask>Monad</hask> (or an <hask>Applicative</hask>, actually) to thread the values through: | ||
interfaces for <hask>Sequence</hask>, <hask>Foldable</hask> or <hask>Traversable</hask> is <hask>zipWith</hask>. The most general kind of <hask>zipWith</hask> over <hask>Traversable</hask>s will keep the exact shape of | |||
the <hask>Traversable</hask> on the left, whilst zipping against the values on the right. It turns out you can get away with a <hask>Foldable</hask> on the right, but you need to use a <hask>Monad</hask> (or an <hask>Applicative</hask>, actually) to thread the | |||
values through: | |||
<haskell> | <haskell> | ||
Line 241: | Line 207: | ||
where map_one (x:xs) y = (xs, g y x) | where map_one (x:xs) y = (xs, g y x) | ||
</haskell> | </haskell> | ||
Replace <hask>mapAccumL</hask> with <hask>mapAccumR</hask> and the elements of the Foldable are zipped in reverse order. | Replace <hask>mapAccumL</hask> with <hask>mapAccumR</hask> and the elements of the Foldable are zipped in reverse order. Similarly, we can define a generalization of <hask>reverse</hask> on Traversables, which preserves the shape but reverses the left-to-right position of the elements: | ||
Similarly, we can define a generalization of <hask>reverse</hask> on Traversables, which preserves the shape but reverses the left-to-right position of the elements: | |||
<haskell> | <haskell> | ||
reverseT :: (Traversable t) => t a -> t a | reverseT :: (Traversable t) => t a -> t a | ||
reverseT t = snd (mapAccumR (\ (x:xs) _ -> (xs, x)) (toList t) t) | reverseT t = snd (mapAccumR (\ (x:xs) _ -> (xs, x)) (toList t) t) | ||
</haskell> | </haskell> |
Latest revision as of 19:00, 18 May 2020
Data.Sequence is recommended as an efficient alternative to [list]s, with a more symmetric feel and better complexity on various operations.
When you've been using it for a little while, there seem to be some baffling omissions from the API. The first couple you are likely to notice are the absence of "map
" and "toList
".
The answer to these lies in the long list of instances which Sequence has:
- The Sequence version of map is "
fmap
", which comes from the Functor class. - The Sequence version of
toList
is in theFoldable
class.
When working with Sequence
you also want to refer to the documentation
for at least Foldable
and Traversable
. Functor
only has the single method, so we've already covered that.
What do these classes all mean? A brief tour:
Functor
Functor
A functor is simply a container. Given a container, and a function which works on the elements, we can apply that function to each element. For lists, the familiar "map
" does exactly this.
Note that the function can produce elements of a different type, so we may have a different type at the end.
Examples:
Prelude Data.Sequence> map (\n -> replicate n 'a') [1,3,5]
["a","aaa","aaaaa"]
Prelude Data.Sequence> fmap (\n -> replicate n 'a') (1 <| 3 <| 5 <| empty)
fromList ["a","aaa","aaaaa"]
Foldable
A Foldable
type is also a container.
The class does not require Functor
superclass
in order to allow containers like Set
or StorableVector
that have additional constraints on the element type.
But many interesting Foldable
s are also Functor
s.
A foldable container is a container with the added property
that its items can be 'folded' to a summary value.
In other words, it is a type which supports "foldr
".
Once you support foldr
, of course, it can be turned into a list, by using toList = foldr (:) []
. This means that all Foldable
s have a representation as a list, but the order of the items may or may not have any particular significance. However, if a Foldable
is also a Functor
, parametricity and the Functor law guarantee that toList
and fmap
commute. Further, in the case of Data.Sequence
, there is a well defined order and it is exposed as expected by toList
.
A particular kind of fold well-used by Haskell programmers is mapM_
, which is a kind of fold over (>>)
, and Foldable
provides this along with the related sequence_
.
Traversable
A Traversable
type is a kind of upgraded Foldable
. Where Foldable
gives you the ability to go through the structure processing the elements (foldr
) but throwing away the shape, Traversable
allows you to do that whilst preserving the shape and, e.g., putting new values in.
Traversable
is what we need for mapM
and sequence
: note the apparently surprising fact that the "_" versions are in a different typeclass.
Some trickier functions: concatMap and filter
Neither Traversable
nor Foldable
contain elements for concatMap
and filter
. That is because Foldable
is about tearing down the structure completely, while Traversable
is about preserving the structure exactly as-is. On the other hand concatMap
tries to 'squeeze more elements in' at a place and filter
tries to cut them out.
You can write concatMap
for Sequence
as follows:
concatMap :: (a -> Seq b) -> Seq a -> Seq b
concatMap = foldMap
But why does it work? It works because sequence is an instance of Monoid
, where the monoidal operation is "appending". The same definition works for lists, and we can write it more generally as:
concatMap :: (Foldable f, Monoid (f b)) => (a -> f b) -> f a -> f b
concatMap = foldMap
And that works with lists and sequences both. Does it work with any Monoid which is Foldable? Only if the Monoid 'means the right thing'. If you have toList (f `mappend` g) = toList f ++ toList g
then it definitely makes sense. In fact this easy to write condition is stronger than needed; it would be good enough if they were permutations of each other.
filter
turns out to be slightly harder still. Note the type signature of filter from Data.List: filter :: (a -> Bool) -> [a] -> [a]
. Every element in the list is evaluated by a predicate function (a -> Bool)
. If that evaluation returns False
, the element is removed. If every evaluation returns False, all elements will be removed; therefore there must be an empty representation of the data structure. In the case of list this would be []
. A general representation of this might be mempty
found in the Monoid
typeclass.
Additionally, for building structure around values you need something like 'singleton' (from Sequence
), or \a -> [a]
for lists. We can use pure
from Applicative
, although it's not really right to bring Applicative
in for this, and get:
filter :: (Applicative f, Foldable f, Monoid (f a)) =>
(a -> Bool) -> f a -> f a
filter p = foldMap (\a -> if p a then pure a else mempty)
It's interesting to note that, under these conditions, we have a candidate to help us turn the Foldable
into a Monad
, since concatMap
is a good definition for >>=
, and we can use pure
for return
.
Generalising zipWith
Another really useful list combinator that doesn't appear in the interfaces for Sequence
, Foldable
or Traversable
is zipWith
. The most general kind of zipWith
over Traversable
s will keep the exact shape of the Traversable
on the left, whilst zipping against the values on the right. It turns out you can get away with a Foldable
on the right, but you need to use a Monad
(or an Applicative
, actually) to thread the values through:
import Prelude hiding (sequence)
import Data.Sequence
import Data.Foldable
import Data.Traversable
import Control.Applicative
data Supply s v = Supply { unSupply :: [s] -> ([s],v) }
instance Functor (Supply s) where
fmap f av = Supply (\l -> let (l',v) = unSupply av l in (l',f v))
instance Applicative (Supply s) where
pure v = Supply (\l -> (l,v))
af <*> av = Supply (\l -> let (l',f) = unSupply af l
(l'',v) = unSupply av l'
in (l'',f v))
runSupply :: (Supply s v) -> [s] -> v
runSupply av l = snd $ unSupply av l
supply :: Supply s s
supply = Supply (\(x:xs) -> (xs,x))
zipTF :: (Traversable t, Foldable f) => t a -> f b -> t (a,b)
zipTF t f = runSupply (traverse (\a -> (,) a <$> supply) t) (toList f)
zipWithTF :: (Traversable t,Foldable f) => (a -> b -> c) -> t a -> f b -> t c
zipWithTF g t f = runSupply (traverse (\a -> g a <$> supply) t) (toList f)
zipWithTFM :: (Traversable t,Foldable f,Monad m) =>
(a -> b -> m c) -> t a -> f b -> m (t c)
zipWithTFM g t f = sequence (zipWithTF g t f)
zipWithTFA :: (Traversable t,Foldable f,Applicative m) =>
(a -> b -> m c) -> t a -> f b -> m (t c)
zipWithTFA g t f = sequenceA (zipWithTF g t f)
The code above fails with a pattern match error when the Foldable
container doesn't have enough input. Here is an alternative version which provides friendlier error reports and makes use of State
instead of the self defined Supply monad.
module GenericZip
(zipWithTF,
zipTF,
zipWithTFA,
zipWithTFM) where
import Data.Foldable
import Data.Traversable
import qualified Data.Traversable as T
import Control.Applicative
import Control.Monad.State
-- | The state contains the list of values obtained form the foldable container
-- and a String indicating the name of the function currectly being executed
data ZipState a = ZipState {fName :: String,
list :: [a]}
-- | State monad containing ZipState
type ZipM l a = State (ZipState l) a
-- | pops the first element of the list inside the state
pop :: ZipM l l
pop = do
st <- get
let xs = list st
n = fName st
case xs of
(a:as) -> do put st{list=as}
return a
[] -> error $ n ++ ": insufficient input"
-- | pop a value form the state and supply it to the second
-- argument of a binary function
supplySecond :: (a -> b -> c) -> a -> ZipM b c
supplySecond f a = do b <- pop
return $ f a b
zipWithTFError :: (Traversable t,Foldable f) =>
String -> (a -> b -> c) -> t a -> f b -> t c
zipWithTFError str g t f = evalState (T.mapM (supplySecond g) t)
(ZipState str (toList f))
zipWithTF :: (Traversable t,Foldable f) => (a -> b -> c) -> t a -> f b -> t c
zipWithTF = zipWithTFError "GenericZip.zipWithTF"
zipTF :: (Traversable t, Foldable f) => t a -> f b -> t (a,b)
zipTF = zipWithTFError "GenericZip.zipTF" (,)
zipWithTFM :: (Traversable t,Foldable f,Monad m) =>
(a -> b -> m c) -> t a -> f b -> m (t c)
zipWithTFM g t f = T.sequence (zipWithTFError "GenericZip.zipWithTFM" g t f)
zipWithTFA :: (Traversable t,Foldable f,Applicative m) =>
(a -> b -> m c) -> t a -> f b -> m (t c)
zipWithTFA g t f = sequenceA (zipWithTFError "GenericZip.zipWithTFA" g t f)
Recent versions of Data.Traversable
include generalizations of mapAccumL
and mapAccumR
from lists to Traversables (encapsulating the state monad used above):
mapAccumL :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)
mapAccumR :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)
Using these, the first version above can be written as
zipWithTF :: (Traversable t, Foldable f) => (a -> b -> c) -> t a -> f b -> t c
zipWithTF g t f = snd (mapAccumL map_one (toList f) t)
where map_one (x:xs) y = (xs, g y x)
Replace mapAccumL
with mapAccumR
and the elements of the Foldable are zipped in reverse order. Similarly, we can define a generalization of reverse
on Traversables, which preserves the shape but reverses the left-to-right position of the elements:
reverseT :: (Traversable t) => t a -> t a
reverseT t = snd (mapAccumR (\ (x:xs) _ -> (xs, x)) (toList t) t)