# Difference between revisions of "Foldable and Traversable"

RossPaterson (talk | contribs) (shorter zip using mapAccumL) |
(Adding the insight that filter requires data structures to have an empty representation.) |
||

(16 intermediate revisions by 9 users not shown) | |||

Line 4: | Line 4: | ||

<center>'' or "Where is Data.Sequence.toList?"''</center> | <center>'' or "Where is Data.Sequence.toList?"''</center> | ||

− | Data.Sequence is recommended as an efficient alternative to [list]s, | + | [http://haskell.org/ghc/docs/latest/html/libraries/containers/Data-Sequence.html Data.Sequence] is recommended as an efficient alternative to [list]s, |

with a more symmetric feel and better complexity on various | with a more symmetric feel and better complexity on various | ||

operations. | operations. | ||

− | When you've been using it for a little while, there seem to be some | + | When you've been using it for a little while, there seem to be some baffling omissions from the API. The first couple you are likely to notice are the absence of "<hask>map</hask>" and "<hask>toList</hask>". |

− | baffling omissions from the API. The first couple you are likely to | + | The answer to these lies in the long list of instances which Sequence has: |

− | notice are the absence of "<hask>map</hask>" and "<hask>toList</hask>". | + | * The Sequence version of map is "<hask>fmap</hask>", which comes from the Functor class. |

− | + | * The Sequence version of <hask>toList</hask> is in the <hask>Foldable</hask> [[class]]. | |

− | The answer to these lies in the long list of instances which Sequence | ||

− | |||

− | Functor class. The Sequence version of <hask>toList</hask> is in the <hask>Foldable</hask> [[class]]. | ||

When working with <hask>Sequence</hask> you also want to refer to the documentation | When working with <hask>Sequence</hask> you also want to refer to the documentation | ||

− | for at least <hask>Foldable</hask> and <hask>Traversable</hask>. <hask>Functor</hask> only has the single | + | for at least <hask>Foldable</hask> and <hask>Traversable</hask>. <hask>Functor</hask> only has the single [[method]], so we've already covered that. |

− | [[method]], so we've already covered that. | ||

==What do these classes all mean? A brief tour:== | ==What do these classes all mean? A brief tour:== | ||

+ | |||

+ | [[Image:FunctorHierarchy.svg]] | ||

===<hask>Functor</hask>=== | ===<hask>Functor</hask>=== | ||

− | A [[functor]] is simply a [[container]]. Given a container, and a [[function]] | + | A [[functor]] is simply a [[container]]. Given a container, and a [[function]] which works on the elements, we can apply that function to each element. For lists, the familiar "<hask>map</hask>" does exactly this. |

− | which works on the elements, we can apply that function to each | ||

− | element. For lists, the familiar "<hask>map</hask>" does exactly this. | ||

Note that the function can produce elements of a different [[type]], so we | Note that the function can produce elements of a different [[type]], so we | ||

Line 42: | Line 38: | ||

===Foldable=== | ===Foldable=== | ||

− | A <hask>Foldable</hask> [[type]] is also a [[container]] | + | A <hask>Foldable</hask> [[type]] is also a [[container]]. |

− | + | The [[class]] does not require <hask>Functor</hask> superclass | |

− | can be 'folded' to a summary value. In other words, it is a type which | + | in order to allow containers like <hask>Set</hask> or <hask>StorableVector</hask> |

− | supports "<hask>foldr</hask>". | + | that have additional constraints on the element type. |

+ | But many interesting <hask>Foldable</hask>s are also <hask>Functor</hask>s. | ||

+ | A foldable container is a container with the added property | ||

+ | that its items can be 'folded' to a summary value. | ||

+ | In other words, it is a type which supports "<hask>foldr</hask>". | ||

− | Once you support <hask>foldr</hask>, of course, | + | Once you support <hask>foldr</hask>, of course, it can be turned into a list, by using <hask>toList = foldr (:) []</hask>. This means that all <hask>Foldable</hask>s have a representation as a list, but the order of the items may or may not have any particular significance. However, if a <hask>Foldable</hask> is also a <hask>Functor</hask>, [[parametricity]] and the [[Functor law]] guarantee that <hask>toList</hask> and <hask>fmap</hask> commute. Further, in the case of <hask>Data.Sequence</hask>, there '''is''' a well defined order and it is exposed as expected by <hask>toList</hask>. |

− | using <hask>foldr (:) []</hask>. This means that all <hask>Foldable</hask>s have a | ||

− | representation as a list | ||

− | not have any particular significance. | ||

− | also a <hask>Functor</hask>, <hask>toList</hask> and <hask>fmap</hask> | ||

− | |||

− | |||

− | there '''is''' a well defined order and it is | ||

− | |||

− | A particular kind of fold well-used by Haskell programmers is | + | A particular kind of fold well-used by Haskell programmers is <hask>mapM_</hask>, which is a kind of fold over <hask>(>>)</hask>, and <hask>Foldable</hask> provides this along with the related <hask>sequence_</hask>. |

− | <hask>mapM_</hask>, which is a kind of fold over | ||

− | <hask>(>>)</hask>, and <hask>Foldable</hask> provides this along with the | ||

− | related <hask>sequence_</hask>. | ||

===Traversable=== | ===Traversable=== | ||

− | A <hask>Traversable</hask> [[type]] is a kind of upgraded <hask>Foldable</hask>. Where <hask>Foldable</hask> | + | A <hask>Traversable</hask> [[type]] is a kind of upgraded <hask>Foldable</hask>. Where <hask>Foldable</hask> gives you the ability to go through the structure processing the elements (<hask>foldr</hask>) but throwing away the shape, <hask>Traversable</hask> allows you to do that whilst preserving the shape and, e.g., putting new values in. |

− | gives you the ability to go through the structure processing the | ||

− | elements (<hask>foldr</hask>) but throwing away the shape, <hask>Traversable</hask> allows you | ||

− | to do that whilst preserving the shape and, e.g., putting new values | ||

− | in. | ||

− | <hask>Traversable</hask> is what we need for <hask>mapM</hask> and | + | <hask>Traversable</hask> is what we need for <hask>mapM</hask> and <hask>sequence</hask> : note the apparently surprising fact that the "_" versions are in a different [[typeclass]]. |

− | <hask>sequence</hask> : note the apparently surprising fact that the | ||

− | "_" versions are in a different [[typeclass]]. | ||

== Some trickier functions: concatMap and filter == | == Some trickier functions: concatMap and filter == | ||

− | Neither <hask>Traversable</hask> nor <hask>Foldable</hask> contain elements for <hask>concatMap</hask> and <hask>filter</hask>. That is because <hask>Foldable</hask> is about tearing down the structure | + | Neither <hask>Traversable</hask> nor <hask>Foldable</hask> contain elements for <hask>concatMap</hask> and <hask>filter</hask>. That is because <hask>Foldable</hask> is about tearing down the structure completely, while <hask>Traversable</hask> is about preserving the structure exactly as-is. On the other hand <hask>concatMap</hask> tries to 'squeeze more elements in' at a place and <hask>filter</hask> tries to cut them out. |

− | completely, while <hask>Traversable</hask> is about preserving the structure | ||

− | exactly as-is. On the other hand <hask>concatMap</hask> tries to | ||

− | 'squeeze more elements in' at a place and <hask>filter</hask> tries to | ||

− | cut them out. | ||

You can write <hask>concatMap</hask> for <hask>Sequence</hask> as follows: | You can write <hask>concatMap</hask> for <hask>Sequence</hask> as follows: | ||

Line 89: | Line 68: | ||

</haskell> | </haskell> | ||

− | But why does it work? It works because sequence is an instance of<hask>Monoid</hask>, where the [[monoid]]al operation is "appending". The same | + | But why does it work? It works because sequence is an instance of <hask>Monoid</hask>, where the [[monoid]]al operation is "appending". The same definition works for lists, and we can write it more generally as: |

− | definition works for lists, and we can write it more generally as: | ||

<haskell> | <haskell> | ||

Line 97: | Line 75: | ||

</haskell> | </haskell> | ||

− | And that works with lists and sequences both. Does it work with any | + | And that works with lists and sequences both. Does it work with any Monoid which is Foldable? Only if the Monoid 'means the right thing'. If you have <hask>toList (f `mappend` g) = toList f ++ toList g</hask> then it definitely makes sense. In fact this easy to write condition is stronger than needed; it would be good enough if they were permutations of each other. |

− | Monoid which is Foldable? Only if the Monoid 'means the right | + | |

− | thing'. If you have <hask>toList (f `mappend` g) = toList f ++ toList g</hask> then it definitely makes sense. In fact this easy to write | + | <hask>filter</hask> turns out to be slightly harder still. Note the type signature of filter from Data.List: <code>filter :: (a -> Bool) -> [a] -> [a]</code>. Every element in the list is evaluated by a predicate function <code>(a -> Bool)</code>. If that evaluation returns <hask>False</hask>, the element is removed. If every evaluation returns False, all elements will be removed; therefore there must be an empty representation of the data structure. In the case of list this would be <hask>[]</hask>. A general representation of this might be <hask>mempty</hask> found in the <hask>Monoid</hask> typeclass. |

− | condition is stronger than needed; it would be good enough if they | ||

− | were permutations of each other. | ||

− | + | Additionally, for building structure around values you need something like 'singleton' (from <hask>Sequence</hask>), or <hask>\a -> [a]</hask> for lists. We can use <hask>pure</hask> from <hask>Applicative</hask>, although it's not really right to bring <hask>Applicative</hask> in for this, and get: | |

− | something like 'singleton' (from <hask>Sequence</hask>), or <hask>\a -> [a]</hask> | ||

− | for lists. We can use <hask>pure</hask> from <hask>Applicative</hask>, although | ||

− | it's not really right to bring <hask>Applicative</hask> in for this, and get: | ||

<haskell> | <haskell> | ||

Line 114: | Line 87: | ||

</haskell> | </haskell> | ||

− | It's interesting to note that, under these conditions, we have a candidate | + | It's interesting to note that, under these conditions, we have a candidate to help us turn the <hask>Foldable</hask> into a <hask>Monad</hask>, since <hask>concatMap</hask> is a good definition for <hask>>>=</hask>, and we can use <hask>pure</hask> for <hask>return</hask>. |

− | to help us turn the <hask>Foldable</hask> into a <hask>Monad</hask>, since <hask>concatMap</hask> is a good | ||

− | definition for <hask>>>=</hask>, and we can use <hask>pure</hask> for <hask>return</hask>. | ||

== Generalising zipWith == | == Generalising zipWith == | ||

− | Another really useful list [[combinator]] that doesn't appear in the | + | Another really useful list [[combinator]] that doesn't appear in the interfaces for <hask>Sequence</hask>, <hask>Foldable</hask> or <hask>Traversable</hask> is <hask>zipWith</hask>. The most general kind of <hask>zipWith</hask> over <hask>Traversable</hask>s will keep the exact shape of the <hask>Traversable</hask> on the left, whilst zipping against the values on the right. It turns out you can get away with a <hask>Foldable</hask> on the right, but you need to use a <hask>Monad</hask> (or an <hask>Applicative</hask>, actually) to thread the values through: |

− | interfaces for <hask>Sequence</hask>, <hask>Foldable</hask> or <hask>Traversable</hask> is <hask>zipWith</hask>. The most general kind of <hask>zipWith</hask> over <hask>Traversable</hask>s will keep the exact shape of | ||

− | the <hask>Traversable</hask> on the left, whilst zipping against the values on the right. It turns out you can get away with a <hask>Foldable</hask> on the right, but you need to use a <hask>Monad</hask> (or an <hask>Applicative</hask>, actually) to thread the | ||

− | values through: | ||

<haskell> | <haskell> | ||

Line 239: | Line 207: | ||

where map_one (x:xs) y = (xs, g y x) | where map_one (x:xs) y = (xs, g y x) | ||

</haskell> | </haskell> | ||

− | Replace <hask>mapAccumL</hask> with <hask>mapAccumR</hask> and the elements of the Foldable are zipped in reverse order. | + | Replace <hask>mapAccumL</hask> with <hask>mapAccumR</hask> and the elements of the Foldable are zipped in reverse order. Similarly, we can define a generalization of <hask>reverse</hask> on Traversables, which preserves the shape but reverses the left-to-right position of the elements: |

− | Similarly, we can define a generalization of <hask>reverse</hask> on Traversables, which preserves the shape but reverses the left-to-right position of the elements: | ||

<haskell> | <haskell> | ||

reverseT :: (Traversable t) => t a -> t a | reverseT :: (Traversable t) => t a -> t a | ||

reverseT t = snd (mapAccumR (\ (x:xs) _ -> (xs, x)) (toList t) t) | reverseT t = snd (mapAccumR (\ (x:xs) _ -> (xs, x)) (toList t) t) | ||

</haskell> | </haskell> |

## Latest revision as of 19:00, 18 May 2020

**Notes on Foldable, Traversable and other useful classes**

*or "Where is Data.Sequence.toList?"*

Data.Sequence is recommended as an efficient alternative to [list]s, with a more symmetric feel and better complexity on various operations.

When you've been using it for a little while, there seem to be some baffling omissions from the API. The first couple you are likely to notice are the absence of "`map`

" and "`toList`

".
The answer to these lies in the long list of instances which Sequence has:

- The Sequence version of map is "
`fmap`

", which comes from the Functor class. - The Sequence version of
`toList`

is in the`Foldable`

class.

When working with `Sequence`

you also want to refer to the documentation
for at least `Foldable`

and `Traversable`

. `Functor`

only has the single method, so we've already covered that.

## Contents

## What do these classes all mean? A brief tour:

`Functor`

A functor is simply a container. Given a container, and a function which works on the elements, we can apply that function to each element. For lists, the familiar "`map`

" does exactly this.

Note that the function can produce elements of a different type, so we may have a different type at the end.

Examples:

```
Prelude Data.Sequence> map (\n -> replicate n 'a') [1,3,5]
["a","aaa","aaaaa"]
Prelude Data.Sequence> fmap (\n -> replicate n 'a') (1 <| 3 <| 5 <| empty)
fromList ["a","aaa","aaaaa"]
```

### Foldable

A `Foldable`

type is also a container.
The class does not require `Functor`

superclass
in order to allow containers like `Set`

or `StorableVector`

that have additional constraints on the element type.
But many interesting `Foldable`

s are also `Functor`

s.
A foldable container is a container with the added property
that its items can be 'folded' to a summary value.
In other words, it is a type which supports "`foldr`

".

Once you support `foldr`

, of course, it can be turned into a list, by using `toList = foldr (:) []`

. This means that all `Foldable`

s have a representation as a list, but the order of the items may or may not have any particular significance. However, if a `Foldable`

is also a `Functor`

, parametricity and the Functor law guarantee that `toList`

and `fmap`

commute. Further, in the case of `Data.Sequence`

, there **is** a well defined order and it is exposed as expected by `toList`

.

A particular kind of fold well-used by Haskell programmers is `mapM_`

, which is a kind of fold over `(>>)`

, and `Foldable`

provides this along with the related `sequence_`

.

### Traversable

A `Traversable`

type is a kind of upgraded `Foldable`

. Where `Foldable`

gives you the ability to go through the structure processing the elements (`foldr`

) but throwing away the shape, `Traversable`

allows you to do that whilst preserving the shape and, e.g., putting new values in.

`Traversable`

is what we need for `mapM`

and `sequence`

: note the apparently surprising fact that the "_" versions are in a different typeclass.

## Some trickier functions: concatMap and filter

Neither `Traversable`

nor `Foldable`

contain elements for `concatMap`

and `filter`

. That is because `Foldable`

is about tearing down the structure completely, while `Traversable`

is about preserving the structure exactly as-is. On the other hand `concatMap`

tries to 'squeeze more elements in' at a place and `filter`

tries to cut them out.

You can write `concatMap`

for `Sequence`

as follows:

```
concatMap :: (a -> Seq b) -> Seq a -> Seq b
concatMap = foldMap
```

But why does it work? It works because sequence is an instance of `Monoid`

, where the monoidal operation is "appending". The same definition works for lists, and we can write it more generally as:

```
concatMap :: (Foldable f, Monoid (f b)) => (a -> f b) -> f a -> f b
concatMap = foldMap
```

And that works with lists and sequences both. Does it work with any Monoid which is Foldable? Only if the Monoid 'means the right thing'. If you have `toList (f `mappend` g) = toList f ++ toList g`

then it definitely makes sense. In fact this easy to write condition is stronger than needed; it would be good enough if they were permutations of each other.

`filter`

turns out to be slightly harder still. Note the type signature of filter from Data.List: `filter :: (a -> Bool) -> [a] -> [a]`

. Every element in the list is evaluated by a predicate function `(a -> Bool)`

. If that evaluation returns `False`

, the element is removed. If every evaluation returns False, all elements will be removed; therefore there must be an empty representation of the data structure. In the case of list this would be `[]`

. A general representation of this might be `mempty`

found in the `Monoid`

typeclass.

Additionally, for building structure around values you need something like 'singleton' (from `Sequence`

), or `\a -> [a]`

for lists. We can use `pure`

from `Applicative`

, although it's not really right to bring `Applicative`

in for this, and get:

```
filter :: (Applicative f, Foldable f, Monoid (f a)) =>
(a -> Bool) -> f a -> f a
filter p = foldMap (\a -> if p a then pure a else mempty)
```

It's interesting to note that, under these conditions, we have a candidate to help us turn the `Foldable`

into a `Monad`

, since `concatMap`

is a good definition for `>>=`

, and we can use `pure`

for `return`

.

## Generalising zipWith

Another really useful list combinator that doesn't appear in the interfaces for `Sequence`

, `Foldable`

or `Traversable`

is `zipWith`

. The most general kind of `zipWith`

over `Traversable`

s will keep the exact shape of the `Traversable`

on the left, whilst zipping against the values on the right. It turns out you can get away with a `Foldable`

on the right, but you need to use a `Monad`

(or an `Applicative`

, actually) to thread the values through:

```
import Prelude hiding (sequence)
import Data.Sequence
import Data.Foldable
import Data.Traversable
import Control.Applicative
data Supply s v = Supply { unSupply :: [s] -> ([s],v) }
instance Functor (Supply s) where
fmap f av = Supply (\l -> let (l',v) = unSupply av l in (l',f v))
instance Applicative (Supply s) where
pure v = Supply (\l -> (l,v))
af <*> av = Supply (\l -> let (l',f) = unSupply af l
(l'',v) = unSupply av l'
in (l'',f v))
runSupply :: (Supply s v) -> [s] -> v
runSupply av l = snd $ unSupply av l
supply :: Supply s s
supply = Supply (\(x:xs) -> (xs,x))
zipTF :: (Traversable t, Foldable f) => t a -> f b -> t (a,b)
zipTF t f = runSupply (traverse (\a -> (,) a <$> supply) t) (toList f)
zipWithTF :: (Traversable t,Foldable f) => (a -> b -> c) -> t a -> f b -> t c
zipWithTF g t f = runSupply (traverse (\a -> g a <$> supply) t) (toList f)
zipWithTFM :: (Traversable t,Foldable f,Monad m) =>
(a -> b -> m c) -> t a -> f b -> m (t c)
zipWithTFM g t f = sequence (zipWithTF g t f)
zipWithTFA :: (Traversable t,Foldable f,Applicative m) =>
(a -> b -> m c) -> t a -> f b -> m (t c)
zipWithTFA g t f = sequenceA (zipWithTF g t f)
```

The code above fails with a pattern match error when the `Foldable`

container doesn't have enough input. Here is an alternative version which provides friendlier error reports and makes use of `State`

instead of the self defined Supply monad.

```
module GenericZip
(zipWithTF,
zipTF,
zipWithTFA,
zipWithTFM) where
import Data.Foldable
import Data.Traversable
import qualified Data.Traversable as T
import Control.Applicative
import Control.Monad.State
-- | The state contains the list of values obtained form the foldable container
-- and a String indicating the name of the function currectly being executed
data ZipState a = ZipState {fName :: String,
list :: [a]}
-- | State monad containing ZipState
type ZipM l a = State (ZipState l) a
-- | pops the first element of the list inside the state
pop :: ZipM l l
pop = do
st <- get
let xs = list st
n = fName st
case xs of
(a:as) -> do put st{list=as}
return a
[] -> error $ n ++ ": insufficient input"
-- | pop a value form the state and supply it to the second
-- argument of a binary function
supplySecond :: (a -> b -> c) -> a -> ZipM b c
supplySecond f a = do b <- pop
return $ f a b
zipWithTFError :: (Traversable t,Foldable f) =>
String -> (a -> b -> c) -> t a -> f b -> t c
zipWithTFError str g t f = evalState (T.mapM (supplySecond g) t)
(ZipState str (toList f))
zipWithTF :: (Traversable t,Foldable f) => (a -> b -> c) -> t a -> f b -> t c
zipWithTF = zipWithTFError "GenericZip.zipWithTF"
zipTF :: (Traversable t, Foldable f) => t a -> f b -> t (a,b)
zipTF = zipWithTFError "GenericZip.zipTF" (,)
zipWithTFM :: (Traversable t,Foldable f,Monad m) =>
(a -> b -> m c) -> t a -> f b -> m (t c)
zipWithTFM g t f = T.sequence (zipWithTFError "GenericZip.zipWithTFM" g t f)
zipWithTFA :: (Traversable t,Foldable f,Applicative m) =>
(a -> b -> m c) -> t a -> f b -> m (t c)
zipWithTFA g t f = sequenceA (zipWithTFError "GenericZip.zipWithTFA" g t f)
```

Recent versions of `Data.Traversable`

include generalizations of `mapAccumL`

and `mapAccumR`

from lists to Traversables (encapsulating the state monad used above):

```
mapAccumL :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)
mapAccumR :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)
```

Using these, the first version above can be written as

```
zipWithTF :: (Traversable t, Foldable f) => (a -> b -> c) -> t a -> f b -> t c
zipWithTF g t f = snd (mapAccumL map_one (toList f) t)
where map_one (x:xs) y = (xs, g y x)
```

Replace `mapAccumL`

with `mapAccumR`

and the elements of the Foldable are zipped in reverse order. Similarly, we can define a generalization of `reverse`

on Traversables, which preserves the shape but reverses the left-to-right position of the elements:

```
reverseT :: (Traversable t) => t a -> t a
reverseT t = snd (mapAccumR (\ (x:xs) _ -> (xs, x)) (toList t) t)
```