https://wiki.haskell.org/index.php?title=List_traversal&feed=atom&action=historyList traversal - Revision history2016-08-27T06:23:09ZRevision history for this page on the wikiMediaWiki 1.19.14+dfsg-1https://wiki.haskell.org/index.php?title=List_traversal&diff=52018&oldid=prevLemming: link to lazy pattern match2012-09-07T07:39:04Z<p>link to lazy pattern match</p>
<table class='diff diff-contentalign-left'>
<tr valign='top'>
<td colspan='1' style="background-color: white; color:black;">← Older revision</td>
<td colspan='1' style="background-color: white; color:black;">Revision as of 07:39, 7 September 2012</td>
</tr></table>Lemminghttps://wiki.haskell.org/index.php?title=List_traversal&diff=52016&oldid=prevLemming: show five implementations of partitionEithers and their pros and cons2012-09-07T07:31:22Z<p>show five implementations of partitionEithers and their pros and cons</p>
<p><b>New page</b></p><div>Traversing a list is sometimes more difficult<br />
than it seems to be at the first glance.<br />
With "traversal" I mean to consume one or more lists<br />
and produce one or more new ones.<br />
Our goal is to do this efficiently and lazily.<br />
<br />
As a running example I use the <hask>partitionEithers</hask> function<br />
that can be found in the <hask>Data.Either</hask> module<br />
since <code>base-4.0</code>.<br />
<br />
Its type signature is<br />
<haskell><br />
partitionEithers :: [Either a b] -> ([a], [b])<br />
</haskell><br />
and it does what you expect:<br />
<haskell><br />
Prelude Data.Either> partitionEithers [Left 'a', Right False, Left 'z']<br />
("az",[False])<br />
Prelude Data.Either> take 100 $ snd $ partitionEithers $ cycle [Left 'a', Right (0 :: Int)]<br />
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]<br />
</haskell><br />
<br />
The second example is especially important<br />
because it shows that the input can be infinitely long<br />
and the output can be, too.<br />
That's the proof that the implementation is lazy.<br />
We will use this example as test for our implementations below.<br />
<br />
<br />
== First attempt - quadratic runtime, not lazy ==<br />
<br />
In our first attempt we maintain a state containing two lists<br />
that we want to extend to the result lists step by step.<br />
<br />
<haskell><br />
partitionEithers2 :: [Either a b] -> ([a], [b])<br />
partitionEithers2 =<br />
let aux ab [] = ab<br />
aux (as, bs) (Left a : es) = aux (as ++ [a], bs) es<br />
aux (as, bs) (Right b : es) = aux (as, bs ++ [b]) es<br />
in aux ([], [])<br />
</haskell><br />
<br />
This implementation works for finite lists<br />
but fails for infinite ones.<br />
You will also notice that it is quite slow.<br />
The reason is that appending something to a list like <hask>as</hask><br />
requires <hask>length as</hask> steps<br />
in order to reach the end of <hask>as</hask>.<br />
Since we do this repeatedly we end up with quadratic runtime.<br />
<br />
<br />
== Second attempt - linear runtime, still not lazy ==<br />
<br />
We have learned that appending something to a list is expensive.<br />
However prepending a single element is very cheap,<br />
it needs only constant number of operations.<br />
Thus we will implement the following idea:<br />
We prepend new elements to the result list<br />
and since this reverses the order of elements,<br />
we reverse the result lists in the end.<br />
<haskell><br />
partitionEithers1 :: [Either a b] -> ([a], [b])<br />
partitionEithers1 xs =<br />
let aux ab [] = ab<br />
aux (as, bs) (Left a : es) = aux (a : as, bs) es<br />
aux (as, bs) (Right b : es) = aux (as, b : bs) es<br />
(ys,zs) = aux ([], []) xs<br />
in (reverse ys, reverse zs)<br />
</haskell><br />
This implementation is much faster than the first one<br />
but it cannot be lazy because <hask>reverse</hask> is not lazy.<br />
<br />
<br />
== Third attempt - linear runtime and full laziness ==<br />
<br />
In order to get linear runtime and full laziness<br />
we must produce the list in the same order as the input.<br />
However we must avoid appending to the end of the list.<br />
Instead we must prepend elements to lists that become known in the future.<br />
We must be very careful that the leading elements of the result lists<br />
can be generated without touching the following elements.<br />
Here is the solution:<br />
<haskell><br />
partitionEithers :: [Either a b] -> ([a], [b])<br />
partitionEithers [] = ([], [])<br />
partitionEithers (Left a : es) =<br />
let (as,bs) = partitionEithers es<br />
in (a:as, bs)<br />
partitionEithers (Right b : es) =<br />
let (as,bs) = partitionEithers es<br />
in (as, b:bs)<br />
</haskell><br />
It is crucial to know that a <hask>let</hask> binding<br />
matches the top-most data constructor lazily.<br />
The following expressions would match strictly and thus would fail:<br />
<haskell><br />
(\(as,bs) -> (a:as, bs)) $ partitionEithers es<br />
</haskell><br />
<haskell><br />
case partitionEithers es of (as,bs) -> (a:as, bs)<br />
</haskell><br />
Matching the pair constructor strictly means<br />
that the recursive call to <hask>partitionEithers</hask> is triggered<br />
before the pair constructor of the result is generated.<br />
This starts a cascade that forces all recursive calls<br />
until the end of the input list.<br />
<br />
This is different for lazy pattern matches.<br />
The above <hask>let</hask> can be rewritten equivalently to:<br />
<haskell><br />
let ~(as,bs) = partitionEithers es<br />
in (a:as, bs)<br />
</haskell><br />
<haskell><br />
(\ ~(as,bs) -> (a:as, bs)) $ partitionEithers es<br />
</haskell><br />
<haskell><br />
case partitionEithers es of ~(as,bs) -> (a:as, bs)<br />
</haskell><br />
or without the tilde as syntactic sugar:<br />
<haskell><br />
case partitionEithers es of ab -> (a : fst ab, snd ab)<br />
</haskell><br />
Of course, both <hask>fst</hask> and <hask>snd</hask><br />
contain strict pattern matches on the pair constructor<br />
but the key difference to above is<br />
that these matches happen inside the pair constructor of<br />
<hask>(a : fst ab, snd ab)</hask>.<br />
That is, the outer pair constructor can be generated<br />
before the evaluation of <hask>ab</hask> is started.<br />
<br />
<br />
== Fourth attempt - expert solution ==<br />
<br />
Now real experts would not recurse manually<br />
but would let <hask>foldr</hask> do this job.<br />
This allows for [[fusion]].<br />
Additionally real experts would add the line<br />
<hask>(\ ~(as,bs) -> (as,bs))</hask><br />
in order to generate the pair constructor of the result<br />
completely independent from the input.<br />
This yields maximum laziness.<br />
<haskell><br />
partitionEithersFoldr :: [Either a b] -> ([a], [b])<br />
partitionEithersFoldr =<br />
(\ ~(as,bs) -> (as,bs)) .<br />
foldr<br />
(\e ~(as,bs) -><br />
case e of<br />
Left a -> (a:as, bs)<br />
Right b -> (as, b:bs))<br />
([], [])<br />
</haskell><br />
<br />
<br />
== Fifth attempt - your solution ==<br />
<br />
If you are tired of all these corner cases<br />
that we need to respect in order to get full laziness<br />
then you might prefer to solve the problem<br />
by just combining functions that are known to be lazy.<br />
It is good style anyway to avoid explicit recursion.<br />
Of course, when combining lazy functions<br />
you must still take care that the combinators maintain laziness.<br />
Thus my exercise for you at the end of this article<br />
is to implement <hask>partitionEithers</hask> using standard functions,<br />
say, from <code>base</code> before version 4.<br />
A small hint: the module <hask>Data.Maybe</hask><br />
turns out to be very useful.<br />
<br />
<br />
[[Category:Idioms]]</div>Lemming