Difference between revisions of "Arrow tutorial"

From HaskellWiki
Jump to navigation Jump to search
(Various formatting changes)
 
(19 intermediate revisions by 7 users not shown)
Line 1: Line 1:
  +
{| border=0 align=right cellpadding=4 cellspacing=0
<haskell>
 
  +
|<haskell>
  +
> {-# LANGUAGE Arrows #-}
 
> module ArrowFun where
 
> module ArrowFun where
 
> import Control.Arrow
 
> import Control.Arrow
  +
> import Control.Category
  +
> import Prelude hiding (id,(.))
 
</haskell>
 
</haskell>
  +
|}
   
  +
== The <code>Arrow</code> class ==
Arrow a b c represents a process that takes as input something of
 
  +
A value of type <code>(Arrow a) => a b c</code> (commonly referred to as just an <i>arrow</i>) represents a process that takes as input a value of type <code>b</code> and outputs a value of type <code>c</code>.
type b and outputs something of type c.
 
   
  +
The class includes the following methods:
Arr builds an arrow out of a function. This function is
 
arrow-specific. It's signature is
 
   
  +
* <code>arr</code> builds an arrow value out of a function:
<haskell>
 
  +
> -- arr :: (Arrow a) => (b -> c) -> a b c
 
  +
:<haskell>
  +
arr :: (Arrow a) => (b -> c) -> a b c
 
</haskell>
 
</haskell>
   
  +
* <code>(>>>)</code> composes two arrow values to form a new one by "chaining" them them together, one after another:
Arrow composition is achieved with (>>>). This takes two arrows
 
and chains them together, one after another. It is also arrow-
 
specific. It's signature is:
 
   
<haskell>
+
:<haskell>
> -- (>>>) :: (Arrow a) => a b c -> a c d -> a b d
+
(>>>) :: (Arrow a) => a b c -> a c d -> a b d
 
</haskell>
 
</haskell>
   
First and second make a new arrow out of an existing arrow. They
+
* <code>first</code> and <code>second</code> make a new arrow value out of an existing one. They perform a transformation (given by their argument) on either the first or the second item of a pair:
perform a transformation (given by their argument) on either
 
the first or the second item of a pair. These definitions are
 
arrow-specific. Their signatures are:
 
   
<haskell>
+
:<haskell>
> -- first :: (Arrow a) => a b c -> a (b, d) (c, d)
+
first :: (Arrow a) => a b c -> a (b, d) (c, d)
> -- second :: (Arrow a) => a b c -> a (d, b) (d, c)
+
second :: (Arrow a) => a b c -> a (d, b) (d, c)
 
</haskell>
 
</haskell>
   
First and second may seem pretty strange at first, but they'll make sense
+
:<code>first</code> and <code>second</code> may seem pretty strange at first, but they'll make sense in a few minutes.
in a few minutes.
 
 
That's it for the arrow-specific definitions.
 
   
Let's define a really simple arrow as an example. Our simple arrow is
+
== A simple arrow type ==
just a function mapping an input to an output. We don't really need
+
Let's define a really simple arrow type as an example, based on a function mapping an input to an output:
arrows for something this simple, but we could use something this
 
simple to explain arrows.
 
   
<haskell>
+
:<haskell>
 
> newtype SimpleFunc a b = SimpleFunc {
 
> newtype SimpleFunc a b = SimpleFunc {
 
> runF :: (a -> b)
 
> runF :: (a -> b)
Line 53: Line 49:
 
> second (SimpleFunc f) = SimpleFunc (mapSnd f)
 
> second (SimpleFunc f) = SimpleFunc (mapSnd f)
 
> where mapSnd g (a,b) = (a, g b)
 
> where mapSnd g (a,b) = (a, g b)
  +
>
> (SimpleFunc f) >>> (SimpleFunc g) = SimpleFunc (g . f)
 
  +
> instance Category SimpleFunc where
  +
> (SimpleFunc g) . (SimpleFunc f) = SimpleFunc (g . f)
  +
> id = arr id
 
</haskell>
 
</haskell>
   
Now lets define some operations that are generic to all arrows.
+
== Some other arrow operations ==
  +
Now let's define some operations that are generic to all arrow types:
   
Split is an arrow that splits a single value into a pair of duplicate
+
* <code>split</code> is an arrow value that splits a single value into a pair of duplicate values:
values:
 
   
<haskell>
+
:<haskell>
 
> split :: (Arrow a) => a b (b, b)
 
> split :: (Arrow a) => a b (b, b)
 
> split = arr (\x -> (x,x))
 
> split = arr (\x -> (x,x))
 
</haskell>
 
</haskell>
   
Unsplit is an arrow that takes a pair of values and combines them
+
* <code>unsplit</code> is an arrow value that takes a pair of values and combines them to return a single value:
to return a single value:
 
   
<haskell>
+
:<haskell>
 
> unsplit :: (Arrow a) => (b -> c -> d) -> a (b, c) d
 
> unsplit :: (Arrow a) => (b -> c -> d) -> a (b, c) d
 
> unsplit = arr . uncurry
 
> unsplit = arr . uncurry
> -- arr (\op (x,y) -> x `op` y)
+
> -- = \op -> arr (\(x,y) -> x `op` y)
 
</haskell>
 
</haskell>
   
  +
* <code>(***)</code> combines two arrow values by running them on a pair (the first arrow value on the first component of the pair; the second arrow value on the second component of the pair):
(***) combines two arrows into a new arrow by running the two arrows
 
on a pair of values (one arrow on the first pair and one arrow on the
 
second pair).
 
   
<haskell>
+
:<haskell>
> -- f *** g = first f >>> second g
+
f *** g = first f >>> second g
 
</haskell>
 
</haskell>
   
(&&&) combines two arrows into a new arrow by running the two arrows on
+
* <code>(&&&)</code> combines two arrow values by running them with the same input:
the same value:
 
   
<haskell>
+
:<haskell>
> -- f &&& g = split >>> first f >>> second g
+
f &&& g = split >>> first f >>> second g
> -- split >>> f *** g
+
-- = split >>> f *** g
 
</haskell>
 
</haskell>
   
LiftA2 makes a new arrow that combines the output from two arrows using
+
* <code>liftA2</code> makes a new arrow value that combines the output from two other arrow values using a binary operation. It works by splitting a value and operating on both halves and then combining the result:
a binary operation. It works by splitting a value and operating on
 
both halfs and then combining the result:
 
   
<haskell>
+
:<haskell>
 
> liftA2 :: (Arrow a) => (b -> c -> d) -> a e b -> a e c -> a e d
 
> liftA2 :: (Arrow a) => (b -> c -> d) -> a e b -> a e c -> a e d
 
> liftA2 op f g = split >>> first f >>> second g >>> unsplit op
 
> liftA2 op f g = split >>> first f >>> second g >>> unsplit op
> -- f &&& g >>> unsplit op
+
> -- = f &&& g >>> unsplit op
 
</haskell>
 
</haskell>
   
  +
== An example ==
  +
Now let's build something using our simple arrow definition and some of the tools we've just created. We start with two simple arrow values, <code>f</code> and <code>g</code>:
   
  +
* <code>f</code> halves its input:
   
  +
:<haskell>
Now let's build something using our simple arrow definition and
 
  +
> f :: SimpleFunc Int Int
some of the tools we just created. We start with two simple
 
arrows, f and g. F halves its input and g triples its input and
 
adds one:
 
 
<haskell>
 
> f, g :: SimpleFunc Int Int
 
 
> f = arr (`div` 2)
 
> f = arr (`div` 2)
  +
</haskell>
  +
  +
* and <code>g</code> triples its input and adds one:
  +
  +
:<haskell>
  +
> g :: SimpleFunc Int Int
 
> g = arr (\x -> x*3 + 1)
 
> g = arr (\x -> x*3 + 1)
 
</haskell>
 
</haskell>
   
We can combine these together using liftA2:
+
We can combine these together using <code>liftA2</code>:
   
<haskell>
+
:<haskell>
  +
> h :: SimpleFunc Int Int
 
> h = liftA2 (+) f g
 
> h = liftA2 (+) f g
  +
>
  +
> hOutput :: Int
 
> hOutput = runF h 8
 
> hOutput = runF h 8
 
</haskell>
 
</haskell>
   
What is h? How does it work?
+
What is <code>h</code>? How does it work?
The process defined by h is (split >>> first f >>> second g >>> unsplit (+)).
 
Lets work through an application of h to some value, 8:
 
   
  +
The process defined by <code>h</code> is <code>split >>> first f >>> second g >>> unsplit (+)</code>. Let's work through an application of <code>h</code> to the value <code>8</code>:
8 -> (8, 8) split
 
(8, 8) -> (4, 8) first f (x `div` 2 of the first element)
 
(4, 8) -> (4, 25) second g (3*x + 1 of the second element)
 
(4, 25) -> 29 applies (+) to tuple elements.
 
   
  +
:{|
+------> f --------------+
 
  +
|<code>8</code>
| v
 
  +
|→
8 ---> (split) ---> g -----> (unsplit (+)) ----> 29
 
  +
|<code>(8, 8)</code>
  +
|<code>split</code>
  +
|-
  +
|<code>(8, 8)</code>
  +
|→
  +
|<code>(4, 8)</code>
  +
|<code>first f</code> ⇔ <code>x `div` 2</code>, where <code>x</code> is the first component of the pair
  +
|-
  +
|<code>(4, 8)</code>
  +
|→
  +
|<code>(4, 25)</code>
  +
|<code>second g</code> ⇔ <code>3*y + 1</code>, where <code>y</code> is the second component of the pair
  +
|-
  +
|<code>(4, 25)</code>
  +
|→
  +
|<code>29</code>
  +
|apply <code>(+)</code> to the components of the pair
  +
|}
  +
:::{|
  +
|
  +
f
  +
↗ ↘
  +
8 → (split) (unsplit (+)) → 29
  +
↘ ↗
  +
g
  +
|}
   
so we see that h is a new arrow that when applied to 8, applies 8 to f
+
We can see that <code>h</code> is a new arrow value that, when applied to <code>8</code>, will apply both <code>f</code> and <code>g</code> to <code>8</code>, then adds their results.
and applies 8 to g and adds the results.
 
   
  +
A lot of juggling occurred to get the plumbing right since <code>h</code> wasn't defined as a linear combination of arrow values. GHC has a syntactic notation that simplifies this in a similar way to how
  +
<code>do</code>-notation simplifies monadic computations. The <code>h</code> function can then be defined as:
   
  +
:<haskell>
Let's move on to something a little fancier now: Kleisli arrows.
 
  +
> h' :: SimpleFunc Int Int
A Kleisli arrow (Kleisli m a b) is the arrow (a -> m b) for all
 
  +
> h' = proc x -> do
monads. It's defined in Control.Arrows similarly to our SimpleFunc:
 
  +
> fx <- f -< x
 
  +
> gx <- g -< x
<haskell>
 
  +
> returnA -< (fx + gx)
> -- newtype Kleisli m a b = Kleisli {
 
  +
>
> -- runKleisli :: (a -> m b)
 
  +
> hOutput' :: Int
> -- }
 
  +
> hOutput' = runF h' 8
 
</haskell>
 
</haskell>
   
  +
== <code>Kleisli</code> arrow values ==
It comes complete with its own definitions for arr, first, second and
 
  +
Let's move on to something a little fancier now: Kleisli arrows.
(>>>). This means that all multi-value functions (a -> [b]) are already
 
defined as Kleisli arrows (because [] is a monad)! (>>>) performs
 
composition, keeping track of all the multiple results. Split, (&&&)
 
and (***) are all defined as before. So for example:
 
   
  +
A Kleisli arrow type (<code>Kleisli m a b</code>) corresponds to the type <code>(a -> m b)</code>, where <code>m</code> is a monadic type. It's defined in <code>Control.Arrows</code> similarly to our <code>SimpleFunc</code>:
<haskell>
 
  +
> -- XXX I am getting type problems with split, unsplit and liftA2! why?
 
  +
:<haskell>
> split' = arr (\x -> (x,x))
 
  +
newtype Kleisli m a b = Kleisli {
> unsplit' = arr . uncurry
 
  +
runKleisli :: (a -> m b)
> --liftA2' :: (Arrow a) => (b -> c -> d) -> a e b -> a e c -> a e d
 
  +
}
> liftA2' op f g = split' >>> first f >>> second g >>> unsplit' op
 
 
</haskell>
 
</haskell>
   
  +
It comes complete with its own definitions for <code>arr</code>, <code>(>>>)</code>, <code>first</code>, and <code>second</code>. This means that all multi-value functions (i.e of type <code>a -> [b]</code>) are already defined as Kleisli arrows (because the list type <code>[]</code> is monadic)! <code>(>>>)</code> performs composition, keeping track of all the multiple results. <code>split</code>, <code>(&&&)</code> and <code>(***)</code> are all defined as before. For example:
<haskell>
 
  +
  +
:<haskell>
 
> plusminus, double, h2 :: Kleisli [] Int Int
 
> plusminus, double, h2 :: Kleisli [] Int Int
 
> plusminus = Kleisli (\x -> [x, -x])
 
> plusminus = Kleisli (\x -> [x, -x])
> double = arr (* 2)
+
> double = arr (* 2)
> h2 = liftA2' (+) plusminus double
+
> h2 = liftA2 (+) plusminus double
  +
>
  +
> h2Output :: [Int]
 
> h2Output = runKleisli h2 8
 
> h2Output = runKleisli h2 8
 
</haskell>
 
</haskell>
   
  +
== A Teaser ==
Finally, here's a little teaser. There's an arrow function called
 
  +
Finally, here is a little teaser. There is an arrow function called <code>returnA</code> which returns an identity arrow. There is an <code>ArrowPlus</code> class that includes a <code>zeroArrow</code> (which for the list type is an arrow value that always returns the empty list) and a <code>(<+>)</code> operator (which takes the results from two arrow values and concatenates them). We can build up some pretty interesting string transformations (multi-valued functions of type <code>String -> [String]</code>) using Kleisli arrow values:
returnA which returns an identity arrow. There's a ArrowPlus class
 
that includes a zeroArrow (which for the list monad is an arrow that
 
always returns the empty list) and a <+> operator (which takes the
 
results from two arrows and concatenates them). We can build up
 
some pretty interesting string transformations (the multi-valued
 
function String -> [String]) using Kleisli arrows:
 
   
<haskell>
+
:<haskell>
  +
> main :: IO ()
 
> main = do
 
> main = do
 
> let
 
> let
 
> prepend x = arr (x ++)
 
> prepend x = arr (x ++)
> append x = arr (++ x)
+
> append x = arr (++ x)
> withId t = returnA <+> t
+
> withId t = returnA <+> t
 
> xform = (withId $ prepend "<") >>>
 
> xform = (withId $ prepend "<") >>>
 
> (withId $ append ">") >>>
 
> (withId $ append ">") >>>
Line 192: Line 212:
   
 
An important observation here is that
 
An important observation here is that
f >> g
+
f >>> g
  +
  +
is a multi-valued composition <code>(g . f)</code>, and
  +
:{|
  +
|
  +
|<code>(withId f) >>> (withId g)</code>
  +
|-
  +
|=
  +
|<code>(returnA <+> f) >>> (returnA <+> g)</code>
  +
|-
  +
|=
  +
|<code>((arr id) <+> f) >>> ((arr id) <+> g)</code>
  +
|}
  +
  +
which, when applied to an input <code>x</code>, returns all values:
  +
:{|
  +
|
  +
|<code>((id . id) x) ++ ((id . f) x) ++ ((id . g) x) ++ ((g . f) x)</code>
  +
|-
  +
| =
  +
|<code>x ++ (f x) ++ (g x) ++ ((g . f) x)</code>
  +
|}
  +
  +
which are all permutations of using the arrow values <code>f</code> and <code>g</code>.
  +
  +
== Tutorial Meta ==
  +
The wiki file source is literate Haskell. Save the source in a file called <code>ArrowFun.lhs</code> to compile it (or run in GHCi).
   
  +
The code is adapted to GHC 6.10.1; use [http://www.haskell.org/haskellwiki/?title=Arrow_tutorial&oldid=15443] for older versions of GHC and other Haskell implementations.
is multi-valued composition (g . f), and
 
(withId f) >>> (withId g) =
 
(returnA <+> f) >>> (returnA <+> g) =
 
((arr id) <+> f) >>> ((arr id) <+> g)
 
   
  +
* Original version - Nov 19, 2006, Tim Newsham.
which, when applied to an input x, returns all values:
 
((id . id) x) ++ ((id . f) x) ++ ((id . g) x) ++ ((g . f) x) =
 
x ++ (f x) ++ (g x) ++ ((g . f) x)
 
   
  +
[[Category:Tutorials]]
which are all permutations of using arrows f and g.
 
  +
[[Category:Arrow]]

Latest revision as of 00:00, 17 June 2021

> {-# LANGUAGE Arrows #-}
> module ArrowFun where
> import Control.Arrow
> import Control.Category
> import Prelude hiding (id,(.))

The Arrow class

A value of type (Arrow a) => a b c (commonly referred to as just an arrow) represents a process that takes as input a value of type b and outputs a value of type c.

The class includes the following methods:

  • arr builds an arrow value out of a function:
arr :: (Arrow a) => (b -> c) -> a b c
  • (>>>) composes two arrow values to form a new one by "chaining" them them together, one after another:
(>>>) :: (Arrow a) => a b c -> a c d -> a b d
  • first and second make a new arrow value out of an existing one. They perform a transformation (given by their argument) on either the first or the second item of a pair:
first  :: (Arrow a) => a b c -> a (b, d) (c, d)
second :: (Arrow a) => a b c -> a (d, b) (d, c)
first and second may seem pretty strange at first, but they'll make sense in a few minutes.

A simple arrow type

Let's define a really simple arrow type as an example, based on a function mapping an input to an output:

> newtype SimpleFunc a b = SimpleFunc {
>     runF :: (a -> b)
> }
>
> instance Arrow SimpleFunc where
>     arr f = SimpleFunc f
>     first (SimpleFunc f) = SimpleFunc (mapFst f)
>                   where mapFst g (a,b) = (g a, b)
>     second (SimpleFunc f) = SimpleFunc (mapSnd f)
>                   where mapSnd g (a,b) = (a, g b)
>
> instance Category SimpleFunc where
>     (SimpleFunc g) . (SimpleFunc f) = SimpleFunc (g . f)
>     id = arr id

Some other arrow operations

Now let's define some operations that are generic to all arrow types:

  • split is an arrow value that splits a single value into a pair of duplicate values:
> split :: (Arrow a) => a b (b, b)
> split = arr (\x -> (x,x))
  • unsplit is an arrow value that takes a pair of values and combines them to return a single value:
> unsplit :: (Arrow a) => (b -> c -> d) -> a (b, c) d
> unsplit = arr . uncurry       
>      -- = \op -> arr (\(x,y) -> x `op` y)
  • (***) combines two arrow values by running them on a pair (the first arrow value on the first component of the pair; the second arrow value on the second component of the pair):
f *** g = first f >>> second g
  • (&&&) combines two arrow values by running them with the same input:
f &&& g = split >>> first f >>> second g
     -- = split >>> f *** g
  • liftA2 makes a new arrow value that combines the output from two other arrow values using a binary operation. It works by splitting a value and operating on both halves and then combining the result:
> liftA2 :: (Arrow a) => (b -> c -> d) -> a e b -> a e c -> a e d
> liftA2 op f g = split >>> first f >>> second g >>> unsplit op
>            -- = f &&& g >>> unsplit op

An example

Now let's build something using our simple arrow definition and some of the tools we've just created. We start with two simple arrow values, f and g:

  • f halves its input:
> f :: SimpleFunc Int Int
> f = arr (`div` 2)
  • and g triples its input and adds one:
> g :: SimpleFunc Int Int
> g = arr (\x -> x*3 + 1)

We can combine these together using liftA2:

> h :: SimpleFunc Int Int
> h = liftA2 (+) f g
>
> hOutput :: Int
> hOutput = runF h 8

What is h? How does it work?

The process defined by h is split >>> first f >>> second g >>> unsplit (+). Let's work through an application of h to the value 8:

8 (8, 8) split
(8, 8) (4, 8) first fx `div` 2, where x is the first component of the pair
(4, 8) (4, 25) second g3*y + 1, where y is the second component of the pair
(4, 25) 29 apply (+) to the components of the pair
              f
            ↗   ↘
 8 → (split)     (unsplit (+)) → 29
            ↘   ↗
              g 

We can see that h is a new arrow value that, when applied to 8, will apply both f and g to 8, then adds their results.

A lot of juggling occurred to get the plumbing right since h wasn't defined as a linear combination of arrow values. GHC has a syntactic notation that simplifies this in a similar way to how do-notation simplifies monadic computations. The h function can then be defined as:

> h' :: SimpleFunc Int Int
> h' = proc x -> do
>       fx <- f -< x
>       gx <- g -< x
>       returnA -< (fx + gx)
>
> hOutput' :: Int
> hOutput' = runF h' 8

Kleisli arrow values

Let's move on to something a little fancier now: Kleisli arrows.

A Kleisli arrow type (Kleisli m a b) corresponds to the type (a -> m b), where m is a monadic type. It's defined in Control.Arrows similarly to our SimpleFunc:

newtype Kleisli m a b = Kleisli {
  runKleisli :: (a -> m b) 
}

It comes complete with its own definitions for arr, (>>>), first, and second. This means that all multi-value functions (i.e of type a -> [b]) are already defined as Kleisli arrows (because the list type [] is monadic)! (>>>) performs composition, keeping track of all the multiple results. split, (&&&) and (***) are all defined as before. For example:

> plusminus, double, h2 :: Kleisli [] Int Int
> plusminus = Kleisli (\x -> [x, -x])
> double    = arr (* 2)
> h2        = liftA2 (+) plusminus double 
>
> h2Output :: [Int]
> h2Output = runKleisli h2 8

A Teaser

Finally, here is a little teaser. There is an arrow function called returnA which returns an identity arrow. There is an ArrowPlus class that includes a zeroArrow (which for the list type is an arrow value that always returns the empty list) and a (<+>) operator (which takes the results from two arrow values and concatenates them). We can build up some pretty interesting string transformations (multi-valued functions of type String -> [String]) using Kleisli arrow values:

> main :: IO ()
> main = do
>    let
>        prepend x = arr (x ++)
>        append  x = arr (++ x)
>        withId  t = returnA <+> t
>        xform = (withId $ prepend "<") >>>
>                (withId $ append ">") >>>
>                (withId $ ((prepend "!") >>> (append "!")))
>        xs = ["test", "foobar"] >>= (runKleisli xform)
>    mapM_ putStrLn xs

An important observation here is that

   f >>> g

is a multi-valued composition (g . f), and

(withId f) >>> (withId g)
= (returnA <+> f) >>> (returnA <+> g)
= ((arr id) <+> f) >>> ((arr id) <+> g)

which, when applied to an input x, returns all values:

((id . id) x) ++ ((id . f) x) ++ ((id . g) x) ++ ((g . f) x)
= x ++ (f x) ++ (g x) ++ ((g . f) x)

which are all permutations of using the arrow values f and g.

Tutorial Meta

The wiki file source is literate Haskell. Save the source in a file called ArrowFun.lhs to compile it (or run in GHCi).

The code is adapted to GHC 6.10.1; use [1] for older versions of GHC and other Haskell implementations.

  • Original version - Nov 19, 2006, Tim Newsham.