Difference between revisions of "Performance/Monads"
Derek Elkins (talk | contribs) m |
(→Use Continuation Passing Style) |
||
Line 44: | Line 44: | ||
type MCPS a = forall r. Cont (M r) a |
type MCPS a = forall r. Cont (M r) a |
||
− | runMCPS m = runCont m |
+ | runMCPS m = runCont m Just |
</haskell> |
</haskell> |
||
Line 88: | Line 88: | ||
Note that this code is just normal CPS code and completely independent of the Maybe monad. There are no case analyses. So, "failure free" code will run as normal (CPS) code. |
Note that this code is just normal CPS code and completely independent of the Maybe monad. There are no case analyses. So, "failure free" code will run as normal (CPS) code. |
||
+ | |||
+ | How and why does this work? We're basically specializing <hask>>>=</hask>. Before, we had |
||
+ | <haskell> |
||
+ | m >>= f = case m of |
||
+ | Just a -> f a |
||
+ | Nothing -> Nothing |
||
+ | </haskell> |
||
+ | i.e. the monadic bind does a case analysis on how to proceed. In the CPS-representation, we're specializing bind to the two possible constructors: |
||
+ | <haskell> |
||
+ | return' a := (return a >>=) = (Just a >>=) = \f -> f a |
||
+ | mzero' := (mzero >>=) = (Nothing >>=) = \f -> Nothing |
||
+ | </haskell> |
||
+ | and the case analysis is now "built-in" into <hask>return'</hask> and <hask>mzero'</hask>. In general, embedding a monad in <hask>Cont</hask> specializes <hask>>>=</hask> to its primitive operations, like for example <hask>mzero</hask> or <hask>get</hask>. This is close to specifying the operational semantics of <hask>>>=</hask> directly and can hence be used to implement the monad in the first place, too. |
||
Now we need to implement the operations. |
Now we need to implement the operations. |
Revision as of 11:30, 31 October 2007
Haskell Performance Resource
Constructs: Techniques: |
Unroll your MTL stacks
MTL is an excellent library for programming with monads. However stacked monad transformers do not inline well and the library is in need of an optimization pass. As a result, it can often impose a performance hit of up to 300% (your code will run up to three times slower).
If you care about this, the best option is to flatten you stack of transformers into a single, hand unrolled monad. An extreme example follows.
This is a typical MTL monad stack
newtype DRM a = DRM {unDRM:: ErrorT Finish (RWS () DNA RNA) a} deriving (MonadState DNA, MonadWriter RNA, MonadError Finish, Monad)
We can unroll it as follows:
type DRM = DRMonad Finish DNA RNA
newtype DRMonad e s w a = DRMonad {runDRMonad :: s -> (Either e a,s,w)}
instance (Monoid m, Error e) => Monad (DRMonad e s w) where
return x = DRMonad(\s -> (Right x, s, mempty))
(>>= = bindDRMonad
fail _ = DRMonad (\s->(Left e,s,mempty))
{-# INLINE bindDRMonad #-}
{-# INLINE bindDRMonad2 #-}
bindDRMonad :: Monoid m => DRMonad a e s w -> (a -> DRMonad b e s w) -> DRMonad b e s w
bindDRMonad m f = DRMonad$ \s -> case runDRMonad m s of
(x',s',w) ->
bindDRMonad2 x' (s',w,f)
bindDRMonad2 x' (s',w, f) = case x' of
Left e -> (Left e, s', w)
Right r -> case runDRMonad (f r) s' of
(x'',s'',w') ->
(x'', s'', w `mappend` w')
After this, you will also want to add the instances for MonadState, MonadWriter, etc.
Use Continuation Passing Style
It is well known that every monad can be "embedded" into the continuation passing monad, Cont. All that is necessary is to make the "answer type" of the Cont monad be the desired monad, e.g.
type MCPS a = forall r. Cont (M r) a
runMCPS m = runCont m Just
Note that M's (>>=) operation isn't used. It comes up when you implement the other operations M supports.
In many cases, this will effectively avoid a layer of interpretation in much of the code using M. To see this, let's look at code that would benefit from using the Maybe monad.
liftMaybe2 :: (a -> b -> Maybe c) -> Maybe a -> Maybe b -> Maybe c
liftMaybe2 f a b =
case a of
Nothing -> Nothing
Just a' -> case b of
Nothing -> Nothing
Just b' -> f a' b'
This code screams for using Maybe as a monad in which case it will look like,
liftMaybe2 f a b = do
a' <- a
b' <- b
f a' b'
-- or just
liftMaybe2 = liftM2
One things to note about the original code is that it must constantly check the returned value even though a failure (Nothing) is most likely a rare occurrence, and further more it's possible that we will need to propagate a Nothing through an arbitrary large amount of code, though a lot of times this won't happen.
Ideally, we want "failure free" code to run as normal and only deal with failure when it occurs and immediately fail in those cases rather than propagating the failure. This is what using continuation passing style gets us.
Explicitly expanding Cont we get,
newtype MaybeCPS a = MaybeCPS { unMaybeCPS :: forall r. (a -> Maybe r) -> Maybe r }
runMaybeCPS m = unMaybeCPS m return
-- The Cont definitions of return and (>>=)
instance Monad MaybeCPS where
return a = MaybeCPS (\k -> k a)
MaybeCPS m >>= f = MaybeCPS (\k -> m (\a -> unMaybeCPS (f a) k))
Note that this code is just normal CPS code and completely independent of the Maybe monad. There are no case analyses. So, "failure free" code will run as normal (CPS) code.
How and why does this work? We're basically specializing >>=
. Before, we had
m >>= f = case m of
Just a -> f a
Nothing -> Nothing
i.e. the monadic bind does a case analysis on how to proceed. In the CPS-representation, we're specializing bind to the two possible constructors:
return' a := (return a >>=) = (Just a >>=) = \f -> f a
mzero' := (mzero >>=) = (Nothing >>=) = \f -> Nothing
and the case analysis is now "built-in" into return'
and mzero'
. In general, embedding a monad in Cont
specializes >>=
to its primitive operations, like for example mzero
or get
. This is close to specifying the operational semantics of >>=
directly and can hence be used to implement the monad in the first place, too.
Now we need to implement the operations.
instance MonadPlus MaybeCPS where
mzero = MaybeCPS (\_ -> Nothing) -- equivalent to MaybeCPS (Nothing >>=)
m `mplus` n = case runMaybeCPS m of
Nothing -> n
Just a -> return a
There are two things to note about this code. mplus is where we use the case analysis. mplus is the only place that we look at what was returned and to do it we have to actually run the computation. This means that we only deal with "effects" when we need to. Further, mzero discards its continuation. This is the typical pattern for aborting a computation in CPS and will lead to an immediate termination of the computation.
MaybeCPS should be faster than using Maybe in most cases and should be almost a drop in replacement for Maybe. Usually, using a CPS implementation would be a drop in replacement, but Maybe is not an abstract data type. Anyway, some results for a more complicated example are here.