Difference between revisions of "Memoization"
MichalPalka (talk  contribs) (→See also: Add draft by Daniel Brown and William R. Cook) 
(→Memoizing fix point operator: rewrite introducing a more modular approach and a bit more explaination) 

(18 intermediate revisions by 12 users not shown)  
Line 8:  Line 8:  
We don't go into the details of this case. 
We don't go into the details of this case. 

If you want a general solution for several types, 
If you want a general solution for several types, 

−  you need a type class, say < 
+  you need a type class, say <code>Memoizable</code>. 
<haskell> 
<haskell> 

memoize :: Memoizable a => (a>b) > (a>b) 
memoize :: Memoizable a => (a>b) > (a>b) 

Line 14:  Line 14:  
Now, how to implement something like this? Of course, one needs a finite 
Now, how to implement something like this? Of course, one needs a finite 

−  map that stores values < 
+  map that stores values <code>b</code> for keys of type <code>a</code>. 
−  It turns out that such a map can be constructed recursively based on the structure of < 
+  It turns out that such a map can be constructed recursively based on the structure of <code>a</code>: 
<haskell> 
<haskell> 

Map () b := b 
Map () b := b 

Line 22:  Line 22:  
</haskell> 
</haskell> 

−  Here, 
+  Here, <code>Map a b</code> is the type of a finite map from keys <code>a</code> to values <code>b</code>. 
Its construction is based on the following laws for functions 
Its construction is based on the following laws for functions 

<haskell> 
<haskell> 

() > b =~= b 
() > b =~= b 

−  (a + a') > b =~= (a > b) 
+  (a + a') > b =~= (a > b) × (a' > b)  = case analysis 
−  (a 
+  (a × a') > b =~= a > (a' > b)  = currying 
</haskell> 
</haskell> 

For further and detailed explanations, see 
For further and detailed explanations, see 

−  * 
+  * Ralf Hinze: [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.3272 Memo functions, polytypically!] 
−  * 
+  * Ralf Hinze: [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.4069 Generalizing generalized tries] 
+  * Conal Elliott: [http://conal.net/blog/posts/elegantmemoizationwithfunctionalmemotries/ Elegant memoization with functional memo tries] and other [http://conal.net/blog/tag/memoization/ posts on memoization]. 

+  * Conal Elliott [http://conal.net/papers/typeclassmorphisms/ Denotational design with type class morphisms], section 9 (Memo tries). 

== Memoization with recursion == 
== Memoization with recursion == 

Line 42:  Line 42:  
The naive implementation of Fibonacci numbers without memoization is horribly slow. 
The naive implementation of Fibonacci numbers without memoization is horribly slow. 

−  Try < 
+  Try <code>slow_fib 30</code>, not too much higher than that and it hangs. 
<haskell> 
<haskell> 

slow_fib :: Int > Integer 
slow_fib :: Int > Integer 

Line 51:  Line 51:  
The memoized version is much faster. 
The memoized version is much faster. 

−  Try < 
+  Try <code>memoized_fib 10000</code>. 
<haskell> 
<haskell> 

memoized_fib :: Int > Integer 
memoized_fib :: Int > Integer 

−  memoized_fib = 
+  memoized_fib = (map fib [0 ..] !!) 
−  +  where fib 0 = 0 

−  fib 1 = 1 
+  fib 1 = 1 
−  fib n = memoized_fib (n2) + memoized_fib (n1) 
+  fib n = memoized_fib (n2) + memoized_fib (n1) 
−  in (map fib [0 ..] !!) 

</haskell> 
</haskell> 

Line 64:  Line 64:  
=== Memoizing fix point operator === 
=== Memoizing fix point operator === 

−  You can factor out the memoizing trick to a function, the memoizing fix point operator, 

+  We can divide the functionality of <code>memoized_fib</code> into the actual calculation and the memoization. 

−  which we will call <hask>memoFix</hask> here. 

+  To recombine the functionality, we will use ordinary function composition <code>(.)</code> and the fix point operator <code>fix</code> (cf. <code>Data.Function</code>): 

+  
+  <haskell> 

+  (.) :: (b > c) > (a > b) > a > c 

+  (.) f g = \ x > f (g x) 

+  
+  fix :: (a > a) > a 

+  fix f = let x = f x in x 

+  </haskell> 

+  
+  For the calculation, we define the function <code>fib</code>, such that <code>fix fib :: Int > Integer</code> computes the fibonacci numbers. 

<haskell> 
<haskell> 

Line 71:  Line 81:  
fib f 0 = 1 
fib f 0 = 1 

fib f 1 = 1 
fib f 1 = 1 

−  fib f n = f (n1) + f (n2) 
+  fib f n = f (n  1) + f (n  2) 
+  </haskell> 

−  fibonacci :: Int > Integer 

+  Note, that the recursion is factored out to <code>fix</code>. 

−  fibonacci = memoFix fib 

+  For the memoization, we write the function <code>memoize</code>. 

+  We could also use the function <code>memoizeInt</code> from the next section for this purpose. 

+  <haskell> 

+  memoize :: (Int > a) > (Int > a) 

+  memoize f = (map f [0 ..] !!) 

</haskell> 
</haskell> 

−  I suppose if you want to "put it in a library", 

+  Now, we have all the tools we need to build a memoizing fibonacci function. 

−  you should just put <hask>fib</hask> in, 

−  and allow the user to call <hask>memoFix fib</hask> to make a new version when necessary. 

−  This allows the user e.g. to define the data structure used for memoization. 

−  The memoising fixpoint operator works 

+  <haskell> 

−  by putting the result of the first call of the function 

+  fibMemo :: Int > Integer 

−  for each natural number into a data structure and 

+  fibMemo = fix (memoize . fib) 

−  using that value for subsequent calls ;) 

+  </haskell> 

+  
+  Let us replace <code>fix</code> by its definition, to see, why this works. 

−  In general it is 

<haskell> 
<haskell> 

−  memoFix :: ((a > b) > (a > b)) > a > b 

+  fibMemo = fix (memoize . fib) 

−  memoFix f = 

+  = let x = (memoize . fib) x in x 

−  +  = (memoize . fib) fibMemo 

+  = memoize (fib fibMemo) 

</haskell> 
</haskell> 

+  
+  Assuming, <code>fibMemo</code> does indeed calculate the fibonacci numbers, it is a perfect candidate to be plugged into <code>fib</code>. 

+  So, <code>fib fibMemo :: Int > Integer</code> also calculates the fibonacci numbers, but the first calculation step is done by <code>fib</code> instead of <code>fibMemo</code> itself. 

+  Additionally, <code>memoize</code> is applied to introduce memoization. 

+  Let us look at a small example to make the functionality more clear. 

+  
+  <haskell> 

+  fibMemo 3 

+  = memoize (fib fibMemo) 3 

+  = map (fib fibMemo) [0 ..] !! 3 

+   Note: The following step does not exactly follow lazy evaluation. 

+  = fib fibMemo 0 : fib fibMemo 1 : fib fibMemo 2 : fib fibMemo 3 : map (fib fibMemo) [4 ..] !! 3 

+  = fib fibMemo 3 

+  = fibMemo 2 + fibMemo 1 

+  = fib fibMemo 2 + fibMemo 1 

+  = (fibMemo 1 + fibMemo 0) + fibMemo 1 

+   Note: Because of the memoization, both “fibMemo 1” terms refer to the same thunk, 

+   so it will only be evaluated once. 

+  = (fib fibMemo 1 + fibMemo 0) + fibMemo 1 

+  = (1 + fibMemo 0) + fibMemo 1 

+  = (1 + fib fibMemo 0) + fibMemo 1 

+  = (1 + 0) + fibMemo 1 

+  = 1 + fibMemo 1 

+   Remember: “fibMemo 1” was already evaluated, so we can directly replace it by its value. 

+  = 1 + 1 

+  = 2 

+  </haskell> 

+  
+  As we can see, the whole calculation boils down to the definition of <code>fib</code>, while <code>memoize</code> introduces sharing of thunks. 

== Efficient tree data structure for maps from Int to somewhere == 
== Efficient tree data structure for maps from Int to somewhere == 

−  Here we present a special tree data type 
+  Here we present a special tree data type 
+  ({{HackagePackageid=datainttrie}}) 

+  which is useful as memoizing data structure e.g. for the Fibonacci function. 

<haskell> 
<haskell> 

memoizeInt :: (Int > a) > (Int > a) 
memoizeInt :: (Int > a) > (Int > a) 

Line 159:  Line 200:  
'''Note: This is migrated from the old wiki.''' 
'''Note: This is migrated from the old wiki.''' 

−  Memoising constructor functions gives you HashConsing, and you can sometimes use MemoisingCafs to implement that. 
+  Memoising constructor functions gives you HashConsing, and you can sometimes use MemoisingCafs ([[constant applicative form]]s) to implement that. 
The MemoisingCafs idiom also supports recursion. 
The MemoisingCafs idiom also supports recursion. 

Line 169:  Line 210:  
wonderous 1 = 0 
wonderous 1 = 0 

wonderous x 
wonderous x 

−   
+   even x = 1 + wonderous (x `div` 2) 
−   otherwise 
+   otherwise = 1 + wonderous (3*x+1) 
</haskell> 
</haskell> 

This function is not at all understood by mathematicians and has a surprisingly complex recursion pattern, so if you need to call it many times with different values, optimising it would not be easy. 
This function is not at all understood by mathematicians and has a surprisingly complex recursion pattern, so if you need to call it many times with different values, optimising it would not be easy. 

Line 187:  Line 228:  
wonderous2' 1 = 0 
wonderous2' 1 = 0 

wonderous2' x 
wonderous2' x 

−   
+   even x = 1 + wonderous2 (x `div` 2) 
−   otherwise 
+   otherwise = 1 + wonderous2' (3*x+1) 
</haskell> 
</haskell> 

When using this pattern in your own code, note carefully when to call the memoised version (wonderous2 in the above example) and when not to. In general, the partially memoised version (wonderous2' in the above example) should call the memoised version if it needs to perform a recursive call. However, in this instance, we only memoize for small values of x, so the branch of the recursion that passes a larger argument need not bother checking the memo table. (This does slow the array initialization, however.) 
When using this pattern in your own code, note carefully when to call the memoised version (wonderous2 in the above example) and when not to. In general, the partially memoised version (wonderous2' in the above example) should call the memoised version if it needs to perform a recursive call. However, in this instance, we only memoize for small values of x, so the branch of the recursion that passes a larger argument need not bother checking the memo table. (This does slow the array initialization, however.) 

−  Thanks to [[ 
+  Thanks to [[lazy evaluation]], we can even memoise an infinite domain, though we lose constant time lookup. This data structure is O(log N): 
<haskell> 
<haskell> 

Line 229:  Line 270:  
wonderous3' 1 = 0 
wonderous3' 1 = 0 

wonderous3' x 
wonderous3' x 

−   
+   even x = 1 + wonderous3 (x `div` 2) 
−   otherwise 
+   otherwise = 1 + wonderous3 (3*x+1) 
</haskell> 
</haskell> 

Line 236:  Line 277:  
 [[AndrewBromage]] 
 [[AndrewBromage]] 

+  
+  == Memoizing polymorphic functions == 

+  
+  What about memoizing polymorphic functions defined with polymorphic recursion? 

+  How can such functions be memoized? 

+  The caching data structures used in memoization typically handle only one type of argument at a time. 

+  For instance, one can have finite maps of differing types, but each concrete finite map holds just one type of key and one type of value. 

+  
+  See the discussion on ''Memoizing polymorphic functions'', [http://conal.net/blog/posts/memoizingpolymorphicfunctionspartone/ part one] and [http://conal.net/blog/posts/memoizingpolymorphicfunctionsparttwo/ part two], as well as [http://conal.net/blog/posts/memoizingpolymorphicfunctionsviaunmemoization/ ''Memoizing polymorphic functions via unmemoization'']. 

== See also == 
== See also == 

−  * [http://www.haskell.org/pipermail/haskellcafe/2007February/ 
+  * [http://www.haskell.org/pipermail/haskellcafe/2007February/021288.html HaskellCafe "speeding up fibonacci with memoizing"] 
−  * [http://www.haskell.org/pipermail/haskellcafe/2007May/ 
+  * [http://www.haskell.org/pipermail/haskellcafe/2007May/024689.html HaskellCafe about memoization utility function] 
−  * [http://www.haskell.org/pipermail/haskellcafe/2007February/ 
+  * [http://www.haskell.org/pipermail/haskellcafe/2007February/021563.html HaskellCafe "memoisation"] 
−  * [http://www.haskell.org/pipermail/haskellcafe/2005October/ 
+  * [http://www.haskell.org/pipermail/haskellcafe/2005October/010287.html HaskellCafe about Memoization and Data.Map] 
* http://programming.reddit.com/info/16ofr/comments 
* http://programming.reddit.com/info/16ofr/comments 

* [http://www.cs.utexas.edu/~wcook/Drafts/2006/MemoMixins.pdf Monadic Memoization Mixins] by Daniel Brown and William R. Cook 
* [http://www.cs.utexas.edu/~wcook/Drafts/2006/MemoMixins.pdf Monadic Memoization Mixins] by Daniel Brown and William R. Cook 

+  * [http://hackage.haskell.org/cgibin/hackagescripts/package/datamemocombinators datamemocombinators: Combinators for building memo tables.] 

+  * [http://hackage.haskell.org/cgibin/hackagescripts/package/MemoTrie MemoTrie: Triebased memo functions] 

+  * [http://hackage.haskell.org/package/monadmemo monadmemo: memoization monad transformer] 

+  * [http://hackage.haskell.org/package/memoize memoize: uses Template Haskell to derive memoization code] 

+  * [http://hackage.haskell.org/package/arraymemoize arraymemoize: memoize finite (and/or discrete) subdomains of a function using arrays] 
Latest revision as of 16:05, 28 April 2014
Memoization is a technique for storing values of a function instead of recomputing them each time the function is called.
Contents
Memoization without recursion
You can just write a memoization function using a data structure that is suitable for your application.
We don't go into the details of this case.
If you want a general solution for several types,
you need a type class, say Memoizable
.
memoize :: Memoizable a => (a>b) > (a>b)
Now, how to implement something like this? Of course, one needs a finite
map that stores values b
for keys of type a
.
It turns out that such a map can be constructed recursively based on the structure of a
:
Map () b := b
Map (Either a a') b := (Map a b, Map a' b)
Map (a,a') b := Map a (Map a' b)
Here, Map a b
is the type of a finite map from keys a
to values b
.
Its construction is based on the following laws for functions
() > b =~= b
(a + a') > b =~= (a > b) × (a' > b)  = case analysis
(a × a') > b =~= a > (a' > b)  = currying
For further and detailed explanations, see
 Ralf Hinze: Memo functions, polytypically!
 Ralf Hinze: Generalizing generalized tries
 Conal Elliott: Elegant memoization with functional memo tries and other posts on memoization.
 Conal Elliott Denotational design with type class morphisms, section 9 (Memo tries).
Memoization with recursion
Things become more complicated if the function is recursively defined and it should use memoized calls to itself. A classic example is the recursive computation of Fibonacci numbers.
The naive implementation of Fibonacci numbers without memoization is horribly slow.
Try slow_fib 30
, not too much higher than that and it hangs.
slow_fib :: Int > Integer
slow_fib 0 = 0
slow_fib 1 = 1
slow_fib n = slow_fib (n2) + slow_fib (n1)
The memoized version is much faster.
Try memoized_fib 10000
.
memoized_fib :: Int > Integer
memoized_fib = (map fib [0 ..] !!)
where fib 0 = 0
fib 1 = 1
fib n = memoized_fib (n2) + memoized_fib (n1)
Memoizing fix point operator
We can divide the functionality of memoized_fib
into the actual calculation and the memoization.
To recombine the functionality, we will use ordinary function composition (.)
and the fix point operator fix
(cf. Data.Function
):
(.) :: (b > c) > (a > b) > a > c
(.) f g = \ x > f (g x)
fix :: (a > a) > a
fix f = let x = f x in x
For the calculation, we define the function fib
, such that fix fib :: Int > Integer
computes the fibonacci numbers.
fib :: (Int > Integer) > Int > Integer
fib f 0 = 1
fib f 1 = 1
fib f n = f (n  1) + f (n  2)
Note, that the recursion is factored out to fix
.
For the memoization, we write the function memoize
.
We could also use the function memoizeInt
from the next section for this purpose.
memoize :: (Int > a) > (Int > a)
memoize f = (map f [0 ..] !!)
Now, we have all the tools we need to build a memoizing fibonacci function.
fibMemo :: Int > Integer
fibMemo = fix (memoize . fib)
Let us replace fix
by its definition, to see, why this works.
fibMemo = fix (memoize . fib)
= let x = (memoize . fib) x in x
= (memoize . fib) fibMemo
= memoize (fib fibMemo)
Assuming, fibMemo
does indeed calculate the fibonacci numbers, it is a perfect candidate to be plugged into fib
.
So, fib fibMemo :: Int > Integer
also calculates the fibonacci numbers, but the first calculation step is done by fib
instead of fibMemo
itself.
Additionally, memoize
is applied to introduce memoization.
Let us look at a small example to make the functionality more clear.
fibMemo 3
= memoize (fib fibMemo) 3
= map (fib fibMemo) [0 ..] !! 3
 Note: The following step does not exactly follow lazy evaluation.
= fib fibMemo 0 : fib fibMemo 1 : fib fibMemo 2 : fib fibMemo 3 : map (fib fibMemo) [4 ..] !! 3
= fib fibMemo 3
= fibMemo 2 + fibMemo 1
= fib fibMemo 2 + fibMemo 1
= (fibMemo 1 + fibMemo 0) + fibMemo 1
 Note: Because of the memoization, both “fibMemo 1” terms refer to the same thunk,
 so it will only be evaluated once.
= (fib fibMemo 1 + fibMemo 0) + fibMemo 1
= (1 + fibMemo 0) + fibMemo 1
= (1 + fib fibMemo 0) + fibMemo 1
= (1 + 0) + fibMemo 1
= 1 + fibMemo 1
 Remember: “fibMemo 1” was already evaluated, so we can directly replace it by its value.
= 1 + 1
= 2
As we can see, the whole calculation boils down to the definition of fib
, while memoize
introduces sharing of thunks.
Efficient tree data structure for maps from Int to somewhere
Here we present a special tree data type (datainttrie) which is useful as memoizing data structure e.g. for the Fibonacci function.
memoizeInt :: (Int > a) > (Int > a)
memoizeInt f = (fmap f (naturals 1 0) !!!)
A data structure with a node corresponding to each natural number to use as a memo.
data NaturalTree a = Node a (NaturalTree a) (NaturalTree a)
Map the nodes to the naturals in this order:
0
1 2
3 5 4 6
7 ...
Look up the node for a particular number
Node a tl tr !!! 0 = a
Node a tl tr !!! n =
if odd n
then tl !!! top
else tr !!! (top1)
where top = n `div` 2
We surely want to be able to map on these things...
instance Functor NaturalTree where
fmap f (Node a tl tr) = Node (f a) (fmap f tl) (fmap f tr)
If only so that we can write cute,
but inefficient things like the below,
which is just a NaturalTree
such that naturals!!!n == n
:
naturals = Node 0 (fmap ((+1).(*2)) naturals) (fmap ((*2).(+1)) naturals)
The following is probably more efficient
(and, having arguments won't hang around at top level, I think)
 have I put more $!
s than necessary?
naturals r n =
Node n
((naturals $! r2) $! (n+r))
((naturals $! r2) $! (n+r2))
where r2 = 2*r
Memoising CAFS
Note: This is migrated from the old wiki.
Memoising constructor functions gives you HashConsing, and you can sometimes use MemoisingCafs (constant applicative forms) to implement that.
The MemoisingCafs idiom also supports recursion.
Consider, for example:
wonderous :: Integer > Integer
wonderous 1 = 0
wonderous x
 even x = 1 + wonderous (x `div` 2)
 otherwise = 1 + wonderous (3*x+1)
This function is not at all understood by mathematicians and has a surprisingly complex recursion pattern, so if you need to call it many times with different values, optimising it would not be easy.
However, we can memoise some of the domain using an array CAF:
wonderous2 :: Integer > Integer
wonderous2 x
 x <= maxMemo = memoArray ! x
 otherwise = wonderous2' x
where
maxMemo = 100
memoArray = array (1,maxMemo)
[ (x, wonderous2' x)  x < [1..maxMemo] ]
wonderous2' 1 = 0
wonderous2' x
 even x = 1 + wonderous2 (x `div` 2)
 otherwise = 1 + wonderous2' (3*x+1)
When using this pattern in your own code, note carefully when to call the memoised version (wonderous2 in the above example) and when not to. In general, the partially memoised version (wonderous2' in the above example) should call the memoised version if it needs to perform a recursive call. However, in this instance, we only memoize for small values of x, so the branch of the recursion that passes a larger argument need not bother checking the memo table. (This does slow the array initialization, however.) Thanks to lazy evaluation, we can even memoise an infinite domain, though we lose constant time lookup. This data structure is O(log N):
type MemoTable a = [(Integer, BinTree a)]
data BinTree a = Leaf a  Node Integer (BinTree a) (BinTree a)
wonderous3 :: Integer > Integer
wonderous3 x
= searchMemoTable x memoTable
where
memoTable :: MemoTable Integer
memoTable = buildMemoTable 1 5
buildMemoTable n i
= (nextn, buildMemoTable' n i) : buildMemoTable nextn (i+1)
where
nextn = n + 2^i
buildMemoTable' base 0
= Leaf (wonderous3' base)
buildMemoTable' base i
= Node (base + midSize)
(buildMemoTable' base (i1))
(buildMemoTable' (base + midSize) (i1))
where
midSize = 2 ^ (i1)
searchMemoTable x ((x',tree):ms)
 x < x' = searchMemoTree x tree
 otherwise = searchMemoTable x ms
searchMemoTree x (Leaf y) = y
searchMemoTree x (Node mid l r)
 x < mid = searchMemoTree x l
 otherwise = searchMemoTree x r
wonderous3' 1 = 0
wonderous3' x
 even x = 1 + wonderous3 (x `div` 2)
 otherwise = 1 + wonderous3 (3*x+1)
Naturally, these techniques can be combined, say, by using a fast CAF data structure for the most common part of the domain and an infinite CAF data structure for the rest.
Memoizing polymorphic functions
What about memoizing polymorphic functions defined with polymorphic recursion? How can such functions be memoized? The caching data structures used in memoization typically handle only one type of argument at a time. For instance, one can have finite maps of differing types, but each concrete finite map holds just one type of key and one type of value.
See the discussion on Memoizing polymorphic functions, part one and part two, as well as Memoizing polymorphic functions via unmemoization.
See also
 HaskellCafe "speeding up fibonacci with memoizing"
 HaskellCafe about memoization utility function
 HaskellCafe "memoisation"
 HaskellCafe about Memoization and Data.Map
 http://programming.reddit.com/info/16ofr/comments
 Monadic Memoization Mixins by Daniel Brown and William R. Cook
 datamemocombinators: Combinators for building memo tables.
 MemoTrie: Triebased memo functions
 monadmemo: memoization monad transformer
 memoize: uses Template Haskell to derive memoization code
 arraymemoize: memoize finite (and/or discrete) subdomains of a function using arrays