# Talk:MapReduce as a monad

### From HaskellWiki

Julianporter (Talk | contribs) (→Monads) |
(→Monads: details matter) |
||

Line 4: | Line 4: | ||

Because the key point is that both Map and Reduce can be seen as monadic functions, and so then MapReduce is just a matter of repeated bind operations. Think of it as a generalised State monad. [[User:Julianporter|julianporter]] 21:16, 2 April 2011 (UTC) | Because the key point is that both Map and Reduce can be seen as monadic functions, and so then MapReduce is just a matter of repeated bind operations. Think of it as a generalised State monad. [[User:Julianporter|julianporter]] 21:16, 2 April 2011 (UTC) | ||

+ | |||

+ | Having read this article, and having had some experience with Hadoop, I wonder... | ||

+ | First, these parameterized monads that you use, is there a categorical interpretation of these? Sure one can talk about bifunctors... but is this the case here? | ||

+ | Second, apart from partitioning, which seems to be crucial for any mapreduce, and which is kind of hard to formalize categorically (we have to, though), the big difference from monoids is that while in a monoid we do have a neutral element, and expect an empty dataset to produce that neutral element (see, e.g. the "two-page explanation" by Fokkinga), I honestly do not see how one can do anything like that if you deal with monads. E.g. in your word count, I could not see how it will produce 0 on an empty dataset - will it? It is a big issue, I'm afraid. | ||

+ | [[User:Vlad Patryshev|Vlad Patryshev]] 05:43, 26 December 2011 (UTC) |

## Latest revision as of 05:43, 26 December 2011

## [edit] Monads

Why is it mapreduce as a *monad*? Map just requires Functor, and reduce sounds like `mappend`, so it'd just be MapReduce as a monoid. --Gwern 20:31, 2 April 2011 (UTC)

Because the key point is that both Map and Reduce can be seen as monadic functions, and so then MapReduce is just a matter of repeated bind operations. Think of it as a generalised State monad. julianporter 21:16, 2 April 2011 (UTC)

Having read this article, and having had some experience with Hadoop, I wonder... First, these parameterized monads that you use, is there a categorical interpretation of these? Sure one can talk about bifunctors... but is this the case here? Second, apart from partitioning, which seems to be crucial for any mapreduce, and which is kind of hard to formalize categorically (we have to, though), the big difference from monoids is that while in a monoid we do have a neutral element, and expect an empty dataset to produce that neutral element (see, e.g. the "two-page explanation" by Fokkinga), I honestly do not see how one can do anything like that if you deal with monads. E.g. in your word count, I could not see how it will produce 0 on an empty dataset - will it? It is a big issue, I'm afraid. Vlad Patryshev 05:43, 26 December 2011 (UTC)