Difference between revisions of "Talk:SantaClausProblemV2"

From HaskellWiki
Jump to navigation Jump to search
m (spam)
 
(17 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
= Beautiful concurrency =
 
= Beautiful concurrency =
  +
  +
'''NOTE: this paper is now finished. It's available at [http://research.microsoft.com/~simonpj/papers/stm here], and there's a [[Talk:SantaClausProblem|new talk page]] for any post-publication comments.'''
  +
   
 
I am writing a chapter for a book called "Beautiful code", edited by Greg Wilson. The chapter is a tutorial about Software Transactional Memory in Haskell.
 
I am writing a chapter for a book called "Beautiful code", edited by Greg Wilson. The chapter is a tutorial about Software Transactional Memory in Haskell.
Line 54: Line 57:
 
* sec 3.2, Atomicity, "This ensured that" -> "This ensures that"
 
* sec 3.2, Atomicity, "This ensured that" -> "This ensures that"
   
[[User:Maeder|Maeder]] 15:35, 11 January 2007 (UTC) maybe use putStrLn (hPutStrLn) instead of \n in the strings
+
[[User:Maeder|Maeder]] 15:35, 11 January 2007 (UTC) maybe use putStrLn (hPutStrLn) instead of \n in the strings '''(SLPJ: actually this does not work well. When concurrent threads do putStrLns, the newlines can be separated form the lines they follow. This doesn't happen with putStr. In principle it could happen with putStr too; GHC makes no guarantees about interleaving. The Right Thing would be to use more locking etc, but I thought that was too big a distraction.)'''
   
 
[[User:Malcolm|Malcolm]] 16:07, 11 January 2007 (UTC) '''(SLPJ: done)'''
 
[[User:Malcolm|Malcolm]] 16:07, 11 January 2007 (UTC) '''(SLPJ: done)'''
Line 100: Line 103:
   
 
[[User:LuciusGregoryMeredith|LuciusGregoryMeredith]] 02:00, 13 January 2007 (UTC) Simon, lovely paper. Your first paragraph is very compelling. Let me quote it here.
 
[[User:LuciusGregoryMeredith|LuciusGregoryMeredith]] 02:00, 13 January 2007 (UTC) Simon, lovely paper. Your first paragraph is very compelling. Let me quote it here.
* "The free lunch is over [8]. We have grown used to the idea that our programs will go faster when we buy a next-generation processor, but that time has passed. While that next-generation chip will have more CPUs, each individual CPU will be no faster than the previous year’s model. If we want our program to run faster, we must learn to write parallel programs [9]."
+
* "The free lunch is over [8]. We have grown used to the idea that our programs will go faster when we buy a next-generation processor, but that time has passed. While that next-generation chip will have more CPUs, each individual CPU will be no faster than the previous year’s model. If we want our program to run faster, we must learn to write parallel programs [9]."
 
* What evidence does this paper offer that using STM will actually get my program to run faster? '''(SLPJ: none whatsoever. The only claims I'm trying to make are (a) we need to learn to write concurrent programs; and (b) if we want to write parallel programs, then STM is promising. In fact we need to write concurrent programs for all sorts of reasons, and performance is only one of them; I'm just using it as a convenient "way in". A bit sloppy perhaps.)''' For all i can see in this example, it may be a beautiful abstraction that when used at scale on interesting programs cannot actually take advantage of the multicore architecture. To be convinced, starting from this particularly motivating opening, i would like to see an example that begins with an algorithm that is not parallel. Then be shown a beautiful, STM-based parallel version that is demonstrably faster.
 
* What evidence does this paper offer that using STM will actually get my program to run faster? '''(SLPJ: none whatsoever. The only claims I'm trying to make are (a) we need to learn to write concurrent programs; and (b) if we want to write parallel programs, then STM is promising. In fact we need to write concurrent programs for all sorts of reasons, and performance is only one of them; I'm just using it as a convenient "way in". A bit sloppy perhaps.)''' For all i can see in this example, it may be a beautiful abstraction that when used at scale on interesting programs cannot actually take advantage of the multicore architecture. To be convinced, starting from this particularly motivating opening, i would like to see an example that begins with an algorithm that is not parallel. Then be shown a beautiful, STM-based parallel version that is demonstrably faster.
 
* It would be particularly compelling to see just how a good implementation of STM for Haskell takes advantage of Intel and/or AMD multicore hardware.
 
* It would be particularly compelling to see just how a good implementation of STM for Haskell takes advantage of Intel and/or AMD multicore hardware.
Line 115: Line 118:
 
[[User:Asilovy|Asilovy]] 18:33, 15 January 2007 (UTC) Page 6, last sentence : Should'nt you write "...being explicit aboute SIDE effects..." and the same for the last sentence of the next paragraph (begin page 7) "This ability to make SIDE effects..." '''(SLPJ: done)'''
 
[[User:Asilovy|Asilovy]] 18:33, 15 January 2007 (UTC) Page 6, last sentence : Should'nt you write "...being explicit aboute SIDE effects..." and the same for the last sentence of the next paragraph (begin page 7) "This ability to make SIDE effects..." '''(SLPJ: done)'''
   
[[User:Asilovy|Asilovy]] 18:33, 15 January 2007 (UTC) Page 8 second paragraph : I can understand that, in some sense, atomocity and isolation are related (in fact 2 faces seen by different threads of the same problem) but you start by saying the model does not ensure atomicity and explain why finishing by "...thereby destroying the isolation guarantee". Sounds confusing.
+
[[User:Asilovy|Asilovy]] 18:33, 15 January 2007 (UTC) Page 8 second paragraph : I can understand that, in some sense, atomocity and isolation are related (in fact 2 faces seen by different threads of the same problem) but you start by saying the model does not ensure atomicity and explain why finishing by "...thereby destroying the isolation guarantee". Sounds confusing. '''(SLPJ: indeed. I've changed it to say "does not ensure isolation...".)
   
 
[[User:Asilovy|Asilovy]] 18:33, 15 January 2007 (UTC) Page 10 code of limitedWithdraw : isn't it "writeTVar acc (bal - amount)" with a minus ? '''(SLPJ: done)'''
 
[[User:Asilovy|Asilovy]] 18:33, 15 January 2007 (UTC) Page 10 code of limitedWithdraw : isn't it "writeTVar acc (bal - amount)" with a minus ? '''(SLPJ: done)'''
Line 127: Line 130:
 
[[User:Norbertk|Norbertk]] 20:12, 12 February 2007 (UTC)
 
[[User:Norbertk|Norbertk]] 20:12, 12 February 2007 (UTC)
 
(p4) I am a Haskell Newbie, so I don't understand at first read the different meaning of the '->' in the type signature. It separates the types of the input paramters and it separates the type of the result. Writing this I presume that the last token is always the result, isn't it ? Maybe a remark would be in order, saying that Haskell functions always return exactly one value.
 
(p4) I am a Haskell Newbie, so I don't understand at first read the different meaning of the '->' in the type signature. It separates the types of the input paramters and it separates the type of the result. Writing this I presume that the last token is always the result, isn't it ? Maybe a remark would be in order, saying that Haskell functions always return exactly one value.
  +
  +
  +
----------
  +
Below here are comments on version 1
  +
  +
[[User:ArthurVanLeeuwen|ArthurVanLeeuwen]] 16:25, 22 December 2006 (UTC) '''(SLPJ: done)''' There's a couple of typos in the paper as it is now. More importantly, the footnote on page 2 has (hGetLine h "hello") where it should state (hPutStr h "hello").
  +
  +
[[User:NeilMitchell|Neil Mitchell]] 16:28, 22 December 2006 (UTC) ('''SLPJ: done''') Sect 1, Para 2. If we want to write parallel program[s] - missing s.
  +
  +
  +
[[User:Steven807|Steven807]] 17:14, 22 December 2006 (UTC) ('''SLPJ: done''') There is no definition or description of '''check'''
  +
  +
[[User:Tibbe|Tibbe]] 18:33, 22 December 2006 (UTC); ('''SLPJ: done''')The footnote on page 2 now has a incorrectly capitalized T in hPutSTr.
  +
  +
[[User:Fanf|Fanf]] 18:51, 22 December 2006 (UTC) ('''SLPJ: done''')page 1 para 2 "as well shall see" should be "as we shall see"
  +
  +
[[User:Fanf|Fanf]] 18:51, 22 December 2006 (UTC) ('''SLPJ: done''')page 3 "a bit like *t in C" should be "a bit like t* in C" since t is a type
  +
  +
[[User:Garious|Garious]] 18:56, 22 December 2006 (UTC) ('''SLPJ: done''') page 10 "at which point An elf" should be "at which point an elf"
  +
  +
[[User:Garious|Garious]] 18:58, 22 December 2006 (UTC) ('''SLPJ: done''') page 10 "Here, then is a possible" should be "Here then, is a possible"
  +
  +
[[User:Garious|Garious]] 19:16, 22 December 2006 (UTC) ('''SLPJ: done''') page 11 "Again, we define Group is declared" whaaa? maybe: "Group is declared as a data type with constructor MkGroup. MkGroup is passed the Group's capacity, and a TVar containing the number of current members and the Group's two Gates."
  +
  +
[[User:Garious|Garious]] 19:16, 22 December 2006 (UTC) ('''SLPJ: done''') page 11 "Creating a new Group is simply..." Is a process of three somewhat-abstract steps simple enough to say 'simply'? Instead, maybe show the code and let the reader think, "Hey, that's simple!"
  +
  +
[[User:Rycee|Rycee]] 20:46, 22 December 2006 (UTC) ('''SLPJ: done''') Page 4. You probably want to change "... the action a does ..." to "... the action act does ..."
  +
  +
[[User:Rycee|Rycee]] 20:46, 22 December 2006 (UTC) ('''SLPJ: done''') Page 8. Typographic nitpick: The space after "i.e." looks wide, perhaps you forgot to write "i.e.\ " to force a regular (non sentence ending) space in LaTeX?
  +
  +
[[User:MichalPalka|MichalPalka]] 22:32, 22 December 2006 (UTC) You could add a reference to SQL and triggers. They are similar in that it is programming with transations and seeing familiar names will make applied programmers feel more comfortable. ('''SLPJ: hard to know what is "familiar" to a random reader. For a reader unfamiliar with SQL and triggers, referring to that might just make it more confusing. What do others think?''')
  +
  +
[[User:DavidHouse|DavidHouse]] 22:55, 22 December 2006 (UTC) ('''SLPJ: done''') Page 3, "That is, return v is a function that, when performed, does no side effects and returns v." Your use of 'that is' implies that you could deduce the fact that it does no side effects from the type signature just given, which isn't true. It's an auxiliary property of 'return'. Maybe just miss out the 'that is'.
  +
  +
[[User:DavidHouse|DavidHouse]] 23:04, 22 December 2006 (UTC)('''SLPJ: done''') Bottom of Page 4, "Then 'atomically act' grabs the lock, performs the action 'at',". Missing 'c' out of 'at'.
  +
  +
[[User:DavidHouse|DavidHouse]] 23:23, 22 December 2006 (UTC) ('''SLPJ: done''') Page 5, "Then 'withdraw' is an STM action that adds amount to the balance in the account." 1) For consistency with the rest of the paper, it should be "Then 'withdraw account ammount'..." 2) withdraw subtracts, not adds, to the amount in the account.
  +
  +
[[User:DavidHouse|DavidHouse]] 23:40, 22 December 2006 (UTC) ('''SLPJ: done at defn of hEchoLine''') Page 9, you probably want to mention that ++ is string concatenation.
  +
  +
[[User:DavidHouse|DavidHouse]] 23:45, 22 December 2006 (UTC) ('''SLPJ: done''')Page 10, I would expect to see the type signatures for the Gate interface rewritten alongside their definitions.
  +
  +
[[User:DavidHouse|DavidHouse]] 23:47, 22 December 2006 (UTC) ('''SLPJ: difficut to do this crisply, but I take the point''') Page 9/10, algebraic datatypes are a little weird for the non-initiated, especially with constructors that don't appear to do anything ("Where do you define MkGate?" etc.). You might want to liken them to C's structs, and explain the constructors as tags?
  +
  +
[[User:DavidHouse|DavidHouse]] 23:57, 22 December 2006 (UTC) ('''SLPJ: done''') Page 13, I don't think you sufficiently explain 'sequence'. You may wish to add a sentence along the lines of "'sequence' is a function that takes a list of IO actions and returns the action that, when executed, runs each of the actions you passed it in turn."
  +
  +
[[User:Dalejordan|Dalejordan]] 02:15, 23 December 2006 (UTC) ('''SLPJ: done''') For the sake of us newbs you might mention in Section 2.1 how all these actions ever get performed. Also, in your description of nTimes I think it would be clearer to say it creates a composite action that performs its argument action n times, rather than say it performs it (directly) n times, even though the value of n is not yet known. Another example of the "beauty" of first class actions (and recursion).
  +
  +
[[User:Brecknell|Brecknell]] 03:29, 23 December 2006 (UTC) ('''SLPJ: done''') Page 7: withdraw should subtract, not add amount to the balance (sadly). Also, since this withdraw has different semantics to the version on p5, you may want to consider giving it a different name.
  +
  +
[[User:Brecknell|Brecknell]] 03:29, 23 December 2006 (UTC) ('''SLPJ: done''') Page 8, towards the end of section 2.4: Two instances of "withdraw" should be "withdraw2", one in the function definition and one in the comment.
  +
  +
[[User:Brecknell|Brecknell]] 05:04, 23 December 2006 (UTC) ('''SLPJ: done''') In the problem definition, "Santa repeatedly sleeps until wakened...", but with the definition of "forever" given in the text, and without some sort of asynchronous signalling, it looks to me like Santa will sleep for at least as long as his random number generator tells him to. Perhaps move the sleep out of forever and into elf and reindeer, then Santa will just sleep in awaitGroup.
  +
  +
[[User:Mvanier|Mvanier]] 19:21, 23 December 2006 (UTC) ('''SLPJ: done''') In section 3.3, "sequence" might as well just be "sequence_". It's confusing to return a list of unit values and not do anything with them. In section 5, I think that the problem with composing lock-based concurrent functions could be explained better; the statement "instead we must expose the locking protocol" could be expanded to show why this is the case.
  +
  +
[[User:ChrisKuklewicz|ChrisKuklewicz]] 22:17, 24 December 2006 (UTC) In learning Haskell STM back in October of 1995 I had written a solution to the same Santa problem. The main suggestion I have is that the elf and reindeer are given two actions (use Gate inGate) and (useGate outGate) and are required to properly bracket their actual action between these two. If the (joinGroup group) call returned a single scoping combinator (bracket_ (useGate inGate) (useGate outGate)) then the concurrency model for the elf and reindeer would be even to get correct (and beautiful?). Furthermore they could be passed (joinGroup group) instead of (group). This prevents the elf and reindeer from calling any other group functions.
  +
  +
[[User:Conal|Conal]] 16:39, 25 December 2006 (UTC) ('''SLPJ: done''') Page 11, first full paragraph, "then waits to the TVar to be decremented to zero." Replace "waits to" with "waits for".
  +
  +
[[User:Brecknell|Brecknell]] 03:41, 27 December 2006 (UTC) This was my first exposure to STM, and I found it clear and easy enough to follow. However, I think it is mainly the discussion of bank accounts that conveys the "beauty" of STM, since that's where the important concepts are demonstrated: composability of transactions, implicit locking and deadlock avoidance, etc. In the Santa Claus problem, those concepts are somewhat obscured. I think the Santa Claus problem is still useful for demonstrating how STM can handle a tricky concurrency scenario with relative ease, but it didn't give me the sense of clarity that the discussion of bank accounts did. Maybe the Santa Claus solution needs more focus on the abstractions possible in STM, how they can help modularise concurrent programs, and why STM is better for solving concurrency problems than the concurrency primitives available elsewhere.
  +
  +
[[User:Gaal|Gaal]] 08:20, 27 December 2006 (UTC) ('''SLPJ: done''') The binding and usage of the reindeer group on page 12 are inconsistent:
  +
; rein_gp <- newGroup 9
  +
; sequence [ reindeer gp n | n <- [1 .. 9]]
  +
-- ^^
  +
; forever (santa elf_group rein_group)
  +
  +
[[User:Fernando|Fernando]] 14:11, 29 December 2006 (UTC) '''SLPJ: done''' In page 3, in the definition of incRef, the last line should probably read, writeIORef var (val+1), instead of writeIORef (val+1). The latter version won't compile.
  +
  +
[[User:Prb|PaulRBrown]] 07:11, 2 January 2007 (UTC) It would be convenient if the download link for the source code were active. (Otherwise, reviewers have to cut and paste the fragments in by hand.) '''SLPJ: try [http://research.microsoft.com/~simonpj/stm/papers/Santa.hs.gz]'''
  +
  +
[[User:Prb|PaulRBrown]] 07:11, 2 January 2007 (UTC) I had trouble with the motivation for the Santa Claus example, and I ended up resorting to scratch paper to sketch out the problem and the model. That section would benefit from some additional introduction, perhaps a diagram. (Something schematic with "(n of m) || (p of q)" would be nice.) You could also explain it in terms of concurrency APIs that people are familiar with, e.g., count-down semaphores in Java's concurrency package. In fact, the additional development of the counting barrier prior to the Santa Claus problem would help overall clarity. '''SLPJ: Hmm'''
  +
  +
: [[User:DavidHouse|DavidHouse]] 10:30, 2 January 2007 (UTC) I actually had trouble with this one too. I think even the addition of a sentence like the following would help a great deal: "We want to model this setup in Haskell, where 'sleeping' is blocking and 'waking up' is doing something." '''SLPJ: where exactly? I'm not convinced this is going to help, but concrete suggestions are always useful.'''
  +
  +
[[User:EricWilligers|EricWilligers]] 12:43, 4 January 2007 (UTC) Page 4: As page 14 currently contains a leap in difficulty with the definition of choose, perhaps foldr1 could be introduced before the last paragraph of 2.1, with an example call. This would also be an opportunity to introduce [ ] in a type signature - it does appear with regards to "sequence" but is easy to miss as the return value isn't discussed. A simple example would avoid (op) notation and would make no sense for empty lists. e.g. max_n values = foldr1 max_2 values. '''SLPJ: I've re-done that bit; see what you think when I publish version 2'''.
  +
  +
[[User:EricWilligers|EricWilligers]] 12:43, 4 January 2007 (UTC) '''SLPJ: done''' Page 10: Missing closing ) at the end of useGate
  +
  +
[[User:EricWilligers|EricWilligers]] 12:43, 4 January 2007 (UTC) '''SLPJ: done''' Page 12: elf_group => elf_gp, rein_group => rein_gp
  +
  +
[[User:EricWilligers|EricWilligers]] 12:43, 4 January 2007 (UTC) '''SLPJ: done''' Page 13: elf-gp 1 => elf_gp 1
  +
  +
[[User:EricWilligers|EricWilligers]] 12:43, 4 January 2007 (UTC) '''SLPJ: done''' Page 14 and page 8 claim all the code is presented. To meet this claim:-
  +
Perhaps the following can appear on p9, just before "Since IO actions are first-class, ..."
  +
<haskell>
  +
deliverToys :: String -> IO ()
  +
deliverToys s = putStr (s ++ " delivering toys\n")
  +
</haskell>
  +
Perhaps the following can appear on p13, just before "Working inside-out, ..."
  +
<haskell>
  +
reindeer :: Group -> Int -> IO ()
  +
reindeer gp id = forkIO (forever (reindeer1 gp id))
  +
</haskell>
  +
Alternately, perhaps deliverToys and reindeer could appear in section 3.5 so they don't affect the flow in 3.1 and 3.3.
  +
  +
[[User:Brianh|Brian Hulley]] 22:06, 4 January 2007 (UTC) Page 5: Trivial typo "be execute simultaneously" should be "execute simultaneously" or "be executed simultaneously" '''SLPJ:done'''. I spent ages trying to understand why the 'brutal lock' method wouldn't ensure isolation then I realised it was that I'd just assumed all other threads would be stopped dead in their tracks while the atomic block was running whereas if I'd read the text more carefully I'd have seen that the lock indeed only applies to actions wrapped in <hask>atomically</hask>. I think if you said something like "if a thread is accessing an IORef while holding the <hask>atomically</hask> lock there is nothing to stop a *different* thread from accessing the same IORef directly at the same time" it would help emphasise the fact that other threads can still be running even though only one thread can be in an atomic block at any given time. '''SLPJ: I have re-worded. See if you prefer version 2'''
  +
  +
[[User:Brianh|Brian Hulley]] 09:49, 5 January 2007 (UTC)
  +
* Page 6: re-execution: this looks like you could end up in an infinite loop since if thread A needs to do something complicated in an atomic block while thread B just sits in a tight loop continuously incrementing the same TVar (used by A) then A would probably always have an invalid view of memory by the time it was ready to validate. Does the implementation address this problem? '''SLPJ: starvation is indeed possible. I deliberately left out for brevity, but perhaps I should devote a couple of paras to it. What do others think?'''
  +
* Page 10: No definition for <hask>check</hask>. Section 2.4 only defines <hask>retry</hask> so perhaps section 3.2 should include:
  +
<haskell>
  +
check c = if not c then retry() else return ()
  +
</haskell>
  +
(to avoid introducing yet another Haskell function <hask>unless</hask>) '''SLPJ: done'''
  +
* Page 12: Use of <hask>sequence</hask> and list comprehensions seems more complicated than something like:
  +
<haskell>
  +
mapM_ (elf elf_gp) [1..10]
  +
</haskell>
  +
'''SLPJ: Hmm'''
  +
* Page 16: Section 5 on locks. It seems to me that STM solves these problems by just conceptually using a single lock (the 'brute force' lock on <hask>atomically</hask>) but a clever thing about STM is that it gets the simplicity of just having a single global lock without the performance penalty due to the use of logging (ie the global lock is only held for the duration of the validate and commit instead of for the whole duration of the atomic block). '''SLPJ: and perhaps not even then; you can do the commit use CAS only with no locks at all!'''
  +
These are just minor points though - I thought it was a really good intro to STM - I've never used/looked at STM before now and yet after reading just this one chapter I feel I've got enough understanding to start using it - Thanks.
  +
  +
[[User:EricWilligers|EricWilligers]] 01:02, 7 January 2007 (UTC) '''SLPJ: done''' I needed to change elf and reindeer due to a compilation error - Expected type: IO (), Inferred type: IO ThreadId. Each of the following worked:-
  +
<haskell>
  +
elf :: Group -> Int -> IO ()
  +
elf gp id = do { _ <- forkIO (forever (elf1 gp id)) ; return () }
  +
</haskell>
  +
<haskell>
  +
elf :: Group -> Int -> IO ThreadId
  +
elf gp id = forkIO (forever (elf1 gp id))
  +
</haskell>
  +
<haskell>
  +
elf gp id = forkIO (forever (elf1 gp id))
  +
</haskell>
  +
The last option avoids the need to mention ThreadId.
  +
  +
[[User:Danwang74|Danwang74]] 04:06, 7 January 2007 (UTC) I think the paper conflates the use of the STM programming model (atomics an various combinators) with the implementation details by focusing only on optimistic methods of implementing the STM programming model. There are fundemental problems with optimistic methods of STM such as situations which require IO as well as implementation challenges which have not been solved.
  +
  +
Pessimistic methods (i.e. lock based methods) can implement the STM programming model efficently. There have been several papers that make progress in this area that I'm sure you are aware of yet they get no mention in the discussion.
  +
  +
Also the comparison of the programming model using bare locks and the STM programming model implemented with optimistics methods is a completely unfair comparison. It is like comparing raw x86 assembly to Haskell compiled to a RISC processor. Programming in raw unstructured x86 assembly is surely a bad idea because the programming model is too low-level. But there is nothing "broken" with x86 assembly when compared to a RISC processor, ignoring aesthetics. In fact one could compile Haskell to x86 assembly with the programmer unaware. The modularity claims are also unfair. Is Haskell compiled to RISC assembly more modular than Haskell compiled to x86?
  +
  +
Locks can be used modularly, just as you can programing in raw x86 assembly modularly with some effort. Unfortunately use of locks in a modular way is not well supported. One way to do this is to use the strawman acquire one global lock implementation of atomics. The whole discussion confuses the programming model and the runtime implementation of it. '''Actually I don't agree that locks can be used modularly. I think they fundamentally can't be, and that's why I don't think the comparison is misleading. Our "composable memory transactions" paper goes into a bit more detail.'''
  +
  +
In fact the STM name is confusing itself. There's an "atomic" programming model possibly implemented via STM. It would be more insightful to talk more about alternative methods of implementing the atomic programming model, independent of STM. It is ironic and confusing to use STM to refer to the programming model which he authors emphasize could be implemented with locks. There needs to be a clearer seperation between the programming model the implementation details.
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) (I'm definitely a Haskell newb.) page 2, Sec. 2.1, 1st paragraph
  +
  +
and returns a value of type IO ().
  +
...
  +
returns a value of type ().
  +
  +
I read this, and wondered, "Is it type IO (), or type () ?" '''(SLPJ: this is hard to explain first time. I'll try to clarify'''
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) ('''SLPJ: done''') page 3, line 4, "Then it will performs" s.b. "Then it will perform"
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) ('''SLPJ: I have clarified''')page 3, 2nd paragraph. I'm not sure what "IO a" means. From the
  +
previous page, I thought that IO implied there would be a side effect.
  +
Now you're saying the "return v" action "does no side effects."
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 3, paragraph 6, "Gentle..." After 5 dense paragraphs, suddenly
  +
this one seems clear. Maybe you intended this. I'm still scratching
  +
my head, wondering if IO and IORef are reserved words, or whether
  +
they're things you've defined somewhere but not yet revealed. '''SLPJ: great! IO and IORef are primitive, built-in types; but they not reserved words in the sense of keywords (e.g. you could not import them and then redefine them). I'm not sure this is worth amplifying though.'''
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 4, end of Section 2.1
  +
  +
I'm still left wondering if IO is a marker that signals "side
  +
effect(s) can happen," or if it's a type. Put another way, I don't
  +
know if IO is an adjective, noun or verb, and I don't know if it's a
  +
part of the language or something Joe Haskell wrote for demonstration. '''SLPJ: it's a type. A value of that type is an action that, when performed, may have side effects. If you can help me adjust earlier text to say that more clearly it'd be great, but I'm not sure it'd help simply to repeat it! You may like to try "Tackling the awkward squad" and then come back to suggest improved wording for this section.'''
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 5, footnote 2. Why do you say "But it's too late now!" Are so
  +
many people in the world using Haskell that it's impossible to tweak
  +
nomenclature? I thought Haskell was a young up and coming language,
  +
not an old language set in its ways. Also, programmers are used to
  +
being able to set things right themselves if the language somehow has
  +
gotten something wrong. The exclamation implies that once a design
  +
mistake is made, however small, programmers just have to lump it. '''SLPJ: reasonable point, send email to libraries@haskell.org!'''
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 5, code example 3. cts -- shorthand for cents? Why not cents? '''SLPJ: short for "contents"! I've replaced it with bal, short for balance'''
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 6, paragraph 3, "When..." This is very clear. I could
  +
substitute that paragraph for ten times as many words I've seen in
  +
database texts.
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 6, "This process is called re-execution." I read that and
  +
thought, "What is to keep a failed transaction from re-executing
  +
forever and never being satisfied?" But it looks like you get to that
  +
in Section 2.4. '''SLPJ: indeed'''
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 10, code examples at bottom. Where did check(...) come from?
  +
How does it work? Does it block? Does it abort? Is it part of the
  +
language? Is it defined later in the chapter? Based on the last
  +
sentence of the first full paragraph on page 11, I'm guessing it
  +
blocks. '''SLPJ: try now. I forgot to define check'''
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) '''SLPJ: done''' page 11, paragraph 3, "Again...", rewrite: "we define Group is
  +
declared as a fresh data type..."
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 14, last paragraph in section 3.4. That is very cool. The only
  +
extra thing I wanted was a picture of how all the pieces you built
  +
combined. I was curious which ones you thought of first. I wondered
  +
about the order in which you conceived of pieces, as it compares with
  +
the order in which you described them.'''SLPJ: glad you liked it; but I've simplified it. I don't have much idea about how to give the "how did I get here" stuff without making it a lot longer'''
  +
  +
  +
[[User:Gknauth|Gknauth]] 19:32, 7 January 2007 (UTC) page 17, conclusion, 1st paragraph, last sentence. I can definitely
  +
see the need for transactional memory techniques in the years ahead,
  +
but I don't know much about how other languages are addressing the
  +
issue. Maybe you could provide some pointers. If Haskell is ahead of
  +
the others, then so much the better for Haskell, but at the moment I
  +
have nothing to compare. '''SLPJ: now citing Larus's new book'''
  +
  +
[[User:EngineerScotty|EngineerScotty]] 17:23, 8 January 2007 (UTC) '''SLPJ: done''' Page 2. The problem is stated such that no thread may "observe a state in which money has left one account but not arrived in the other". This may be stating the obvious; but should the make-before-break condition also be proscribed--no thread should be able to observe a state in which money has arrived in one account without having left the other, as well? A minor quibble, but one if which not addressed will cause me to an open an account with your bank. :)
  +
  +
[[User:BoAdler|BoAdler]] 20:10, 10 January 2007 (UTC) '''SLPJ: done''' page 3. You describe "IORef t" as being "a bit like *t in C", but that notation in C is used to dereference pointers. I know this is pedantic, but it should be "(t *)". I got thrown for a second, and had to remind myself that 't' was a type, not a variable.
  +
  +
  +
[[User:Robin|Robin]] 13:52, 18 January 2007 (UTC) page 2, section 2.1. "syncrhonized" has a typo.

Latest revision as of 09:39, 12 January 2009

Beautiful concurrency

NOTE: this paper is now finished. It's available at here, and there's a new talk page for any post-publication comments.


I am writing a chapter for a book called "Beautiful code", edited by Greg Wilson. The chapter is a tutorial about Software Transactional Memory in Haskell.

After a lot of useful feedback, I have now completed Version 2 of the paper. The Haskell code is also available.

I would welcome any comments or questions you have on the paper, or constructive suggestions for improving it; the more concrete the better. You can see what people said about version 1 on the version 1 talk page: Talk:SantaClausProblem .

The book is aimed at a general audience of programmers, not Haskell geeks, so I have tried to explain everything necessary as I go along. So if you are not a Haskell expert, your input would be particularly valuable to me.

You can email me directly (simonpj@microsoft.com), or add Wiki talk notes below.

If you give your real name somewhere in your text (or email it to me), I'll add you to the acknowledgements at the end of the chapter. Notably, I'd like to acknowledge Steven807, Mvanier, Fernando, Gknauth, EngineerScotty, BoAdler, Schmitz, Genneth, Pitarou, Asilovy.


Simonpj 14:26, 22 December 2006 (UTC) To add a note, begin with four tilde signs ~~~~; the Wiki will fill in your user name and date.

Brecknell 13:38, 11 January 2007 (UTC) In Figure 1, retry should be STM a, not STM (). (SLPJ: done)

Sylvain 13:45, 11 January 2007 (UTC) small points: since you go through explaining that the do-notation is overloaded and works for STM as well as IO (page 8), you might want to tell the same for return on page 11 (return being given with a type a -> IO a on page 5). (SLPJ: done) There is also a double "Here is the code" at the bottom of page 17. (SLPJ: done)

Brecknell 13:59, 11 January 2007 (UTC) The rewrite of the paragraph at top of page 17 has left behind a fragment that I think just needs to be deleted: "just runs each of the actions in sequence".(SLPJ: done)

Brecknell 14:15, 11 January 2007 (UTC) Last sentence of page 18: "Here is another way approach that..." Delete either "way" or "approach". (SLPJ: done)

ArthurVanLeeuwen 14:27, 11 January 2007 (UTC) (SLPJ: done) minor stuff, really:

  • page 2, middle of page: syncrhonized -> synchronized
  • page 5, top of page: For example, here are two Haskell functions -> For example, here are the type signatures for two Haskell functions (the functions themselves may be primitives, after all)
  • page 5, second paragraph: do the parentheses around hPutStr serve more than to confuse? Especially given their absence for hEchoLine...
  • page 5, footnote 3: comma after effects
  • page 6: make discarding the return value of forkIO in the second example main more explicit, e.g. with a footnote
  • page 7, after Atomicity: This ensured -> This ensures

Brecknell 14:32, 11 January 2007 (UTC) I like the way you build up to the definition of "choose" in this new revision. I think this helps to convey how STM and actions as first-class values can improve modularity and composability in concurrent applications. So, if you like, you can consider my lsat comment on the previous revision "done", as well.

Maeder 14:38, 11 January 2007 (UTC) (SLPJ: done)

  • page 11, use subtract (+ -> -) in limitedWithDraw
  • page 22 (2nd line), the the -> the

Genneth 15:07, 11 January 2007 (UTC) p4: make analogy between () and void to help non-haskellers (SLPJ: done)

ArthurVanLeeuwen 15:10, 11 January 2007 (UTC) The bottom of page 17 has 'Here is his code:' in duplicate. (SLPJ: done)

ArthurVanLeeuwen 15:19, 11 January 2007 (UTC) The note 'The code for choose is brief, but a little mind-bending:' on page 19 is very short, for such a leap in required understanding. In just one single block of 7 lines you introduce both the concept of IO actions encapsulated inside STM actions as well as the concept of list comprehension. For a non-haskeller this is quite a lot. (SLPJ: True; but I'm not sure whether to say more, or less! Or just leave it)

Maeder 15:25, 11 January 2007 (UTC) maybe generalize nTimes on page 7 to type: Int -> (Int -> IO ()) -> IO () and use nTimes on page 16 instead of sequence_ and list comprehensions. (The function choose on page 19 is also higher order.) (SLPJ: I decide to leave it)

Malcolm 15:27, 11 January 2007 (UTC) (SLPJ: done)

  • Intro, para 2, "Sadly, parallel program[s] are" - missing plural
  • sec 2.1, para 3, "if two thread[s] call" - missing plural
  • sec 3.2, Atomicity, "This ensured that" -> "This ensures that"

Maeder 15:35, 11 January 2007 (UTC) maybe use putStrLn (hPutStrLn) instead of \n in the strings (SLPJ: actually this does not work well. When concurrent threads do putStrLns, the newlines can be separated form the lines they follow. This doesn't happen with putStr. In principle it could happen with putStr too; GHC makes no guarantees about interleaving. The Right Thing would be to use more locking etc, but I thought that was too big a distraction.)

Malcolm 16:07, 11 January 2007 (UTC) (SLPJ: done)

  • page 11, limitedWithdraw2 in type signature, is called simply withdraw2 in definition
  • sec 4.2 "remaining capactity" -> "remaining capacity" (x3)
  • sec 4.3 "newGate makes [a] new Gate" - missing article
  • top of page 17, orphan (non-)sentence ". just runs each of the actions in sequence."
  • sec 4.4, "futher" -> "further"
  • sec 4.4 "another way approach" -> "another approach"
  • sec 6 "exectued" -> "executed"
  • sec 6 "the the treatment" -> "the treatment"

Jmuk 04:14, 12 January 2007 (UTC) page 13, semicolons in helper1 are out of alignment (SLPJ: done)

Fanf 19:34, 12 January 2007 (UTC) (SLPJ: done) Minor nits:

  • p.2: need semicolon at end of last java statement in each function.
  • p.2: should you comment that the class has a withdraw method similar to the deposit method?
  • p.4 paragraph about locks: you introduce the word "compositional" here - perhaps better to stick with "modular"?
  • bottom of p. 4: should "can" be omitted, or should it read "we must explain"?
  • bottom of p. 7: inconsistent tense: began -> begins.
  • Should the layout of the do notation be more like normal C layout, to avoid freaking out the mundanes?
  • would openGate be a clearer name for operateGate?
  • footnote p. 14: mkGate -> MkGate
  • santa's code: perhaps a comment between two gates to note that the elves or reindeers do their things at that point.
  • limitedWithdraw: is the test (amount > 0) redundant?

Fanf 19:45, 12 January 2007 (UTC) (SLPJ: hmm... I'm re-using the description in earlier papers) Section 4 introduction: The description of the problem reads like a description of an implementation, and I think it should be more story-like. e.g.

Santa spends most of his time asleep. On Christmas Eve, his nine reindeer return from their holidays and wake up Santa. He harnesses each of them to his sleigh, delivers toys with them and finally unharnesses them, allowing them to return to their holidays. All year his ten elves are busy working to make toys, but every so often they will need to consult Santa about toy R&D in his study. Santa will only wake up and talk to three of them at a time, and of course their consultations are a lower priority than delivering toys.

  • p.13 perhaps "sleep" should be "delay" for consistency with the terminology in the code, and to avoid confusion with what santa does - the elves are making toys during the delay, not sleeping. (SLPJ: done)
  • Also, is it worth noting that the reindeer gates model the harnessing and unharnessing? (SLPJ: not sure. Anyone else?)

Justin Bailey 20:35, 12 January 2007 (UTC) Really interesting paper. I've only been exposed to Haskell (through a graduate Programming Languages course) for 4 or 5 months, but I found it very readable. My comments:

  • The introduction showing problems with locks and how atomically can be used in the bank account example is very well done.
  • Development of the gate operations is pretty clear. It would have helped if a short gloss on the connection between the whimsical problem statement and the opening/closing of gates was made. Re-reading, you do cover it in one paragraph on p.13. Sadly, I'd forgotten a lot of that by the time I got to reading about the implementation of Gates and Groups. Maybe make connections throughout that section?
  • I agree with ArthurVanLeeuwen above about the choose function. It's extremely mind-bending. I didn't event notice actions was a list until I saw the post - I just skimmed past that syntax trying to figure out the do portion. I also think the introduction of foldr1 would throw people not familiar with Haskell or functional programming. I still have trouble with it :)
  • p.14: "capacity" is misspelled "capactity" several times. (SLPJ: done)
  • p.17: "1000,000 microseconds" instead of "1,000,000 microseconds".(SLPJ: done)


LuciusGregoryMeredith 02:00, 13 January 2007 (UTC) Simon, lovely paper. Your first paragraph is very compelling. Let me quote it here.

  • "The free lunch is over [8]. We have grown used to the idea that our programs will go faster when we buy a next-generation processor, but that time has passed. While that next-generation chip will have more CPUs, each individual CPU will be no faster than the previous year’s model. If we want our program to run faster, we must learn to write parallel programs [9]."
  • What evidence does this paper offer that using STM will actually get my program to run faster? (SLPJ: none whatsoever. The only claims I'm trying to make are (a) we need to learn to write concurrent programs; and (b) if we want to write parallel programs, then STM is promising. In fact we need to write concurrent programs for all sorts of reasons, and performance is only one of them; I'm just using it as a convenient "way in". A bit sloppy perhaps.) For all i can see in this example, it may be a beautiful abstraction that when used at scale on interesting programs cannot actually take advantage of the multicore architecture. To be convinced, starting from this particularly motivating opening, i would like to see an example that begins with an algorithm that is not parallel. Then be shown a beautiful, STM-based parallel version that is demonstrably faster.
  • It would be particularly compelling to see just how a good implementation of STM for Haskell takes advantage of Intel and/or AMD multicore hardware.
  • It would be even more compelling to see the corresponding lock-based program and how it fairs relative to the STM version in terms of performance, usage of the hardware platform as well as program understandability and analysis.
  • Clearly, one of the real rubs is getting from current expressions of algorithms to parallel ones, especially parallel ones that map onto modern architectures. Perhaps your point is that STM helps one start to think and conceptualize computation as a concurrent activity -- which then offers some hope to the ordinary programmer to develop programs that will actually take advantage of today and tomorrow's architectures. If so, then the paper is an excellent start, but i would very much like to see this point made more explicit and central, especially if you only give some lipservice to the argument that STM can actually be made to take advantage of the multicore architectures. In particular, evidence for this sort of point might come in the form of a study of how long it takes an ordinary programmer -- not a Simon P-J -- to develop a beautiful solution like a solution to the Santa Clause problem.
  • Evidence against this sort of point would come in the form of people finding the basic constructs of the solution "mind-bending".

Pitarou 00:51, 15 January 2007 (UTC) In the definition of the function forever, there is the comment -- Repeatedly perform the action, taking a rest each time. I'm not sure what you mean by "taking a rest". (SLPJ: done)

Asilovy 16:43, 15 January 2007 (UTC) Page 1 first paragraph, last sentence : I'm not native english but I wonder if you would'nt write "If we want our programs..." with 's' since you use plurals elswhere. (SLPJ: done)

Asilovy 16:43, 15 January 2007 (UTC) Page 3 last sentence before 2.2 : Is'nt it "... on there being INsufficient..." rather than "sufficient" ?(SLPJ: done)

Asilovy 18:33, 15 January 2007 (UTC) Page 6, last sentence : Should'nt you write "...being explicit aboute SIDE effects..." and the same for the last sentence of the next paragraph (begin page 7) "This ability to make SIDE effects..." (SLPJ: done)

Asilovy 18:33, 15 January 2007 (UTC) Page 8 second paragraph : I can understand that, in some sense, atomocity and isolation are related (in fact 2 faces seen by different threads of the same problem) but you start by saying the model does not ensure atomicity and explain why finishing by "...thereby destroying the isolation guarantee". Sounds confusing. (SLPJ: indeed. I've changed it to say "does not ensure isolation...".)

Asilovy 18:33, 15 January 2007 (UTC) Page 10 code of limitedWithdraw : isn't it "writeTVar acc (bal - amount)" with a minus ? (SLPJ: done)

Asilovy 18:33, 15 January 2007 (UTC) Page 11 : name confusion in the code to limitedWithdraw2 (in the type signature), and withdraw2 in the definition and the comment (SLPJ: done)

Asilovy 18:33, 15 January 2007 (UTC) Page 15 firt line : maybe "The function newGate makes A new Gate..." ? (SLPJ: done)

Jeremygibbons 12:47, 22 January 2007 (UTC) "in reality" (p14) - I hate to break it to you, Simon, but I think "in the specification" would be more appropriate here. (SLPJ: ha ha; I think I'll leave it!) Also, on p19, "a atomic" should be "an atomic" and on p21, "sophisiticated" is misspelt. (SLPJ: done)

Norbertk 20:12, 12 February 2007 (UTC) (p4) I am a Haskell Newbie, so I don't understand at first read the different meaning of the '->' in the type signature. It separates the types of the input paramters and it separates the type of the result. Writing this I presume that the last token is always the result, isn't it ? Maybe a remark would be in order, saying that Haskell functions always return exactly one value.



Below here are comments on version 1

ArthurVanLeeuwen 16:25, 22 December 2006 (UTC) (SLPJ: done) There's a couple of typos in the paper as it is now. More importantly, the footnote on page 2 has (hGetLine h "hello") where it should state (hPutStr h "hello").

Neil Mitchell 16:28, 22 December 2006 (UTC) (SLPJ: done) Sect 1, Para 2. If we want to write parallel program[s] - missing s.


Steven807 17:14, 22 December 2006 (UTC) (SLPJ: done) There is no definition or description of check

Tibbe 18:33, 22 December 2006 (UTC); (SLPJ: done)The footnote on page 2 now has a incorrectly capitalized T in hPutSTr.

Fanf 18:51, 22 December 2006 (UTC) (SLPJ: done)page 1 para 2 "as well shall see" should be "as we shall see"

Fanf 18:51, 22 December 2006 (UTC) (SLPJ: done)page 3 "a bit like *t in C" should be "a bit like t* in C" since t is a type

Garious 18:56, 22 December 2006 (UTC) (SLPJ: done) page 10 "at which point An elf" should be "at which point an elf"

Garious 18:58, 22 December 2006 (UTC) (SLPJ: done) page 10 "Here, then is a possible" should be "Here then, is a possible"

Garious 19:16, 22 December 2006 (UTC) (SLPJ: done) page 11 "Again, we define Group is declared" whaaa? maybe: "Group is declared as a data type with constructor MkGroup. MkGroup is passed the Group's capacity, and a TVar containing the number of current members and the Group's two Gates."

Garious 19:16, 22 December 2006 (UTC) (SLPJ: done) page 11 "Creating a new Group is simply..." Is a process of three somewhat-abstract steps simple enough to say 'simply'? Instead, maybe show the code and let the reader think, "Hey, that's simple!"

Rycee 20:46, 22 December 2006 (UTC) (SLPJ: done) Page 4. You probably want to change "... the action a does ..." to "... the action act does ..."

Rycee 20:46, 22 December 2006 (UTC) (SLPJ: done) Page 8. Typographic nitpick: The space after "i.e." looks wide, perhaps you forgot to write "i.e.\ " to force a regular (non sentence ending) space in LaTeX?

MichalPalka 22:32, 22 December 2006 (UTC) You could add a reference to SQL and triggers. They are similar in that it is programming with transations and seeing familiar names will make applied programmers feel more comfortable. (SLPJ: hard to know what is "familiar" to a random reader. For a reader unfamiliar with SQL and triggers, referring to that might just make it more confusing. What do others think?)

DavidHouse 22:55, 22 December 2006 (UTC) (SLPJ: done) Page 3, "That is, return v is a function that, when performed, does no side effects and returns v." Your use of 'that is' implies that you could deduce the fact that it does no side effects from the type signature just given, which isn't true. It's an auxiliary property of 'return'. Maybe just miss out the 'that is'.

DavidHouse 23:04, 22 December 2006 (UTC)(SLPJ: done) Bottom of Page 4, "Then 'atomically act' grabs the lock, performs the action 'at',". Missing 'c' out of 'at'.

DavidHouse 23:23, 22 December 2006 (UTC) (SLPJ: done) Page 5, "Then 'withdraw' is an STM action that adds amount to the balance in the account." 1) For consistency with the rest of the paper, it should be "Then 'withdraw account ammount'..." 2) withdraw subtracts, not adds, to the amount in the account.

DavidHouse 23:40, 22 December 2006 (UTC) (SLPJ: done at defn of hEchoLine) Page 9, you probably want to mention that ++ is string concatenation.

DavidHouse 23:45, 22 December 2006 (UTC) (SLPJ: done)Page 10, I would expect to see the type signatures for the Gate interface rewritten alongside their definitions.

DavidHouse 23:47, 22 December 2006 (UTC) (SLPJ: difficut to do this crisply, but I take the point) Page 9/10, algebraic datatypes are a little weird for the non-initiated, especially with constructors that don't appear to do anything ("Where do you define MkGate?" etc.). You might want to liken them to C's structs, and explain the constructors as tags?

DavidHouse 23:57, 22 December 2006 (UTC) (SLPJ: done) Page 13, I don't think you sufficiently explain 'sequence'. You may wish to add a sentence along the lines of "'sequence' is a function that takes a list of IO actions and returns the action that, when executed, runs each of the actions you passed it in turn."

Dalejordan 02:15, 23 December 2006 (UTC) (SLPJ: done) For the sake of us newbs you might mention in Section 2.1 how all these actions ever get performed. Also, in your description of nTimes I think it would be clearer to say it creates a composite action that performs its argument action n times, rather than say it performs it (directly) n times, even though the value of n is not yet known. Another example of the "beauty" of first class actions (and recursion).

Brecknell 03:29, 23 December 2006 (UTC) (SLPJ: done) Page 7: withdraw should subtract, not add amount to the balance (sadly). Also, since this withdraw has different semantics to the version on p5, you may want to consider giving it a different name.

Brecknell 03:29, 23 December 2006 (UTC) (SLPJ: done) Page 8, towards the end of section 2.4: Two instances of "withdraw" should be "withdraw2", one in the function definition and one in the comment.

Brecknell 05:04, 23 December 2006 (UTC) (SLPJ: done) In the problem definition, "Santa repeatedly sleeps until wakened...", but with the definition of "forever" given in the text, and without some sort of asynchronous signalling, it looks to me like Santa will sleep for at least as long as his random number generator tells him to. Perhaps move the sleep out of forever and into elf and reindeer, then Santa will just sleep in awaitGroup.

Mvanier 19:21, 23 December 2006 (UTC) (SLPJ: done) In section 3.3, "sequence" might as well just be "sequence_". It's confusing to return a list of unit values and not do anything with them. In section 5, I think that the problem with composing lock-based concurrent functions could be explained better; the statement "instead we must expose the locking protocol" could be expanded to show why this is the case.

ChrisKuklewicz 22:17, 24 December 2006 (UTC) In learning Haskell STM back in October of 1995 I had written a solution to the same Santa problem. The main suggestion I have is that the elf and reindeer are given two actions (use Gate inGate) and (useGate outGate) and are required to properly bracket their actual action between these two. If the (joinGroup group) call returned a single scoping combinator (bracket_ (useGate inGate) (useGate outGate)) then the concurrency model for the elf and reindeer would be even to get correct (and beautiful?). Furthermore they could be passed (joinGroup group) instead of (group). This prevents the elf and reindeer from calling any other group functions.

Conal 16:39, 25 December 2006 (UTC) (SLPJ: done) Page 11, first full paragraph, "then waits to the TVar to be decremented to zero." Replace "waits to" with "waits for".

Brecknell 03:41, 27 December 2006 (UTC) This was my first exposure to STM, and I found it clear and easy enough to follow. However, I think it is mainly the discussion of bank accounts that conveys the "beauty" of STM, since that's where the important concepts are demonstrated: composability of transactions, implicit locking and deadlock avoidance, etc. In the Santa Claus problem, those concepts are somewhat obscured. I think the Santa Claus problem is still useful for demonstrating how STM can handle a tricky concurrency scenario with relative ease, but it didn't give me the sense of clarity that the discussion of bank accounts did. Maybe the Santa Claus solution needs more focus on the abstractions possible in STM, how they can help modularise concurrent programs, and why STM is better for solving concurrency problems than the concurrency primitives available elsewhere.

Gaal 08:20, 27 December 2006 (UTC) (SLPJ: done) The binding and usage of the reindeer group on page 12 are inconsistent:

 ; rein_gp <- newGroup 9
 ; sequence [ reindeer gp n | n <- [1 .. 9]]
                    -- ^^
 ; forever (santa elf_group rein_group)

Fernando 14:11, 29 December 2006 (UTC) SLPJ: done In page 3, in the definition of incRef, the last line should probably read, writeIORef var (val+1), instead of writeIORef (val+1). The latter version won't compile.

PaulRBrown 07:11, 2 January 2007 (UTC) It would be convenient if the download link for the source code were active. (Otherwise, reviewers have to cut and paste the fragments in by hand.) SLPJ: try [1]

PaulRBrown 07:11, 2 January 2007 (UTC) I had trouble with the motivation for the Santa Claus example, and I ended up resorting to scratch paper to sketch out the problem and the model. That section would benefit from some additional introduction, perhaps a diagram. (Something schematic with "(n of m) || (p of q)" would be nice.) You could also explain it in terms of concurrency APIs that people are familiar with, e.g., count-down semaphores in Java's concurrency package. In fact, the additional development of the counting barrier prior to the Santa Claus problem would help overall clarity. SLPJ: Hmm

DavidHouse 10:30, 2 January 2007 (UTC) I actually had trouble with this one too. I think even the addition of a sentence like the following would help a great deal: "We want to model this setup in Haskell, where 'sleeping' is blocking and 'waking up' is doing something." SLPJ: where exactly? I'm not convinced this is going to help, but concrete suggestions are always useful.

EricWilligers 12:43, 4 January 2007 (UTC) Page 4: As page 14 currently contains a leap in difficulty with the definition of choose, perhaps foldr1 could be introduced before the last paragraph of 2.1, with an example call. This would also be an opportunity to introduce [ ] in a type signature - it does appear with regards to "sequence" but is easy to miss as the return value isn't discussed. A simple example would avoid (op) notation and would make no sense for empty lists. e.g. max_n values = foldr1 max_2 values. SLPJ: I've re-done that bit; see what you think when I publish version 2.

EricWilligers 12:43, 4 January 2007 (UTC) SLPJ: done Page 10: Missing closing ) at the end of useGate

EricWilligers 12:43, 4 January 2007 (UTC) SLPJ: done Page 12: elf_group => elf_gp, rein_group => rein_gp

EricWilligers 12:43, 4 January 2007 (UTC) SLPJ: done Page 13: elf-gp 1 => elf_gp 1

EricWilligers 12:43, 4 January 2007 (UTC) SLPJ: done Page 14 and page 8 claim all the code is presented. To meet this claim:- Perhaps the following can appear on p9, just before "Since IO actions are first-class, ..."

deliverToys :: String -> IO ()
deliverToys s = putStr (s ++ " delivering toys\n")

Perhaps the following can appear on p13, just before "Working inside-out, ..."

reindeer :: Group -> Int -> IO ()
reindeer gp id = forkIO (forever (reindeer1 gp id))

Alternately, perhaps deliverToys and reindeer could appear in section 3.5 so they don't affect the flow in 3.1 and 3.3.

Brian Hulley 22:06, 4 January 2007 (UTC) Page 5: Trivial typo "be execute simultaneously" should be "execute simultaneously" or "be executed simultaneously" SLPJ:done. I spent ages trying to understand why the 'brutal lock' method wouldn't ensure isolation then I realised it was that I'd just assumed all other threads would be stopped dead in their tracks while the atomic block was running whereas if I'd read the text more carefully I'd have seen that the lock indeed only applies to actions wrapped in atomically. I think if you said something like "if a thread is accessing an IORef while holding the atomically lock there is nothing to stop a *different* thread from accessing the same IORef directly at the same time" it would help emphasise the fact that other threads can still be running even though only one thread can be in an atomic block at any given time. SLPJ: I have re-worded. See if you prefer version 2

Brian Hulley 09:49, 5 January 2007 (UTC)

  • Page 6: re-execution: this looks like you could end up in an infinite loop since if thread A needs to do something complicated in an atomic block while thread B just sits in a tight loop continuously incrementing the same TVar (used by A) then A would probably always have an invalid view of memory by the time it was ready to validate. Does the implementation address this problem? SLPJ: starvation is indeed possible. I deliberately left out for brevity, but perhaps I should devote a couple of paras to it. What do others think?
  • Page 10: No definition for check. Section 2.4 only defines retry so perhaps section 3.2 should include:
    check c = if not c then retry() else return ()

(to avoid introducing yet another Haskell function unless) SLPJ: done

  • Page 12: Use of sequence and list comprehensions seems more complicated than something like:
   mapM_ (elf elf_gp) [1..10]

SLPJ: Hmm

  • Page 16: Section 5 on locks. It seems to me that STM solves these problems by just conceptually using a single lock (the 'brute force' lock on atomically) but a clever thing about STM is that it gets the simplicity of just having a single global lock without the performance penalty due to the use of logging (ie the global lock is only held for the duration of the validate and commit instead of for the whole duration of the atomic block). SLPJ: and perhaps not even then; you can do the commit use CAS only with no locks at all!

These are just minor points though - I thought it was a really good intro to STM - I've never used/looked at STM before now and yet after reading just this one chapter I feel I've got enough understanding to start using it - Thanks.

EricWilligers 01:02, 7 January 2007 (UTC) SLPJ: done I needed to change elf and reindeer due to a compilation error - Expected type: IO (), Inferred type: IO ThreadId. Each of the following worked:-

elf :: Group -> Int -> IO ()
elf gp id = do { _ <- forkIO (forever (elf1 gp id)) ; return () }
elf :: Group -> Int -> IO ThreadId
elf gp id = forkIO (forever (elf1 gp id))
elf gp id = forkIO (forever (elf1 gp id))

The last option avoids the need to mention ThreadId.

Danwang74 04:06, 7 January 2007 (UTC) I think the paper conflates the use of the STM programming model (atomics an various combinators) with the implementation details by focusing only on optimistic methods of implementing the STM programming model. There are fundemental problems with optimistic methods of STM such as situations which require IO as well as implementation challenges which have not been solved.

Pessimistic methods (i.e. lock based methods) can implement the STM programming model efficently. There have been several papers that make progress in this area that I'm sure you are aware of yet they get no mention in the discussion.

Also the comparison of the programming model using bare locks and the STM programming model implemented with optimistics methods is a completely unfair comparison. It is like comparing raw x86 assembly to Haskell compiled to a RISC processor. Programming in raw unstructured x86 assembly is surely a bad idea because the programming model is too low-level. But there is nothing "broken" with x86 assembly when compared to a RISC processor, ignoring aesthetics. In fact one could compile Haskell to x86 assembly with the programmer unaware. The modularity claims are also unfair. Is Haskell compiled to RISC assembly more modular than Haskell compiled to x86?

Locks can be used modularly, just as you can programing in raw x86 assembly modularly with some effort. Unfortunately use of locks in a modular way is not well supported. One way to do this is to use the strawman acquire one global lock implementation of atomics. The whole discussion confuses the programming model and the runtime implementation of it. Actually I don't agree that locks can be used modularly. I think they fundamentally can't be, and that's why I don't think the comparison is misleading. Our "composable memory transactions" paper goes into a bit more detail.

In fact the STM name is confusing itself. There's an "atomic" programming model possibly implemented via STM. It would be more insightful to talk more about alternative methods of implementing the atomic programming model, independent of STM. It is ironic and confusing to use STM to refer to the programming model which he authors emphasize could be implemented with locks. There needs to be a clearer seperation between the programming model the implementation details.

Gknauth 19:32, 7 January 2007 (UTC) (I'm definitely a Haskell newb.) page 2, Sec. 2.1, 1st paragraph

 and returns a value of type IO ().
 ...
 returns a value of type ().

I read this, and wondered, "Is it type IO (), or type () ?" (SLPJ: this is hard to explain first time. I'll try to clarify

Gknauth 19:32, 7 January 2007 (UTC) (SLPJ: done) page 3, line 4, "Then it will performs" s.b. "Then it will perform"

Gknauth 19:32, 7 January 2007 (UTC) (SLPJ: I have clarified)page 3, 2nd paragraph. I'm not sure what "IO a" means. From the previous page, I thought that IO implied there would be a side effect. Now you're saying the "return v" action "does no side effects."

Gknauth 19:32, 7 January 2007 (UTC) page 3, paragraph 6, "Gentle..." After 5 dense paragraphs, suddenly this one seems clear. Maybe you intended this. I'm still scratching my head, wondering if IO and IORef are reserved words, or whether they're things you've defined somewhere but not yet revealed. SLPJ: great! IO and IORef are primitive, built-in types; but they not reserved words in the sense of keywords (e.g. you could not import them and then redefine them). I'm not sure this is worth amplifying though.

Gknauth 19:32, 7 January 2007 (UTC) page 4, end of Section 2.1

I'm still left wondering if IO is a marker that signals "side effect(s) can happen," or if it's a type. Put another way, I don't know if IO is an adjective, noun or verb, and I don't know if it's a part of the language or something Joe Haskell wrote for demonstration. SLPJ: it's a type. A value of that type is an action that, when performed, may have side effects. If you can help me adjust earlier text to say that more clearly it'd be great, but I'm not sure it'd help simply to repeat it! You may like to try "Tackling the awkward squad" and then come back to suggest improved wording for this section.

Gknauth 19:32, 7 January 2007 (UTC) page 5, footnote 2. Why do you say "But it's too late now!" Are so many people in the world using Haskell that it's impossible to tweak nomenclature? I thought Haskell was a young up and coming language, not an old language set in its ways. Also, programmers are used to being able to set things right themselves if the language somehow has gotten something wrong. The exclamation implies that once a design mistake is made, however small, programmers just have to lump it. SLPJ: reasonable point, send email to libraries@haskell.org!

Gknauth 19:32, 7 January 2007 (UTC) page 5, code example 3. cts -- shorthand for cents? Why not cents? SLPJ: short for "contents"! I've replaced it with bal, short for balance

Gknauth 19:32, 7 January 2007 (UTC) page 6, paragraph 3, "When..." This is very clear. I could substitute that paragraph for ten times as many words I've seen in database texts.

Gknauth 19:32, 7 January 2007 (UTC) page 6, "This process is called re-execution." I read that and thought, "What is to keep a failed transaction from re-executing forever and never being satisfied?" But it looks like you get to that in Section 2.4. SLPJ: indeed

Gknauth 19:32, 7 January 2007 (UTC) page 10, code examples at bottom. Where did check(...) come from? How does it work? Does it block? Does it abort? Is it part of the language? Is it defined later in the chapter? Based on the last sentence of the first full paragraph on page 11, I'm guessing it blocks. SLPJ: try now. I forgot to define check

Gknauth 19:32, 7 January 2007 (UTC) SLPJ: done page 11, paragraph 3, "Again...", rewrite: "we define Group is declared as a fresh data type..."

Gknauth 19:32, 7 January 2007 (UTC) page 14, last paragraph in section 3.4. That is very cool. The only extra thing I wanted was a picture of how all the pieces you built combined. I was curious which ones you thought of first. I wondered about the order in which you conceived of pieces, as it compares with the order in which you described them.SLPJ: glad you liked it; but I've simplified it. I don't have much idea about how to give the "how did I get here" stuff without making it a lot longer


Gknauth 19:32, 7 January 2007 (UTC) page 17, conclusion, 1st paragraph, last sentence. I can definitely see the need for transactional memory techniques in the years ahead, but I don't know much about how other languages are addressing the issue. Maybe you could provide some pointers. If Haskell is ahead of the others, then so much the better for Haskell, but at the moment I have nothing to compare. SLPJ: now citing Larus's new book

EngineerScotty 17:23, 8 January 2007 (UTC) SLPJ: done Page 2. The problem is stated such that no thread may "observe a state in which money has left one account but not arrived in the other". This may be stating the obvious; but should the make-before-break condition also be proscribed--no thread should be able to observe a state in which money has arrived in one account without having left the other, as well? A minor quibble, but one if which not addressed will cause me to an open an account with your bank. :)

BoAdler 20:10, 10 January 2007 (UTC) SLPJ: done page 3. You describe "IORef t" as being "a bit like *t in C", but that notation in C is used to dereference pointers. I know this is pedantic, but it should be "(t *)". I got thrown for a second, and had to remind myself that 't' was a type, not a variable.


Robin 13:52, 18 January 2007 (UTC) page 2, section 2.1. "syncrhonized" has a typo.