Talk:Questions and answers

From HaskellWiki
Revision as of 13:23, 5 February 2007 by MathematicalOrchid (talk | contribs) (...and now the code snippet actually makes sense!)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

IRC

Is this a good place to ask questions? MathematicalOrchid 15:30, 18 January 2007 (UTC)

not yet at least. try the #haskell irc channel on freenode which is usually manned by at least a few very helpfull people. alternatively try the haskell-cafe mailing list --Johannes Ahlmann 12:29, 22 January 2007 (UTC)

A question of speed

I have just performed a benchmark regarding the speed of (++) vs putStr, and received an extremely puzzling and counter-intuitive result. Perhaps somebody can explain?

Test #1

  writeFile "Test1.txt" $ concat $ replicate n "test"

For n = 10,000,000, that takes about 35 seconds wall-clock time and about 17 seconds CPU time on my test machine. (It also uses about 1.4 MB RAM.)

Test #2

  writeFile "Test2.txt" $ concatMap (\x -> "test") [1..n]

For the same value of n, that takes about 43 seconds wall-clock time and about 20 seconds CPU time. (Uses 2.4 MB RAM for some reason...)

Test #3

  writeFile "Test3.txt" $ build n ""

  build 0 x = x
  build n x = build (n-1) (x ++ "test")

This test does not finish. It simply consumes memory without limit, never writing anything to disk. (At 90 MB, Windoze warned me that 'the system is getting dangerously low on virtual memory'.)

I added a couple of calls to seq in there - but this made no noticeable difference to anything.

Test #4

And now the really interesting test:

  do h <- openFile "Test4.txt" WriteMode ; mapM_ (\x -> hPutStr h "help") [1..n]

This takes 80 seconds wall-time and 70 seconds CPU time. (Memory usage appears to be 2.4 MB or less.)

Summary

I'm supprised that concatMap should be slower than concat and replicate. But then we're not talking about a huge speed difference.

I am absolutely astonished that (++) should be 50% faster than the much more efficient I/O calls. Does anybody have the slightest clue how this can be? Currently the only think I can think of is that each I/O call has some kind of constant overhead, and so the number of I/O calls affects speed more than the amount of data processed. But even so... 50%??

All this is with code compiled by GHC 6.6 - as if that makes any difference.