Benchmarks Game/Parallel/BinaryTreesDPH
< Benchmarks Game | Parallel
Jump to navigation
Jump to search
Revision as of 20:26, 7 October 2008 by Thoughtpolice (talk | contribs)
Binary Trees
Data Parallel Haskell
- Not submitted, requires GHC 6.10 beta or above
Hardware:
- 2.2gHz core2duo, Macbook Pro
Build and run:
$ ~/ghc-6.10/bin/ghc --make -fcpr-off -threaded -fdph-par -package dph-base -Odph binarytrees.hs $ ./binarytrees 20 +RTS -N3 -sstderr -A350M
Times
Note, these times have a 350mb heap (+RTS -A350M)
Single core vs. dual core
- Single core (no -N3): 18.945
- Dual core (with -N3): 14.338 (see below)
DPH vs. parallel strategies
- Parallel strategies dual core: 11.471 (see below)
- DPH Dual core: 14.338 (see below)
Comparison with parallel strategies version of parallel binary trees:
Data Parallel Haskell
time ./binarytrees 20 +RTS -N3 -sstderr -A350M -RTS ./binarytrees 20 +RTS -N3 -sstderr -A350M stretch tree of depth 21 check: -1 2097152 trees of depth 4 check: -2097152 524288 trees of depth 6 check: -524288 131072 trees of depth 8 check: -131072 32768 trees of depth 10 check: -32768 8192 trees of depth 12 check: -8192 2048 trees of depth 14 check: -2048 512 trees of depth 16 check: -512 128 trees of depth 18 check: -128 32 trees of depth 20 check: -32 long lived tree of depth 20 check: -1 14,584,449,064 bytes allocated in the heap 215,093,024 bytes copied during GC 34,674,344 bytes maximum residency (2 sample(s)) 51,688 bytes maximum slop 1083 MB total memory in use (9 MB lost due to fragmentation)
Generation 0: 16 collections, 16 parallel, 1.97s, 1.13s elapsed Generation 1: 2 collections, 2 parallel, 0.83s, 0.47s elapsed
Parallel GC work balance: 2.34 (53773189 / 22969253, ideal 3)
Task 0 (worker) : MUT time: 23.18s ( 12.54s elapsed) GC time: 0.37s ( 0.21s elapsed)
Task 1 (worker) : MUT time: 21.46s ( 12.54s elapsed) GC time: 2.09s ( 1.19s elapsed)
Task 2 (worker) : MUT time: 23.53s ( 12.54s elapsed) GC time: 0.02s ( 0.02s elapsed)
Task 3 (worker) : MUT time: 23.56s ( 12.57s elapsed) GC time: 0.00s ( 0.00s elapsed)
Task 4 (worker) : MUT time: 23.24s ( 12.57s elapsed) GC time: 0.32s ( 0.18s elapsed)
INIT time 0.02s ( 0.03s elapsed) MUT time 20.75s ( 12.54s elapsed) GC time 2.80s ( 1.60s elapsed) EXIT time 0.00s ( 0.02s elapsed) Total time 23.57s ( 14.17s elapsed)
%GC time 11.9% (11.3% elapsed)
Alloc rate 702,222,726 bytes per MUT second
Productivity 88.1% of total user, 146.4% of total elapsed
recordMutableGen_sync: 0 gc_alloc_block_sync: 14318885 whitehole_spin: 0 gen[0].steps[0].sync_todo: 0 gen[0].steps[0].sync_large_objects: 0 gen[0].steps[1].sync_todo: 354 gen[0].steps[1].sync_large_objects: 0 gen[1].steps[0].sync_todo: 10112 gen[1].steps[0].sync_large_objects: 0 ./binarytrees 20 +RTS -N3 -sstderr -A350M -RTS 23.57s user 2.13s system 179% cpu 14.338 total
Parallel Strategies
time ./binarytrees 20 +RTS -N3 -sstderr -A350M -RTS ./binarytrees 20 +RTS -N3 -sstderr -A350M stretch tree of depth 21 check: -1 2097152 trees of depth 4 check: -2097152 524288 trees of depth 6 check: -524288 131072 trees of depth 8 check: -131072 32768 trees of depth 10 check: -32768 8192 trees of depth 12 check: -8192 2048 trees of depth 14 check: -2048 512 trees of depth 16 check: -512 128 trees of depth 18 check: -128 32 trees of depth 20 check: -32 long lived tree of depth 20 check: -1 9,719,681,300 bytes allocated in the heap 164,038,148 bytes copied during GC (scavenged) 160 bytes copied during GC (not scavenged) 33,718,272 bytes maximum residency (2 sample(s))
11 collections in generation 0 ( 0.33s) 2 collections in generation 1 ( 0.20s)
1094 Mb total memory in use
Task 0 (worker) : MUT time: 18.43s ( 10.65s elapsed) GC time: 0.17s ( 0.19s elapsed)
Task 1 (worker) : MUT time: 18.54s ( 10.65s elapsed) GC time: 0.06s ( 0.06s elapsed)
Task 2 (worker) : MUT time: 18.32s ( 10.65s elapsed) GC time: 0.29s ( 0.36s elapsed)
Task 3 (worker) : MUT time: 18.60s ( 10.67s elapsed) GC time: 0.01s ( 0.01s elapsed)
Task 4 (worker) : MUT time: 18.61s ( 10.68s elapsed) GC time: 0.00s ( 0.00s elapsed)
INIT time 0.01s ( 0.03s elapsed) MUT time 18.08s ( 10.65s elapsed) GC time 0.53s ( 0.63s elapsed) EXIT time 0.00s ( 0.01s elapsed) Total time 18.62s ( 11.31s elapsed)
%GC time 2.8% (5.6% elapsed)
Alloc rate 537,291,716 bytes per MUT second
Productivity 97.1% of total user, 159.8% of total elapsed
./binarytrees 20 +RTS -N3 -sstderr -A350M -RTS 18.62s user 2.15s system 181% cpu 11.471 total