# Shootout/Nsieve

### From HaskellWiki

DonStewart (Talk | contribs) (→Proposed entry) |
(→Current entry: fix imports for 6.8) |
||

Line 32: | Line 32: | ||

Following the spec, uses Word8's to represent bools. Instead of IOUArray | Following the spec, uses Word8's to represent bools. Instead of IOUArray | ||

− | Int Word8, which has a slow | + | Int Word8, which has a slow initialisation phase, we use a ByteString, |

and call memset to init. Faster than unoptimised C. | and call memset to init. Faster than unoptimised C. | ||

Line 49: | Line 49: | ||

import System | import System | ||

import Foreign | import Foreign | ||

− | import Data.ByteString. | + | import Data.ByteString.Internal |

+ | import Data.ByteString.Unsafe (unsafeIndex) | ||

import Text.Printf | import Text.Printf | ||

## Revision as of 01:24, 19 January 2008

A ShootoutEntry for the nsieve benchmark

diff program output N = 2 with this output file to check your program is correct before contributing.

Each program should count the prime numbers from 2 to M, using the same na�ve Sieve of Eratosthenes algorithm:

- create a sequence of M boolean flags
- for each index number
- if the flag value at that index is true
- set all the flag values at multiples of that index false
- increment the count

- if the flag value at that index is true

Calculate 3 prime counts, for M = 2N � 10000, 2N-1 � 10000, and 2N-2 � 10000.

The basic benchmark was described in "A High-Level Language Benchmark." BYTE, September 1981, p. 180, Jim Gilbreath.

Of course, there are more efficient implementations of the Sieve of Eratosthenes, and there are more efficient ways to sieve prime numbers, for example "Prime sieves using binary quadratic forms".

For more information see Eric W. Weisstein, "Sieve of Eratosthenes." From MathWorld PrimeCountingFunction--A Wolfram Web Resource.

## Contents |

## 1 Benchmarks

Debian Linux/x86, N=10

|| Entry || Time || || Old || 1.961|| || New || 0.669||

## 2 Current entry

Following the spec, uses Word8's to represent bools. Instead of IOUArray Int Word8, which has a slow initialisation phase, we use a ByteString, and call memset to init. Faster than unoptimised C.

The GHC native code gen seems to do a better job on this one.

{-# OPTIONS -O2 -fasm -fbang-patterns #-} -- -- The Computer Language Shootout -- http://shootout.alioth.debian.org/ -- -- Contributed by Don Stewart. -- Uses Word8 values to represent Bools, avoiding a bit-packing Array Bool -- import System import Foreign import Data.ByteString.Internal import Data.ByteString.Unsafe (unsafeIndex) import Text.Printf main = do n <- getArgs >>= readIO . head mapM_ (sieve . (10000 *) . (2 ^)) [n, n-1, n-2] sieve n = do a <- create n $ \p -> memset p 0 (fromIntegral n) >> return () r <- go n a 0 2 printf "Primes up to %8d %8d\n" (n::Int) (r::Int) go m !a !c !n | n == m = return c | true a n = go m a c (n+1) | otherwise = set (n+n) where set !j | j <= m = false a j >> set (j+n) | otherwise = go m a (c+1) (n+1) true !a !n = unsafeIndex a n == 1 false (PS fp _ _) !n = withForeignPtr fp $ \p -> pokeByteOff p n (1 :: Word8)

## 3 IOUArrays

A bit slower than using ByteStrings, since the initialisation phase is poor.

{-# OPTIONS -fbang-patterns #-} -- -- The Computer Language Shootout -- http://shootout.alioth.debian.org/ -- -- Contributed by Don Stewart -- Uses Word8 values to represent Bools, avoiding a bit-packing Array Bool -- import System import Data.Array.IO import Data.Array.Base import Text.Printf import Word main = do n <- getArgs >>= readIO . head mapM_ (sieve . (10000 *) . (2 ^)) [n, n-1, n-2] sieve n = do a <- newArray (2,n) 0 :: IO (IOUArray Int Word8) -- avoid bit packing Bool type r <- go n a 0 2 printf "Primes up to %8d %8d\n" (n::Int) (r::Int) go m !a !c !n | n == m = return c | otherwise = do e <- unsafeRead a n if e == 0 then let loop !j | j <= m = unsafeWrite a j 1 >> loop (j+n) | otherwise = go m a (c+1) (n+1) in loop (n+n) else go m a c (n+1)

## 4 Illegal entry

Ported to GHC 6.6 Submitted

Uses bit packing, so is instead the nsieve-bits entry.

{-# OPTIONS -fbang-patterns #-} -- -- The Computer Language Shootout -- http://shootout.alioth.debian.org/ -- -- Contributed by Don Stewart -- Nsieve over a Bool array -- import Data.Array.IO import Data.Array.Base import System import Text.Printf main = do n <- getArgs >>= readIO . head :: IO Int mapM_ (sieve . (10000 *) . (2 ^)) [n, n-1, n-2] sieve n = do a <- newArray (2,n) True :: IO (IOUArray Int Bool) -- an array of Bool r <- go a n 2 0 printf "Primes up to %8d %8d\n" (n::Int) (r::Int) :: IO () go !a !m !n !c | n == m = return c | otherwise = do e <- unsafeRead a n if e then let loop !j | j <= m = unsafeWrite a j False >> loop (j+n) | otherwise = go a m (n+1) (c+1) in loop (n+n) else go a m (n+1) c

## 5 Old entry

Studying the Core shows the the mapM was preventing things from being unboxed properly. Big speedup. The extra `seqs` also help the unboxing. Should be around the fastest entry now.

*Is anyone else bothered by the usage of unsafeWrite? I agree that*
excellent speed is being gained, but this speed is gained at the cost
of functional-style coding. I'm a novice Haskeller, but using unsafeWrite
seems like an imperative hack to gain speed.* -- AlsonKemp*

*Response: unsafeWrite is just the same as a normal array update,*
without an extraneous bounds check (since it's already performed in the
outer loop). We're just avoiding redundant computation (as we do in all
entries for this problem). SPJ actually recommends unsafe* ops in
arrays, see the "loop performance bug" thread in the haskell mail
archives from 2005.

In summary, its not a hack -- the name just makes clear that you have to
do your own bounds check (Haskell is good at letting you know when there
are extra proof considerations for the programmer to take). Since this
program runs faster than GCC, then I think we should not be worried
about avoiding a redundant bounds check :)* -- DonStewart*

-- -- $Id: nsieve-ghc.code,v 1.27 2006/01/08 22:44:56 igouy-guest Exp $ -- Written by Einar Karttunen, optimised further by Don Stewart -- import Data.Array.IO; import Data.Array.Base; import Data.Bits; import System; import Text.Printf main = (\n -> mapM_ (sieve.(10000 *).shiftL 1) [n,n-1,n-2]) . read . head =<< getArgs sieve m = do c <- newArray (2,m) True >>= \a -> loop a m 2 0 printf "Primes up to %8d %8d\n" (m::Int) (c::Int) :: IO () loop arr m n c | arr `seq` m `seq` n `seq` c `seq` False = undefined loop arr m n c = if n == m then return c else do el <- unsafeRead arr n if el then let loop' j | j > m = loop arr m (n+1) (c + 1) | otherwise = unsafeWrite arr j False >> loop' (j+n) in loop' (n+n) else loop (arr :: IOUArray Int Bool) m (n+1) c

## 6 Entry 2

Shortest entry in any language

{-# OPTIONS -O2 -optc-O3 #-} -- $Id: nsieve-ghc.code,v 1.27 2006/01/08 22:44:56 igouy-guest Exp $ -- Written by Einar Karttunen, shortened by Don Stewart import Data.Array.IO; import Data.Array.Base; import Data.Bits; import System; import Text.Printf loop arr m n c = if n == m then return c else do el <- unsafeRead arr n if el then do mapM_ (flip (unsafeWrite arr) False) (tail [n,n+n..m]) loop arr m (n+1) $! c + 1 else loop (arr :: IOUArray Int Bool) m (n+1) c sieve m = do c <- newArray (2,m) True >>= \a -> loop a m 2 0 printf "Primes up to %8d %8d\n" (m::Int) (c::Int) :: IO () main = (\n -> mapM_ (sieve.(10000 *).shiftL 1) [n,n-1,n-2]) . read . head =<< getArgs

## 7 Original entry

{-# OPTIONS -O2 -optc-O3 #-} -- $Id: nsieve-ghc.code,v 1.27 2006/01/08 22:44:56 igouy-guest Exp $ -- written by Einar Karttunen import Data.Array.IO import Data.Array.Base import Data.Bits (shiftL) import System (getArgs) loop :: IOUArray Int Bool -> Int -> Int -> Int -> IO Int loop arr m n c | n == m = return c | otherwise = do el <- unsafeRead arr n if el then do mapM_ (\i -> unsafeWrite arr i False) (tail [n,n+n..m]) loop arr m (n+1) $! c+1 else do loop arr m (n+1) c fmt width i = let is = show i in (take (width - length is) (repeat ' ')) ++ is sieve n = do let m = (1 `shiftL` n) * 10000 arr <- newArray (2,m) True c <- loop arr m 2 0 putStrLn ("Primes up to "++fmt 8 m++" "++fmt 8 c) main = do n <- getArgs >>= readIO.head sieve n >> sieve (n-1) >> sieve (n-2)