Difference between revisions of "Prime numbers"
m (Add Category:Mathematics) 
(more edits, Euler's sieve) 

Line 1:  Line 1:  
== Prime Number Resources == 
== Prime Number Resources == 

−  In mathematics, a prime number (or a prime) is a natural number which has exactly two distinct natural number divisors: 1 and itself. 
+  In mathematics, a prime number (or a prime) is a natural number which has exactly two distinct natural number divisors: 1 and itself. The smallest prime is thus 2. 
[http://www.haskell.org/haskellwiki/Prime_numbers Prime Numbers] at Wikipedia. 
[http://www.haskell.org/haskellwiki/Prime_numbers Prime Numbers] at Wikipedia. 

Line 15:  Line 15:  
== Finding Primes == 
== Finding Primes == 

⚫  
+  
⚫  
+  Any natural number is representable as a product of powers of its prime factors, and so a prime has no prime divisors other then itself. That means that starting with 2, for each newly found prime we can eliminate from the rest of the numbers all such numbers that have this prime as their divisor. This is known as "sieving" the natural numbers, so that in the end what we are left with are just primes. 

+  
⚫  
⚫  
<haskell> 
<haskell> 

primes :: [Integer] 
primes :: [Integer] 

−  primes = sieve [2..] 
+  primes = sieve [2..] 
+  where 

sieve (p:xs) 
sieve (p:xs) 

−  = p : sieve [x  x<xs, x `mod` p /= 0] 
+  = p : sieve [x  x<xs, x `mod` p /= 0] 
+   or: filter ((/=0).(`mod`p)) xs 

</haskell> 
</haskell> 

−  While appearing simple, this method is extremely inefficient and not recommended for more than a few 
+  While appearing simple, this method is <i>extremely inefficient</i> and not recommended for more than a few thousand prime numbers, even when compiled. Here, every number is tested for divisibility by each prime up to its smallest factor, so any prime is checked against all its preceding primes, when in fact just those not greater than its square root would suffice. That means e.g. that to find the 1001st prime (<code>7927</code>) 1000 filters are used, when in fact just the first 24 are needed (upto <code>89</code>'s filter only). 
−  +  So it in effect creates a cascade of filters in front of the infinite numbers supply, in an <i>extremely premature</i> fashion. One way of fixing that would be to <i>postpone</i> the creation of filters until the right moment, as in 

==== Postponed Filters Prime Sieve ==== 
==== Postponed Filters Prime Sieve ==== 

Line 33:  Line 36:  
<haskell> 
<haskell> 

primes :: [Integer] 
primes :: [Integer] 

−  primes = 2: 3: sieve (tail primes) [5,7..] 
+  primes = 2: 3: sieve (tail primes) [5,7..] 
+  where 

sieve (p:ps) xs 
sieve (p:ps) xs 

−  = h ++ sieve ps [xx<t, x `rem` p /= 0] 
+  = h ++ sieve ps [xx<t, x `rem` p /= 0] 
−  +   or: filter ((/=0).(`rem`p)) t 

+  where (h,~(_:t)) = span (< p*p) xs 

</haskell> 
</haskell> 

Line 70:  Line 73:  
</haskell> 
</haskell> 

−  Instead of relying on nested filters, it tests by an explicit list of all the needed prime factors, but it recomputes this list, which will be the same for the increasing spans of numbers between the successive squares of primes. 
+  Instead of relying on nested filters, it tests each odd number by an explicit list of all the needed prime factors, but it recomputes this list, which will be the same for the increasing spans of numbers between the successive squares of primes. 
=== Simple Prime Sieve III === 
=== Simple Prime Sieve III === 

Line 97:  Line 100:  
sieve k (p:ps) x 
sieve k (p:ps) x 

= let fs = take k (tail primes) in 
= let fs = take k (tail primes) in 

−  [x  x<[x,x+2..p*p2], 
+  [x  x<[x,x+2..p*p2], and [x`rem`p/=0  p<fs]] 
++ sieve (k+1) ps (p*p+2) 
++ sieve (k+1) ps (p*p+2) 

</haskell> 
</haskell> 

Line 113:  Line 116:  
It can thus produce a few primes e.g. above <code>239812076741689</code>, which is a square of the millionth odd prime, without having to compute all the preceding primes (which would number in trillions). 
It can thus produce a few primes e.g. above <code>239812076741689</code>, which is a square of the millionth odd prime, without having to compute all the preceding primes (which would number in trillions). 

−  === 
+  === Sieve of Eratosthenes === 
−  All the versions thus far try to <i>keep the primes</i> among all numbers by testing <i>each number</i> in isolation. Testing composites is cheap (as most of them will have small factors, so the test is soon aborted), but testing prime numbers is costly. The 
+  All the versions thus far try to <i>keep the primes</i> among all numbers by testing <i>each number</i> in isolation. Testing composites is cheap (as most of them will have small factors, so the test is soon aborted), but testing prime numbers is costly. The sieve of Eratosthenes tries to <i>get rid of composites</i> by working on <i>spans</i> of numbers <i>at once</i> and thus gets its primes <i>"for free"</i>, as gaps between the generated/marked/crossedoff composites: 
<haskell> 
<haskell> 

Line 121:  Line 124:  
primes = 2: 3: 5: sieve [] (tail primes) 7 
primes = 2: 3: 5: sieve [] (tail primes) 7 

where 
where 

−  sieve fs (p:ps) x = 
+  sieve fs (p:ps) x = gaps x comps 
++ sieve ((2*p,q+2*p):fs') ps (q+2) 
++ sieve ((2*p,q+2*p):fs') ps (q+2) 

where 
where 

−  +  gaps x (y:ys)  x==y = gaps (x+2) ys 

−   True = x: 
+   True = x: gaps (x+2) (y:ys) 
−  +  gaps x [] = if x==q then [] else [x] 

q = p*p 
q = p*p 

comps = foldl mrg [] mults 
comps = foldl mrg [] mults 

−  (mults,fs') = unzip (map 
+  (mults,fs') = unzip (map mark fs) 
−  +  mark (s,x) = (h,(s,x')) 

−  +  where (h,~(x':_)) = span(<q)[x,x+s..] 

mrg u@(a:t) w@(b:r) 
mrg u@(a:t) w@(b:r) 

 a<b = a: t `mrg` w 
 a<b = a: t `mrg` w 

Line 142:  Line 145:  
Compared with the previous version, if compiled with GHC, it is steadily 1.2x slower and takes 3.5x more memory. But interpreted under both GHCi and WinHugs, it runs <i>2x to 4x faster</i>, takes <i>less</i> memory, and has better asymptotic behavior. 
Compared with the previous version, if compiled with GHC, it is steadily 1.2x slower and takes 3.5x more memory. But interpreted under both GHCi and WinHugs, it runs <i>2x to 4x faster</i>, takes <i>less</i> memory, and has better asymptotic behavior. 

+  
+  ==== Euler's Sieve ==== 

+  
+  This is the sieve of Eratosthenes variation which is directly generating, starting with 2, for each newly found prime all numbers that <i>would</i> have this prime as their divisor, i.e. its multiples, to be eliminated from the rest of the numbers (starting at its square of course), resulting in a straightforward modification of the postponedfilters sieve: 

+  
+  <haskell> 

+  primes :: [Integer] 

+  primes = 2: 3: sieve (tail primes) [5,7..] 

+  where 

+  sieve (p:ps) xs 

+  = h ++ sieve ps (t `minus` [q+2*p,q+4*p..]) 

+  where (h,~(_:t)) = span (< q) xs 

+  q = p*p 

+  xs@(x:xt) `minus` ys@(y:yt) 

+   x<y = x:xt `minus` ys 

+   x>y = xs `minus` yt 

+   True = xt `minus` yt 

+  </haskell> 

+  
+  This is easily adaptable to wheel optimization. 

+  
+  ==== Euler's Sieve with merged multiples ==== 

+  
+  The explicit partitioning into spans between the successive primes squares is actually superfluous and is implicitly achieved when combing for gaps in the directly merged inifinite primes multiples streams: 

+  
+  <haskell> 

+  primes :: [Integer] 

+  primes = (2:) . (3:) . gaps 5 $ 

+  foldr (\p> let q=p*p 

+  in (q:).merge [q+2*p,q+4*p..]) 

+  [] (tail primes) 

+  where 

+  gaps x s@(y:ys) 

+   x==y = gaps (x+2) ys 

+   True = x: gaps (x+2) s 

+  merge a@(x:xs) b@(y:ys) 

+   x > y = y:merge a ys 

+   x < y = x:merge xs b 

+   True = x:merge xs ys 

+  </haskell> 

+  
+  Here the specific clause order in <code>merge</code> achieves additional 30%  50% speedup. 

+  
=== Prime Wheels === 
=== Prime Wheels === 
Revision as of 04:07, 5 December 2009
Contents
 1 Prime Number Resources
 2 Finding Primes
 3 Testing Primality
 4 External links
Prime Number Resources
In mathematics, a prime number (or a prime) is a natural number which has exactly two distinct natural number divisors: 1 and itself. The smallest prime is thus 2.
Prime Numbers at Wikipedia.
Sieve of Eratosthenes at Wikipedia.
HackageDB packages:
Numbers: An assortment of number theoretic functions.
NumberSieves: Number Theoretic Sieves: primes, factorization, and Euler's Totient.
primes: Efficient, purely functional generation of prime numbers.
Finding Primes
Any natural number is representable as a product of powers of its prime factors, and so a prime has no prime divisors other then itself. That means that starting with 2, for each newly found prime we can eliminate from the rest of the numbers all such numbers that have this prime as their divisor. This is known as "sieving" the natural numbers, so that in the end what we are left with are just primes.
Simple Primes Sieve
The following is a direct translation of this idea, generating a list of all the prime numbers in the universe:
primes :: [Integer]
primes = sieve [2..]
where
sieve (p:xs)
= p : sieve [x  x<xs, x `mod` p /= 0]
 or: filter ((/=0).(`mod`p)) xs
While appearing simple, this method is extremely inefficient and not recommended for more than a few thousand prime numbers, even when compiled. Here, every number is tested for divisibility by each prime up to its smallest factor, so any prime is checked against all its preceding primes, when in fact just those not greater than its square root would suffice. That means e.g. that to find the 1001st prime (7927
) 1000 filters are used, when in fact just the first 24 are needed (upto 89
's filter only).
So it in effect creates a cascade of filters in front of the infinite numbers supply, in an extremely premature fashion. One way of fixing that would be to postpone the creation of filters until the right moment, as in
Postponed Filters Prime Sieve
primes :: [Integer]
primes = 2: 3: sieve (tail primes) [5,7..]
where
sieve (p:ps) xs
= h ++ sieve ps [xx<t, x `rem` p /= 0]
 or: filter ((/=0).(`rem`p)) t
where (h,~(_:t)) = span (< p*p) xs
This only tests odd numbers, and by the least amount of primes for each numbers span between successive squares of primes.
Whereas the first version exhibits near O() behavior, this one exhibits near O() behavior, and is good for creating a few 100,000s of primes (compiled). It takes as much time to generate 100,000 primes with it as it takes for the first one to generate 5500, GHCcompiled.
Primality Test and Integer Factorization
Given an infinite list of prime numbers, we can implement primality tests and integer factorization:
isPrime n = n > 1 && n == head (primeFactors n)
primeFactors 1 = []
primeFactors n = go n primes
where
go n ps@(p:pt)
 p*p > n = [n]
 n `rem` p == 0 = p : go (n `quot` p) ps
 otherwise = go n pt
Simple Prime Sieve II
The following method is slightly faster, creating 114,000 primes in the same period of time. It works well for generating a few 100,000s of primes as well:
primes :: [Integer]
primes = 2:filter isPrime [3,5..]
where
isPrime n = all (not . divides n) $ takeWhile (\p > p*p <= n) primes
divides n p = n `mod` p == 0
Instead of relying on nested filters, it tests each odd number by an explicit list of all the needed prime factors, but it recomputes this list, which will be the same for the increasing spans of numbers between the successive squares of primes.
Simple Prime Sieve III
This is faster still, creating 158,000 primes (again, GHCcompiled) in the same time span:
primes :: [Integer]
primes = 2:3:go 5 [] (tail primes)
where
divisibleBy d n = n `mod` d == 0
go start ds (p:ps) = let pSq = p*p in
foldr (\d > filter (not . divisibleBy d)) [start, start + 2..pSq  2] ds
++ go (pSq + 2) (p:ds) ps
This one explicitly maintains the list of primes needed for testing each odds span between successive primes squares, which it also explicitly generates. But it tests with nested filter
s, which it repeatedly recreates.
Simple Prime Sieve IV
The list of primes needed to test each range of odds is actually just the prefix of the primes list itself, of known length, and need not be specifically generated at all. Combined with onecall testing by the explicit list of primes, and direct generation of odds between the successive primes squares, this leads to:
primes :: [Integer]
primes = 2: 3: sieve 0 (tail primes) 5
sieve k (p:ps) x
= let fs = take k (tail primes) in
[x  x<[x,x+2..p*p2], and [x`rem`p/=0  p<fs]]
++ sieve (k+1) ps (p*p+2)
It produces about 222,000 primes in the same time span, and is good for creating about a million first primes, compiled.
The reason to have sieve
function available separately too is that it can also be used to produce primes above a given number, as in
primesFrom m = sieve (length h) ps $ m`div`2*2+1
where
(h,(_:ps)) = span (<= (floor.sqrt.fromIntegral) m) primes
It can thus produce a few primes e.g. above 239812076741689
, which is a square of the millionth odd prime, without having to compute all the preceding primes (which would number in trillions).
Sieve of Eratosthenes
All the versions thus far try to keep the primes among all numbers by testing each number in isolation. Testing composites is cheap (as most of them will have small factors, so the test is soon aborted), but testing prime numbers is costly. The sieve of Eratosthenes tries to get rid of composites by working on spans of numbers at once and thus gets its primes "for free", as gaps between the generated/marked/crossedoff composites:
primes :: [Integer]
primes = 2: 3: 5: sieve [] (tail primes) 7
where
sieve fs (p:ps) x = gaps x comps
++ sieve ((2*p,q+2*p):fs') ps (q+2)
where
gaps x (y:ys)  x==y = gaps (x+2) ys
 True = x: gaps (x+2) (y:ys)
gaps x [] = if x==q then [] else [x]
q = p*p
comps = foldl mrg [] mults
(mults,fs') = unzip (map mark fs)
mark (s,x) = (h,(s,x'))
where (h,~(x':_)) = span(<q)[x,x+s..]
mrg u@(a:t) w@(b:r)
 a<b = a: t `mrg` w
 b<a = b: u `mrg` r
 True = a: t `mrg` r
mrg u w = if null u then w else u
This "marks" the odd composites in a given range by generating them  just as a person performing the original sieve would do, marking one by one the multiples of the relevant primes  and then combs through them looking for gaps.
Compared with the previous version, if compiled with GHC, it is steadily 1.2x slower and takes 3.5x more memory. But interpreted under both GHCi and WinHugs, it runs 2x to 4x faster, takes less memory, and has better asymptotic behavior.
Euler's Sieve
This is the sieve of Eratosthenes variation which is directly generating, starting with 2, for each newly found prime all numbers that would have this prime as their divisor, i.e. its multiples, to be eliminated from the rest of the numbers (starting at its square of course), resulting in a straightforward modification of the postponedfilters sieve:
primes :: [Integer]
primes = 2: 3: sieve (tail primes) [5,7..]
where
sieve (p:ps) xs
= h ++ sieve ps (t `minus` [q+2*p,q+4*p..])
where (h,~(_:t)) = span (< q) xs
q = p*p
xs@(x:xt) `minus` ys@(y:yt)
 x<y = x:xt `minus` ys
 x>y = xs `minus` yt
 True = xt `minus` yt
This is easily adaptable to wheel optimization.
Euler's Sieve with merged multiples
The explicit partitioning into spans between the successive primes squares is actually superfluous and is implicitly achieved when combing for gaps in the directly merged inifinite primes multiples streams:
primes :: [Integer]
primes = (2:) . (3:) . gaps 5 $
foldr (\p> let q=p*p
in (q:).merge [q+2*p,q+4*p..])
[] (tail primes)
where
gaps x s@(y:ys)
 x==y = gaps (x+2) ys
 True = x: gaps (x+2) s
merge a@(x:xs) b@(y:ys)
 x > y = y:merge a ys
 x < y = x:merge xs b
 True = x:merge xs ys
Here the specific clause order in merge
achieves additional 30%  50% speedup.
Prime Wheels
The idea of only testing odd numbers can be extended further. For instance, it is a useful fact that every prime number other than 2 and 3 must be of the form or . Thus, we only need to test these numbers:
primes :: [Integer]
primes = 2:3:primes'
where
1:p:candidates = [6*k+r  k < [0..], r < [1,5]]
primes' = p : filter isPrime candidates
isPrime n = all (not . divides n) $ takeWhile (\p > p*p <= n) primes'
divides n p = n `mod` p == 0
Here, primes'
is the list of primes greater than 3 and isPrime
does not test for divisibility by 2 or 3 because the candidates
by construction don't have these numbers as factors. We also need to exclude 1 from the candidates and mark the next one as prime to start the recursion.
Such a scheme to generate candidate numbers first that avoid a given set of primes as divisors is called a prime wheel. Imagine that you had a wheel of circumference 6 to be rolled along the number line. With spikes positioned 1 and 5 units around the circumference, rolling the wheel will prick holes exactly in those positions on the line whose numbers are not divisible by 2 and 3.
A wheel can be represented by its circumference and the spiked positions.
data Wheel = Wheel Integer [Integer]
We prick out numbers by rolling the wheel.
roll (Wheel n rs) = [n*k+r  k < [0..], r < rs]
The smallest wheel is the unit wheel with one spike, it will prick out every number.
w0 = Wheel 1 [1]
We can create a larger wheel by rolling a smaller wheel of circumference n
along a rim of circumference p*n
while excluding spike positions at multiples of p
.
nextSize (Wheel n rs) p =
Wheel (p*n) [r'  k < [0..(p1)], r < rs, let r' = n*k+r, r' `mod` p /= 0]
Combining both, we can make wheels that prick out numbers that avoid a given list ds
of divisors.
mkWheel ds = foldl nextSize w0 ds
Now, we can generate prime numbers with a wheel that for instance avoids all multiples of 2, 3, 5 and 7.
primes :: [Integer]
primes = small ++ large
where
1:p:candidates = roll $ mkWheel small
small = [2,3,5,7]
large = p : filter isPrime candidates
isPrime n = all (not . divides n) $ takeWhile (\p > p*p <= n) large
divides n p = n `mod` p == 0
It's a pretty big wheel with a circumference of 210 and allows us to calculate the first 10000 primes in convenient time.
A fixed size wheel is fine, but how about adapting the wheel size while generating prime numbers? See the functional pearl titled Lazy wheel sieves and spirals of primes for more.
Implicit Heap
The following is a more efficient prime generator, implementing the sieve of Eratosthenes.
See also the message threads Re: "nocoding" functional data structures via lazyness for more about how merging ordered lists amounts to creating an implicit heap and Re: Code and Perf. Data for Prime Finders for an explanation of the People a
structure that makes it work when tying a knot.
data People a = VIP a (People a)  Crowd [a]
mergeP :: Ord a => People a > People a > People a
mergeP (VIP x xt) ys = VIP x $ mergeP xt ys
mergeP (Crowd xs) (Crowd ys) = Crowd $ merge xs ys
mergeP xs@(Crowd ~(x:xt)) ys@(VIP y yt) = case compare x y of
LT > VIP x $ mergeP (Crowd xt) ys
EQ > VIP x $ mergeP (Crowd xt) yt
GT > VIP y $ mergeP xs yt
merge :: Ord a => [a] > [a] > [a]
merge xs@(x:xt) ys@(y:yt) = case compare x y of
LT > x : merge xt ys
EQ > x : merge xt yt
GT > y : merge xs yt
diff xs@(x:xt) ys@(y:yt) = case compare x y of
LT > x : diff xt ys
EQ > diff xt yt
GT > diff xs yt
foldTree :: (a > a > a) > [a] > a
foldTree f ~(x:xs) = f x . foldTree f . pairs $ xs
where pairs ~(x: ~(y:ys)) = f x y : pairs ys
primes, nonprimes :: [Integer]
primes = 2:3:diff [5,7..] nonprimes
nonprimes = serve . foldTree mergeP . map multiples $ tail primes
where
multiples p = vip [p*k  k < [p,p+2..]]
vip (x:xs) = VIP x $ Crowd xs
serve (VIP x xs) = x:serve xs
serve (Crowd xs) = xs
nonprimes
effectively implements a heap, exploiting lazy evaluation.
Bitwise prime sieve
Count the number of prime below a given 'n'. Shows fast bitwise arrays, and an example of Template Haskell to defeat your enemies.
{# OPTIONS O2 optcO XBangPatterns #}
module Primes (nthPrime) where
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Base
import System
import Control.Monad
import Data.Bits
nthPrime :: Int > Int
nthPrime n = runST (sieve n)
sieve n = do
a < newArray (3,n) True :: ST s (STUArray s Int Bool)
let cutoff = truncate (sqrt $ fromIntegral n) + 1
go a n cutoff 3 1
go !a !m cutoff !n !c
 n >= m = return c
 otherwise = do
e < unsafeRead a n
if e then
if n < cutoff then
let loop !j
 j < m = do
x < unsafeRead a j
when x $ unsafeWrite a j False
loop (j+n)
 otherwise = go a m cutoff (n+2) (c+1)
in loop ( if n < 46340 then n * n else n `shiftL` 1)
else go a m cutoff (n+2) (c+1)
else go a m cutoff (n+2) c
And places in a module:
{# OPTIONS fth #}
import Primes
main = print $( let x = nthPrime 10000000 in [ x ] )
Run as:
$ ghc make o primes Main.hs
$ time ./primes
664579
./primes 0.00s user 0.01s system 228% cpu 0.003 total
Using IntSet for a traditional sieve
module Sieve where
import qualified Data.IntSet as I
 findNext  finds the next member of an IntSet.
findNext c is  I.member c is = c
 c > I.findMax is = error "Ooops. No next number in set."
 otherwise = findNext (c+1) is
 mark  delete all multiples of n from n*n to the end of the set
mark n is = is I.\\ (I.fromAscList (takeWhile (<=end) (map (n*) [n..])))
where
end = I.findMax is
 primes  gives all primes up to n
primes n = worker 2 (I.fromAscList [2..n])
where
worker x is
 (x*x) > n = is
 otherwise = worker (findNext (x+1) is) (mark x is)
Testing Primality
MillerRabin Primality Test
find2km :: Integral a => a > (a,a)
find2km n = f 0 n
where
f k m
 r == 1 = (k,m)
 otherwise = f (k+1) q
where (q,r) = quotRem m 2
millerRabinPrimality :: Integer > Integer > Bool
millerRabinPrimality n a
 a <= 1  a >= n1 =
error $ "millerRabinPrimality: a out of range ("
++ show a ++ " for "++ show n ++ ")"
 n < 2 = False
 even n = False
 b0 == 1  b0 == n' = True
 otherwise = iter (tail b)
where
n' = n1
(k,m) = find2km n'
b0 = powMod n a m
b = take (fromIntegral k) $ iterate (squareMod n) b0
iter [] = False
iter (x:xs)
 x == 1 = False
 x == n' = True
 otherwise = iter xs
pow' :: (Num a, Integral b) => (a > a > a) > (a > a) > a > b > a
pow' _ _ _ 0 = 1
pow' mul sq x' n' = f x' n' 1
where
f x n y
 n == 1 = x `mul` y
 r == 0 = f x2 q y
 otherwise = f x2 q (x `mul` y)
where
(q,r) = quotRem n 2
x2 = sq x
mulMod :: Integral a => a > a > a > a
mulMod a b c = (b * c) `mod` a
squareMod :: Integral a => a > a > a
squareMod a b = (b * b) `rem` a
powMod :: Integral a => a > a > a > a
powMod m = pow' (mulMod m) (squareMod m)
External links
 A collection of prime generators; the file "ONeillPrimes.hs" contains one of the fastest pureHaskell prime generators. WARNING: Don't use the priority queue from that file for your projects: it's broken and works for primes only by a lucky chance.