Prime numbers: Difference between revisions
m (m) |
(m) |
||
Line 24: | Line 24: | ||
</haskell> | </haskell> | ||
While appearing simple, this method is extremely inefficient and not recommended for more than a few 1000s of prime numbers, even when compiled. For every number it will test its divisibility by all prime numbers smaller than its smallest prime factor. | While appearing simple, this method is extremely inefficient and not recommended for more than a few 1000s of prime numbers, even when compiled. For every number it will test its divisibility by all prime numbers smaller than its smallest prime factor. This means that for primes, it will check their divisibility by all their preceding numbers, when in fact no number above its square root needs to be tested at all. | ||
This code creates in effect a nested linear structure of filters in front of the infinite numbers supply, and does so in extremely premature fashion, leading to a blowup of unnecessary filters. One way of fixing that would be to postpone the creation of filters until the right moment, as in | This code creates in effect a nested linear structure of filters in front of the infinite numbers supply, and does so in extremely premature fashion, leading to a blowup of unnecessary filters. One way of fixing that would be to postpone the creation of filters until the right moment, as in | ||
Line 34: | Line 34: | ||
</haskell> | </haskell> | ||
It only tests odd numbers, and only those that are larger than a prime factor's square, for an increasing supply of primes | It only tests odd numbers, and only those that are larger than a prime factor's square, but smaller than the next primes' square, for an increasing supply of primes as they are generated. | ||
Whereas the first version exhibits near O(<math>{n^3}</math>) behavior, this one exhibits near O(<math>{n^{1.5}}</math>) behavior, and is good for creating a few 100,000s of primes. It takes as much time to generate <i>100,000 primes</i> with it as it takes for the first one to generate <i>5500</i> of them. | Whereas the first version exhibits near O(<math>{n^3}</math>) behavior, this one exhibits near O(<math>{n^{1.5}}</math>) behavior, and is good for creating a few 100,000s of primes. It takes as much time to generate <i>100,000 primes</i> with it as it takes for the first one to generate <i>5500</i> of them. |
Revision as of 10:57, 6 November 2009
Prime Number Resources
In mathematics, a prime number (or a prime) is a natural number which has exactly two distinct natural number divisors: 1 and itself.
Prime Numbers at Wikipedia.
Sieve of Eratosthenes at Wikipedia.
HackageDB packages:
Numbers: An assortment of number theoretic functions.
NumberSieves: Number Theoretic Sieves: primes, factorization, and Euler's Totient.
primes: Efficient, purely functional generation of prime numbers.
Finding Primes
Simple Prime Sieve
The following is an elegant way to generate a list of all the prime numbers in the universe:
primes :: [Integer]
primes = sieve [2..]
where sieve (p:xs) = p : sieve [x | x<-xs, x `mod` p /= 0]
While appearing simple, this method is extremely inefficient and not recommended for more than a few 1000s of prime numbers, even when compiled. For every number it will test its divisibility by all prime numbers smaller than its smallest prime factor. This means that for primes, it will check their divisibility by all their preceding numbers, when in fact no number above its square root needs to be tested at all.
This code creates in effect a nested linear structure of filters in front of the infinite numbers supply, and does so in extremely premature fashion, leading to a blowup of unnecessary filters. One way of fixing that would be to postpone the creation of filters until the right moment, as in
primes = 2: 3: sieve (tail primes) [5,7..] where
sieve (p:ps) xs
= h ++ sieve ps (filter ((/=0).(`rem`p)) t)
where (h,(_:t))=span (< p*p) xs
It only tests odd numbers, and only those that are larger than a prime factor's square, but smaller than the next primes' square, for an increasing supply of primes as they are generated.
Whereas the first version exhibits near O() behavior, this one exhibits near O() behavior, and is good for creating a few 100,000s of primes. It takes as much time to generate 100,000 primes with it as it takes for the first one to generate 5500 of them.
Given an infinite list of prime numbers, we can implement primality tests and integer factorization:
isPrime n = n > 1 && n == head (primeFactors n)
primeFactors 1 = []
primeFactors n = go n primes
where
go n ps@(p:pt)
| p*p > n = [n]
| n `rem` p == 0 = p : go (n `quot` p) ps
| otherwise = go n pt
Simple Prime Sieve II
The following method is slightly faster, creating 114,000 primes in the same period of time. It works well for generating a few 100,000s of primes as well:
primes :: [Integer]
primes = 2:filter isPrime [3,5..]
where
isPrime n = all (not . divides n) $ takeWhile (\p -> p*p <= n) primes
divides n p = n `mod` p == 0
It too only tests odd numbers and avoids testing for prime factors of that are larger than .
Simple Prime Sieve III
This is faster still, creating 158,000 primes in the same time span:
primes :: [Integer]
primes = 2:3:go 5 [] (tail primes)
where
divisibleBy d n = n `mod` d == 0
go start ds (p:ps) = let pSq = p*p in
foldr (\d -> filter (not . divisibleBy d)) [start, start + 2..pSq - 2] ds
++ go (pSq + 2) (p:ds) ps
Compared with the previous sieve, it makes the list of primes needed for testing, and tests the whole batch of odd numbers in the corresponding range with it, instead of repeatedly generating this list with takeWhile
for each individual number. It tests them by passing each number through a series of filter
function calls, each filtering by its own prime factor from that list. Were both foldr
and filter
to be compiled away by a smart compiler, that would've been like the and
call from the following version.
Simple Prime Sieve IV
primes :: [Integer]
primes = 2: 3: sieve 0 primes' 5
primes' = tail primes
sieve k (p:ps) x
= [x | x<-[x,x+2..p*p-2], and [(x`rem`p)/=0 | p<-take k primes']]
++ sieve (k+1) ps (p*p+2)
This version filters each odd number through by just one call to the and
function with explicit list of prime factors. It also avoids building this list of factors-to-test-by altogether, since it is actully the prefix of primes list itself and thus can be used directly, given that its length is known in advance for each iteration step. It is thus the fastest, making about 222,000 primes in the same time span. It is good for creating about a million first primes, compiled.
IOW this sieve
version explicitly uses successive prefixes of the primes list to test batches of odds between successive squares of primes, to keep only the prime numbers among them. Wherein lies the key deficiency of all the versions thus far which try to keep the primes among all numbers by testing each number in isolation, since testing composites is cheap (as most of them will have small factors), and testing prime numbers is costly. The real sieves try to get rid of composites working on spans of numbers at once and are thus able to get their primes for free, as gaps between the generated/marked/crossed-off composites.
The reason to have sieve
function available separately too is that it can also be used to produce primes above a given number, as in
primesFrom m = sieve (length h) ps $ m`div`2*2+1
where
(h,(_:ps)) = span (<= (floor.sqrt.fromIntegral) m) primes
It can thus produce a few primes e.g. above 239812076741689, which is a square of the millionth odd prime.
Prime Wheels
The idea of only testing odd numbers can be extended further. For instance, it is a useful fact that every prime number other than 2 and 3 must be of the form or . Thus, we only need to test these numbers:
primes :: [Integer]
primes = 2:3:primes'
where
1:p:candidates = [6*k+r | k <- [0..], r <- [1,5]]
primes' = p : filter isPrime candidates
isPrime n = all (not . divides n) $ takeWhile (\p -> p*p <= n) primes'
divides n p = n `mod` p == 0
Here, primes'
is the list of primes greater than 3 and isPrime
does not test for divisibility by 2 or 3 because the candidates
by construction don't have these numbers as factors. We also need to exclude 1 from the candidates and mark the next one as prime to start the recursion.
Such a scheme to generate candidate numbers that avoid the first a given set of primes as divisors is called a prime wheel. Imagine that you had a wheel of circumference 6 to be rolled along the number line. With spikes positioned 1 and 5 units around the circumference, rolling the wheel will prick holes exactly in those positions on the line whose numbers are not divisible by 2 and 3.
A wheel can be represented by its circumference and the spiked positions.
data Wheel = Wheel Integer [Integer]
We prick out numbers by rolling the wheel.
roll (Wheel n rs) = [n*k+r | k <- [0..], r <- rs]
The smallest wheel is the unit wheel with one spike, it will prick out every number.
w0 = Wheel 1 [1]
We can create a larger wheel by rolling a smaller wheel of circumference n
along a rim of circumference p*n
while excluding spike positions at multiples of p
.
nextSize (Wheel n rs) p =
Wheel (p*n) [r' | k <- [0..(p-1)], r <- rs, let r' = n*k+r, r' `mod` p /= 0]
Combining both, we can make wheels that prick out numbers that avoid a given list ds
of divisors.
mkWheel ds = foldl nextSize w0 ds
Now, we can generate prime numbers with a wheel that for instance avoids all multiples of 2, 3, 5 and 7.
primes :: [Integer]
primes = small ++ large
where
1:p:candidates = roll $ mkWheel small
small = [2,3,5,7]
large = p : filter isPrime candidates
isPrime n = all (not . divides n) $ takeWhile (\p -> p*p <= n) large
divides n p = n `mod` p == 0
It's a pretty big wheel with a circumference of 210 and allows us to calculate the first 10000 primes in convenient time.
A fixed size wheel is fine, but how about adapting the wheel size while generating prime numbers? See the functional pearl titled Lazy wheel sieves and spirals of primes for more.
Implicit Heap
The following is a more efficient prime generator, implementing the sieve of Eratosthenes.
See also the message threads Re: "no-coding" functional data structures via lazyness for more about how merging ordered lists amounts to creating an implicit heap and Re: Code and Perf. Data for Prime Finders for an explanation of the People a
structure that makes it work when tying a knot.
data People a = VIP a (People a) | Crowd [a]
mergeP :: Ord a => People a -> People a -> People a
mergeP (VIP x xt) ys = VIP x $ mergeP xt ys
mergeP (Crowd xs) (Crowd ys) = Crowd $ merge xs ys
mergeP xs@(Crowd ~(x:xt)) ys@(VIP y yt) = case compare x y of
LT -> VIP x $ mergeP (Crowd xt) ys
EQ -> VIP x $ mergeP (Crowd xt) yt
GT -> VIP y $ mergeP xs yt
merge :: Ord a => [a] -> [a] -> [a]
merge xs@(x:xt) ys@(y:yt) = case compare x y of
LT -> x : merge xt ys
EQ -> x : merge xt yt
GT -> y : merge xs yt
diff xs@(x:xt) ys@(y:yt) = case compare x y of
LT -> x : diff xt ys
EQ -> diff xt yt
GT -> diff xs yt
foldTree :: (a -> a -> a) -> [a] -> a
foldTree f ~(x:xs) = f x . foldTree f . pairs $ xs
where pairs ~(x: ~(y:ys)) = f x y : pairs ys
primes, nonprimes :: [Integer]
primes = 2:3:diff [5,7..] nonprimes
nonprimes = serve . foldTree mergeP . map multiples $ tail primes
where
multiples p = vip [p*k | k <- [p,p+2..]]
vip (x:xs) = VIP x $ Crowd xs
serve (VIP x xs) = x:serve xs
serve (Crowd xs) = xs
nonprimes
effectively implements a heap, exploiting lazy evaluation.
Bitwise prime sieve
Count the number of prime below a given 'n'. Shows fast bitwise arrays, and an example of Template Haskell to defeat your enemies.
{-# OPTIONS -O2 -optc-O -XBangPatterns #-}
module Primes (nthPrime) where
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Base
import System
import Control.Monad
import Data.Bits
nthPrime :: Int -> Int
nthPrime n = runST (sieve n)
sieve n = do
a <- newArray (3,n) True :: ST s (STUArray s Int Bool)
let cutoff = truncate (sqrt $ fromIntegral n) + 1
go a n cutoff 3 1
go !a !m cutoff !n !c
| n >= m = return c
| otherwise = do
e <- unsafeRead a n
if e then
if n < cutoff then
let loop !j
| j < m = do
x <- unsafeRead a j
when x $ unsafeWrite a j False
loop (j+n)
| otherwise = go a m cutoff (n+2) (c+1)
in loop ( if n < 46340 then n * n else n `shiftL` 1)
else go a m cutoff (n+2) (c+1)
else go a m cutoff (n+2) c
And places in a module:
{-# OPTIONS -fth #-}
import Primes
main = print $( let x = nthPrime 10000000 in [| x |] )
Run as:
$ ghc --make -o primes Main.hs
$ time ./primes
664579
./primes 0.00s user 0.01s system 228% cpu 0.003 total
Using IntSet for a traditional sieve
module Sieve where
import qualified Data.IntSet as I
-- findNext - finds the next member of an IntSet.
findNext c is | I.member c is = c
| c > I.findMax is = error "Ooops. No next number in set."
| otherwise = findNext (c+1) is
-- mark - delete all multiples of n from n*n to the end of the set
mark n is = is I.\\ (I.fromAscList (takeWhile (<=end) (map (n*) [n..])))
where
end = I.findMax is
-- primes - gives all primes up to n
primes n = worker 2 (I.fromAscList [2..n])
where
worker x is
| (x*x) > n = is
| otherwise = worker (findNext (x+1) is) (mark x is)
Testing Primality
Miller-Rabin Primality Test
find2km :: Integral a => a -> (a,a)
find2km n = f 0 n
where
f k m
| r == 1 = (k,m)
| otherwise = f (k+1) q
where (q,r) = quotRem m 2
millerRabinPrimality :: Integer -> Integer -> Bool
millerRabinPrimality n a
| a <= 1 || a >= n-1 =
error $ "millerRabinPrimality: a out of range ("
++ show a ++ " for "++ show n ++ ")"
| n < 2 = False
| even n = False
| b0 == 1 || b0 == n' = True
| otherwise = iter (tail b)
where
n' = n-1
(k,m) = find2km n'
b0 = powMod n a m
b = take (fromIntegral k) $ iterate (squareMod n) b0
iter [] = False
iter (x:xs)
| x == 1 = False
| x == n' = True
| otherwise = iter xs
pow' :: (Num a, Integral b) => (a -> a -> a) -> (a -> a) -> a -> b -> a
pow' _ _ _ 0 = 1
pow' mul sq x' n' = f x' n' 1
where
f x n y
| n == 1 = x `mul` y
| r == 0 = f x2 q y
| otherwise = f x2 q (x `mul` y)
where
(q,r) = quotRem n 2
x2 = sq x
mulMod :: Integral a => a -> a -> a -> a
mulMod a b c = (b * c) `mod` a
squareMod :: Integral a => a -> a -> a
squareMod a b = (b * b) `rem` a
powMod :: Integral a => a -> a -> a -> a
powMod m = pow' (mulMod m) (squareMod m)
External links
- A collection of prime generators; the file "ONeillPrimes.hs" contains one of the fastest pure-Haskell prime generators. WARNING: Don't use the priority queue from that file for your projects: it's broken and works for primes only by a lucky chance.