https://wiki.haskell.org/api.php?action=feedcontributions&user=Steve+C&feedformat=atomHaskellWiki - User contributions [en]2021-05-08T08:44:08ZUser contributionsMediaWiki 1.27.4https://wiki.haskell.org/index.php?title=Tail_recursion&diff=34800Tail recursion2010-05-23T09:13:04Z<p>Steve C: Make 'foldl' the object of the link, and move below quote from Brent.</p>
<hr />
<div>A recursive function is tail recursive if the final result of the<br />
recursive call is the final result of the function itself. If the<br />
result of the recursive call must be further processed (say, by adding<br />
1 to it, or consing another element onto the beginning of it), it is<br />
not tail recursive.<br />
<br />
With that said, tail recursion is not that useful of a concept in a<br />
lazy language like Haskell. The important concept to know in Haskell<br />
is [[guarded recursion]], where any recursive calls occur within a data<br />
constructor (such as <hask>foldr</hask>, where the recursive call to foldr occurs<br />
as an argument to <hask>(:)</hask>). This allows the result of the function to be<br />
consumed lazily, since it can be evaluated up to the data constructor<br />
and the recursive call delayed until needed.<br />
<br />
Note that [[Fold|foldl]] is always tail recursive.<br />
<br />
== Source ==<br />
<br />
* Brent Yorgey in Haskell-Cafe on [http://www.haskell.org/pipermail/haskell-cafe/2009-March/058607.html Definition of "tail recursive" wrt Folds]<br />
<br />
[[Category:Glossary]]<br />
[[Category:Idioms]]</div>Steve Chttps://wiki.haskell.org/index.php?title=Prime_numbers&diff=32791Prime numbers2009-12-31T11:56:27Z<p>Steve C: Add "Using ST Array" section</p>
<hr />
<div>== Prime Number Resources ==<br />
In mathematics, a <i>prime number</i> (or a <i>prime</i>) is a natural number which has exactly two distinct natural number divisors: 1 and itself. The smallest prime is thus 2.<br />
<br />
[http://en.wikipedia.org/wiki/Prime_numbers Prime Numbers] at Wikipedia.<br />
<br />
[http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes Sieve of Eratosthenes] at Wikipedia.<br />
<br />
HackageDB packages:<br />
<br />
[http://hackage.haskell.org/package/Numbers Numbers]: An assortment of number theoretic functions.<br />
<br />
[http://hackage.haskell.org/package/NumberSieves NumberSieves]: Number Theoretic Sieves: primes, factorization, and Euler's Totient.<br />
<br />
[http://hackage.haskell.org/package/primes primes]: Efficient, purely functional generation of prime numbers.<br />
<br />
== Finding Primes ==<br />
<br />
Any natural number is representable as a product of powers of its prime factors, and so a prime has no prime divisors other than itself. That means that starting with 2, <i>for each</i> newly found <i>prime</i> we can <i>eliminate</i> from the rest of the numbers <i>all such numbers</i> that have this prime as their divisor, giving us the <i>next available</i> number as next prime. This is known as <i>sieving</i> the natural numbers, so that in the end what we are left with are just primes. <br />
<br />
=== The Classic Simple Primes Sieve ===<br />
<br />
Attributed to David Turner <i>(SASL Language Manual, 1975)</i>, the following is a direct translation of that idea, generating a list of all prime numbers:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = sieve [2..]<br />
where<br />
sieve (p:xs) = p : sieve [x | x<-xs, x `mod` p /= 0] <br />
-- or: filter ((/=0).(`mod`p)) xs<br />
</haskell><br />
<br />
This should only be considered a <i>specification</i>, not a <i>code</i>. When run as is, it is <i>extremely inefficient</i> because it starts up the filters prematurely, immediately after each prime, instead of only after the prime's square has been reached. To be admitted as prime, <i>each number</i> will be <i>tested for divisibility</i> here by all its preceding primes, while just those not greater than its square root would suffice. This means that e.g. to find the <b>1001</b>st prime (<code>7927</code>), <b>1000</b> filters are used, when in fact just the first <b>24</b> are needed (upto <code>89</code>'s filter only).<br />
<br />
So this in effect creates a cascade of nested filters in front of the infinite numbers supply, and in <i>extremely premature</i> fashion at that. One way of fixing that would be to <i>postpone</i> the creation of filters until the right moment, by decoupling the primes supply from the numbers supply, as in<br />
<br />
=== Postponed Filters Sieve ===<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve (tail primes) [5,7..]<br />
where <br />
sieve (p:ps) xs = h ++ sieve ps [x | x<-t, x `rem` p /= 0] <br />
-- or: filter ((/=0).(`rem`p)) t<br />
where (h,~(_:t)) = span (< p*p) xs<br />
</haskell><br />
<br />
This can be seen as essential framework for all the code to come. It only tests odd numbers, and only by the primes that are needed, for each <i>numbers span</i> between successive squares of primes. To find the <b>1001</b>st prime, the divisibility test is performed by only <b>24</b> nested filters corresponding to the first <b>24</b> odd primes.<br />
<br />
Whereas the first version exhibits near O(<math>{n^2}</math>) behavior, this one exhibits near O(<math>{n^{1.5}}</math>) behavior, with an orders-of-magnitude speedup. <br />
<br />
There is another way for the composites to be found - by generating all the multiples of successive primes, in advance. Any number thus generated will of course be divisible by the corresponding prime.<br />
<br />
==== Postponed Multiples Removal i.e. Euler's Sieve ====<br />
Instead of testing <i>each number</i> for divisibility by a prime we can just <i>remove</i> the prime's <i>multiples</i> in advance. We gain in speed because we now get the primes <i>for free</i>, after all the multiples are removed on a particular span, <i>without</i> performing any divisibility tests <i>at all</i>:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve (tail primes) [5,7..] <br />
where<br />
sieve (p:ps) xs = h ++ sieve ps (t `minus` tail [q,q+2*p..])<br />
where (h,~(_:t)) = span (< q) xs <br />
q = p*p<br />
<br />
<br />
minus :: (Ord a) => [a] -> [a] -> [a]<br />
minus a@(x:xs) b@(y:ys) = case compare x y of <br />
LT -> x: xs `minus` b<br />
EQ -> xs `minus` ys<br />
GT -> a `minus` ys<br />
minus a b = a<br />
</haskell> <br />
<br />
This is, in fact, Euler's sieve.<br />
<br />
==== Merged Multiples Removal Sieve ====<br />
<code>(...((s-a)-b)-...)</code> is the same as <code>(s-(a+b+...))</code>, and so we can just remove the ''merged'' infinite primes multiples, each starting at its prime's square, from the ''initial'' numbers supply. This way we needn't explicitly jump to a prime's square because it's where its multiples start anyway:<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2:primes'<br />
where<br />
primes' = [3] ++ [5,7..] `minus` foldr merge' [] mults<br />
mults = map (\p->let q=p*p in (q,tail [q,q+2*p..])) $ primes'<br />
merge' (q,qs) xs = q : merge qs xs<br />
<br />
<br />
<br />
merge :: (Ord a) => [a] -> [a] -> [a] <br />
merge a@(x:xs) b@(y:ys) = case compare x y of <br />
LT -> x: merge xs b <br />
EQ -> x: merge xs ys<br />
GT -> y: merge a ys<br />
merge a b = if null a then b else a<br />
</haskell><br />
<br />
This code is yet faster. Its main deficiency still is that it creates a linear nested merging structure, instead of a tree-like structure. Each multiple produced by a prime has to percolate to the top eventually, so it's better to make its path shorter. It'll have to go through fewer merge nodes this way.<br />
<br />
The linearity is imposed by type asymmetry of our <code>(merge' :: a -> b -> b)</code> function, forcing us into the <code>1+(1+(1+(1+...)))</code> pattern, <code>+</code> standing for <code>merge'</code> (which was defined that way to prevent the run-ahead when folding over the infinite list of lists of multiples).<br />
<br />
We need to turn it into an associative operation of uniform type <code>(:: a -> a -> a)</code> to be able to freely rearrange the combinations into arbitrary tree-like patterns, as in e.g. ((1+1)+(2+2))+(...) etc. The type uniformity is what makes compositionality possible.<br />
<br />
The code in the "Implicit Heap" section below improves on that, and is essentially equivalent to using a treefold instead of a standard linear <code>foldr</code>, as in:<br />
<br />
==== Treefold Merged Multiples Removal ====<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2:primes'<br />
where<br />
primes' = [3,5] ++ drop 2 [3,5..] `minus` comps<br />
mults = map (\p-> let q=p*p in ([q],tail [q,q+2*p..])) $ primes'<br />
comps = fst $ tfold mergeSP (pairwise mergeSP mults)<br />
<br />
pairwise f (x:y:ys) = f x y : pairwise f ys<br />
<br />
tfold f (a: ~(b: ~(c:xs)))<br />
= (a `f` (b `f` c)) `f` tfold f (pairwise f xs)<br />
<br />
mergeSP (a,b) ~(c,d) = let (bc,b') = spMerge b c<br />
in (a ++ bc, merge b' d)<br />
where <br />
spMerge :: (Ord a) => [a] -> [a] -> ([a],[a]) <br />
spMerge a@(x:xs) b@(y:ys) = case compare x y of<br />
LT -> (x:c,d) where (c,d) = spMerge xs b<br />
EQ -> (x:c,d) where (c,d) = spMerge xs ys<br />
GT -> (y:c,d) where (c,d) = spMerge a ys<br />
spMerge a [] = ([] ,a)<br />
spMerge [] b = ([] ,b)<br />
</haskell><br />
The fold used here creates a <code>(2+(2+2))+( (4+(4+4)) + ( (8+(8+8)) + ... ))</code> structure, better adjusted for primes multiples production than <code>1+(2+(4+(8+...)))</code>, used by the "Implicit Heap" code, giving it additional 10% speedup.<br />
<br />
<code> mergeSP </code> is an associative operation, preserving of the invariant such that for a list of multiples <code>[(a,b),(c,d), ...]</code>, it's always the case that <code> last a < head b && last a < head c</code>. These "split pairs" represent ordered list as a pair of its known and (currently) finite prefix, and the rest of it. Such pairs under <code>mergeSP</code> operation form a <i>monoid</i>, and if we were to declare a <hask>newtype SplitPair a = SP ([a],[a])</hask> a <hask>Monoid</hask> instance, with <code>mergeSP</code> its <hask>mappend</hask> (and <code>tfold</code> its <hask>mconcat</hask>), the above code for <code>comps</code> would just become <hask>SP (comps,_) = mconcat mults</hask>.<br />
<br />
This code exhibits approximately O(<math>{n^{1.20}}</math>)..O(<math>{n^{1.15}}</math>) local asymptotic behavior (tested interpreted, in GHCi, for 10,000 to 300,000 primes produced). When compared with Melissa O'Neill's PQ code from the ZIP package which was modified to work on odds only as well, it is 3.2x times faster, with used memory reported about 2.5x times smaller.<br />
<br />
It can be further improved with the wheel optimization (as described below at the Prime Wheels section):<br />
<br />
==== Treefold Merged Multiples, with Wheel ====<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2:3:5:7:primes' <br />
where<br />
primes' = [11,13] ++ drop 2 (rollFrom 11) `minus` comps<br />
mults = map (\p-> fromList $ map (p*) $ rollFrom p) $ primes'<br />
comps = fst $ tfold mergeSP (pairwise mergeSP mults)<br />
fromList (x:xs) = ([x],xs)<br />
rollFrom n = let x = (n-11) `mod` 210<br />
(y,_) = span (< x) wheelNums<br />
in roll n $ drop (length y) wheel<br />
<br />
wheelNums = roll 0 wheel<br />
roll = scanl (+)<br />
wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2:<br />
4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel<br />
</haskell><br />
<br />
This runs about 2.5x times faster than the Priority Queue-based code as present in Melissa O'Neill's ZIP package, with similar local asymptotic behavior of about O(<math>{n^{1.25}}</math>)..O(<math>{n^{1.17}}</math>) (tested interpreted, in GHCi, for 10,000 to 300,000 primes produced), with used memory reported about twice less as well.<br />
<br />
=== More Filtering Sieves ===<br />
<br />
The primes list creation with divisibility testing can be reformulated in a few more ways, using the list of primes <i>as it is being built</i> (a la "circular programming").<br />
<br />
==== Odd numbers, by Trial Division ====<br />
This is also good for generating a few 100,000s primes (when GHC-compiled as well):<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: filter isPrime [5,7..]<br />
where<br />
isPrime n = all (notDivs n) $ takeWhile (\p-> p*p <= n) (tail primes)<br />
notDivs n p = n `mod` p /= 0 <br />
</haskell><br />
<br />
Instead of relying on nested filters, it tests each odd number by an explicit list of all the needed prime factors. But for each number tested it re-fetches this list <i>anew</i> which will be <i>the same</i> for the increasing spans of numbers between the successive squares of primes.<br />
<br />
==== Generated Spans, by Nested Filters ====<br />
The other way to go instead of concentrating on the numbers supply, is to directly work on the successive spans between the primes squares.<br />
<br />
This version is a bit faster still, creating <i>158,000 primes</i> (again, GHC-compiled) in the same time as the postponed filters does 100,000 primes:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve [] (tail primes) 5<br />
where<br />
notDivsBy d n = n `mod` d /= 0<br />
sieve ds (p:ps) x = foldr (filter . notDivsBy) [x, x+2..p*p-2] ds<br />
++ sieve (p:ds) ps (p*p+2)<br />
</haskell><br />
<br />
This one explicitly maintains the list of primes needed for testing each odds span between successive primes squares, which it also explicitly generates. But it tests with nested <code>filter</code>s, which it repeatedly recreates.<br />
<br />
==== Generated Spans, by List of Primes ====<br />
The list of primes needed to test each range of odds is actually just the prefix of the primes list itself, of known length, and need not be specifically generated at all. Combined with one-call testing by the explicit list of primes, and direct generation of odds between the successive primes squares, this leads to:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve 0 (tail primes) 5<br />
sieve k (p:ps) x = [x | x<-[x,x+2..p*p-2], and [x`rem`p/=0 | p<-fs]]<br />
-- or: all ((>0).(x`rem`)) fs<br />
++ sieve (k+1) ps (p*p+2) <br />
where fs = take k (tail primes) <br />
</haskell><br />
<br />
It produces about <i>222,000 primes</i> in the same amount of time, and is good for creating about a million first primes, compiled.<br />
<br />
The reason to have <code>sieve</code> function available separately too is that it can also be used to produce primes above a given number, as in<br />
<br />
<haskell><br />
primesFrom m = sieve (length h) ps $ m`div`2*2+1 <br />
where <br />
(h,(_:ps)) = span (<= (floor.sqrt.fromIntegral) m) primes<br />
</haskell><br />
<br />
It can thus produce a few primes e.g. above <code>239812076741689</code>, which is a square of the millionth odd prime, without having to compute all the preceding primes (which would number in trillions).<br />
<br />
==== Multiples Removal on Generated Spans, or Sieve of Eratosthenes ====<br />
<br />
The divisibility testing too should be considered a specification (as in "no multiples of p"), and not a code per se, because although testing composites is cheap (as most of them will have small factors, so the test is soon aborted), testing prime numbers is costly, and is to be avoided.<br />
<br />
All the filtering versions thus far try to <i>keep the primes</i> among all numbers by testing <i>each number</i> in isolation. Instead, all the relevant primes' multiples can be removed from the corresponding segments of odds, and what's left after that will be just primes:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve [] (tail primes) 5<br />
where<br />
sieve fs (p:ps) x = [x,x+2..q-2] `minus` foldl merge [] mults<br />
++ sieve ((2*p,q):fs') ps (q+2)<br />
where<br />
q = p*p<br />
mults = [ [y+s,y+2*s..q] | (s,y)<- fs]<br />
fs' = [ (s,last ms) | ((s,_),ms)<- zip fs mults]<br />
</haskell> <br />
<br />
This modifies the preceding sieve to "mark" the odd composites in a given range (instead of testing their divisibility) by generating them - just as a person performing the original sieve of Eratosthenes would do, marking one by one the multiples of the relevant primes.<br />
<br />
Compared with the previous version, interpreted under both GHCi and WinHugs, it runs <i>faster</i>, takes <i>less</i> memory, and has better asymptotic behavior, its performance approximately the same as in the Merged Multiples Removal sieve. The advantage in working with spans explicitly is that this code is easily amendable to using arrays for the composites marking and removal on each <i>finite</i> span. <br />
<br />
=== Using Immutable Arrays ===<br />
<br />
==== Generating Segments of Primes ====<br />
<br />
The removal of multiples on each segment of odds in the sieve of Eratosthenes can be done by actually marking them in an array, instead of manipulating lists with "minus" and "merge":<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve [] (tail primes) 5 <br />
where <br />
sieve fs (p:ps) x = [i | i<- [x,x+2..q-2], a!i] <br />
++ sieve ((2*p,q):fs') ps (q+2)<br />
where<br />
q = p*p<br />
mults = [ [y+s,y+2*s..q] | (s,y)<- fs]<br />
fs' = [ (s,last ms) | ((s,_),ms)<- zip fs mults]<br />
a = accumArray (\a b->False) True (x,q-2) <br />
[(i,()) | ms<- mults, i<- ms] <br />
</haskell><br />
Apparently, arrays are ''fast'' too. This code, compared with Treefold Merged with Wheel version (itself 2.5x times faster than Melissa O'Neill's PQ version), runs at about the same time and memory usage, but improving slightly with slightly better local asymptotics.<br />
<br />
==== Calculating Primes Upto a Given Value ====<br />
<br />
<haskell><br />
primesToN n = 2: [i | i<-[3,5..n], ar!i]<br />
where<br />
ar = f 5 $ accumArray (\a b->False) True (3,n) <br />
[(i,()) | i<- [9,15..n]]<br />
f p a | q>=n = a<br />
| True = if null x then a' else f (head x) a'<br />
where q = p*p<br />
a'= a//[(i,False)|i<-[q,q+2*p..n]]<br />
x = [i | i<-[p+2,p+4..n], a' !i]<br />
</haskell><br />
<br />
==== Calculating Primes in a Given Range ====<br />
<br />
<haskell><br />
primesFromTo a b = (if a<3 then [2] else []) <br />
++ [i | i<-[o,o+2..b], ar!i]<br />
where <br />
o = max (if even a then a+1 else a) 3<br />
r = (1+).floor.sqrt.fromInteger $ b<br />
ar = accumArray (\a b->False) True (o,b) <br />
[(i,()) | p <- [3,5..r]<br />
, let q = p*p <br />
s = 2*p <br />
(n,x) = quotRem (o-q) s <br />
q' = if o <= q then q<br />
else if x==0 then q+n*s<br />
else q+(n+1)*s<br />
, i<- [q',q'+s..b] ]<br />
</haskell><br />
<br />
Although testing by odds instead of by primes, the array generation is so fast that it is very much feasible and even preferable for quick generation of some short spans of relatively big primes.<br />
<br />
=== Using Mutable Arrays ===<br />
<br />
Using mutable arrays is the fastest and most memory efficient way to calculate prime numbers in Haskell.<br />
<br />
==== Using ST Array ====<br />
<br />
This method implements the Sieve of Eratosthenes, similar to how you might do it<br />
in C.<br />
<br />
<haskell><br />
import Control.Monad<br />
import Control.Monad.ST<br />
import Data.Array.IArray<br />
import Data.Array.MArray<br />
import Data.Array.ST<br />
import Data.Array.Unboxed<br />
<br />
primesToNA :: Int -> UArray Int Bool<br />
primesToNA n = runSTUArray sieve where<br />
sieve = do<br />
a <- newArray (2,n) True :: ST s (STUArray s Int Bool)<br />
let sr = floor . (sqrt::Double->Double) . fromIntegral $ n+1<br />
forM_ [4,6..n] $ \j -> writeArray a j False<br />
forM_ [3,5..sr] $ \i -> do<br />
si <- readArray a i<br />
when si $<br />
forM_ [i*i,i*i+i+i..n] $ \j -> writeArray a j False<br />
return a<br />
<br />
primesToN :: Int -> [Int]<br />
primesToN n = [i | (i,e) <- assocs . primesToNA $n, e]<br />
</haskell><br />
<br />
==== Bitwise prime sieve with Template Haskell ====<br />
<br />
Count the number of prime below a given 'n'. Shows fast bitwise arrays,<br />
and an example of [[Template Haskell]] to defeat your enemies.<br />
<br />
<haskell><br />
{-# OPTIONS -O2 -optc-O -XBangPatterns #-}<br />
module Primes (nthPrime) where<br />
<br />
import Control.Monad.ST<br />
import Data.Array.ST<br />
import Data.Array.Base<br />
import System<br />
import Control.Monad<br />
import Data.Bits<br />
<br />
nthPrime :: Int -> Int<br />
nthPrime n = runST (sieve n)<br />
<br />
sieve n = do<br />
a <- newArray (3,n) True :: ST s (STUArray s Int Bool)<br />
let cutoff = truncate (sqrt $ fromIntegral n) + 1<br />
go a n cutoff 3 1<br />
<br />
go !a !m cutoff !n !c<br />
| n >= m = return c<br />
| otherwise = do<br />
e <- unsafeRead a n<br />
if e then<br />
if n < cutoff then<br />
let loop !j<br />
| j < m = do<br />
x <- unsafeRead a j<br />
when x $ unsafeWrite a j False<br />
loop (j+n)<br />
| otherwise = go a m cutoff (n+2) (c+1)<br />
in loop ( if n < 46340 then n * n else n `shiftL` 1)<br />
else go a m cutoff (n+2) (c+1)<br />
else go a m cutoff (n+2) c<br />
</haskell><br />
<br />
And places in a module:<br />
<br />
<haskell><br />
{-# OPTIONS -fth #-}<br />
import Primes<br />
<br />
main = print $( let x = nthPrime 10000000 in [| x |] )<br />
</haskell><br />
<br />
Run as:<br />
<br />
<haskell><br />
$ ghc --make -o primes Main.hs<br />
$ time ./primes<br />
664579<br />
./primes 0.00s user 0.01s system 228% cpu 0.003 total<br />
</haskell><br />
<br />
=== Implicit Heap ===<br />
<br />
The following is a more efficient prime generator, implementing the sieve of<br />
Eratosthenes.<br />
<br />
See also the message threads [http://thread.gmane.org/gmane.comp.lang.haskell.cafe/25270/focus=25312 Re: "no-coding" functional data structures via lazyness] for more about how merging ordered lists amounts to creating an implicit heap and [http://thread.gmane.org/gmane.comp.lang.haskell.cafe/26426/focus=26493 Re: Code and Perf. Data for Prime Finders] for an explanation of the <hask>People a</hask> structure that makes it work when tying a knot.<br />
<br />
<haskell><br />
data People a = VIP a (People a) | Crowd [a]<br />
<br />
mergeP :: Ord a => People a -> People a -> People a<br />
mergeP (VIP x xt) ys = VIP x $ mergeP xt ys<br />
mergeP (Crowd xs) (Crowd ys) = Crowd $ merge xs ys<br />
mergeP xs@(Crowd (x:xt)) ys@(VIP y yt) = case compare x y of<br />
LT -> VIP x $ mergeP (Crowd xt) ys<br />
EQ -> VIP x $ mergeP (Crowd xt) yt<br />
GT -> VIP y $ mergeP xs yt<br />
<br />
merge :: Ord a => [a] -> [a] -> [a]<br />
merge xs@(x:xt) ys@(y:yt) = case compare x y of<br />
LT -> x : merge xt ys<br />
EQ -> x : merge xt yt<br />
GT -> y : merge xs yt<br />
<br />
diff xs@(x:xt) ys@(y:yt) = case compare x y of<br />
LT -> x : diff xt ys<br />
EQ -> diff xt yt<br />
GT -> diff xs yt<br />
<br />
foldTree :: (a -> a -> a) -> [a] -> a<br />
foldTree f ~(x:xs) = x `f` foldTree f (pairs xs)<br />
where pairs ~(x: ~(y:ys)) = f x y : pairs ys<br />
<br />
primes, nonprimes :: [Integer]<br />
primes = 2:3:diff [5,7..] nonprimes<br />
nonprimes = serve . foldTree mergeP . map multiples $ tail primes<br />
where<br />
multiples p = vip [p*p,p*p+2*p..]<br />
<br />
vip (x:xs) = VIP x $ Crowd xs<br />
serve (VIP x xs) = x:serve xs<br />
serve (Crowd xs) = xs<br />
</haskell><br />
<br />
<hask>nonprimes</hask> effectively implements a heap, exploiting lazy evaluation.<br />
<br />
=== Prime Wheels ===<br />
<br />
The idea of only testing odd numbers can be extended further. For instance, it is a useful fact that every prime number other than 2 and 3 must be of the form <math>6k+1</math> or <math>6k+5</math>. Thus, we only need to test these numbers:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2:3:primes'<br />
where<br />
1:p:candidates = [6*k+r | k <- [0..], r <- [1,5]]<br />
primes' = p : filter isPrime candidates<br />
isPrime n = all (not . divides n) $ takeWhile (\p -> p*p <= n) primes'<br />
divides n p = n `mod` p == 0<br />
</haskell><br />
<br />
Here, <hask>primes'</hask> is the list of primes greater than 3 and <hask>isPrime</hask> does not test for divisibility by 2 or 3 because the <hask>candidates</hask> by construction don't have these numbers as factors. We also need to exclude 1 from the candidates and mark the next one as prime to start the recursion.<br />
<br />
Such a scheme to generate candidate numbers first that avoid a given set of primes as divisors is called a '''prime wheel'''. Imagine that you had a wheel of circumference 6 to be rolled along the number line. With spikes positioned 1 and 5 units around the circumference, rolling the wheel will prick holes exactly in those positions on the line whose numbers are not divisible by 2 and 3.<br />
<br />
A wheel can be represented by its circumference and the spiked positions.<br />
<haskell><br />
data Wheel = Wheel Integer [Integer]<br />
</haskell><br />
We prick out numbers by rolling the wheel.<br />
<haskell><br />
roll (Wheel n rs) = [n*k+r | k <- [0..], r <- rs]<br />
</haskell><br />
The smallest wheel is the unit wheel with one spike, it will prick out every number.<br />
<haskell><br />
w0 = Wheel 1 [1]<br />
</haskell><br />
We can create a larger wheel by rolling a smaller wheel of circumference <hask>n</hask> along a rim of circumference <hask>p*n</hask> while excluding spike positions at multiples of <hask>p</hask>.<br />
<haskell><br />
nextSize (Wheel n rs) p =<br />
Wheel (p*n) [r' | k <- [0..(p-1)], r <- rs, let r' = n*k+r, r' `mod` p /= 0]<br />
</haskell><br />
Combining both, we can make wheels that prick out numbers that avoid a given list <hask>ds</hask> of divisors.<br />
<haskell><br />
mkWheel ds = foldl nextSize w0 ds<br />
</haskell><br />
<br />
Now, we can generate prime numbers with a wheel that for instance avoids all multiples of 2, 3, 5 and 7.<br />
<haskell><br />
primes :: [Integer]<br />
primes = small ++ large<br />
where<br />
1:p:candidates = roll $ mkWheel small<br />
small = [2,3,5,7]<br />
large = p : filter isPrime candidates<br />
isPrime n = all (not . divides n) $ takeWhile (\p -> p*p <= n) large<br />
divides n p = n `mod` p == 0<br />
</haskell><br />
It's a pretty big wheel with a circumference of 210 and allows us to calculate the first 10000 primes in convenient time.<br />
<br />
A fixed size wheel is fine, but how about adapting the wheel size while generating prime numbers? See the [[Research papers/Functional pearls|functional pearl]] titled [http://citeseer.ist.psu.edu/runciman97lazy.html Lazy wheel sieves and spirals of primes] for more.<br />
<br />
<br />
=== Using IntSet for a traditional sieve ===<br />
<haskell><br />
<br />
module Sieve where<br />
import qualified Data.IntSet as I<br />
<br />
-- findNext - finds the next member of an IntSet.<br />
findNext c is | I.member c is = c<br />
| c > I.findMax is = error "Ooops. No next number in set." <br />
| otherwise = findNext (c+1) is<br />
<br />
-- mark - delete all multiples of n from n*n to the end of the set<br />
mark n is = is I.\\ (I.fromAscList (takeWhile (<=end) (map (n*) [n..])))<br />
where<br />
end = I.findMax is<br />
<br />
-- primes - gives all primes up to n <br />
primes n = worker 2 (I.fromAscList [2..n])<br />
where<br />
worker x is <br />
| (x*x) > n = is<br />
| otherwise = worker (findNext (x+1) is) (mark x is)<br />
</haskell><br />
<br />
== Testing Primality ==<br />
<br />
=== Primality Test and Integer Factorization ===<br />
<br />
Given an infinite list of prime numbers, we can implement primality tests and integer factorization:<br />
<br />
<haskell><br />
isPrime n = n > 1 && n == head (primeFactors n)<br />
<br />
primeFactors 1 = []<br />
primeFactors n = go n primes<br />
where<br />
go n ps@(p:pt)<br />
| p*p > n = [n]<br />
| n `rem` p == 0 = p : go (n `quot` p) ps<br />
| otherwise = go n pt<br />
</haskell><br />
<br />
=== Miller-Rabin Primality Test ===<br />
<haskell><br />
find2km :: Integral a => a -> (a,a)<br />
find2km n = f 0 n<br />
where <br />
f k m<br />
| r == 1 = (k,m)<br />
| otherwise = f (k+1) q<br />
where (q,r) = quotRem m 2 <br />
<br />
millerRabinPrimality :: Integer -> Integer -> Bool<br />
millerRabinPrimality n a<br />
| a <= 1 || a >= n-1 = <br />
error $ "millerRabinPrimality: a out of range (" <br />
++ show a ++ " for "++ show n ++ ")" <br />
| n < 2 = False<br />
| even n = False<br />
| b0 == 1 || b0 == n' = True<br />
| otherwise = iter (tail b)<br />
where<br />
n' = n-1<br />
(k,m) = find2km n'<br />
b0 = powMod n a m<br />
b = take (fromIntegral k) $ iterate (squareMod n) b0<br />
iter [] = False<br />
iter (x:xs)<br />
| x == 1 = False<br />
| x == n' = True<br />
| otherwise = iter xs<br />
<br />
pow' :: (Num a, Integral b) => (a -> a -> a) -> (a -> a) -> a -> b -> a<br />
pow' _ _ _ 0 = 1<br />
pow' mul sq x' n' = f x' n' 1<br />
where <br />
f x n y<br />
| n == 1 = x `mul` y<br />
| r == 0 = f x2 q y<br />
| otherwise = f x2 q (x `mul` y)<br />
where<br />
(q,r) = quotRem n 2<br />
x2 = sq x<br />
<br />
mulMod :: Integral a => a -> a -> a -> a<br />
mulMod a b c = (b * c) `mod` a<br />
squareMod :: Integral a => a -> a -> a<br />
squareMod a b = (b * b) `rem` a<br />
powMod :: Integral a => a -> a -> a -> a<br />
powMod m = pow' (mulMod m) (squareMod m)<br />
</haskell><br />
<br />
== External links ==<br />
* http://www.cs.hmc.edu/~oneill/code/haskell-primes.zip<br />
: A collection of prime generators; the file "ONeillPrimes.hs" contains one of the fastest pure-Haskell prime generators. WARNING: Don't use the priority queue from that file for your projects: it's broken and works for primes only by a lucky chance.<br />
<br />
[[Category:Code]]<br />
[[Category:Mathematics]]</div>Steve Chttps://wiki.haskell.org/index.php?title=Prime_numbers&diff=32790Prime numbers2009-12-31T11:19:06Z<p>Steve C: Rename "Arrays" section to "Immutable Arrays", add "Mutable Arrays" section</p>
<hr />
<div>== Prime Number Resources ==<br />
In mathematics, a <i>prime number</i> (or a <i>prime</i>) is a natural number which has exactly two distinct natural number divisors: 1 and itself. The smallest prime is thus 2.<br />
<br />
[http://en.wikipedia.org/wiki/Prime_numbers Prime Numbers] at Wikipedia.<br />
<br />
[http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes Sieve of Eratosthenes] at Wikipedia.<br />
<br />
HackageDB packages:<br />
<br />
[http://hackage.haskell.org/package/Numbers Numbers]: An assortment of number theoretic functions.<br />
<br />
[http://hackage.haskell.org/package/NumberSieves NumberSieves]: Number Theoretic Sieves: primes, factorization, and Euler's Totient.<br />
<br />
[http://hackage.haskell.org/package/primes primes]: Efficient, purely functional generation of prime numbers.<br />
<br />
== Finding Primes ==<br />
<br />
Any natural number is representable as a product of powers of its prime factors, and so a prime has no prime divisors other than itself. That means that starting with 2, <i>for each</i> newly found <i>prime</i> we can <i>eliminate</i> from the rest of the numbers <i>all such numbers</i> that have this prime as their divisor, giving us the <i>next available</i> number as next prime. This is known as <i>sieving</i> the natural numbers, so that in the end what we are left with are just primes. <br />
<br />
=== The Classic Simple Primes Sieve ===<br />
<br />
Attributed to David Turner <i>(SASL Language Manual, 1975)</i>, the following is a direct translation of that idea, generating a list of all prime numbers:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = sieve [2..]<br />
where<br />
sieve (p:xs) = p : sieve [x | x<-xs, x `mod` p /= 0] <br />
-- or: filter ((/=0).(`mod`p)) xs<br />
</haskell><br />
<br />
This should only be considered a <i>specification</i>, not a <i>code</i>. When run as is, it is <i>extremely inefficient</i> because it starts up the filters prematurely, immediately after each prime, instead of only after the prime's square has been reached. To be admitted as prime, <i>each number</i> will be <i>tested for divisibility</i> here by all its preceding primes, while just those not greater than its square root would suffice. This means that e.g. to find the <b>1001</b>st prime (<code>7927</code>), <b>1000</b> filters are used, when in fact just the first <b>24</b> are needed (upto <code>89</code>'s filter only).<br />
<br />
So this in effect creates a cascade of nested filters in front of the infinite numbers supply, and in <i>extremely premature</i> fashion at that. One way of fixing that would be to <i>postpone</i> the creation of filters until the right moment, by decoupling the primes supply from the numbers supply, as in<br />
<br />
=== Postponed Filters Sieve ===<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve (tail primes) [5,7..]<br />
where <br />
sieve (p:ps) xs = h ++ sieve ps [x | x<-t, x `rem` p /= 0] <br />
-- or: filter ((/=0).(`rem`p)) t<br />
where (h,~(_:t)) = span (< p*p) xs<br />
</haskell><br />
<br />
This can be seen as essential framework for all the code to come. It only tests odd numbers, and only by the primes that are needed, for each <i>numbers span</i> between successive squares of primes. To find the <b>1001</b>st prime, the divisibility test is performed by only <b>24</b> nested filters corresponding to the first <b>24</b> odd primes.<br />
<br />
Whereas the first version exhibits near O(<math>{n^2}</math>) behavior, this one exhibits near O(<math>{n^{1.5}}</math>) behavior, with an orders-of-magnitude speedup. <br />
<br />
There is another way for the composites to be found - by generating all the multiples of successive primes, in advance. Any number thus generated will of course be divisible by the corresponding prime.<br />
<br />
==== Postponed Multiples Removal i.e. Euler's Sieve ====<br />
Instead of testing <i>each number</i> for divisibility by a prime we can just <i>remove</i> the prime's <i>multiples</i> in advance. We gain in speed because we now get the primes <i>for free</i>, after all the multiples are removed on a particular span, <i>without</i> performing any divisibility tests <i>at all</i>:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve (tail primes) [5,7..] <br />
where<br />
sieve (p:ps) xs = h ++ sieve ps (t `minus` tail [q,q+2*p..])<br />
where (h,~(_:t)) = span (< q) xs <br />
q = p*p<br />
<br />
<br />
minus :: (Ord a) => [a] -> [a] -> [a]<br />
minus a@(x:xs) b@(y:ys) = case compare x y of <br />
LT -> x: xs `minus` b<br />
EQ -> xs `minus` ys<br />
GT -> a `minus` ys<br />
minus a b = a<br />
</haskell> <br />
<br />
This is, in fact, Euler's sieve.<br />
<br />
==== Merged Multiples Removal Sieve ====<br />
<code>(...((s-a)-b)-...)</code> is the same as <code>(s-(a+b+...))</code>, and so we can just remove the ''merged'' infinite primes multiples, each starting at its prime's square, from the ''initial'' numbers supply. This way we needn't explicitly jump to a prime's square because it's where its multiples start anyway:<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2:primes'<br />
where<br />
primes' = [3] ++ [5,7..] `minus` foldr merge' [] mults<br />
mults = map (\p->let q=p*p in (q,tail [q,q+2*p..])) $ primes'<br />
merge' (q,qs) xs = q : merge qs xs<br />
<br />
<br />
<br />
merge :: (Ord a) => [a] -> [a] -> [a] <br />
merge a@(x:xs) b@(y:ys) = case compare x y of <br />
LT -> x: merge xs b <br />
EQ -> x: merge xs ys<br />
GT -> y: merge a ys<br />
merge a b = if null a then b else a<br />
</haskell><br />
<br />
This code is yet faster. Its main deficiency still is that it creates a linear nested merging structure, instead of a tree-like structure. Each multiple produced by a prime has to percolate to the top eventually, so it's better to make its path shorter. It'll have to go through fewer merge nodes this way.<br />
<br />
The linearity is imposed by type asymmetry of our <code>(merge' :: a -> b -> b)</code> function, forcing us into the <code>1+(1+(1+(1+...)))</code> pattern, <code>+</code> standing for <code>merge'</code> (which was defined that way to prevent the run-ahead when folding over the infinite list of lists of multiples).<br />
<br />
We need to turn it into an associative operation of uniform type <code>(:: a -> a -> a)</code> to be able to freely rearrange the combinations into arbitrary tree-like patterns, as in e.g. ((1+1)+(2+2))+(...) etc. The type uniformity is what makes compositionality possible.<br />
<br />
The code in the "Implicit Heap" section below improves on that, and is essentially equivalent to using a treefold instead of a standard linear <code>foldr</code>, as in:<br />
<br />
==== Treefold Merged Multiples Removal ====<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2:primes'<br />
where<br />
primes' = [3,5] ++ drop 2 [3,5..] `minus` comps<br />
mults = map (\p-> let q=p*p in ([q],tail [q,q+2*p..])) $ primes'<br />
comps = fst $ tfold mergeSP (pairwise mergeSP mults)<br />
<br />
pairwise f (x:y:ys) = f x y : pairwise f ys<br />
<br />
tfold f (a: ~(b: ~(c:xs)))<br />
= (a `f` (b `f` c)) `f` tfold f (pairwise f xs)<br />
<br />
mergeSP (a,b) ~(c,d) = let (bc,b') = spMerge b c<br />
in (a ++ bc, merge b' d)<br />
where <br />
spMerge :: (Ord a) => [a] -> [a] -> ([a],[a]) <br />
spMerge a@(x:xs) b@(y:ys) = case compare x y of<br />
LT -> (x:c,d) where (c,d) = spMerge xs b<br />
EQ -> (x:c,d) where (c,d) = spMerge xs ys<br />
GT -> (y:c,d) where (c,d) = spMerge a ys<br />
spMerge a [] = ([] ,a)<br />
spMerge [] b = ([] ,b)<br />
</haskell><br />
The fold used here creates a <code>(2+(2+2))+( (4+(4+4)) + ( (8+(8+8)) + ... ))</code> structure, better adjusted for primes multiples production than <code>1+(2+(4+(8+...)))</code>, used by the "Implicit Heap" code, giving it additional 10% speedup.<br />
<br />
<code> mergeSP </code> is an associative operation, preserving of the invariant such that for a list of multiples <code>[(a,b),(c,d), ...]</code>, it's always the case that <code> last a < head b && last a < head c</code>. These "split pairs" represent ordered list as a pair of its known and (currently) finite prefix, and the rest of it. Such pairs under <code>mergeSP</code> operation form a <i>monoid</i>, and if we were to declare a <hask>newtype SplitPair a = SP ([a],[a])</hask> a <hask>Monoid</hask> instance, with <code>mergeSP</code> its <hask>mappend</hask> (and <code>tfold</code> its <hask>mconcat</hask>), the above code for <code>comps</code> would just become <hask>SP (comps,_) = mconcat mults</hask>.<br />
<br />
This code exhibits approximately O(<math>{n^{1.20}}</math>)..O(<math>{n^{1.15}}</math>) local asymptotic behavior (tested interpreted, in GHCi, for 10,000 to 300,000 primes produced). When compared with Melissa O'Neill's PQ code from the ZIP package which was modified to work on odds only as well, it is 3.2x times faster, with used memory reported about 2.5x times smaller.<br />
<br />
It can be further improved with the wheel optimization (as described below at the Prime Wheels section):<br />
<br />
==== Treefold Merged Multiples, with Wheel ====<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2:3:5:7:primes' <br />
where<br />
primes' = [11,13] ++ drop 2 (rollFrom 11) `minus` comps<br />
mults = map (\p-> fromList $ map (p*) $ rollFrom p) $ primes'<br />
comps = fst $ tfold mergeSP (pairwise mergeSP mults)<br />
fromList (x:xs) = ([x],xs)<br />
rollFrom n = let x = (n-11) `mod` 210<br />
(y,_) = span (< x) wheelNums<br />
in roll n $ drop (length y) wheel<br />
<br />
wheelNums = roll 0 wheel<br />
roll = scanl (+)<br />
wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2:<br />
4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel<br />
</haskell><br />
<br />
This runs about 2.5x times faster than the Priority Queue-based code as present in Melissa O'Neill's ZIP package, with similar local asymptotic behavior of about O(<math>{n^{1.25}}</math>)..O(<math>{n^{1.17}}</math>) (tested interpreted, in GHCi, for 10,000 to 300,000 primes produced), with used memory reported about twice less as well.<br />
<br />
=== More Filtering Sieves ===<br />
<br />
The primes list creation with divisibility testing can be reformulated in a few more ways, using the list of primes <i>as it is being built</i> (a la "circular programming").<br />
<br />
==== Odd numbers, by Trial Division ====<br />
This is also good for generating a few 100,000s primes (when GHC-compiled as well):<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: filter isPrime [5,7..]<br />
where<br />
isPrime n = all (notDivs n) $ takeWhile (\p-> p*p <= n) (tail primes)<br />
notDivs n p = n `mod` p /= 0 <br />
</haskell><br />
<br />
Instead of relying on nested filters, it tests each odd number by an explicit list of all the needed prime factors. But for each number tested it re-fetches this list <i>anew</i> which will be <i>the same</i> for the increasing spans of numbers between the successive squares of primes.<br />
<br />
==== Generated Spans, by Nested Filters ====<br />
The other way to go instead of concentrating on the numbers supply, is to directly work on the successive spans between the primes squares.<br />
<br />
This version is a bit faster still, creating <i>158,000 primes</i> (again, GHC-compiled) in the same time as the postponed filters does 100,000 primes:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve [] (tail primes) 5<br />
where<br />
notDivsBy d n = n `mod` d /= 0<br />
sieve ds (p:ps) x = foldr (filter . notDivsBy) [x, x+2..p*p-2] ds<br />
++ sieve (p:ds) ps (p*p+2)<br />
</haskell><br />
<br />
This one explicitly maintains the list of primes needed for testing each odds span between successive primes squares, which it also explicitly generates. But it tests with nested <code>filter</code>s, which it repeatedly recreates.<br />
<br />
==== Generated Spans, by List of Primes ====<br />
The list of primes needed to test each range of odds is actually just the prefix of the primes list itself, of known length, and need not be specifically generated at all. Combined with one-call testing by the explicit list of primes, and direct generation of odds between the successive primes squares, this leads to:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve 0 (tail primes) 5<br />
sieve k (p:ps) x = [x | x<-[x,x+2..p*p-2], and [x`rem`p/=0 | p<-fs]]<br />
-- or: all ((>0).(x`rem`)) fs<br />
++ sieve (k+1) ps (p*p+2) <br />
where fs = take k (tail primes) <br />
</haskell><br />
<br />
It produces about <i>222,000 primes</i> in the same amount of time, and is good for creating about a million first primes, compiled.<br />
<br />
The reason to have <code>sieve</code> function available separately too is that it can also be used to produce primes above a given number, as in<br />
<br />
<haskell><br />
primesFrom m = sieve (length h) ps $ m`div`2*2+1 <br />
where <br />
(h,(_:ps)) = span (<= (floor.sqrt.fromIntegral) m) primes<br />
</haskell><br />
<br />
It can thus produce a few primes e.g. above <code>239812076741689</code>, which is a square of the millionth odd prime, without having to compute all the preceding primes (which would number in trillions).<br />
<br />
==== Multiples Removal on Generated Spans, or Sieve of Eratosthenes ====<br />
<br />
The divisibility testing too should be considered a specification (as in "no multiples of p"), and not a code per se, because although testing composites is cheap (as most of them will have small factors, so the test is soon aborted), testing prime numbers is costly, and is to be avoided.<br />
<br />
All the filtering versions thus far try to <i>keep the primes</i> among all numbers by testing <i>each number</i> in isolation. Instead, all the relevant primes' multiples can be removed from the corresponding segments of odds, and what's left after that will be just primes:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve [] (tail primes) 5<br />
where<br />
sieve fs (p:ps) x = [x,x+2..q-2] `minus` foldl merge [] mults<br />
++ sieve ((2*p,q):fs') ps (q+2)<br />
where<br />
q = p*p<br />
mults = [ [y+s,y+2*s..q] | (s,y)<- fs]<br />
fs' = [ (s,last ms) | ((s,_),ms)<- zip fs mults]<br />
</haskell> <br />
<br />
This modifies the preceding sieve to "mark" the odd composites in a given range (instead of testing their divisibility) by generating them - just as a person performing the original sieve of Eratosthenes would do, marking one by one the multiples of the relevant primes.<br />
<br />
Compared with the previous version, interpreted under both GHCi and WinHugs, it runs <i>faster</i>, takes <i>less</i> memory, and has better asymptotic behavior, its performance approximately the same as in the Merged Multiples Removal sieve. The advantage in working with spans explicitly is that this code is easily amendable to using arrays for the composites marking and removal on each <i>finite</i> span. <br />
<br />
=== Using Immutable Arrays ===<br />
<br />
==== Generating Segments of Primes ====<br />
<br />
The removal of multiples on each segment of odds in the sieve of Eratosthenes can be done by actually marking them in an array, instead of manipulating lists with "minus" and "merge":<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2: 3: sieve [] (tail primes) 5 <br />
where <br />
sieve fs (p:ps) x = [i | i<- [x,x+2..q-2], a!i] <br />
++ sieve ((2*p,q):fs') ps (q+2)<br />
where<br />
q = p*p<br />
mults = [ [y+s,y+2*s..q] | (s,y)<- fs]<br />
fs' = [ (s,last ms) | ((s,_),ms)<- zip fs mults]<br />
a = accumArray (\a b->False) True (x,q-2) <br />
[(i,()) | ms<- mults, i<- ms] <br />
</haskell><br />
Apparently, arrays are ''fast'' too. This code, compared with Treefold Merged with Wheel version (itself 2.5x times faster than Melissa O'Neill's PQ version), runs at about the same time and memory usage, but improving slightly with slightly better local asymptotics.<br />
<br />
==== Calculating Primes Upto a Given Value ====<br />
<br />
<haskell><br />
primesToN n = 2: [i | i<-[3,5..n], ar!i]<br />
where<br />
ar = f 5 $ accumArray (\a b->False) True (3,n) <br />
[(i,()) | i<- [9,15..n]]<br />
f p a | q>=n = a<br />
| True = if null x then a' else f (head x) a'<br />
where q = p*p<br />
a'= a//[(i,False)|i<-[q,q+2*p..n]]<br />
x = [i | i<-[p+2,p+4..n], a' !i]<br />
</haskell><br />
<br />
==== Calculating Primes in a Given Range ====<br />
<br />
<haskell><br />
primesFromTo a b = (if a<3 then [2] else []) <br />
++ [i | i<-[o,o+2..b], ar!i]<br />
where <br />
o = max (if even a then a+1 else a) 3<br />
r = (1+).floor.sqrt.fromInteger $ b<br />
ar = accumArray (\a b->False) True (o,b) <br />
[(i,()) | p <- [3,5..r]<br />
, let q = p*p <br />
s = 2*p <br />
(n,x) = quotRem (o-q) s <br />
q' = if o <= q then q<br />
else if x==0 then q+n*s<br />
else q+(n+1)*s<br />
, i<- [q',q'+s..b] ]<br />
</haskell><br />
<br />
Although testing by odds instead of by primes, the array generation is so fast that it is very much feasible and even preferable for quick generation of some short spans of relatively big primes.<br />
<br />
=== Using Mutable Arrays ===<br />
<br />
==== Bitwise prime sieve ====<br />
<br />
Count the number of prime below a given 'n'. Shows fast bitwise arrays,<br />
and an example of [[Template Haskell]] to defeat your enemies.<br />
<br />
<haskell><br />
{-# OPTIONS -O2 -optc-O -XBangPatterns #-}<br />
module Primes (nthPrime) where<br />
<br />
import Control.Monad.ST<br />
import Data.Array.ST<br />
import Data.Array.Base<br />
import System<br />
import Control.Monad<br />
import Data.Bits<br />
<br />
nthPrime :: Int -> Int<br />
nthPrime n = runST (sieve n)<br />
<br />
sieve n = do<br />
a <- newArray (3,n) True :: ST s (STUArray s Int Bool)<br />
let cutoff = truncate (sqrt $ fromIntegral n) + 1<br />
go a n cutoff 3 1<br />
<br />
go !a !m cutoff !n !c<br />
| n >= m = return c<br />
| otherwise = do<br />
e <- unsafeRead a n<br />
if e then<br />
if n < cutoff then<br />
let loop !j<br />
| j < m = do<br />
x <- unsafeRead a j<br />
when x $ unsafeWrite a j False<br />
loop (j+n)<br />
| otherwise = go a m cutoff (n+2) (c+1)<br />
in loop ( if n < 46340 then n * n else n `shiftL` 1)<br />
else go a m cutoff (n+2) (c+1)<br />
else go a m cutoff (n+2) c<br />
</haskell><br />
<br />
And places in a module:<br />
<br />
<haskell><br />
{-# OPTIONS -fth #-}<br />
import Primes<br />
<br />
main = print $( let x = nthPrime 10000000 in [| x |] )<br />
</haskell><br />
<br />
Run as:<br />
<br />
<haskell><br />
$ ghc --make -o primes Main.hs<br />
$ time ./primes<br />
664579<br />
./primes 0.00s user 0.01s system 228% cpu 0.003 total<br />
</haskell><br />
<br />
=== Implicit Heap ===<br />
<br />
The following is a more efficient prime generator, implementing the sieve of<br />
Eratosthenes.<br />
<br />
See also the message threads [http://thread.gmane.org/gmane.comp.lang.haskell.cafe/25270/focus=25312 Re: "no-coding" functional data structures via lazyness] for more about how merging ordered lists amounts to creating an implicit heap and [http://thread.gmane.org/gmane.comp.lang.haskell.cafe/26426/focus=26493 Re: Code and Perf. Data for Prime Finders] for an explanation of the <hask>People a</hask> structure that makes it work when tying a knot.<br />
<br />
<haskell><br />
data People a = VIP a (People a) | Crowd [a]<br />
<br />
mergeP :: Ord a => People a -> People a -> People a<br />
mergeP (VIP x xt) ys = VIP x $ mergeP xt ys<br />
mergeP (Crowd xs) (Crowd ys) = Crowd $ merge xs ys<br />
mergeP xs@(Crowd (x:xt)) ys@(VIP y yt) = case compare x y of<br />
LT -> VIP x $ mergeP (Crowd xt) ys<br />
EQ -> VIP x $ mergeP (Crowd xt) yt<br />
GT -> VIP y $ mergeP xs yt<br />
<br />
merge :: Ord a => [a] -> [a] -> [a]<br />
merge xs@(x:xt) ys@(y:yt) = case compare x y of<br />
LT -> x : merge xt ys<br />
EQ -> x : merge xt yt<br />
GT -> y : merge xs yt<br />
<br />
diff xs@(x:xt) ys@(y:yt) = case compare x y of<br />
LT -> x : diff xt ys<br />
EQ -> diff xt yt<br />
GT -> diff xs yt<br />
<br />
foldTree :: (a -> a -> a) -> [a] -> a<br />
foldTree f ~(x:xs) = x `f` foldTree f (pairs xs)<br />
where pairs ~(x: ~(y:ys)) = f x y : pairs ys<br />
<br />
primes, nonprimes :: [Integer]<br />
primes = 2:3:diff [5,7..] nonprimes<br />
nonprimes = serve . foldTree mergeP . map multiples $ tail primes<br />
where<br />
multiples p = vip [p*p,p*p+2*p..]<br />
<br />
vip (x:xs) = VIP x $ Crowd xs<br />
serve (VIP x xs) = x:serve xs<br />
serve (Crowd xs) = xs<br />
</haskell><br />
<br />
<hask>nonprimes</hask> effectively implements a heap, exploiting lazy evaluation.<br />
<br />
=== Prime Wheels ===<br />
<br />
The idea of only testing odd numbers can be extended further. For instance, it is a useful fact that every prime number other than 2 and 3 must be of the form <math>6k+1</math> or <math>6k+5</math>. Thus, we only need to test these numbers:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = 2:3:primes'<br />
where<br />
1:p:candidates = [6*k+r | k <- [0..], r <- [1,5]]<br />
primes' = p : filter isPrime candidates<br />
isPrime n = all (not . divides n) $ takeWhile (\p -> p*p <= n) primes'<br />
divides n p = n `mod` p == 0<br />
</haskell><br />
<br />
Here, <hask>primes'</hask> is the list of primes greater than 3 and <hask>isPrime</hask> does not test for divisibility by 2 or 3 because the <hask>candidates</hask> by construction don't have these numbers as factors. We also need to exclude 1 from the candidates and mark the next one as prime to start the recursion.<br />
<br />
Such a scheme to generate candidate numbers first that avoid a given set of primes as divisors is called a '''prime wheel'''. Imagine that you had a wheel of circumference 6 to be rolled along the number line. With spikes positioned 1 and 5 units around the circumference, rolling the wheel will prick holes exactly in those positions on the line whose numbers are not divisible by 2 and 3.<br />
<br />
A wheel can be represented by its circumference and the spiked positions.<br />
<haskell><br />
data Wheel = Wheel Integer [Integer]<br />
</haskell><br />
We prick out numbers by rolling the wheel.<br />
<haskell><br />
roll (Wheel n rs) = [n*k+r | k <- [0..], r <- rs]<br />
</haskell><br />
The smallest wheel is the unit wheel with one spike, it will prick out every number.<br />
<haskell><br />
w0 = Wheel 1 [1]<br />
</haskell><br />
We can create a larger wheel by rolling a smaller wheel of circumference <hask>n</hask> along a rim of circumference <hask>p*n</hask> while excluding spike positions at multiples of <hask>p</hask>.<br />
<haskell><br />
nextSize (Wheel n rs) p =<br />
Wheel (p*n) [r' | k <- [0..(p-1)], r <- rs, let r' = n*k+r, r' `mod` p /= 0]<br />
</haskell><br />
Combining both, we can make wheels that prick out numbers that avoid a given list <hask>ds</hask> of divisors.<br />
<haskell><br />
mkWheel ds = foldl nextSize w0 ds<br />
</haskell><br />
<br />
Now, we can generate prime numbers with a wheel that for instance avoids all multiples of 2, 3, 5 and 7.<br />
<haskell><br />
primes :: [Integer]<br />
primes = small ++ large<br />
where<br />
1:p:candidates = roll $ mkWheel small<br />
small = [2,3,5,7]<br />
large = p : filter isPrime candidates<br />
isPrime n = all (not . divides n) $ takeWhile (\p -> p*p <= n) large<br />
divides n p = n `mod` p == 0<br />
</haskell><br />
It's a pretty big wheel with a circumference of 210 and allows us to calculate the first 10000 primes in convenient time.<br />
<br />
A fixed size wheel is fine, but how about adapting the wheel size while generating prime numbers? See the [[Research papers/Functional pearls|functional pearl]] titled [http://citeseer.ist.psu.edu/runciman97lazy.html Lazy wheel sieves and spirals of primes] for more.<br />
<br />
<br />
=== Using IntSet for a traditional sieve ===<br />
<haskell><br />
<br />
module Sieve where<br />
import qualified Data.IntSet as I<br />
<br />
-- findNext - finds the next member of an IntSet.<br />
findNext c is | I.member c is = c<br />
| c > I.findMax is = error "Ooops. No next number in set." <br />
| otherwise = findNext (c+1) is<br />
<br />
-- mark - delete all multiples of n from n*n to the end of the set<br />
mark n is = is I.\\ (I.fromAscList (takeWhile (<=end) (map (n*) [n..])))<br />
where<br />
end = I.findMax is<br />
<br />
-- primes - gives all primes up to n <br />
primes n = worker 2 (I.fromAscList [2..n])<br />
where<br />
worker x is <br />
| (x*x) > n = is<br />
| otherwise = worker (findNext (x+1) is) (mark x is)<br />
</haskell><br />
<br />
== Testing Primality ==<br />
<br />
=== Primality Test and Integer Factorization ===<br />
<br />
Given an infinite list of prime numbers, we can implement primality tests and integer factorization:<br />
<br />
<haskell><br />
isPrime n = n > 1 && n == head (primeFactors n)<br />
<br />
primeFactors 1 = []<br />
primeFactors n = go n primes<br />
where<br />
go n ps@(p:pt)<br />
| p*p > n = [n]<br />
| n `rem` p == 0 = p : go (n `quot` p) ps<br />
| otherwise = go n pt<br />
</haskell><br />
<br />
=== Miller-Rabin Primality Test ===<br />
<haskell><br />
find2km :: Integral a => a -> (a,a)<br />
find2km n = f 0 n<br />
where <br />
f k m<br />
| r == 1 = (k,m)<br />
| otherwise = f (k+1) q<br />
where (q,r) = quotRem m 2 <br />
<br />
millerRabinPrimality :: Integer -> Integer -> Bool<br />
millerRabinPrimality n a<br />
| a <= 1 || a >= n-1 = <br />
error $ "millerRabinPrimality: a out of range (" <br />
++ show a ++ " for "++ show n ++ ")" <br />
| n < 2 = False<br />
| even n = False<br />
| b0 == 1 || b0 == n' = True<br />
| otherwise = iter (tail b)<br />
where<br />
n' = n-1<br />
(k,m) = find2km n'<br />
b0 = powMod n a m<br />
b = take (fromIntegral k) $ iterate (squareMod n) b0<br />
iter [] = False<br />
iter (x:xs)<br />
| x == 1 = False<br />
| x == n' = True<br />
| otherwise = iter xs<br />
<br />
pow' :: (Num a, Integral b) => (a -> a -> a) -> (a -> a) -> a -> b -> a<br />
pow' _ _ _ 0 = 1<br />
pow' mul sq x' n' = f x' n' 1<br />
where <br />
f x n y<br />
| n == 1 = x `mul` y<br />
| r == 0 = f x2 q y<br />
| otherwise = f x2 q (x `mul` y)<br />
where<br />
(q,r) = quotRem n 2<br />
x2 = sq x<br />
<br />
mulMod :: Integral a => a -> a -> a -> a<br />
mulMod a b c = (b * c) `mod` a<br />
squareMod :: Integral a => a -> a -> a<br />
squareMod a b = (b * b) `rem` a<br />
powMod :: Integral a => a -> a -> a -> a<br />
powMod m = pow' (mulMod m) (squareMod m)<br />
</haskell><br />
<br />
== External links ==<br />
* http://www.cs.hmc.edu/~oneill/code/haskell-primes.zip<br />
: A collection of prime generators; the file "ONeillPrimes.hs" contains one of the fastest pure-Haskell prime generators. WARNING: Don't use the priority queue from that file for your projects: it's broken and works for primes only by a lucky chance.<br />
<br />
[[Category:Code]]<br />
[[Category:Mathematics]]</div>Steve Chttps://wiki.haskell.org/index.php?title=Prime_numbers&diff=32789Prime numbers2009-12-31T09:47:27Z<p>Steve C: Fix typo</p>
<hr />
<div>== Prime Number Resources ==<br />
In mathematics, a <i>prime number</i> (or a <i>prime</i>) is a natural number which has exactly two distinct natural number divisors: 1 and itself. The smallest prime is thus 2.<br />
<br />
[http://en.wikipedia.org/wiki/Prime_numbers Prime Numbers] at Wikipedia.<br />
<br />
[http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes Sieve of Eratosthenes] at Wikipedia.<br />
<br />
HackageDB packages:<br />
<br />
[http://hackage.haskell.org/package/Numbers Numbers]: An assortment of number theoretic functions.<br />
<br />
[http://hackage.haskell.org/package/NumberSieves NumberSieves]: Number Theoretic Sieves: primes, factorization, and Euler's Totient.<br />
<br />
[http://hackage.haskell.org/package/primes primes]: Efficient, purely functional generation of prime numbers.<br />
<br />
== Finding Primes ==<br />
<br />
Any natural number is representable as a product of powers of its prime factors, and so a prime has no prime divisors other than itself. That means that starting with 2, <i>for each</i> newly found <i>prime</i> we can <i>eliminate</i> from the rest of the numbers <i>all such numbers</i> that have this prime as their divisor, giving us the <i>next available</i> number as next prime. This is known as <i>sieving</i> the natural numbers, so that in the end what we are left with are just primes. <br />
<br />
=== The Classic Simple Primes Sieve ===<br />
<br />
Attributed to David Turner <i>(SASL Language Manual, 1975)</i>, the following is a direct translation of that idea, generating a list of all prime numbers:<br />
<br />
<haskell><br />
primes :: [Integer]<br />
primes = sieve [2..]<br />
where<br />
sieve (p:xs) = p : sieve [x | x<-xs, x `mod` p /= 0] <br />