Difference between revisions of "Prime numbers"
(better wheel version for Treefold Merged Multiples  2.5x faster than PQ !) 
(→Turner's sieve  Trial division: mention complexity of the first variant) 

(483 intermediate revisions by 5 users not shown)  
Line 1:  Line 1:  
+  In mathematics, ''amongst the natural numbers greater than 1'', a ''prime number'' (or a ''prime'') is such that has no divisors other than itself (and 1). 

+  
== Prime Number Resources == 
== Prime Number Resources == 

−  In mathematics, a <i>prime number</i> (or a <i>prime</i>) is a natural number which has exactly two distinct natural number divisors: 1 and itself. The smallest prime is thus 2. 

−  [http://www.haskell.org/haskellwiki/Prime_numbers Prime Numbers] at Wikipedia. 

+  * At Wikipedia: 

+  **[http://en.wikipedia.org/wiki/Prime_numbers Prime Numbers] 

+  **[http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes Sieve of Eratosthenes] 

−  [http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes Sieve of Eratosthenes] at Wikipedia. 

+  * HackageDB packages: 

+  ** [http://hackage.haskell.org/package/arithmoi arithmoi]: Various basic number theoretic functions; efficient arraybased sieves, Montgomery curve factorization ... 

+  ** [http://hackage.haskell.org/package/Numbers Numbers]: An assortment of number theoretic functions. 

+  ** [http://hackage.haskell.org/package/NumberSieves NumberSieves]: Number Theoretic Sieves: primes, factorization, and Euler's Totient. 

+  ** [http://hackage.haskell.org/package/primes primes]: Efficient, purely functional generation of prime numbers. 

−  HackageDB packages: 

+  * Papers: 

+  ** O'Neill, Melissa E., [http://www.cs.hmc.edu/~oneill/papers/SieveJFP.pdf "The Genuine Sieve of Eratosthenes"], Journal of Functional Programming, Published online by Cambridge University Press 9 October 2008 doi:10.1017/S0956796808007004. 

−  [http://hackage.haskell.org/package/Numbers Numbers]: An assortment of number theoretic functions. 

+  == Definition == 

−  [http://hackage.haskell.org/package/NumberSieves NumberSieves]: Number Theoretic Sieves: primes, factorization, and Euler's Totient. 

+  In mathematics, ''amongst the natural numbers greater than 1'', a ''prime number'' (or a ''prime'') is such that has no divisors other than itself (and 1). The smallest prime is thus 2. Nonprime numbers are known as ''composite'', i.e. those representable as product of two natural numbers greater than 1. 

−  [http://hackage.haskell.org/package/primes primes]: Efficient, purely functional generation of prime numbers. 

+  To find out a prime's multiples we can either '''a.''' test each new candidate number for divisibility by that prime, giving rise to a kind of ''trial division'' algorithm; or '''b.''' we can directly generate the multiples of a prime ''p'' by counting up from it in increments of ''p'', resulting in a variant of the ''sieve of Eratosthenes''. 

−  == Finding Primes == 

+  The set of prime numbers is thus 

−  Any natural number is representable as a product of powers of its prime factors, and so a prime has no prime divisors other then itself. That means that starting with 2, <i>for each</i> newly found <i>prime</i> we can <i>eliminate</i> from the rest of the numbers <i>all such numbers</i> that have this prime as their divisor, giving us the <i>next available</i> number as next prime. This is known as <i>sieving</i> the natural numbers, so that in the end what we are left with are just primes. 

+  : '''P''' = { ''n'' ∈ '''N'''<sub>2</sub> ''':''' (∀ ''m'' ∈ '''N'''<sub>2</sub>) ( (''m''  ''n'') ⇒ m = n) } 

−  === The Classic Simple Primes Sieve === 

+  :: = { ''n'' ∈ '''N'''<sub>2</sub> ''':''' (∀ ''m'' ∈ '''N'''<sub>2</sub>) ( ''m''  < ''n'' ⇒ ¬(''m''  ''n'')) } 

−  Attributed to David Turner <i>(SASL Language Manual, 1975)</i>, the following is a direct translation of that idea, generating a list of all prime numbers: 

+  :: = { ''n'' ∈ '''N'''<sub>2</sub> ''':''' (∀ ''m'' ∈ '''N'''<sub>2</sub>) ( ''m'' ⋅ ''m'' ≤ ''n'' ⇒ ¬(''m''  ''n'')) } 

+  
+  :: = '''N'''<sub>2</sub> \ { ''n'' ⋅ ''m'' ''':''' ''n'',''m'' ∈ '''N'''<sub>2</sub> } 

+  
+  :: = '''N'''<sub>2</sub> \ '''⋃''' { { ''n'' ⋅ ''m'' ''':''' ''m'' ∈ '''N'''<sub>n</sub> } ''':''' ''n'' ∈ '''N'''<sub>2</sub> } 

+  
+  :: = '''N'''<sub>2</sub> \ '''⋃''' { { ''n'' ⋅ ''n'', ''n'' ⋅ ''n''+''n'', ''n'' ⋅ ''n''+''n''+''n'', ... } ''':''' ''n'' ∈ '''N'''<sub>2</sub> } 

+  
+  :: = '''N'''<sub>2</sub> \ '''⋃''' { { ''p'' ⋅ ''p'', ''p'' ⋅ ''p''+''p'', ''p'' ⋅ ''p''+''p''+''p'', ... } ''':''' ''p'' ∈ '''P''' } 

+  :::: where '''N'''<sub>k</sub> = { ''n'' ∈ '''N''' ''':''' ''n'' ≥ k } 

+  
+  Thus starting with 2, for each newly found prime we can ''eliminate'' from the rest of the numbers ''all the multiples'' of this prime, giving us the next available number as next prime. This is known as ''sieving'' the natural numbers, so that in the end all the composites are eliminated and what we are left with are just primes. <small>(This is what the last formula is describing, though seemingly [http://en.wikipedia.org/wiki/Impredicativity impredicative], because it is selfreferential. But because '''N'''<sub>2</sub> is wellordered with the order preserved under addition, the formula is welldefined.)</small> 

+  
+  Prototypically, it is 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  primes = map head $ scanl (\\) [2..] [[p, p+p..]  p < primes] 

−  primes = sieve [2..] 

+  where 

−  where 

+  (\\) = Data.List.Ordered.minus 

−  sieve (p:xs) = p : sieve [x  x<xs, x `mod` p /= 0] 

−   or: filter ((/=0).(`mod`p)) xs 

</haskell> 
</haskell> 

−  This should only be considered a <i>specification</i>, not a <i>code</i>. When run as is, it is <i>extremely inefficient</i> because it starts up the filters prematurely, immediately after each prime, instead of only after the prime's square has been reached. To be admitted as prime, <i>each number</i> will be <i>tested for divisibility</i> here by all its preceding primes, while just those not greater than its square root would suffice. This means that e.g. to find the <b>1001</b>st prime (<code>7927</code>), <b>1000</b> filters are used, when in fact just the first <b>24</b> are needed (upto <code>89</code>'s filter only). 

+  Having (a chain of) directaccess mutable arrays indeed enables easy marking of these multiples as is usually done in imperative languages; but to get an [[#Tree merging with Wheelefficient ''list''based code]] we have to be smart about combining those streams of multiples of each prime  which gives us also the memory efficiency in generating the results incrementally, one by one. 

−  So this in effect creates a cascade of nested filters in front of the infinite numbers supply, and in <i>extremely premature</i> fashion at that. One way of fixing that would be to <i>postpone</i> the creation of filters until the right moment, by decoupling the primes supply from the numbers supply, as in 

+  Short exposition is [[Prime numbers miscellaneous#A Tale of Sieveshere]]. 

−  === Postponed Filters Sieve === 

+  == Sieve of Eratosthenes == 

+  Simplest, bounded, ''very'' inefficient formulation: 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  import Data.List (\\)  (\\) is setdifference for unordered lists 

−  primes = 2: 3: sieve (tail primes) [5,7..] 

+  
−  where 

+  primesTo m = sieve [2..m] 

−  sieve (p:ps) xs = h ++ sieve ps [x  x<t, x `rem` p /= 0] 

+  where 

−   or: filter ((/=0).(`rem`p)) t 

+  sieve (x:xs) = x : sieve (xs \\ [x,x+x..m]) 

−  +  sieve [] = [] 

+   or: 

+  = ps 

+  where 

+  ps = map head $ takeWhile (not.null) 

+  $ scanl (\\) [2..m] [[p, p+p..m]  p < ps] 

</haskell> 
</haskell> 

−  This can be seen as essential framework for all the code to come. It only tests odd numbers, and only by the primes that are needed, for each <i>numbers span</i> between successive squares of primes. To find the <b>1001</b>st prime, the divisibility test is performed by only <b>24</b> nested filters corresponding to the first <b>24</b> odd primes. 

+  The (unbounded) sieve of Eratosthenes calculates primes as ''integers above 1 that are not multiples of primes'', i.e. ''not composite'' — whereas composites are found as enumeration of multiples of each prime, generated by counting up from prime's square in constant increments equal to that prime (or twice that much, for odd primes). This is much more efficient and runs at about <i>n<sup>1.2</sup></i> empirical orders of growth (corresponding to <i>n (log n)<sup>2</sup> log log n</i> complexity, more or less, in ''n'' primes produced): 

−  Whereas the first version exhibits near O(<math>{n^2}</math>) behavior, this one exhibits near O(<math>{n^{1.5}}</math>) behavior, with an ordersofmagnitude speedup. 

+  <haskell> 

+  import Data.List.Ordered (minus, union, unionAll) 

−  There is another way for the composites to be found  by generating all the multiples of successive primes, in advance. Any number thus generated will of course be divisible by the corresponding prime. 

+  primes = 2 : 3 : minus [5,7..] (unionAll [[p*p, p*p+2*p..]  p < tail primes]) 

−  ==== Postponed Multiples Removal i.e. Euler's Sieve ==== 

+  { Using `under n = takeWhile (<= n)`, with ordered increasing lists, 

−  Instead of testing <i>each number</i> for divisibility by a prime we can just <i>remove</i> the prime's <i>multiples</i> in advance. We gain in speed because we now get the primes <i>for free</i>, after all the multiples are removed on a particular span, <i>without</i> performing any divisibility tests <i>at all</i>: 

+  `minus`, `union` and `unionAll` satisfy, for any `n` and `m`: 

−  <haskell> 

+  under n (minus a b) == nub . sort $ under n a \\ under n b 

−  primes :: [Integer] 

+  under n (union a b) == nub . sort $ under n a ++ under n b 

−  primes = 2: 3: sieve (tail primes) [5,7..] 

+  under n . unionAll . take m == under n . foldl union [] . take m 

−  where 

+  under n . unionAll == nub . sort . concat 

−  sieve (p:ps) xs = h ++ sieve ps (t `minus` tail [q,q+2*p..]) 

+  . takeWhile (not.null) . map (under n) } 

−  where (h,~(_:t)) = span (< q) xs 

+  </haskell> 

−  q = p*p 

+  The definition is primed with 2 and 3 as initial primes, to avoid the vicious circle. 

−  minus :: (Ord a) => [a] > [a] > [a] 

+  The ''"big union"'' <code>unionAll</code> function could be defined as the folding of <code>(\(x:xs) > (x:) . union xs)</code>; or it could use a <code>Bool</code> array as a sorting and duplicatesremoving device. The processing naturally divides up into the segments between successive squares of primes. 

−  minus a@(x:xs) b@(y:ys) = case compare x y of 

−  LT > x: xs `minus` b 

−  EQ > xs `minus` ys 

−  GT > a `minus` ys 

−  minus a b = a 

−  </haskell> 

−  This is, in fact, Euler's sieve. 

+  Stepwise development follows (the fully developed version is [[#Tree merging with Wheelhere]]). 

−  ==== Merged Multiples Removal Sieve ==== 

+  === Initial definition === 

−  <code>(...((sa)b)...)</code> is the same as <code>(s(a+b+...))</code>, and so we can just remove the ''merged'' infinite primes multiples, each starting at its prime's square, from the ''initial'' numbers supply. This way we needn't explicitly jump to a prime's square because it's where its multiples start anyway: 

+  
+  First of all, working with ''ordered'' increasing lists, the sieve of Eratosthenes can be genuinely represented by 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+   genuine yet wasteful sieve of Eratosthenes 

−  primes = 2:primes' 

+   primes = eratos [2.. ]  unbounded 

−  where 

+  primesTo m = eratos [2..m]  bounded, up to m 

−  primes' = [3] ++ [5,7..] `minus` foldr merge' [] mults 

+  where 

−  mults = map (\p>let q=p*p in (q,tail [q,q+2*p..])) $ primes' 

+  eratos [] = [] 

−  +  eratos (p:xs) = p : eratos (xs `minus` [p, p+p..]) 

+   eratos (p:xs) = p : eratos (xs `minus` map (p*) [1..]) 

+   eulers (p:xs) = p : eulers (xs `minus` map (p*) (p:xs)) 

+   turner (p:xs) = p : turner [x  x < xs, rem x p /= 0] 

+   fix ( map head . scanl minus [2..] . map (\p> [p, p+p..]) ) 

+  </haskell> 

−  
+  This should be regarded more like a ''specification'', not a code. It runs at [https://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth empirical orders of growth] worse than quadratic in number of primes produced. But it has the core defining features of the classical formulation of S. of E. as '''''a.''''' being bounded, i.e. having a top limit value, and '''''b.''''' finding out the multiples of a prime directly, by counting up from it in constant increments, equal to that prime. 

−  merge :: (Ord a) => [a] > [a] > [a] 

+  
−  merge a@(x:xs) b@(y:ys) = case compare x y of 

+  The canonical listdifference <code>minus</code> and duplicatesremoving <code>union</code> functions (cf. [http://hackage.haskell.org/packages/archive/dataordlist/latest/doc/html/DataListOrdered.html Data.List.Ordered]) are: 

−  LT > x: merge xs b 

+  <haskell> 

−  EQ > x: merge xs ys 

+   ordered lists, difference and union 

−  GT > y: merge a ys 

+  minus (x:xs) (y:ys) = case (compare x y) of 

−  +  LT > x : minus xs (y:ys) 

+  EQ > minus xs ys 

+  GT > minus (x:xs) ys 

+  minus xs _ = xs 

+  union (x:xs) (y:ys) = case (compare x y) of 

+  LT > x : union xs (y:ys) 

+  EQ > x : union xs ys 

+  GT > y : union (x:xs) ys 

+  union xs [] = xs 

+  union [] ys = ys 

</haskell> 
</haskell> 

−  This code is yet faster. Its main deficiency still is that it creates a linear nested merging structure, instead of a treelike structure. Each multiple produced by a prime has to percolate to the top eventually, so it's better to make its path shorter. It'll have to go through fewer merge nodes this way. 

+  The name ''merge'' ought to be reserved for duplicatespreserving merging operation of the merge sort. 

−  The linearity is imposed by type asymmetry of our <code>(merge' :: a > b > b)</code> function, forcing us into the <code>1+(1+(1+(1+...)))</code> pattern, <code>+</code> standing for <code>merge'</code> (which was defined that way to prevent the runahead when folding over the infinite list of lists of multiples). 

+  === Analysis === 

−  We need to turn it into an associative operation of uniform type <code>(:: a > a > a)</code> to be able to freely rearrange the combinations into arbitrary treelike patterns, as in e.g. ((1+1)+(2+2))+(...) etc. The type uniformity is what makes compositionality possible. 

+  So for each newly found prime ''p'' the sieve discards its multiples, enumerating them by counting up in steps of ''p''. There are thus <math>O(m/p)</math> multiples generated and eliminated for each prime, and <math>O(m \log \log(m))</math> multiples in total, with duplicates, by virtues of [http://en.wikipedia.org/wiki/Prime_harmonic_series prime harmonic series]. 

−  The code in the "Implicit Heap" section below improves on that, and is essentially equivalent to using a treefold instead of a standard linear <code>foldr</code>, as in: 

+  If each multiple is dealt with in <math>O(1)</math> time, this will translate into <math>O(m \log \log(m))</math> RAM machine operations (since we consider addition as an atomic operation). Indeed mutable randomaccess arrays allow for that. But lists in Haskell are sequentialaccess, and complexity of <code>minus(a,b)</code> for lists is <math>\textstyle O(a \cup b)</math> instead of <math>\textstyle O(b)</math> of the direct access destructive array update. The lower the complexity of each ''minus'' step, the better the overall complexity. 

−  ==== Treefold Merged Multiples Removal ==== 

+  So on ''k''th step the argument list <code>(p:xs)</code> that the <code>eratos</code> function gets, starts with the ''(k+1)''th prime, and consists of all the numbers ≤ ''m'' coprime with all the primes ≤ ''p''. According to the M. O'Neill's article (p.10) there are <math>\textstyle\Phi(m,p) \in \Theta(m/\log p)</math> such numbers. 

−  <haskell> 

−  primes :: [Integer] 

−  primes = 2:primes' 

−  where 

−  primes' = [3,5] ++ drop 2 [3,5..] `minus` comps 

−  mults = map (\p> let q=p*p in ([q],tail [q,q+2*p..])) $ primes' 

−  comps = fst $ tfold mergeSP (pairwise mergeSP mults) 

−  pairwise f (x:y:ys) = f x y : pairwise f ys 

+  It looks like <math>\textstyle\sum_{i=1}^{n}{1/log(p_i)} = O(n/\log n)</math> for our intents and purposes. Since the number of primes below ''m'' is <math>n = \pi(m) = O(m/\log(m))</math> by the prime number theorem (where <math>\pi(m)</math> is a prime counting function), there will be ''n'' multiplesremoving steps in the algorithm; it means total complexity of at least <math>O(m n/\log(n)) = O(m^2/(\log(m))^2)</math>, or <math>O(n^2)</math> in ''n'' primes produced  much much worse than the optimal <math>O(n \log(n) \log\log(n))</math>. 

−  tfold f (a: ~(b: ~(c:xs))) 

+  === From Squares === 

−  = (a `f` (b `f` c)) `f` tfold f (pairwise f xs) 

−  mergeSP (a,b) ~(c,d) = let (bc,b') = spMerge b c 

+  But we can start each elimination step at a prime's square, as its smaller multiples will have been already produced and discarded on previous steps, as multiples of smaller primes. This means we can stop early now, when the prime's square reaches the top value ''m'', and thus cut the total number of steps to around <math>\textstyle n = \pi(m^{0.5}) = \Theta(2m^{0.5}/\log m)</math>. This does not in fact change the complexity of randomaccess code, but for lists it makes it <math>O(m^{1.5}/(\log m)^2)</math>, or <math>O(n^{1.5}/(\log n)^{0.5})</math> in ''n'' primes produced, a dramatic speedup: 

−  in (a ++ bc, merge b' d) 

+  <haskell> 

−  where 

+  primesToQ m = eratos [2..m] 

−  spMerge :: (Ord a) => [a] > [a] > ([a],[a]) 

+  where 

−  spMerge a@(x:xs) b@(y:ys) = case compare x y of 

+  eratos [] = [] 

−  LT > (x:c,d) where (c,d) = spMerge xs b 

+  eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+p..m]) 

−  EQ > (x:c,d) where (c,d) = spMerge xs ys 

+   eratos (p:xs) = p : eratos (xs `minus` map (p*) [p..div m p]) 

−  GT > (y:c,d) where (c,d) = spMerge a ys 

+   eulers (p:xs) = p : eulers (xs `minus` map (p*) (under (div m p) (p:xs))) 

−  spMerge a [] = ([] ,a) 

+   turner (p:xs) = p : turner [x  x<xs, x<p*p  rem x p /= 0] 

−  spMerge [] b = ([] ,b) 

</haskell> 
</haskell> 

−  The fold used here creates a <code>(2+(2+2))+( (4+(4+4)) + ( (8+(8+8)) + ... ))</code> structure, better adjusted for primes multiples production than <code>1+(2+(4+(8+...)))</code>, used by the "Implicit Heap" code, giving it additional 10% speedup. 

−  <code> mergeSP </code> is an associative operation, preserving of the invariant such that for a list of multiples <code>[(a,b),(c,d), ...]</code>, it's always the case that <code> last a < head b && last a < head c</code>. These "split pairs" represent ordered list as a pair of its known and (currently) finite prefix, and the rest of it. Such pairs under <code>mergeSP</code> operation form a <i>monoid</i>, and if we were to declare a <hask>newtype SplitPair a = SP ([a],[a])</hask> a <hask>Monoid</hask> instance, with <code>mergeSP</code> its <hask>mappend</hask> (and <code>tfold</code> its <hask>mconcat</hask>), the above code for <code>comps</code> would just become <hask>SP (comps,_) = mconcat mults</hask>. 

+  Its empirical complexity is around <math>O(n^{1.45})</math>. This simple optimization works here because this formulation is bounded (by an upper limit). To start late on a bounded sequence is to stop early (starting past end makes an empty sequence – ''see warning below''<sup><sub> 1</sub></sup>), thus preventing the creation of all the superfluous multiples streams which start above the upper bound anyway <small>(note that Turner's sieve is unaffected by this)</small>. This is acceptably slow now, striking a good balance between clarity, succinctness and efficiency. 

−  This code exhibits approximately O(<math>{n^{1.20}}</math>)..O(<math>{n^{1.15}}</math>) local asymptotic behavior (tested interpreted, in GHCi, for 10,000 to 300,000 primes produced). When compared with Melissa O'Neill's PQ code from the ZIP package which was modified to work on odds only as well, it is 3.2x times faster, with used memory reported about 2.5x times smaller. 

+  <sup><sub>1</sub></sup><small>''Warning'': this is predicated on a subtle point of <code>minus xs [] = xs</code> definition being used, as it indeed should be. If the definition <code>minus (x:xs) [] = x:minus xs []</code> is used, the problem is back and the complexity is bad again.</small> 

−  It can be further improved with the wheel optimization (as described below at the Prime Wheels section): 

+  === Guarded === 

+  This ought to be ''explicated'' (improving on clarity, though not on time complexity) as in the following, for which it is indeed a minor optimization whether to start from ''p'' or ''p*p''  because it explicitly ''stops as soon as possible'': 

+  <haskell> 

+  primesToG m = 2 : sieve [3,5..m] 

+  where 

+  sieve (p:xs) 

+   p*p > m = p : xs 

+   otherwise = p : sieve (xs `minus` [p*p, p*p+2*p..]) 

+   p : sieve (xs `minus` map (p*) [p,p+2..]) 

+   p : eulers (xs `minus` map (p*) (p:xs)) 

+  </haskell> 

+  (here we also flatly ignore all evens above 2 a priori.) It is now clear that it ''can't'' be made unbounded just by abolishing the upper bound ''m'', because the guard can not be simply omitted without changing the complexity back for the worst. 

−  ==== Treefold Merged Multiples, with Wheel ==== 

+  === Accumulating Array === 

+  
+  So while <code>minus(a,b)</code> takes <math>O(b)</math> operations for randomaccess imperative arrays and about <math>O(a)</math> operations here for ordered increasing lists of numbers, when using Haskell's immutable array for ''a'' one ''could'' expect the array update time to be nevertheless closer to <math>O(b)</math> if destructive update was used implicitly by compiler for an array being passed along as an accumulating state parameter: 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  {# OPTIONS_GHC O2 #} 

−  primes = 2:3:5:7:primes' 

+  import Data.Array.Unboxed 

−  where 

−  primes' = [11,13] ++ drop 2 (rollFrom 11) `minus` comps 

−  mults = map (\p> fromList $ map (p*) $ rollFrom p) $ primes' 

−  comps = fst $ tfold mergeSP (pairwise mergeSP mults) 

−  fromList (x:xs) = ([x],xs) 

−  rollFrom n = let x = (n11) `mod` 210 

−  (y,_) = span (< x) wheelSums 

−  in roll n $ drop (length y) wheel 

−  wheelSums = roll 0 wdiffs 

+  primesToA m = sieve 3 (array (3,m) [(i,odd i)  i<[3..m]] 

−  roll = scanl (+) 

+  :: UArray Int Bool) 

−  wheel = wdiffs ++ wheel 

+  where 

−  wdiffs = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2: 

+  sieve p a 

−  4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wdiffs 

+   p*p > m = 2 : [i  (i,True) < assocs a] 

+   a!p = sieve (p+2) $ a // [(i,False)  i < [p*p, p*p+2*p..m]] 

+   otherwise = sieve (p+2) a 

</haskell> 
</haskell> 

−  This runs about 2.5x times faster than the Priority Queuebased code as present in Melissa O'Neill's ZIP package, with similar local asymptotic behavior of about O(<math>{n^{1.25}}</math>)..O(<math>{n^{1.17}}</math>) (tested interpreted, in GHCi, for 10,000 to 300,000 primes produced), with used memory reported about twice less as well. 

+  Indeed for unboxed arrays <small>(suggested by Daniel Fischer; with regular, boxed arrays it is ''very'' slow)</small>, the above code runs pretty fast, but with empirical complexity of ''O(n<sup>1.15..1.45</sup>)'' in ''n'' primes produced <small>(for producing from few hundred thousands to few millions primes, memory usage also slowly growing)</small>. If the update for each index were an <i>O(1)</i> operation, the empirical complexity would be seen as diminishing, as ''O(n<sup>1.15..1.05</sup>)'', reflecting the true, linearithmic complexity. 

−  === More Filtering Sieves === 

+  We could use explicitly mutable monadic arrays ([[#Using Mutable Arrays''see below'']]) to remedy this, but we can also think about it a little bit more on the functional side of things still. 

−  The primes list creation with divisibility testing can be reformulated in a few more ways, using the list of primes <i>as it is being built</i> (a la "circular programming"). 

+  === Postponed === 

+  Going back to ''guarded'' Eratosthenes, first we notice that though it works with minimal number of prime multiples streams, it still starts working with each prematurely. Fixing this with explicit synchronization won't change complexity but will speed it up some more: 

+  <haskell> 

+  primesPE1 = 2 : sieve [3..] primesPE1 

+  where 

+  sieve xs (p:pt)  q < p*p , (h,t) < span (< q) xs = 

+  h ++ sieve (t `minus` [q, q+p..]) pt 

+   h ++ turner [x  x<t, rem x p>0] pt 

+  </haskell> 

+  <! primesPE1 = 2 : sieve [3,5..] 9 (tail primesPE1) 

+  where 

+  sieve xs q ~(p:pt)  (h,t) < span (< q) xs = 

+  h ++ sieve (t `minus` [q, q+2*p..]) (head pt^2) pt 

+   h ++ turner [x  x<t, rem x p>0] ... 

+  > 

+  Inlining and fusing <code>span</code> and <code>(++)</code> we get: 

+  <! 

+  primesPE = 2 : ops 

+  where 

+  ops = sieve [3,5..] 9 ops  odd primes 

+  sieve (x:xs) q ps@ ~(p:pt) 

+   x < q = x : sieve xs q ps 

+   otherwise = sieve (xs `minus` [q, q+2*p..]) (head pt^2) pt > 

+  <haskell> 

+  primesPE = 2 : sieve [3..] [[p*p, p*p+p..]  p < primesPE] 

+  where 

+  sieve (x:xs) t@((q:cs):r) 

+   x < q = x : sieve xs t 

+   otherwise = sieve (minus xs cs) r 

+  </haskell> 

+  Since the removal of a prime's multiples here starts at the right moment, and not just from the right place, the code could now finally be made unbounded. Because no multiplesremoval process is started ''prematurely'', there are no ''extraneous'' multiples streams, which were the reason for the original formulation's extreme inefficiency. 

−  ==== Odd numbers, by Trial Division ==== 

+  === Segmented === 

−  This is also good for generating a few 100,000s primes (when GHCcompiled as well): 

+  With work done segmentwise between the successive squares of primes it becomes 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  primesSE = 2 : ops 

−  primes = 2: 3: filter isPrime [5,7..] 

+  where 

−  where 

+  ops = sieve 3 9 ops []  odd primes 

−  isPrime n = all (notDivs n) $ takeWhile (\p> p*p <= n) (tail primes) 

+  sieve x q ~(p:pt) fs = 

−  notDivs n p = n `mod` p /= 0 

+  foldr (flip minus) [x,x+2..q2]  chain of subtractions 

+  [[y+s, y+2*s..q]  (s,y) < fs]  OR, 

+   [x,x+2..q2] `minus` foldl union []  subtraction of merged 

+   [[y+s, y+2*s..q]  (s,y) < fs]  lists 

+  ++ sieve (q+2) (head pt^2) pt 

+  ((2*p,q):[(s,qrem (qy) s)  (s,y) < fs]) 

+  </haskell> 

+  
+  This "marks" the odd composites in a given range by generating them  just as a person performing the original sieve of Eratosthenes would do, counting ''one by one'' the multiples of the relevant primes. These composites are independently generated so some will be generated multiple times. 

+  
+  The advantage to working in spans explicitly is that this code is easily amendable to using arrays for the composites marking and removal on each ''finite'' span; and memory usage can be kept in check by using fixed sized segments. 

+  
+  ====Segmented Treemerging==== 

+  Rearranging the chain of subtractions into a subtraction of merged streams ''([[#Linear mergingsee below]])'' and using [[#Tree mergingtreelike folding]] structure, further [http://ideone.com/pfREP speeds up the code] and ''significantly'' improves its asymptotic time behavior (down to about <math>O(n^{1.28} empirically)</math>, space is leaking though): 

+  
+  <haskell> 

+  primesSTE = 2 : ops 

+  where 

+  ops = sieve 3 9 ops []  odd primes 

+  sieve x q ~(p:pt) fs = 

+  ([x,x+2..q2] `minus` joinST [[y+s, y+2*s..q]  (s,y) < fs]) 

+  ++ sieve (q+2) (head pt^2) pt 

+  ((++ [(2*p,q)]) [(s,qrem (qy) s)  (s,y) < fs]) 

+  
+  joinST (xs:t) = (union xs . joinST . pairs) t 

+  where 

+  pairs (xs:ys:t) = union xs ys : pairs t 

+  pairs t = t 

+  joinST [] = [] 

</haskell> 
</haskell> 

−  Instead of relying on nested filters, it tests each odd number by an explicit list of all the needed prime factors. But for each number tested it refetches this list <i>anew</i> which will be <i>the same</i> for the increasing spans of numbers between the successive squares of primes. 

+  ====Segmented merging via an array==== 

−  ==== Generated Spans, by Nested Filters ==== 

+  The removal of composites is easy with arrays. Starting points can be calculated directly: 

−  The other way to go instead of concentrating on the numbers supply, is to directly work on the successive spans between the primes squares. 

−  This version is a bit faster still, creating <i>158,000 primes</i> (again, GHCcompiled) in the same time as the postponed filters does 100,000 primes: 

+  <haskell> 

+  import Data.List (inits, tails) 

+  import Data.Array.Unboxed 

+  
+  primesSAE = 2 : sieve 2 4 (tail primesSAE) (inits primesSAE) 

+   (2:) . (sieve 2 4 . tail <*> inits) $ primesSAE 

+  where 

+  sieve r q ps (fs:ft) = [n  (n,True) < assocs ( 

+  accumArray (\ _ _ > False) True (r+1,q1) 

+  [(m,())  p < fs, let s = p * div (r+p) p, 

+  m < [s,s+p..q1]] :: UArray Int Bool )] 

+  ++ sieve q (head ps^2) (tail ps) ft 

+  </haskell> 

+  The pattern of iterated calls to <code>tail</code> is captured by a higherorder function <code>tails</code>, which explicitly generates the stream of tails of a stream, making for a bit more readable (even if possibly a bit less efficient) code: 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  psSAGE = 2 : [n  (r:q:_, fs) < (zip . tails . (2:) . map (^2) <*> inits) psSAGE, 

−  primes = 2: 3: sieve [] (tail primes) 5 

+  (n,True) < assocs ( 

−  where 

+  accumArray (\_ _ > False) True (r+1, q1) 

−  notDivsBy d n = n `mod` d /= 0 

+  [(m,())  p < fs, let s = (r+p)`div`p*p, 

−  sieve ds (p:ps) x = foldr (filter . notDivsBy) [x, x+2..p*p2] ds 

+  m < [s,s+p..q1]] :: UArray Int Bool )] 

−  ++ sieve (p:ds) ps (p*p+2) 

</haskell> 
</haskell> 

−  This one explicitly maintains the list of primes needed for testing each odds span between successive primes squares, which it also explicitly generates. But it tests with nested <code>filter</code>s, which it repeatedly recreates. 

+  === Linear merging === 

+  But segmentation doesn't add anything substantially, and each multiples stream starts at its prime's square anyway. What does the [[#PostponedPostponed]] code do, operationally? With each prime's square passed by, there emerges a nested linear ''leftdeepening'' structure, '''''(...((xsa)b)...)''''', where '''''xs''''' is the original oddsproducing ''[3,5..]'' list, so that each odd it produces must go through more and more <code>minus</code> nodes on its way up  and those odd numbers that eventually emerge on top are prime. Thinking a bit about it, wouldn't another, ''rightdeepening'' structure, '''''(xs(a+(b+...)))''''', be better? This idea is due to Richard Bird, seen in his code presented in M. O'Neill's article, equivalent to: 

+  <haskell> 

+  primesB = 2 : minus [3..] (foldr (\p r> p*p : union [p*p+p, p*p+2*p..] r) 

+  [] primesB) 

+  </haskell> 

+  or, 

−  ==== Generated Spans, by List of Primes ==== 

+  <haskell> 

−  The list of primes needed to test each range of odds is actually just the prefix of the primes list itself, of known length, and need not be specifically generated at all. Combined with onecall testing by the explicit list of primes, and direct generation of odds between the successive primes squares, this leads to: 

+  primesLME1 = 2 : prs 

+  where 

+  prs = 3 : minus [5,7..] (joinL [[p*p, p*p+2*p..]  p < prs]) 

+  
+  joinL ((x:xs):t) = x : union xs (joinL t) 

+  </haskell> 

+  
+  Here, ''xs'' stays near the top, and ''more frequently'' oddsproducing streams of multiples of smaller primes are ''above'' those of the bigger primes, that produce ''less frequently'' their multiples which have to pass through ''more'' <code>union</code> nodes on their way up. Plus, no explicit synchronization is necessary anymore because the produced multiples of a prime start at its square anyway  just some care has to be taken to avoid a runaway access to the indefinitelydefined structure, defining <code>joinL</code> (or <code>foldr</code>'s combining function) to produce part of its result ''before'' accessing the rest of its input (thus making it ''productive''). 

+  
+  Melissa O'Neill [http://hackage.haskell.org/packages/archive/NumberSieves/0.0/doc/html/src/NumberTheorySieveONeill.html introduced double primes feed] to prevent unneeded memoization (a memory leak). We can even do multistage. Here's the code, faster still and with radically reduced memory consumption, with empirical orders of growth of around ~ <math>n^{1.40}</math> (initially better, yet worsening for bigger ranges): 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  primesLME = 2 : _Y ((3:) . minus [5,7..] . joinL . map (\p> [p*p, p*p+2*p..])) 

−  primes = 2: 3: sieve 0 (tail primes) 5 

+  
−  sieve k (p:ps) x = [x  x<[x,x+2..p*p2], and [x`rem`p/=0  p<fs]] 

+  _Y :: (t > t) > t 

−  +  _Y g = g (_Y g)  multistage, nonsharing, g (g (g (g ...))) 

−  +   g (let x = g x in x)  two g stages, sharing 

−  where fs = take k (tail primes) 

</haskell> 
</haskell> 

−  It produces about <i>222,000 primes</i> in the same amount of time, and is good for creating about a million first primes, compiled. 

+  <code>_Y</code> is a nonsharing fixpoint combinator, here arranging for a recursive ''"telescoping"'' multistage primes production (a ''tower'' of producers). 

−  The reason to have <code>sieve</code> function available separately too is that it can also be used to produce primes above a given number, as in 

+  This allows the <code>primesLME</code> stream to be discarded immediately as it is being consumed by its consumer. For <code>prs</code> from <code>primesLME1</code> definition above it is impossible, as each produced element of <code>prs</code> is needed later as input to the same <code>prs</code> corecursive stream definition. So the <code>prs</code> stream feeds in a loop into itself and is thus retained in memory, being consumed by self much slower than it is produced. With multistage production, each stage feeds into its consumer above it at the square of its current element which can be immediately discarded after it's been consumed. <code>(3:)</code> jumpstarts the whole thing. 

+  
+  === Tree merging === 

+  Moreover, it can be changed into a '''''tree''''' structure. This idea [http://www.haskell.org/pipermail/haskellcafe/2007July/029391.html is due to Dave Bayer] and [[Prime_numbers_miscellaneous#Implicit_HeapHeinrich Apfelmus]]: 

<haskell> 
<haskell> 

−  primesFrom m = sieve (length h) ps $ m`div`2*2+1 

+  primesTME = 2 : _Y ((3:) . gaps 5 . joinT . map (\p> [p*p, p*p+2*p..])) 

−  where 

+  
−  (h,(_:ps)) = span (<= (floor.sqrt.fromIntegral) m) primes 

+   joinL ((x:xs):t) = x : union xs (joinL t) 

+  joinT ((x:xs):t) = x : union xs (joinT (pairs t))  set union, ~= 

+  where pairs (xs:ys:t) = union xs ys : pairs t  nub.sort.concat 

+  
+  gaps k s@(x:xs)  k < x = k:gaps (k+2) s  ~= [k,k+2..]\\s, 

+   True = gaps (k+2) xs  when null(s\\[k,k+2..]) 

</haskell> 
</haskell> 

−  It can thus produce a few primes e.g. above <code>239812076741689</code>, which is a square of the millionth odd prime, without having to compute all the preceding primes (which would number in trillions). 

+  This code is [http://ideone.com/p0e81 pretty fast], running at speeds and empirical complexities comparable with the code from Melissa O'Neill's article (about <math>O(n^{1.2})</math> in number of primes ''n'' produced). 

−  ==== Multiples Removal on Generated Spans, or Sieve of Eratosthenes ==== 

+  As an aside, <code>joinT</code> is equivalent to [[Fold#Treelike_foldsinfinite treelike folding]] <code>foldi (\(x:xs) ys> x:union xs ys) []</code>: 

−  The divisibility testing too should be considered a specification (as in "no multiples of p"), and not a code per se, because although testing composites is cheap (as most of them will have small factors, so the test is soon aborted), testing prime numbers is costly, and is to be avoided. 

+  [[Image:Treelike_folding.gifframelesscenter458pxtreelike folding]] 

−  All the filtering versions thus far try to <i>keep the primes</i> among all numbers by testing <i>each number</i> in isolation. Instead, all the relevant primes' multiples can be removed from the corresponding segments of odds, and what's left after that will be just primes: 

+  [https://hackage.haskell.org/package/dataordlist0.4.7.0/docs/DataListOrdered.html#v:foldt <code>Data.List.Ordered.foldt</code>] of the <i>dataordlist</i> package builds the same structure, but in a lazier fashion, consuming its input at the slowest pace possible. Here this sophistication is not needed (evidently). 

+  
+  === Tree merging with Wheel === 

+  Wheel factorization optimization can be further applied, and another tree structure can be used which is better adjusted for the primes multiples production (effecting about 5%10% at the top of a total ''2.5x speedup'' w.r.t. the above tree merging on odds only, for first few million primes): 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  primesTMWE = [2,3,5,7] ++ _Y ((11:) . tail . gapsW 11 wheel 

−  primes = 2: 3: sieve [] (tail primes) 5 

+  . joinT . hitsW 11 wheel) 

−  +  
−  +  gapsW k (d:w) s@(c:cs)  k < c = k : gapsW (k+d) w s  set difference 

−  +   otherwise = gapsW (k+d) w cs  k==c 

−  +  hitsW k (d:w) s@(p:ps)  k < p = hitsW (k+d) w s  intersection 

−  +   otherwise = scanl (\c d>c+p*d) (p*p) (d:w) 

−  +  : hitsW (k+d) w ps  k==p 

−  fs' = [ (s,last ms)  ((s,_),ms)< zip fs mults] 

−  </haskell> 

−  This modifies the preceding sieve to "mark" the odd composites in a given range (instead of testing their divisibility) by generating them  just as a person performing the original sieve of Eratosthenes would do, marking one by one the multiples of the relevant primes. 

+  wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2: 

+  4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel 

+   cycle $ zipWith () =<< tail $ [i  i < [11..11+_210], gcd i _210 == 1] 

+   where _210 = product [2,3,5,7] 

+  </haskell> 

−  Compared with the previous version, interpreted under both GHCi and WinHugs, it runs <i>faster</i>, takes <i>less</i> memory, and has better asymptotic behavior, its performance approximately the same as in the Merged Multiples Removal sieve. The advantage in working with spans explicitly is that this code is easily amendable to using arrays for the composites marking and removal on each <i>finite</i> span. 

+  The <code>hitsW</code> function is there to find the starting point for rolling the wheel for each prime, but this can be found directly: 

−  === Using Arrays === 

+  <haskell> 

+  primesW = [2,3,5,7] ++ _Y ( (11:) . tail . gapsW 11 wheel . joinT . 

+  map (\p > 

+  map (p*) . dropWhile (< p) $ 

+  scanl (+) (p  rem (p11) 210) wheel) ) 

+  </haskell> 

−  ==== Generating Segments of Primes ==== 

+  Seems to run about 1.4x faster, too. 

−  The removal of multiples on each segment of odds in the sieve of Eratosthenes can be done by actually marking them in an array, instead of manipulating lists with "minus" and "merge": 

+  ====Above Limit  Offset Sieve==== 

+  Another task is to produce primes above a given value: 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  {# OPTIONS_GHC O2 fnocse #} 

−  primes = 2: 3: sieve [] (tail primes) 5 

+  primesFromTMWE primes m = dropWhile (< m) [2,3,5,7,11] 

−  where 

+  ++ gapsW a wh2 (compositesFrom a) 

−  sieve fs (p:ps) x = [i  i< [x,x+2..q2], a!i] 

+  where 

−  ++ sieve ((2*p,q):fs') ps (q+2) 

+  (a,wh2) = rollFrom (snapUp (max 3 m) 3 2) 

−  where 

+  (h,p2:t) = span (< z) $ drop 4 primes  p < z => p*p<=a 

−  +  z = ceiling $ sqrt $ fromIntegral a + 1  p2>=z => p2*p2>a 

−  +  compositesFrom a = joinT (joinST [multsOf p a  p < h ++ [p2]] 

−  +  : [multsOf p (p*p)  p < t] ) 

−  +  
−  +  snapUp v o step = v + (mod (ov) step)  full steps from o 

+  multsOf p from = scanl (\c d>c+p*d) (p*x) wh  map (p*) $ 

+  where  scanl (+) x wh 

+  (x,wh) = rollFrom (snapUp from p (2*p) `div` p)  , if p < from 

+  
+  wheelNums = scanl (+) 0 wheel 

+  rollFrom n = go wheelNums wheel 

+  where 

+  m = (n11) `mod` 210 

+  go (x:xs) ws@(w:ws2)  x < m = go xs ws2 

+   True = (n+xm, ws)  (x >= m) 

</haskell> 
</haskell> 

−  Apparently, arrays are ''fast'' too. This code, compared with Treefold Merged with Wheel version (itself 2.5x times faster than Melissa O'Neill's PQ version), runs at about the same time and memory usage, but improving slightly with slightly better local asymptotics. 

−  ==== Calculating Primes Upto a Given Value ==== 

+  A certain preprocessing delay makes it worthwhile when producing more than just a few primes, otherwise it degenerates into simple [[#Optimal trial divisiontrial division]], which is then ought to be used directly: 

<haskell> 
<haskell> 

−  primesToN n = 2: [i  i<[3,5..n], ar!i] 

+  primesFrom m = filter isPrime [m..] 

−  where 

−  ar = f 5 $ accumArray (\a b>False) True (3,n) 

−  [(i,())  i< [9,15..n]] 

−  f p a  q>=n = a 

−   True = if null x then a' else f (head x) a' 

−  where q = p*p 

−  a'= a//[(i,False)i<[q,q+2*p..n]] 

−  x = [i  i<[p+2,p+4..n], a' !i] 

</haskell> 
</haskell> 

−  ==== Calculating Primes in a Given Range ==== 

+  === Mapbased === 

+  Runs ~1.7x slower than [[#Tree_mergingTME version]], but with the same empirical time complexity, ~<math>n^{1.2}</math> (in ''n'' primes produced) and same very low (near constant) memory consumption: 

<haskell> 
<haskell> 

−  primesFromTo a b = (if a<3 then [2] else []) 

+  import Data.List  based on 

−  ++ [i  i<[o,o+2..b], ar!i] 

+  import qualified Data.Map as M  http://stackoverflow.com/a/1140100 

−  where 

+  
−  o = max (if even a then a+1 else a) 3 

+  primesMPE :: [Integer] 

−  r = (1+).floor.sqrt.fromInteger $ b 

+  primesMPE = 2 : mkPrimes 3 M.empty prs 9  postponed sieve enlargement 

−  ar = accumArray (\a b>False) True (o,b) 

+  where  by decoupled primes feed loop 

−  [(i,())  p < [3,5..r] 

+  prs = 3 : mkPrimes 5 M.empty prs 9 

−  , let q = p*p 

+  mkPrimes n m ps@ ~(p:pt) q = case (M.null m, M.findMin m) of 

−  +  { (False, (n2, skips))  n == n2 > 

−  +  mkPrimes (n+2) (addSkips n (M.deleteMin m) skips) ps q 

−  +  ; _ > if n < q 

−  +  then n : mkPrimes (n+2) m ps q 

−  +  else mkPrimes (n+2) (addSkip n m (2*p)) pt (head pt^2) 

−  +  } 

+  addSkip n m s = M.alter (Just . maybe [s] (s:)) (n+s) m 

+  addSkips = foldl' . addSkip 

</haskell> 
</haskell> 

−  Although testing by odds instead of by primes, the array generation is so fast that it is very much feasible and even preferable for quick generation of some short spans of relatively big primes. 

+  == Turner's sieve  Trial division == 

−  === Implicit Heap === 

+  David Turner's ''(SASL Language Manual, 1983)'' formulation replaces nonstandard <code>minus</code> in the sieve of Eratosthenes by stock list comprehension with <code>rem</code> filtering, turning it into a trial division algorithm, for clarity and simplicity: 

−  The following is a more efficient prime generator, implementing the sieve of 

+  <haskell> 

−  Eratosthenes. 

+   unbounded sieve, premature filters 

+  primesT = sieve [2..] 

+  where 

+  sieve (p:xs) = p : sieve [x  x < xs, rem x p > 0] 

−  See also the message threads [http://thread.gmane.org/gmane.comp.lang.haskell.cafe/25270/focus=25312 Re: "nocoding" functional data structures via lazyness] for more about how merging ordered lists amounts to creating an implicit heap and [http://thread.gmane.org/gmane.comp.lang.haskell.cafe/26426/focus=26493 Re: Code and Perf. Data for Prime Finders] for an explanation of the <hask>People a</hask> structure that makes it work when tying a knot. 

+   map head 

+   $ iterate (\(p:xs) > [x  x < xs, rem x p > 0]) [2..] 

+  </haskell> 

+  
+  This creates many superfluous implicit filters, because they are created prematurely. To be admitted as prime, ''each number'' will be ''tested for divisibility'' here by all its preceding primes, while just those not greater than its square root would suffice. To find e.g. the '''1001'''st prime (<code>7927</code>), '''1000''' filters are used, when in fact just the first '''24''' are needed (up to <code>89</code>'s filter only). Operational overhead here is huge; theoretically, it has ''quadratic'' time complexity, in the number of produced primes. 

+  
+  === Guarded Filters === 

+  But this really ought to be changed into the ''abortive'' variant, [[#From Squaresagain achieving]] the ''"miraculous"'' complexity improvement from above quadratic to about <math>O(n^{1.45})</math> ''empirically'' (in ''n'' primes produced) by stopping the sieving as soon as possible: 

<haskell> 
<haskell> 

−  data People a = VIP a (People a)  Crowd [a] 

+  primesToGT m = sieve [2..m] 

+  where 

+  sieve (p:xs) 

+   p*p > m = p : xs 

+   True = p : sieve [x  x < xs, rem x p > 0] 

−  mergeP :: Ord a => People a > People a > People a 

+   (\(a,b:_) > map head a ++ b) . span ((< m).(^2).head) $ 

−  mergeP (VIP x xt) ys = VIP x $ mergeP xt ys 

+   iterate (\(p:xs) > [x  x < xs, rem x p > 0]) [2..m] 

−  mergeP (Crowd xs) (Crowd ys) = Crowd $ merge xs ys 

+  </haskell> 

−  mergeP xs@(Crowd (x:xt)) ys@(VIP y yt) = case compare x y of 

−  LT > VIP x $ mergeP (Crowd xt) ys 

−  EQ > VIP x $ mergeP (Crowd xt) yt 

−  GT > VIP y $ mergeP xs yt 

−  merge :: Ord a => [a] > [a] > [a] 

+  === Postponed Filters === 

−  merge xs@(x:xt) ys@(y:yt) = case compare x y of 

+  Or it can remain unbounded, just filters creation must be ''postponed'' until the right moment: 

−  LT > x : merge xt ys 

+  <haskell> 

−  EQ > x : merge xt yt 

+  primesPT1 = 2 : sieve primesPT1 [3..] 

−  GT > y : merge xs yt 

+  where 

+  sieve (p:pt) xs = let (h,t) = span (< p*p) xs 

+  in h ++ sieve pt [x  x < t, rem x p > 0] 

−  diff xs@(x:xt) ys@(y:yt) = case compare x y of 

+   fix $ concatMap (fst . fst) 

−  LT > x : diff xt ys 

+   . iterate (\((_,xs), p:pt) > let (h,t) = span (< p*p) xs in 

−  EQ > diff xt yt 

+   ((h, [x  x < t, rem x p > 0]), pt)) 

−  +   . (,) ([2],[3..]) 

+  </haskell> 

+  It can be rewritten with <code>span</code> and <code>(++)</code> inlined and fused into the <code>sieve</code>: 

+  <haskell> 

+  primesPT = 2 : oddprimes 

+  where 

+  oddprimes = sieve [3,5..] 9 oddprimes 

+  sieve (x:xs) q ps@ ~(p:pt) 

+   x < q = x : sieve xs q ps 

+   True = sieve [x  x < xs, rem x p /= 0] (head pt^2) pt 

+  </haskell> 

+  creating here [[#Linear merging as well]] the linear filtering nested structure at runtime, <code>(...(([3,5..] >>= filterBy [3]) >>= filterBy [5])...)</code>, but unlike the nonpostponed code each filter being created at its proper moment, not sooner than the prime's square is seen. 

+  <haskell> 

+  filterBy ds n = [n  noDivs n ds]  `ds` assumed to be nondecreasing 

−  foldTree :: (a > a > a) > [a] > a 

+  noDivs n ds = foldr (\d r > d*d > n  (rem n d > 0 && r)) True ds 

−  foldTree f ~(x:xs) = x `f` foldTree f (pairs xs) 

+  </haskell> 

−  where pairs ~(x: ~(y:ys)) = f x y : pairs ys 

−  primes, nonprimes :: [Integer] 

+  === Optimal trial division === 

−  primes = 2:3:diff [5,7..] nonprimes 

+  
−  nonprimes = serve . foldTree mergeP . map multiples $ tail primes 

+  The above is algorithmically equivalent to the traditional formulation of trial division, 

+  <haskell> 

+  ps = 2 : [i  i < [3..], 

+  and [rem i p > 0  p < takeWhile (\p > p^2 <= i) ps]] 

+  </haskell> 

+  or, 

+  <haskell> 

+   primes = filter (`noDivs`[2..]) [2..] 

+   = 2 : filter (`noDivs`[3,5..]) [3,5..] 

+  primesTD = 2 : 3 : filter (`noDivs` tail primesTD) [5,7..] 

+  
+  isPrime n = n > 1 && noDivs n primesTD 

+  </haskell> 

+  except that this code is rechecking for each candidate number which primes to use, whereas for every candidate number in each segment between the successive squares of primes these will just be the same prefix of the primes list being built. 

+  
+  Trial division is used as a simple [[Testing primality#Primality Test and Integer Factorizationprimality test and prime factorization algorithm]]. 

+  
+  === Segmented Generate and Test === 

+  Next we turn [[#Postponed Filters the list of filters]] into one filter by an ''explicit'' list, each one in a progression of prefixes of the primes list. This seems to eliminate most recalculations, explicitly filtering composites out from batches of odds between the consecutive squares of primes. 

+  <haskell> 

+  import Data.List 

+  
+  primesST = 2 : ops 

where 
where 

−  multiples p = vip [p*p,p*p+2*p..] 

+  ops = sieve 3 9 ops (inits ops)  odd primes 

+   (sieve 3 9 <*> inits) ops  inits: [],[3],[3,5],... 

+  sieve x q ~(_:pt) (fs:ft) = 

+  filter ((`all` fs) . ((> 0).) . rem) [x,x+2..q2] 

+  ++ sieve (q+2) (head pt^2) pt ft 

+  </haskell> 

+  This can also be coded as, arguably more readable, 

+  <haskell> 

+  primesSGT = 2 : ops 

+  where 

+  ops = 3 : [n  (r:q:_, px) < (zip . tails . (3:) . map (^2)) ops (inits ops), 

+  n < [r+2,r+4..q2], all ((> 0) . rem n) px] 

+  
+   n < foldl (>>=) [r+2,r+4..q2]  chain of filters 

+   [filterBy [p]  p < px]]  OR, 

−  vip (x:xs) = VIP x $ Crowd xs 

+   n < [r+2,r+4..q2] >>= filterBy px]  a filter by a list 

−  serve (VIP x xs) = x:serve xs 

−  serve (Crowd xs) = xs 

</haskell> 
</haskell> 

−  <hask>nonprimes</hask> effectively implements a heap, exploiting lazy evaluation. 

+  ==== Generate and Test Above Limit ==== 

−  === Prime Wheels === 

+  The following will start the segmented Turner sieve at the right place, using any primes list it's supplied with (e.g. [[#Tree_merging_with_Wheel  TMWE]] etc.) or itself, as shown, demand computing it just up to the square root of any prime it'll produce: 

−  The idea of only testing odd numbers can be extended further. For instance, it is a useful fact that every prime number other than 2 and 3 must be of the form <math>6k+1</math> or <math>6k+5</math>. Thus, we only need to test these numbers: 

+  <haskell> 

+  primesFromST m  m <= 2 = 2 : primesFromST 3 

+  primesFromST m  m > 2 = 

+  sieve (m`div`2*2+1) (head ps^2) (tail ps) (inits ps) 

+  where 

+  (h,ps) = span (<= (floor.sqrt $ fromIntegral m+1)) ops 

+  sieve x q ps (fs:ft) = 

+  filter ((`all` (h ++ fs)) . ((> 0) .) . rem) [x,x+2..q2] 

+  ++ sieve (q+2) (head ps^2) (tail ps) ft 

+  ops = 3 : primesFromST 5  odd primes 

+  
+   ~> take 3 $ primesFromST 100000001234 

+   [100000001237,100000001239,100000001249] 

+  </haskell> 

+  
+  This is usually faster than testing candidate numbers for divisibility [[#Optimal trial divisionone by one]] which has to refetch anew the needed prime factors to test by, for each candidate. Faster is the [[99_questions/Solutions/39#Solution_4.offset sieve of Eratosthenes on odds]], and yet faster the one [[#Above_Limit__Offset_Sievew/ wheel optimization]], on this page. 

+  
+  === Conclusions === 

+  All these variants being variations of trial division, finding out primes by direct divisibility testing of every candidate number by sequential primes below its square root (instead of just by ''its factors'', which is what ''direct generation of multiples'' is doing, essentially), are thus principally of worse complexity than that of Sieve of Eratosthenes. 

+  
+  The initial code is just a oneliner that ought to have been regarded as ''executable specification'' in the first place. It can easily be improved quite significantly with a simple use of bounded, guarded formulation to limit the number of filters it creates, or by postponement of filter creation. 

+  
+  == Euler's Sieve == 

+  === Unbounded Euler's sieve === 

+  With each found prime Euler's sieve removes all its multiples ''in advance'' so that at each step the list to process is guaranteed to have ''no multiples'' of any of the preceding primes in it (consists only of numbers ''coprime'' with all the preceding primes) and thus starts with the next prime: 

<haskell> 
<haskell> 

−  primes :: [Integer] 

+  primesEU = 2 : eulers [3,5..] 

−  primes = 2:3:primes' 

where 
where 

−  1:p:candidates = [6*k+r  k < [0..], r < [1,5]] 

+  eulers (p:xs) = p : eulers (xs `minus` map (p*) (p:xs)) 

−  primes' = p : filter isPrime candidates 

+   eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+2*p..]) 

−  isPrime n = all (not . divides n) $ takeWhile (\p > p*p <= n) primes' 

−  divides n p = n `mod` p == 0 

</haskell> 
</haskell> 

−  Here, <hask>primes'</hask> is the list of primes greater than 3 and <hask>isPrime</hask> does not test for divisibility by 2 or 3 because the <hask>candidates</hask> by construction don't have these numbers as factors. We also need to exclude 1 from the candidates and mark the next one as prime to start the recursion. 

+  This code is extremely inefficient, running above <math>O({n^{2}})</math> empirical complexity (and worsening rapidly), and should be regarded a ''specification'' only. Its memory usage is very high, with empirical space complexity just below <math>O({n^{2}})</math>, in ''n'' primes produced. 

−  Such a scheme to generate candidate numbers first that avoid a given set of primes as divisors is called a '''prime wheel'''. Imagine that you had a wheel of circumference 6 to be rolled along the number line. With spikes positioned 1 and 5 units around the circumference, rolling the wheel will prick holes exactly in those positions on the line whose numbers are not divisible by 2 and 3. 

+  In the streambased sieve of Eratosthenes we are able to ''skip'' along the input stream <code>xs</code> directly to the prime's square, consuming the whole prefix at once, thus achieving the results equivalent to the postponement technique, because the generation of the prime's multiples is independent of the rest of the stream. 

−  A wheel can be represented by its circumference and the spiked positions. 

+  But here in the Euler's sieve it ''is'' dependent on all <code>xs</code> and we're unable ''in principle'' to skip along it to the prime's square  because all <code>xs</code> are needed for each prime's multiples generation. Thus efficient unbounded streambased implementation seems to be impossible in principle, under the simple scheme of producing the multiples by multiplication. 

+  
+  === Wheeled list representation === 

+  
+  The situation can be somewhat improved using a different list representation, for generating lists not from a last element and an increment, but rather a last span and an increment, which entails a set of helpful equivalences: 

<haskell> 
<haskell> 

−  data Wheel = Wheel Integer [Integer] 

+  { fromElt (x,i) = x : fromElt (x + i,i) 

+  === iterate (+ i) x 

+  [n..] === fromElt (n,1) 

+  === fromSpan ([n],1) 

+  [n,n+2..] === fromElt (n,2) 

+  === fromSpan ([n,n+2],4) } 

+  
+  fromSpan (xs,i) = xs ++ fromSpan (map (+ i) xs,i) 

+  
+  { === concat $ iterate (map (+ i)) xs 

+  fromSpan (p:xt,i) === p : fromSpan (xt ++ [p + i], i) 

+  fromSpan (xs,i) `minus` fromSpan (ys,i) 

+  === fromSpan (xs `minus` ys, i) 

+  map (p*) (fromSpan (xs,i)) 

+  === fromSpan (map (p*) xs, p*i) 

+  fromSpan (xs,i) === forall (p > 0). 

+  fromSpan (concat $ take p $ iterate (map (+ i)) xs, p*i) } 

+  
+  spanSpecs = iterate eulerStep ([2],1) 

+  eulerStep (xs@(p:_), i) = 

+  ( (tail . concat . take p . iterate (map (+ i))) xs 

+  `minus` map (p*) xs, p*i ) 

+  
+  { > mapM_ print $ take 4 spanSpecs 

+  ([2],1) 

+  ([3],2) 

+  ([5,7],6) 

+  ([7,11,13,17,19,23,29,31],30) } 

</haskell> 
</haskell> 

−  We prick out numbers by rolling the wheel. 

+  
+  Generating a list from a span specification is like rolling a ''[[#Prime_Wheelswheel]]'' as its pattern gets repeated over and over again. For each span specification <code>w@((p:_),_)</code> produced by <code>eulerStep</code>, the numbers in <code>(fromSpan w)</code> up to <math>{p^2}</math> are all primes too, so that 

+  
<haskell> 
<haskell> 

−  roll (Wheel n rs) = [n*k+r  k < [0..], r < rs] 

+  eulerPrimesTo m = if m > 1 then go ([2],1) else [] 

+  where 

+  go w@((p:_), _) 

+   m < p*p = takeWhile (<= m) (fromSpan w) 

+   True = p : go (eulerStep w) 

</haskell> 
</haskell> 

−  The smallest wheel is the unit wheel with one spike, it will prick out every number. 

+  
+  This runs at about <math>O(n^{1.5..1.8})</math> complexity, for <code>n</code> primes produced, and also suffers from a severe space leak problem (IOW its memory usage is also very high). 

+  
+  == Using Immutable Arrays == 

+  
+  === Generating Segments of Primes === 

+  
+  The sieve of Eratosthenes' [[#Segmentedremoval of multiples on each segment of odds]] can be done by actually marking them in an array, instead of manipulating ordered lists, and can be further sped up more than twice by working with odds only: 

+  
<haskell> 
<haskell> 

−  w0 = Wheel 1 [1] 

+  import Data.Array.Unboxed 

+  
+  primesSA :: [Int] 

+  primesSA = 2 : oddprimes () 

+  where 

+  oddprimes = (3 :) . sieve 3 [] . oddprimes 

+  sieve x fs (p:ps) = [i*2 + x  (i,True) < assocs a] 

+  ++ sieve (p*p) ((p,0) : 

+  [(s, rem (yq) s)  (s,y) < fs]) ps 

+  where 

+  q = (p*px)`div`2 

+  a :: UArray Int Bool 

+  a = accumArray (\ b c > False) True (1,q1) 

+  [(i,())  (s,y) < fs, i < [y+s, y+s+s..q]] 

</haskell> 
</haskell> 

−  We can create a larger wheel by rolling a smaller wheel of circumference <hask>n</hask> along a rim of circumference <hask>p*n</hask> while excluding spike positions at multiples of <hask>p</hask>. 

+  
+  Runs significantly faster than [[#Tree merging with WheelTMWE]] and with better empirical complexity, of about <math>O(n^{1.10..1.05})</math> in producing first few millions of primes, with constant memory footprint. 

+  
+  === Calculating Primes Upto a Given Value === 

+  
+  Equivalent to [[#Accumulating ArrayAccumulating Array]] above, running somewhat faster (compiled by GHC with optimizations turned on): 

+  
<haskell> 
<haskell> 

−  nextSize (Wheel n rs) p = 

+  {# OPTIONS_GHC O2 #} 

−  Wheel (p*n) [r'  k < [0..(p1)], r < rs, let r' = n*k+r, r' `mod` p /= 0] 

+  import Data.Array.Unboxed 

+  
+  primesToNA n = 2: [i  i < [3,5..n], ar ! i] 

+  where 

+  ar = f 5 $ accumArray (\ a b > False) True (3,n) 

+  [(i,())  i < [9,15..n]] 

+  f p a  q > n = a 

+   True = if null x then a2 else f (head x) a2 

+  where q = p*p 

+  a2 :: UArray Int Bool 

+  a2 = a // [(i,False)  i < [q, q+2*p..n]] 

+  x = [i  i < [p+2,p+4..n], a2 ! i] 

</haskell> 
</haskell> 

−  Combining both, we can make wheels that prick out numbers that avoid a given list <hask>ds</hask> of divisors. 

+  
+  === Calculating Primes in a Given Range === 

+  
<haskell> 
<haskell> 

−  mkWheel ds = foldl nextSize w0 ds 

+  primesFromToA a b = (if a<3 then [2] else []) 

+  ++ [i  i < [o,o+2..b], ar ! i] 

+  where 

+  o = max (if even a then a+1 else a) 3  first odd in the segment 

+  r = floor . sqrt $ fromIntegral b + 1 

+  ar = accumArray (\_ _ > False) True (o,b)  initially all True, 

+  [(i,())  p < [3,5..r] 

+  , let q = p*p  flip every multiple of an odd 

+  s = 2*p  to False 

+  (n,x) = quotRem (o  q) s 

+  q2 = if o <= q then q 

+  else q + (n + signum x)*s 

+  , i < [q2,q2+s..b] ] 

</haskell> 
</haskell> 

−  Now, we can generate prime numbers with a wheel that for instance avoids all multiples of 2, 3, 5 and 7. 

+  Although sieving by odds instead of by primes, the array generation is so fast that it is very much feasible and even preferable for quick generation of some short spans of relatively big primes. 

+  
+  == Using Mutable Arrays == 

+  
+  Using mutable arrays is the fastest but not the most memory efficient way to calculate prime numbers in Haskell. 

+  
+  === Using ST Array === 

+  
+  This method implements the Sieve of Eratosthenes, similar to how you might do it 

+  in C, modified to work on odds only. It is fast, but about linear in memory consumption, allocating one (though apparently packed) sieve array for the whole sequence to process. 

+  
<haskell> 
<haskell> 

−  primes :: [Integer] 

+  import Control.Monad 

−  primes = small ++ large 

+  import Control.Monad.ST 

−  where 

+  import Data.Array.ST 

−  1:p:candidates = roll $ mkWheel small 

+  import Data.Array.Unboxed 

−  small = [2,3,5,7] 

+  
−  large = p : filter isPrime candidates 

+  sieveUA :: Int > UArray Int Bool 

−  isPrime n = all (not . divides n) $ takeWhile (\p > p*p <= n) large 

+  sieveUA top = runSTUArray $ do 

−  +  let m = (top1) `div` 2 

+  r = floor . sqrt $ fromIntegral top + 1 

+  sieve < newArray (1,m) True  :: ST s (STUArray s Int Bool) 

+  forM_ [1..r `div` 2] $ \i > do 

+  isPrime < readArray sieve i 

+  when isPrime $ do  ((2*i+1)^21)`div`2 == 2*i*(i+1) 

+  forM_ [2*i*(i+1), 2*i*(i+2)+1..m] $ \j > do 

+  writeArray sieve j False 

+  return sieve 

+  
+  primesToUA :: Int > [Int] 

+  primesToUA top = 2 : [i*2+1  (i,True) < assocs $ sieveUA top] 

</haskell> 
</haskell> 

−  It's a pretty big wheel with a circumference of 210 and allows us to calculate the first 10000 primes in convenient time. 

−  A fixed size wheel is fine, but how about adapting the wheel size while generating prime numbers? See the [[Research papers/Functional pearlsfunctional pearl]] titled [http://citeseer.ist.psu.edu/runciman97lazy.html Lazy wheel sieves and spirals of primes] for more. 

+  Its [http://ideone.com/KwZNc empirical time complexity] is improving with ''n'' (number of primes produced) from above <math>O(n^{1.20})</math> towards <math>O(n^{1.16})</math>. The reference [http://ideone.com/FaPOB C++ vectorbased implementation] exhibits this improvement in empirical time complexity too, from <math>O(n^{1.5})</math> gradually towards <math>O(n^{1.12})</math>, where tested ''(which might be interpreted as evidence towards the expected [http://en.wikipedia.org/wiki/Computation_time#Linearithmic.2Fquasilinear_time quasilinearithmic] <math>O(n \log(n)\log(\log n))</math> time complexity)''. 

−  === Bitwise prime sieve === 
+  === Bitwise prime sieve with Template Haskell === 
Count the number of prime below a given 'n'. Shows fast bitwise arrays, 
Count the number of prime below a given 'n'. Shows fast bitwise arrays, 

Line 416:  Line 692:  
e < unsafeRead a n 
e < unsafeRead a n 

if e then 
if e then 

−  +  if n < cutoff then 

−  +  let loop !j 

−  +   j < m = do 

−  +  x < unsafeRead a j 

−  +  when x $ unsafeWrite a j False 

−  +  loop (j+n) 

−  +   otherwise = go a m cutoff (n+2) (c+1) 

−  +  in loop ( if n < 46340 then n * n else n `shiftL` 1) 

−  +  else go a m cutoff (n+2) (c+1) 

else go a m cutoff (n+2) c 
else go a m cutoff (n+2) c 

</haskell> 
</haskell> 

−  And 
+  And place in a module: 
<haskell> 
<haskell> 

Line 446:  Line 722:  
</haskell> 
</haskell> 

−  === Using IntSet for a traditional sieve === 

+  == Implicit Heap == 

−  <haskell> 

−  module Sieve where 

+  See [[Prime_numbers_miscellaneous#Implicit_Heap  Implicit Heap]]. 

−  import qualified Data.IntSet as I 

−   findNext  finds the next member of an IntSet. 

+  == Prime Wheels == 

−  findNext c is  I.member c is = c 

−   c > I.findMax is = error "Ooops. No next number in set." 

−   otherwise = findNext (c+1) is 

−   mark  delete all multiples of n from n*n to the end of the set 

+  See [[Prime_numbers_miscellaneous#Prime_Wheels  Prime Wheels]]. 

−  mark n is = is I.\\ (I.fromAscList (takeWhile (<=end) (map (n*) [n..]))) 

−  where 

−  end = I.findMax is 

−   primes  gives all primes up to n 

+  == Using IntSet for a traditional sieve == 

−  primes n = worker 2 (I.fromAscList [2..n]) 

−  where 

−  worker x is 

−   (x*x) > n = is 

−   otherwise = worker (findNext (x+1) is) (mark x is) 

−  </haskell> 

−  == Testing Primality == 

+  See [[Prime_numbers_miscellaneous#Using_IntSet_for_a_traditional_sieve  Using IntSet for a traditional sieve]]. 

−  +  == Testing Primality, and Integer Factorization == 

−  Given an infinite list of prime numbers, we can implement primality tests and integer factorization: 

+  See [[Testing_primality  Testing primality]]: 

−  <haskell> 

+  * [[Testing_primality#Primality_Test_and_Integer_Factorization  Primality Test and Integer Factorization ]] 

−  isPrime n = n > 1 && n == head (primeFactors n) 

+  * [[Testing_primality#MillerRabin_Primality_Test  MillerRabin Primality Test]] 

−  primeFactors 1 = [] 

+  == Oneliners == 

−  primeFactors n = go n primes 

+  See [[Prime_numbers_miscellaneous#Oneliners  primes oneliners]]. 

−  where 

−  go n ps@(p:pt) 

−   p*p > n = [n] 

−   n `rem` p == 0 = p : go (n `quot` p) ps 

−   otherwise = go n pt 

−  </haskell> 

−  
−  === MillerRabin Primality Test === 

−  <haskell> 

−  find2km :: Integral a => a > (a,a) 

−  find2km n = f 0 n 

−  where 

−  f k m 

−   r == 1 = (k,m) 

−   otherwise = f (k+1) q 

−  where (q,r) = quotRem m 2 

−  
−  millerRabinPrimality :: Integer > Integer > Bool 

−  millerRabinPrimality n a 

−   a <= 1  a >= n1 = 

−  error $ "millerRabinPrimality: a out of range (" 

−  ++ show a ++ " for "++ show n ++ ")" 

−   n < 2 = False 

−   even n = False 

−   b0 == 1  b0 == n' = True 

−   otherwise = iter (tail b) 

−  where 

−  n' = n1 

−  (k,m) = find2km n' 

−  b0 = powMod n a m 

−  b = take (fromIntegral k) $ iterate (squareMod n) b0 

−  iter [] = False 

−  iter (x:xs) 

−   x == 1 = False 

−   x == n' = True 

−   otherwise = iter xs 

−  
−  pow' :: (Num a, Integral b) => (a > a > a) > (a > a) > a > b > a 

−  pow' _ _ _ 0 = 1 

−  pow' mul sq x' n' = f x' n' 1 

−  where 

−  f x n y 

−   n == 1 = x `mul` y 

−   r == 0 = f x2 q y 

−   otherwise = f x2 q (x `mul` y) 

−  where 

−  (q,r) = quotRem n 2 

−  x2 = sq x 

−  
−  mulMod :: Integral a => a > a > a > a 

−  mulMod a b c = (b * c) `mod` a 

−  squareMod :: Integral a => a > a > a 

−  squareMod a b = (b * b) `rem` a 

−  powMod :: Integral a => a > a > a > a 

−  powMod m = pow' (mulMod m) (squareMod m) 

−  </haskell> 

== External links == 
== External links == 

* http://www.cs.hmc.edu/~oneill/code/haskellprimes.zip 
* http://www.cs.hmc.edu/~oneill/code/haskellprimes.zip 

−  : A collection of prime generators; the file "ONeillPrimes.hs" contains one of the fastest pureHaskell prime generators 
+  : A collection of prime generators; the file "ONeillPrimes.hs" contains one of the fastest pureHaskell prime generators; code by Melissa O'Neill. 
+  : WARNING: Don't use the priority queue from ''older versions'' of that file for your projects: it's broken and works for primes only by a lucky chance. The ''revised'' version of the file fixes the bug, as pointed out by Eugene Kirpichov on February 24, 2009 on the [http://www.mailarchive.com/haskellcafe@haskell.org/msg54618.html haskellcafe] mailing list, and fixed by Bertram Felgenhauer. 

+  
+  * [http://ideone.com/willness/primes test entries] for (some of) the above code variants. 

+  
+  * Speed/memory [http://ideone.com/p0e81 comparison table] for sieve of Eratosthenes variants. 

+  
+  * [http://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth Empirical orders of growth] on Wikipedia. 

[[Category:Code]] 
[[Category:Code]] 
Latest revision as of 16:55, 8 August 2020
In mathematics, amongst the natural numbers greater than 1, a prime number (or a prime) is such that has no divisors other than itself (and 1).
Contents
 1 Prime Number Resources
 2 Definition
 3 Sieve of Eratosthenes
 4 Turner's sieve  Trial division
 5 Euler's Sieve
 6 Using Immutable Arrays
 7 Using Mutable Arrays
 8 Implicit Heap
 9 Prime Wheels
 10 Using IntSet for a traditional sieve
 11 Testing Primality, and Integer Factorization
 12 Oneliners
 13 External links
Prime Number Resources
 At Wikipedia:
 HackageDB packages:
 arithmoi: Various basic number theoretic functions; efficient arraybased sieves, Montgomery curve factorization ...
 Numbers: An assortment of number theoretic functions.
 NumberSieves: Number Theoretic Sieves: primes, factorization, and Euler's Totient.
 primes: Efficient, purely functional generation of prime numbers.
 Papers:
 O'Neill, Melissa E., "The Genuine Sieve of Eratosthenes", Journal of Functional Programming, Published online by Cambridge University Press 9 October 2008 doi:10.1017/S0956796808007004.
Definition
In mathematics, amongst the natural numbers greater than 1, a prime number (or a prime) is such that has no divisors other than itself (and 1). The smallest prime is thus 2. Nonprime numbers are known as composite, i.e. those representable as product of two natural numbers greater than 1.
To find out a prime's multiples we can either a. test each new candidate number for divisibility by that prime, giving rise to a kind of trial division algorithm; or b. we can directly generate the multiples of a prime p by counting up from it in increments of p, resulting in a variant of the sieve of Eratosthenes.
The set of prime numbers is thus
 P = { n ∈ N_{2} : (∀ m ∈ N_{2}) ( (m  n) ⇒ m = n) }
 = { n ∈ N_{2} : (∀ m ∈ N_{2}) ( m < n ⇒ ¬(m  n)) }
 = { n ∈ N_{2} : (∀ m ∈ N_{2}) ( m ⋅ m ≤ n ⇒ ¬(m  n)) }
 = N_{2} \ { n ⋅ m : n,m ∈ N_{2} }
 = N_{2} \ ⋃ { { n ⋅ m : m ∈ N_{n} } : n ∈ N_{2} }
 = N_{2} \ ⋃ { { n ⋅ n, n ⋅ n+n, n ⋅ n+n+n, ... } : n ∈ N_{2} }
 = N_{2} \ ⋃ { { p ⋅ p, p ⋅ p+p, p ⋅ p+p+p, ... } : p ∈ P }
 where N_{k} = { n ∈ N : n ≥ k }
 = N_{2} \ ⋃ { { p ⋅ p, p ⋅ p+p, p ⋅ p+p+p, ... } : p ∈ P }
Thus starting with 2, for each newly found prime we can eliminate from the rest of the numbers all the multiples of this prime, giving us the next available number as next prime. This is known as sieving the natural numbers, so that in the end all the composites are eliminated and what we are left with are just primes. (This is what the last formula is describing, though seemingly impredicative, because it is selfreferential. But because N_{2} is wellordered with the order preserved under addition, the formula is welldefined.)
Prototypically, it is
primes = map head $ scanl (\\) [2..] [[p, p+p..]  p < primes]
where
(\\) = Data.List.Ordered.minus
Having (a chain of) directaccess mutable arrays indeed enables easy marking of these multiples as is usually done in imperative languages; but to get an efficient listbased code we have to be smart about combining those streams of multiples of each prime  which gives us also the memory efficiency in generating the results incrementally, one by one.
Short exposition is here.
Sieve of Eratosthenes
Simplest, bounded, very inefficient formulation:
import Data.List (\\)  (\\) is setdifference for unordered lists
primesTo m = sieve [2..m]
where
sieve (x:xs) = x : sieve (xs \\ [x,x+x..m])
sieve [] = []
 or:
= ps
where
ps = map head $ takeWhile (not.null)
$ scanl (\\) [2..m] [[p, p+p..m]  p < ps]
The (unbounded) sieve of Eratosthenes calculates primes as integers above 1 that are not multiples of primes, i.e. not composite — whereas composites are found as enumeration of multiples of each prime, generated by counting up from prime's square in constant increments equal to that prime (or twice that much, for odd primes). This is much more efficient and runs at about n^{1.2} empirical orders of growth (corresponding to n (log n)^{2} log log n complexity, more or less, in n primes produced):
import Data.List.Ordered (minus, union, unionAll)
primes = 2 : 3 : minus [5,7..] (unionAll [[p*p, p*p+2*p..]  p < tail primes])
{ Using `under n = takeWhile (<= n)`, with ordered increasing lists,
`minus`, `union` and `unionAll` satisfy, for any `n` and `m`:
under n (minus a b) == nub . sort $ under n a \\ under n b
under n (union a b) == nub . sort $ under n a ++ under n b
under n . unionAll . take m == under n . foldl union [] . take m
under n . unionAll == nub . sort . concat
. takeWhile (not.null) . map (under n) }
The definition is primed with 2 and 3 as initial primes, to avoid the vicious circle.
The "big union" unionAll
function could be defined as the folding of (\(x:xs) > (x:) . union xs)
; or it could use a Bool
array as a sorting and duplicatesremoving device. The processing naturally divides up into the segments between successive squares of primes.
Stepwise development follows (the fully developed version is here).
Initial definition
First of all, working with ordered increasing lists, the sieve of Eratosthenes can be genuinely represented by
 genuine yet wasteful sieve of Eratosthenes
 primes = eratos [2.. ]  unbounded
primesTo m = eratos [2..m]  bounded, up to m
where
eratos [] = []
eratos (p:xs) = p : eratos (xs `minus` [p, p+p..])
 eratos (p:xs) = p : eratos (xs `minus` map (p*) [1..])
 eulers (p:xs) = p : eulers (xs `minus` map (p*) (p:xs))
 turner (p:xs) = p : turner [x  x < xs, rem x p /= 0]
 fix ( map head . scanl minus [2..] . map (\p> [p, p+p..]) )
This should be regarded more like a specification, not a code. It runs at empirical orders of growth worse than quadratic in number of primes produced. But it has the core defining features of the classical formulation of S. of E. as a. being bounded, i.e. having a top limit value, and b. finding out the multiples of a prime directly, by counting up from it in constant increments, equal to that prime.
The canonical listdifference minus
and duplicatesremoving union
functions (cf. Data.List.Ordered) are:
 ordered lists, difference and union
minus (x:xs) (y:ys) = case (compare x y) of
LT > x : minus xs (y:ys)
EQ > minus xs ys
GT > minus (x:xs) ys
minus xs _ = xs
union (x:xs) (y:ys) = case (compare x y) of
LT > x : union xs (y:ys)
EQ > x : union xs ys
GT > y : union (x:xs) ys
union xs [] = xs
union [] ys = ys
The name merge ought to be reserved for duplicatespreserving merging operation of the merge sort.
Analysis
So for each newly found prime p the sieve discards its multiples, enumerating them by counting up in steps of p. There are thus multiples generated and eliminated for each prime, and multiples in total, with duplicates, by virtues of prime harmonic series.
If each multiple is dealt with in time, this will translate into RAM machine operations (since we consider addition as an atomic operation). Indeed mutable randomaccess arrays allow for that. But lists in Haskell are sequentialaccess, and complexity of minus(a,b)
for lists is instead of of the direct access destructive array update. The lower the complexity of each minus step, the better the overall complexity.
So on kth step the argument list (p:xs)
that the eratos
function gets, starts with the (k+1)th prime, and consists of all the numbers ≤ m coprime with all the primes ≤ p. According to the M. O'Neill's article (p.10) there are such numbers.
It looks like for our intents and purposes. Since the number of primes below m is by the prime number theorem (where is a prime counting function), there will be n multiplesremoving steps in the algorithm; it means total complexity of at least , or in n primes produced  much much worse than the optimal .
From Squares
But we can start each elimination step at a prime's square, as its smaller multiples will have been already produced and discarded on previous steps, as multiples of smaller primes. This means we can stop early now, when the prime's square reaches the top value m, and thus cut the total number of steps to around . This does not in fact change the complexity of randomaccess code, but for lists it makes it , or in n primes produced, a dramatic speedup:
primesToQ m = eratos [2..m]
where
eratos [] = []
eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+p..m])
 eratos (p:xs) = p : eratos (xs `minus` map (p*) [p..div m p])
 eulers (p:xs) = p : eulers (xs `minus` map (p*) (under (div m p) (p:xs)))
 turner (p:xs) = p : turner [x  x<xs, x<p*p  rem x p /= 0]
Its empirical complexity is around . This simple optimization works here because this formulation is bounded (by an upper limit). To start late on a bounded sequence is to stop early (starting past end makes an empty sequence – see warning below^{ 1}), thus preventing the creation of all the superfluous multiples streams which start above the upper bound anyway (note that Turner's sieve is unaffected by this). This is acceptably slow now, striking a good balance between clarity, succinctness and efficiency.
^{1}Warning: this is predicated on a subtle point of minus xs [] = xs
definition being used, as it indeed should be. If the definition minus (x:xs) [] = x:minus xs []
is used, the problem is back and the complexity is bad again.
Guarded
This ought to be explicated (improving on clarity, though not on time complexity) as in the following, for which it is indeed a minor optimization whether to start from p or p*p  because it explicitly stops as soon as possible:
primesToG m = 2 : sieve [3,5..m]
where
sieve (p:xs)
 p*p > m = p : xs
 otherwise = p : sieve (xs `minus` [p*p, p*p+2*p..])
 p : sieve (xs `minus` map (p*) [p,p+2..])
 p : eulers (xs `minus` map (p*) (p:xs))
(here we also flatly ignore all evens above 2 a priori.) It is now clear that it can't be made unbounded just by abolishing the upper bound m, because the guard can not be simply omitted without changing the complexity back for the worst.
Accumulating Array
So while minus(a,b)
takes operations for randomaccess imperative arrays and about operations here for ordered increasing lists of numbers, when using Haskell's immutable array for a one could expect the array update time to be nevertheless closer to if destructive update was used implicitly by compiler for an array being passed along as an accumulating state parameter:
{# OPTIONS_GHC O2 #}
import Data.Array.Unboxed
primesToA m = sieve 3 (array (3,m) [(i,odd i)  i<[3..m]]
:: UArray Int Bool)
where
sieve p a
 p*p > m = 2 : [i  (i,True) < assocs a]
 a!p = sieve (p+2) $ a // [(i,False)  i < [p*p, p*p+2*p..m]]
 otherwise = sieve (p+2) a
Indeed for unboxed arrays (suggested by Daniel Fischer; with regular, boxed arrays it is very slow), the above code runs pretty fast, but with empirical complexity of O(n^{1.15..1.45}) in n primes produced (for producing from few hundred thousands to few millions primes, memory usage also slowly growing). If the update for each index were an O(1) operation, the empirical complexity would be seen as diminishing, as O(n^{1.15..1.05}), reflecting the true, linearithmic complexity.
We could use explicitly mutable monadic arrays (see below) to remedy this, but we can also think about it a little bit more on the functional side of things still.
Postponed
Going back to guarded Eratosthenes, first we notice that though it works with minimal number of prime multiples streams, it still starts working with each prematurely. Fixing this with explicit synchronization won't change complexity but will speed it up some more:
primesPE1 = 2 : sieve [3..] primesPE1
where
sieve xs (p:pt)  q < p*p , (h,t) < span (< q) xs =
h ++ sieve (t `minus` [q, q+p..]) pt
 h ++ turner [x  x<t, rem x p>0] pt
Inlining and fusing span
and (++)
we get:
primesPE = 2 : sieve [3..] [[p*p, p*p+p..]  p < primesPE]
where
sieve (x:xs) t@((q:cs):r)
 x < q = x : sieve xs t
 otherwise = sieve (minus xs cs) r
Since the removal of a prime's multiples here starts at the right moment, and not just from the right place, the code could now finally be made unbounded. Because no multiplesremoval process is started prematurely, there are no extraneous multiples streams, which were the reason for the original formulation's extreme inefficiency.
Segmented
With work done segmentwise between the successive squares of primes it becomes
primesSE = 2 : ops
where
ops = sieve 3 9 ops []  odd primes
sieve x q ~(p:pt) fs =
foldr (flip minus) [x,x+2..q2]  chain of subtractions
[[y+s, y+2*s..q]  (s,y) < fs]  OR,
 [x,x+2..q2] `minus` foldl union []  subtraction of merged
 [[y+s, y+2*s..q]  (s,y) < fs]  lists
++ sieve (q+2) (head pt^2) pt
((2*p,q):[(s,qrem (qy) s)  (s,y) < fs])
This "marks" the odd composites in a given range by generating them  just as a person performing the original sieve of Eratosthenes would do, counting one by one the multiples of the relevant primes. These composites are independently generated so some will be generated multiple times.
The advantage to working in spans explicitly is that this code is easily amendable to using arrays for the composites marking and removal on each finite span; and memory usage can be kept in check by using fixed sized segments.
Segmented Treemerging
Rearranging the chain of subtractions into a subtraction of merged streams (see below) and using treelike folding structure, further speeds up the code and significantly improves its asymptotic time behavior (down to about , space is leaking though):
primesSTE = 2 : ops
where
ops = sieve 3 9 ops []  odd primes
sieve x q ~(p:pt) fs =
([x,x+2..q2] `minus` joinST [[y+s, y+2*s..q]  (s,y) < fs])
++ sieve (q+2) (head pt^2) pt
((++ [(2*p,q)]) [(s,qrem (qy) s)  (s,y) < fs])
joinST (xs:t) = (union xs . joinST . pairs) t
where
pairs (xs:ys:t) = union xs ys : pairs t
pairs t = t
joinST [] = []
Segmented merging via an array
The removal of composites is easy with arrays. Starting points can be calculated directly:
import Data.List (inits, tails)
import Data.Array.Unboxed
primesSAE = 2 : sieve 2 4 (tail primesSAE) (inits primesSAE)
 (2:) . (sieve 2 4 . tail <*> inits) $ primesSAE
where
sieve r q ps (fs:ft) = [n  (n,True) < assocs (
accumArray (\ _ _ > False) True (r+1,q1)
[(m,())  p < fs, let s = p * div (r+p) p,
m < [s,s+p..q1]] :: UArray Int Bool )]
++ sieve q (head ps^2) (tail ps) ft
The pattern of iterated calls to tail
is captured by a higherorder function tails
, which explicitly generates the stream of tails of a stream, making for a bit more readable (even if possibly a bit less efficient) code:
psSAGE = 2 : [n  (r:q:_, fs) < (zip . tails . (2:) . map (^2) <*> inits) psSAGE,
(n,True) < assocs (
accumArray (\_ _ > False) True (r+1, q1)
[(m,())  p < fs, let s = (r+p)`div`p*p,
m < [s,s+p..q1]] :: UArray Int Bool )]
Linear merging
But segmentation doesn't add anything substantially, and each multiples stream starts at its prime's square anyway. What does the Postponed code do, operationally? With each prime's square passed by, there emerges a nested linear leftdeepening structure, (...((xsa)b)...), where xs is the original oddsproducing [3,5..] list, so that each odd it produces must go through more and more minus
nodes on its way up  and those odd numbers that eventually emerge on top are prime. Thinking a bit about it, wouldn't another, rightdeepening structure, (xs(a+(b+...))), be better? This idea is due to Richard Bird, seen in his code presented in M. O'Neill's article, equivalent to:
primesB = 2 : minus [3..] (foldr (\p r> p*p : union [p*p+p, p*p+2*p..] r)
[] primesB)
or,
primesLME1 = 2 : prs
where
prs = 3 : minus [5,7..] (joinL [[p*p, p*p+2*p..]  p < prs])
joinL ((x:xs):t) = x : union xs (joinL t)
Here, xs stays near the top, and more frequently oddsproducing streams of multiples of smaller primes are above those of the bigger primes, that produce less frequently their multiples which have to pass through more union
nodes on their way up. Plus, no explicit synchronization is necessary anymore because the produced multiples of a prime start at its square anyway  just some care has to be taken to avoid a runaway access to the indefinitelydefined structure, defining joinL
(or foldr
's combining function) to produce part of its result before accessing the rest of its input (thus making it productive).
Melissa O'Neill introduced double primes feed to prevent unneeded memoization (a memory leak). We can even do multistage. Here's the code, faster still and with radically reduced memory consumption, with empirical orders of growth of around ~ (initially better, yet worsening for bigger ranges):
primesLME = 2 : _Y ((3:) . minus [5,7..] . joinL . map (\p> [p*p, p*p+2*p..]))
_Y :: (t > t) > t
_Y g = g (_Y g)  multistage, nonsharing, g (g (g (g ...)))
 g (let x = g x in x)  two g stages, sharing
_Y
is a nonsharing fixpoint combinator, here arranging for a recursive "telescoping" multistage primes production (a tower of producers).
This allows the primesLME
stream to be discarded immediately as it is being consumed by its consumer. For prs
from primesLME1
definition above it is impossible, as each produced element of prs
is needed later as input to the same prs
corecursive stream definition. So the prs
stream feeds in a loop into itself and is thus retained in memory, being consumed by self much slower than it is produced. With multistage production, each stage feeds into its consumer above it at the square of its current element which can be immediately discarded after it's been consumed. (3:)
jumpstarts the whole thing.
Tree merging
Moreover, it can be changed into a tree structure. This idea is due to Dave Bayer and Heinrich Apfelmus:
primesTME = 2 : _Y ((3:) . gaps 5 . joinT . map (\p> [p*p, p*p+2*p..]))
 joinL ((x:xs):t) = x : union xs (joinL t)
joinT ((x:xs):t) = x : union xs (joinT (pairs t))  set union, ~=
where pairs (xs:ys:t) = union xs ys : pairs t  nub.sort.concat
gaps k s@(x:xs)  k < x = k:gaps (k+2) s  ~= [k,k+2..]\\s,
 True = gaps (k+2) xs  when null(s\\[k,k+2..])
This code is pretty fast, running at speeds and empirical complexities comparable with the code from Melissa O'Neill's article (about in number of primes n produced).
As an aside, joinT
is equivalent to infinite treelike folding foldi (\(x:xs) ys> x:union xs ys) []
:
Data.List.Ordered.foldt
of the dataordlist package builds the same structure, but in a lazier fashion, consuming its input at the slowest pace possible. Here this sophistication is not needed (evidently).
Tree merging with Wheel
Wheel factorization optimization can be further applied, and another tree structure can be used which is better adjusted for the primes multiples production (effecting about 5%10% at the top of a total 2.5x speedup w.r.t. the above tree merging on odds only, for first few million primes):
primesTMWE = [2,3,5,7] ++ _Y ((11:) . tail . gapsW 11 wheel
. joinT . hitsW 11 wheel)
gapsW k (d:w) s@(c:cs)  k < c = k : gapsW (k+d) w s  set difference
 otherwise = gapsW (k+d) w cs  k==c
hitsW k (d:w) s@(p:ps)  k < p = hitsW (k+d) w s  intersection
 otherwise = scanl (\c d>c+p*d) (p*p) (d:w)
: hitsW (k+d) w ps  k==p
wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2:
4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel
 cycle $ zipWith () =<< tail $ [i  i < [11..11+_210], gcd i _210 == 1]
 where _210 = product [2,3,5,7]
The hitsW
function is there to find the starting point for rolling the wheel for each prime, but this can be found directly:
primesW = [2,3,5,7] ++ _Y ( (11:) . tail . gapsW 11 wheel . joinT .
map (\p >
map (p*) . dropWhile (< p) $
scanl (+) (p  rem (p11) 210) wheel) )
Seems to run about 1.4x faster, too.
Above Limit  Offset Sieve
Another task is to produce primes above a given value:
{# OPTIONS_GHC O2 fnocse #}
primesFromTMWE primes m = dropWhile (< m) [2,3,5,7,11]
++ gapsW a wh2 (compositesFrom a)
where
(a,wh2) = rollFrom (snapUp (max 3 m) 3 2)
(h,p2:t) = span (< z) $ drop 4 primes  p < z => p*p<=a
z = ceiling $ sqrt $ fromIntegral a + 1  p2>=z => p2*p2>a
compositesFrom a = joinT (joinST [multsOf p a  p < h ++ [p2]]
: [multsOf p (p*p)  p < t] )
snapUp v o step = v + (mod (ov) step)  full steps from o
multsOf p from = scanl (\c d>c+p*d) (p*x) wh  map (p*) $
where  scanl (+) x wh
(x,wh) = rollFrom (snapUp from p (2*p) `div` p)  , if p < from
wheelNums = scanl (+) 0 wheel
rollFrom n = go wheelNums wheel
where
m = (n11) `mod` 210
go (x:xs) ws@(w:ws2)  x < m = go xs ws2
 True = (n+xm, ws)  (x >= m)
A certain preprocessing delay makes it worthwhile when producing more than just a few primes, otherwise it degenerates into simple trial division, which is then ought to be used directly:
primesFrom m = filter isPrime [m..]
Mapbased
Runs ~1.7x slower than TME version, but with the same empirical time complexity, ~ (in n primes produced) and same very low (near constant) memory consumption:
import Data.List  based on
import qualified Data.Map as M  http://stackoverflow.com/a/1140100
primesMPE :: [Integer]
primesMPE = 2 : mkPrimes 3 M.empty prs 9  postponed sieve enlargement
where  by decoupled primes feed loop
prs = 3 : mkPrimes 5 M.empty prs 9
mkPrimes n m ps@ ~(p:pt) q = case (M.null m, M.findMin m) of
{ (False, (n2, skips))  n == n2 >
mkPrimes (n+2) (addSkips n (M.deleteMin m) skips) ps q
; _ > if n < q
then n : mkPrimes (n+2) m ps q
else mkPrimes (n+2) (addSkip n m (2*p)) pt (head pt^2)
}
addSkip n m s = M.alter (Just . maybe [s] (s:)) (n+s) m
addSkips = foldl' . addSkip
Turner's sieve  Trial division
David Turner's (SASL Language Manual, 1983) formulation replaces nonstandard minus
in the sieve of Eratosthenes by stock list comprehension with rem
filtering, turning it into a trial division algorithm, for clarity and simplicity:
 unbounded sieve, premature filters
primesT = sieve [2..]
where
sieve (p:xs) = p : sieve [x  x < xs, rem x p > 0]
 map head
 $ iterate (\(p:xs) > [x  x < xs, rem x p > 0]) [2..]
This creates many superfluous implicit filters, because they are created prematurely. To be admitted as prime, each number will be tested for divisibility here by all its preceding primes, while just those not greater than its square root would suffice. To find e.g. the 1001st prime (7927
), 1000 filters are used, when in fact just the first 24 are needed (up to 89
's filter only). Operational overhead here is huge; theoretically, it has quadratic time complexity, in the number of produced primes.
Guarded Filters
But this really ought to be changed into the abortive variant, again achieving the "miraculous" complexity improvement from above quadratic to about empirically (in n primes produced) by stopping the sieving as soon as possible:
primesToGT m = sieve [2..m]
where
sieve (p:xs)
 p*p > m = p : xs
 True = p : sieve [x  x < xs, rem x p > 0]
 (\(a,b:_) > map head a ++ b) . span ((< m).(^2).head) $
 iterate (\(p:xs) > [x  x < xs, rem x p > 0]) [2..m]
Postponed Filters
Or it can remain unbounded, just filters creation must be postponed until the right moment:
primesPT1 = 2 : sieve primesPT1 [3..]
where
sieve (p:pt) xs = let (h,t) = span (< p*p) xs
in h ++ sieve pt [x  x < t, rem x p > 0]
 fix $ concatMap (fst . fst)
 . iterate (\((_,xs), p:pt) > let (h,t) = span (< p*p) xs in
 ((h, [x  x < t, rem x p > 0]), pt))
 . (,) ([2],[3..])
It can be rewritten with span
and (++)
inlined and fused into the sieve
:
primesPT = 2 : oddprimes
where
oddprimes = sieve [3,5..] 9 oddprimes
sieve (x:xs) q ps@ ~(p:pt)
 x < q = x : sieve xs q ps
 True = sieve [x  x < xs, rem x p /= 0] (head pt^2) pt
creating here as well the linear filtering nested structure at runtime, (...(([3,5..] >>= filterBy [3]) >>= filterBy [5])...)
, but unlike the nonpostponed code each filter being created at its proper moment, not sooner than the prime's square is seen.
filterBy ds n = [n  noDivs n ds]  `ds` assumed to be nondecreasing
noDivs n ds = foldr (\d r > d*d > n  (rem n d > 0 && r)) True ds
Optimal trial division
The above is algorithmically equivalent to the traditional formulation of trial division,
ps = 2 : [i  i < [3..],
and [rem i p > 0  p < takeWhile (\p > p^2 <= i) ps]]
or,
 primes = filter (`noDivs`[2..]) [2..]
 = 2 : filter (`noDivs`[3,5..]) [3,5..]
primesTD = 2 : 3 : filter (`noDivs` tail primesTD) [5,7..]
isPrime n = n > 1 && noDivs n primesTD
except that this code is rechecking for each candidate number which primes to use, whereas for every candidate number in each segment between the successive squares of primes these will just be the same prefix of the primes list being built.
Trial division is used as a simple primality test and prime factorization algorithm.
Segmented Generate and Test
Next we turn the list of filters into one filter by an explicit list, each one in a progression of prefixes of the primes list. This seems to eliminate most recalculations, explicitly filtering composites out from batches of odds between the consecutive squares of primes.
import Data.List
primesST = 2 : ops
where
ops = sieve 3 9 ops (inits ops)  odd primes
 (sieve 3 9 <*> inits) ops  inits: [],[3],[3,5],...
sieve x q ~(_:pt) (fs:ft) =
filter ((`all` fs) . ((> 0).) . rem) [x,x+2..q2]
++ sieve (q+2) (head pt^2) pt ft
This can also be coded as, arguably more readable,
primesSGT = 2 : ops
where
ops = 3 : [n  (r:q:_, px) < (zip . tails . (3:) . map (^2)) ops (inits ops),
n < [r+2,r+4..q2], all ((> 0) . rem n) px]
 n < foldl (>>=) [r+2,r+4..q2]  chain of filters
 [filterBy [p]  p < px]]  OR,
 n < [r+2,r+4..q2] >>= filterBy px]  a filter by a list
Generate and Test Above Limit
The following will start the segmented Turner sieve at the right place, using any primes list it's supplied with (e.g. TMWE etc.) or itself, as shown, demand computing it just up to the square root of any prime it'll produce:
primesFromST m  m <= 2 = 2 : primesFromST 3
primesFromST m  m > 2 =
sieve (m`div`2*2+1) (head ps^2) (tail ps) (inits ps)
where
(h,ps) = span (<= (floor.sqrt $ fromIntegral m+1)) ops
sieve x q ps (fs:ft) =
filter ((`all` (h ++ fs)) . ((> 0) .) . rem) [x,x+2..q2]
++ sieve (q+2) (head ps^2) (tail ps) ft
ops = 3 : primesFromST 5  odd primes
 ~> take 3 $ primesFromST 100000001234
 [100000001237,100000001239,100000001249]
This is usually faster than testing candidate numbers for divisibility one by one which has to refetch anew the needed prime factors to test by, for each candidate. Faster is the offset sieve of Eratosthenes on odds, and yet faster the one w/ wheel optimization, on this page.
Conclusions
All these variants being variations of trial division, finding out primes by direct divisibility testing of every candidate number by sequential primes below its square root (instead of just by its factors, which is what direct generation of multiples is doing, essentially), are thus principally of worse complexity than that of Sieve of Eratosthenes.
The initial code is just a oneliner that ought to have been regarded as executable specification in the first place. It can easily be improved quite significantly with a simple use of bounded, guarded formulation to limit the number of filters it creates, or by postponement of filter creation.
Euler's Sieve
Unbounded Euler's sieve
With each found prime Euler's sieve removes all its multiples in advance so that at each step the list to process is guaranteed to have no multiples of any of the preceding primes in it (consists only of numbers coprime with all the preceding primes) and thus starts with the next prime:
primesEU = 2 : eulers [3,5..]
where
eulers (p:xs) = p : eulers (xs `minus` map (p*) (p:xs))
 eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+2*p..])
This code is extremely inefficient, running above empirical complexity (and worsening rapidly), and should be regarded a specification only. Its memory usage is very high, with empirical space complexity just below , in n primes produced.
In the streambased sieve of Eratosthenes we are able to skip along the input stream xs
directly to the prime's square, consuming the whole prefix at once, thus achieving the results equivalent to the postponement technique, because the generation of the prime's multiples is independent of the rest of the stream.
But here in the Euler's sieve it is dependent on all xs
and we're unable in principle to skip along it to the prime's square  because all xs
are needed for each prime's multiples generation. Thus efficient unbounded streambased implementation seems to be impossible in principle, under the simple scheme of producing the multiples by multiplication.
Wheeled list representation
The situation can be somewhat improved using a different list representation, for generating lists not from a last element and an increment, but rather a last span and an increment, which entails a set of helpful equivalences:
{ fromElt (x,i) = x : fromElt (x + i,i)
=== iterate (+ i) x
[n..] === fromElt (n,1)
=== fromSpan ([n],1)
[n,n+2..] === fromElt (n,2)
=== fromSpan ([n,n+2],4) }
fromSpan (xs,i) = xs ++ fromSpan (map (+ i) xs,i)
{ === concat $ iterate (map (+ i)) xs
fromSpan (p:xt,i) === p : fromSpan (xt ++ [p + i], i)
fromSpan (xs,i) `minus` fromSpan (ys,i)
=== fromSpan (xs `minus` ys, i)
map (p*) (fromSpan (xs,i))
=== fromSpan (map (p*) xs, p*i)
fromSpan (xs,i) === forall (p > 0).
fromSpan (concat $ take p $ iterate (map (+ i)) xs, p*i) }
spanSpecs = iterate eulerStep ([2],1)
eulerStep (xs@(p:_), i) =
( (tail . concat . take p . iterate (map (+ i))) xs
`minus` map (p*) xs, p*i )
{ > mapM_ print $ take 4 spanSpecs
([2],1)
([3],2)
([5,7],6)
([7,11,13,17,19,23,29,31],30) }
Generating a list from a span specification is like rolling a wheel as its pattern gets repeated over and over again. For each span specification w@((p:_),_)
produced by eulerStep
, the numbers in (fromSpan w)
up to are all primes too, so that
eulerPrimesTo m = if m > 1 then go ([2],1) else []
where
go w@((p:_), _)
 m < p*p = takeWhile (<= m) (fromSpan w)
 True = p : go (eulerStep w)
This runs at about complexity, for n
primes produced, and also suffers from a severe space leak problem (IOW its memory usage is also very high).
Using Immutable Arrays
Generating Segments of Primes
The sieve of Eratosthenes' removal of multiples on each segment of odds can be done by actually marking them in an array, instead of manipulating ordered lists, and can be further sped up more than twice by working with odds only:
import Data.Array.Unboxed
primesSA :: [Int]
primesSA = 2 : oddprimes ()
where
oddprimes = (3 :) . sieve 3 [] . oddprimes
sieve x fs (p:ps) = [i*2 + x  (i,True) < assocs a]
++ sieve (p*p) ((p,0) :
[(s, rem (yq) s)  (s,y) < fs]) ps
where
q = (p*px)`div`2
a :: UArray Int Bool
a = accumArray (\ b c > False) True (1,q1)
[(i,())  (s,y) < fs, i < [y+s, y+s+s..q]]
Runs significantly faster than TMWE and with better empirical complexity, of about in producing first few millions of primes, with constant memory footprint.
Calculating Primes Upto a Given Value
Equivalent to Accumulating Array above, running somewhat faster (compiled by GHC with optimizations turned on):
{# OPTIONS_GHC O2 #}
import Data.Array.Unboxed
primesToNA n = 2: [i  i < [3,5..n], ar ! i]
where
ar = f 5 $ accumArray (\ a b > False) True (3,n)
[(i,())  i < [9,15..n]]
f p a  q > n = a
 True = if null x then a2 else f (head x) a2
where q = p*p
a2 :: UArray Int Bool
a2 = a // [(i,False)  i < [q, q+2*p..n]]
x = [i  i < [p+2,p+4..n], a2 ! i]
Calculating Primes in a Given Range
primesFromToA a b = (if a<3 then [2] else [])
++ [i  i < [o,o+2..b], ar ! i]
where
o = max (if even a then a+1 else a) 3  first odd in the segment
r = floor . sqrt $ fromIntegral b + 1
ar = accumArray (\_ _ > False) True (o,b)  initially all True,
[(i,())  p < [3,5..r]
, let q = p*p  flip every multiple of an odd
s = 2*p  to False
(n,x) = quotRem (o  q) s
q2 = if o <= q then q
else q + (n + signum x)*s
, i < [q2,q2+s..b] ]
Although sieving by odds instead of by primes, the array generation is so fast that it is very much feasible and even preferable for quick generation of some short spans of relatively big primes.
Using Mutable Arrays
Using mutable arrays is the fastest but not the most memory efficient way to calculate prime numbers in Haskell.
Using ST Array
This method implements the Sieve of Eratosthenes, similar to how you might do it in C, modified to work on odds only. It is fast, but about linear in memory consumption, allocating one (though apparently packed) sieve array for the whole sequence to process.
import Control.Monad
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed
sieveUA :: Int > UArray Int Bool
sieveUA top = runSTUArray $ do
let m = (top1) `div` 2
r = floor . sqrt $ fromIntegral top + 1
sieve < newArray (1,m) True  :: ST s (STUArray s Int Bool)
forM_ [1..r `div` 2] $ \i > do
isPrime < readArray sieve i
when isPrime $ do  ((2*i+1)^21)`div`2 == 2*i*(i+1)
forM_ [2*i*(i+1), 2*i*(i+2)+1..m] $ \j > do
writeArray sieve j False
return sieve
primesToUA :: Int > [Int]
primesToUA top = 2 : [i*2+1  (i,True) < assocs $ sieveUA top]
Its empirical time complexity is improving with n (number of primes produced) from above towards . The reference C++ vectorbased implementation exhibits this improvement in empirical time complexity too, from gradually towards , where tested (which might be interpreted as evidence towards the expected quasilinearithmic time complexity).
Bitwise prime sieve with Template Haskell
Count the number of prime below a given 'n'. Shows fast bitwise arrays, and an example of Template Haskell to defeat your enemies.
{# OPTIONS O2 optcO XBangPatterns #}
module Primes (nthPrime) where
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Base
import System
import Control.Monad
import Data.Bits
nthPrime :: Int > Int
nthPrime n = runST (sieve n)
sieve n = do
a < newArray (3,n) True :: ST s (STUArray s Int Bool)
let cutoff = truncate (sqrt $ fromIntegral n) + 1
go a n cutoff 3 1
go !a !m cutoff !n !c
 n >= m = return c
 otherwise = do
e < unsafeRead a n
if e then
if n < cutoff then
let loop !j
 j < m = do
x < unsafeRead a j
when x $ unsafeWrite a j False
loop (j+n)
 otherwise = go a m cutoff (n+2) (c+1)
in loop ( if n < 46340 then n * n else n `shiftL` 1)
else go a m cutoff (n+2) (c+1)
else go a m cutoff (n+2) c
And place in a module:
{# OPTIONS fth #}
import Primes
main = print $( let x = nthPrime 10000000 in [ x ] )
Run as:
$ ghc make o primes Main.hs
$ time ./primes
664579
./primes 0.00s user 0.01s system 228% cpu 0.003 total
Implicit Heap
See Implicit Heap.
Prime Wheels
See Prime Wheels.
Using IntSet for a traditional sieve
See Using IntSet for a traditional sieve.
Testing Primality, and Integer Factorization
See Testing primality:
Oneliners
See primes oneliners.
External links
 A collection of prime generators; the file "ONeillPrimes.hs" contains one of the fastest pureHaskell prime generators; code by Melissa O'Neill.
 WARNING: Don't use the priority queue from older versions of that file for your projects: it's broken and works for primes only by a lucky chance. The revised version of the file fixes the bug, as pointed out by Eugene Kirpichov on February 24, 2009 on the haskellcafe mailing list, and fixed by Bertram Felgenhauer.
 test entries for (some of) the above code variants.
 Speed/memory comparison table for sieve of Eratosthenes variants.
 Empirical orders of growth on Wikipedia.