Difference between revisions of "Talk:99 questions/31 to 41"

From HaskellWiki
Jump to navigation Jump to search
m
Line 28: Line 28:
 
primefactors :: Integer -> [Integer]
 
primefactors :: Integer -> [Integer]
 
primefactors n | n < 2 = []
 
primefactors n | n < 2 = []
| even n = o2 ++ (primefactors o1)
+
| even n = o2 ++ (primefactors o1)
| otherwise = if z /=1 then revres ++ [z] else revres
+
| otherwise = if z /=1 then revres ++ [z] else revres
where
+
where
res = divisions (DivIter { dividend = o1,
+
res = divisions (DivIter { dividend = o1,
divisor = 3,
+
divisor = 3,
bound = intsqrt(o1),
+
bound = intsqrt(o1),
result = o2})
+
result = o2})
revres = reverse (result res)
+
revres = reverse (result res)
z = dividend res
+
z = dividend res
--indeed gets 1 in certain situations
+
--indeed gets 1 in certain situations
(o1, o2) = twosect (n, [])
+
(o1, o2) = twosect (n, [])
   
 
twosect :: (Integer, [Integer]) -> (Integer, [Integer])
 
twosect :: (Integer, [Integer]) -> (Integer, [Integer])

Revision as of 23:47, 16 December 2007

Hi,

there are several problems with the solution to 35.

First: The signature of primefactors is Integer -> [Integer]. That means, the function is designed to work on integers larger than 2^32. (a reasonable task now, when ghc outperforms Fortran). But then we cannot make a list of the primes up to sqrt n. And there are doubts anyway concerning the price for building such a list first.

Second:

a) If found a 'matching' divisor, we must factor it out completely, that means divide by it, as often as possible. Then that divisor is done completely. Thus we must go through the divisors only once for the whole process (not start by dividing by 2 again after every "successful" division). This reduces the number of computations at least to the half.

b) The division by a current divisor is done twice in the solution, once for getting the quotient, a second time for getting the remainder. We must use divMod. This reduces the number of the expensive divisions to the half again.

c) We first deal with the (possibly multiple) factor 2, then go through the odd numbers as divisors only. Another reduction by 50%.

d) n*n is computed at each step. We should compute a bound ("intsqrt" below) only once for each "successful" division.

All in all these proposals should give improvement in performance in a magnitude of 1000% to the solution in the text, with code still naive and easily readable for beginners in Haskell (as me):


data DivIter = 
    DivIter { dividend :: Integer, 
              divisor  :: Integer,
              bound    :: Integer,
              result   :: [Integer] }

intsqrt m = floor (sqrt $ fromInteger m)

primefactors :: Integer -> [Integer]
primefactors n | n < 2 = []
               | even n = o2 ++ (primefactors o1)
               | otherwise = if z /=1 then revres ++ [z] else revres
                    where 
                        res = divisions (DivIter { dividend = o1,
                                                   divisor = 3,  
                                                   bound = intsqrt(o1),
                                                   result = o2})
                        revres = reverse (result res)
                        z = dividend res  
                               --indeed gets 1 in certain situations
                        (o1, o2) = twosect (n, [])

twosect :: (Integer, [Integer]) -> (Integer, [Integer])
twosect m  | odd (fst m) = m
           | even (fst m) = twosect ( div (fst m) 2,  2:(snd m) )

divstep :: DivIter -> DivIter 
divstep y = if ximod>0 then 
            y { divisor=(divisor y) +2 } 
        else
            y {dividend = xidiv, 
               bound = intsqrt xidiv, 
               result = (divisor y):(result y) }
    where (xidiv, ximod) = divMod (dividend y) (divisor y)

divisions :: DivIter -> DivIter
divisions y = if divisor y <= bound y then divisions (divstep y) 
              else y


This computes

  primefactors (2^90+1) = [5,5,13,37,41,61,109,181,1321,54001,29247661]

in a moment, as an equivalent python script does, but fails to beat that script for 2^120+1 - due to the recursion in "divisions". Compiled with this versions for "divisions"

divisions = do
    y <- get
    if divisor y <= bound y then do
        put ( divstep y )
        divisions
        else 
            return y

called from primefactors by

res = execState divisions (DivIter { dividend = o1, 
                                     divisor = 3, 
                                     bound = intsqrt(o1),
                                     result = o2 })

it computes

   primefactors (2^120+1) = [97,257,673,394783681,4278255361,46908728641]

in about 30 minutes (with a constant footprint of 2MB) on my system, which still is not very good compared with the two hours python needs. Indeed the function "divisions" is separated from "divstep" here not for readability, but for possible replacement.