m (Added question)
m (→Use unboxed arrays (UArray, IOUArray): making code lines shorter to fit to screen; longer lines cause scroll bars to appear with bars taking up text space instead of appearing next to it! a BUG!)
|(9 intermediate revisions by 5 users not shown)|
Latest revision as of 10:37, 23 June 2011
|Haskell Performance Resource|
 1 General Array techniques
- Remember that ordinary arrays are monolithic, and individual elements are not mutable. In particular, the (//) operator copies the entire array, so it is rarely what you want. (Data.Array.Diff provides a variant of arrays with O(1) (//), but that library has performance problems of its own).
- Monolithic arrays are by no means useless! Powerful array-construction facilities like accumArray can often eliminate the need for truly mutable arrays.
- If you really need mutable arrays for speed, then if possible use the ST variant, so that the stateful part of your program can be encapsulated (Data.Array.ST).
- Avoid redundant bounds checks by using Data.Array.Base.unsafeAt, Data.Array.Base.unsafeRead, Data.Array.Base.unsafeWrite (these are currently undocumented, unfortunately). This can make a large difference.
- See also Arrays, a thorough exploration of the various array types available in most Haskell compilers.
 2 GHC-specific techniques
 2.1 Use unboxed arrays (UArray, IOUArray)
GHC supports arrays of unboxed elements, for several basic arithmetic element types including Int and Char: see the Data.Array.Unboxed library for details. Unboxed arrays support the same programmer interface as ordinary boxed arrays, so converting your code is easy. Using unboxed arrays will be a win in terms of both time and space.
There are also mutable unboxed arrays: IOUArray and STUArray (see Data.Array.IO and Data.Array.ST respectively). Using unboxed mutable arrays is often a good way to translate imperative algorithms into Haskell with similar performance.
Extra strictness on functions taking unboxed arrays as arguments can help, as GHC will then unbox the array argument to a primitive type. For example, the following language shootout program for the nsieve benchmark, combining unboxed arrays, unsafe writes and strictness, runs faster than the equivalent hand-optimised C program:
import Data.Array.IO import Data.Array.Base import Data.Bits import System import Text.Printf
main = (\n -> mapM_ (sieve.(10000 *).shiftL 1) [n,n-1,n-2]) . read . head =<< getArgs
sieve m = do c <- newArray (2,m) True >>= \a -> loop a m 2 0 printf "Primes up to %8d %8d\n" (m::Int) (c::Int) :: IO ()
loop arr m n c | arr `seq` m `seq` n `seq` c `seq` False = undefined loop arr m n c = if n == m then return c else do el <- unsafeRead arr n if el then let loop' j | j > m = loop arr m (n+1) (c + 1) | otherwise = unsafeWrite arr j False >> loop' (j+n) in loop' (n+n) else loop (arr :: IOUArray Int Bool) m (n+1) c
It also uses bitwise arithmetic, which can further improve some code. Another note, for ghc 6.4.1 at least, an explicit loop seems to be better than mapM_.