|Haskell Performance Resource|
1 Don't use Float
The speed claims may not be true due to Doubles not necessarily being aligned as the machine wishes. We could do with some benchmarking on various platforms to see what's what.
2 GHC-specific advice
On x86 (and other platforms with GHC prior to version 6.4.2), use the -fexcess-precision flag to improve performance of floating-point intensive code (up to 2x speedups have been seen). This will keep more intermediates in registers instead of memory, at the expense of occasional differences in results due to unpredictable rounding. See the GHC documentation for more details. Switching on GCCs -ffast-math and -O3 can also help (use -optc-ffast-math and -optc-O3).
Where available, the -optc-march=pentium4 -optc-mfpmath=sse flags may also help.
Note that the -fexcess-precision flag may make programs behave oddly,e.g. after falling an