Difference between revisions of "GHC/SIMD"

From HaskellWiki
< GHC
Jump to navigation Jump to search
(Initial entry)
 
m (To be deleted if no new content appears...)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
  +
[[Category:Pages to be removed]]
== Overview ==
 
 
This page is initially to provide a location for discussions on extending GHC to take advantage of CPU SIMD instructions, including SSE and Altivec instructions.
 
 
SSE provides 'packed' data types of floats and integers that fit into 128 bit xmm registers.
 
 
The operations on these data types include the standard mathematical operations (Add/Mul/...). There are also additional mathematical operations (reciprocal, reciprocal-square-root) and packed-specific operations such as dot-product, horizontal add/sub/add-sub.
 
 
Also, to support data-streaming operations, there are memory operations that bypass the cache and write directly to/from the xmm registers.
 
 
xmm registers are 128 bits and hold both packed integer and packed float types. I suggest that a new `PackedReg` data constructor be added.
 
 
In terms of an implementation plan:
 
 
* Add new packed data types and 'standard' operations on those types to Cmm and primops.txt.pp
 
 
** Int32Packed4#, ...
 
 
** Width = ... | W32_4 | ...
 
 
* implement new types and operations in backends (C/LLVM/ASM)
 
 
So far this is straightforward.
 
 
* As has been mentioned on the developer's [http://hackage.haskell.org/trac/ghc/ticket/3557 wiki] a 'packed-size' agnostic optimising layer of vector operations would be great. It seems that this could be implemented without new primops on top of the CPU-specific primops.
 
 
* What mechanism should be used for constructing/accessing elements of a packed data type? (LLVM has a <vector n type> datatype with accessor functions).
 
 
* Stream fusion would allow complex operations for 'map'ed and 'zip'ed vectors of Floats, etc., that are optimised to make use of CPU Vectors.
 

Latest revision as of 03:59, 26 April 2021