Difference between revisions of "Library/ArrayRef"
m (formatting and cleanup) 
m 

Line 1:  Line 1:  
−  Arrays&References library supports Hugs 20032006 and GHC 6.4. 
+  The Arrays&References library supports Hugs 20032006 and GHC 6.4. 
It includes the following features: 
It includes the following features: 

Line 5:  Line 5:  
This substitutes the numerous "fast mutable Ints", "fast mutable 
This substitutes the numerous "fast mutable Ints", "fast mutable 

−  Bools" 
+  Bools" and "fast mutable Ptrs" ghcspecific modules that are used in 
almost any large project. In contrast to them, this library mimics 
almost any large project. In contrast to them, this library mimics 

−  wellknown interface of IORef/STRef: 
+  the wellknown interface of IORef/STRef: 
<haskell> 
<haskell> 

Line 17:  Line 17:  
</haskell> 
</haskell> 

−  Unboxed references for IO monad 
+  Unboxed references for IO monad have the type "IOURef a" and operations 
newIOURef, readIOURef, writeIOURef. Unboxed references for ST monad 
newIOURef, readIOURef, writeIOURef. Unboxed references for ST monad 

−  +  have the type "STURef s a" and operations newSTURef, readSTURef, 

writeSTURef. 
writeSTURef. 

−  Unboxed references can 
+  Unboxed references can only contain values of following types: 
Bool, Char, Int, Int8..Int64, Word, Word8..Word64, Float, Double, 
Bool, Char, Int, Int8..Int64, Word, Word8..Word64, Float, Double, 

Ptr a, FunPtr a, StablePtr a. These types are members of Unboxed class 
Ptr a, FunPtr a, StablePtr a. These types are members of Unboxed class 

and you can implement new instances of this class by converting values 
and you can implement new instances of this class by converting values 

−  of some other type (say, CChar) to values of already supported type. 
+  of some other type (say, CChar) to values of an already supported type. 
Despite all these improvements, operations with unboxed references are 
Despite all these improvements, operations with unboxed references are 

compiled to the same code as for any "fast mutable variables". Moreover, 
compiled to the same code as for any "fast mutable variables". Moreover, 

−  unboxed references are available even for Hugs 
+  unboxed references are available even for Hugs which allows simplified 
−  debugging of programs that 
+  debugging of programs that use them. Please note that unboxed references 
−  always hold computed values, in contrast to boxed references, 
+  always hold computed values, in contrast to boxed references, which can 
−  contain unevaluated 
+  contain unevaluated thunks. 
−  I wish to thank Simon Marlow and especially Oleg Kiselyov who 
+  I wish to thank Simon Marlow and especially Oleg Kiselyov who proposed 
−  idea of these references and 
+  the idea of these references and their implementation (in particular, see 
http://www.haskell.org/pipermail/haskellcafe/2006February/014324.html) 
http://www.haskell.org/pipermail/haskellcafe/2006February/014324.html) 

Line 46:  Line 46:  
Sometimes you need to write code that will be compatible both with IO 
Sometimes you need to write code that will be compatible both with IO 

−  and ST monads, and even better with any monad that 
+  and ST monads, and even better with any monad that is lifted from 
one of these two. This is especially useful for writing library code that 
one of these two. This is especially useful for writing library code that 

should be as generic as possible. Operations for arrays, for example, 
should be as generic as possible. Operations for arrays, for example, 

−  are ready for such 
+  are ready for such a kind of usage  readArray and writeArray can work 
in any monad. But it's not true for references  you need to use 
in any monad. But it's not true for references  you need to use 

readIORef for IO monad, but readSTRef for ST monad, so if you need to 
readIORef for IO monad, but readSTRef for ST monad, so if you need to 

−  implement monadindependent algorithm that uses references, you will 
+  implement a monadindependent algorithm that uses references, you will 
be in trouble. This module solves this problem by providing 
be in trouble. This module solves this problem by providing 

monadindependent operations on boxed and unboxed references. So, the 
monadindependent operations on boxed and unboxed references. So, the 

Line 73:  Line 73:  
This example uses the boxed references, unboxed references can be used 
This example uses the boxed references, unboxed references can be used 

−  in similar way with operations newURef, readURef, writeURef. 
+  in a similar way with operations newURef, readURef, writeURef. 
You can find examples of writing monadindependent routines in 
You can find examples of writing monadindependent routines in 

−  "Examples/Universal.hs". Another 
+  "Examples/Universal.hs". Another library of mine, [[Library/Streams]], widely uses this 
facility to implement common functionality for streams working in 
facility to implement common functionality for streams working in 

different monads. 
different monads. 

Line 84:  Line 84:  
==Syntax sugar for mutable types== 
==Syntax sugar for mutable types== 

−  Haskell 
+  Haskell doesn't support a convenient syntax for using mutable vars, such 
−  as references, arrays and hash tables. The library includes module 
+  as references, arrays and hash tables. The library includes a module 
that partially simplifies their usage. For example: 
that partially simplifies their usage. For example: 

Line 111:  Line 111:  
+= increase 
+= increase 

= decrease 
= decrease 

−  .= apply pure function to contents 
+  .= apply a pure function to the contents 
−  .< apply monadic computation to contents 
+  .< apply a monadic computation to the contents 
val return current value 
val return current value 

−  +  The left part of these operations can be a reference, array or hash element. Code examples: 

reference x += 1 
reference x += 1 

Line 121:  Line 121:  
(hash,key) (hash,"str") .= (*2) 
(hash,key) (hash,"str") .= (*2) 

−  You can also omit extra parentheses when indexing 
+  You can also omit extra parentheses when indexing a two or threedimensional array: 
(arr,0,1) =: 1 
(arr,0,1) =: 1 

is equivalent to 
is equivalent to 

Line 134:  Line 134:  
==Reimplemented Arrays library== 
==Reimplemented Arrays library== 

−  The library also includes modified 
+  The library also includes modified implementations of Data.Array.* 
−  modules. The main benefit of these modifications is simplified internal 
+  modules. The main benefit of these modifications is a simplified internal 
library structure 
library structure 

Line 141:  Line 141:  
 Unboxed arrays now can be used in polymorphic functions, they are defined 
 Unboxed arrays now can be used in polymorphic functions, they are defined 

−  for every element type that belongs to classes Unboxed and HasDefaultValue 
+  for every element type that belongs to the classes Unboxed and HasDefaultValue 
(again, look at http://www.haskell.org/pipermail/haskellcafe/2004July/006400.html). 
(again, look at http://www.haskell.org/pipermail/haskellcafe/2004July/006400.html). 

You can add new instances to these classes 
You can add new instances to these classes 

Line 155:  Line 155:  
is selected (or disabled) on array creation. 
is selected (or disabled) on array creation. 

−   Unboxed arrays of Bool values occupy one byte per element (in old 
+   Unboxed arrays of Bool values occupy one byte per element (in the old 
−  implementation they 
+  implementation they used one bit per element) 
 castUArray/castIOUArray/castSTUArray operations are nonmonadic, 
 castUArray/castIOUArray/castSTUArray operations are nonmonadic, 

Line 162:  Line 162:  
size of elements: UArray (1,2) Word32 > UArray (1,8) Word8 
size of elements: UArray (1,2) Word32 > UArray (1,8) Word8 

−   Some operations 
+   Some operations may be slower in the new implementation, because I'm 
−  not sure that 
+  not sure that I discovered all the clever tricks used in original lib. 
Please test speed and report me about any problems 
Please test speed and report me about any problems 

Line 192:  Line 192:  
</haskell> 
</haskell> 

−  This means that definitions like this will no 
+  This means that definitions like this will no longer work: 
<haskell> 
<haskell> 

Line 200:  Line 200:  
</haskell> 
</haskell> 

−  because `bounds` operation is part of HasBounds class that is no 
+  because the `bounds` operation is part of HasBounds class that is no longer a 
−  base class for MArray. 
+  base class for MArray. What can you do to fix this problem? Either: 
−   Add HasBounds restriction to the operation type: 
+   Add a HasBounds restriction to the operation type: 
<haskell> 
<haskell> 

Line 209:  Line 209:  
</haskell> 
</haskell> 

−  This way, your code will become compatible with both old and new 
+  This way, your code will become compatible with both the old and the new 
versions of Arrays library, but it will work only with "old" mutable 
versions of Arrays library, but it will work only with "old" mutable 

−  arrays and 
+  arrays and won't support dynamic arrays. 
−   Replace 
+   Replace calls to the `bounds` operation with calls to `getBounds`. This 
way, your function will become compatible with any instance of MArray 
way, your function will become compatible with any instance of MArray 

class, including dynamic arrays: 
class, including dynamic arrays: 

Line 222:  Line 222:  
</haskell> 
</haskell> 

−  I should mention that despite MArray now 
+  I should mention that despite MArray now isn't based on the HasBounds 
class, all the old mutable array types (IOArray..StorableArray) still 
class, all the old mutable array types (IOArray..StorableArray) still 

−  +  implement this interface. Only the new dynamic arrays don't implement 

it because this is impossible. So, you can use the `bounds` operation 
it because this is impossible. So, you can use the `bounds` operation 

in code that works with one of "old" array constructors: 
in code that works with one of "old" array constructors: 

Line 237:  Line 237:  
===Using dynamic (resizable) arrays=== 
===Using dynamic (resizable) arrays=== 

−  Just to let you know  current implementation of dynamic arrays is 
+  Just to let you know  the current implementation of dynamic arrays is 
very trivial: it just saves reference (IORef or STRef) to the mutable 
very trivial: it just saves reference (IORef or STRef) to the mutable 

−  array. When dynamic array resized, new mutable array is allocated and 
+  array. When a dynamic array is resized, a new mutable array is allocated and the 
−  contents copied. New elements are filled with the value that was 
+  contents is copied. New elements are filled with the value that was 
supported as default if array was created with the newArray 
supported as default if array was created with the newArray 

−  or newDynamicArray operation. If dynamic array 
+  or newDynamicArray operation. If a dynamic array is created with 
−  newArray_ or newDynamicArray_ 
+  newArray_ or newDynamicArray_, then new elements will be left 
undefined. 
undefined. 

−  +  A dynamic array can be resized explicitly by the resizeDynamicArray operation: 

<haskell> 
<haskell> 

Line 253:  Line 253:  
where (l,u) are new array bounds. If the dynamic array was created 
where (l,u) are new array bounds. If the dynamic array was created 

−  by newArray or newArray_ operation, it is the only way to resize it  
+  by a newArray or newArray_ operation, it is the only way to resize it  
−  attempts to write beyond current bounds will raise exception: 
+  attempts to write beyond current bounds will raise an exception: 
<haskell> 
<haskell> 

arr < newArray (0,1) 99 :: IO (DynamicIOArray Int Int) 
arr < newArray (0,1) 99 :: IO (DynamicIOArray Int Int) 

resizeDynamicArray arr (0,0) 
resizeDynamicArray arr (0,0) 

−  writeArray arr 1 1  this operation raises 
+  writeArray arr 1 1  this operation raises an exception 
</haskell> 
</haskell> 

−  To create array that will be automatically resized on attempt to write 
+  To create an array that will be automatically resized on attempt to write 
beyond current bounds, you should use newDynamicArray or 
beyond current bounds, you should use newDynamicArray or 

−  newDynamicArray_ operation (former 
+  newDynamicArray_ operation (the former initializes an array with a given value, while the latter leaves the array uninitialized). Their first argument 
−  +  determines the array expansion policy: 

−  determine array expansion policy: 

<haskell> 
<haskell> 

Line 272:  Line 272:  
</haskell> 
</haskell> 

−  This array will grow at least two times each time 
+  This array will grow at least two times each time automatic 
−  expansion occurs, 
+  expansion occurs, which is determined by using the `growTwoTimes` 
−  parameter. This parameter is just the ordinary function 
+  parameter. This parameter is just the ordinary function that has the 
following type: 
following type: 

Line 282:  Line 282:  
This function accepts old array bounds and offending index and 
This function accepts old array bounds and offending index and 

−  returns new array bounds. You can write new functions for expansion 
+  returns new array bounds. You can write new functions for an expansion 
−  +  policies yourself, or use one of premastered ones: 

growTwoTimes  expand array at least two times 
growTwoTimes  expand array at least two times 

growMinimally  minimal growth that ensures inclusion of new index 
growMinimally  minimal growth that ensures inclusion of new index 

−  noGrow  disable automatic growth. This policy used for arrays created by newArray or newArray_ 
+  noGrow  disable automatic growth. This policy is used for arrays created by newArray or newArray_ 
−  Please note that not every array can work with 
+  Please note that not every array can work with every expansion policy 
−  and 
+  and that is why I supported freedom of selection of this policy. Only 
−  noGrow policy is compatible with every index type. growMinimally 
+  noGrow policy is compatible with every index type. The growMinimally 
policy by it's type is compatible with any index, but it will not work 
policy by it's type is compatible with any index, but it will not work 

for partially ordered indexes, in particular for multidimensional 
for partially ordered indexes, in particular for multidimensional 

arrays. Imagine, for example, array with bounds (0,0)..(9,9). When you 
arrays. Imagine, for example, array with bounds (0,0)..(9,9). When you 

−  +  try to write to index (15,5), this expansion policy function will 

−  be unable to determine what new bounds should be (0,0)..(15,9). So you 
+  be unable to determine what the new bounds should be (0,0)..(15,9). So you 
−  anyway should provide custom expansion policy function for partially 
+  anyway should provide a custom expansion policy function for partially 
ordered indexes. At last, growTwoTimes policy is compatible only with 
ordered indexes. At last, growTwoTimes policy is compatible only with 

−  indexes belonging to class Num, but it is most useful policy 
+  indexes belonging to class Num, but it is the most useful policy of all, because it ensures that the program will not spend all it's 
−  supported because it ensures that the program will not spend all it's 

time expanding the array. On the other side, you can provide your own 
time expanding the array. On the other side, you can provide your own 

−  policy function that will, for example, expand array only 1.5 times. 
+  policy function that will, for example, expand an array only 1.5 times. 
Dynamic array supports the same MArray and HasMutableBounds interfaces 
Dynamic array supports the same MArray and HasMutableBounds interfaces 

−  as other mutable arrays, but they don't support HasBounds interface. 
+  as other mutable arrays, but they don't support the HasBounds interface. 
And now about types of dynamic arrays. These types reflect all the 
And now about types of dynamic arrays. These types reflect all the 

−  types you can use for mutable arrays, and 
+  types you can use for mutable arrays, and include DynamicIOArray, 
−  DynamicIOUArray, DynamicSTArray, DynamicSTUArray, 
+  DynamicIOUArray, DynamicSTArray, DynamicSTUArray, which have the 
−  same parameters as corresponding arrays without "Dynamic" prefix. 
+  same parameters as corresponding arrays without the "Dynamic" prefix. 
Some examples are: 
Some examples are: 

Revision as of 21:32, 6 June 2006
The Arrays&References library supports Hugs 20032006 and GHC 6.4. It includes the following features:
Contents
Unboxed references
This substitutes the numerous "fast mutable Ints", "fast mutable Bools" and "fast mutable Ptrs" ghcspecific modules that are used in almost any large project. In contrast to them, this library mimics the wellknown interface of IORef/STRef:
import Data.Ref
main = do x < newIOURef (0::Int)
writeIOURef x 1
a < readIOURef x
print a
Unboxed references for IO monad have the type "IOURef a" and operations newIOURef, readIOURef, writeIOURef. Unboxed references for ST monad have the type "STURef s a" and operations newSTURef, readSTURef, writeSTURef.
Unboxed references can only contain values of following types: Bool, Char, Int, Int8..Int64, Word, Word8..Word64, Float, Double, Ptr a, FunPtr a, StablePtr a. These types are members of Unboxed class and you can implement new instances of this class by converting values of some other type (say, CChar) to values of an already supported type.
Despite all these improvements, operations with unboxed references are compiled to the same code as for any "fast mutable variables". Moreover, unboxed references are available even for Hugs which allows simplified debugging of programs that use them. Please note that unboxed references always hold computed values, in contrast to boxed references, which can contain unevaluated thunks.
I wish to thank Simon Marlow and especially Oleg Kiselyov who proposed the idea of these references and their implementation (in particular, see http://www.haskell.org/pipermail/haskellcafe/2006February/014324.html)
You can find examples of using unboxed references in "Examples/URef.hs"
Monadindependent references
Sometimes you need to write code that will be compatible both with IO and ST monads, and even better with any monad that is lifted from one of these two. This is especially useful for writing library code that should be as generic as possible. Operations for arrays, for example, are ready for such a kind of usage  readArray and writeArray can work in any monad. But it's not true for references  you need to use readIORef for IO monad, but readSTRef for ST monad, so if you need to implement a monadindependent algorithm that uses references, you will be in trouble. This module solves this problem by providing monadindependent operations on boxed and unboxed references. So, the following routine:
test_Ref = do x < newRef (0::Int)
writeRef x 1
readRef x
can be executed both in IO and ST monads:
main = do a < test_Ref
print a
let b = runST test_Ref
print b
This example uses the boxed references, unboxed references can be used in a similar way with operations newURef, readURef, writeURef.
You can find examples of writing monadindependent routines in "Examples/Universal.hs". Another library of mine, Library/Streams, widely uses this facility to implement common functionality for streams working in different monads.
Syntax sugar for mutable types
Haskell doesn't support a convenient syntax for using mutable vars, such as references, arrays and hash tables. The library includes a module that partially simplifies their usage. For example:
main = do  syntax sugar used for reference:
x < ref (0::Int)
x += 1
x .= (*2)
a < val x
print a
 syntax sugar used for array:
arr < newArray (0,9) 0 :: IO Array Int Int
(arr,0) =: 1
b < val (arr,0)
print b
Basically, the module supports syntax sugar for using the following data types: all types of references, arrays and hash tables. Also, it includes two operations to creating references  ref (=newRef) and uref (=newURef). Other operations include
=: assign += increase = decrease .= apply a pure function to the contents .< apply a monadic computation to the contents val return current value
The left part of these operations can be a reference, array or hash element. Code examples:
reference x += 1 (array,index) (arr,0) =: 1 (hash,key) (hash,"str") .= (*2)
You can also omit extra parentheses when indexing a two or threedimensional array:
(arr,0,1) =: 1
is equivalent to
(arr,(0,1)) =: 1
Let's pay attention that this module supports arrays implementation included in the library, not standard Data.Array.* modules. Module "Examples/SyntaxSugar.hs" should contain further examples.
Reimplemented Arrays library
The library also includes modified implementations of Data.Array.* modules. The main benefit of these modifications is a simplified internal library structure
Nevertheless, it also includes a few uservisible changes:
 Unboxed arrays now can be used in polymorphic functions, they are defined for every element type that belongs to the classes Unboxed and HasDefaultValue (again, look at http://www.haskell.org/pipermail/haskellcafe/2004July/006400.html). You can add new instances to these classes
 MArray class now supports arrays with dynamic bounds. It includes monadic operation getBounds, and if you will change your code to use this operation with mutable arrays instead of `bounds`, your code also will be ready to work with dynamic (resizable) arrays
 Support for dynamic (resizable) arrays included. Their bounds can be changed either explicitly (by `resizeDynamicArray`) or implicitly (by writing to nonexisting position). Policy of automatic array expansion is selected (or disabled) on array creation.
 Unboxed arrays of Bool values occupy one byte per element (in the old implementation they used one bit per element)
 castUArray/castIOUArray/castSTUArray operations are nonmonadic, require "Enum ix" and recalculates upper bound of array according to size of elements: UArray (1,2) Word32 > UArray (1,8) Word8
 Some operations may be slower in the new implementation, because I'm not sure that I discovered all the clever tricks used in original lib. Please test speed and report me about any problems
In other aspects, using of new arrays are equivalent to the old ones. Just change "Array" to the "ArrayBZ" in your import statements and enjoy! :) Directory "Examples/Array" contains demonstrations of using each array type
Changes in MArray usage
Old Arrays library contained the following definitions:
class HasBounds a where
bounds :: Ix i => a i e > (i,i)
class (Monad m, HasBounds a) => MArray a e m where
...
In new library, MArray class defined as:
class (Monad m) => HasMutableBounds a m where
getBounds :: Ix i => a i e > m (i,i)
class (Monad m, HasMutableBounds a m) => MArray a e m where
...
This means that definitions like this will no longer work:
arrayHead :: (MArray a e m, Ix i) => a i e > m e
arrayHead marr = case bounds marr of
(l,_) > readArray marr l
because the `bounds` operation is part of HasBounds class that is no longer a base class for MArray. What can you do to fix this problem? Either:
 Add a HasBounds restriction to the operation type:
arrayHead :: (MArray a e m, HasBounds a, Ix i) => a i e > m e
This way, your code will become compatible with both the old and the new versions of Arrays library, but it will work only with "old" mutable arrays and won't support dynamic arrays.
 Replace calls to the `bounds` operation with calls to `getBounds`. This way, your function will become compatible with any instance of MArray class, including dynamic arrays:
arrayHead marr = do (l,_) < getBounds marr
readArray marr l
I should mention that despite MArray now isn't based on the HasBounds class, all the old mutable array types (IOArray..StorableArray) still implement this interface. Only the new dynamic arrays don't implement it because this is impossible. So, you can use the `bounds` operation in code that works with one of "old" array constructors:
arrayHead :: IOArray i e > IO e
arrayHead marr = case bounds marr of
(l,_) > readArray marr l
Using dynamic (resizable) arrays
Just to let you know  the current implementation of dynamic arrays is very trivial: it just saves reference (IORef or STRef) to the mutable array. When a dynamic array is resized, a new mutable array is allocated and the contents is copied. New elements are filled with the value that was supported as default if array was created with the newArray or newDynamicArray operation. If a dynamic array is created with newArray_ or newDynamicArray_, then new elements will be left undefined.
A dynamic array can be resized explicitly by the resizeDynamicArray operation:
resizeDynamicArray array (l,u)
where (l,u) are new array bounds. If the dynamic array was created by a newArray or newArray_ operation, it is the only way to resize it  attempts to write beyond current bounds will raise an exception:
arr < newArray (0,1) 99 :: IO (DynamicIOArray Int Int)
resizeDynamicArray arr (0,0)
writeArray arr 1 1  this operation raises an exception
To create an array that will be automatically resized on attempt to write
beyond current bounds, you should use newDynamicArray or
newDynamicArray_ operation (the former initializes an array with a given value, while the latter leaves the array uninitialized). Their first argument
determines the array expansion policy:
arr < newDynamicArray_ growTwoTimes (0,1) :: IO (DynamicIOArray Int Int)
This array will grow at least two times each time automatic expansion occurs, which is determined by using the `growTwoTimes` parameter. This parameter is just the ordinary function that has the following type:
type GrowBoundsF i = (i,i) > i > (i,i)
This function accepts old array bounds and offending index and returns new array bounds. You can write new functions for an expansion policies yourself, or use one of premastered ones:
growTwoTimes  expand array at least two times growMinimally  minimal growth that ensures inclusion of new index noGrow  disable automatic growth. This policy is used for arrays created by newArray or newArray_
Please note that not every array can work with every expansion policy and that is why I supported freedom of selection of this policy. Only noGrow policy is compatible with every index type. The growMinimally policy by it's type is compatible with any index, but it will not work for partially ordered indexes, in particular for multidimensional arrays. Imagine, for example, array with bounds (0,0)..(9,9). When you try to write to index (15,5), this expansion policy function will be unable to determine what the new bounds should be (0,0)..(15,9). So you anyway should provide a custom expansion policy function for partially ordered indexes. At last, growTwoTimes policy is compatible only with indexes belonging to class Num, but it is the most useful policy of all, because it ensures that the program will not spend all it's time expanding the array. On the other side, you can provide your own policy function that will, for example, expand an array only 1.5 times.
Dynamic array supports the same MArray and HasMutableBounds interfaces as other mutable arrays, but they don't support the HasBounds interface.
And now about types of dynamic arrays. These types reflect all the
types you can use for mutable arrays, and include DynamicIOArray,
DynamicIOUArray, DynamicSTArray, DynamicSTUArray, which have the
same parameters as corresponding arrays without the "Dynamic" prefix.
Some examples are:
DynamicIOArray Int Double
DynamicSTUArray s (Int,Int) Bool
You can also create dynamic arrays from other mutable array types working in IO monad:
DynamicIO StorableArray Int Double
or ST monad:
DynamicST s (STUArray s) (Int,Int) Bool
or any other monad (ask me if you need this). Btw, implementation of dynamic arrays use the monadindependent references class mentioned above.
See "Examples/Array/Dynamic.hs" for further examples on using these arrays.