IO, partible-style
It is interesting that novices in lazy functional programming in general expect that there is some direct (side-effecting) I/O using a function call.
A Partial Rehabilitation of Side-Effecting I/O:, Manfred Schmidt-Schauß.
...like how I/O works in Standard ML?
val echoML : unit -> unit fun echoML () = let val c = getcML () in if c = #"\n" then () else let val _ = putcML c in echoML () end end
Alright, now look at this:
echo :: OI -> ()
echo u = let !(u1:u2:u3:_) = parts u
!c = primGetChar u1 in
if c == '\n' then
()
else
let !_ = primPutChar c u2 in
echo u3
Are you interested?
Wadler's echo
Those two versions of that small program are based on the running example from Philip Wadler's How to Declare an Imperative. If we compare the two:
val echoML : unit -> unit fun echoML () = let val c = getcML () in if c = #"\n" then () else let val _ = putcML c in echoML () end end |
echo :: OI -> ()
echo u = let !(u1:u2:u3:_) = parts u
!c = primGetChar u1 in
if c == '\n' then
()
else
let !_ = primPutChar c u2 in
echo u3
--
|
...we can see just how similar the two versions of echo
really are: apart from the obvious changes of syntax and names:
- the Haskell version replaces the
unit
arguments forechoML
andgetcML
, - and provides an extra argument for
putcML
, - with the replacement parameter
u
being used to define the new local bindingsu1
,u2
andu3
as the result of a call toparts
.
So there you have it: for the price of some extra calls and bindings, we can have SML-style I/O in Haskell. Furthermore, as the prevailing definition for SML has been available since 1997, there should be plenty of I/O tutorials to choose from...
Resisting temptation
If you're now thinking about using something like:
primitive might_get_Char :: () -> Char primitive might_put_Char :: Char -> ()
to acheive a more direct translation...don't - it might for this small program, but it just isn't reliable in general. Why?
- Short answer: unlike SML, Haskell's nonstrict evaluation means expressions should be referentially transparent.
- Long answer: read section 2.2 (pages 4-5) of Wadler's paper.
- Longer answer: read Lennart Augustsson's More points for lazy evaluation.
- Extended answer: read John Hughes's Why Functional Programming Matters.
But, if after all that, you're still not convinced...maybe Standard ML really is the programming language for you :-)
OI
: what is it?
OI
is an abstract partible type:
data OI a
primitive primPartOI :: OI -> (OI, OI)
instance Partible OI where
part = primPartOI
Like primPartOI
, most other primitives for the OI
type also accept an OI
-value as their last (or only) argument e.g:
primitive primGetChar :: OI -> Char
primitive primPutChar :: Char -> OI -> ()
⋮
For consistency, the last argument of a OI
-based definition should also be an OI
-value:
interact :: (String -> String) -> OI -> ()
interact d u = let !(u1, u2) = part u in
putStr (d $ getContents u1) u2
putStr :: String -> OI -> ()
putStr s u = foldr (\(!_) -> id) () $ zipWith primPutChar s $ parts u
getContents :: OI -> String
getContents u = case map getChar (parts u) of
l@(!c:_) -> l
l -> l
IO
, using OI
So how do we get from IO
to OI
?
- Haskell is now used far and wide, so good ol' "search and replace" is a non-starter!
- There are some who still prefer C, and there are others who are content with
IO
- convincing them to switch will probably take a lot more than a solitary page on some wiki!
Fortunately, it's quite easy to define IO
with OI
:
type IO a = OI -> a
...provided you followed that hint about putting the OI
argument last:
interact :: (String -> String) -> IO ()
putStr :: String -> IO ()
getContents :: IO String
primitive primGetChar :: IO Char
primitive primPutChar :: Char -> IO ()
⋮
Of course, a realistic implementation of IO
in Haskell requires a certain interface:
unitIO :: a -> IO a
unitIO x = \ u -> let !_ = part u in x
bindIO :: IO a -> (a -> IO b) -> IO b
bindIO m k = \ u -> let !(u1, u2) = part u in
let !x = m u1 in
let !y = k x u2 in
y
You didn't put the OI
argument last? Oh well, there's always the applicative interface...
Some annoyances
- Extra parameters and arguments - As noted by Sigbjørn Finne and Simon Peyton Jones in Programming Reactive Systems in Haskell, passing around all those
OI
-values correctly can be tedious for large definitions.
- Polymorphic references - It's been known for a very long time in the SML community that naive declarations for operations using mutable references breaks type safety:
primitive newPolyRef :: a -> OI -> PolyRef a primitive readPolyRef :: PolyRef a -> OI -> a primitive writePolyRef :: PolyRef a -> a -> OI -> () kah_BOOM u = let … !vehicle = newPolyRef undefined u1 !_ = writePolyRef ("0" :: [Char]) u2 !crash = readPolyRef vehicle u3 burn = 1 :: Int in crash + burn
- SML's solution is to make all mutable references monomorphic through the use of dedicated syntax:
let val r = ref (…) ⋮
- One alternative for Haskell would be to extend type signatures to support monomorphic type-variables:
primitive newIORef :: monomo a . a -> OI -> IORef a primitive readIORef :: monomo a . IORef a -> OI -> a primitive writeIORef :: monomo a . IORef a -> a -> OI -> () {- would be rejected by the extended type system: kah_BOOM u = let !(u1:u2:u3:_) = parts u !vehicle = newIORef undefined u1 -- vehicle :: monomo a . IORef a !_ = writeIORef ("0" :: [Char]) u2 -- vehicle :: IORef [Char] !crash = readIORef vehicle u3 -- vehicle :: IORef [Char] ≠ IORef Int burn = 1 :: Int in crash + burn -}
- In standard Haskell, one of the few places this already occurs (albeit implicitly) is the parameters of a function:
{- will be rejected by the standard Haskell type system ker_plunk f = (f True, f 'b') -}
One solution
- Extra parameters and arguments - What is needed is a succinct interface to "hide the plumbing" used to pass around
OI
-values. Here's one we prepared earlier:
unitIO :: a -> IO a unitIO x = \ u -> let !_ = part u in x bindIO :: IO a -> (a -> IO b) -> IO b bindIO m k = \ u -> let !(u1, u2) = part u in let !x = m u1 in let !y = k x u2 in y
- Polymorphic references - we now make
IO
into an abstract data type:
module Abstract.IO ( Monad (..), getChar, putChar, … newIORef, readIORef, writeIORef, ⋮ ) where instance Monad ((->) OI) where return = unitIO (>>=) = bindIO getChar :: IO Char getChar = primGetChar putChar :: Char -> IO () putChar = primPutChar newIORef :: a -> IO (IORef a) newIORef = primNewIORef readIORef :: IORef a -> IO a readIORef = primReadIORef writeIORef :: IORef a -> a -> IO () writeIORef = primWriteIORef -- these are now local, private entities -- type IO a = OI -> a unitIO :: a -> IO a unitIO x = \ u -> let !_ = part u in x bindIO :: IO a -> (a -> IO b) -> IO b bindIO m k = \ u -> let !(u1, u2) = part u in let !x = m u1 in let !y = k x u2 in y data OI a primitive primPartOI :: OI -> (OI, OI) primitive primGetChar :: OI -> Char primitive primPutChar :: Char -> OI -> () ⋮ data IORef primitive primNewIORef :: a -> OI -> IORef a primitive primReadIORef :: IORef a -> OI -> a primitive primWriteIORef :: IORef a -> a -> OI -> () ⋮
- With the
IO
type now made abstract, the only way to useIO
-values is by using:
- the visible
IO
operations:getChar
,putChar
, etc. - the monadic interface -
Monad(return, (>>=), …)
(or via Haskell'sdo
-notation).
- the visible
- So how does making
IO
an ADT prevent polymophic references? It's all to do with the type of(>>=)
when used withIO
-actions:
(>>=) :: IO a -> (a -> IO b) -> IO b
- in particular, the type of the second argument:
(a -> IO b)
- ...it's a function, so the value it receives will be rendered monomorphic in the function's result (of type
IO b
).
- As
(>>=)
is now the onlyIO
operation which can retrieve a result from anIO
-action, mutable references (IORef …
) simply cannot be used polymorphically.
GHC's solution
newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
...you may have noticed that we've already made liberal use of one Haskell extension - bang-patterns - and it would be useful to stay as close as possible to standard Haskell, so we'll simplify matters:
newtype IO a = IO (IOState -> (IOState, a)) -- unboxed-tuple replaced by standard one
type IOState = State# RealWorld
Now to make the changes:
- to the type -
IOState
uses anOI
-value:
newtype IOState = IOS OI
- to the I/O-specific operations - each one will use the
OI
-value in the initial state to provide two newOI
-values: one to make up the final state; the other being used by theOI
-primitive:
getChar :: IO Char getChar = IO $ \(IOS u) -> let !(u1, u2) = part u !c = primGetChar u1 in (IOS u2, c) putChar :: Char -> IO () putChar c = IO $ \(IOS u) -> let !(u1, u2) = part u !t = primPutChar c u1 in (IOS u2, t) -- etc.
- to the overloaded operations - you've probably seen it all before:
instance Monad IO where return x = IO $ \(!s) -> (s, x) IO m >>= k = IO $ \(!s) -> let !(s', x) = m s !(IO w) = k x in w s'
- (...if you haven't: it's ye ol'
pass-the-planetstate-passing technique.)
One aspect which doesn't change is IO
and its operations being abstract. In fact, the need is even more pressing: in addition to preventing the misuse of certain OI
-operations, being an abstract data type prevents IOState
-values from being erroneously reused.
Conclusions
- Why is Haskell I/O monadic - to avoid having to use extra arguments and parameters everywhere.
- Why is Haskell I/O abstract - to ensure I/O works as intended, by preventing the misuse of internal data.
- Why is Haskell I/O unusual - because of Haskell's nonstrict evaluation and thus its focus on referential transparency, contrary to most other programming languages.
Further reading
If you've managed to get all the way to here, State in Haskell by John Launchbury and Simon Peyton Jones is also worth reading, if you're interested in how GHC eventually arrived at its current definition of IO
.