Difference between revisions of "IO, partible-style"

From HaskellWiki
Jump to navigation Jump to search
m (Extra reference added)
(Page superseded by "IO in action")
Tag: New redirect
 
Line 1: Line 1:
  +
#redirect [[IO in action]]
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote">
 
The difficulty [with new or unfamiliar ways of looking at things] is considerably greater in the case of practical programmers for whom an abstract concept [...] has little reality until they can clothe it with a representation and so understand what it is that they are dealing with.
 
   
  +
[[Category:Pages to be removed]]
<tt>[https://www.cs.cmu.edu/~crary/819-f09/Strachey67.pdf Fundamental Concepts in Programming Languages], Christopher Strachey.</tt>
 
</div>
 
<span> </span>
 
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote">
 
It is interesting that novices in lazy functional programming in general expect that there is some direct (side-effecting) I/O using a function call.
 
 
<tt>[https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.55.1076&rep=rep1&type=pdf A Partial Rehabilitation of Side-Effecting I/O:], Manfred Schmidt-Schauß.</tt>
 
</div>
 
 
...like how I/O works in [https://www.smlnj.org/sml.html Standard ML]?
 
 
<pre>
 
val echoML : unit -> unit
 
fun echoML () = let val c = getcML () in
 
if c = #"\n" then
 
()
 
else
 
let val _ = putcML c in
 
echoML ()
 
end
 
end
 
</pre>
 
 
Alright, now look at this:
 
 
<haskell>
 
echo :: OI -> ()
 
echo u = let !(u1:u2:u3:_) = parts u
 
!c = primGetChar u1 in
 
if c == '\n' then
 
()
 
else
 
let !_ = primPutChar c u2 in
 
echo u3
 
</haskell>
 
 
So how is this possible?
 
<br>
 
<br>
 
__TOC__
 
<sub> </sub>
 
----
 
=== <u>Wadler's </u><code>echo</code> ===
 
 
Those two versions of that small program are based on the running example from Philip Wadler's [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.3579&rep=rep1&type=pdf How to Declare an Imperative]. If we compare the two:
 
 
{|
 
|<pre>
 
val echoML : unit -> unit
 
fun echoML () =
 
let val c = getcML () in
 
if c = #"\n" then
 
()
 
else
 
let val _ = putcML c in
 
echoML ()
 
end
 
end
 
</pre>
 
|<haskell>
 
echo :: OI -> ()
 
echo u = let !(u1:u2:u3:_) = parts u
 
!c = primGetChar u1 in
 
if c == '\n' then
 
()
 
else
 
let !_ = primPutChar c u2 in
 
echo u3
 
 
--
 
</haskell>
 
|}
 
 
...we can see just how similar the two versions of <code>echo</code> really are: apart from the obvious changes of syntax and names:
 
* the Haskell version replaces the <code>unit</code> arguments for <code>echoML</code> and <code>getcML</code>,
 
* and provides an extra argument for <code>putcML</code>,
 
* with the replacement parameter <code>u</code> being used to define the new local bindings <code>u1</code>, <code>u2</code> and <code>u3</code> as the result of a call to <code>parts</code>.
 
 
So for the price of some extra calls and bindings, we can have SML-style I/O in Haskell. Furthermore, as the prevailing [https://smlfamily.github.io/sml97-defn.pdf definition for Standard ML] has been available since 1997, there should be plenty of I/O tutorials to choose from...
 
 
----
 
=== <u>Resisting temptation</u> ===
 
 
If you're now thinking about using something like:
 
 
<pre>
 
primitive might_get_Char :: () -> Char
 
primitive might_put_Char :: Char -> ()
 
</pre>
 
 
to achieve a more direct translation...'''don't''' - it ''might'' for this small program, but it just isn't reliable in general. Why?
 
 
* ''Short answer'': unlike SML, Haskell's nonstrict evaluation means expressions should be [[Referential transparency|referentially transparent]].
 
* ''Long answer'': read section 2.2 (pages 4-5) of Wadler's [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.3579&rep=rep1&type=pdf paper].
 
* ''Longer answer'': read Lennart Augustsson's [https://augustss.blogspot.com/2011/05/more-points-for-lazy-evaluation-in.html More points for lazy evaluation].
 
* ''Extended answer'': read John Hughes's [https://www.cse.chalmers.se/~rjmh/Papers/whyfp.pdf Why Functional Programming Matters].
 
 
But, if after all that, you're still not convinced...a small functional language which combines non-strictness with effect-centric definitions in a similar fashion can be found [http://h2.jaguarpaw.co.uk/posts/impure-lazy-language here] - ''"have fun!"''
 
 
----
 
===<code>OI</code><u>: what is it?</u> ===
 
 
<code>OI</code> is an abstract [[Plainly partible|partible]] type:
 
 
<haskell>
 
data OI a
 
primitive primPartOI :: OI -> (OI, OI)
 
 
instance Partible OI where
 
part = primPartOI
 
</haskell>
 
 
Like <code>primPartOI</code>, most other primitives for the <code>OI</code> type also accept an <code>OI</code>-value as their last (or only) argument e.g:
 
 
<haskell>
 
primitive primGetChar :: OI -> Char
 
primitive primPutChar :: Char -> OI -> ()
 
 
</haskell>
 
 
For consistency, the last argument of a <code>OI</code>-based definition should also be an <code>OI</code>-value:
 
 
<haskell>
 
interact :: (String -> String) -> OI -> ()
 
interact d u = let !(u1, u2) = part u in
 
putStr (d $ getContents u1) u2
 
 
putStr :: String -> OI -> ()
 
putStr s u = foldr (\(!_) -> id) () $ zipWith primPutChar s $ parts u
 
 
getContents :: OI -> String
 
getContents u = case map getChar (parts u) of
 
l@(!c:_) -> l
 
l -> l
 
</haskell>
 
<sub> </sub>
 
 
----
 
=== <u>Some annoyances</u> ===
 
 
==== Extra parameters and arguments ====
 
As noted by Sigbjørn Finne and Simon Peyton Jones in [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46.1260&rep=rep1&type=pdf Programming Reactive Systems in Haskell], passing around all those <code>OI</code>-values ''correctly'' can be tedious for large definitions. Fortunately, they can be also used more concisely with a variety of abstract interfaces:
 
 
* [[Arrow|arrow]]
 
 
:<haskell>
 
type A b c = (OI -> b) -> (OI -> c)
 
 
arr :: (b -> c) -> A b c
 
arr f = \ c' u -> f $! c' u
 
 
 
both :: A b c -> A b' c' -> A (b, b') (c, c')
 
f' `both` g' = \ c' u -> let !(u1:u2:u3:_) = parts u in
 
let !(x, x') = c' u1 in
 
let !y = f' (unit x) u2 in
 
let !y' = g' (unit x') u3 in
 
(y, y')
 
where
 
unit x u = let !_ = part u in x
 
</haskell>
 
 
* [[Comonad|comonad]]
 
 
:<haskell>
 
type C a = (a, OI)
 
 
extract :: C a -> a
 
extract (x, u) = let !_ = part u in x
 
 
duplicate :: C a -> C (C a)
 
duplicate (x, u) = let !(u1, u2) = part u in
 
((x, u1), u2)
 
 
extend :: (C a -> b) -> C a -> C b
 
extend h (x, u) = let !(u1, u2) = part u in
 
let !y = h (x, u1) in
 
(y, u2)
 
</haskell>
 
 
* [[Monad|along with]]
 
 
:<haskell>
 
type M a = OI -> a
 
 
unit :: a -> M a
 
unit x = \ u -> let !_ = part u in x
 
 
bind :: M a -> (a -> M b) -> M b
 
bind m k = \ u -> let !(u1, u2) = part u in
 
let !x = m u1 in
 
let !y = k x u2 in
 
y
 
</haskell>
 
 
:...provided you followed that hint about putting the <code>OI</code> argument last:
 
 
:<haskell>
 
interact :: (String -> String) -> M ()
 
putStr :: String -> M ()
 
getContents :: M String
 
 
primitive primGetChar :: M Char
 
primitive primPutChar :: Char -> M ()
 
 
</haskell>
 
 
:You didn't put the <code>OI</code> argument last? Oh well, there's always the [[Applicative functor|applicative]] interface...
 
 
===== Why bother with monadic I/O? =====
 
 
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote">
 
Considering that the concept of monads is not likely to disappear from the functional programming landscape any time soon, it is vital that we, as the functional programming community, somehow overcome the problems novices encounter when first studying monads.
 
 
<tt>[https://pms.cs.ru.nl/iris-diglib/src/getContent.php?id=2017-Steenvoorden-SupportLearning Visual Support for Learning Monads], Tim Steenvoorden, Jurriën Stutterheim, Erik Barendsen and Rinus Plasmeijer.</tt>
 
</div>
 
 
Futhermore:
 
* Haskell is now used far and wide, so good ol' ''"search and replace"'' is a non-starter!
 
* There are some who [https://www.humprog.org/~stephen//research/papers/kell17some-preprint.pdf still prefer C], and there are others who are content with <code>IO</code> - convincing them to switch will probably take a lot more than a solitary page on some wiki!
 
 
So it's actually rather fortunate that <code>IO</code> can be defined so easily using <code>OI</code>:
 
<haskell>
 
type IO a = M a -- = OI -> a
 
</haskell>
 
<sub> </sub>
 
 
----
 
==== Mutable state shared between different types ====
 
It's been known for a very long time in the SML community that <i>naïve</i> declarations of operations for mutable state breaks type safety:
 
 
<pre>
 
primitive newPolyRef :: a -> OI -> PolyRef a
 
primitive readPolyRef :: PolyRef a -> OI -> a
 
primitive writePolyRef :: PolyRef a -> a -> OI -> ()
 
 
kah_BOOM u = let …
 
!vehicle = newPolyRef undefined u1
 
!_ = writePolyRef ("0" :: [Char]) u2
 
!crash = readPolyRef vehicle u3
 
burn = 1 :: Int
 
in
 
crash + burn
 
</pre>
 
 
The solution used by SML is to render the use of mutable state ''monomorphic'' through the use of dedicated syntax:
 
 
<pre>
 
let val r = ref (…)
 
 
</pre>
 
 
===== Options for Haskell =====
 
One alternative would be to extend type signatures to support [[Monomorphism by annotation of type variables|monomorphic type-variables]]:
 
 
<haskell>
 
primitive newIORef :: monomo a . a -> OI -> IORef a
 
primitive readIORef :: monomo a . IORef a -> OI -> a
 
primitive writeIORef :: monomo a . IORef a -> a -> OI -> ()
 
 
{- would be rejected by the extended type system:
 
kah_BOOM u = let !(u1:u2:u3:_) = parts u
 
!vehicle = newIORef undefined u1 -- vehicle :: monomo a . IORef a
 
!_ = writeIORef ("0" :: [Char]) u2 -- vehicle :: IORef [Char]
 
!crash = readIORef vehicle u3 -- vehicle :: IORef [Char] ≠ IORef Int
 
burn = 1 :: Int
 
in
 
crash + burn
 
-}
 
</haskell>
 
 
In standard Haskell, there are a few places where this already occurs, albeit implicitly:
 
* on the parameters of a function,
 
 
:<haskell>
 
{- will be rejected by the standard Haskell type system
 
 
ker_plunk f = (f True, f 'b')
 
 
-}
 
</haskell>
 
 
* and the type-parameter for a type class.
 
 
:<haskell>
 
primitive newIORef :: Eq a => a -> OI -> IORef a
 
primitive readIORef :: Eq a => IORef a -> OI -> a
 
primitive writeIORef :: Eq a => IORef a -> a -> OI -> ()
 
</haskell>
 
 
One other alternative is to make <code>IO</code> into an ''abstract data type'':
 
 
<haskell>
 
module Abstract.IO
 
(
 
Monad (..),
 
getChar, putChar, …
 
newIORef, readIORef, writeIORef,
 
 
)
 
where
 
 
instance Monad ((->) OI) where
 
return = unitIO
 
(>>=) = bindIO
 
 
 
getChar :: IO Char
 
getChar = primGetChar
 
 
putChar :: Char -> IO ()
 
putChar = primPutChar
 
 
newIORef :: a -> IO (IORef a)
 
newIORef = primNewIORef
 
 
readIORef :: IORef a -> IO a
 
readIORef = primReadIORef
 
 
writeIORef :: IORef a -> a -> IO ()
 
writeIORef = primWriteIORef
 
 
 
-- these are now local, private entities --
 
type IO a = OI -> a
 
 
unitIO :: a -> IO a
 
unitIO x = \ u -> let !_ = part u in x
 
 
bindIO :: IO a -> (a -> IO b) -> IO b
 
bindIO m k = \ u -> let !(u1, u2) = part u in
 
let !x = m u1 in
 
let !y = k x u2 in
 
y
 
 
data OI a
 
primitive primPartOI :: OI -> (OI, OI)
 
 
primitive primGetChar :: OI -> Char
 
primitive primPutChar :: Char -> OI -> ()
 
 
 
data IORef
 
primitive primNewIORef :: a -> OI -> IORef a
 
primitive primReadIORef :: IORef a -> OI -> a
 
primitive primWriteIORef :: IORef a -> a -> OI -> ()
 
 
 
</haskell>
 
 
With <code>IO</code> now abstract, the only way to use <code>IO</code>-actions is by using:
 
 
* the visible <code>IO</code> operations: <code>getChar</code>, <code>putChar</code>, etc.
 
* the monadic interface - <code>Monad(return, (>>=), …)</code> (or via Haskell's <code>do</code>-notation).
 
 
So how does making <code>IO</code> an ADT prevent the polymorphic sharing of mutable state? It's all to do with the type of <code>(>>=)</code> when used with <code>IO</code>-actions:
 
 
<haskell>
 
(>>=) :: IO a -> (a -> IO b) -> IO b
 
</haskell>
 
 
in particular, the type of the second argument:
 
 
:{|
 
|<haskell>
 
(a -> IO b)
 
</haskell>
 
|}
 
 
...it's a function, so the value it receives will be rendered monomorphic in it's result (of type <code>IO b</code>).
 
 
As <code>(>>=)</code> is now the only <code>IO</code> operation which can retrieve a result from an <code>IO</code>-action, mutable state (<code>IORef …</code>) simply cannot be used polymorphically.
 
 
----
 
=== <u>How GHC does I/O</u> ===
 
<sub> </sub>
 
<haskell>
 
newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
 
</haskell>
 
 
...you may have noticed that we've already made liberal use of one Haskell extension - [https://downloads.haskell.org/~ghc/7.8.4/docs/html/users_guide/bang-patterns.html bang-patterns] - and it would be useful to stay as close as possible to standard Haskell, so we'll simplify matters:
 
 
<haskell>
 
newtype IO a = IO (IOState -> (IOState, a)) -- unboxed-tuple replaced by standard one
 
 
type IOState = State# RealWorld
 
</haskell>
 
 
Now to make the changes:
 
 
* ''to the type'' - <code>IOState</code> uses an <code>OI</code>-value:
 
 
:<haskell>
 
newtype IOState = IOS OI
 
</haskell>
 
 
* ''to the I/O-specific operations'' - each one will use the <code>OI</code>-value in the initial state to provide two new <code>OI</code>-values: one to make up the final state; the other being used by the <code>OI</code>-primitive:
 
 
:<haskell>
 
getChar :: IO Char
 
getChar = IO $ \(IOS u) -> let !(u1, u2) = part u
 
!c = primGetChar u1
 
in (IOS u2, c)
 
 
putChar :: Char -> IO ()
 
putChar c = IO $ \(IOS u) -> let !(u1, u2) = part u
 
!t = primPutChar c u1
 
in (IOS u2, t)
 
 
-- etc.
 
</haskell>
 
 
* ''to the overloaded operations'' - you've probably seen it all before:
 
 
:<haskell>
 
instance Monad IO where
 
return x = IO $ \(!s) -> (s, x)
 
IO m >>= k = IO $ \(!s) -> let !(s', x) = m s
 
!(IO w) = k x
 
in w s'
 
</haskell>
 
 
:(...if you haven't: it's ye ol' <del>''pass-the-planet''</del> state-passing technique.)
 
 
One aspect which doesn't change is <code>IO</code> and its operations being abstract. In fact, the need is even more pressing: in addition to preventing the misuse of certain <code>OI</code>-operations, being an abstract data type prevents <code>IOState</code>-values from being erroneously reused.
 
----
 
=== <u>Conclusions</u> ===
 
 
* ''Why is Haskell I/O monadic'' - to avoid having to use extra arguments and parameters ''everywhere''.
 
 
* ''Why is Haskell I/O abstract'' - to ensure I/O works as intended, by preventing the misuse of internal data.
 
 
* ''Why is Haskell I/O unusual'' - because of Haskell's ''nonstrict evaluation'' and thus its focus on ''referential transparency'', contrary to most other programming languages.
 
 
----
 
=== <u>Further reading</u> ===
 
 
If you've managed to get all the way to here:
 
* [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.52.3656&rep=rep1&type=pdf State in Haskell] by John Launchbury and Simon Peyton Jones is also worth reading, if you're interested in how GHC eventually arrived at its current definition of <code>IO</code>.
 
* [[Output/Input]] provides more details about the type <code>OI -> a</code>.
 
 
[[Category:Tutorials]]
 

Latest revision as of 21:27, 14 June 2022

Redirect to: