Difference between revisions of "IO in action"
m |
m |
||
(11 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote"> |
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote"> |
||
− | The <code>IO</code> type serves as a tag for operations (actions) that interact with the outside world. [...] |
+ | The <code>IO</code> type serves as a tag for operations (actions) that interact with the outside world. The <code>IO</code> type is abstract: no constructors are visible to the user. [...] |
− | < |
+ | <small>[https://www.haskell.org/definition/haskell2010.pdf The Haskell 2010 Report] (page 95 of 329).</small> |
</div> |
</div> |
||
+ | Instead of that conventional approach: |
||
− | So what are I/O actions? |
||
− | <sup> </sup> |
||
− | |||
− | == A false start == |
||
− | |||
− | Unlike most other programming languages, Haskell's [[Non-strict semantics|non-strict semantics]] and thus its focus on [[referential transparency]] means the common approach to I/O <b>won't work</b>. Even if there was some way to actually introduce it: |
||
<haskell> |
<haskell> |
||
+ | data IO -- abstract |
||
− | # cat NoDirectIO.hs |
||
− | module NoDirectIO where |
||
+ | getChar :: IO Char |
||
− | foreign import ccall unsafe "c_getchar" getchar :: () -> Char |
||
− | + | putChar :: Char -> IO () |
|
+ | ⋮ |
||
− | # |
||
− | # ghci NoDirectIO.hs |
||
− | GHCi, version 9.0.1: https://www.haskell.org/ghc/ :? for help |
||
− | [1 of 1] Compiling NoDirectIO ( NoDirectIO.hs, interpreted ) |
||
− | |||
− | NoDirectIO.hs:3:1: error: |
||
− | • Unacceptable argument type in foreign declaration: |
||
− | ‘()’ cannot be marshalled in a foreign call |
||
− | • When checking declaration: |
||
− | foreign import ccall unsafe "c_getchar" getchar :: () -> Char |
||
− | | |
||
− | 3 | foreign import ccall unsafe "c_getchar" getchar :: () -> Char |
||
− | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
||
− | Failed, no modules loaded. |
||
− | ghci> :q |
||
− | Leaving GHCi. |
||
− | # |
||
</haskell> |
</haskell> |
||
+ | describe <code>IO</code> using other types, ones with no visible constructors: |
||
− | ...such entities (because they are certainly <i>not</i> regular Haskell functions!) are practically useless. For example, what should the output of this be in Haskell? |
||
<haskell> |
<haskell> |
||
+ | data (->) a b -- abstract |
||
− | let |
||
− | + | data OI -- also abstract |
|
− | g x y = h y y |
||
− | h x y = "what?" |
||
− | in f (putstr "hello ") (putstr "world\n") |
||
− | </haskell> |
||
+ | type IO a = OI -> a |
||
− | <small>(That's just one of the counter-examples from section 3.1 (page 43 of 210) in Claus Reinke's [https://macau.uni-kiel.de/servlets/MCRFileNodeServlet/macau_derivate_00002884/1998_tr04.pdf Functions, Frames and Interactions]!)</small> |
||
+ | getChar :: OI -> Char -- an I/O action |
||
− | |||
+ | putChar :: Char -> (OI -> ()) -- a function with one parameter, whose result is an I/O action |
||
− | For a language like Haskell, there are two options: |
||
− | |||
− | * avoid I/O altogether and be [[Denotative|denotative]]; |
||
− | |||
− | * use existing language features to build a framework and adapt I/O-centric entities to work within it: a model of I/O. |
||
− | |||
− | == Actions and functions == |
||
− | |||
− | In Haskell, functions have their basis in mathematics, not subroutines. It requires all functions to obey this essential rule: |
||
− | |||
− | * if a function's result changes, it is <b>only</b> because it's arguments have changed. |
||
− | |||
− | So if <code>getchar</code> and <code>putchar</code> were applied to a different value at each call site, they <i>could</i> be used like functions: |
||
− | |||
− | <haskell> |
||
− | foreign import ccall unsafe "c_getchar" getchar :: ... -> Char |
||
− | foreign import ccall unsafe "c_putchar" putchar :: Char -> ... -> () |
||
− | </haskell> |
||
− | |||
− | These curious values need types: |
||
− | |||
− | * the requirement for different values is now extended to avoid having two different types - each value can only be used as an argument <i>once</i>: |
||
− | |||
− | :<haskell> |
||
− | let u = ... in |
||
− | let !c1 = getchar u in -- invalid: |
||
− | let !c2 = getchar u in -- reusing u |
||
− | in [c1, c2] |
||
− | </haskell> |
||
− | |||
− | :<haskell> |
||
− | let [c1, c2] = ... in |
||
− | let u = ... in |
||
− | let !_ = putchar c1 u in -- invalid |
||
− | let !_ = putchar c2 u in -- too |
||
− | in () |
||
− | </haskell> |
||
− | |||
− | :<haskell> |
||
− | let u = ... in |
||
− | let !c = getchar u in -- invalid |
||
− | let !_ = putchar c u in -- again |
||
− | in () |
||
− | </haskell> |
||
− | |||
− | :This extended requirement will also apply to any other <code>OI</code>-based entities, primitive or otherwise. |
||
− | |||
− | * since outside interactions are involved, let's [https://www.interaction-design.org/literature/article/kiss-keep-it-simple-stupid-a-design-principle keep it simple]: |
||
− | |||
− | :<haskell> |
||
− | data OI -- abstract |
||
− | getChar :: OI -> Char |
||
− | putChar :: Char -> OI -> () |
||
− | </haskell> |
||
− | |||
− | Having previously referred to them as <i>"entities"</i>, these new type signatures make for more useful descriptions: |
||
− | |||
− | <haskell> |
||
⋮ |
⋮ |
||
− | getChar :: (OI -> Char) -- this is an I/O action |
||
− | putChar :: Char -> (OI -> ()) -- this resembles a function returning an I/O action |
||
</haskell> |
</haskell> |
||
+ | <sub> </sub> |
||
− | == |
+ | == Starting up == |
+ | <code>Main.main</code> is also an I/O action: |
||
− | Since the origin of <code>OI</code> values are unspecified, let's start with some pseudo-code: |
||
<haskell> |
<haskell> |
||
+ | main :: OI -> () |
||
− | getLine :: (OI -> [Char]) -- another I/O action |
||
− | putLine :: [Char] -> (OI -> ()) -- also resembles a function returning an I/O action |
||
− | |||
− | getLine u = let !c = getChar ... in |
||
− | if c == '\n' then |
||
− | [] |
||
− | else |
||
− | let !cs = getLine ... |
||
− | in c:cs |
||
− | |||
− | putLine (c:cs) u = let !_ = putChar c ... in putLine cs ... |
||
− | putLine [] u = putChar '\n' ... |
||
</haskell> |
</haskell> |
||
+ | Therefore an internal subroutine in the Haskell implementation provides each running program with an initial <code>OI</code> value. However, most programs will require more than just one: |
||
− | Because <code>OI</code> values can only be used once: |
||
<haskell> |
<haskell> |
||
− | + | getLine :: OI -> [Char] |
|
+ | getLine u = let !(u1, u2) = partOI u in |
||
− | |||
+ | let !c = getChar u1 in |
||
− | getLine u = let (u1, u2) = ... in |
||
− | + | if c == '\n' then |
|
− | + | [] |
|
− | + | else |
|
− | + | let !cs = getLine u2 |
|
− | + | in c:cs |
|
− | in c:cs |
||
− | putLine |
+ | putLine :: [Char] -> OI -> () |
+ | putLine (c:cs) u = let !(u1, u2) = partOI u in |
||
let !_ = putChar c u1 in |
let !_ = putChar c u1 in |
||
putLine cs u2 |
putLine cs u2 |
||
Line 150: | Line 55: | ||
</haskell> |
</haskell> |
||
+ | So another I/O action is needed in order to access that same internal subroutine from Haskell: |
||
− | Those new local bindings <code>u1</code> and <code>u2</code> in <code>getLine</code> must be defined somehow, and there's only one parameter available: |
||
<haskell> |
<haskell> |
||
+ | partOI :: OI -> (OI, OI) |
||
− | ⋮ |
||
− | |||
− | getLine u = let (u1, u2) = ... u ... in |
||
− | let !c = getChar u1 in |
||
− | if c == '\n' then |
||
− | [] |
||
− | else |
||
− | let !cs = getLine u2 |
||
− | in c:cs |
||
− | |||
− | ⋮ |
||
</haskell> |
</haskell> |
||
+ | If more than two new <code>OI</code> values are needed: |
||
− | Now for [[Partible|an extra]] abstraction in the form of another primitive, to complete the new local bindings: |
||
<haskell> |
<haskell> |
||
+ | partsOI :: OI -> [OI] |
||
− | ⋮ |
||
− | + | partsOI u = let !(u1, u2) = partOI u in u1 : partsOI u2 |
|
− | |||
− | getLine u = let (u1, u2) = partOI u in |
||
− | let !c = getChar u1 in |
||
− | if c == '\n' then |
||
− | [] |
||
− | else |
||
− | let !cs = getLine u2 |
||
− | in c:cs |
||
− | |||
− | putLine (c:cs) u = let (u1, u2) = partOI u in |
||
− | let !_ = putChar c u1 in |
||
− | putLine cs u2 |
||
− | putLine [] u = putChar '\n' u |
||
</haskell> |
</haskell> |
||
− | + | But why are these abstract <code>OI</code> values needed at all - what purpose do they serve? |
|
+ | == Actions and functions == |
||
− | * suggests the existence of a single ancestral <code>OI</code> value in the entire program: |
||
+ | Looking more closely at <code>getLine</code>, <code>putLine</code> and <code>partsOI</code> reveals an interesting fact: |
||
− | :<haskell> |
||
− | main :: (OI -> ()) -- a program is an I/O action |
||
− | </haskell> |
||
+ | * each <code>OI</code> value is only used once (if at all). |
||
− | * and clearly shows that the only <code>safe</code> way to use an I/O action is from within the definition of another I/O action: |
||
+ | Why is this important? Because in Haskell, functions have their basis in mathematics. That imposes certain requirements on function, including this one: |
||
− | :<haskell> |
||
− | trace :: [Char] -> a -> a |
||
− | trace msg x = let u = ... in -- how's this going to work? |
||
− | let !_ = putLine msg u in x |
||
− | </haskell> |
||
+ | * if a function's result changes, it is <b>only</b> because one or more of it's arguments has changed. |
||
− | == Other interfaces == |
||
+ | If they're always used with different <code>OI</code> values, then I/O actions can be used like functions: |
||
− | The simplicity of the <code>OI</code>-based interface: |
||
<haskell> |
<haskell> |
||
+ | partsOI :: OI -> [OI] |
||
− | data OI |
||
+ | partsOI = unfoldr (Just . partOI) |
||
− | partOI :: (OI -> (OI, OI)) |
||
− | getChar :: (OI -> Char) |
||
− | putChar :: Char -> (OI -> ()) |
||
</haskell> |
</haskell> |
||
+ | even if they're defined using subroutines: |
||
− | makes it very adept at implementing other models of I/O: |
||
+ | <haskell> |
||
− | * [[Comonad|comonad]]: |
||
+ | foreign import "oi_part" partOI :: OI -> (OI, OI) |
||
− | |||
+ | foreign import "oi_getchar" getChar :: OI -> Char |
||
− | :<haskell> |
||
+ | foreign import "oi_putchar" putChar :: Char -> OI -> () |
||
− | type C a = (OI, a) |
||
− | |||
− | extract :: C a -> a |
||
− | extract (u, x) = let !_ = partOI u in x |
||
− | |||
− | duplicate :: C a -> C (C a) |
||
− | duplicate (u, x) = let !(u1, u2) = partOI u in |
||
− | (u2, (u1, x)) |
||
− | |||
− | extend :: (C a -> b) -> C a -> C b |
||
− | extend h (u, x) = let !(u1, u2) = partOI u in |
||
− | let !y = h (u1, x) in |
||
− | (u2, y) |
||
</haskell> |
</haskell> |
||
+ | The need for an <code>OI</code> value also helps to prevent I/O actions from being used as subroutines: |
||
− | * [[Arrow|arrow]]: |
||
− | + | <haskell> |
|
− | + | trace :: [Char] -> a -> a |
|
+ | trace msg x = case putLine msg of !_ -> x -- how is this supposed to work? |
||
− | |||
− | arr :: (b -> c) -> A b c |
||
− | arr f = \ c' u -> f $! c' u |
||
− | |||
− | both :: A b c -> A b' c' -> A (b, b') (c, c') |
||
− | f' `both` g' = \ c' u -> let !(u1:u2:u3:_) = partsOI u in |
||
− | let !(x, x') = c' u1 in |
||
− | let !y = f' (unit x) u2 in |
||
− | let !y' = g' (unit x') u3 in |
||
− | (y, y') |
||
− | |||
− | |||
− | unit :: a -> OI -> a |
||
− | unit x u = let !_ = partOI u in x |
||
− | |||
− | partsOI :: OI -> [OI] |
||
− | partsOI u = let !(u1, u2) = partOI u in u1 : partsOI u2 |
||
</haskell> |
</haskell> |
||
+ | == Monadic actions == |
||
− | * that [[Monad|other]] interface: |
||
+ | The monadic interface: |
||
− | :<haskell> |
||
− | type M a = OI -> a |
||
+ | <haskell> |
||
− | unit :: a -> M a |
||
+ | instance Monad ((->) OI) |
||
− | unit x = \ u -> let !_ = partOI u in x |
||
+ | return x = \ u -> let !_ = partOI u in x |
||
− | |||
− | + | m >>= k = \ u -> let !(u1, u2) = partOI u in |
|
− | + | let !x = m u1 in |
|
− | let ! |
+ | let !y = k x u2 in |
− | + | y |
|
− | y |
||
</haskell> |
</haskell> |
||
+ | allows <code>getLine</code> and <code>getLine</code> to be defined more compactly: |
||
− | * ...and even the state-passing style used by [https://www.cambridge.org/core/services/aop-cambridge-core/content/view/2EFAEBBE3A19EA03A8D6D75A5348E194/S0956796800001258a.pdf/the-ins-and-outs-of-clean-io.pdf Clean], [https://www.researchgate.net/publication/220997216_Controlling_chaos_on_safe_side-effects_in_data-parallel_operations Single-Assignment C], some Haskell implementations and as part of the I/O model used for the verification of interactive programs in [https://cakeml.org/vstte18.pdf CakeML], remembering that <code>OI</code> values can only be used once: |
||
− | + | <haskell> |
|
+ | getLine :: OI -> [Char] |
||
− | newtype W = W OI |
||
+ | getLine = do c <- getChar |
||
+ | if c == '\n' then |
||
+ | return [] |
||
+ | else |
||
+ | do cs <- getLine |
||
+ | return (c:cs) |
||
− | + | putLine :: [Char] -> OI -> () |
|
+ | putLine [] = putChar '\n' |
||
− | readchar (W u) = let !(u1, u2) = partOI u in |
||
+ | putLine (c:cs) = putChar c >> putLine cs |
||
− | let !c = getChar u1 in |
||
− | (c, W u2) |
||
− | |||
− | writechar :: Char -> W -> W |
||
− | writechar c (W u) = let !(u1, u2) = partOI u in |
||
− | let !_ = putChar u1 in |
||
− | W u2 |
||
</haskell> |
</haskell> |
||
+ | and conceals the use of all those <code>OI</code> values. But not all definitions will benefit from being monadic: |
||
− | It can also be used to implement the models of [https://dl.acm.org/doi/pdf/10.1145/262009.262011 I/O used in earlier versions] of Haskell: |
||
− | |||
− | * dialogues: |
||
− | |||
− | :<haskell> |
||
− | runD :: ([Response] -> [Request]) -> OI -> () |
||
− | runD d u = foldr (\(!_) -> id) () $ yet $ \ l -> zipWith respond (d l) (partsOI u) |
||
− | |||
− | yet :: (a -> a) -> a |
||
− | yet f = f (yet f) |
||
+ | <haskell> |
||
partsOI :: OI -> [OI] |
partsOI :: OI -> [OI] |
||
− | partsOI |
+ | partsOI = do (u1, u2) <- partOI; return (u1 : partsOI u2) |
− | |||
− | respond :: Request -> OI -> Response |
||
− | respond Getq u = let !c = getChar u in Getp c |
||
− | respond (Putq c) u = let !_ = putChar c u in Putp |
||
− | |||
− | data Request = Getq | Putq Char |
||
− | data Response = Getp Char | Putp |
||
− | </haskell> |
||
− | |||
− | * continuations: |
||
− | |||
− | :<haskell> |
||
− | type Answer = OI -> () |
||
− | |||
− | runK :: Answer -> IO -> () |
||
− | runK a u = a u |
||
− | |||
− | doneK :: Answer |
||
− | doneK = \ u -> let !_ = partOI u in () |
||
− | |||
− | getcharK :: (Char -> Answer) -> Answer |
||
− | getcharK k = \ u -> let !(u1, u2) = partOI u in |
||
− | let !c = getChar u1 in |
||
− | let !a = k c in |
||
− | a u2 |
||
− | |||
− | putcharK :: Char -> Answer -> Answer |
||
− | putcharK c a = \ u -> let !(u1, u2) = partOI u in |
||
− | let !_ = putChar c u1 in |
||
− | a u2 |
||
</haskell> |
</haskell> |
||
== Further reading == |
== Further reading == |
||
− | * [[ |
+ | * [[Merely monadic]] provides more information about Haskell's implementation of the monadic interface. |
− | * For those who prefer it, John Launchbury and Simon Peyton Jones's [https:// |
+ | * For those who prefer it, John Launchbury and Simon Peyton Jones's [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.52.3656&rep=rep1&type=pdf State in Haskell] explains the state-passing approach currently in widespread use. |
[[Category:Tutorials]] |
[[Category:Tutorials]] |
Latest revision as of 07:12, 12 June 2023
The IO
type serves as a tag for operations (actions) that interact with the outside world. The IO
type is abstract: no constructors are visible to the user. [...]
The Haskell 2010 Report (page 95 of 329).
Instead of that conventional approach:
data IO -- abstract
getChar :: IO Char
putChar :: Char -> IO ()
⋮
describe IO
using other types, ones with no visible constructors:
data (->) a b -- abstract
data OI -- also abstract
type IO a = OI -> a
getChar :: OI -> Char -- an I/O action
putChar :: Char -> (OI -> ()) -- a function with one parameter, whose result is an I/O action
⋮
Starting up
Main.main
is also an I/O action:
main :: OI -> ()
Therefore an internal subroutine in the Haskell implementation provides each running program with an initial OI
value. However, most programs will require more than just one:
getLine :: OI -> [Char]
getLine u = let !(u1, u2) = partOI u in
let !c = getChar u1 in
if c == '\n' then
[]
else
let !cs = getLine u2
in c:cs
putLine :: [Char] -> OI -> ()
putLine (c:cs) u = let !(u1, u2) = partOI u in
let !_ = putChar c u1 in
putLine cs u2
putLine [] u = putChar '\n' u
So another I/O action is needed in order to access that same internal subroutine from Haskell:
partOI :: OI -> (OI, OI)
If more than two new OI
values are needed:
partsOI :: OI -> [OI]
partsOI u = let !(u1, u2) = partOI u in u1 : partsOI u2
But why are these abstract OI
values needed at all - what purpose do they serve?
Actions and functions
Looking more closely at getLine
, putLine
and partsOI
reveals an interesting fact:
- each
OI
value is only used once (if at all).
Why is this important? Because in Haskell, functions have their basis in mathematics. That imposes certain requirements on function, including this one:
- if a function's result changes, it is only because one or more of it's arguments has changed.
If they're always used with different OI
values, then I/O actions can be used like functions:
partsOI :: OI -> [OI]
partsOI = unfoldr (Just . partOI)
even if they're defined using subroutines:
foreign import "oi_part" partOI :: OI -> (OI, OI)
foreign import "oi_getchar" getChar :: OI -> Char
foreign import "oi_putchar" putChar :: Char -> OI -> ()
The need for an OI
value also helps to prevent I/O actions from being used as subroutines:
trace :: [Char] -> a -> a
trace msg x = case putLine msg of !_ -> x -- how is this supposed to work?
Monadic actions
The monadic interface:
instance Monad ((->) OI)
return x = \ u -> let !_ = partOI u in x
m >>= k = \ u -> let !(u1, u2) = partOI u in
let !x = m u1 in
let !y = k x u2 in
y
allows getLine
and getLine
to be defined more compactly:
getLine :: OI -> [Char]
getLine = do c <- getChar
if c == '\n' then
return []
else
do cs <- getLine
return (c:cs)
putLine :: [Char] -> OI -> ()
putLine [] = putChar '\n'
putLine (c:cs) = putChar c >> putLine cs
and conceals the use of all those OI
values. But not all definitions will benefit from being monadic:
partsOI :: OI -> [OI]
partsOI = do (u1, u2) <- partOI; return (u1 : partsOI u2)
Further reading
- Merely monadic provides more information about Haskell's implementation of the monadic interface.
- For those who prefer it, John Launchbury and Simon Peyton Jones's State in Haskell explains the state-passing approach currently in widespread use.