Difference between revisions of "IO in action"

From HaskellWiki
Jump to: navigation, search
(Initial content)
 
m
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote">
 
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote">
The <code>IO</code> type serves as a tag for operations (actions) that interact with the outside world. [...]
+
The <code>IO</code> type serves as a tag for operations (actions) that interact with the outside world. The <code>IO</code> type is abstract: no constructors are visible to the user. [...]
  
<tt>[https://www.haskell.org/definition/haskell2010.pdf The Haskell 2010 Report], (page 95 of 329).</tt>
+
<tt>[https://www.haskell.org/definition/haskell2010.pdf The Haskell 2010 Report] (page 95 of 329).</tt>
 
</div>
 
</div>
  
So what are I/O actions?
+
[[Output/Input|Instead of]] the conventional approach:
 
 
== A false start ==
 
 
 
Unlike most other programming languages, Haskell's [[nonstrict evaluation]] and thus its focus on [[referential transparency]] means the common approach to I/O <b>won't work</b>. Even if there was some way to actually introduce it:
 
  
 
<haskell>
 
<haskell>
# cat NoDirectIO.hs
+
data IO  -- abstract
module NoDirectIO where
 
  
foreign import ccall unsafe "c_getchar" getchar :: () -> Char
+
getChar ::         IO Char
foreign import ccall unsafe "c_putchar" putchar :: Char -> ()
+
putChar :: Char -> IO ()
#
+
        ⋮
# ghci NoDirectIO.hs
 
GHCi, version 9.0.1: https://www.haskell.org/ghc/  :? for help
 
[1 of 1] Compiling NoDirectIO      ( NoDirectIO.hs, interpreted )
 
 
 
NoDirectIO.hs:3:1: error:
 
    • Unacceptable argument type in foreign declaration:
 
        ‘()’ cannot be marshalled in a foreign call
 
    • When checking declaration:
 
        foreign import ccall unsafe "c_getchar" getchar :: () -> Char
 
  |
 
3 | foreign import ccall unsafe "c_getchar" getchar :: () -> Char
 
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
Failed, no modules loaded.
 
ghci> :q
 
Leaving GHCi.
 
#
 
 
</haskell>
 
</haskell>
  
...such entities (because they are certainly <i>not</i> regular Haskell functions!) are practically useless. For example, what should the output of this be in Haskell?
+
describe <code>IO</code> using other types, ones with no visible constructors:
  
 
<haskell>
 
<haskell>
let                                                 
+
data (->) a b  -- abstract
  f x y = g y x
+
data OI        -- also abstract
  g x y = h y y
 
  h x y = "what?"
 
in f (putstr "hello ") (putstr "world\n")
 
</haskell>
 
  
<small>(That's just one of the counter-examples from section 3.1 (page 43 of 210) in Claus Reinke's [https://macau.uni-kiel.de/servlets/MCRFileNodeServlet/macau_derivate_00002884/1998_tr04.pdf Functions, Frames and Interactions]!)</small>
+
type IO a =        OI -> a
 
+
getChar ::         OI -> Char -- an I/O action
For a language like Haskell, there are two options:
+
putChar :: Char -> (OI -> ())   -- a function with one parameter, whose result is an I/O action
 
 
* avoid I/O altogether and be [[Denotative|denotative]];
 
 
 
* use existing language features to build a framework and adapt I/O-centric entities to work within it: a model of I/O.
 
 
 
== Actions and functions ==
 
 
 
In Haskell, functions have their basis in mathematics, not subroutines. It requires all functions to obey this essential rule:
 
 
 
* if a function's result changes, it is <b>only</b> because it's arguments have changed.
 
 
 
So if <code>getchar</code> and <code>putchar</code> were applied to a different value at each call site, they <i>could</i> be used like functions:
 
 
 
<haskell>
 
foreign import ccall unsafe "c_getchar" getchar :: ... -> Char
 
foreign import ccall unsafe "c_putchar" putchar :: Char -> ... -> ()
 
</haskell>
 
 
 
These curious values need types:
 
 
 
* the requirement for different values is now extended to avoid having two different types - each value can only be used as an argument <i>once</i>:
 
 
 
:<haskell>
 
let u  = ... in
 
let !c1 = getchar u in  -- invalid:
 
let !c2 = getchar u in  -- reusing u
 
in  [c1, c2]
 
</haskell>
 
 
 
:<haskell>
 
let [c1, c2] = ... in
 
let u = ... in
 
let !_  = putchar c1 u in  -- invalid
 
let !_  = putchar c2 u in  --  too
 
in  ()
 
</haskell>
 
       
 
:<haskell>
 
let u = ... in
 
let !c  = getchar u  in   -- invalid
 
let !_  = putchar c u in  --  again
 
in  ()
 
</haskell>
 
 
 
:This extended requirement will also apply to any other <code>OI</code>-based entities, primitive or otherwise.
 
 
 
* since outside interactions are involved, let's [https://www.interaction-design.org/literature/article/kiss-keep-it-simple-stupid-a-design-principle keep it simple]:
 
 
 
:<haskell>
 
data OI  -- abstract
 
getChar :: OI -> Char
 
putChar :: Char -> OI -> ()
 
</haskell>
 
 
 
Having previously referred to them as <i>"entities"</i>, these new type signatures make for more useful descriptions:
 
 
 
<haskell>
 
 
         ⋮
 
         ⋮
getChar ::        (OI -> Char)  -- this is an I/O action
 
putChar :: Char -> (OI -> ())    -- this resembles a function returning an I/O action
 
 
</haskell>
 
</haskell>
 +
<sub> </sub>
  
== An example in action ==
+
== Starting up ==
  
Since the origin of <code>OI</code> values are unspecified, let's start with some pseudo-code:
+
<code>Main.main</code> is also an I/O action:
  
 
<haskell>
 
<haskell>
getLine ::           (OI -> [Char])  -- another I/O action
+
main :: OI -> ()
putLine :: [Char] -> (OI -> ())      -- also resembles a function returning an I/O action
 
 
 
getLine u = let !c = getChar ... in
 
              if c == '\n' then
 
                []
 
              else
 
                let !cs = getLine ...
 
                in c:cs
 
 
 
putLine (c:cs) u = let !_ = putChar c ... in putLine cs ...
 
putLine []    u = putChar '\n' ...
 
 
</haskell>
 
</haskell>
  
Because <code>OI</code> values can only be used once:
+
Therefore an internal subroutine in the Haskell implementation provides each running program with an initial <code>OI</code> value. However, most programs will require more than just one:
  
 
<haskell>
 
<haskell>
        ⋮
+
getLine        :: OI -> [Char]
 +
getLine u        = let !(u1, u2) = partOI u in
 +
                  let !c = getChar u1 in
 +
                  if c == '\n' then
 +
                    []
 +
                  else
 +
                    let !cs = getLine u2
 +
                    in c:cs
  
getLine u = let (u1, u2) = ... in
+
putLine        :: [Char] -> OI -> ()
            let !c = getChar u1 in
+
putLine (c:cs) u = let !(u1, u2) = partOI u in
            if c == '\n' then
 
              []
 
            else
 
              let !cs = getLine u2
 
              in c:cs
 
 
 
putLine (c:cs) u = let (u1, u2) = ... in
 
 
                   let !_ = putChar c u1 in
 
                   let !_ = putChar c u1 in
 
                   putLine cs u2
 
                   putLine cs u2
Line 149: Line 55:
 
</haskell>
 
</haskell>
  
Those new local bindings <code>u1</code> and <code>u2</code> in <code>getLine</code> must be defined somehow, and there's only one parameter available:
+
So another I/O action is needed in order to access that same internal subroutine from Haskell:
  
 
<haskell>
 
<haskell>
        ⋮
+
partOI :: OI -> (OI, OI)
 
 
getLine u = let (u1, u2) = ... u ... in
 
            let !c = getChar u1 in
 
            if c == '\n' then
 
              []
 
            else
 
              let !cs = getLine u2
 
              in c:cs
 
 
 
        ⋮
 
 
</haskell>
 
</haskell>
  
Now for some more abstraction in the form of an extra primitive, to complete the new local bindings:
+
If more than two new <code>OI</code> values are needed:
  
 
<haskell>
 
<haskell>
        ⋮
+
partsOI :: OI -> [OI]
partOI  :: (OI -> (OI, OI))  -- also an I/O action
+
partsOI u = let !(u1, u2) = partOI u in u1 : partsOI u2
 
 
getLine u = let (u1, u2) = partOI u in
 
            let !c = getChar u1 in
 
            if c == '\n' then
 
              []
 
            else
 
              let !cs = getLine u2
 
              in c:cs
 
 
 
putLine (c:cs) u = let (u1, u2) = partOI u in
 
                  let !_ = putChar c u1 in
 
                  putLine cs u2
 
putLine []    u = putChar '\n' u
 
 
</haskell>
 
</haskell>
  
Noticing the tree-like way in which the various local <code>OI</code> values are being defined and used:
+
But why are these abstract <code>OI</code> values needed at all - what purpose do they serve?
  
* suggests the existence of a single ancestral <code>OI</code> value in the entire program:
+
== Actions and functions ==
  
:<haskell>
+
Looking more closely at <code>getLine</code>, <code>putLine</code> and <code>partsOI</code> reveals an interesting fact:
main :: (OI -> ())  -- a program is an I/O action
 
</haskell>
 
  
* and clearly shows that the only <code>safe</code> way to use an I/O action is from within the definition of another I/O action:
+
* each <code>OI</code> value is only used once (if at all).
  
:<haskell>
+
Why is this important? Because in Haskell, functions have their basis in mathematics. That imposes certain requirements on function, including this one:
trace :: [Char] -> a -> a
 
trace msg x = let u = ... in  -- how's this going to work?
 
              let !_ = putLine u in x
 
</haskell>
 
  
== Other interfaces ==
+
* if a function's result changes, it is <b>only</b> because one or more of it's arguments has changed.
  
The simplicity of the <code>OI</code>-based interface:
+
If they're always used with different <code>OI</code> values, then I/O actions can be used like functions:
  
 
<haskell>
 
<haskell>
data OI
+
partsOI :: OI -> [OI]
partOI  ::         (OI -> (OI, OI))
+
partsOI = unfoldr (Just . partOI)
getChar ::        (OI -> Char)
 
putChar :: Char -> (OI -> ())
 
 
</haskell>
 
</haskell>
  
makes it very adept at implementing other models of I/O:
+
even if they're defined using subroutines:  
  
* [[Comonad|comonad]]:
+
<haskell>
 
+
foreign import "oi_part"    partOI ::         OI -> (OI, OI)
:<haskell>
+
foreign import "oi_getchar" getChar ::         OI -> Char
type C a        = (a, OI)
+
foreign import "oi_putchar" putChar :: Char -> OI -> ()
 
 
extract          :: C a -> a
 
extract (x, u)   =  let !_ = partOI u in x
 
 
 
duplicate        :: C a -> C (C a)
 
duplicate (x, u) =  let !(u1, u2) = partOI u in
 
                    ((x, u1), u2)
 
 
 
extend          :: (C a -> b) -> C a -> C b
 
extend h (x, u)  =  let !(u1, u2) = partOI u in
 
                    let !y        = h (x, u1) in
 
                    (y, u2)
 
 
</haskell>
 
</haskell>
  
* [[Arrow|arrow]]:
+
The need for an <code>OI</code> value also helps to prevent I/O actions from being used as subroutines:
  
:<haskell>
+
<haskell>
type A b c  =  (OI -> b) -> (OI -> c)
+
trace :: [Char] -> a -> a
 
+
trace msg x = case putLine msg of !_ -> x -- how is this supposed to work?
arr          :: (b -> c) -> A b c
 
arr f        =  \ c' u -> f $! c' u
 
 
 
both        :: A b c -> A b' c' -> A (b, b') (c, c')
 
f' `both` g' =  \ c' u -> let !(u1:u2:u3:_) = partsOI u in
 
                          let !(x, x')      = c' u1 in
 
                          let !y            = f' (unit x) u2 in
 
                          let !y'          = g' (unit x') u3 in
 
                          (y, y')                         
 
 
 
 
 
unit        :: a -> OI -> a
 
unit x u    = let !_ = partOI u in x
 
 
 
partsOI      :: OI -> [OI]
 
partsOI u    = let !(u1, u2) = partOI u in u1 : partsOI u2
 
 
</haskell>
 
</haskell>
  
* that [[Monad|other]] interface:
+
== Monadic actions ==
  
:<haskell>
+
The monadic interface:
type M a  =  OI -> a
 
  
unit      :: a -> M a
+
<haskell>
unit x     =  \ u -> let !_ = partOI u in x  
+
instance Monad ((->) OI)
 
+
    return x =  \ u -> let !_ = partOI u in x  
bind      :: M a -> (a -> M b) -> M b
+
    m >>= k =  \ u -> let !(u1, u2) = partOI u in
bind m k   =  \ u -> let !(u1, u2) = partOI u in
+
                      let !x = m u1 in
                    let !x = m u1 in
+
                      let !y = k x u2 in
                    let !y = k x u2 in
+
                      y
                    y
 
 
</haskell>
 
</haskell>
  
* ...and even the state-passing style used by [https://www.cambridge.org/core/services/aop-cambridge-core/content/view/2EFAEBBE3A19EA03A8D6D75A5348E194/S0956796800001258a.pdf/the-ins-and-outs-of-clean-io.pdf Clean], [https://www.researchgate.net/publication/220997216_Controlling_chaos_on_safe_side-effects_in_data-parallel_operations Single-Assignment C] and some Haskell implementations, remembering that <code>OI</code> values can only be used once:
+
allows <code>getLine</code> and <code>getLine</code> to be defined more compactly:
  
:<haskell>
+
<haskell>
newtype W = W OI
+
getLine :: OI -> [Char]
 
+
getLine = do c <- getChar
readchar :: W -> (Char, W)
+
            if c == '\n' then
readchar (W u) = let !(u1, u2) = partOI u in
+
              return []
                let !c = getChar u1 in
+
            else
                (c, W u2)
+
              do cs <- getLine
 +
                  return (c:cs)
  
writechar :: Char -> W -> W
+
putLine :: [Char] -> OI -> ()
writechar c (W u) = let !(u1, u2) = partOI u in
+
putLine []    = putChar '\n'
                    let !_ = putChar u1 in
+
putLine (c:cs) = putChar c >> putLine cs
                    W u2
 
 
</haskell>
 
</haskell>
  
It can also be used to implement the models of [https://dl.acm.org/doi/pdf/10.1145/262009.262011 I/O used in earlier versions] of Haskell:
+
and conceals the use of all those <code>OI</code> values. But not all definitions will benefit from being monadic:
 
 
* dialogues:
 
 
 
:<haskell>
 
runD :: ([Response] -> [Request]) -> OI -> ()
 
runD d u = foldr (\(!_) -> id) () $ yet $ \ l -> zipWith respond (d l) (partsOI u)
 
 
 
yet :: (a -> a) -> a
 
yet f = f (yet f)
 
  
 +
<haskell>
 
partsOI :: OI -> [OI]
 
partsOI :: OI -> [OI]
partsOI u = let !(u1, u2) = partOI u in u1 : partsOI u2
+
partsOI = do (u1, u2) <- partOI; return (u1 : partsOI u2)
 
 
respond :: Request -> OI -> Response
 
respond Getq    u = let !c = getChar u in Getp c
 
respond (Putq c) u = let !_ = putChar c u in Putp
 
 
 
data Request  = Getq | Putq Char
 
data Response = Getp Char | Putp
 
</haskell>
 
 
 
* continuations:
 
 
 
:<haskell>
 
type Answer = OI -> ()
 
 
 
runK :: Answer -> IO -> ()
 
runK a u = a u
 
 
 
doneK :: Answer
 
doneK = \ u -> let !_ = partOI u in ()
 
 
 
getcharK :: (Char -> Answer) -> Answer
 
getcharK k  = \ u -> let !(u1, u2) = partOI u in
 
                      let !c        = getChar u1 in
 
                      let !a        = k c in
 
                      a u2
 
 
 
putcharK :: Char -> Answer -> Answer
 
putcharK c a = \ u -> let !(u1, u2) = partOI u in
 
                      let !_        = putChar c u1 in
 
                      a u2
 
 
</haskell>
 
</haskell>
  
 
== Further reading ==
 
== Further reading ==
  
If you've managed to get all the way to here:
+
* [[Merely monadic]] provides more information about Haskell's implementation of the monadic interface.
 
 
* [[Output/Input]] goes into more detail about the type <code>OI -> a</code>.
 
  
* For those who prefer it, John Launchbury and Simon Peyton Jones's [https://galois.com/wp-content/uploads/2014/08/pub_JL_StateInHaskell.pdf State in Haskell] explains the state-passing approach currently in widespread use.
+
* For those who prefer it, John Launchbury and Simon Peyton Jones's [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.52.3656&rep=rep1&type=pdf State in Haskell] explains the state-passing approach currently in widespread use.
  
 
[[Category:Tutorials]]
 
[[Category:Tutorials]]

Latest revision as of 05:15, 10 September 2022

The IO type serves as a tag for operations (actions) that interact with the outside world. The IO type is abstract: no constructors are visible to the user. [...]

The Haskell 2010 Report (page 95 of 329).

Instead of the conventional approach:

data IO  -- abstract

getChar ::         IO Char
putChar :: Char -> IO ()
         

describe IO using other types, ones with no visible constructors:

data (->) a b  -- abstract
data OI        -- also abstract

type IO a =         OI -> a
getChar ::          OI -> Char  -- an I/O action
putChar :: Char -> (OI -> ())   -- a function with one parameter, whose result is an I/O action
         

Starting up

Main.main is also an I/O action:

main :: OI -> ()

Therefore an internal subroutine in the Haskell implementation provides each running program with an initial OI value. However, most programs will require more than just one:

getLine         :: OI -> [Char]
getLine u        = let !(u1, u2) = partOI u in
                   let !c = getChar u1 in
                   if c == '\n' then
                     []
                   else
                     let !cs = getLine u2
                     in c:cs

putLine         :: [Char] -> OI -> ()
putLine (c:cs) u = let !(u1, u2) = partOI u in
                   let !_ = putChar c u1 in
                   putLine cs u2
putLine []     u = putChar '\n' u

So another I/O action is needed in order to access that same internal subroutine from Haskell:

partOI :: OI -> (OI, OI)

If more than two new OI values are needed:

partsOI :: OI -> [OI]
partsOI u = let !(u1, u2) = partOI u in u1 : partsOI u2

But why are these abstract OI values needed at all - what purpose do they serve?

Actions and functions

Looking more closely at getLine, putLine and partsOI reveals an interesting fact:

  • each OI value is only used once (if at all).

Why is this important? Because in Haskell, functions have their basis in mathematics. That imposes certain requirements on function, including this one:

  • if a function's result changes, it is only because one or more of it's arguments has changed.

If they're always used with different OI values, then I/O actions can be used like functions:

partsOI :: OI -> [OI]
partsOI = unfoldr (Just . partOI)

even if they're defined using subroutines:

foreign import "oi_part"    partOI  ::         OI -> (OI, OI)
foreign import "oi_getchar" getChar ::         OI -> Char
foreign import "oi_putchar" putChar :: Char -> OI -> ()

The need for an OI value also helps to prevent I/O actions from being used as subroutines:

trace :: [Char] -> a -> a
trace msg x = case putLine msg of !_ -> x  -- how is this supposed to work?

Monadic actions

The monadic interface:

instance Monad ((->) OI) 
    return x =  \ u -> let !_ = partOI u in x 
    m >>= k  =  \ u -> let !(u1, u2) = partOI u in
                       let !x = m u1 in
                       let !y = k x u2 in
                       y

allows getLine and getLine to be defined more compactly:

getLine :: OI -> [Char]
getLine = do c <- getChar
             if c == '\n' then
               return []
             else
               do cs <- getLine
                  return (c:cs)

putLine :: [Char] -> OI -> ()
putLine []     = putChar '\n'
putLine (c:cs) = putChar c >> putLine cs

and conceals the use of all those OI values. But not all definitions will benefit from being monadic:

partsOI :: OI -> [OI]
partsOI = do (u1, u2) <- partOI; return (u1 : partsOI u2)

Further reading

  • Merely monadic provides more information about Haskell's implementation of the monadic interface.
  • For those who prefer it, John Launchbury and Simon Peyton Jones's State in Haskell explains the state-passing approach currently in widespread use.