Difference between revisions of "IO, partible-style"

From HaskellWiki
Jump to navigation Jump to search
(Sections reorganised; extra content added)
m (Extra reference added)
Line 1: Line 1:
  +
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote">
  +
The difficulty [with new or unfamiliar ways of looking at things] is considerably greater in the case of practical programmers for whom an abstract concept [...] has little reality until they can clothe it with a representation and so understand what it is that they are dealing with.
  +
  +
<tt>[https://www.cs.cmu.edu/~crary/819-f09/Strachey67.pdf Fundamental Concepts in Programming Languages], Christopher Strachey.</tt>
  +
</div>
  +
<span> </span>
 
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote">
 
<div style="border-left:1px solid lightgray; padding: 1em" alt="blockquote">
 
It is interesting that novices in lazy functional programming in general expect that there is some direct (side-effecting) I/O using a function call.
 
It is interesting that novices in lazy functional programming in general expect that there is some direct (side-effecting) I/O using a function call.

Revision as of 06:28, 21 April 2022

The difficulty [with new or unfamiliar ways of looking at things] is considerably greater in the case of practical programmers for whom an abstract concept [...] has little reality until they can clothe it with a representation and so understand what it is that they are dealing with.

Fundamental Concepts in Programming Languages, Christopher Strachey.

It is interesting that novices in lazy functional programming in general expect that there is some direct (side-effecting) I/O using a function call.

A Partial Rehabilitation of Side-Effecting I/O:, Manfred Schmidt-Schauß.

...like how I/O works in Standard ML?

val echoML    : unit -> unit
fun echoML () = let val c = getcML () in
                if c = #"\n" then
                  ()
                else
                  let val _ = putcML c in
                  echoML ()
                  end
                end

Alright, now look at this:

echo         :: OI -> ()
echo u       =  let !(u1:u2:u3:_) = parts u
                    !c            = primGetChar u1 in
                if c == '\n' then
                  ()
                else
                  let !_ = primPutChar c u2 in
                  echo u3

So how is this possible?


Wadler's echo

Those two versions of that small program are based on the running example from Philip Wadler's How to Declare an Imperative. If we compare the two:

val echoML    : unit -> unit
fun echoML () =
                let val c = getcML () in
                if c = #"\n" then
                  ()
                else
                  let val _ = putcML c in
                  echoML ()
                  end
                end
echo   :: OI -> ()
echo u =  let !(u1:u2:u3:_) = parts u
              !c            = primGetChar u1 in
          if c == '\n' then
            ()
          else
            let !_ = primPutChar c u2 in
            echo u3

--

...we can see just how similar the two versions of echo really are: apart from the obvious changes of syntax and names:

  • the Haskell version replaces the unit arguments for echoML and getcML,
  • and provides an extra argument for putcML,
  • with the replacement parameter u being used to define the new local bindings u1, u2 and u3 as the result of a call to parts.

So for the price of some extra calls and bindings, we can have SML-style I/O in Haskell. Furthermore, as the prevailing definition for Standard ML has been available since 1997, there should be plenty of I/O tutorials to choose from...


Resisting temptation

If you're now thinking about using something like:

primitive might_get_Char :: () -> Char
primitive might_put_Char :: Char -> ()

to achieve a more direct translation...don't - it might for this small program, but it just isn't reliable in general. Why?

But, if after all that, you're still not convinced...a small functional language which combines non-strictness with effect-centric definitions in a similar fashion can be found here - "have fun!"


OI: what is it?

OI is an abstract partible type:

data OI a
primitive primPartOI :: OI -> (OI, OI)

instance Partible OI where
    part = primPartOI

Like primPartOI, most other primitives for the OI type also accept an OI-value as their last (or only) argument e.g:

primitive primGetChar :: OI -> Char
primitive primPutChar :: Char -> OI -> ()
        

For consistency, the last argument of a OI-based definition should also be an OI-value:

interact      :: (String -> String) -> OI -> ()
interact d u  =  let !(u1, u2) = part u in
                 putStr (d $ getContents u1) u2

putStr        :: String -> OI -> ()
putStr s u    =  foldr (\(!_) -> id) () $ zipWith primPutChar s $ parts u

getContents   :: OI -> String
getContents u =  case map getChar (parts u) of
                   l@(!c:_) -> l
                   l        -> l


Some annoyances

Extra parameters and arguments

As noted by Sigbjørn Finne and Simon Peyton Jones in Programming Reactive Systems in Haskell, passing around all those OI-values correctly can be tedious for large definitions. Fortunately, they can be also used more concisely with a variety of abstract interfaces:

type A b c   =  (OI -> b) -> (OI -> c)

arr          :: (b -> c) -> A b c
arr f        =  \ c' u -> f $! c' u


both         :: A b c -> A b' c' -> A (b, b') (c, c')
f' `both` g' =  \ c' u -> let !(u1:u2:u3:_) = parts u in
                          let !(x, x')      = c' u1 in
                          let !y            = f' (unit x) u2 in
                          let !y'           = g' (unit x') u3 in
                          (y, y')                           
                where
                  unit x u = let !_ = part u in x
type C a         =  (a, OI)

extract          :: C a -> a
extract (x, u)   =  let !_ = part u in x

duplicate        :: C a -> C (C a)
duplicate (x, u) =  let !(u1, u2) = part u in
                    ((x, u1), u2)

extend           :: (C a -> b) -> C a -> C b
extend h (x, u)  =  let !(u1, u2) = part u in
                    let !y        = h (x, u1) in
                    (y, u2)
type M a   =  OI -> a

unit       :: a -> M a
unit x     =  \ u -> let !_ = part u in x 

bind       :: M a -> (a -> M b) -> M b
bind m k   =  \ u -> let !(u1, u2) = part u in
                     let !x = m u1 in
                     let !y = k x u2 in
                     y
...provided you followed that hint about putting the OI argument last:
interact      :: (String -> String) -> M ()
putStr        :: String -> M ()
getContents   :: M String

primitive primGetChar :: M Char
primitive primPutChar :: Char -> M ()
        
You didn't put the OI argument last? Oh well, there's always the applicative interface...
Why bother with monadic I/O?

Considering that the concept of monads is not likely to disappear from the functional programming landscape any time soon, it is vital that we, as the functional programming community, somehow overcome the problems novices encounter when first studying monads.

Visual Support for Learning Monads, Tim Steenvoorden, Jurriën Stutterheim, Erik Barendsen and Rinus Plasmeijer.

Futhermore:

  • Haskell is now used far and wide, so good ol' "search and replace" is a non-starter!
  • There are some who still prefer C, and there are others who are content with IO - convincing them to switch will probably take a lot more than a solitary page on some wiki!

So it's actually rather fortunate that IO can be defined so easily using OI:

type IO a = M a  -- = OI -> a


Mutable state shared between different types

It's been known for a very long time in the SML community that naïve declarations of operations for mutable state breaks type safety:

primitive newPolyRef :: a -> OI -> PolyRef a
primitive readPolyRef :: PolyRef a -> OI -> a
primitive writePolyRef :: PolyRef a -> a -> OI -> ()

kah_BOOM u = let …
                 !vehicle = newPolyRef undefined u1
                 !_       = writePolyRef ("0" :: [Char]) u2
                 !crash   = readPolyRef vehicle u3
                 burn     = 1 :: Int
             in
                 crash + burn

The solution used by SML is to render the use of mutable state monomorphic through the use of dedicated syntax:

let val r = ref (…)
         ⋮
Options for Haskell

One alternative would be to extend type signatures to support monomorphic type-variables:

primitive newIORef   :: monomo a . a -> OI -> IORef a
primitive readIORef  :: monomo a . IORef a -> OI -> a
primitive writeIORef :: monomo a . IORef a -> a -> OI -> ()

{- would be rejected by the extended type system: 
kah_BOOM u = let !(u1:u2:u3:_) = parts u
                 !vehicle      = newIORef undefined u1          -- vehicle :: monomo a . IORef a
                 !_            = writeIORef ("0" :: [Char]) u2  -- vehicle :: IORef [Char]
                 !crash        = readIORef vehicle u3           -- vehicle :: IORef [Char] ≠ IORef Int
                 burn          = 1 :: Int
             in
                 crash + burn
-}

In standard Haskell, there are a few places where this already occurs, albeit implicitly:

  • on the parameters of a function,
{- will be rejected by the standard Haskell type system

ker_plunk f = (f True, f 'b')

-}
  • and the type-parameter for a type class.
primitive newIORef   :: Eq a => a -> OI -> IORef a
primitive readIORef  :: Eq a => IORef a -> OI -> a
primitive writeIORef :: Eq a => IORef a -> a -> OI -> ()

One other alternative is to make IO into an abstract data type:

module Abstract.IO
(
    Monad (..),
    getChar, putChar, 
    newIORef, readIORef, writeIORef,
                 
)
where

instance Monad ((->) OI) where
     return = unitIO
     (>>=)  = bindIO


getChar    :: IO Char
getChar    =  primGetChar

putChar    :: Char -> IO ()
putChar    =  primPutChar

newIORef   :: a -> IO (IORef a)
newIORef   =  primNewIORef

readIORef  :: IORef a -> IO a
readIORef  =  primReadIORef

writeIORef :: IORef a -> a -> IO ()
writeIORef =  primWriteIORef
                 

 -- these are now local, private entities --
type IO a = OI -> a

unitIO     :: a -> IO a
unitIO x   =  \ u -> let !_ = part u in x 

bindIO     :: IO a -> (a -> IO b) -> IO b
bindIO m k =  \ u -> let !(u1, u2) = part u in
                     let !x = m u1 in
                     let !y = k x u2 in
                     y

data OI a
primitive primPartOI  :: OI -> (OI, OI)

primitive primGetChar :: OI -> Char
primitive primPutChar :: Char -> OI -> ()
        

data IORef
primitive primNewIORef    :: a -> OI -> IORef a
primitive primReadIORef   :: IORef a -> OI -> a
primitive primWriteIORef  :: IORef a -> a -> OI -> ()
        

With IO now abstract, the only way to use IO-actions is by using:

  • the visible IO operations: getChar, putChar, etc.
  • the monadic interface - Monad(return, (>>=), …) (or via Haskell's do-notation).

So how does making IO an ADT prevent the polymorphic sharing of mutable state? It's all to do with the type of (>>=) when used with IO-actions:

(>>=) :: IO a -> (a -> IO b) -> IO b

in particular, the type of the second argument:

(a -> IO b)

...it's a function, so the value it receives will be rendered monomorphic in it's result (of type IO b).

As (>>=) is now the only IO operation which can retrieve a result from an IO-action, mutable state (IORef …) simply cannot be used polymorphically.


How GHC does I/O

newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))

...you may have noticed that we've already made liberal use of one Haskell extension - bang-patterns - and it would be useful to stay as close as possible to standard Haskell, so we'll simplify matters:

newtype IO a = IO (IOState -> (IOState, a))  -- unboxed-tuple replaced by standard one 

type IOState = State# RealWorld

Now to make the changes:

  • to the type - IOState uses an OI-value:
newtype IOState = IOS OI
  • to the I/O-specific operations - each one will use the OI-value in the initial state to provide two new OI-values: one to make up the final state; the other being used by the OI-primitive:
getChar   :: IO Char
getChar   =  IO $ \(IOS u) -> let !(u1, u2) = part u
                                  !c        = primGetChar u1
                              in  (IOS u2, c)

putChar   :: Char -> IO ()
putChar c =  IO $ \(IOS u) -> let !(u1, u2) = part u
                                  !t        = primPutChar c u1
                              in  (IOS u2, t)

 -- etc.
  • to the overloaded operations - you've probably seen it all before:
instance Monad IO where
    return x   = IO $ \(!s) -> (s, x)
    IO m >>= k = IO $ \(!s) -> let !(s', x) = m s
                                   !(IO w)  = k x
                               in  w s'
(...if you haven't: it's ye ol' pass-the-planet state-passing technique.)

One aspect which doesn't change is IO and its operations being abstract. In fact, the need is even more pressing: in addition to preventing the misuse of certain OI-operations, being an abstract data type prevents IOState-values from being erroneously reused.


Conclusions

  • Why is Haskell I/O monadic - to avoid having to use extra arguments and parameters everywhere.
  • Why is Haskell I/O abstract - to ensure I/O works as intended, by preventing the misuse of internal data.
  • Why is Haskell I/O unusual - because of Haskell's nonstrict evaluation and thus its focus on referential transparency, contrary to most other programming languages.

Further reading

If you've managed to get all the way to here:

  • State in Haskell by John Launchbury and Simon Peyton Jones is also worth reading, if you're interested in how GHC eventually arrived at its current definition of IO.
  • Output/Input provides more details about the type OI -> a.