Difference between revisions of "Output/Input"

From HaskellWiki
Jump to: navigation, search
m (More minor changes)
(Closed mostly-redundant page)
 
(28 intermediate revisions by the same user not shown)
Line 1: Line 1:
Let me guess...you've read every other guide, tutorial, lesson and introduction and none of them have helped - you still don't understand I/O in Haskell.
 
  +
[[Category:Pages to be removed]]
   
Alright then - have a look at this:
 
  +
There is currently no text in this page.
 
  +
You can <span style="color:brown;">search for this page title</span> in other pages,
<haskell>
 
  +
<span style="color:brown;">search the related logs</span>,
data OI -- abstract, primitive
 
  +
or <span style="color:brown;">log in</span> to create this page.
 
partOI :: OI -> (OI, OI) --
 
getchar :: OI -> Char -- primitives
 
putchar :: Char -> OI -> () --
 
 
seq :: a -> b -> b -- also primitive
 
 
instance Partible OI where ...
 
 
class Partible a where
 
part :: a -> (a, a)
 
parts :: a -> [a]
 
.
 
.
 
.
 
</haskell>
 
 
No up-front explanation; I'm guessing you've seen more than enough of those, so I'm trying something different. I will explain it later...
 
 
Yes, of course there's more to Haskell I/O than <code>getchar</code> and <code>putchar</code>; I've downsized it for convenience. If you want, you can add the rest afterwards...
 
 
Yes, they're somewhat arcane, but they can be used to emulate all the classic approaches to I/O in Haskell, albeit in miniature:
 
 
<haskell>
 
module ClassicIO where
 
import Prelude(Char, String)
 
import Prelude(($), (.))
 
import Data.List(map, foldr, zipWith)
 
import OutputInput
 
import Partible
 
 
-- simple text --
 
 
{- main :: (String -> String) -}
 
 
runMain_text :: (String -> String) -> OI -> ()
 
runMain_text main = \u -> case part u of
 
(u1, u2) ->
 
putchars (main (getchars u1)) u2
 
 
getchars :: OI -> String
 
getchars = map getchar . parts
 
 
putchars :: String -> OI -> ()
 
putchars s = foldr seq () . zipWith putchar s . parts
 
 
 
-- dialogues --
 
 
{- main :: Dialogue -}
 
 
runMain_dial :: Dialogue -> OI -> ()
 
runMain_dial main = \u -> foldr seq () $ yet $
 
\l -> zipWith respond (main l) (parts u)
 
 
type Dialogue = [Response] -> [Request]
 
 
data Request = Getq | Putq Char
 
data Response = Getp Char | Putp
 
 
yet :: (a -> a) -> a
 
yet f = f (yet f)
 
 
respond :: Request -> OI -> Response
 
respond Getq = \u -> case getchar u of c -> Getp c
 
respond (Putq c) = \u -> seq (putchar c u) Putp
 
 
 
-- continuations --
 
 
{- main :: (() -> IOResult) -> IOResult -}
 
 
runMain_cont :: ((() -> IOResult) -> IOResult) -> OI -> ()
 
runMain_cont main = call (main done)
 
 
newtype IOResult = R (OI -> ())
 
 
call :: IOResult -> OI -> ()
 
call (R a) = a
 
 
done :: () -> IOResult
 
done () = R $ \ u -> part u `seq` ()
 
 
getchar_cont :: (Char -> IOResult) -> IOResult
 
getchar_cont k = R $ \u -> case part u of
 
(u1, u2) ->
 
case getchar u1 of
 
c -> seq c (call (k c) u2)
 
 
putchar_cont :: Char -> (() -> IOResult) -> IOResult
 
putchar_cont c k = R $ \u -> case part u of
 
(u1, u2) ->
 
seq (putchar c u) (call (k ()) u2)
 
 
-- state-passing --
 
 
{- main :: IOState -> ((), IOState) -}
 
 
runMain_stat :: (IOState -> ((), IOState)) -> OI -> ()
 
runMain_stat main = \u -> seq (main (ini_st u)) ()
 
 
newtype IOState = S OI
 
 
ini_st :: OI -> IOState
 
ini_st = S
 
 
getchar_stat :: IOState -> (Char, IOState)
 
getchar_stat (S u) = case part u of
 
(u1, u2) ->
 
case getchar u1 of
 
c -> seq c (c, S u2)
 
 
putchar_stat :: Char -> IOState -> ((), IOState)
 
putchar_stat c (S u) = case part u of
 
(u1, u2) ->
 
seq (putchar c u1) ((), S u2)
 
 
-- and those weird, fickle things ;-)
 
 
{- main :: IO () -}
 
 
runMain_wfth :: IO () -> OI -> ()
 
runMain_wfth main = main
 
 
type IO a = OI -> a
 
 
getchar_wfth :: IO Char
 
getchar_wfth = getchar
 
 
putchar_wfth :: Char -> IO ()
 
putchar_wfth = putchar
 
 
unit :: a -> IO a
 
unit x = \u -> part u `seq` x
 
 
bind :: IO a -> (a -> IO b) -> IO b
 
bind m k = \u -> case part u of
 
(u1, u2) -> (\x -> x `seq` k x u2) (m u1)
 
</haskell>
 
 
Here are examples for each of those approaches:
 
 
<haskell>
 
module Echoes where
 
import Prelude(String, Char(..), Eq(..))
 
import Prelude(($))
 
import ClassicIO
 
import OutputInput(runOI)
 
 
echo_text :: String -> String
 
echo_text (c:cs) = if c == '\n' then [] else c : echo_text cs
 
 
echo_dial :: Dialogue
 
echo_dial p = Getq :
 
case p of
 
Getp c : p' ->
 
if c == '\n' then
 
[]
 
else
 
Putq c :
 
case p' of
 
Putp : p'' -> echo_dial p''
 
 
echo_cont :: (() -> IOResult) -> IOResult
 
echo_cont k = getchar_cont $ \c ->
 
if c == '\n' then
 
k ()
 
else
 
putchar_cont c (\_ -> echo_cont k)
 
 
echo_stat :: IOState -> ((), IOState)
 
echo_stat s = case getchar_stat s of
 
(c, s') ->
 
if c == '\n' then
 
((), s')
 
else
 
case putchar_stat c s' of
 
(_, s'') -> echo_stat s''
 
 
echo_wfth :: IO ()
 
echo_wfth = getchar_wfth `bind` \c ->
 
if c == '\n' then
 
unit ()
 
else
 
putchar_wfth c `bind` \_ -> echo_wfth
 
</haskell>
 
 
What was that - using <code>Prelude.seq</code> that way won't work in Haskell 2010? ''You are correct!''
 
 
This should work as expected[1][2]:
 
 
<haskell>
 
-- for GHC 8.6.5
 
{-# LANGUAGE CPP #-}
 
#define during seq
 
module Sequential(seq) where
 
import qualified Prelude(during)
 
 
infixr 0 `seq`
 
seq :: a -> b -> b
 
seq x y = Prelude.during x (case x of _ -> y)
 
</haskell>
 
 
It didn't work? Try this instead:
 
 
<haskell>
 
-- for GHC 8.6.5
 
{-# LANGUAGE CPP #-}
 
#define during seq
 
module Sequential(seq) where
 
import qualified Prelude(during)
 
import GHC.Base(lazy)
 
 
infixr 0 `seq`
 
seq :: a -> b -> b
 
seq x y = Prelude.during x (lazy y)
 
</haskell>
 
 
As for those extensions - they stay with each definition.
 
 
That still didn't work? Well, give this a try:
 
 
<haskell>
 
yet :: (a -> a) -> a
 
yet f = y where y = f y
 
</haskell>
 
 
Now that we're firmly on the topic of implementation details, did you notice how easy it was to define that allegedly ''warm, fuzzy''[3] <code>IO</code> type using this curious new <code>OI</code> type, and those primitives?
 
 
Sometimes that can be a hint that doing the opposite will be difficult or even impossible to do while staying within standard Haskell 2010. As it happens, this is one of those cases...
 
 
To define <code>OI</code>, <code>partOI</code>, <code>getchar</code> and <code>putchar</code> will require:
 
 
* modifying your preferred Haskell implementation - lots of work;
 
 
* using some other language for the definitions, with Haskell then calling the foreign code - extra work to deal with two different languages;
 
 
* using unsafe or implementation-specific primitives - work needed to avoid conflicts with Haskell semantics;
 
 
* using implementation-specific extensions - work needed to track relevant extensions, and possible conflicts with Haskell semantics.
 
 
For now, I'll just use the extensions - they're ugly, but at least they'll be contained, as they are in those alternate definitions of <code>seq</code>. But who knows - if this approach to I/O proves useful enough, it might make its way into a future Haskell standard...that's how <code>IO</code> happened[4].
 
 
In the meantime, take a deep breath:
 
 
<haskell>
 
-- for GHC 8.6.5
 
{-# LANGUAGE MagicHash, UnboxedTuples #-}
 
module OutputInput(OI, runOI, seq, getchar, putchar) where
 
import Prelude(Char, String)
 
import Prelude(($), (++), putChar, getChar, error)
 
import Partible
 
import Sequential
 
import GHC.Base(IO(..), State#, MutVar#, RealWorld)
 
import GHC.Base(seq#, realWorld#, newMutVar#, atomicModifyMutVar#)
 
 
data OI = OI OI#
 
 
instance Partible OI where
 
part = partOI
 
 
partOI :: OI -> (OI, OI)
 
partOI (OI r) = case expire# "partOI" r realWorld# of
 
s -> case newMutVar# () s of
 
(# s', r1 #) ->
 
case newMutVar# () s' of
 
(# _, r2 #) -> (OI r1, OI r2)
 
 
runOI :: (OI -> a) -> IO a
 
runOI g = IO $ \s -> case newMutVar# () s of
 
(# s', r #) -> seq# (g (OI r)) s'
 
 
getchar :: OI -> Char
 
getchar (OI r) = case expire# "getchar" r realWorld# of
 
s -> case undo# getChar s of
 
(# _, c #) -> c
 
 
putchar :: Char -> OI -> ()
 
putchar c (OI r) = case expire# "putchar" r realWorld# of
 
s -> case undo# (putChar c) s of
 
(# _, x #) -> x
 
 
 
-- Local definitions
 
--
 
type OI# = MutVar# RealWorld ()
 
 
expire# :: String -> MutVar# s () -> State# s -> State# s
 
expire# name r s = case atomicModifyMutVar# r flick s of
 
(# s', _ #) -> s'
 
where
 
flick :: () -> (a, ())
 
flick x@() = (error nowUsed, x)
 
 
nowUsed = name ++ ": argument already used"
 
 
undo# :: IO a -> State# RealWorld -> (# State# RealWorld, a #)
 
undo# (IO a) = a
 
</haskell>
 
 
Now you can start breathing again :-)
 
 
<haskell>
 
module Partible where
 
 
class Partible a where
 
part :: a -> (a, a)
 
parts :: a -> [a]
 
 
-- Minimal complete definition: part or parts
 
part u = case parts u of u1:u2:_ -> (u1, u2)
 
parts u = case part u of (u1, u2) -> u1 : parts u2
 
</haskell>
 
 
If you remember, I dispensed with an up-front explanation to try something different. Now that you've
 
seen just how different this all is, here's the explanation...
 
 
That abstract <code>partOI</code> and its overloaded associates <code>part</code> and <code>parts</code>? They help an optimising Haskell implementation to determine when it's safe to use those optimisations. Consider this definition:
 
 
<haskell>
 
testme n = n^2 + n^2
 
</haskell>
 
 
One simple optimisation would be to replace the duplicates of <code>n^2</code> with a single, shared local definition:
 
 
<haskell>
 
testme n = let x = n^2 in x + x
 
</haskell>
 
 
This definition:
 
 
<haskell>
 
main' u = putchars "ha" u `seq` putchars "ha" u
 
 
</haskell>
 
 
would likewise be rewritten, with the result being:
 
 
<haskell>
 
main' u = let x = putchars "ha" u in x `seq` x
 
</haskell>
 
 
but, as noted by Philip Wadler[5]:
 
 
<blockquote>''[...] the laugh is on us: the program prints only a single <code>"ha"</code>, at the time variable <br><code>x</code> is bound. In the presence of side effects, equational reasoning in its simplest form <br>becomes invalid.''</blockquote>
 
 
''Equational reasoning'' is the basis for that simple optimisation and many others in implementations like GHC - so far they've been serving us quite well.
 
 
What - just treat I/O-centric definitions as some special case by modifying GHC? Haskell implementations like GHC are complicated enough as is!
 
 
The problem is being caused by the code being treated as though it's pure, so let's modify the code instead. In this case, the easiest solution is to make all calls to I/O-centric definitions unique:
 
 
<haskell>
 
main u = case part u of
 
(u1, u2) ->
 
putchars "ha" u1 `seq` putchars "ha" u2
 
</haskell>
 
 
But what about:
 
 
<haskell>
 
oops g h u = g u `seq` h u
 
 
main' = oops (putchars "ha") (putchars "ha")
 
</haskell>
 
 
Will the laugh be on us, again?
 
 
This is Haskell, not Clean[6] - there are no uniqueness types to help fend off such potentially-troublesome expressions. For now, the simplest way to make sure <code>OI</code> values are only used once is to have the implementation treat their reuse as being invalid e.g. by throwing an exception or raising an error to stop the offending program.
 
 
In the prototype implementation, the maintenance of this all-important ''single-use'' property is performed by <code>expire#</code>.
 
 
Now for the much-maligned[7] <code>seq</code>...you could be tempted into avoiding its use by using a new data type:
 
 
<haskell>
 
newtype Result a = Is a
 
 
getchar' :: OI -> Result Char
 
putchar' :: Char -> OI -> Result ()
 
</haskell>
 
 
and case-expressions:
 
 
<haskell>
 
respond' :: Request -> OI -> Response
 
respond' Getq = \u -> case getchar' u of Is c -> Getp c
 
respond' (Putq c) = \u -> case putchar' c u of Is _ -> Putp
 
</haskell>
 
 
But before you succumb:
 
 
<haskell>
 
unit_Result :: a -> Result a
 
unit_Result = Is
 
 
bind_Result :: Result a -> (a -> Result b) -> Result b
 
bind_Result (Is x) k = k x
 
</haskell>
 
 
Oh look - <code>Result</code> is one of '''those''' types...
 
 
The bang-pattern extension? So you can instead write:
 
 
<haskell>
 
respond'' :: Request -> OI -> Response
 
respond'' Getq = \u -> let !c = getchar u in Getp c
 
respond'' (Putq c) = \u -> let !z = putchar c u in Putp
 
</haskell>
 
 
As you can see, <code>z</code> isn't used anywhere - there is no need for it. This being Haskell, if it isn't needed, it normally isn't evaluated. For now, the bang-pattern extension modifies the evaluation of
 
<code>z</code> in order to prevent <code>respond''</code> being rewritten as:
 
 
<haskell>
 
respond'' :: Request -> OI -> Response
 
respond'' Getq = \u -> let !c = getchar u in Getp c
 
respond'' (Putq c) = \u -> Putp
 
</haskell>
 
 
While you can use bang-patterns instead of <code>seq</code>, you'll be assuming that they appear in a future Haskell standard, and can still be used the same way - if either assumption is wrong, you might be looking at one very big clean-up job...
 
 
You could try all manner of ways to avoid using <code>seq</code> - you might even find one that you like; all well and good...but others might not. For me, the simplest way I've found to make this approach to I/O work is with <code>seq</code> - one that's actually sequential.
 
 
But maybe - after all that - you still want <code>seq</code> banished from Haskell. Perhaps you still don't understand I/O in Haskell. It could be that you're dismayed by what you've read here. Alternately, you may have seen or tried this all before, and know it doesn't work - darn...
 
 
If that's you, the corresponding language proposal[8] has a list of other articles and research papers I've found which describe or refer to other approaches - perhaps one (or more) of them will be more acceptable.
 
 
As noted by Owen Stephens[9]:
 
 
<blockquote>''I/O is not a particularly active area of research, but new approaches are still being discovered, <br>iteratees being a case in point.''</blockquote>
 
 
Who knows - the Haskell language could return to having a pure, fully-defined approach to I/O...and it could be you that finds it :-D
 
 
 
P.S: Why the name <code>OI</code>? Many years ago I was tinkering with arrows for performing I/O, labelling them <code>OI a b</code> out of expediency. More recently, I discovered a set of slides[10] describing another approach to I/O which used values of type <code>OI a</code> in a similar fashion to what I've been describing here. I've reused the name because of that similarity.
 
 
 
References:
 
 
[1] [[Sequential ordering of evaluation]]; Haskell Wiki.<br>
 
 
[2] [https://gitlab.haskell.org/ghc/ghc/-/issues/5129 Ticket# 5129: "evaluate" optimized away]; GHC bug tracker.<br>
 
 
[3] [https://www.cs.nott.ac.uk/~pszgmh/appsem-slides/peytonjones.ppt Wearing the hair shirt: a retrospective on Haskell]; Simon Peyton Jones.<br>
 
 
[4] [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.168.4008&rep=rep1&type=pdf A History of Haskell: being lazy with class]; Paul Hudak, John Hughes, Simon Peyton Jones and Philip Wadler.<br>
 
 
[5] [https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.3579&rep=rep1&type=pdf How to Declare an Imperative]; Philip Wadler.<br>
 
 
[6] [https://clean.cs.ru.nl/Clean The Clean homepage]; Radboud University, Nijmegen, The Netherlands.<br>
 
 
[7] [https://mail.haskell.org/pipermail/haskell/2002-May/009622.html Thread: State monads don't respect the monad laws in Haskell]; Haskell mail archive.<br>
 
 
[8] [[Partibles for composing monads]]; Haskell Wiki.<br>
 
 
[9] [https://www.owenstephens.co.uk/assets/static/research/masters_report.pdf Approaches to Functional I/O]; Owen Stephens.<br>
 
 
[10] <span style="color:#ba0000">Non-Imperative Functional Programming</span>; Nobuo Yamashita.<br>
 
 
 
[[User:Atravers|Atravers]] 03:05, 20 August 2020 (UTC)
 

Latest revision as of 22:54, 16 April 2021


There is currently no text in this page. You can search for this page title in other pages, search the related logs, or log in to create this page.