Difference between revisions of "IO at work"

From HaskellWiki
Jump to navigation Jump to search
(Transferred archived content in 'Using IO', from 2002-08-25, to here)
 
(Various general improvements, post-transfer)
Line 10: Line 10:
 
</haskell>
 
</haskell>
   
Calling <code>readfirstline "/foo/bar"</code> could yield different results at different times while executing the program. For example, someone could change the file in-between, or it's a special file which seems to change anyway (terminal devices, named pipes, for example). As the Haskell compiler assumes referential transparency (for constant <code>f</code> and <code>g</code>, <code>f g</code> always yields the same result, for some built-in sense of equality), it could transform the program, calling the function more or less often, or earlier or later. Surely, you don't want that for I/O effects.
+
Calling <code>readfirstline "/foo/bar"</code> could yield different results at different times while executing the program - for example:
  +
* someone could change the file between the calls;
  +
* it's a special file which seems to change anyway (e.g. terminal devices, named pipes);
  +
* the file no longer exists (e.g. was deleted, network connection lost).
  +
  +
As the Haskell compiler assumes referential transparency (for constant <code>f</code> and <code>g</code>, <code>f g</code> always yields the same result, for some built-in sense of equality), it could transform the program, calling the function more or less often, or earlier or later. Surely, you don't want that for I/O effects.
   
 
So, there must be another solution.
 
So, there must be another solution.
   
The solution in Haskell is to keep the action I/O outside of Haskell's expression evaluation, moving it into the surrounding runtime system. The program (the function <code>main</code> in the module Main) is a value which merely describes what the program should do. The runtime evaluates <code>main</code> and executes the actions, evaluating functions or other lazy expressions when needed.
+
The one chosen for Haskell is to keep the I/O of an ''action'' (<code>IO</code> computation) outside of Haskell's expression evaluation by moving it into the surrounding runtime system. The program (the function <code>main</code> in the module Main) is then one large action - a value which merely describes what the program should do. The runtime evaluates <code>main</code> and executes the its constituent actions, evaluating functions or other lazy expressions when needed.
   
 
This is where the <code>IO</code> types come into play.
 
This is where the <code>IO</code> types come into play.
Line 20: Line 25:
 
A value of type <code>IO a</code> describes a computation, potentially with I/O effects, that yields a result of type <code>a</code> when it's really performed. <code>IO</code> is an abstract datatype, so you can't directly manipulate its internal structure.
 
A value of type <code>IO a</code> describes a computation, potentially with I/O effects, that yields a result of type <code>a</code> when it's really performed. <code>IO</code> is an abstract datatype, so you can't directly manipulate its internal structure.
   
The <code>main</code> function must have an <code>IO</code> type. The type parameter (named <code>a</code> above) is irrelevant in Haskell 98, but must be <code>()</code> (the trivial type) in Haskell 1.4.
+
The <code>main</code> function must have an <code>IO</code> type. The type parameter (named <code>a</code> above) is irrelevant since Haskell 98, but was <code>()</code> (the trivial type) for earlier versions.
   
Instead, there are some primitives (functions with no <code>IO</code> parameters, but with <code>IO</code> result), that generate <code>IO</code> values (computations), and there are combinators (functions having one or more <code>IO</code> values as parameters). The combinators usually have an <code>IO</code> result. In standard Haskell, there's 'no' combinator having <code>IO</code> parameters, but no <code>IO</code> result, except trivial ones (functions which just ignore their <code>IO</code> parameters).
+
Instead, there are some primitives (functions with no <code>IO</code> parameters, but with <code>IO</code> result), that generate <code>IO</code> values (computations), and there are combinators (functions having one or more <code>IO</code> values as parameters). The combinators usually have an <code>IO</code> result. In standard Haskell, there's ''no'' combinator having <code>IO</code> parameters, but no <code>IO</code> result, except trivial ones (functions which just ignore their <code>IO</code> parameters).
   
 
Now, let's see some examples.
 
Now, let's see some examples.
   
<haskell>
+
* <haskell>
 
putStr :: String -> IO ()
 
putStr :: String -> IO ()
 
</haskell>
 
</haskell>
   
<code>putStr</code> someString denotes the computation of printing the given string on the program's standard output, and returning a trivial value <code>()</code>.
+
:<code>putStr</code> someString denotes the computation of printing the given string on the program's standard output, and returning a trivial value <code>()</code>.
   
So, one of the simplest programs is:
+
:So, one of the simplest programs is:
   
<haskell>
+
:<haskell>
 
main = putStr "Hello world\n"
 
main = putStr "Hello world\n"
 
</haskell>
 
</haskell>
   
Other primitive actions are:
+
* Other primitive actions are:
   
<haskell>
+
:<haskell>
 
getLine :: IO String
 
getLine :: IO String
 
return :: a -> IO a
 
return :: a -> IO a
 
</haskell>
 
</haskell>
   
<code>getLine</code> is the computation that, when executed, reads a line from standard input and returns it, with the newline character stripped.
+
:<code>getLine</code> is the computation that, when executed, reads a line from standard input and returns it, with the newline character stripped.
   
<code>return someValue</code> is the computation that has no side effects, but just returns <code>someValue</code>, when executed. That is used when you want to combine primitive side-effecting I/O actions with pure computations, processing and/or combining the results of the side-effecting actions into other results. You'll see more about <code>return</code> in the examples further down.
+
:<code>return someValue</code> is the computation that has no side effects, but just returns <code>someValue</code>, when executed. That is used when you want to combine primitive side-effecting I/O actions with pure computations, processing and/or combining the results of the side-effecting actions into other results. You'll see more about <code>return</code> in the examples further down.
   
So, in Haskell 98, the program
+
:So, in Haskell, the program
   
<haskell>
+
:<haskell>
 
main = getLine
 
main = getLine
 
</haskell>
 
</haskell>
   
will read a line and discard it (as the result of the <code>main</code> action, if there's a non-trivial one, will be silently discarded).
+
:will read a line and discard it (as the result of the <code>main</code> action, if there's a non-trivial one, will be silently discarded).
   
 
Now, with only those measures, no really interesting programs can be written. You usually must combine more than one primitive action into a whole, doing what you want to do.
 
Now, with only those measures, no really interesting programs can be written. You usually must combine more than one primitive action into a whole, doing what you want to do.
Line 75: Line 80:
 
The computation <code>computation1 >>= (\result1 -> computation2)</code> will, when executed, execute <code>computation1</code>, pass the result to the lambda on the right of <code>(>>=)</code>. That lambda application will yield a computation by pure expression evaluation. That resulting computation will be executed then.
 
The computation <code>computation1 >>= (\result1 -> computation2)</code> will, when executed, execute <code>computation1</code>, pass the result to the lambda on the right of <code>(>>=)</code>. That lambda application will yield a computation by pure expression evaluation. That resulting computation will be executed then.
   
  +
Some examples:
Examples:
 
   
<haskell>
+
* <haskell>
 
main = putStr "Hello " >> putStr "world\n"
 
main = putStr "Hello " >> putStr "world\n"
 
</haskell>
 
</haskell>
   
The same as above, but in two different <code>putStr</code> actions.
+
:The same as above, but in two different <code>putStr</code> actions.
   
<haskell>
+
* <haskell>
 
main = getLine >>= (\inputStr -> putStr ("You typed " ++ inputStr ++ "\n"))
 
main = getLine >>= (\inputStr -> putStr ("You typed " ++ inputStr ++ "\n"))
 
</haskell>
 
</haskell>
   
In fact, you can omit the parentheses, and you can format it differently:
+
:In fact, you can omit the parentheses, and you can format it differently:
   
<haskell>
+
:<haskell>
 
main =
 
main =
 
getLine >>= \inputStr ->
 
getLine >>= \inputStr ->
Line 96: Line 101:
 
</haskell>
 
</haskell>
   
This reads a line, then echoes it with some text around it.
+
:This reads a line, then echoes it with some text around it.
   
Now, where does the <code>do</code>-notation come into play? My, it's just syntactic sugar for expressions combined with <code>(>>)</code> and <code>(>>=)</code>.
+
Now, where does the <code>do</code>-notation come into play? It's merely syntactic sugar for expressions combined with <code>(>>)</code> and <code>(>>=)</code>.
   
 
The last two examples read this in the <code>do</code>-notation:
 
The last two examples read this in the <code>do</code>-notation:
   
<haskell>
+
* <haskell>
 
main = do
 
main = do
 
putStr "Hello "
 
putStr "Hello "
Line 108: Line 113:
 
</haskell>
 
</haskell>
   
 
* <haskell>
and
 
 
<haskell>
 
 
main = do
 
main = do
 
inputStr <- getLine
 
inputStr <- getLine

Revision as of 20:39, 22 March 2021

What's that IO type anyway? How do I do I/O in Haskell?

Because of the referential transparency of Haskell, you can't do I/O in Haskell in a way like in impure functional or imperative programming languages.

One example is this: Assume there were a function

readfirstline :: String {- file name -} -> String {- first line of file -}

Calling readfirstline "/foo/bar" could yield different results at different times while executing the program - for example:

  • someone could change the file between the calls;
  • it's a special file which seems to change anyway (e.g. terminal devices, named pipes);
  • the file no longer exists (e.g. was deleted, network connection lost).

As the Haskell compiler assumes referential transparency (for constant f and g, f g always yields the same result, for some built-in sense of equality), it could transform the program, calling the function more or less often, or earlier or later. Surely, you don't want that for I/O effects.

So, there must be another solution.

The one chosen for Haskell is to keep the I/O of an action (IO computation) outside of Haskell's expression evaluation by moving it into the surrounding runtime system. The program (the function main in the module Main) is then one large action - a value which merely describes what the program should do. The runtime evaluates main and executes the its constituent actions, evaluating functions or other lazy expressions when needed.

This is where the IO types come into play.

A value of type IO a describes a computation, potentially with I/O effects, that yields a result of type a when it's really performed. IO is an abstract datatype, so you can't directly manipulate its internal structure.

The main function must have an IO type. The type parameter (named a above) is irrelevant since Haskell 98, but was () (the trivial type) for earlier versions.

Instead, there are some primitives (functions with no IO parameters, but with IO result), that generate IO values (computations), and there are combinators (functions having one or more IO values as parameters). The combinators usually have an IO result. In standard Haskell, there's no combinator having IO parameters, but no IO result, except trivial ones (functions which just ignore their IO parameters).

Now, let's see some examples.

  • putStr :: String -> IO ()
    
putStr someString denotes the computation of printing the given string on the program's standard output, and returning a trivial value ().
So, one of the simplest programs is:
main = putStr "Hello world\n"
  • Other primitive actions are:
getLine :: IO String
return  :: a -> IO a
getLine is the computation that, when executed, reads a line from standard input and returns it, with the newline character stripped.
return someValue is the computation that has no side effects, but just returns someValue, when executed. That is used when you want to combine primitive side-effecting I/O actions with pure computations, processing and/or combining the results of the side-effecting actions into other results. You'll see more about return in the examples further down.
So, in Haskell, the program
main = getLine
will read a line and discard it (as the result of the main action, if there's a non-trivial one, will be silently discarded).

Now, with only those measures, no really interesting programs can be written. You usually must combine more than one primitive action into a whole, doing what you want to do.

And, there they are, some combinators:

(>>) :: IO a -> IO b -> IO b

The computation computation1 >> computation2 will, when executed, first execute computation1, discard the result, then execute computation2, and return the result of executing computation2.

And, if you want to carry on results of earlier computations:

(>>=) :: IO a -> (a -> IO b) -> IO b

The computation computation1 >>= (\result1 -> computation2) will, when executed, execute computation1, pass the result to the lambda on the right of (>>=). That lambda application will yield a computation by pure expression evaluation. That resulting computation will be executed then.

Some examples:

  • main = putStr "Hello " >> putStr "world\n"
    
The same as above, but in two different putStr actions.
  • main = getLine >>= (\inputStr -> putStr ("You typed " ++ inputStr ++ "\n"))
    
In fact, you can omit the parentheses, and you can format it differently:
main =
      getLine >>=                                     \inputStr ->
      putStr ("You typed " ++ inputStr ++ "\n")
This reads a line, then echoes it with some text around it.

Now, where does the do-notation come into play? It's merely syntactic sugar for expressions combined with (>>) and (>>=).

The last two examples read this in the do-notation:

  • main = do
          putStr "Hello "
          putStr "world\n"
    
  • main = do
          inputStr <- getLine
          putStr ("You typed " ++ inputStr ++ "\n")
    

Now, at last, back to the opener: the readfirstline example. That's also an example for the use of return.

In fact, that function is easy to define:

readfirstline :: String -> IO String
readfirstline filename =
      readFile filename >>=                   \fileContent ->
      return (head (lines fileContent))

Try to understand that for yourself! Because readFile reads the file lazily, only the first line of the file will be read at all (or some more, because of buffering) rather than the whole file.

The above code would be (in do-notation):

readfirstline filename = do
      fileContent <- readFile filename
      return (head(lines fileContent))