Error vs. Exception

From HaskellWiki
Revision as of 01:38, 6 December 2009 by Lemming (talk | contribs) (error - checks in advance, exception - checks afterwards)

Jump to: navigation, search

There is confusion about the distinction of errors and exceptions for a long time, repeated threads in Haskell-Cafe and more and more packages that handle errors and exceptions or something between. Although both terms are related and sometimes hard to distinguish, it is important to do it carefully. This is like the confusion between parallelism and concurrency.

The first problem is that "exception" seem to me to be the historically younger term. Before there were only "errors", independent from whether they were programming or I/O or user errors. In this article we want to use the term exception for expected but irregular situations at runtime and the term error for mistakes in the running program, that can be resolved only by fixing the program. We do not want to distinguish between different ways of representing exceptions: Maybe, Either, exceptions in IO monad, or return codes, they all represent exceptions and are worth considering for exception handling.

The history may have led to the identifiers we find today in the Haskell language and standard Haskell modules.

  • Exceptions: Prelude.catch, Control.Exception.catch, Control.Exception.try, IOError, Control.Monad.Error
  • Errors: error, assert, Control.Exception.catch, Debug.Trace.trace

Note, that the catch function from Prelude handles exclusively exceptions, whereas its counterpart from Control.Exception also catches certain kinds of undefined values.

Prelude> catch (error "bla") (\msg -> putStrLn $ "catched " ++ show msg)
*** Exception: bla

Prelude> Control.Exception.catch (error "bla") (\msg -> putStrLn $ "catched " ++ show (msg::Control.Exception.SomeException))
catched bla

This is unsafe, since Haskell's error is just sugar for undefined, that shall help spotting a programming error. A program should work as well when all errors and undefineds are replaced by infinite loops. However infinite loops in general cannot be catched, whereas calls to sugared functions like error can.

Even more confusion was initiated by the Java programming language to use the term "exceptions" for programming errors like the NullPointerException and introducing the distinction between checked and unchecked exceptions.


Let me give some examples for explaining the difference between errors and exceptions and why the distinction is important.

First consider a compiler like GHC. If you feed it with a program that contains syntax or type errors it emits a descriptive message of the problem. For GHC these are exceptions. GHC must expect all of these problems and handles them by generating a useful message for the user. However, sometimes you "succeed" to let GHC emit something like "Panic! This should not happen: ... Write a bug report to" Then you encountered a bug in GHC. For GHC this is an error. It cannot be handled by GHC itself. The report "didn't expect TyVar in TyCon after unfolding" or so isn't of much help for the user. It's the business of the GHC developers to fix the problem.

Ok, these are possible reactions to user input. Now a more difficult question: How should GHC handle corruptions in the files it has generated itself like the interface (.hi) and object files (.o)? These corruptions can be introduced easily by the user by editing the files in a simple text editor, or by network problems or by exchanging files between operating systems or different GHC versions, or by virus programs. Thus GHC must be prepared for them, which means, it must generate and handle exceptions here. It must tell the user at least that there is some problem with the read file. Next question: Must GHC also be prepared for corrupt memory or damages in the CPU? Good question. According to the above definition corrupt memory is an exception, not an error. However, GHC cannot do much about such situations. So I don't think it must be prepared for that.

Now we proceed with two examples that show, what happens if you try to treat errors like exceptions:
I was involved in the development of a library that was written in C++. One of the developers told me, that the developers are divided into the ones who like exceptions and the other ones who prefer return codes. As it seem to me, the friends of return codes won. However, I got the impression that they debated the wrong point: Exceptions and return codes are equally expressive, they should however not be used to describe errors. Actually the return codes contained definitions like ARRAY_INDEX_OUT_OF_RANGE. But I wondered: How shall my function react, when it gets this return code from a subroutine? Shall it send a mail to its programmer? It could return this code to its caller in turn, but he will also not know, how to cope with it. Even worse, since I cannot make assumptions about the implementation of a function, I have to expect an ARRAY_INDEX_OUT_OF_RANGE from every subroutine. My conclusion is, that ARRAY_INDEX_OUT_OF_RANGE is a (programming) error. It cannot be handled or fixed at runtime, it can only be fixed by its developer. Thus there should be no according return code, but instead there should be asserts.

The second example is a library for advanced arithmetic in Modula-3. I decided to use exceptions for signalling problems. One of the exceptions was VectorSizeMismatch, that was raised whenever two vectors of different sizes should be added or multiplied by a scalar product. However I found, that quickly almost every function in the library could potentially raise this exception and Modula-3 urges you to declare all potential exceptions. (However, ignoring potential exceptions only yields a compiler warning, that can even be suppressed.) I also noticed that due to the way I generated and combined the vectors and matrices the sizes would always match. Thus in case of a mismatch this means, there is not a problem with user input but with my program. Consequently, I removed this exception and replaced the checks by ASSERT. These ASSERTs can be disabled by a compiler switch for efficiency concerns. A correct program fulfils all ASSERTs and thus it does not make a difference whether they are present in the compiled program or not. In a faulty program the presence of ASSERTs only controls the way a program fails: either by giving wrong results or segmentation faults.

With the new handling of vector size compatibility, if the operands of a vector addition originate from user input, then you have to check that their sizes match before you call vector addition. However this is a cheap check. Thus if you want another criterion for distinction of errors and exceptions: Errors can be prevented by (cheap) checks in advance, whereas exceptions can only be handled after a risky action was run. You can easily check for array indices being within array bounds, pointers for being not NULL, divisors for being not zero before calling according functions. In many cases you will not need those checks, because e.g. you have a loop traversing all valid indices of an array, and consequently you know that every index is allowed. You do not need to check exceptions afterwards. In contrast to that, memory full, disk full, file not existing, file without write permission and even overflows are clearly exceptions. Even if you check that there is enough memory available before allocating, the required chunk of memory might just be allocated by someone else between your memory check and your allocation. The file permission might be just changed between checking the permission and writing to the file. Permissions might even change while you write. Overflows are deterministic, but in order to prevent an overflow say for a multiplication, you have to reimplement the multiplication in an overflow-proof way. This will be slower than the actual multiplication. (Processors always show overflows by flags, but almost none of the popular high-level languages allows to query this information.)

My conclusion is that (programming) errors can only be handled by the programmer, not by the running program. Thus the term "error handling" sounds contradictory to me. However supporting a programmer with finding errors (bugs) in his programs is a good thing. I just wouldn't call it "error handling" but "debugging". An important example in Haskell is the module Debug.Trace. It provides the function trace that looks like a non-I/O function but actually outputs something on the console. It is natural that debugging functions employ hacks. For finding a programming error it would be inappropriate to transform the program code to allow I/O in a set of functions that do not need it otherwise. The change would only persist until the bug is detected and fixed. Summarized, hacks in debugging functions are necessary for quickly finding problems without large restructuring of the program and they are not problematic, because they only exist until the bug is removed.

Different from that exceptions are things you cannot fix in advance. You will always have to live with files that cannot be found and user input that is malformed. You can insist that the user does not hit the X key, you may threat him to send him lawyers, but your program has to be prepared to receive a "X key pressed" message nonetheless. Thus exceptions belong to the program and the program must be adapted to treat exceptional values where they can occur. No hacks can be accepted for exception handling.

When exceptions become errors

Another issue that makes distinction between exceptions and errors difficult is, that sometimes the one gets converted into the other one.

It is an error to not handle an exception. If a file cannot be opened you must respect that result. You can proceed as if the file could be opened, though. If you do so you might crash the machine or the runtime system terminates your program. All of these effects are possible consequences of a (programming) error. Again, it does not matter wether the exceptional situation is signaled by a return code that you ignore or an IO exception for which you did not run a catch.

When errors become exceptions

Often there is criticism about the distinction between errors and exceptions because there are software architectures where even programming errors of a part shall not crash a larger piece of software. Typical examples are: A process in an operating system shall not crash the whole system if it crashes itself. A buggy browser plugin shall not terminate the browser. A corrupt CGI script shall not bring the web server down, where it runs on.

In these cases errors are handled like exceptions. But there is no reason to dismiss the distinction of errors and exceptions, at all. Obviously there are levels, and when crossing level boundaries it is ok to turn an error into an exception. The part that contains an error cannot do anything to recover from it. Also the next higher level cannot fix it, but it can restrict the damage. Within one encapsulated part of an architecture errors and exceptions shall be strictly separated. (Or put differently: If at one place you think you have to handle an error like an exception, why not dividing the program into two parts at this position? :-) )

See also