Difference between revisions of "Safely running untrusted Haskell code"

Revision as of 17:05, 27 May 2007

Obviously, don't run code in the IO monad, just show pure results (or possibly make your own monad that is a restricted subset of IO). But it's a lot more complicated than that...

Verifying safety : lambdabot's approach

Since 2004, lambdabot has executed arbitrary strings of Haskell provided by user's of various IRC channels, in particular, the Haskell channel. In order to do this, a particular security policy is required. The policy, and its implementation, is described here.

The policy

Only allow execution of pure Haskell expressions.

The implementation

The evaluator is essentially a function, eval :: String -> IO String, which takes a random Haskell string, verifies it, compiles it, and evaluates the result, returning a String representing the result, back over the network.

This function is implemented as two separate processes:

The driver reads a String from the network, and then subjects it to a simple test:

parse the expression to check it is a Haskell 98 expression

If the string parses as a Haskell 98 expression, the 'runplugs' process is then forked to evaluate the string, and the following checks are put in place:

Only a trusted module set is imported, avoiding unsafePerformIO and stToIO and such like.
Module imports are disallowed
Time and space limitations on the runplugs process are set by the OS
The expression is parsed as an expression, hopefully preventing code injection (is this true? and can any string that can parse as a valid Haskell expression become something more sinister when put in a particular context?)
The expression is bound to a random top level identifier (harmless to guess)
The expression is wrapped in 'show', and must be an instance of Show
An instance of Show IO is defined, which prints "<IO>", rendering IO impossible (otherwise, showing the IO would result in a type error, and still not run it). Actually instances for IO and (->), using Typeable, are used to show the specific types of all IO and functions (e.g. putStrLn --> <[Char] -> IO ()> )
The expression type checked, with the show constraint, enforcing purity
If it type checks, and the type checker doesn't time out, it is compiled to native code with -fasm
Only -fextended-default-rules are allowed, as language extensions over H98.
The resulting .o file is dynamically linked into the throw-away runplugs instance
The value is evaluated inside an exception handler.
If an exception is thrown, only the first 1024 characters of the exception string are returned.
If all went well, the first 2048 characters of the shown string are returned to the caller.

Exploits

A variety of interesting exploits have been found, or thought of, over the years. Those we remember are listed below:

using newtype recursion to have the typechecker not terminate
using pathological type inference cases to have the type checker not terminate
code injection of code fragments that arne't haskell expressions
Template Haskell used to run IO actions during type checking
stToIO to convert a safe ST action, into an IO action that is run
large strings returned in exceptions
unsafePerformIO, of course
unsafeCoerce#
throwing a piece of code as an exception, which is evaluated when the exception is shown
non-terminating code, in a tight loop that doesn't allocate, can't use GHC's threadDelay/scheduler (let f () = f () in f ()) to timeout (must use OS resource limits).
large array allocations can fill memory
very large array allocations can integer overflow the storage manager, allowing arbitrary memory access
creating class instances that violate assumed laws (cf EvilIx)
various literal strings that print IRC protocol commands could be printed using exceptions.
if a user guesses the top level identifier the expression is bound to, it can be used to print a silly string
zombies could be created by multiple runplugs calls, leading to blocking on endless output. the resulting zombies accumulate, eventually leading to DOS.

Template Haskell

We believe that Template Haskell can be made safe for users by hiding runIO and reify.

@@ Line 36: / Line 36: @@
 * Module imports are disallowed
 * Time and space limitations on the runplugs process are set by the OS
+* The expression is parsed as an expression, hopefully preventing code injection (is this true? and can any string that can parse as a valid Haskell expression become something more sinister when put in a particular context?)
 * The expression is bound to a random top level identifier (harmless to guess)
 * The expression is wrapped in 'show', and must be an instance of Show
 * An instance of Show IO is defined, which prints "<IO>", rendering IO impossible (otherwise, showing the IO would result in a type error, and still not run it). Actually instances for IO and (->), using Typeable, are used to show the specific types of all IO and functions (e.g. putStrLn  -->   <[Char] -> IO ()> )
-* The expression is type checked, with the show constraint, enforcing purity
+* The expression type checked, with the show constraint, enforcing purity
 * If it type checks, and the type checker doesn't time out, it is compiled to native code with -fasm
 * Only -fextended-default-rules are allowed, as language extensions over H98.

Difference between revisions of "Safely running untrusted Haskell code"

Revision as of 17:05, 27 May 2007

Contents

Verifying safety : lambdabot's approach

The policy

The implementation

Exploits

Template Haskell

See also

Navigation menu

Search