# Haskell Quiz/Bytecode Compiler/Solution Justin Bailey

### From HaskellWiki

< Haskell Quiz | Bytecode Compiler(Difference between revisions)

(Found bug and documented) |
|||

Line 1: | Line 1: | ||

+ | [[Category:Code]] | ||

This solution isn't the cleanest or quite tested, but if you load it in hugs and run: | This solution isn't the cleanest or quite tested, but if you load it in hugs and run: | ||

## Revision as of 09:01, 6 November 2006

This solution isn't the cleanest or quite tested, but if you load it in hugs and run:

compile $ parse $ tokenize $ stmt1 compile $ parse $ tokenize $ stmt2 ... compile $ parse $ tokenize $ stmt8

You'll see the byte codes generated. I didn't implement any optimizations (i.e. using SWAP).

Sadly, the parsing algorithm does not preserve precedence correctly. For example, parsing "2*2+2" produces

> parse $ tokenize "2*2+2" Statement Mult (Statement Plus (Val 2) (Val 2)) (Val 2)

Which is incorrect. Any suggestions?

The code below is literate Haskell.

\begin{code} import Data.Char(isAlpha, isDigit, isSeparator) data Op = Plus | Minus | Mult | Div | Pow | Mod deriving (Show, Eq) data Token = RightParen | LeftParen | Number Integer | Operator Op deriving (Show, Eq) data Expression = Statement Op Expression Expression | Val Integer | Empty deriving (Show) data Bytecode = NOOP | CONST Integer | LCONST Integer | ADD | SUB | MUL | POW | DIV | MOD | SWAP deriving (Show) expr1 = Statement Plus (Val 1) (Val 2) expr2 = Statement Mult (Statement Plus (Val 1) (Val 2)) (Val 3) -- Take a statement and evaluate eval (Val n) = n eval (Statement op left right) | op == Mult = eval left * eval right | op == Minus = eval left - eval right | op == Plus = eval left + eval right | op == Div = eval left `div` eval right | op == Pow = eval left ^ eval right | op == Mod = eval left `mod` eval right -- Take a string and tokenize it tokenize [] = [] tokenize ('(':xs) = LeftParen : tokenize xs tokenize (')':xs) = RightParen : tokenize xs tokenize (x:xs) | isDigit x = let (val, remaining) = span isDigit (x:xs) in Number (read val) : tokenize remaining | isOperator x = toOperator x : tokenize xs | isSeparator x = tokenize (dropWhile isSeparator xs) where makeOp '+' = Just $ Operator Plus makeOp '-' = Just $ Operator Minus makeOp '*' = Just $ Operator Mult makeOp '/' = Just $ Operator Div makeOp '^' = Just $ Operator Pow makeOp '%' = Just $ Operator Mod makeOp _ = Nothing toOperator x = case makeOp x of Just x -> x _ -> error "Bad operator." isOperator x = case makeOp x of Just x -> True _ -> False stmt1 = "1 + 2" -- 3 stmt2 = "1 + 2 * 3" -- 7 stmt3 = "(1 + 2) * 3" -- 9 stmt4 = "4 - 5 * 2" -- -6 stmt5 = "5 * (2 - 4)" -- -10 stmt6 = "(1*3)*4*(5*6)" -- 360 stmt7 = "2^(2+(3/2)^2)" -- 8 stmt8 = "(10%3)*(2+2)" -- 4 {- Based on http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm E -> E + E E -> E * E E -> E / E E -> E - E E -> E % E E -> E ^ E E -> ( E ) E -> n Transform to E --> Exp(0) Exp(p) --> P {B Exp(q)} P --> "(" E ")" | v B --> "+" | "-" | "*" |"/" | "^" | "%" Precedence ^ 3 *, /, % 2 +, - 1 -} -- define precdence of operators precedence Plus = 1 precedence Minus = 1 precedence Mult = 2 precedence Div = 3 precedence Mod = 2 precedence Pow = 3 -- Precedence comparison - gets precedence of -- given operator and determines if its greater than the value given. (>*) op val = precedence op >= val -- Precedence addition - for left associative operators, -- return its precedence + 1. For righ associative, just return the operators -- precedence. prec_add p@(Pow) = precedence p prec_add p = 1 + precedence p parse [] = error "Can't parse empty list of tokens" parse tokens = fst $ parseE tokens 0 where parseE tokens prec = let (p, remaining) = parseP tokens prec in if remaining == [] then (p, remaining) else case head remaining of Operator op -> if op >* prec then let (right, rest) = parseE (tail remaining) $ prec_add op in (Statement op p right, rest) else let (left, rest) = parseE (tail remaining) $ prec in (Statement op left p, rest) _ -> (p, remaining) parseP ((Number n):ts) prec = (Val n, ts) parseP ((LeftParen):ts) prec = let (e, remaining) = parseE ts 0 in (e, tail remaining) compile stmt = compile' stmt [] compile' (Statement op left right) instr = let li = compile' left instr ri = compile' right instr lri = li ++ ri in case op of Plus -> lri ++ [ADD] Minus -> lri ++ [SUB] Mult -> lri ++ [MUL] Div -> lri ++ [DIV] Mod -> lri ++ [MOD] Pow -> lri ++ [POW] compile' (Val n) instr = instr ++ [LCONST n] \end{code}