Haskell Quiz/Bytecode Compiler/Solution Justin Bailey
Jump to navigation
Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
This solution isn't the cleanest or quite tested, but if you load it in hugs and run:
compile $ parse $ tokenize $ stmt1
compile $ parse $ tokenize $ stmt2
...
compile $ parse $ tokenize $ stmt8
You'll see the byte codes generated. I didn't implement any optimizations (i.e. using SWAP).
Sadly, the parsing algorithm does not preserve precedence correctly. For example, parsing "2*2+2" produces
> parse $ tokenize "2*2+2"
Statement Mult (Statement Plus (Val 2) (Val 2)) (Val 2)
Which is incorrect. Any suggestions?
The code below is literate Haskell.
\begin{code}
import Data.Char(isAlpha, isDigit, isSeparator)
data Op = Plus | Minus | Mult | Div | Pow | Mod
deriving (Show, Eq)
data Token = RightParen
| LeftParen
| Number Integer
| Operator Op
deriving (Show, Eq)
data Expression = Statement Op Expression Expression
| Val Integer
| Empty
deriving (Show)
data Bytecode = NOOP | CONST Integer | LCONST Integer
| ADD
| SUB
| MUL
| POW
| DIV
| MOD
| SWAP
deriving (Show)
expr1 = Statement Plus (Val 1) (Val 2)
expr2 = Statement Mult (Statement Plus (Val 1) (Val 2)) (Val 3)
-- Take a statement and evaluate
eval (Val n) = n
eval (Statement op left right)
| op == Mult = eval left * eval right
| op == Minus = eval left - eval right
| op == Plus = eval left + eval right
| op == Div = eval left `div` eval right
| op == Pow = eval left ^ eval right
| op == Mod = eval left `mod` eval right
-- Take a string and tokenize it
tokenize [] = []
tokenize ('(':xs) = LeftParen : tokenize xs
tokenize (')':xs) = RightParen : tokenize xs
tokenize (x:xs) | isDigit x = let (val, remaining) = span isDigit (x:xs)
in Number (read val) : tokenize remaining
| isOperator x = toOperator x : tokenize xs
| isSeparator x = tokenize (dropWhile isSeparator xs)
where
makeOp '+' = Just $ Operator Plus
makeOp '-' = Just $ Operator Minus
makeOp '*' = Just $ Operator Mult
makeOp '/' = Just $ Operator Div
makeOp '^' = Just $ Operator Pow
makeOp '%' = Just $ Operator Mod
makeOp _ = Nothing
toOperator x = case makeOp x of
Just x -> x
_ -> error "Bad operator."
isOperator x = case makeOp x of
Just x -> True
_ -> False
stmt1 = "1 + 2" -- 3
stmt2 = "1 + 2 * 3" -- 7
stmt3 = "(1 + 2) * 3" -- 9
stmt4 = "4 - 5 * 2" -- -6
stmt5 = "5 * (2 - 4)" -- -10
stmt6 = "(1*3)*4*(5*6)" -- 360
stmt7 = "2^(2+(3/2)^2)" -- 8
stmt8 = "(10%3)*(2+2)" -- 4
{-
Based on http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm
E -> E + E
E -> E * E
E -> E / E
E -> E - E
E -> E % E
E -> E ^ E
E -> ( E )
E -> n
Transform to
E --> Exp(0)
Exp(p) --> P {B Exp(q)}
P --> "(" E ")" | v
B --> "+" | "-" | "*" |"/" | "^" | "%"
Precedence
^ 3
*, /, % 2
+, - 1
-}
-- define precdence of operators
precedence Plus = 1
precedence Minus = 1
precedence Mult = 2
precedence Div = 3
precedence Mod = 2
precedence Pow = 3
-- Precedence comparison - gets precedence of
-- given operator and determines if its greater than the value given.
(>*) op val = precedence op >= val
-- Precedence addition - for left associative operators,
-- return its precedence + 1. For righ associative, just return the operators
-- precedence.
prec_add p@(Pow) = precedence p
prec_add p = 1 + precedence p
parse [] = error "Can't parse empty list of tokens"
parse tokens = fst $ parseE tokens 0
where parseE tokens prec
= let (p, remaining) = parseP tokens prec
in
if remaining == []
then (p, remaining)
else case head remaining of
Operator op ->
if op >* prec
then
let (right, rest) = parseE (tail remaining) $ prec_add op
in (Statement op p right, rest)
else let (left, rest) = parseE (tail remaining) $ prec
in (Statement op left p, rest)
_ -> (p, remaining)
parseP ((Number n):ts) prec = (Val n, ts)
parseP ((LeftParen):ts) prec
= let (e, remaining) = parseE ts 0
in (e, tail remaining)
compile stmt = compile' stmt []
compile' (Statement op left right) instr =
let li = compile' left instr
ri = compile' right instr
lri = li ++ ri
in case op of
Plus -> lri ++ [ADD]
Minus -> lri ++ [SUB]
Mult -> lri ++ [MUL]
Div -> lri ++ [DIV]
Mod -> lri ++ [MOD]
Pow -> lri ++ [POW]
compile' (Val n) instr = instr ++ [LCONST n]
\end{code}