# Difference between revisions of "Haskell Quiz/Bytecode Compiler/Solution Justin Bailey"

From HaskellWiki

(Found bug and documented) |
|||

Line 9: | Line 9: | ||

You'll see the byte codes generated. I didn't implement any optimizations (i.e. using SWAP). |
You'll see the byte codes generated. I didn't implement any optimizations (i.e. using SWAP). |
||

+ | |||

+ | Sadly, the parsing algorithm does not preserve precedence correctly. For example, parsing "2*2+2" produces |
||

+ | |||

+ | <haskell> |
||

+ | > parse $ tokenize "2*2+2" |
||

+ | Statement Mult (Statement Plus (Val 2) (Val 2)) (Val 2) |
||

+ | </haskell> |
||

+ | |||

+ | Which is incorrect. Any suggestions? |
||

The code below is literate Haskell. |
The code below is literate Haskell. |

## Revision as of 08:50, 4 November 2006

This solution isn't the cleanest or quite tested, but if you load it in hugs and run:

```
compile $ parse $ tokenize $ stmt1
compile $ parse $ tokenize $ stmt2
...
compile $ parse $ tokenize $ stmt8
```

You'll see the byte codes generated. I didn't implement any optimizations (i.e. using SWAP).

Sadly, the parsing algorithm does not preserve precedence correctly. For example, parsing "2*2+2" produces

```
> parse $ tokenize "2*2+2"
Statement Mult (Statement Plus (Val 2) (Val 2)) (Val 2)
```

Which is incorrect. Any suggestions?

The code below is literate Haskell.

```
\begin{code}
import Data.Char(isAlpha, isDigit, isSeparator)
data Op = Plus | Minus | Mult | Div | Pow | Mod
deriving (Show, Eq)
data Token = RightParen
| LeftParen
| Number Integer
| Operator Op
deriving (Show, Eq)
data Expression = Statement Op Expression Expression
| Val Integer
| Empty
deriving (Show)
data Bytecode = NOOP | CONST Integer | LCONST Integer
| ADD
| SUB
| MUL
| POW
| DIV
| MOD
| SWAP
deriving (Show)
expr1 = Statement Plus (Val 1) (Val 2)
expr2 = Statement Mult (Statement Plus (Val 1) (Val 2)) (Val 3)
-- Take a statement and evaluate
eval (Val n) = n
eval (Statement op left right)
| op == Mult = eval left * eval right
| op == Minus = eval left - eval right
| op == Plus = eval left + eval right
| op == Div = eval left `div` eval right
| op == Pow = eval left ^ eval right
| op == Mod = eval left `mod` eval right
-- Take a string and tokenize it
tokenize [] = []
tokenize ('(':xs) = LeftParen : tokenize xs
tokenize (')':xs) = RightParen : tokenize xs
tokenize (x:xs) | isDigit x = let (val, remaining) = span isDigit (x:xs)
in Number (read val) : tokenize remaining
| isOperator x = toOperator x : tokenize xs
| isSeparator x = tokenize (dropWhile isSeparator xs)
where
makeOp '+' = Just $ Operator Plus
makeOp '-' = Just $ Operator Minus
makeOp '*' = Just $ Operator Mult
makeOp '/' = Just $ Operator Div
makeOp '^' = Just $ Operator Pow
makeOp '%' = Just $ Operator Mod
makeOp _ = Nothing
toOperator x = case makeOp x of
Just x -> x
_ -> error "Bad operator."
isOperator x = case makeOp x of
Just x -> True
_ -> False
stmt1 = "1 + 2" -- 3
stmt2 = "1 + 2 * 3" -- 7
stmt3 = "(1 + 2) * 3" -- 9
stmt4 = "4 - 5 * 2" -- -6
stmt5 = "5 * (2 - 4)" -- -10
stmt6 = "(1*3)*4*(5*6)" -- 360
stmt7 = "2^(2+(3/2)^2)" -- 8
stmt8 = "(10%3)*(2+2)" -- 4
{-
Based on http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm
E -> E + E
E -> E * E
E -> E / E
E -> E - E
E -> E % E
E -> E ^ E
E -> ( E )
E -> n
Transform to
E --> Exp(0)
Exp(p) --> P {B Exp(q)}
P --> "(" E ")" | v
B --> "+" | "-" | "*" |"/" | "^" | "%"
Precedence
^ 3
*, /, % 2
+, - 1
-}
-- define precdence of operators
precedence Plus = 1
precedence Minus = 1
precedence Mult = 2
precedence Div = 3
precedence Mod = 2
precedence Pow = 3
-- Precedence comparison - gets precedence of
-- given operator and determines if its greater than the value given.
(>*) op val = precedence op >= val
-- Precedence addition - for left associative operators,
-- return its precedence + 1. For righ associative, just return the operators
-- precedence.
prec_add p@(Pow) = precedence p
prec_add p = 1 + precedence p
parse [] = error "Can't parse empty list of tokens"
parse tokens = fst $ parseE tokens 0
where parseE tokens prec
= let (p, remaining) = parseP tokens prec
in
if remaining == []
then (p, remaining)
else case head remaining of
Operator op ->
if op >* prec
then
let (right, rest) = parseE (tail remaining) $ prec_add op
in (Statement op p right, rest)
else let (left, rest) = parseE (tail remaining) $ prec
in (Statement op left p, rest)
_ -> (p, remaining)
parseP ((Number n):ts) prec = (Val n, ts)
parseP ((LeftParen):ts) prec
= let (e, remaining) = parseE ts 0
in (e, tail remaining)
compile stmt = compile' stmt []
compile' (Statement op left right) instr =
let li = compile' left instr
ri = compile' right instr
lri = li ++ ri
in case op of
Plus -> lri ++ [ADD]
Minus -> lri ++ [SUB]
Mult -> lri ++ [MUL]
Div -> lri ++ [DIV]
Mod -> lri ++ [MOD]
Pow -> lri ++ [POW]
compile' (Val n) instr = instr ++ [LCONST n]
\end{code}
```