Talk:Toy compression implementations
Much kudos for fixing the underflow error. The new LZW implementation is much smaller, but... how in the name of God does it actually work? o_O MathematicalOrchid 11:36, 9 March 2007 (UTC)
To understand it I rewrote it a bit:
encode_LZW :: (Eq t) => [t] -> [t] -> [Int] encode_LZW alphabet = work (map (:) alphabet) where work table  =  work table lst = index : work table' rest where (tok, rest) = last . takeWhile ((`elem` table) . fst) . tail $ zip (inits lst) (tails lst) index = fromJust (elemIndex tok table) table' = table ++ [tok'] tok' = tok ++ [head rest]
The idea of the the table, which is the 1st argument to 'work', is that some prefix of the input is already in the table.
(encode_LZW chars) uses 'chars' to make the initial table for the 'work' function by turning the list of characters into a list of length 1 strings.
where (tok,rst) definition can be read right to left:
zip (inits lst) (tails lst)computes every possible way to split
lstinput into a prefix and suffix, in increasing length of prefix.
tailfunction just drops the head because it doesn't want to consider the length 0 prefix
takeWhileapplies the predicate
(`elem` table)to the prefix. This will always succeed on the length 1 prefix, and may find longer prefixes in the table.
lastfunction take the last prefix in the table, which will always be the longest such prefix
tokis this prefix, and
restis the remaining suffix to process.
Wow... a most ingenious (and inefficient) approach! Well, now it makes sense anyway. MathematicalOrchid 13:34, 9 March 2007 (UTC)