Haskell Quiz/Index and Query/Solution Jethr0

From HaskellWiki


Unfortunately this solution doesn't really address the problem :)

Neither are bit-arrays used nor is this solution saving much space. I just wanted to experiment with the State Monad, and I'm quite happy with what I learned.

Example:

> let docs = [("Doc1", "The quick brown fox")
             ,("Doc2", "Jumped over the brown dog")
             ,("Doc3", "Cut him to the quick")]
> finder docs "brown"
["Doc1","Doc2"]
> finder docs "the"
["Doc2","Doc3"]


Solution:

import qualified Control.Monad.State as State
import qualified Data.Map as Map
import qualified Data.Set as Set

data Rd = Rd {rdN   :: Integer
             ,rdMap :: Map.Map String Integer
             } deriving (Show)

-- process words of a file and return the set of indices
processWords :: [String] -> State.State Rd (Set.Set Integer)
processWords = foldM step (Set.empty) where 
    step ws x = do mp <- State.gets rdMap
                   i <- case Map.lookup x mp of
                            Nothing -> do n <- State.gets rdN
                                          State.modify (\s -> s{rdN=(n+1), rdMap=Map.insert x n (rdMap s)})
                                          return n
                            Just a  -> return a
                   return $ Set.insert i ws                          

processFile :: (String,String) -> State.State Rd (String, [Integer])
processFile (doc,str) = do indices <- processWords (words str)
                           return (doc, Set.toList indices)

-- find all documents containing string "str" as a word.
findDocs :: String -> [(String,[Integer])] -> State.State Rd [String]
findDocs str indices = do mp <- State.gets rdMap
                          case Map.lookup str mp of
                            Nothing -> return []
                            Just i  -> return . map fst . filter (\(_,is) -> i `elem` is) $ indices

runIt f = State.evalState f (Rd {rdN=0, rdMap=Map.empty})
finder ds str = runIt (mapM processFile ds >>= findDocs str)