Difference between revisions of "The Monad.Reader/Issue4/Solving Sudoku"
(some fmt) 

Line 1:  Line 1:  
'''This article needs reformatting! Please help tidy it up.'''[[User:WouterSwierstraWouterSwierstra]] 14:23, 9 May 2008 (UTC) 
'''This article needs reformatting! Please help tidy it up.'''[[User:WouterSwierstraWouterSwierstra]] 14:23, 9 May 2008 (UTC) 

−  = 
+  =Solving Sudoku= 
−  
When I first started writing this article, Sudoku was  as far as I knew  the name of a numberplacing puzzle of Japanese origin, set by The Times newspaper. Since then, almost every UK national newspaper has started carrying a weekly or even daily Sudoku puzzle; even the Northampton Chronicle has them (although the Chron's are usually pitifully easy). Sudoku has become a popular craze, and books of puzzles together with hints on how to solve them are even now on sale in newsagents up and down the country. 
When I first started writing this article, Sudoku was  as far as I knew  the name of a numberplacing puzzle of Japanese origin, set by The Times newspaper. Since then, almost every UK national newspaper has started carrying a weekly or even daily Sudoku puzzle; even the Northampton Chronicle has them (although the Chron's are usually pitifully easy). Sudoku has become a popular craze, and books of puzzles together with hints on how to solve them are even now on sale in newsagents up and down the country. 

Line 8:  Line 8:  
The solver I wrote, which is presented here, is not a purely deductive solver. I had never attempted to solve a Sudoku puzzle by hand when I came up with the algorithm, and was unaware of the full range of deductive procedures that are possible (there may still be quite a few I haven't fathomed). My solver uses a combination of bruteforce search, rulebased pruning of the search tree, and a simple but effective heuristic to guide it to the correct answer more quickly. It was written in Haskell because I wanted to explore some methods of writing programs of this kind in a purely functional language. 
The solver I wrote, which is presented here, is not a purely deductive solver. I had never attempted to solve a Sudoku puzzle by hand when I came up with the algorithm, and was unaware of the full range of deductive procedures that are possible (there may still be quite a few I haven't fathomed). My solver uses a combination of bruteforce search, rulebased pruning of the search tree, and a simple but effective heuristic to guide it to the correct answer more quickly. It was written in Haskell because I wanted to explore some methods of writing programs of this kind in a purely functional language. 

−  == 
+  ==Grids and moves== 
−  
Here is an example Sudoku puzzle: 
Here is an example Sudoku puzzle: 

Line 49:  Line 49:  
One way to think of this in Haskell is to think of each new number added to a specific cell in the grid as a "move", the state of the grid before the move as a value of some type, and the state of the grid after the move as another value of the same type. The act of adding a number to the grid can then be represented as a function: 
One way to think of this in Haskell is to think of each new number added to a specific cell in the grid as a "move", the state of the grid before the move as a value of some type, and the state of the grid after the move as another value of the same type. The act of adding a number to the grid can then be represented as a function: 

−  {{{ 

+  <haskell> 

move :: SudokuState > Move > SudokuState 
move :: SudokuState > Move > SudokuState 

−  }}} 

+  </haskell> 

To set up a puzzle, given a list of initiallyplaced digits and a `SudokuState` representing a blank grid, we could then "fold" the list of initial moves into the grid like this: 
To set up a puzzle, given a list of initiallyplaced digits and a `SudokuState` representing a blank grid, we could then "fold" the list of initial moves into the grid like this: 

−  {{{ 

+  <haskell> 

moves :: SudokuState > [Move] > SudokuState 
moves :: SudokuState > [Move] > SudokuState 

moves = foldl move 
moves = foldl move 

−  }}} 

+  </haskell> 

Given a function, Validate, that returns a value of True if a `SudokuState` is valid and False if it is not, we can already try any sequence of random moves and find out which ones result in a grid that breaks the rules of Sudoku and which ones result in a valid grid. The rules we have to test for are simply that no number may appear more than once in any given row, column, or 3x3 subgrid. 
Given a function, Validate, that returns a value of True if a `SudokuState` is valid and False if it is not, we can already try any sequence of random moves and find out which ones result in a grid that breaks the rules of Sudoku and which ones result in a valid grid. The rules we have to test for are simply that no number may appear more than once in any given row, column, or 3x3 subgrid. 

−  == 
+  ==Forced moves== 
−  
Rather than trying moves that may break the rules and then testing the resulting grid to see whether they do or not, we can analyse the grid and find out which moves are permissible and which are not. One way to do this is to give every empty cell a list of all nine digits, and then remove from that list all of the digits that occur elsewhere in the same row, column or 3x3 subgrid as that cell. If we are lucky, we may find that some cells have only one digit left in their list of possible digits after we have done this. These cells give rise to ''forced'' moves, moves that we have no alternative but to play. 
Rather than trying moves that may break the rules and then testing the resulting grid to see whether they do or not, we can analyse the grid and find out which moves are permissible and which are not. One way to do this is to give every empty cell a list of all nine digits, and then remove from that list all of the digits that occur elsewhere in the same row, column or 3x3 subgrid as that cell. If we are lucky, we may find that some cells have only one digit left in their list of possible digits after we have done this. These cells give rise to ''forced'' moves, moves that we have no alternative but to play. 

If we can find all of the forced moves in a given position and play them, then we will arrive at a new position in which there may be other moves that are forced by the moves we have just played. It's possible to devise a Sudoku puzzle that could be completely solved just by doing this, but in fact most interesting puzzles require other kinds of deductive steps to reach a conclusion. Nevertheless, identifying and playing forced moves helps us to get closer to a solution without recourse to trialanderror. 
If we can find all of the forced moves in a given position and play them, then we will arrive at a new position in which there may be other moves that are forced by the moves we have just played. It's possible to devise a Sudoku puzzle that could be completely solved just by doing this, but in fact most interesting puzzles require other kinds of deductive steps to reach a conclusion. Nevertheless, identifying and playing forced moves helps us to get closer to a solution without recourse to trialanderror. 

−  Just as playing forced moves may bring us nearer to a solution, it can also help us to identify a grid that ''cannot'' be solved, because we have placed the numbers in such a way that there are sequences of forced moves that result in an invalid grid. By using forced moves to look ahead for 
+  Just as playing forced moves may bring us nearer to a solution, it can also help us to identify a grid that ''cannot'' be solved, because we have placed the numbers in such a way that there are sequences of forced moves that result in an invalid grid. By using forced moves to look ahead for unavoidable failure, we can save ourselves the trouble of trying any more combinations of moves based on a hopeless position. Instead, we can go back immediately to a point where we had some choice about what move to play, and play a different move. 
−  
−  == Monads: Definitely Maybe == 

+  ==Monads: Definitely Maybe== 

Now, it happens that our algorithm involves two kinds of operation, transitions from one state to another and failing and backtracking to a previous state, that can be represented quite conveniently in Haskell using monads. 
Now, it happens that our algorithm involves two kinds of operation, transitions from one state to another and failing and backtracking to a previous state, that can be represented quite conveniently in Haskell using monads. 

The `State` monad allows us to thread a state value invisibly through a sequence of actions that alter that state, so that instead of explicitly taking that value, applying a change to it and returning an altered value, we can express the statechanging actions in a more imperative style. We can thus replace 
The `State` monad allows us to thread a state value invisibly through a sequence of actions that alter that state, so that instead of explicitly taking that value, applying a change to it and returning an altered value, we can express the statechanging actions in a more imperative style. We can thus replace 

−  {{{ 

+  <haskell> 

move :: SudokuState > Move > SudokuState 
move :: SudokuState > Move > SudokuState 

−  }}} 

+  </haskell> 

with 
with 

−  {{{ 

+  <haskell> 

doMove :: Move > SudokuState () 
doMove :: Move > SudokuState () 

−  }}} 

+  </haskell> 

where `SudokuState` is a version of the State monad that encapsulates changes to the state of a Sudoku grid. We can then rewrite 
where `SudokuState` is a version of the State monad that encapsulates changes to the state of a Sudoku grid. We can then rewrite 

−  {{{ 

+  <haskell> 

moves :: SudokuState > [Move] > SudokuState 
moves :: SudokuState > [Move] > SudokuState 

moves = foldl move 
moves = foldl move 

−  }}} 

+  </haskell> 

as 
as 

−  {{{ 

+  <haskell> 

doMoves :: [Move] > SudokuState () 
doMoves :: [Move] > SudokuState () 

doMoves = mapM_ doMove 
doMoves = mapM_ doMove 

−  }}} 

+  </haskell> 

The `State` monad also has some helper functions defined, `gets` and `modify`, that obtain a value from the encapsulated state using a projection function, and update it using a transformation function. 
The `State` monad also has some helper functions defined, `gets` and `modify`, that obtain a value from the encapsulated state using a projection function, and update it using a transformation function. 

Line 101:  Line 102:  
For failing and backtracking, we can use the `Maybe` monad, which is a member of the `MonadPlus` typeclass. The `MonadPlus` laws, applied to the `Maybe` monad, mean that 
For failing and backtracking, we can use the `Maybe` monad, which is a member of the `MonadPlus` typeclass. The `MonadPlus` laws, applied to the `Maybe` monad, mean that 

−  {{{ 

+  <haskell> 

(Just a) `mplus` (Just b) = Just a 
(Just a) `mplus` (Just b) = Just a 

Nothing `mplus` (Just b) = Just b 
Nothing `mplus` (Just b) = Just b 

(Just a) `mplus` Nothing = Just a 
(Just a) `mplus` Nothing = Just a 

Nothing `mplus` Nothing = Nothing 
Nothing `mplus` Nothing = Nothing 

−  }}} 

+  </haskell> 

In practice, this means that if we have two functions that return a value in the `Maybe` monad, we can `mplus` them together and take the first that succeeds (failure is defined as returning `Nothing`, which is done by `mzero`). Based on this, we can combine a series of possible moves using `msum`, and return the first that succeeds. This makes it possible to perform a depthfirst traversal of a search tree, by summing together all of the branches at each node in the tree: the first branch that succeeds is taken. 
In practice, this means that if we have two functions that return a value in the `Maybe` monad, we can `mplus` them together and take the first that succeeds (failure is defined as returning `Nothing`, which is done by `mzero`). Based on this, we can combine a series of possible moves using `msum`, and return the first that succeeds. This makes it possible to perform a depthfirst traversal of a search tree, by summing together all of the branches at each node in the tree: the first branch that succeeds is taken. 

Line 112:  Line 113:  
To combine the `State` and `Maybe` monads, we use the monad transformer `StateT`. This allows us to return values in the `Maybe` monad from the `State` monad, and gives us the semantics of both monads combined in a useful way: 
To combine the `State` and `Maybe` monads, we use the monad transformer `StateT`. This allows us to return values in the `Maybe` monad from the `State` monad, and gives us the semantics of both monads combined in a useful way: 

−  {{{ 

+  <haskell> 

type StatePlus s a = StateT s Maybe a 
type StatePlus s a = StateT s Maybe a 

−  }}} 

+  </haskell> 

(I am grateful to Michael Weber for showing me how to do this) 
(I am grateful to Michael Weber for showing me how to do this) 

−  == 
+  ==Implementing the Solver== 
This section is a literate Haskell implementation of the solver. 
This section is a literate Haskell implementation of the solver. 

We start with some external resources we're going to need. 
We start with some external resources we're going to need. 

−  {{{ 

+  <haskell> 

> import List 
> import List 

Line 128:  Line 129:  
> import Control.Monad.Trans 
> import Control.Monad.Trans 

−  }}} 

+  </haskell> 

Now for the basic types used to represent the contents of the Sudoku world. 
Now for the basic types used to represent the contents of the Sudoku world. 

The state of the grid can be represented simply as a list of lists of 
The state of the grid can be represented simply as a list of lists of 

integers: a 9x9 array of cells. Empty cells are represented by a zero. 
integers: a 9x9 array of cells. Empty cells are represented by a zero. 

−  {{{ 

+  <haskell> 

> type SudokuGrid = [[Int]] 
> type SudokuGrid = [[Int]] 

Line 141:  Line 142:  
> initialGrid = times 9 emptyRow 
> initialGrid = times 9 emptyRow 

−  }}} 

+  </haskell> 

A position on the grid is represented by a coordinate pair of two integers, 
A position on the grid is represented by a coordinate pair of two integers, 

and a move is represented by a position and a digit. 
and a move is represented by a position and a digit. 

−  {{{ 

+  <haskell> 

> type Position = (Int, Int) 
> type Position = (Int, Int) 

> type Move = (Position, Int) 
> type Move = (Position, Int) 

−  }}} 

+  </haskell> 

The State/Maybe monad encapsulating the game state is defined using a monad 
The State/Maybe monad encapsulating the game state is defined using a monad 

transformer. 
transformer. 

−  {{{ 

+  <haskell> 

> type StatePlus s a = StateT s Maybe a 
> type StatePlus s a = StateT s Maybe a 

> type SudokuState a = StatePlus SudokuGrid a 
> type SudokuState a = StatePlus SudokuGrid a 

−  }}} 

+  </haskell> 

We need some ways to read values out of the grid, and update its contents. For 
We need some ways to read values out of the grid, and update its contents. For 

this we use `gets` and `modify`, supplying a variety of projection and 
this we use `gets` and `modify`, supplying a variety of projection and 

Line 165:  Line 166:  
`select`, which are used to obtain ranges of values from the grid, and 
`select`, which are used to obtain ranges of values from the grid, and 

`replace`, which replaces a member of a list at a given index. 
`replace`, which replaces a member of a list at a given index. 

−  {{{ 

+  <haskell> 

> getRows :: SudokuState [[Int]] 
> getRows :: SudokuState [[Int]] 

Line 186:  Line 187:  
> getSubGrids :: SudokuState [[Int]] 
> getSubGrids :: SudokuState [[Int]] 

> getSubGrids = gets $ \rows > [concat $ slice (select rows [0..2]) [0..2], 
> getSubGrids = gets $ \rows > [concat $ slice (select rows [0..2]) [0..2], 

−  > 
+  > concat $ slice (select rows [0..2]) [3..5], 
−  > 
+  > concat $ slice (select rows [0..2]) [6..8], 
−  > 
+  > concat $ slice (select rows [3..5]) [0..2], 
−  > 
+  > concat $ slice (select rows [3..5]) [3..5], 
−  > 
+  > concat $ slice (select rows [3..5]) [6..8], 
−  > 
+  > concat $ slice (select rows [6..8]) [0..2], 
−  > 
+  > concat $ slice (select rows [6..8]) [3..5], 
−  > 
+  > concat $ slice (select rows [6..8]) [6..8]] 
> getSubGrid :: Int > SudokuState [Int] 
> getSubGrid :: Int > SudokuState [Int] 

Line 203:  Line 204:  
> getAllCellValues :: SudokuState [(Position, Int)] 
> getAllCellValues :: SudokuState [(Position, Int)] 

> getAllCellValues = gets $ \rows > 
> getAllCellValues = gets $ \rows > 

−  > 
+  > let indexed = zip rows [0..] in 
−  > 
+  > concatMap (\(row, y) > zip (zip [0..8] (times 9 y)) row) indexed 
> replace index value list = 
> replace index value list = 

−  > 
+  > let indexed = zip list [0..] in 
−  > 
+  > map (\(v, i) > if i==index then value else v) indexed 
> update :: Int > Int > Int > SudokuState () 
> update :: Int > Int > Int > SudokuState () 

> update x y v = modify $ \rows > replace y (replace x v (rows !! y)) rows 
> update x y v = modify $ \rows > replace y (replace x v (rows !! y)) rows 

−  }}} 

+  </haskell> 

Based on this, we can define `doMove` using `update`, and `doMoves` using `Move`. 
Based on this, we can define `doMove` using `update`, and `doMoves` using `Move`. 

−  {{{ 

+  <haskell> 

> doMove ((x, y), v) = update x y v 
> doMove ((x, y), v) = update x y v 

> doMoves = mapM_ doMove 
> doMoves = mapM_ doMove 

−  }}} 

+  </haskell> 

For notational convenience, we define a little parser for moves. `val` 
For notational convenience, we define a little parser for moves. `val` 

−  converts digit characters into 
+  converts digit characters into Ints; `val'` does the same for the letters AH. 
−  +  <haskell> 

> val c = let (Just p) = elemIndex c ['1'..'9'] in p 
> val c = let (Just p) = elemIndex c ['1'..'9'] in p 

Line 231:  Line 232:  
> makeMove (x:y:_:z:_) = ((val' x, val y), (val z) + 1) 
> makeMove (x:y:_:z:_) = ((val' x, val y), (val z) + 1) 

−  }}} 

+  </haskell> 

To set up the initial game state, we apply a list of moves to the empty grid. 
To set up the initial game state, we apply a list of moves to the empty grid. 

−  {{{ 

+  <haskell> 

> initialize = doMoves . map makeMove 
> initialize = doMoves . map makeMove 

−  }}} 

+  </haskell> 

Now we can do some solving! The first thing we need to do is identify all of 
Now we can do some solving! The first thing we need to do is identify all of 

the empty squares in the grid. We then build a list of the permitted values in 
the empty squares in the grid. We then build a list of the permitted values in 

Line 243:  Line 244:  
(N.B. `containingSubGrid` returns the index of the subgrid containing a given position.) 
(N.B. `containingSubGrid` returns the index of the subgrid containing a given position.) 

−  {{{ 

+  <haskell> 

> containingSubGrid (x, y) = (x `div` 3) + (3 * (y `div` 3)) 
> containingSubGrid (x, y) = (x `div` 3) + (3 * (y `div` 3)) 

> getEmptyPositions = do 
> getEmptyPositions = do 

−  > 
+  > allCellValues < getAllCellValues 
−  > 
+  > return $ map (\(p, _) > p) . filter (\(_, v) > v == 0) $ allCellValues 
> getPermittedValues (x, y) = do 
> getPermittedValues (x, y) = do 

−  > 
+  > row < getRow y 
−  > 
+  > col < getCol x 
−  > 
+  > subGrid < getSubGrid $ containingSubGrid (x, y) 
−  > 
+  > return $ [1..9] \\ (row ++ col ++ subGrid) 
> getPossibleMoves = do 
> getPossibleMoves = do 

−  > 
+  > getEmptyPositions >>= mapM (\pos > do pv < getPermittedValues pos; return (pos, pv)) 
−  }}} 

+  </haskell> 

Once we have this analysis of what moves are possible, we can find all the 
Once we have this analysis of what moves are possible, we can find all the 

forced moves in the current position. These are any positions where there is 
forced moves in the current position. These are any positions where there is 

Line 268:  Line 269:  
current grid is valid. If it is, we will take one of the positions with the 
current grid is valid. If it is, we will take one of the positions with the 

lowest number of possible moves, and try each of them in turn. 
lowest number of possible moves, and try each of them in turn. 

−  {{{ 

+  <haskell> 

> atom (x:[]) = True 
> atom (x:[]) = True 

Line 274:  Line 275:  
> getForcedMoves = do 
> getForcedMoves = do 

−  > 
+  > possibleMoves < getPossibleMoves 
−  > 
+  > let forcedMoves = filter (\(_, ms) > atom ms) possibleMoves 
−  > 
+  > return [(pos, v)(pos, (v:_)) < forcedMoves] 
> doForcedMoves = do 
> doForcedMoves = do 

−  > 
+  > forcedMoves < getForcedMoves 
−  > 
+  > if null forcedMoves 
−  > 
+  > then return () 
−  > 
+  > else doMoves forcedMoves >> doForcedMoves 
> getNextMove = do 
> getNextMove = do 

−  > 
+  > possibleMoves < getPossibleMoves 
−  > 
+  > let blocked = filter (\(_, cs) > null cs) possibleMoves 
−  > 
+  > if null possibleMoves 
−  > 
+  > then return [] 
−  > 
+  > else if not (null blocked) 
−  > 
+  > then mzero 
−  > 
+  > else return (fewestOptions possibleMoves) 
> fewestOptions es = 
> fewestOptions es = 

−  > 
+  > let (e:_) = sortBy (\(_, cs1) (_, cs2) > compare (length cs1) (length cs2)) es 
−  > 
+  > (pos, cs) = e in 
−  > 
+  > [(pos, c)c < cs] 
> solve = do 
> solve = do 

−  > 
+  > doForcedMoves 
−  > 
+  > valid < checkValid 
−  > 
+  > if valid 
−  > 
+  > then do nextMove < getNextMove 
−  > 
+  > if null nextMove 
−  > 
+  > then return () 
−  > 
+  > else msum . map (\move > doMove move >> solve) $ nextMove 
−  > 
+  > else mzero 
> checkValid = do 
> checkValid = do 

−  > 
+  > rows < getRows 
−  > 
+  > cols < getCols 
−  > 
+  > subGrids < getSubGrids 
−  > 
+  > return $ (allValid rows) && (allValid cols) && (allValid subGrids) 
> allValid = foldr (\x b > b && (noDups x)) True 
> allValid = foldr (\x b > b && (noDups x)) True 

Line 319:  Line 320:  
> noDups (v:vs) = (if v/=0 then (not (v `elem` vs)) else True) && (noDups vs) 
> noDups (v:vs) = (if v/=0 then (not (v `elem` vs)) else True) && (noDups vs) 

−  }}} 

+  </haskell> 

All that's left is to provide some pretty printing for results, and to create 
All that's left is to provide some pretty printing for results, and to create 

a `play` function that can invoke the solver. 
a `play` function that can invoke the solver. 

−  {{{ 

+  <haskell> 

> prettyPrint :: [[Int]] > IO () 
> prettyPrint :: [[Int]] > IO () 

Line 333:  Line 334:  
> play moves = case (execStateT (initialize moves >> solve) initialGrid) of 
> play moves = case (execStateT (initialize moves >> solve) initialGrid) of 

−  > 
+  > (Just result) > prettyPrint result 
−  > 
+  > Nothing > putStrLn "Failed" 
−  }}} 

+  </haskell> 

For testing purposes, here's a sample puzzle and solution. 
For testing purposes, here's a sample puzzle and solution. 

−  {{{ 

+  <haskell> 

> sample = ["A1:1", "C2:2", "D2:7", "E2:4", "D3:5", "I3:4", "B4:3", "A5:7", "B5:5", "F6:9", "G6:6", "B7:4", "F7:6", "H8:7", "I8:1", "F9:1", "H9:3"] 
> sample = ["A1:1", "C2:2", "D2:7", "E2:4", "D3:5", "I3:4", "B4:3", "A5:7", "B5:5", "F6:9", "G6:6", "B7:4", "F7:6", "H8:7", "I8:1", "F9:1", "H9:3"] 

> solution = ["184963725", 
> solution = ["184963725", 

−  > 
+  > "562748319", 
−  > 
+  > "397512864", 
−  > 
+  > "239657148", 
−  > 
+  > "756184293", 
−  > 
+  > "418239657", 
−  > 
+  > "941376582", 
−  > 
+  > "623895471", 
−  > 
+  > "875421936"] 
> solutionGrid = map (map (\c > (val c) + 1)) solution 
> solutionGrid = map (map (\c > (val c) + 1)) solution 

> test = case (execStateT (initialize sample >> solve) initialGrid) of 
> test = case (execStateT (initialize sample >> solve) initialGrid) of 

−  > 
+  > (Just result) > result==solutionGrid 
−  > 
+  > Nothing > False 
−  
−  }}} 

−  == Conclusion == 

+  </haskell> 

−  I hope that this program shows that it is easy to write simple recursive solvers in Haskell, with state and backtracking semantics layered into the program using monads and monad transformers. In terms of functional programming elegance, I feel this program is on the launch pad but has not yet taken off: the mechanism used for recursion should probably be made generic, and separated out from the particular program that uses it. There are other ways of practising nondeterminism in Haskell, too: the List monad is typically used whenever multiple correct results are possible. 

+  ==Conclusion== 

+  I hope that this program shows that it is easy to write simple recursive solvers in Haskell, with state and backtracking semantics layered into the program using monads and monad transformers. In terms of functional programming elegance, I feel this program is on the launch pad but has not yet taken off: the mechanism used for recursion should probably be made generic, and separated out from the particular program that uses it. There are other ways of practising nondeterminism in Haskell, too: the List monad is typically used whenever multiple correct results are possible. 

Finally, the algorithm presented here is neither particularly intelligent nor particularly fast: there are other ways of deducing forced moves, given an initial state, and all wellwritten Sudoku puzzles should be able to be solved using only deduction (which would eliminate the need for backtracking altogether). 
Finally, the algorithm presented here is neither particularly intelligent nor particularly fast: there are other ways of deducing forced moves, given an initial state, and all wellwritten Sudoku puzzles should be able to be solved using only deduction (which would eliminate the need for backtracking altogether). 
Latest revision as of 00:55, 10 May 2008
This article needs reformatting! Please help tidy it up.WouterSwierstra 14:23, 9 May 2008 (UTC)
Contents
Solving Sudoku
When I first started writing this article, Sudoku was  as far as I knew  the name of a numberplacing puzzle of Japanese origin, set by The Times newspaper. Since then, almost every UK national newspaper has started carrying a weekly or even daily Sudoku puzzle; even the Northampton Chronicle has them (although the Chron's are usually pitifully easy). Sudoku has become a popular craze, and books of puzzles together with hints on how to solve them are even now on sale in newsagents up and down the country.
I started thinking about how to write an automated solver for Sudoku almost as soon as I was introduced to the Times' daily puzzle by my father. On the face of it, it's a fairly straightforward task. The Sudoku "board" is a grid of ninebynine squares, which is further subdivided into nine threebythree cells. Some of the squares have a digit between 1 and 9 written in them; the objective of the puzzle is to place digits in the remaining squares such that every row, every column, and every threebythree region contains every digit from 1 to 9. Each puzzle has only a single solution, which in the case of a properlyset puzzle can always be arrived at through a process of deduction (that is, without recourse to trialanderror methods).
The solver I wrote, which is presented here, is not a purely deductive solver. I had never attempted to solve a Sudoku puzzle by hand when I came up with the algorithm, and was unaware of the full range of deductive procedures that are possible (there may still be quite a few I haven't fathomed). My solver uses a combination of bruteforce search, rulebased pruning of the search tree, and a simple but effective heuristic to guide it to the correct answer more quickly. It was written in Haskell because I wanted to explore some methods of writing programs of this kind in a purely functional language.
Grids and moves
Here is an example Sudoku puzzle:
 # <#999999> A <#999999> B <#999999> C <#999999> D <#999999> E <#999999> F <#999999> G <#999999> H <#999999> I  <#999999> 1 <#EEEEEE> 7 <#EEEEEE>  <#EEEEEE>   1     <#EEEEEE>  <#EEEEEE>  <#EEEEEE> 2  <#999999> 2 <#EEEEEE>  <#EEEEEE>  <#EEEEEE>       6 <#EEEEEE>  <#EEEEEE> 8 <#EEEEEE>   <#999999> 3 <#EEEEEE>  <#EEEEEE>  <#EEEEEE>   8     <#EEEEEE> 1 <#EEEEEE>  <#EEEEEE> 9  <#999999> 4      7 <#EEEEEE>  <#EEEEEE>  <#EEEEEE> 9    1    <#999999> 5    9  3 <#EEEEEE>  <#EEEEEE>  <#EEEEEE>   5  4    <#999999> 6    6   <#EEEEEE> 4 <#EEEEEE>  <#EEEEEE>   9      <#999999> 7 <#EEEEEE> 3 <#EEEEEE>  <#EEEEEE> 8      4 <#EEEEEE>  <#EEEEEE>  <#EEEEEE>   <#999999> 8 <#EEEEEE>  <#EEEEEE> 4 <#EEEEEE>   3     <#EEEEEE>  <#EEEEEE>  <#EEEEEE>   <#999999> 9 <#EEEEEE> 1 <#EEEEEE>  <#EEEEEE>       5 <#EEEEEE>  <#EEEEEE>  <#EEEEEE> 3 
We can encode the puzzle as a list of positions together with the numbers that appear in each position, e.g.
[A1:7, D1:1, I1:2, F2:6, H2:8, D3:8, G3:1, I3:9, C4:7, F4:9, H4:1, B5:9, C5:3, G5:5, H5:4, B6:6, D6:4, G6:9, A7:3, C7:8, F7:4, B8:4, D8:3, A9:1, F9:5, I9:3]
Using the same notation, we can describe any new numbers that we write onto the grid; for instance, G9:4, which would give the grid:
 # <#999999> A <#999999> B <#999999> C <#999999> D <#999999> E <#999999> F <#999999> G <#999999> H <#999999> I  <#999999> 1 <#EEEEEE> 7 <#EEEEEE>  <#EEEEEE>   1     <#EEEEEE>  <#EEEEEE>  <#EEEEEE> 2  <#999999> 2 <#EEEEEE>  <#EEEEEE>  <#EEEEEE>       6 <#EEEEEE>  <#EEEEEE> 8 <#EEEEEE>   <#999999> 3 <#EEEEEE>  <#EEEEEE>  <#EEEEEE>   8     <#EEEEEE> 1 <#EEEEEE>  <#EEEEEE> 9  <#999999> 4      7 <#EEEEEE>  <#EEEEEE>  <#EEEEEE> 9    1    <#999999> 5    9  3 <#EEEEEE>  <#EEEEEE>  <#EEEEEE>   5  4    <#999999> 6    6   <#EEEEEE> 4 <#EEEEEE>  <#EEEEEE>   9      <#999999> 7 <#EEEEEE> 3 <#EEEEEE>  <#EEEEEE> 8      4 <#EEEEEE>  <#EEEEEE>  <#EEEEEE>   <#999999> 8 <#EEEEEE>  <#EEEEEE> 4 <#EEEEEE>   3     <#EEEEEE>  <#EEEEEE>  <#EEEEEE>   <#999999> 9 <#EEEEEE> 1 <#EEEEEE>  <#EEEEEE>       5 <#FF4444> 4 <#EEEEEE>  <#EEEEEE> 3 
One way to think of this in Haskell is to think of each new number added to a specific cell in the grid as a "move", the state of the grid before the move as a value of some type, and the state of the grid after the move as another value of the same type. The act of adding a number to the grid can then be represented as a function:
move :: SudokuState > Move > SudokuState
To set up a puzzle, given a list of initiallyplaced digits and a `SudokuState` representing a blank grid, we could then "fold" the list of initial moves into the grid like this:
moves :: SudokuState > [Move] > SudokuState
moves = foldl move
Given a function, Validate, that returns a value of True if a `SudokuState` is valid and False if it is not, we can already try any sequence of random moves and find out which ones result in a grid that breaks the rules of Sudoku and which ones result in a valid grid. The rules we have to test for are simply that no number may appear more than once in any given row, column, or 3x3 subgrid.
Forced moves
Rather than trying moves that may break the rules and then testing the resulting grid to see whether they do or not, we can analyse the grid and find out which moves are permissible and which are not. One way to do this is to give every empty cell a list of all nine digits, and then remove from that list all of the digits that occur elsewhere in the same row, column or 3x3 subgrid as that cell. If we are lucky, we may find that some cells have only one digit left in their list of possible digits after we have done this. These cells give rise to forced moves, moves that we have no alternative but to play.
If we can find all of the forced moves in a given position and play them, then we will arrive at a new position in which there may be other moves that are forced by the moves we have just played. It's possible to devise a Sudoku puzzle that could be completely solved just by doing this, but in fact most interesting puzzles require other kinds of deductive steps to reach a conclusion. Nevertheless, identifying and playing forced moves helps us to get closer to a solution without recourse to trialanderror.
Just as playing forced moves may bring us nearer to a solution, it can also help us to identify a grid that cannot be solved, because we have placed the numbers in such a way that there are sequences of forced moves that result in an invalid grid. By using forced moves to look ahead for unavoidable failure, we can save ourselves the trouble of trying any more combinations of moves based on a hopeless position. Instead, we can go back immediately to a point where we had some choice about what move to play, and play a different move.
Monads: Definitely Maybe
Now, it happens that our algorithm involves two kinds of operation, transitions from one state to another and failing and backtracking to a previous state, that can be represented quite conveniently in Haskell using monads.
The `State` monad allows us to thread a state value invisibly through a sequence of actions that alter that state, so that instead of explicitly taking that value, applying a change to it and returning an altered value, we can express the statechanging actions in a more imperative style. We can thus replace
move :: SudokuState > Move > SudokuState
with
doMove :: Move > SudokuState ()
where `SudokuState` is a version of the State monad that encapsulates changes to the state of a Sudoku grid. We can then rewrite
moves :: SudokuState > [Move] > SudokuState
moves = foldl move
as
doMoves :: [Move] > SudokuState ()
doMoves = mapM_ doMove
The `State` monad also has some helper functions defined, `gets` and `modify`, that obtain a value from the encapsulated state using a projection function, and update it using a transformation function.
For failing and backtracking, we can use the `Maybe` monad, which is a member of the `MonadPlus` typeclass. The `MonadPlus` laws, applied to the `Maybe` monad, mean that
(Just a) `mplus` (Just b) = Just a
Nothing `mplus` (Just b) = Just b
(Just a) `mplus` Nothing = Just a
Nothing `mplus` Nothing = Nothing
In practice, this means that if we have two functions that return a value in the `Maybe` monad, we can `mplus` them together and take the first that succeeds (failure is defined as returning `Nothing`, which is done by `mzero`). Based on this, we can combine a series of possible moves using `msum`, and return the first that succeeds. This makes it possible to perform a depthfirst traversal of a search tree, by summing together all of the branches at each node in the tree: the first branch that succeeds is taken.
To combine the `State` and `Maybe` monads, we use the monad transformer `StateT`. This allows us to return values in the `Maybe` monad from the `State` monad, and gives us the semantics of both monads combined in a useful way:
type StatePlus s a = StateT s Maybe a
(I am grateful to Michael Weber for showing me how to do this)
Implementing the Solver
This section is a literate Haskell implementation of the solver. We start with some external resources we're going to need.
> import List
> import Monad
> import Control.Monad.State
> import Control.Monad.Trans
Now for the basic types used to represent the contents of the Sudoku world.
The state of the grid can be represented simply as a list of lists of integers: a 9x9 array of cells. Empty cells are represented by a zero.
> type SudokuGrid = [[Int]]
> times n v = take n $ repeat v
> emptyRow :: [Int]
> emptyRow = times 9 0
> initialGrid = times 9 emptyRow
A position on the grid is represented by a coordinate pair of two integers, and a move is represented by a position and a digit.
> type Position = (Int, Int)
> type Move = (Position, Int)
The State/Maybe monad encapsulating the game state is defined using a monad transformer.
> type StatePlus s a = StateT s Maybe a
> type SudokuState a = StatePlus SudokuGrid a
We need some ways to read values out of the grid, and update its contents. For this we use `gets` and `modify`, supplying a variety of projection and transformation functions.
We also define three helper functions: `slice` and `select`, which are used to obtain ranges of values from the grid, and `replace`, which replaces a member of a list at a given index.
> getRows :: SudokuState [[Int]]
> getRows = gets id
> getRow :: Int > SudokuState [Int]
> getRow index = getRows >>= return . flip (!!) index
> select :: [[Int]] > [Int] > [[Int]]
> select values range = map (\n > values !! n) range
> slice :: [[Int]] > [Int] > [[Int]]
> slice values range = map (\n > map (flip (!!) n) values) range
> getCols :: SudokuState [[Int]]
> getCols = gets $ \rows > slice rows [0..8]
> getCol :: Int > SudokuState [Int]
> getCol index = getCols >>= return . flip (!!) index
> getSubGrids :: SudokuState [[Int]]
> getSubGrids = gets $ \rows > [concat $ slice (select rows [0..2]) [0..2],
> concat $ slice (select rows [0..2]) [3..5],
> concat $ slice (select rows [0..2]) [6..8],
> concat $ slice (select rows [3..5]) [0..2],
> concat $ slice (select rows [3..5]) [3..5],
> concat $ slice (select rows [3..5]) [6..8],
> concat $ slice (select rows [6..8]) [0..2],
> concat $ slice (select rows [6..8]) [3..5],
> concat $ slice (select rows [6..8]) [6..8]]
> getSubGrid :: Int > SudokuState [Int]
> getSubGrid index = getSubGrids >>= return . flip (!!) index
> getCellValue :: Int > Int > SudokuState Int
> getCellValue x y = gets $ \rows > (rows !! y) !! x
> getAllCellValues :: SudokuState [(Position, Int)]
> getAllCellValues = gets $ \rows >
> let indexed = zip rows [0..] in
> concatMap (\(row, y) > zip (zip [0..8] (times 9 y)) row) indexed
> replace index value list =
> let indexed = zip list [0..] in
> map (\(v, i) > if i==index then value else v) indexed
> update :: Int > Int > Int > SudokuState ()
> update x y v = modify $ \rows > replace y (replace x v (rows !! y)) rows
Based on this, we can define `doMove` using `update`, and `doMoves` using `Move`.
> doMove ((x, y), v) = update x y v
> doMoves = mapM_ doMove
For notational convenience, we define a little parser for moves. `val` converts digit characters into Ints; `val'` does the same for the letters AH.
> val c = let (Just p) = elemIndex c ['1'..'9'] in p
> val' c = let (Just p) = elemIndex c ['A'..'I'] in p
> makeMove (x:y:_:z:_) = ((val' x, val y), (val z) + 1)
To set up the initial game state, we apply a list of moves to the empty grid.
> initialize = doMoves . map makeMove
Now we can do some solving! The first thing we need to do is identify all of the empty squares in the grid. We then build a list of the permitted values in each of those squares.
(N.B. `containingSubGrid` returns the index of the subgrid containing a given position.)
> containingSubGrid (x, y) = (x `div` 3) + (3 * (y `div` 3))
> getEmptyPositions = do
> allCellValues < getAllCellValues
> return $ map (\(p, _) > p) . filter (\(_, v) > v == 0) $ allCellValues
> getPermittedValues (x, y) = do
> row < getRow y
> col < getCol x
> subGrid < getSubGrid $ containingSubGrid (x, y)
> return $ [1..9] \\ (row ++ col ++ subGrid)
> getPossibleMoves = do
> getEmptyPositions >>= mapM (\pos > do pv < getPermittedValues pos; return (pos, pv))
Once we have this analysis of what moves are possible, we can find all the forced moves in the current position. These are any positions where there is only one move permitted.
First we will find the forced moves and make them, then we will check that the current grid is valid. If it is, we will take one of the positions with the lowest number of possible moves, and try each of them in turn.
> atom (x:[]) = True
> atom _ = False
> getForcedMoves = do
> possibleMoves < getPossibleMoves
> let forcedMoves = filter (\(_, ms) > atom ms) possibleMoves
> return [(pos, v)(pos, (v:_)) < forcedMoves]
> doForcedMoves = do
> forcedMoves < getForcedMoves
> if null forcedMoves
> then return ()
> else doMoves forcedMoves >> doForcedMoves
> getNextMove = do
> possibleMoves < getPossibleMoves
> let blocked = filter (\(_, cs) > null cs) possibleMoves
> if null possibleMoves
> then return []
> else if not (null blocked)
> then mzero
> else return (fewestOptions possibleMoves)
> fewestOptions es =
> let (e:_) = sortBy (\(_, cs1) (_, cs2) > compare (length cs1) (length cs2)) es
> (pos, cs) = e in
> [(pos, c)c < cs]
> solve = do
> doForcedMoves
> valid < checkValid
> if valid
> then do nextMove < getNextMove
> if null nextMove
> then return ()
> else msum . map (\move > doMove move >> solve) $ nextMove
> else mzero
> checkValid = do
> rows < getRows
> cols < getCols
> subGrids < getSubGrids
> return $ (allValid rows) && (allValid cols) && (allValid subGrids)
> allValid = foldr (\x b > b && (noDups x)) True
> noDups [] = True
> noDups (v:vs) = (if v/=0 then (not (v `elem` vs)) else True) && (noDups vs)
All that's left is to provide some pretty printing for results, and to create a `play` function that can invoke the solver.
> prettyPrint :: [[Int]] > IO ()
> prettyPrint rows = mapM_ prettyPrintRow rows >> putStr "\n"
> prettyPrintRow r = mapM_ prettyPrintPos r >> putStr "\n"
> prettyPrintPos 0 = putStr "."
> prettyPrintPos p = putStr (show p)
> play moves = case (execStateT (initialize moves >> solve) initialGrid) of
> (Just result) > prettyPrint result
> Nothing > putStrLn "Failed"
For testing purposes, here's a sample puzzle and solution.
> sample = ["A1:1", "C2:2", "D2:7", "E2:4", "D3:5", "I3:4", "B4:3", "A5:7", "B5:5", "F6:9", "G6:6", "B7:4", "F7:6", "H8:7", "I8:1", "F9:1", "H9:3"]
> solution = ["184963725",
> "562748319",
> "397512864",
> "239657148",
> "756184293",
> "418239657",
> "941376582",
> "623895471",
> "875421936"]
> solutionGrid = map (map (\c > (val c) + 1)) solution
> test = case (execStateT (initialize sample >> solve) initialGrid) of
> (Just result) > result==solutionGrid
> Nothing > False
Conclusion
I hope that this program shows that it is easy to write simple recursive solvers in Haskell, with state and backtracking semantics layered into the program using monads and monad transformers. In terms of functional programming elegance, I feel this program is on the launch pad but has not yet taken off: the mechanism used for recursion should probably be made generic, and separated out from the particular program that uses it. There are other ways of practising nondeterminism in Haskell, too: the List monad is typically used whenever multiple correct results are possible.
Finally, the algorithm presented here is neither particularly intelligent nor particularly fast: there are other ways of deducing forced moves, given an initial state, and all wellwritten Sudoku puzzles should be able to be solved using only deduction (which would eliminate the need for backtracking altogether).
In practice, it is much more rewarding to solve Sudoku puzzles manually than to feed them into a computer and wait a few seconds for an automated solution. On the other hand, writing a solver such as this one can also be a stimulating challenge, and I hope you have enjoyed reading about it