----+----|----+----|----+----|----+----|----+----|----+----|----+----
SuDoku1.txt: Describing the Su Doku solver and related code
2006-Jun-03 / TN
Introduction: Representing the board
------------------------------------
SuDoku.hs contains functions for solving SuDoku puzzles. The board
under consideration is represented by a rectangular array of fields.
(These "fields" are called "cells" in other SuDoku-related writings,
sorry about that.) For each field, the set of possible elements that
may be placed in the field is maintained.
An ordinary SuDoku puzzle uses a 9*9 array and the possible elements
are the digits 1, 2, .. 9. Initially, the fields with given contents
in the SuDoku puzzle hold a set containing just the single given
element. And the blank fields of the puzzle hold the set of all
possible elements.
Solving: Reduction and trials
-----------------------------
Solving works by reducing the size of the sets that represent
possible field contents in a sequence of steps. A solution has been
found when each set contains exactly one element.
If reduction alone fails to produce a solution, a single trial step
is taken: A field with (the fewest number of) non-single possiblities
is selected and each possibility is tried one by one. For each
possibility, renewed reduction is performed recursively, possibly
leading to further trial steps and so on, until a solution is found.
The solutions produced by each trial are combined, thus all solutions
will be found.
This solution strategy is implemented by the sdkSolve function.
Constraint sets
---------------
Basic reduction is performed on one constraint set at a time. A
constraint set is some selection of fields that must fulfill the
basic SoDuko condition: That each field must contain precisely one
element. And that all fields in the constraint set must contain
different elements.
For ordinary SoDuko puzzles, the constraint sets are the nine rows,
the nine columns and the nine 3*3 squares that subdivide the 9*9
board.
The SuDoku solver is parameterized over the constraint sets. It is
not restricted to solve ordinary 9*9 SuDoku puzzles.
The basic board is assumed to be a rectangular array of fields. A
constraint set is then represented by the list of indexes that
address the fields of the constraint set. For example, the upper
right 3*3 square constraint set of an ordinary 9*9 SuDoku puzzle is
represented by the list (array indexes are 0-based):
[(0,6),(0,7),(0,8),(1,6),(1,7),(1,8),(2,6),(2,7),(2,8)]
The function sdkConstraintSetsTraditional computes constraint sets
for a class of SuDoku puzzles called "traditional" which is a n^2*n^2
square that generalizes the ordinary SuDoku puzzle that has n=3. The
function sdkConstraintSetsClover computes constraint sets for a
class of puzzles called "clover" that consists of 5 n^2*n^2 squares
in a clover-like pattern with a single n*n square of each "leaf"
square overlapping the "center" square. The clover pattern is
illustrated for n=3 as follows:
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
Additional classes of SuDoku puzzles may be defined similarly.
Reducing the board
------------------
A single round of board reduction is performed by doing a basic
reduction on each constraint set (the function sdkBoardReduceAll).
Reduction closure, then, repeatedly performs single board reductions
until the board changes no more (the function sdkBoardClosure).
Basic reduction strategies: By field and by element
---------------------------------------------------
Two different basic reduction strategies are implemented for a
constraint set: Reduction by field and reduction by element.
Field reduction is based on the consideration of subsets of fields of
the constraint set to be reduced. For some subset of the fields, the
union of the elements that can be contained in any of the fields is
computed. If the number of elements in this union is less or equal to
the size of the subset, the subset is said to be reducing and the
elements of the union cannot be used outside this subset of fields.
Hence the elements of the union can be removed from all the other
fields of the constraint set.
As an obvious special case of this, if a single field can only
contain a single element, then that element can be excluded from all
the other fields.
Field reduction uses this idea repeatedly: The non-empty subsets of
the fields of the constraint set are considered in order of
increasing size. If a subset is found to be reducing, the element
union is excluded from the remaining fields of the constraint set and
field reduction is repeated on this remaining set of fields. Field
reduction is done by function sdkReduceByField.
(Technical note: sdkListLengthLE is used rather than comparison of
list lengths to improve performance.)
For element reduction, subsets of elements are considered. For some
subset of elements, the set of fields that can contain any of the
elements is computed. If the number of such fields equals the number
of elements in the subset, any additional elements recorded as
candidates for these fields can be excluded.
For example, consider the subset {1,2} of elements and assume that we
find that the only fields that may contain 1 or 2 have {1,2} and
{1,2,3} as candidate element sets. Then 1 and 2 must occupy precisely
these two fields and 3 can be excluded as a possibility for the
second field.
As for field reduction, element reduction uses this idea repeatedly,
considering at each step the shortest reducing element subset
(function sdkReduceByElement).
Field reduction and element reduction produce identical results
---------------------------------------------------------------
It has come as a significant surprise to me that the two reduction
strategies, although they certainly appear quite different to me,
nevertheless produce identical results. At least in the practical
circumstances of solving real SuDoku puzzles. And since field
reduction is much faster than element reduction, the present version
of the code uses field reduction only to find the final solutions:
The call of sdkSolve in sdkProcessSuDoku uses by=sdkReduceByField.
The element reduction code has been retained, however, for possible
further experiments.
Command line interface to sdkSolve
----------------------------------
The Main module contained in SdkMSol2.hs provides a command line
interface to the SuDoku solver. Here is an example of a call of
the solver that uses runhugs:
runhugs SdkMSol2 \
tn1 \
Traditional 3 \
-#123456789 \
1-53---9- \
---6----- \
------271 \
82------- \
---487--- \
------53- \
23------- \
--7-59--- \
--6---8-4
This is a command for a Linux system. A similar command can be used
under Windows. (I usually keep these long commmands in individual
command files, i.e. Shell scripts or BAT files.) And, of course,
SdkMSol2 can be compiled by GHC and run in a similar manner.
The first parameter "tn1" is simply a name for the puzzle used in the
output. In this case, the first (and so far only) SuDoku puzzle
invented by Yours Truly. The next parameter "Traditional" selects
the class of puzzle desired. Following parameters may be used to
qualify specific puzzles within the class, in this case the "3" which
selects the ordinary 9*9 SuDoku puzzle.
The remaining parameters lay out the initial contents of the
rectangular board. Parameters of the form
#
("#" is used rather than "=" which gives problems in BAT files) are
definitions where the first character is defined to represent the set
of characters listed. Thus, the parameter "-#123456789" defines "-"
to represent the set "123456789". Parameters not of this form
represent rows of the, generally rectangular, board: In such a row,
characters that have been "defined" represent their set, other
characters represent the set containing that single character.
(The command line interface thus restricts the elements used by the
solver to be individual characters. However, the solver itself is not
restricted to handle single-character elements.)
Notice that the puzzle class with parameters represents the
constraint sets used independently of the board, whose shape is
basically defined by the number of rows specified and their maximum
length. Conflicts are not verified explicitly, but generally result
in (sometimes obscure) runtime errors.
Selected lines from the output of running the above command:
SdkMSol2: 2006-Jun-04 13.37
SdkMSol2(tn1)
Height 9 Width 9
1-53---9-
---6-----
------271
82-------
---487---
------53-
23-------
--7-59---
--6---8-4
tn1: Problem:
1 123456789 5 3 123456789 123456789 123456789 9 123456789
123456789 123456789 123456789 6 123456789 123456789 123456789 123456789 123456789
123456789 123456789 123456789 123456789 123456789 123456789 2 7 1
8 2 123456789 123456789 123456789 123456789 123456789 123456789 123456789
123456789 123456789 123456789 4 8 7 123456789 123456789 123456789
123456789 123456789 123456789 123456789 123456789 123456789 5 3 123456789
2 3 123456789 123456789 123456789 123456789 123456789 123456789 123456789
123456789 123456789 7 123456789 5 9 123456789 123456789 123456789
123456789 123456789 6 123456789 123456789 123456789 8 123456789 4
tn1: Closure by field:
...
1 78 5 3 27 28 4 9 6
79 479 2 6 1479 14 3 8 5
369 4689 3489 589 49 458 2 7 1
8 2 139 159 1369 1356 1679 4 79
3569 1569 139 4 8 7 169 16 2
679 14679 149 129 1269 126 5 3 8
2 3 18 18 46 46 79 5 79
4 18 7 128 5 9 16 126 3
59 59 6 7 123 123 8 12 4
tn1: Solutions:
1 7 5 3 2 8 4 9 6
9 4 2 6 7 1 3 8 5
3 6 8 5 9 4 2 7 1
8 2 9 1 3 5 6 4 7
6 5 3 4 8 7 9 1 2
7 1 4 9 6 2 5 3 8
2 3 1 8 4 6 7 5 9
4 8 7 2 5 9 1 6 3
5 9 6 7 1 3 8 2 4
tn1: Total of 1 solutions
tn1: CPU Used: 53.4
Thus, the problem is stated (in two different ways), the sequence of
intermediate boards resulting from performing field reduction on the
initial board is printed and, finally, all the solutions are printed.
(And since field reduction is also carried out, starting from the
initial board, as part of finding the solution, field reduction of
the initial board is actually carred out twice. Something that may be
changed by editing the sdkProcessSuDoku function.)
Here is an example of a command for solving a 16*16 SuDoku puzzle:
runhugs SdkMSol2 \
MasterSUDOKU_106_20051208 \
Traditional 4 \
-#0123456789ABCDEF \
--6--37--B-C--E- \
-C4A08D-5-7-2--- \
-9--F----0--C1-- \
-1E8--6-2D-4--5- \
F-CBE---12-39-7- \
-A8-B-----F7---1 \
7-D4-9---C-0---8 \
--------D-9--B-- \
--2--0-9-------- \
E---8-A---4-12-B \
4---67-----F-90- \
-0-FD-C1---E38-5 \
-B--4-97-E--80A- \
--F0--2----9--3- \
---E-F-3-A0B6C1- \
-4--C-0--81--F--
The final example uses the clover class:
runhugs SdkMSol2 \
JuleSuDoku_Fem_I_En_20060102 \
Clover 3 \
-#123456789 \
---6327-1...--7---23- \
--24--5-6...4--1----- \
--85---9-...1-5-8--7- \
94----127...-6----8-5 \
1-5------...-3---714- \
87--9----...-41-62-97 \
5873-------3----3--19 \
--4--8----4----52--6- \
2---65-----9---7--58- \
......5-----12-...... \
......-8---67-9...... \
......--1372---...... \
64---9----14----4---- \
--263-------------74- \
-93-4----93----5-3--2 \
----68-7-...2-5--8-9- \
---------...84--9-57- \
5--3--84-...-----5-26 \
--54-36--...7-------- \
2--9-5--7...-5438-61- \
-3782----...--37694--
Notice the dots (".") used. They do not enter any constraint sets,
but are needed to fill out the board.
Performance
-----------
A single performance improvement has been implemented: It is the
matter mentioned above that relates to sdkListLengthLE. I have no
idea how the performance of this solver compares to other solvers.
But it is clear that considerable improvement, I would guess
certainly one, perhaps even two, orders of magnitude, would result
from using bit vectors rather than Haskell Strings to represent
sets.
Experimenting with reduction strategies
---------------------------------------
The main module in t44.hs uses miscellaneous functions in SuDoku.hs
to exhaustively check whether field reduction is identical to
element reduction under reasonable assumptions.
Test
----
The Main module in t40.hs contains test code.
Final words
-----------
Best wishes
Thorkil Naur