Path: physics insights > misc >Sudoku 
Puzzle 1 Before we begin, here's one to try your luck on while you're reading the page. Maybe you can solve it by the time you get to the bottom of this page (if you read slowly enough)! (Sorry, no slick user interface here; to conveniently work on this, you'll need to print it and use a pencil. Click on it to view the image, which is a little more printerfriendly than printing the whole page.) The solution (yes, it has a solution!), along with notes on some features of the puzzle, is given at the bottom of this page. 
I'm going to call the five things we need to worry about boards, cells, boxes, rows, and columns on this page.
The cells are the things you put numbers in.
The "boxes" are the nine 3x3 subarrays of the board.
If I slip up and call something a square, I'll usually mean a
"cell".
I've spent far too much time fiddling with sudoku puzzles since I first looked at one a few years ago. Not too surprisingly, that first puzzle left me completely at a loss as to how to proceed. So, of course, I wrote a program to solve it.
That first Perl script worked by guessing a value, one cell at at time, and backtracking when it got to a point with no legal guesses. It worked but it was horribly slow.
Of course, I eventually figured out that one can annotate the cells with the possible values, and fill in any which have just one possible value, and then update the annotations, and so on and so forth until the board's done. And I learned some things. Mostly what I learned is that I made a lot of mistakes. There were two kinds of errors I committed over and over again:
So, to eliminate these little problems, I stopped annotating (no more annotation errors!) and I switched from a pencil to a black pen (same color as the printer's ink). Much better!
Of course, there were some drawbacks to my new way of doing
things. Without annotations, it definitely takes longer to do the
puzzles (except when I would have messed up the annotations, of
course). (And, as I learned when doing these pages, I can't
do some puzzles without annotation  there comes a point where I get
stuck. So, I would correct this to say I do most puzzles
without annotating...)
And when I make a mistake with a pen, I need to catch it right away;
otherwise, if I've filled in more than a few cells wrong, the puzzle's a
total loss. So I try not to make mistakes. (Hah hah.)
Before I get to the point of this page (assuming it has one), I'm going
to take a moment to explain my own approach to these puzzles. I
figure at least a few people reading that I do them without annotations
and never erase may have wondered a bit at what I actually do.
Please note that I'm not suggesting anyone else take this approach; it's
not especially efficient, and you'll certainly never win any sudoku
speed solving contests with it.
I use negative constraints as much as I can, and I use a visual
approach to filling in values. I start by working with the boxes,
and only look at rows and columns after I've mostly run out of boxbased
operations.
I'm sure that all meant nothing to anyone reading it, so I'll draw it, and work through several moves on an example board. Suppose we have a puzzle that looks like this (click on the image to get something printable):
I would start this by first working with 1, and then iterating up through all nine possible values.
For each value, I move (my eyes) around the board, looking at each instance of that value in turn. I then mentally cross out the row and column in which that cell resides, along with the row and column of any other cell which contains the same value, and see if that gets me anywhere. If it does, I fill in a value, and continue on.
Usually, to help with the visualization, I run the back end of the pen
along the rows and columns to (invisibly) "cross out" the affected rows
and columns. (And sometimes I use the business end of the pen by
mistake, and then I make a mess...)
Looking over the above board, I don't get any joy from the 1's, nor the 2's, nor the 3's (and I'm beginning to wonder why I chose this board for my example). Finally, with the 4's, I see something useful:
The cell with the green circle must be a 4. So we fill that in, and continue on up through 9.
(By the way, if you try this step with Puzzle 1, at the top of the
page, you won't find any values you can fill in this way.)
And that's all I see on this pass. So we need to look a little harder. (In fact, I'd actually do a more complex search to start with; I'm breaking what I do down into smaller steps to make the process  ahem  slightly less murky.)
We look at each of the numbers again, but this time we're going to pay some attention to what's going on in nearby cells. 1 doesn't get us anywhere, but with 2 we see something. The 2 in the upper right box blocks one cell in the lower right box, which means one of two cells in the top row of that box must be a 2 (green oval):
We don't know which cell outlined in green has the 2, but we know one of them does, and we can (mentally) cross out the row containing them.
We extend the "cross out" across the board to the left, and then we
"cross out" the column containing the 2 in the upper left box. In
the lower left box, we see that the purple oval must consequently
contain a 2. Unfortunately, this doesn't actually get us
anywhere, since the oval spans two cells, but it illustrates the
technique. (By the way, this technique won't help with Puzzle
1, at the top of the page.)
At this point we've gone about as far as we can go with "negative constraints" and we need to look at the positive constraints: We need to pay some attention to the rule that every digit must be present in every row, column, and box.
We'll start with the lower right box. (We happened to be looking in its direction already, and it only has three blank cells, which is a manageable number; there's no deeper reason for starting there.) It contains the values 1, 4, 5, 6, 7, and 9, so it's missing 2, 3, and 8. Let's see where they can go.
We see a 2 in the upper right box and an 8 in the lower left box which have some effect on the lower right box. Let's draw their "crossout" lines, pink for the 2, and blue for the 8:
The 2 and the 8 are both restricted to the blue oval in the lower right
box. So, the cell with the green circle in it must contain a 3,
since it can't be a 2 or 8, and there are no other values missing from
the lower right box.
The top two corner boxes don't have anything interesting for us at this time, and the center box, which is surrounded by empty boxes, can't be solved until we've got something filled in in one of the side boxes. So let's look at the lower left box.
The lower left box is missing four values: 2, 6, 7, and 9. Glancing around the board, we see that 7 and 9 both appear in the bottom row. Crossing that out (shown in pink), we see the following:
7 and 9 are restricted to the squares under the green oval in the lower left box.
So, the two empty cells in the bottom row of the lower left box must contain 2 and 6. We've also crossed out (in blue) the column containing a 2 in the upper left corner. From that, we can see that the missing 2 must go in the cell with the blue circle in it, and that leaves only the lower left corner cell for the 6.
(By the way, this technique, also, will not help with Puzzle 1.)
That's about as far as we can go working with the boxes, so let's look at a row. The bottom row has just 3 blanks in it, so let's see if we can make any progress there.
It's missing the values 1, 4, and 8. We've crossed out the column which contains 1 and 8 (in pink):
1 and 8 must both go into the cells marked with the green circles. That leaves only the center cell for the missing 4.
We're now in a position to go back to using simple "crossing out" by itself to make some more progress.
If we cross out the two columns containing 4s in the center box, and cross out the two rows containing 4s in the top left and right boxes, we see:
And the only place a 4 can go in the upper middle box is the cell with the green circle.
I think this is sufficient to give a pretty clear picture of how I go
about solving puzzles without annotations. Continuing the same
way, it's possible to solve this puzzle completely, as you can see here (I
diverged from my usual black pen, and switched to a blue pen for the
additional values which I filled after the 4 in the upper middle box).
As I said to start with, it's slower to do it this way  annotations allow you to make the same deductions without the need to visually rescan the board each time. But, as I also said, this way I never mess up the annotations (and I find it more fun to do it this way).
(Finally, I should mention that none of the techniques I've described so far will help in the least with solving Puzzle 1, at the top of the page.)
N.B.  Since writing the first draft of this page, I renumbered my "solver levels" to split level 4, thus leading to six total levels, rather than five. There may still be a few places where I refer to the former 5level scheme. If so, sorry for the confusion!
The Perl script I use to analyze (and solve) sudokus is given here.
The solver script reads a text file which contains the board to be
solved. The format is simple: there's one line per row, and the
values are delimited by blanks. Once it's read the board, it tries
to solve it; it proceeds exactly the way a human would do it. If
you want it to analyze a board which you subsequently plan to solve by
hand, pass it the argument no_show; it will
still print a rating and some additional information about the board,
but won't print the solution. (If you just want to know how
difficult a board is, use the script score_puzzle,
which runs the solver and digests the output to produce just the
difficulty information.)
Solving is done by determining the permissible values for each empty cell, and then operating on the permissible value sets. (In other words, it annotates the board and looks for cells that have just one value on their list.) It runs up to 5 passes over the board, using progressively more complex operations. After each pass, it checks to see if it's filled all cells on the board; if it has, then it's done.
The difficulty level of a puzzle, typically indicated by a number of stars, is a rather mushy concept. What, exactly, makes a puzzle hard to do? The number of empty squares, the number of squares you can fill in easily to start with, the lengths of the possibility lists you'll need to deal with  a number of things affect it. It's obvious, after looking at puzzles from several different companies, that everybody who sets out to rate boards does it a little differently.
The solver script discussed here provides an alternative, much clearer
way of rating a board. The solver level of the puzzle is
the number of passes it took the solver program to solve it. Since
each pass uses more sophisticated techniques, this is a direct measure
of how clever you need to be to solve the puzzle. (Unfortunately
this doesn't correspond all that well with how "hard" a puzzle seems to
be, which is probably why everybody who makes these things comes up with
some other way of assigning stars.)
When the solver script runs, it keeps a work queue of rows, columns,
and boxes to be processed; before each pass, all rows, columns, and
boxes are pushed onto the queue. (Unlike a human, the program
never forgets to update the annotations on affected cells when it fills
one in!) Here are the passes that it runs, with each pass being
used only if the previous passes failed to generate a solution:
a 
b 
c 
4 
5 
6 
7 
8 
9 
a 
b 
3 
4 
5 
6 
7 
8 
9 
a 
b 
c 
d 
5 
6 
7 
8 
9 
a 
b 
c 
d 
e 
6 
7 
8 
9 
a 
b 
3 
d 
e 
6 
7 
8 
9 
a 
b 
c 
d 
5 
6 
7 
8 
9 
a 
b 
c 
4 
5 
6 
7 
8 
9 
a 
b 
c 
d 
e 
f 
7 
8 
9 
1 
2 
3 
d 
e 
f  7 
8 
9 
Well I was planning on writing this section, but this is all taking way too long. I plan to get to it but it's going to be a while yet.
Here's our example board, with the fetching name Board A1c:
It's worth taking a moment to admire it. It doesn't look like much  not symmetric, no blank rows or columns, no entirely missing values, nothing especially "cute" about it  but it's the end result of running the board builder through about 5,000 boards under control of a driver script, looking for that one special board. It has the remarkable property that there are cells which can be filled in on (nearly) every pass of the solver, and so it is an ideal example board. (I wouldn't recommend trying to do it by hand, however, for reasons we'll see a little later.)
And I'll fill in the rest of this section just any day now!
I said above that every Sudoku (but one) which I've seen (and analyzed) from a newspaper was level 1, 2, or 3.
A few months ago, I would have said that was true of all the Sudokus I'd seen (save for my own creations).
A couple years ago, someone gave me a book of Sudoku puzzles from Kappa (volume 21, as it happens). They were rated with from 1 to 4 stars. The 1star puzzles (at the beginning of the book) were really easy, so I tried the 4star puzzles (at the back of the book). All of them which I tried were very easy, also, and I put the book aside.
A couple months back, for no particular reason, I took it down off the book case and started doing the 4star puzzles. As I said, they all seemed quite easy  until I got stuck on #95. That sometimes happens to me (comes from doing them without notes, or so I claim, anyway); normally it just means I've overlooked something. I put it aside, and looked at it again the next day.
Still stuck.
And the next day: still stuck. At that point, I went over the puzzle with a fine tooth comb, checking everything carefully, paying close attention to what I was doing. I even annotated it:
Still stuck.
So, I dusted off my old solver script, which I hadn't run in years, and
fed it puzzle 95. (And that was what got me to thinking again
about finally writing this page, by the way.)
It's a level 6. And it's in among a bunch of level 2 socalled "4star" puzzles; there's no hint in the text that there's a wolf in there among the sheep.
A bug in Kappa's software? I don't know. (All sudokus are
created by computer, of course  hand building one would be well nigh
impossible.) I went through their remaining 4star puzzles which I
hadn't yet done, feeding them all to the solver (with the no_show
argument, just to get the ratings). Five of the 27 "4star"
puzzles in the book were level 6; a couple were level 3; the rest were
easy level 2 puzzles. Some major inconsistency there!
Since, as far as I know, a level 6 puzzle is essentially unsolvable, save by machine (unless you have the patience to use guessandbacktrack techniques manually) this seems pretty tacky to me, but maybe people get a kick out of hitting an occasional "impossible" puzzle. Maybe it increases the satisfaction of solving the easier ones which surround them.
I get dailypaper puzzles from Dave Green (Conceptus puzzles), in the Ottawa Citizen, and I get them from Fabian Savary, in Devoir. I used to see them in the Boston Globe (a long time ago). I also occasionally do puzzles from Yan Georget, which appear in LeMonde, when I feel moved to print them out (I just get it the online version of that paper).
The puzzles in the papers are all, in general, pretty easy; I never am (quite) forced to annotate them in the course of solving them. So, imagine my surprise when I printed out a puzzle from Le Monde and got stuck trying to solve it. Here it is, at the point where I got stuck:
You guessed it  it is a level 6.
Mistake?
I don't know. It was rated "Expert", after all. But again,
this seems pretty tacky to me; it's such a huge jump in difficulty from
their usual "Expert" puzzles that it is, essentially, unsolvable.
The Perl script I used to create the puzzles on these pages is given here.
It has a large number of arguments, with a very primitive interface, and its output is simple text. See the README.txt file in the scripts directory, and check the argument list in the script for detailed information. I may eventually write more about it. In the mean time, if you don't know Perl you may not want to bother with it.
If you do use it, and its sister script, build_sudoku_to_spec, you will
probably want to run make_sudoku_board
on the output to turn it into something which can be printed nicely.
See: Some Puzzle Examples.
These are mostly level 4 or 5 boards, the like of which one doesn't see
in the daily papers. There are just a few, but they're all
hand tested, with comments on the results...
As promised, here are some comments on Puzzle 1, given at the top
of the page. If you tried to do it, you're probably already aware that there's something strange going on there. In fact, you might even suspect that it's broken, or a mistake, and you may be starting to doubt that there's a solution to it at all. Well, let me reassure you; the puzzle isn't broken, and there's no error in it. It has a solution, and what's more it has just one (in other words, it's fully determined, like any good Sudoku puzzle). So, here's the solution, along with some notes from the solver output when run on that puzzle. And in case you're wondering, I certainly did not solve that one by hand. I gave it a whirl when I put this page together, but it's ... well ... strornry hard, as Christopher Robin might have expressed it. If you solve it (by hand) please email me (the webmaster), and let me know how you went about it! 
