How many Sudokus?

The answer to the question in the headline is: Too many. After I wrote about Sudoku a couple a years ago, I thought I had cured my addiction; but I’ve been a shameless backslider.

I return to the subject now, in this public confessional mode, to correct an error in my 2006 column. Citing a paper by Bertram Felgenhauer and Fraser Jarvis (pdf), I stated that there are 6,670,903,752,021,072,936,960 Sudoku solutions. This is simply the number of ways of filling in the 81 squares without violating the no-two-alike constraints in rows, columns and blocks. I then went on to discuss the symmetries of the Sudoku grid, and noted:

When all these symmetries are taken into account, the number of essentially different Sudoku patterns is reduced substantially…. although [the reduction] still leaves an impressive number of genuinely different solutions: 3,546,146,300,288, or 4×1012.

The latter number also comes from Felgenhauer and Jarvis, but I misinterpreted it. Contrary to my statement, it takes into account only a few of the simplest symmetries. The true number of essentially distinct Sudoku solutions is reported in a slightly later publication by Ed Russell and Jarvis. The correct tally is smaller by three orders of magnitude: 5,472,730,538.

Thanks to Richard E. Dickerson for pointing out my error.

It needs to be emphasized that both of these calculations count the number of solutions, not the number of puzzles. For any given solution, there are many possible puzzles, each created by erasing a different subset of the entries in the solution. How many puzzles per solution? Does anyone know? An exact count seems beyond reckoning, but random sampling ought to yield a rough estimate. For example, consider just those puzzles that have 17 initial clues (the apparent minimum). For any given solution grid, there are (81 choose 17) ways of forming a 17-clue puzzle; this works out to 128,447,994,798,305,325. Most of those puzzles will not have a unique solution, and so they are disqualified as proper Sudokus. The random sampling is needed to estimate how many of the potential puzzles are admissible.

I’d try doing the calculation myself, but I’m too busy solving Sudokus.

Update 2008-04-05: Richard E. Dickerson discovered the error mentioned above in the course of preparing his own discourse on Sudoku-solving strategies. That work now has its own web site, The Sudokerson Matrix, where you can download the 61-page PDF file and an accompanying spreadsheet. Dickerson also mentions some other resources on Sudoku strategy, including a recent book by Andrew C. Stuart that I haven’t yet seen. Dickerson comments:

Few books on Sudoku go into strategies in any depth. Stuart’s The Logic of Sudoku (Michael Mepham, 2007) is a wonderful exception. But his book and mine represent the two opposite poles of Sudoku-solving strategy. I regard many of his methods as superfluous, and I am sure that he regards most of my strategies as heretical. You need to have a close look at both.

What I find intriguing in all this is the very idea that we could have contending doctrines of Sudoku strategy. But before the talk of “heresy” gets out of hand and leads to Sudoku holy wars, I’d like to remind everyone that there’s another view of Sudoku in which all strategies become superfluous. Give me a computer and a program for solving constraint-satisfaction problems, and every 9×9 Sudoku yields in milliseconds.

Posted in games, mathematics | 8 Comments