Archive for December, 2009

17 x 17: A nonprogress report

Tuesday, December 22nd, 2009

The question again: Is there a four-coloring of the 17-by-17 grid in which none of the 18,496 rectangles have the same color at all four corners? As I said last time, Bill Gasarch would not have put a bounty on this problem if it had an easy solution. Over the past couple of weeks I’ve invested some 1014 CPU cycles in the search, and a few neural cycles too. I have nothing to show for the effort, except maybe a slightly clearer intuition about the nature of the problem.

If you generate a bunch of four-colored n-by-n grids at random, the average number of monochromatic rectangles per grid increases quite smoothly with n:

random-grid-stats.png

This gradual progression might lead you to suspect that the difficulty of finding or producing an n-by-n grid that is totally devoid of monochrome rectangles would also be a smooth function of n. The truth is quite different.

success-graph.png

Finding a solution is easy for square grids of any size up to 15 by 15. The task suddenly becomes very hard at size 16 by 16. As for 17 by 17, it’s much harder still–and indeed is not yet known not to be impossible. (Details on the data behind this graph: For each size class from n=8 to n=17 I started with 1,000 randomly four-colored n-by-n grids. Then I applied a simple heuristic search (the first of the algorithms listed below) to each grid, running the program for 1,000 × n2 steps. The graph records the number of times this procedure succeeded–i.e., produced a grid with no monochrome rectangles–at each grid size. Up to n=14, the search never failed; at n=16 and beyond, it never succeeded.)

This kind of sudden transition from easy to hard is a familiar feature in the realm of constraint satisfaction. Well-known intractable problems such as graph coloring and boolean satisfiability have the same structure. That doesn’t bode well for any of the simple-minded computational methods I’ve tried. Here’s a brief catalog of my failures. These are algorithms that I’m pretty sure are never going to pay off.

  • Biased random walk. Start with a randomly colored grid. Repeatedly choose a site at random, then try changing its color; accept the move if it reduces the overall number of monochrome rectangles. This is the simplest of all the algorithms. None of the more-elaborate schemes is decisively better.
  • Whack-a-mole. Find all the monochrome rectangles in the grid; choose one of them and alter the color of one corner, thereby eliminating the rectangle. In the simplest version of this algorithm, you choose the rectangle and the corner and the new color at random; in more sophisticated versions, you might evaluate the alternatives and take the one that offers the greatest benefit.
  • Steepest descent. Examine all possible moves (for the 17-by-17 grid there are 867 of them) and choose one that minimizes the rectangle count.
  • Lookahead steepest descent. Examine all possible moves, and then all possible sequels to each such move (for the 17-by-17 grid there are 751,689 two-move sequences); choose a sequence that minimizes the rectangle count. In principle this method could be extended to chains of three or more moves, but the cost soon gets out of hand. (The lookahead technique is the mirror image of backtrack search; it explores the tree of possible moves breadth-first instead of depth-first.)
  • Color-balanced search. Allow only moves that maintain the overall balance of colors in the grid. For example, in a 16-by-16 four-colored grid, color balance implies 16 sites in each color. One way to maintain balance is to make moves that swap the colors of two sites. (There is no reason to think that a rectangle-free grid will have exact color balance; on the other hand, a solution for a large grid cannot depart too far from perfect balance. Thus a color-balanced search might be an effective trick for finding a neighborhood where solutions are more common.)
  • Row-and-column-balanced search. Allow only moves that maintain the balance of colors within each row and each column of the grid. In a 16-by-16 grid, each row and column should have four sites in each of four colors. A simple way to maintain this detailed color balance is to search for “harlequin rectangles” with the color pattern \(\begin{array}{cc}a & b\\b & a\end{array}\) and permute them to \(\begin{array}{cc}b & a\\a & b\end{array}\).

Most of these techniques are greedy: At each stage the algorithm chooses an action that maximizes some measure of progress. On hard instances, a pure greedy strategy almost always fails; the search gets stuck in some local optimum. Thus it’s usually best to temper the greediness to some extent, occasionally choosing a move other than the one that yields the best immediate return. (The family of methods known as simulated annealing are more elaborate variations on this idea, based on insights from thermal physics.)

greediness-traces.png

Here we see traces of three runs of the algorithm identified above as steepest descent, with differing values of a greediness parameter m. (The grid size is 15 × 15.) At m=0 (no greediness at all), all moves are equally likely to be chosen, and the algorithm executes a random walk on the space of grid colorings. At m=1 (maximum greediness), the program always chooses the highest-ranked move, which works well until the system stumbles into a state where no move can reduce the rectangle count. A value of m=0.3 seems to be a good compromise. (I’ll say a little more below on the greediness parameter; indeed, I have a question about how best to define and implement it.)

After all this fussing with a dozen variations on local-search algorithms, I’m afraid the outlook for success is not promising. With a little patience and some tuning of parameters, any one of these algorithms can solve grids up to 15 × 15. With a lot more patience and tuning, they’ll eventually yield answers for 16 × 16. But none of the algorithms come even close to cracking the 17 × 17 barrier. Solving that one is going to require a fundamentally new idea. Perhaps someone will find an analytic approach to constructing a solution, rather than blindly searching for one. Or perhaps someone will prove that no solution exists.

On the computational front, I suspect the best hope is a family of algorithms known in various contexts as belief propagation, survey propagation and the cavity method. I’ve been hoping that friends who are expert in these techniques might swoop in and solve the problem for me, but if not I may have to give it a try myself.

In the meantime, here’s the thing about greediness (an apt subject for this time of year?). We want to define a function greedy whose arguments are a vector of alternative moves ranked from best to worst and a number m such that 0 ≤ m ≤ 1. If the greediness parameter m is 0, the function returns a random element of the vector. If m = 1, the returned value is always the first (highest-ranked) move. Otherwise, we must somehow interpolate between these behaviors. One attractive notion is to return the first element of the vector with probability m, the second choice with probability m(1 − m), and so on. Thus for m = 1/2 the series of probabilities would begin 1/2, 1/4, 1/8…. For m = 1/3 the first few values are 1/3, 2/9, 4/27….

This scheme works just fine for a vector of infinite length, but there’s a problem with shorter vectors. Consider what happens with the procedure call greedy(v=[1, 2, 3, 4], m=0.5). We have the following table of probabilities:

     1 --> 1/2
     2 --> 1/4
     3 --> 1/8
     4 --> 1/16

But on adding up those values, we come up 1/16th short of 1. What happens to the missing probability? I took an easy way out, distributing the “extra” probability equally over all the elements of the vector. The code looks something like this:

function greedy(v, m)
  for i=0 to length(v)
     if (i==length(v))
        return v[random(length(v))]
     elseif (random(1.0) < m)
        return v[i]

This procedure seems to give sensible results, but I wonder if there might be a better or more natural definition of greedy probabilities. Also, the running time for my code is logarithmic in the length of the vector (assuming m < 1). Is there a constant-time algorithm that gives the same results? (We don’t know the length of the vector in advance, so merely precomputing the table of probabilities is not an option.)

The 17×17 challenge

Saturday, December 5th, 2009

William Gasarch is not the Clay Mathematics Institute. He isn’t paying a million bucks for proofs of famous conjectures. But Gasarch is putting up 172 of his own dollars for the solution to an intriguing little stumper. And the prize problem appears to be somewhat easier than the Riemann hypothesis or the P=NP question. (Unless it’s impossible!)

Gasarch sets forth his prize challenge in a blog post, with further background in a paper and in the slides from a talk. All of those works are well worth reading, but for those who don’t want to chase down the references, here’s the gist. Our mission, should we choose to accept it, is to color the nodes of an n-by-m grid, using only a specified number of colors, and observing a particular constraint: Nowhere in the grid may the four corners of a rectangle all have the same color. (Only rectangles with sides parallel to the x and y axes are considered.) For example, here is a four-colored 15-by-15 grid that satisfies the no-monochromatic-rectangles constraint:

grid15r0a.png

In this array of dots there are \( {15 \choose 2}^2=11025\) distinct rectangles. If you care to check through all of them one by one, you’ll find that in no case do all four corners have the same color. In contrast, here is a 16-by-16 grid that is almost but not quite rectangle-free:

grid16r2a.png

Gasarch offers his $289 bounty for any four-colored 17-by-17 grid with no monochromatic rectangles. Why is that grid of particular interest? It’s a border case. Among square grids, all those up through 16-by-16 have been shown to have rectangle-free four-colorings. For the 18-by-18 grid and all larger squares, rectangle-free four-coloring has been proved impossible. For squares larger than 18-by-18, four-coloring has been proved impossible. The status of the 17-by-17 and 18-by-18 grids remains unsettled, but Gasarch believes that both are four-colorable.

Gasarch has much more to say about the mathematics behind this problem. Here I would like to muse on some computational aspects of searching for a 17-by-17 four-coloring.

To state the obvious first, this is not a problem we can expect to solve by exhaustive enumeration. There are 4289 possible colorings of the grid. Casting out symmetries brings that down only to 2 × 4287. There’s not world enough or time for checking them all.

Testing grids at random is also hopeless in the 17-by-17 case. This nonstrategy actually works quite well for small grids. For example, you can readily find a four-coloring of an 8-by-8 grid just by generating a few hundred thousand random colorings. But the method fails for larger grids because the proportion of all colorings that are rectangle-free falls steeply with grid size. (Consider the 2-by-2 grid: There are 256 four-colorings, and all but four of them are rectangle-free.)

To make any progress toward the 17-by-17 case, we’ll have to do at least a little thinking, rather than expecting the computer to do all the work. Here’s one idea that’s very easy to implement: Find a monochrome rectangle somewhere within the grid, change the color of one of its corners, and repeat until you can’t find any more rectangles. This algorithm works reasonably well for grids up to about 12-by-12, but then it runs out of steam. On larger grids, changing the color of a node to eliminate one rectangle is likely to create another rectangle elsewhere (or several more of them). As a result, the system merely takes a random walk, with trendless fluctuations in the number of rectangles at any given moment. You discover a rectangle-free coloring only if the walk happens to stumble on the zero point.

I found the 15-by-15 four-coloring shown above with an algorithm that’s a little more effective even thought it’s no more sophisticated than the corner-twiddling method. The program repeatedly chooses a node at random and tries assigning it all four possible colors, tallying up the number of rectangles for each color choice. Some color or set of colors must minimize the rectangle count; from among these optimal colors the program chooses one at random and sets the node to that color before repeating the loop. This is a “greedy” method: At each step the number of rectangles can decrease or remain constant but can never increase. Greedy methods are notorious for getting stuck in local optima that are inferior to the global optimum. Maybe that’s what happens to my program when I try it on 16-by-16 and 17-by-17 grids. Or maybe the search space is just too large. In any case, when I woke up this morning and checked the results of an overnight run, I did not find a rectangle-free four-colored 17-by-17 grid awaiting me.

Of course I really didn’t have to do any algorithm analysis at all to know that I wasn’t going to win $289 and eternal fame with a day or two of idle hacking. If the problem were that easy, Gasarch and his students would have solved it for themselves long ago.

In spite of these various failures and frustrations, the grid-coloring problem still looks tantalizingly solvable. If a four-coloring of the Gasarch grid exists, it seems like we should be able to find it by some practical computation.

There are certainly lots of approaches more powerful than the blind dart game I’ve been playing. For example, if local optima are the major impediment, some variation on simulated annealing might help.

A more radical possibility is to try to construct an instance rather than merely search for it. If we assume that the four colors are represented as evenly as possible, then the 17-by-17 grid must have 72 nodes in each of three colors and 73 nodes in the fourth color. Starting from a blank grid, it’s easy to mark off 73 nodes in a single color without creating a forbidden rectangle. Adding 72 nodes of a second color is only a little harder. But then the job gets tricky. When you try to fill in a third color, you also by default choose nodes for the fourth color at the same time, and conflicts pile up in a hurry. Some kind of backtracking approach is probably needed here. Gasarch links to a paper by Elizabeth Kupin of Rutgers that explores these ideas in more detail. (If you want to prove the nonexistence of a four-coloring, this is presumably the way to go.)

Gasarch mentions two other promising avenues: integer programming (the discrete variant of linear programming) and SAT solvers–algorithms for the satisfiability problem. Having spent some time hanging out with a few master SAT solvers, I’m intrigued by the latter possibility. You can almost encode the grid-coloring problem as an instance of NAE-SAT, or not-all-equal SAT. Each node of the grid is represented by a variable that can take on any of four values. We group subsets of variables four at a time into clauses, where each clause includes the variables representing the four corners of a rectangle somewhere in the grid. For the 17-by-17 grid there are \({17 \choose 2}^2=18496\) clauses of this kind. The entire formula is satisfied if we can assign values to the 289 variables in such a way that none of the 18496 clauses has all four of its variables with the same value. After 40 years of work on SAT, there’s a highly developed technology for solving such problems. However, there’s a hitch. SAT problems are formulated in terms of boolean variables, with just two values each, but the grid-color variables have four values. Thus a further layer of encoding is likely to be needed, bringing a further explosion in the size of the problem instance.

One final hackerish note: What’s the best way to detect the presence of a monochromatic rectangle in a grid? My candidate goes like this. We encode the rows of the grid in a set of bit vectors–four vectors for each row, representing the four possible colors. For example, the red vector for a row has a 1 at each position where the corresponding node is red, and zeroes elsewhere. The blue vector has 1s for blue nodes, and so forth. Now we can detect a rectangle merely by taking the logical AND of two rows (an operation that could be a single machine instruction). A rectangle exists if and only if the result of the AND is nonzero. at least two bits are set in the resulting vector.

[Thanks to all the commenters for corrections and elaborations.]