bit-player | An amateur's outlook on computation and mathematics

William Gasarch is not the Clay Mathematics Institute. He isn’t paying a million bucks for proofs of famous conjectures. But Gasarch is putting up 17² of his own dollars for the solution to an intriguing little stumper. And the prize problem appears to be somewhat easier than the Riemann hypothesis or the P=NP question. (Unless it’s impossible!)

Gasarch sets forth his prize challenge in a blog post, with further background in a paper and in the slides from a talk. All of those works are well worth reading, but for those who don’t want to chase down the references, here’s the gist. Our mission, should we choose to accept it, is to color the nodes of an n-by-m grid, using only a specified number of colors, and observing a particular constraint: Nowhere in the grid may the four corners of a rectangle all have the same color. (Only rectangles with sides parallel to the x and y axes are considered.) For example, here is a four-colored 15-by-15 grid that satisfies the no-monochromatic-rectangles constraint:

In this array of dots there are $ {15 \choose 2}^2=11025$ distinct rectangles. If you care to check through all of them one by one, you’ll find that in no case do all four corners have the same color. In contrast, here is a 16-by-16 grid that is almost but not quite rectangle-free:

Gasarch offers his $289 bounty for any four-colored 17-by-17 grid with no monochromatic rectangles. Why is that grid of particular interest? It’s a border case. Among square grids, all those up through 16-by-16 have been shown to have rectangle-free four-colorings. ~~For the 18-by-18 grid and all larger squares, rectangle-free four-coloring has been proved impossible.~~ For squares larger than 18-by-18, four-coloring has been proved impossible. The status of the 17-by-17 and 18-by-18 grids remains unsettled, but Gasarch believes that both are four-colorable.

Gasarch has much more to say about the mathematics behind this problem. Here I would like to muse on some computational aspects of searching for a 17-by-17 four-coloring.

To state the obvious first, this is not a problem we can expect to solve by exhaustive enumeration. There are 4²⁸⁹ possible colorings of the grid. Casting out symmetries brings that down only to 2 × 4²⁸⁷. There’s not world enough or time for checking them all.

Testing grids at random is also hopeless in the 17-by-17 case. This nonstrategy actually works quite well for small grids. For example, you can readily find a four-coloring of an 8-by-8 grid just by generating a few hundred thousand random colorings. But the method fails for larger grids because the proportion of all colorings that are rectangle-free falls steeply with grid size. (Consider the 2-by-2 grid: There are 256 four-colorings, and all but four of them are rectangle-free.)

To make any progress toward the 17-by-17 case, we’ll have to do at least a little thinking, rather than expecting the computer to do all the work. Here’s one idea that’s very easy to implement: Find a monochrome rectangle somewhere within the grid, change the color of one of its corners, and repeat until you can’t find any more rectangles. This algorithm works reasonably well for grids up to about 12-by-12, but then it runs out of steam. On larger grids, changing the color of a node to eliminate one rectangle is likely to create another rectangle elsewhere (or several more of them). As a result, the system merely takes a random walk, with trendless fluctuations in the number of rectangles at any given moment. You discover a rectangle-free coloring only if the walk happens to stumble on the zero point.

I found the 15-by-15 four-coloring shown above with an algorithm that’s a little more effective even thought it’s no more sophisticated than the corner-twiddling method. The program repeatedly chooses a node at random and tries assigning it all four possible colors, tallying up the number of rectangles for each color choice. Some color or set of colors must minimize the rectangle count; from among these optimal colors the program chooses one at random and sets the node to that color before repeating the loop. This is a “greedy” method: At each step the number of rectangles can decrease or remain constant but can never increase. Greedy methods are notorious for getting stuck in local optima that are inferior to the global optimum. Maybe that’s what happens to my program when I try it on 16-by-16 and 17-by-17 grids. Or maybe the search space is just too large. In any case, when I woke up this morning and checked the results of an overnight run, I did not find a rectangle-free four-colored 17-by-17 grid awaiting me.

Of course I really didn’t have to do any algorithm analysis at all to know that I wasn’t going to win $289 and eternal fame with a day or two of idle hacking. If the problem were that easy, Gasarch and his students would have solved it for themselves long ago.

In spite of these various failures and frustrations, the grid-coloring problem still looks tantalizingly solvable. If a four-coloring of the Gasarch grid exists, it seems like we should be able to find it by some practical computation.

There are certainly lots of approaches more powerful than the blind dart game I’ve been playing. For example, if local optima are the major impediment, some variation on simulated annealing might help.

A more radical possibility is to try to construct an instance rather than merely search for it. If we assume that the four colors are represented as evenly as possible, then the 17-by-17 grid must have 72 nodes in each of three colors and 73 nodes in the fourth color. Starting from a blank grid, it’s easy to mark off 73 nodes in a single color without creating a forbidden rectangle. Adding 72 nodes of a second color is only a little harder. But then the job gets tricky. When you try to fill in a third color, you also by default choose nodes for the fourth color at the same time, and conflicts pile up in a hurry. Some kind of backtracking approach is probably needed here. Gasarch links to a paper by Elizabeth Kupin of Rutgers that explores these ideas in more detail. (If you want to prove the nonexistence of a four-coloring, this is presumably the way to go.)

Gasarch mentions two other promising avenues: integer programming (the discrete variant of linear programming) and SAT solvers–algorithms for the satisfiability problem. Having spent some time hanging out with a few master SAT solvers, I’m intrigued by the latter possibility. You can almost encode the grid-coloring problem as an instance of NAE-SAT, or not-all-equal SAT. Each node of the grid is represented by a variable that can take on any of four values. We group subsets of variables four at a time into clauses, where each clause includes the variables representing the four corners of a rectangle somewhere in the grid. For the 17-by-17 grid there are ${17 \choose 2}^2=18496$ clauses of this kind. The entire formula is satisfied if we can assign values to the 289 variables in such a way that none of the 18496 clauses has all four of its variables with the same value. After 40 years of work on SAT, there’s a highly developed technology for solving such problems. However, there’s a hitch. SAT problems are formulated in terms of boolean variables, with just two values each, but the grid-color variables have four values. Thus a further layer of encoding is likely to be needed, bringing a further explosion in the size of the problem instance.

One final hackerish note: What’s the best way to detect the presence of a monochromatic rectangle in a grid? My candidate goes like this. We encode the rows of the grid in a set of bit vectors–four vectors for each row, representing the four possible colors. For example, the red vector for a row has a 1 at each position where the corresponding node is red, and zeroes elsewhere. The blue vector has 1s for blue nodes, and so forth. Now we can detect a rectangle merely by taking the logical AND of two rows (an operation that could be a single machine instruction). A rectangle exists if and only if ~~the result of the AND is nonzero.~~ at least two bits are set in the resulting vector.

[Thanks to all the commenters for corrections and elaborations.]

Update: If you are coming late to this topic, please note there are three later posts discussing it, the last of which gives a solution: 22 December 2009, 13 November 2010, and 8 February 2012.

The 17×17 challenge