Archive for the ‘mathematics’ Category

When life gives you lemmas, make lemma-ade

Sunday, January 16th, 2011

peg-legged Santa and two aligators in the lobby of the Sheraton, New Orleans

Santa and alligators in the hotel lobby: This must be New Orleans. Laptoppers and iPhoners paying no attention whatever: This must be the Joint Mathematics Meeting.

It was a good meeting. I learned stuff. My take-home impression: The world of mathematics is thriving, and everybody seems to be having fun. There’s adventure to be found if you go seek it.

A week later, though, as I mull over the lectures I heard and try to make sense of my notes, not all my thoughts are so jolly and upbeat. One subject in particular leaves me feeling vaguely uneasy, and I can’t seem to let it go.

The seed of doubt was sown in the annual Current Events session organized by David Eisenbud. In this special series of talks, always one of the highlights of the meeting for me, “the speakers do not report on their own work, but survey some of the most interesting current developments in mathematics, pure and applied…. Excellence in exposition is a prime consideration.”

The final talk of the session was given by David Nadler of Northwestern University; he spoke on “The geometric nature of the Fundamental Lemma.” The subject is an apt one for a Current Events talk: The lemma in question was recently proved by Ngô Bảo Châu, who was just named a Fields medalist for this work.

So what is the Fundamental Lemma? Nadler began his talk by saying he wasn’t going to answer that question. He would not work through the proof. He would not even state the lemma. In an hour’s talk, it can’t be done, he said. The slide below is as close as he came.

Nadler-slide-0646.jpg

(The blurriness of this image is entirely my fault, not Nadler’s. But perhaps the fuzziness is not altogether inappropriate. I begin to think that the lemma is so abstruse it can’t even be photographed clearly.)

My aim here is not to gripe about Nadler’s talk, which was in fact a masterful presentation; I found it engaging and entertaining even when the material soared by miles over my head. (A writeup of the talk is available online (3 MB PDF), although it’s quite different from the oral presentation.)

In any event, no one else has done better with this topic. The Wikipedia page isn’t much help. When Paul Basken wrote an account for the The Chronicle of Higher Education last fall, he started out bravely but punted when it came time to state the lemma:

It is an element of an overall structure known as the Langlands Program, a “grand unifying theory” that shows common links between number theory and harmonic analysis, said Edward Frenkel, a professor of mathematics at the University of California at Berkeley.

Thomas Hales has written “A Statement of the Fundamental Lemma” for the Clay Mathematics Institute, but I’m afraid it leaves me utterly unenlightened.

Of course it doesn’t really matter much whether I grasp these ideas. I’m merely an eavesdropper, and I have no right to demand that the conversations of professional mathematicians be tailored to my needs. But in the Current Events session Nadler wasn’t addressing an audience of amateurs like me. He was speaking to a roomful of his peers and colleagues. When even an audience of mathematicians can’t understand the statement of a problem—much less the solution—then I begin to worry about the sustainability of the enterprise.

In his history of 19th century mathematics Felix Klein wrote:

When I was a student, abelian functions were … considered the uncontested summit of mathematics and each of us was ambitious to make progress in this field. And now? The younger generation hardly knows abelian functions.

How did this happen? In mathematics, as in other sciences, the same process can be observed again and again. First new questions arise, for internal or external reasons, and draw researchers away from the old questions. And the old questions, just because they have been worked on so much, need ever more comprehensive study for their mastery. This is unpleasant, and so one is glad to turn to problems that have been less developed and therefore require less foreknowledge—even if it is only a matter of axiomatics, or set theory, or some such thing.

There is some dispute about whether Klein’s obituary for abelian functions might have been premature. But the general process he describes, in which a field of study sinks under the weight of accumulated scholarship, seems like a real hazard in some areas of mathematics.

It may well be that the Fundamental Lemma is so truly fundamental that students will continue to make the necessary investment of “ever more comprehensive study.” But I do hope someone will figure out how to communicate the meaning of this work to the rest of us, or at least to the rest of the mathematical community.

More on snow and balls

Monday, January 10th, 2011

Here in New Orleans there’s nothing to make a snowball out of—not even a semisnowball—but I did hear several talks on snow and balls during the Joint Math Meetings.

An invited address by Yuval Peres was followed by a session of six talks related in one way or another to the theme of diffusion-limited aggregation. This is a subject dear to me; the graphic in the bit-player heading is a product of a DLA simulation (see my old column on the topic).

The snowiest of the DLA talks was by Janko Gravner, whose work on snowfakes I have discussed before. I won’t say more here except to mention one illuminating offhand remark. In discussing Norman Packard’s cellular automaton for modeling snowflakes, Gravner quoted Steven Levy’s book Artificial Life:

An elementary schoolchild could look at any of the gorgeous pictures of computer screens in Packard’s collection and instantly identify it as a snowflake.

What the schoolchild would actually recognize, Gravner pointed out, is not a snowflake but “Christmas iconography.” And surely it’s true: We’ve all seen so many of those lacy doily patterns with hexagonal symmetry that we believe snowflakes really look like that. Some do, but many others are plates or needles or even wheel-and-axle assemblies.

DLA is one way of modeling snowflake growth: Water-vapor molecules drift in from some distant source, wander randomly, and stick fast when they touch any part of the solid snowflake. The feathery patterns in the bit-player header were created by such an algorithm. But most of the talks in the JMM session concerned an inside-out version of this process: Particles (or “chips”) are released at the center and wander across the structure formed by the deposition of earlier chips, stopping as soon as they find a vacant site. What’s intriguing about this “interior” DLA is that the resulting shapes are not at all feathery or lacy. They are compact “blobs.” And if you let the process run long enough, the blob evolves into a ball, with a boundary approximating a circle.

Jim Propp invented a derandomized version of interior DLA called the rotor-router model. Each site in the lattice is equipped with an arrow, and when a chip reaches the site, it exits in the direction indicated by the arrow; then the arrow rotates by some fixed rule, such as turning 90 degrees clockwise. This scheme has many properties of a random walk—the chips are sent in all directions with equal probability—but it is fully deterministic. (Propp was one of the speakers in the DLA session. He promises to put his slides online, presumably here, but when I checked this morning they weren’t yet up.)

The rotor-router patterns are also round blobs, but in addition they have an intricate internal structure that becomes visible when the sites are colored according to the rotor-arrow direction. Tobias Friedrich, who gave another of the talks, has some spectacular examples on his web site. Using the technology of the Google Maps API, Friedrich’s site allows you to zoom into images of billion-pixel DLA patterns.

The big question is: Why so round? It’s not really surprising that the forms created in this was are generally compact, without the voids or branched structures seen in “external” DLA experiments. But the pure circular shape remains unexplained. In these models all the action takes place on a lattice—usually square, but triangular and hexagonal lattices have been tried as well—and you might expect some remnant of the lattice symmetry to show through in the deposited pattern. There’s no sign of it.

On the other hand, Matthew Cook, in the final talk of the session, revealed that although the blobs are very round indeed (within about a third of a pixel, on average), they are apparently not perfect circles. There are persistent bulges, not quite aligned with the lattice axes, that don’t seem to be smoothed away as the number of chips increases.

This is posted from the New Orleans airport, on my way back to the land of real snowflakes.

Floor e?

Friday, January 7th, 2011

On what floor was this sign placed?

At most conferences, the sign above might provoke a moment of puzzlement. But this is the JMM, the Joint Mathematics Meetings, being held this year in New Orleans. As far as I can tell, the 5,400 mathematicians gathered here are quite untroubled by these instructions. And after all, they know perfectly well that a whole uncountable continuum lies between floors 2 and 3.

Onward and upward, indeed. And downward. And everything in between.

 

 

Snowballs

Thursday, January 6th, 2011

The first snow of the season is always a treat. When the flakes start flying, I want to go out and play. I even want to go out and shovel.

Here in the Boston area our first storm left a particularly magical landscape. Glistening flakes fell steadily all night, with little wind, so that every fencepost was topped by a fluffy white pompom in the morning. The decorations were remarkably regular and symmetrical. In the photo below I have carved a cross-section out of the snow lying atop the back-porch railing, which is 3½ inches wide.

semicircular snow mound from storm of 2010-12-21

The profile looks like a fairly good approximation to a semicircle. (Below is a red semicircle fitted to the image by eye.)

the snow semiball with a semicircle fitted by eye

Later, I began wondering why the snow would assume just this shape. Is it an equilibrium form that would be maintained indefinitely if the snowfall continued? Is there some deep and universal reason for the semicircular form, or is it just an accident, a matter of contingency, something peculiar to this snowfall but nothing to generalize about. Semicircular distributions are not all that common in nature. They turn up in the distribution of eigenvalues of certain random matrices, but that seems a pretty far-fetched explanation for the lumps of snow on my back porch.

I would expect the profile to vary somewhat with the properties of the snow—moisture content, size and form of the flakes, etc.—but I’m not at all sure just how it would vary. In the past 20 years there’s been lots written about the self-organized shape of sand piles, which are essentially conical in profile. Clearly, snow is not sand. It’s fluffier and stickier. A flake doesn’t necessarily have to be supported from below. It could adhere to a surface, perhaps with probability given by some function of the cosine of the slope….

That’s as far as I’d gotten when the next storm hit.

not a semicircle, 2010-12-26-0537.jpg 

Perhaps I should note that my boyish joy at the first snowfall tends to dissipate as the winter wears on. 

Whack-a-Rectangle

Saturday, November 13th, 2010

It’s been almost a year since Bill Gasarch gave us the problem of four-coloring the nodes of a 17 × 17 grid in such a way that no rectangle has all four corners the same color. (See my earlier commentary here and here.)

The problem remains open, the prize of $172 unclaimed.

A few weeks ago I had a momentary fantasy that I might have come upon the answer. I was browsing in a strange New England emporium called Building 19, full of merchandise found nowhere else (thankfully). From across the room I spotted a sofa cushion that looked like it might well be a rectangle-free four-colored grid.

A rectangle-free coloring on a sofa cushion at Building 19?

Sadly, a closer look showed that the cushion is neither rectangle-free nor four-colored. Nor 17 × 17 for that matter. And the price tag reads $399.90, so if Bill wants to receive the solution in the form of tacky furniture, he’s going to have to up his offer.

But let us set aside these disappointments. There’s happier news from elsewhere. Martin Schweitzer has turned the square-free-rectangle problem into a game! He explains that he was exploring the new canvas element of HTML5, and thought the rectangle-free problem would make a nice Javascript programming project. The result is entertaining and may even lead to better intuition about the structure of the problem.

On the other hand, the hardest level (”Ninja”) in Schweitzer’s game is only a 12 × 12 array, so don’t think you’re going to click away at little colored squares and earn yourself $289.

The program requires a canvas-capable browser, such a recent version of Chrome, Firefox, Opera or Safari.

Kenken-friendly numbers

Saturday, October 23rd, 2010

I messed up a kenken the other day, carelessly forgetting that 2, 3, 4 and 6 are not the only divisors of 12. Later, mulling over my mistake, I realized that it’s the presence of the multiplicative identity that gives a kenken puzzle much of its combinatorial variety. If we played kenken without 1’s, the three-cell “cage” below would have only the single solution shown.

Three-cell cage labeled 12x, with unique solution 2 x 3 x 2

But because 1 is included in the set of candidate numbers, the cage could also be filled in with various permutations of 1 × 3 × 4 and 1 × 2 × 6, thereby increasing the entropy of the puzzle.

[For those of you who don't fritter away enough of your time on the puzzles page at the back of the newspaper, here's a kenken résumé. You are given a k × k grid to be filled in with the integers from 1 through k, observing certain constraints. Each integer must appear exactly once in each column and each row (thus making the grid a Latin square). And the numbers within each thickly outlined cage must combine to produce the target number in the upper left corner of the cage, using the operation shown there (+, –, × or ÷). For the noncommutative and nonassociative operations, the requirement is merely that some ordering of the operands yields the target value.]

If having the multiplicative identity among the candidate numbers makes the puzzle more interesting, an obvious idea is to include the additive identity as well. This would be easy to arrange: Just shift from the set {1, 2, …, k} to the set {0, 1, …, k–1}. Then a target value of 3+ would no longer have the unique solution 1 + 2; it could also be 0 + 3. Moreover, a target value of 0× would be a kind of wild card, with very high entropy indeed!

Other sets of candidate numbers might produce even harder (and weirder) puzzles. How about a 5 × 5 kenken with the set {–2, –1, 0, 1, 2}? Or, venturing a little farther afield, we might try a 4 × 4 puzzle with the candidates {1, –1, i, –i}.

a complex kenken, with candidate set {1, -1, i, -i}

(Roll your mouse over the grid above to see a valid solution—but I’m not at all sure the solution is unique or that it can be found by a purely deductive process. At best this is a jokenken.)

Looking in the opposite direction, certain candidate sets should lower the entropy of the system and thus can be expected to make the puzzle easier. In particular, a grid to be filled in only with prime numbers would have unambiguous solutions (except for order) in all multiplicative cages. (If we stick with the convention that target values are always integers, there could be no division cages in such a puzzle.)

Following up on this line of thought, I began to wonder about candidate sets for which all operations would yield unambiguous results. If we choose any two distinct numbers from such a set, and combine them with any of the kenken operations +, –, ×, ÷, we would get a result different from what we would see with any other pair of numbers or any other operation. Do such sets exist? Yes, an example is {2, 8, 10, 11}, for which the four arithmetic operations yield these results:

operation tables for kenken candidate set {2, 8, 10, 11}

Following kenken convention, we take only numbers from the lower triangle of each table, and in the case of division we ignore any noninteger quotient. Thus the full set of results are the numbers circled in red. It’s easy to see there are no duplicated values. Sorted by magnitude, the set is: {1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 16, 18, 19, 20, 21, 22, 80, 88, 110}.

A cute property of such sets is that they allow the construction (and solution!) of kenken puzzles like this one:

six-by-six kenken to be solved with numbers from the set {2,8,22,27,29,30}; mouseover for solution

Note that the target values in the cages do not indicate which operation is to be performed. There’s no need to specify an operator, because each of the target values can be generated in only one way from the set of numbers given. If a set allows such an operatorless puzzle grid, I’m going to call it a kenken-friendly set, or a kkf set.

[According to Wikipedia, operatorless kenkens are nothing new:

More complex KenKen problems are formed ... omitting the symbols +, −, × and ÷, thus leaving them as yet another unknown to be determined.

I've never seen such a kenken in the wild. Are they based on kenken-friendly sets of the kind defined here? I rather doubt it if the puzzles are meant to be harder than normal. (The example above is quite easy.)]

A few things I know about kenken-friendly sets:

  1. No kkf set can include either 0 or 1 among its members.
  2. No three elements of a kkf set can form an arithmetic progression. It follows that if {m, …, n} is a kkf set of k elements, then nm is at least k(k–1)/2. (Let’s call this the minimum span of the set.)
  3. Every kkf set defines a Golomb ruler, a set of integers in which all differences are unique. But of course only some Golomb rulers also satisfy the constraints that sums, products and quotients be unique.
  4. Among sets made up entirely of small numbers, kkf sets are rare, but when you select sets of k integers from the range [1, n], and n is much larger than k, nearly every set has the kkf property.

proportion of random sets that have the kkf property for k = 4 to 8

A few things I don’t know about kenken-friendly sets:

  1. For each value of k, what are the smallest kkf sets?
  2. Before we can address question 1, we have to answer this: What is the best measure of size in a kkf set? In sets of the form {m, …, n}, should we work to minimize m, minimize n, or perhaps minimize the span from m to n? Or maybe the right measure is the sum or product of all the elements of the set? (This question doesn’t come up for Golomb rulers, where it’s customary to set m=0, then look for the set with the smallest n, which is designated the optimal ruler for a given k.)
  3. One way to search for small kkf sets is to take a known Golomb ruler {0, … n}, then slide it along the number line, adding a constant c to each element. This preserves Golomb-ruler-ness, and at each c you can test for kkf-ness. The question is: Does every Golomb ruler become a kkf set for some finite value of c? The answer is yes for the 35 known optimal Golomb rulers, and I have not found a counterexample among various larger rulers. (I’ve tried a sampling of those listed by James B. Shearer.) But I have no clue to a proof, one way or the other.
  4. The special treatment of division in these sets may make sense for a puzzle printed in the newspaper, but mathematically it’s a bit of a wart. There’s no good rationale for accepting quotients when they happen to be integers and ignoring them otherwise. So what’s the proper formulation of the problem? Should we open the door to the whole field of rational numbers? Or, on the contrary, should we exclude division altogether and just work with +, – and ×? Another option is to replace division with the remainder operation. (This last approach looks intriguing. Sets that have all-unique sums, products, differences and remainders seem to be rare, but they do exist. One example is {3, 6, 16, 47}, which yields the result set {0, 1, 2, 3, 4, 5, 9, 10, 13, 15, 18, 19, 22, 31, 41, 44, 48, 50, 53, 63, 96, 141, 282, 752}).
  5. Have all these questions been answered already, perhaps in some other context?

Finally, some data. Here are a few small (by some measure) kkf sets, along with the result sets they generate from the operations {+, –, ×, ÷}:

k=4:

{2, 8, 9, 11} ⇒ {1, 2, 3, 4, 6, 7, 9, 10, 11, 13, 16, 17, 18, 19, 20, 22, 72, 88, 99}
{3, 6, 10, 11} ⇒ {1, 2, 3, 4, 5, 7, 8, 9, 13, 14, 16, 17, 18, 21, 30, 33, 60, 66, 110}
{4, 6, 9, 10} ⇒ {1, 2, 3, 4, 5, 6, 10, 13, 14, 15, 16, 19, 24, 36, 40, 54, 60, 90}
{5, 6, 9, 11} ⇒ {1, 2, 3, 4, 5, 6, 11, 14, 15, 16, 17, 20, 30, 45, 54, 55, 66, 99}

k=5:
{2, 5, 13, 18, 19} ⇒ {1, 3, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 20, 21, 23, 24, 26, 31, 32, 36, 37, 38, 65, 90, 95, 234, 247, 342}
{9, 11, 16, 19, 20} ⇒ {1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 20, 25, 27, 28, 29, 30, 31, 35, 36, 39, 99, 144, 171, 176, 180, 209, 220, 304, 320, 380}}

k=6:
{2, 3, 17, 19, 26, 29} ⇒ {1, 2, 3, 5, 6, 7, 9, 10, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29, 31, 32, 34, 36, 38, 43, 45, 46, 48, 51, 52, 55, 57, 58, 78, 87, 323, 442, 493, 494, 551, 754}
{7, 11, 13, 16, 23, 24} ⇒ {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 16, 17, 18, 20, 23, 24, 27, 29, 30, 31, 34, 35, 36, 37, 39, 40, 47, 77, 91, 112, 143, 161, 168, 176, 208, 253, 264, 299, 312, 368, 384, 552}
{13, 17, 19, 22, 29, 30} ⇒ {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 16, 17, 30, 32, 35, 36, 39, 41, 42, 43, 46, 47, 48, 49, 51, 52, 59, 221, 247, 286, 323, 374, 377, 390, 418, 493, 510, 551, 570, 638, 660, 870}}

k = 7:
{6, 13, 17, 18, 27, 33, 35} ⇒ {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 27, 29, 30, 31, 33, 35, 39, 40, 41, 44, 45, 46, 48, 50, 51, 52, 53, 60, 62, 68, 78, 102, 108, 162, 198, 210, 221, 234, 306, 351, 429, 455, 459, 486, 561, 594, 595, 630, 891, 945, 1155}

I could carry less

Tuesday, August 31st, 2010

The fabled carefree residents of the Carryless Islands in the remote South Pacific have very few possessions, which is just as well, since their notion of arithmetic is ill-suited to accurate record-keeping. When they add or multiply numbers, they follow similar rules to ours, except that there are no carries into other digit positions. Addition and multiplication of single-digit numbers are performed by a process that we would call “reduction mod 10.” Any carry digits are simply ignored. So 9 + 4 = 3, 5 + 5 = 0, 9 × 4 = 6, 5 × 4 = 0, and so on.

With this fable, David Applegate, Marc LeBrun and N. J. A. Sloane introduce a new scheme of arithmetic in a paper newly posted on the arXiv.

And if you think the mathematics sounds trivial, try explaining the structure of this sequence:

21, 23, 25, 27, 29, 41, 43, 45, 47, 49, 51, 52, 53, 54, 56, 57, 58, 59, 61, 63,

which lists the first 20 “carryless primes.”

The ormat game

Monday, August 16th, 2010

Here’s the deal. I’m going to give you a square grid, with some of the cells colored and others possibly left blank. We’ll call this a template. Perhaps the grid will be one of these 3×3 templates:

colored 3x3 ormat grids

You have a supply of transparent plastic overlays that match the grid in size and shape and that also bear patterns of black dots:

dot patterns for the six 3x3 permutation matrices

Note that each of these patterns has exactly three dots, with one dot in each row and each column. The six overlays shown are the only 3×3 grids that have this property.

Your task is to assemble a subset of the overlays and lay them on the template in such a way that dots cover all the colored squares but none of the blank squares. You are welcome to superimpose multiple dots on any colored square, but overall you want to use as few overlays as possible. To make things interesting, I’ll suggest a wager. I’ll pay you $3 for a correct covering of a 3×3 template, but you have to pay me $1 for each overlay you use. Is this a good bet?

Before going further, I should mention that not every conceivable template can be covered under these rules. To take an obvious example, no 3×3 template with fewer than three colored squares can possibly be covered by any combination of the six overlays. But I promise to submit only templates that can be covered by some combination of the given dot patterns; if I err about this, I forfeit the bet.

How does the game play out? If I give you the template marked “1″ above, you can easily win; just choose permutations a and b, which together cover all the colored squares and no others. You pay $2 and get $3. Template 2, with all nine squares colored, looks like it might be the toughest challenge. Clearly, it cannot be covered with fewer than three overlays, since we need a total of nine dots; and it turns out that exactly three overlays are required. Indeed, there are two ways of covering the template with three overlays: a + d + e and b + c + f. Thus this template is a breakeven proposition: You earn $3 and pay $3.

Now we come to template 3, which has eight colored squares and one blank. Surely if you can cover the full nine squares with just three overlays, then you should also be able to cover eight squares—no? I invite you to try it. In fact the only covering that works requires four overlays: b + d + e + f. Thus you shouldn’t take my bet, since I can always give you a template with just one blank, and you’ll have a net loss of $1.

Some background. I’ll return to the gaming table momentarily, but first let me explain what this is all about and where it came from. A few weeks ago, I was writing about “ranges of rankings,” which led me into the topic of permutation matrices. To recapitulate:

  • A permutation matrix is a square matrix with a single 1 in each column and each row, and all the rest of the elements 0.
  • An ormat is a superposition of permutation matrices, formed by applying the Boolean OR function to corresponding elements of the permutation matrices. For example: matrix-or-sum.png
  • Not all square matrices with (0,1) entries can be formed by OR-ing permutation matrices, but there’s an efficient algorithm for deciding whether or not a given matrix is an ormat. (I thank some helpful commenters for enlightening me on this point.)
  • Given an ormat, the total number of distinct permutation paths that can be threaded through the 1 entries of the matrix is equal to the permanent of the matrix. Calculating the permanent is known to be a hard computational problem.

In a comment, Barry Cipra posed the following query:

The permanent tells us the maximum number of different permutations that can be OR-summed to produce a given ormat, but what is the corresponding minimum number? Also, in how many different ways can the minimum be achieved?

The connection between ormats and my little game is probably apparent by now. The template of colored and blank squares is an ormat; the dotted overlays represent permutation matrices; to maximize your payoff in the game (or to minimize your loss), you need to answer Barry’s first question, finding the minimum number of permutations that can be combined to yield the given ormat.

For 3×3 matrices, we can solve this problem by exhaustive search, calculating the OR-sums of all possible combinations of the six 3×3 permutation matrices taken 1, 2, 3, …, 6 at a time. I did this with pencil and paper on a recent airplane trip. Here is a summary of the results:

number of ormats generated by various combinations of permutation matrices

Some of the numbers on this card are easy to explain. The six ormats with just three 1 entries are the permutation matrices themselves. There are six of them because there are 3! = 6 permutations of three things. There are no ormats with four 1 entries for a reason that bears thinking about: There can be no permutations that differ from one another in just one element. When you superimpose any of the six overlays shown above, you can wind up with three, five or six dots, but never four.

At the other end of the scale, it’s no surprise that there’s exactly one ormat with nine 1 entries, and that it takes three permutations to produce it. And then there are the nine ormats with eight 1 entries, which each require four permutations to be OR-ed. These are the single-blank patterns like template 3 above.

Based on these results, I began speculating about what I would see in a tabulation of all 4×4 ormats.

guesses about stats for 4x4 ormats

There would have to be 4! = 24 patterns with four 1 entries, and just one pattern with all 1s, generated by OR-ing four permutations. And there should be 16 ormats that require five permutations, namely the 16 matrices with a single 0 element. This last prediction seemed a little less self-evident than the others.

Pocket change and Cheerios. My thoughts about the single-zero (or single-blank) case went something like this. To cover 15 squares with sets of four dots each, we need at least four sets, or else we simply won’t have enough dots. So a useful starting point is one of the optimal arrangements that cover all 16 squares without gaps or overlaps. By this time I had grown tired of drawing zillions of dots, and so I started working with sets of coins.

initial configuration of four permutations of coins

In this arrangement each coin denomination forms a permutation, with no two pennies, nickels, dimes or quarters in the same row or the same column. We have successfully covered all the colored squares, but unfortunately we’ve also covered the blank at the lower right. Thus this pattern of coins is not an acceptable solution, but maybe we can fix it up somehow?

adjusted configuration after one coin is moved

Moving the penny from the blank square to another square in the same column solves one problem but creates another: Now the arrangement of pennies is no longer a permutation. There are two pennies in the third row.

coins after second adjustment to restore permutation

So now we have to shift another penny to restore the one-per-row-and-column property. Inevitably, this leaves a colored square uncovered. The only way we can cover that exposed square is to introduce a fifth permutation. Since I had run out of coin denominations, I chose a popular brand of breakfast toroids. Voila:

coins and cheerios -- the five-permutation solution

There’s nothing special about the particular moves I chose in this sequence. If you try some alternatives, you should be able to persuade yourself that moving the penny that covers the blank to any other square in the fourth column (or in the fourth row) would lead to essentially the same situation. Likewise the game would come out the same if the single blank square were placed anywhere else in the grid. And you could also start with a different set of initial permutations (provided they cover all the squares).

This coin-shuffling exercise demonstrates that we can cover any 4×4 template that has a single blank by combining no more than five permutations, but how do we know that five are actually needed? Maybe there’s some totally different arrangement that would do the job with just four permutations? Well, think about what such an arrangement would look like. It would have to differ at exactly one position from some other layout of permutations that covers the full 16-square grid. But no two permutations can differ at one and only one place. Thus the reason there can be no four-permutation cover of 15 squares is essentially the same as the reason no 4×4 ormat pattern can cover just five squares.

This argument generalizes to k×k matrices: For any integer k, there must be at least k ormat patterns that cannot be covered with fewer than k+1 permutations. But then comes the bigger speculative leap: Perhaps k+1 is an upper bound. Perhaps part of the answer to Barry’s question is that no k×k ormat pattern requires more than k+1 permutations. At one point I even had a “proof” of this conjecture. Then I wrote a program to check it, doing much the same thing I did with the dots on the airplane.

Out of bounds. My program found the expected 16 ormat patterns that require five permutations—and it found many more as well. In all it identified 2,032 4×4 ormats that can’t be composed from fewer than five permutations. And then came a bigger surprise: The program also found 480 patterns that require six permutations. So much for my proposed upper bound.

One of those problematic 480 ormats takes this form:

((1 1 1 1) (1 1 1 1) (1 1 1 1) (1 1 0 0))

Looking over this pattern, I thought I understood where my earlier reasoning had gone awry. This matrix is just like the single-zero pattern, but with two zeros! (I do mean for that statement to make sense. Bear with me.) Suppose we start again with a set of four permutations that completely cover the grid, including the two blanks.

starting configuration of 16 coins on 4x4 template with two blanks

Then we can uncover each blank just as we did in the coin-shuffling procedure above, although we have to be careful the two sets of movements don’t interfere with each other. (Not much point in removing the penny from a blank square, then putting the nickel there.) Here is a strategy for clearing both blank squares while maintaining the one-per-column-and-row permutation property:

four coins and two blanks: first solution

Inevitably, when we uncover the two blank squares, we also remove coins from two colored squares, which now have to be filled in again. The key point is that no single permutation can repair that damage, because the two open colored squares are in the same row. To cover both of those squares we need two additional permutations.

Other ways of reshuffling the coins avoid putting the two open squares in the same row or column, but they still foil all attempts to complete the covering with just five permutations. Try adding four Cheerios to the diagram below. If you cover both of the open blue squares, then either you also cover one of the blank squares or you wind up with two Cheerios in the same row.

four coins and two blanks: second solution

So now it’s clear we need as many as six permutations to cover a 4×4 ormat. Does that suggest that the general upper bound might be k+2 rather than k+1? Or perhaps the appropriate formula is 2k–2? In support of this latter possibility I offer these two ormats, which require 8 and 10 permutations respectively:

2k-2ormats.png

Another wager. Having fooled myself several times about the upper bound on minimal ormat coverings, I feel I should build in a little margin for error before I invite you to make a further wager. We already have direct evidence that covering a k×k ormat can take as many as 2k–2 permutations. So I’ll be generous and offer a full $2k for a proper covering, while charging $1 per permutation. If k=3 or k=4, you can definitely make money on this deal. But is it a good bet for larger k? (Hint: I’d be willing to play the game on these terms for real money.)

•     •     •

Update 2010-08-19: No takers for my bet, eh? Too bad; I had already spent my winnings.

Barry Cipra, who raised the question about minimal ormat covers in the first place, sends this illuminating letter:

I’m going to tiptoe a short ways out on a long long limb and conjecture (really just guess) that the “worst case” behavior, in terms of the minimum number of permutations it takes to produce a given ormat, occurs for ormats of the following form, shown here for k=7:

((1 1 1 1 1 1 1) (1 1 1 1 1 1 1) (0 1 1 1 1 1 1) (0 0 1 1 1 1 1) (0 0 0 1 1 1 1) (0 0 0 0 1 1 1) (0 0 0 0 0 1 1))

So as not to abuse existing matrix terminology, I’ll call any (square) matrix of this type—i.e., whose entries below the main subdiagonal are all 0—”uppity triangular.” I can (and will!) show that this uppity triangular ormat for k=7 requires (at least) 16 permutations—and the number appears to grow concavely upwards from that, so I, for one, will definitely not take you up on your $2k wager.

The trick, I realized, is to view each ormat as the “shadow” of what I’ll call an “addmat.” If you let P1, P2, …, Pr be k×k permutation matrices, their addmat is simply the ordinary result of addition: S = P1 + P2 + … + Pr, whose entries are positive integers wherever one or more of the constituent permutations has a 1 and otherwise 0. The associated ormat is obtained by changing each of these entries to a 1, while leaving the 0’s alone. In this sense, the ormat’s 1’s are the “shadows” of the addmat’s positive entries.

What’s crucial is that addmats have a lovely little property not shared with their shadows: the row and columns sums of the entries of an addmat all equal the number of permutations that produce them, r.

Come now, let us reason together…. The uppity triangular ormat example above (for k=7) must come from an addmat of the form

{{

where a, b, and all the *’s are positive integers. In particular, each * is at least 1. Since all row and column sums must be equal, the sum a+b must equal the sum of b and all 6 *’s above it. Hence a is at least 6. Likewise a+b must equal the sum of a and all 6 *’s above it, so b is also at least 6. Hence a+b is at least 12, which means the OR-sum that produced the given ormat involves at least 12 permutations.

This clearly generalizes to arbitrary k, which is more than “direct evidence” that covering a k×k ormat can take as many as 2k–2 permutations, it’s rigorous proof! But we can immediately do better, at least on a case-by-case basis. If we try to get by with just 12 permutations for this uppity triangular ormat, we quickly run into trouble. We obviously must have a = b = 6, and it follows that all the *’s above them are 1’s (to make those column sums 12). That is, we have the addmat

{{

where I now wish to call your attention to the entry labeled “@”. To make its row-sum equal 12, we need @ = 10. But that means its column sum (with the 5 *’s above it) is at least 15, which cannot be! So we are forced to try larger values of a and/or b—which is to say, we need more permutation matrices to produce this addmat.

It turns out you can’t satisfy the row and column sum condition until you get to a = b = 8. I won’t take you through all the steps, but just give you a taste with the penultimate possibility, a = 7, b = 8. The best you can hope for in this case is

{{

Note that I put as much of the “weight” in the last two columns as close to the 7 and 8 as possible, so that I could use the smallest possible value (10) as the entry with 5 positive entries above it. This makes the last three rows, and the right three columns all have the same sum, 15, but now we see a problem in the 12’s column: Its sum is at least 16. So once again, we’re screwed. It’s only with the next attempt that we avoid contradiction:

{{8, 3, 1, 1, 1, 1, 1}, {8, 3, 1, 1, 1, 1, 1}, {0, 10, 2, 1, 1, 1,<br />
  1}, {0, 0, 12, 1, 1, 1, 1}, {0, 0, 0, 12, 2, 1, 1}, {0, 0, 0, 0, 10, 3, 3}, {0, 0, 0, 0, 0, 8, 8}}

This matrix finally has all its row and column sums equal. Please note, this may or may not be an actual addmat of a set of 16 permutation matrices—I suspect it probably is, but I haven’t bothered to check. All we know is that it satisfies a necessary condition of being an addmat, namely that its row and column sums are all equal. (It’d be nice if that were also a sufficient condition, but something tells me it isn’t.)

This example, which can clearly be played out for larger values of k, suggests that not only are you safe with a $2k wager, but with a $(2k+2) wager and higher I’ve played around with this a bit, and persuaded myself that the number of permutation matrices will go to 2k + a lot—for k=10, if I did things correctly, you need 24 permutations (or possibly more, if the uppity triangular matrix the analysis leads to is not an actual addmat). I am entirely convinced that some additional careful thought can streamline the analysis into a nice, slick proof. I’m just not sure I haven’t already made a mistake, and built an elaborate house of cards….

Does any of this jibe with what you’ve already found to be the case?

It does indeed jibe.

First of all, to answer a small question Barry left open, here is a set of 16 permutations that will successfully cover his 7×7 “uppity triangular” matrix:

{1,2,3,4,5,6,7} {2,1,4,3,6,7,5} {1,3,2,5,4,7,6} {1,3,4,2,6,5,7}
{2,3,1,5,6,4,7} {1,2,3,5,6,7,4} {1,2,4,5,3,6,7} {1,2,4,5,6,3,7}
{1,2,4,5,6,7,3} {1,3,4,5,2,6,7} {1,3,4,5,6,2,7} {1,3,4,5,6,7,2}
{2,3,4,1,5,6,7} {2,3,4,5,1,6,7} {2,3,4,5,6,1,7} {2,3,4,5,6,7,1}

This was found with a simple greedy search.

My own attempts to find an upper bound have focused not on uppity triangular matrices but on matrices I’ve been calling “flags,” like this 7×7 case:

{{0, 0, 0, 1, 1, 1, 1}, {0, 0, 0, 1, 1, 1, 1}, {0, 0, 0, 1, 1, 1, 1}, {1, 1, 1, 1, 1, 1, 1}, {1, 1, 1, 1, 1, 1, 1}, {1, 1, 1, 1, 1, 1, 1}, {1, 1, 1, 1, 1, 1, 1}}

This matrix also requires 16 permutations for a proper covering. To see why, try threading permutations through the columns of the matrix, starting at the left edge and in each column choosing a 1 element (never a 0) from a different row. Because of the block of zeros at the upper left, the first three elements of every permutation must lie in rows 4 through 7. Thus each permutation “uses up” three of the last four rows in the first three columns, and the rest of the permutation can revisit this range of rows only once. It follows that each permutation can touch only one element in the 4×4 block of 1s in the lower right corner of the matrix, and at least 16 permutations are needed to cover all the 1s in the matrix. Showing that 16 are sufficient is not hard.

This kind of analysis works for any odd k, and thus we know that such matrices can require as many as

{\biggl\lceil\frac{k}{2}\biggr\rceil}^2

permutations. (For even k the situation is a little less symmetrical, and I haven’t worked out the exact details.)

These results give us a lower bound on the upper bound on the number of permutations that may be needed to cover a k×k ormat. But we haven’t proved it’s the true upper bound. Are there other ormats that require even more permutations? My guess is no, but keep in mind that almost all my conjectures along these lines have turned out to be wrong.

Four questions about fuzzy rankings

Saturday, July 24th, 2010

The National Research Council is getting ready to release a new assessment of graduate-education programs in the U.S. The previous study, published in 1995, gave each Ph.D.-granting department a numerical score between 0 and 5, then listed all the programs in each discipline in rank order. For example, here’s the top-10 list for doctoral programs in mathematics (as presented by H. J. Newton of Texas A&M University):

 rank    school                 score   
    1    Princeton               4.94
    2    Cal Berkeley            4.94
    3    MIT                     4.92
    4    Harvard                 4.90
    5    Chicago                 4.69
    6    Stanford                4.68
    7    Yale                    4.55
    8    NYU                     4.49
    9    Michigan                4.23
   10    Columbia                4.23

Note that the scores of the first two schools are identical (to two decimal places), and the first four scores differ by less than 1 percent. Given the uncertainties in the data, it seems reasonable to suppose that the ranking could have turned out differently. If the whole survey had been repeated, the first few schools might have appeared in a different order. Doctoral candidates in mathematics are presumably sophisticated enough to understand this point. Nevertheless, the spot at top of the list still carries undeniable prestige, even when you know that the distinction could be merely an artifact of statistical noise.

The committee appointed by the NRC to conduct the new graduate-school study wants to avoid this “spurious precision problem.” They’ve adopted some jazzy statistical methods—mainly a technique called resampling—to model the uncertainty in the data, and they’ve also decreed that the results will be presented differently. There will be no sorted master list showing overall ranks in descending order. Instead the programs in each discipline will be listed alphabetically, and each program will be given a range of possible ranks. For example, a program might be estimated to rank between fifth place and ninth place. Let’s call such a range of ranks a rank-interval, and denote it {5, 6, 7, 8, 9} or {5–9}.

For a hypothetical set of 10 institutions, A through J, here’s what a set of rank-intervals might look like.

bar graph showing ranges of rankings for schools A through J.png

Acknowledging the uncertainty in your findings is commendable. But let’s be realistic. If you actually want to make use of these results—for example, if you’re a student choosing a grad-school program—the first thing you’re going to do is sort those bars into some sort of rank order, trying to figure out which school is best and how they all stack up against one another. In other words, you’re going to undo all the elaborate efforts the NRC committee has put into obscuring that information.

Below is one possible ordering of the bars. I have sorted first on the top of the rank-intervals, then, if two columns have the same top rank, I’ve sorted on the bottom rank. Other sorting rules give similar but not identical results. For example, sorting on the midpoints of the intervals would interchange columns B and F.

bar graphs showing rank-ranges sorted into one canonical order.png

Question 1. Does sorting a set of rank-intervals by one of these simple rules yield a consistent and meaningful total ordering of the data? To put it another way, can you trust this attempt to reconstruct a ranking?

I hasten to add that this is not really a practical question about finding the best grad school. If you’re facing such a choice in real life, the NRC rank-intervals are not the only available source of information. But, for the sake of the mathematical puzzle, let’s pretend that all we know about schools A through J is embodied in those ranges of rankings.

It turns out that rank-intervals have some fairly peculiar behavior. Ranges of ratings are not a problem. If the NRC merely gave each school a fuzzy rating on the 0-to-5 scale, no one would have much trouble interpreting the results. But when you turn fuzzy ratings into fuzzy rankings, there are hidden constraints. For example, not all sets of rank-intervals are well-formed.

two impossible sets of rank-ranges

The set at left is impossible because there’s no one in last place. (We can’t all be above average.) The example at right is also nonsensical because D has no ranking at all. For a set of rank-intervals to be valid, there has to be at least one entry in each row and each column.

That’s a necessary condition, but not a sufficient one, as the two graphs below illustrate.

two more impossible rank-intervals

Do you see the problem with the example at left? Column B has a rank-interval of {1–2}, but in fact B can never rank first because A has no alternative to being first. The case at right is conceptually similar but a little subtler: If B is ranked third, then either first place or second place will have to remain vacant.

The underlying issue here is the presence of constraints or linkages within a set of rankings. Suppose you have calculated ratings and rankings of several schools, and then some new information turns up about one school. You can change the rating of that school without any need to adjust other ratings, but not so the ranking. If a school goes from third place to fourth place, the old fourth-place school has to move to some other rung of the ladder, and somebody has to fill the vacancy in third place. These interdependencies are obvious in a non-fuzzy ranking, but they also exist in the fuzzy case. You can’t just assign arbitrary rank-intervals to the items in a set and assume they’ll all fit together. This observation leads to a second question:

Question 2. What are the admissible sets of rank-intervals? How do we characterize them?

I have a partial answer to this question. It goes like this. Any ranking of k things must be a permutation of the integers from 1 through k. A permutation can be embodied in a permutation matrix—a square k × k matrix in which every row has a single 1, every column has a single 1, and all the other entries are 0. For example, here are the six possible 3 × 3 permutation matrices:

3x3-permutation-matrices.png

They correspond to the rankings (1, 2, 3), (1, 3, 2), (2, 1, 3), (3, 1, 2), (2, 3, 1) and (3, 2, 1).

Since a permutation matrix represents a specific (non-fuzzy) ranking, we can build up a set of rank-intervals by taking the OR-sum of two or more permutation matrices. What do I mean by an OR-sum? It’s just the element-by-element sum of the matrices using the boolean OR operator, ∨, instead of ordinary addition. OR has the following addition table:

                      0 ∨ 0 = 0
                      0 ∨ 1 = 1
                      1 ∨ 0 = 1
                      1 ∨ 1 = 1

For the first two 3 × 3 matrices shown above the arithmetic sum is:

matrix-addition.png

whereas the OR-sum looks like this:

matrix-or-sum.png

Every valid set of rank-intervals must correspond to an OR-sum of permutation matrices, simply because a set of rank-intervals is in fact a collection of permutations. The converse also holds: Any OR-sum of permutation matrices yields an admissible set of rank-intervals. Thus the OR-sums of permutation matrices—let’s call them ormats for brevity—are in one-to-one correspondence with the admissible sets of rank-intervals. (There’s just one catch when applying this idea to the NRC study. The columns of an ormat may well have “gaps,” as in the column pattern (0 1 1 0 0 1 1), which corresponds to the rank-interval {2–3, 6–7}. Will the NRC allow such discontinuous ranges in their grad-school assessments? Perhaps the issue will never come up in practice. In any case, I’m ignoring it here.)

Arithmetic sums of permutation matrices form an open-ended, infinite series; in contrast, there are only finitely many distinguishable OR-sums. The reason is easy to see: Ormats have k2 entries, each of which can take on only two possible values, and so there can’t be more than \(2^{k^{2}}\) distinct matrices. Because of the various constraints on the arrangement of the entries, the actual number of ormats is smaller. For example, at k = 3 the \(2^{k^{2}}\) upper bound allows for 512 ormats, but there are only 49:

the-49-3-by-3-or-sums.png

Thus we come to the next question.

Question 3. For each k ≥ 1, how many distinct ormats can we build by OR-ing subsets of k × k permutation matrices? Is there a closed-form expression for this number?

I have answers only for puny values of k.

   k       upper bound        # of ormats  
   1                 1                  1
   2                16                  3
   3               512                 49
   4            65,536              7,443
   5        33,554,432          6,092,721
   6    68,719,476,736                  ?

The tallies of ormats were calculated by direct enumeration, which is not a promising approach for larger k. (I note—to spare folks the bother of looking—that the sequence 1, 3, 49, 7443, 6092721 does not yet appear in the OEIS.)

To extend this series, we might try to exploit the internal structure and symmetries of the ormats. By sorting the columns and rows of the matrices, we can reduce the 49 3×3 ormats to just six equivalence classes, with the following exemplars:

exemplars of six ormat equivalence classes

Enumerating just these reduced sets of matrices should make it possible to reach larger values of k, but I have not pursued this idea. (Furthermore, the two-dimensional sorting of matrices looks to be a curiously challenging task in itself.)

By the way, I think the number of ormats will approach the \(2^{k^{2}}\) upper bound asymptotically as k increases. Many of the features that disqualify a matrix from ormathood—such as all-zero rows or columns—become rarer when k is large. I have tested this conjecture by generating random (0,1) matrices and then counting how many of them turn out to be ormats.

fraction-of-ormats.png

For k = 1 through 5 the results are in close agreement with the actual counts of ormats, and up to k = 10 the trend is clearly upward. But continuing this inquiry to larger values of k will depend on a positive answer to the next question.

Question 4. Given a square matrix with (0,1) entries, is there an efficient algorithm for deciding whether or not it is an OR-sum of permutation matrices, and thus an admissible set of rank-intervals?

The question asks for a recognition predicate—a procedure that will return true if a matrix is an ormat and otherwise false. If efficiency doesn’t matter, there’s no question such an algorithm exists. At worst, we can generate all the k × k ormats and see if a given matrix is among them. But that’s like saying we can factor integers by producing a complete multiplication table. It just won’t do in practice. Isn’t there a quick and easy shortcut, some distinctive property of ormats that will let us recognize them at a glance?

If we could replace the OR-sum with the ordinary arithmetic sum, the answer would be yes. Permutation matrices have the handy property that all rows and columns sum to 1. An arithmetic sum of r permutation matrices has rows and columns that all sum to r. (It is a semi-magic square.) The converse is also true (though harder to prove): If a matrix of nonnegative integers has rows and columns that all sum to r, it is a sum of r permutation matrices. This fact yields a simple test: Sum the rows and the columns and check for equality.

Unfortunately, the trick won’t work for ormats, because the boolean OR operation throws away even more information than summing does. Because 0 ∨ 1 = 1 ∨ 0 = 1 ∨ 1, infinitely many sets of operands map into the same result, and there’s no obvious way to recover the operands or even to determine how many permutation matrices entered into the OR-sum.

Maybe there’s some other clever trick for recognizing ormats, but I haven’t found it. Let me make the question more concrete. Below are three (0,1) square matrices. Two of them are ormats but the third is not. Can you tell the difference?

three-puzzle-matrices.png

If it’s so hard to recognize an ormat, how did I count the ormats among a bunch of randomly generated (o,1) matrices? By hard work: I reconstructed the set of permutations allowed by each matrix. Visualize a permutation as a path threading its way through the matrix from left to right, connecting only non-zero elements and touching each column and each row just once. When you have drawn all possible permutation paths, check to see if every non-zero element is included in at least one path; if so, then the matrix is an ormat. Note that this is not an efficient recognition procedure. In the worst case (namely, an all-ones matrix), there are k! permutations, so this method has exponential running time. But k! is better than \(2^{k^2}\); and, besides, for sparse matrices the number of permutations is much smaller than k!. The 10 × 10 matrix presented as an example at the start of this post gives rise to 580 permutations, a manageable number. Here’s what they look like, plotted as a spider web of red paths across the bar chart.

ranges-with-paths

Every nonzero site is visited by at least one permutation path, so this set of rank-intervals is indeed valid.

This process of lacing permutations through a matrix finally brings me back to Question 1, about how to make sense of the NRC’s fuzzy ranking scheme. Let’s take a small example:

probability-example-1.png

Examining the graph above shows that A must rank either first or second—but which is more likely? In the absence of more-detailed information, it seems reasonable to assume the two cases are equally likely; we assign them each a probability of 1/2. Similarly, B has the rank-interval {1–3}, and so we might suppose that each of these three cases has probability 1/3. Continuing in the same way, we assign probabilities to every element of the matrix.

probability-example-2.png

But wait! This can’t be right; our probabilities have sprung a leak. Any proper set of probabilities has to sum to 1. Our procedure assures that each column obeys this rule, but there is no such guarantee for the rows. In row 1, we’re missing one-sixth of our probability, and in row 2 we have an excess of 1/2; row 4 comes up short by 1/3.

Is there any self-consistent assignment of probabilities for the elements of this matrix? Sure. As a matter of fact, there are infinitely many such assignments, including this one:

probability-example-3.png

I’ll return in a moment to the question of how I plucked those particular numbers out of the air, but note first what they imply about the ranking of items A through D. For item A, with the rank-interval {1–2}, the odds are two-to-one that it ranks first rather than second. B has the behavior we expected from the outset, with probability uniformly distributed over the three cases. But if you pick either C or D, each with the rank-interval {2–4}, your chance of getting second place is only 1/6, and half the time you’ll be in last place.

Where do these numbers come from? Instead of starting with the assumption that probability is uniformly distributed over each rank-interval, assume that each possible permutation of the ranks is equiprobable. For this matrix there are six allowed permutations: (1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4), (1, 3, 4, 2), (2, 1, 3, 4) and (2, 1, 4, 3). Observe that four of the six ordering put A first, and only two permutations place A second. We can also tally up such “occupation numbers” for all the other matrix elements:

probability-example-4.png

Dividing these numbers by the total number of permutations, 6, yields the probabilities given above.

We can do the same computation for the 10 × 10 example matrix, which turns out to allow 580 permutations:

ranges-with-path-weights.png

If you care to check, you’ll find that each column and each row sums to 580; dividing all the entries by this number yields a probability matrix with columns and rows that sum to 1 (also known as a doubly stochastic matrix).

This process of tabulating permutation paths recovers some of the information we would have gotten from the arithmetic sum of the permutation matrices—information that was lost in the OR-ing operation. But we get back only some of the information because we have to assume that each permutation included in the OR-sum appears only once. (This is just another way of saying that the allowed permutations are equiprobable.) There’s no particularly good reason to make this assumption, but at least it leads to a feasible probability matrix.

Is there any way of calculating the entries in the doubly stochastic matrix without explicitly tracing out all the permutation paths? I’m sure there is. I think the construction of the matrix can be approached as an integer-programming problem, and perhaps through other kinds of optimization technology. What seems less likely is that there’s some simple and efficient shortcut algorithm. But I could be wrong about that; there’s a lot of mathematics connected with this subject that I don’t understand well enough to write about (e.g., the Birkhoff polytope). I hope others will fill in the gaps.

Getting back to the assessment of grad schools—have we finally found the right way to understand those rank-intervals that the NRC promises to publish any day now? My sense is that a semi-magic square (or, equivalently, a doubly stochastic matrix) will give a less-misleading impression than a simple eyeball sorting on the spans or midpoints of the rank-intervals. But what a lot of bother to get to that point! How many prospective grad students are going to repeat this analysis?

Acknowledgment: Thanks to Geoff Davis of PhDs.org for introducing me to this story. PhDs.org will have the new ratings as soon as the NRC releases them, and may even find a way to make them intelligible! Disclaimer: I’ve done paid work for the PhDs.org web site (but this is not a paid endorsement).

Update 2010-07-27: If you’ve gotten this far, please read the comments as well. A number of commenters have provided important insights and context, which have helped me understand what’s going on in the matrices I’ve been calling ormats. But I’m still a bit murky about the best way to recognize and count them. I’m not sure that publishing my still-murky thoughts is terribly helpful, but maybe someone else will read what follows and give us a dazzling, gemlike synthesis.

For the ormat-recognition problem (Question 4 above), three basic approaches have been mentioned: enumerating the permutation paths through the matrix, examining matrix minors, and looking for perfect matchings in a bipartite graph defined by the matrix. It seems to me that all of these methods are doing the same thing.

Start with Barry Cipra’s method of minors. The basic operation is to choose a nonzero matrix element, then delete the row and the column in which that element occurs. You then apply the same operation to the remaining, smaller matrix.

In tracing permutation paths, we’re looking for sequences of nonzero elements, drawing one element from each column and each row. A way of organizing this search is to choose a nonzero element and then, after recording its location, delete the corresponding column and row, so that no other elements can be chosen from that column or row.

In the method based on Hall’s theorem, as explained by John R., we view the ormat as the adjacency matrix of a bipartite graph, where every nonzero element designates an edge connecting a row vertex to a column vertex. To find a matching, we delete an edge, along with the two vertices it connects (and also all the other edges incident on those vertices). Then we recurse on the smaller remaining graph. (See further update below.) If you translate this operation on the graph back into the language of matrices, deleting an edge and its endpoints amounts to deleting a row and a column of the adjacency matrix.

I am not asserting that these three algorithms are all identical, but they all rely on the same underlying operation. To say more, we would need to consider the control structure of the algorithms—how the basic operations are organized, how the recursion works, all the details of the bookkeeping. I don’t trust myself to make those comparisons without trying to implement the three methods, which I have not yet done. However, at this point I just don’t see how any method can guarantee correct results without something resembling backtracking (or else exhaustive search through an exponential space). After all, we’re not looking for just one matching in the graph, or one decomposition into matrix minors, or one permutation path; we have to examine them all.

Here’s a further hand-wavy argument for the essential difficulty of the task. For a (0,1) matrix, the number of permutation paths that avoid all zero entries is equal to the permanent of the matrix. Computing the permanent of such a matrix is known to be #P-complete.

Update 2010-07-31: With lots of help from my friends, I think I finally get it. Although there could be as many as k! permutation paths in a k × k matrix, you don’t need to examine all of the paths to decide whether or not the matrix is an ormat. It’s enough to establish that one such path passes through each nonzero element. This is what the algorithm based on Hall’s theorem does. As Frans points out in a comment below, I misunderstood the essential nature of that algorithm (in spite of having it explained to me several times). There is no recursive deconstruction into progressively smaller matrix minors; instead, we just loop over all the nonzero elements of the matrix, find the minor associated with each such element, then check for a perfect matching in the minor. (Still more refinements are possible—but already we have a polynomial algorithm.)

With this efficient recognizer predicate, it’s easy to measure the proportion of ormats in random matrices at larger values of k:

fraction-of-ormats-k25.png

As expected, the fraction of ormats approaches 1 beyond about k = 20.

So much for identifying ormats. I am still unable to extend the series of exact counts beyond k = 5. The tabulations for random (0,1) matrices suggest that for k = 6 there should be about 20 billion ormats, and counting that high is just too painful. I need to work out the symmetries of the problem.

As far as I can tell, assigning exact probabilities to the nonzero matrix elements requires a full enumeration of all the permutation paths, and thus a calculation equivalent to the permanent. There may be a useful approximation.

Barry Cipra asks a really good question: The permanent tells us the maximum number of permutations that could possibly be included in a given ormat, but what is the minimum number? A naive upper bound is the number of 1s in the matrix, but I don’t see an easy path to an exact count. But enough for now.

A new Handbook

Monday, May 17th, 2010

The Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (better known as Abramowitz and Stegun) is a much-storied book. Not that it’s a book full of stories; truth is, there’s not much of a narrative thread running through those formulas, graphs and mathematical tables. But it’s a book with a story behind it.

The story began in the late 1930s, when the Mathematical Tables Project was launched in a factory building on the West Side of Manhattan. Supported by the Works Progress Administration, the Tables Project had dual aims: first, preparing high-quality tables of trigonometric functions, logarithms, and the like; and, second, providing work for unemployed New Yorkers. The 450 human computers hired for the project were chosen more on the basis of need than skill, and the work was done on a kind of numerical assembly line. According to David Alan Grier:

Each group was taught to perform a single arithmetic operation. One group knew how to add positive numbers, a second to subtract, and the third to multiply single digits. The last and most sophisticated group did long division.

I have to admit there’s a certain nightmare aspect to this scene. Working in a numbers factory sounds no more appealing than stamping sheet metal all day, although there was less danger of getting a finger crushed in the machinery.

A few years later, the Tables Project was swept up in war work; then, afterward, many of the key personnel moved to the National Bureau of Standards (now NIST, the National Institute of Standards and Technology). There they conceived the Handbook. Apparently the initial plan was a greatest-hits album of tables, but by the early 1950s the future of table-making was looking pretty dim. And so the project changed course and put more emphasis on the mathematical functions that lay behind the tables—familiar functions such logarithms, more specialized ones such as Bessel functions and some more recondite topics such as Mathieu functions and orthogonal polynomials. There were still tables listing numeric values of functions, but the Handbook also presented the mathematics you would need to evaluate (or approximate) the function for yourself.

The editors in charge of the Handbook were Milton Abramowitz and Irene Stegun, both veterans of the New York table office. They recruited about 30 young mathematicians to write chapters. In 1958 Abramowitz died suddenly; Stegun saw the project through to publication in 1964.

The Handbook seems an unlikely best-seller, but the U.S. government has distributed more than 150,000 copies, and editions from other publishers are estimated to bring the total copies in print to something near a million. (As a product of government work, the Handbook is not covered by copyright; there are scanned versions on the web.)

Announced last week is a new Handbook, officially retitled the NIST Handbook of Mathematical Functions. The ink-and-paper version, which I have not yet seen, is published by Cambridge University Press. Perhaps even more interesting is the web edition, called the NIST Digital Library of Mathematical Functions (DLMF), which I have just begun to explore. It is recognizably the same book as Abramowitz and Stegun, with the same terse style of presentation. But much has changed. The hundreds of pages of tables are finally gone; this is not the place to look up the sine of 23 degrees. But there are handsome color graphics now, and a new emphasis on methods of computation, including pointers to recommended software. And the selection of topics has expanded somewhat. For example, there are new chapters on the Painlevé equations and on functions whose argument is a matrix. Elsewhere, the Lambert W function (a personal favorite of mine) is a newcomer to the chapter on elementary functions.

Apart from the content, the DLMF is interesting as an experiment in presenting mathematics on the web. It’s the most ambitious project I’ve seen based on MathML, and it seems to work well, at least when viewed in recent versions of Firefox. (In other browsers I’ve tried, MathML gets garbled, but the equations can be displayed as images and are still quite readable in that format—even with an ancient version of Internet Explorer.) Here’s part of a page as seen in Firefox:

DLMF-zeta-450.jpg

Mousing over the “i” icon at the right margin provides access to encodings of the equations in MathML or TeX and as PNG images, as well as definitions, cross-references and such. Very slick.

The editor in chief of the new Handbook is Frank W. J. Olver of the University of Maryland College Park and NIST, whom I have mentioned before both here at bit-player and in American Scientist. As a young mathematician, half a century ago, Olver wrote one of the Handbook chapters on Bessel functions. (As an even younger mathematician, in the 1940s, he worked with Alan Turing at the National Physical Laboratory in Britain.) There are three more principal editors: Daniel W. Lozier, Ronald F. Boisvert and Charles W. Clark, all of NIST, as well as a long roster of associate editors and domain experts. It’s too soon to say whether some combination of these names will eventually replace the moniker “Abramowitz and Stegun.”

Notes: The quotation above from David Alan Grier appears in “The Math Tables Project of the Work Projects Administration: The Reluctant Start of the Computing Era,” IEEE Annals of the History of Computing, Vol. 20, No. 3, 1998. Grier has also written a profile of Stegun in “Irene Stegun, the Handbook of Mathematical Functions, and the Lingering Influence of the New Deal,” American Mathematical Monthly, August-September 2006. Boisvert and Lozier have written a brief account of the history of the Handbook. (Oddly, the images in this PDF file are negatives.)