My dekasabbatical

14 October 2009

The new issue of American Scientist is on the newsstands and on the web. My “Computing Science” column takes a stab at explaining the Hubbard model, a staple of condensed-matter physics that I’ve been struggling to understand for at least a decade. Have I finally figured it out? You can see for yourself.

The issue also includes an announcement that this new column will be my last for a year. I’m taking a sabbatical—or is it a dekasabbatical? (Both etymology and academic habit imply that a sabbatical is a break from the routine that’s supposed to come along every seven years or so. I’ve been writing the column for seventeen years.)

Friends ask what I’m going to be doing for the next twelve months. Well, one of the great advantages of my line of work is that you get to learn something completely different with every issue of the magazine—flitting from pseudorandom numbers to genetic codes to ichnofossils to interval arithmetic to ferromagnets to programming languages to the childhood of C. F. Gauss, and on and on, like some maniacal butterfly visiting every pretty flower in the field. One of the great disadvantages of my line of work is that you get to learn something completely different with every issue of the magazine—and as soon as you start to make a little progress, you have to set it aside and start all over on a whole nother subject. Thus I’m looking forward to being a little more single-minded for a while. Just one flower. Now if only I knew which flower.

Argiope aurantia

7 October 2009

Argiope-aurantia-2499.jpg

It’s orb-weaving season in my part of the world. Out in the ivy, I have four webs of the golden orb weaver, Argiope aurantia, all within one square meter.

The engineering talents of all the orb weavers are impressive, but what attracts the eye to these particular webs is that bizarre zig-zag decoration, known as a stabilimentum. What’s it for? Does it attract prey? Or mates? Does it camouflage the spider? Does it make the spider look larger than it is, to discourage predators? Does it make the web more conspicuous, to ward off inadvertent damage from passing birds or mammals? Maybe it’s just a skein of spare silk? Or a sunscreen.

Someday we may know the answer, but the spider never will.

Congruent numbers

6 October 2009

A press release from the American Institute of Mathematics two weeks ago announced that all the congruent numbers up to 1 trillion have been enumerated. Two questions leap to mind. What the heck is a congruent number? And who cares?

I’ll return to those questions. But first I’d like to pause just a moment to marvel at the idea of calculating anything up to 1012. A few decades ago, such a project would have been unthinkable. Today, counting to a trillion takes only an hour or so, even on plain vanilla hardware. This is truly one of the wonders of the age, and we shouldn’t grow too blasé about it. But the computation of all those congruent numbers involved a lot more than looping through “+1″ a trillion times, and it took considerably longer than an hour.

Okay, so what’s a congruent number? I would like to sidle up to that question rather than face it straightaway. The definition is not hard to understand—it’s all about right triangles with rational side lengths and integer areas—but when I started looking into this story, it took me a while to appreciate why those particular triangles might be worth thinking about.

Let’s begin with Pythagorean triples—sets of positive integers (a, b, c) that define a right triangle, with a and b giving the lengths of the legs and c the hypotenuse. The familiar (3, 4, 5) triangle is everybody’s favorite example. For which right triangles with integer side lengths is the area also an integer? That’s easy: All of them. The area of a right triangle is ab/2; in any Pythagorean triple either a or b (or both) must be an even number, which implies that ab/2 is a whole number. For the (3, 4, 5) triangle the area is (3 × 4)/2 = 6.

Knowing one such triangle, we can make more. Lots more. Just multiply a, b and c by any integer k, which has the effect of multiplying the area by k2. If k = 2, we get the (6, 8, 10) triangle with area 24; k = 3 yields the (9, 12, 15) triangle with area 54, and so on. The resulting sequence of triangles is infinite but not very interesting; all the larger triangles are just scaled-up versions of the original, like photographic enlargements. We can view the entire infinite series as a single class of triangles, and take the smallest member of the series as the prototype. That smallest triangle is given by a primitive Pythagorean triple—one where a, b and c have no factors in common. It turns out there are also infinitely many primitive triples. Euclid bequeathed us an algorithm that will generate all of them, if you let it run forever.

Given this infinite set of infinite series of triangles, it’s clear that infinitely many integers can be the area of a Pythagorean triangle. But that certainly doesn’t mean that every positive integer has this property. For example, there’s no Pythagorean triangle with an area of 5. To convince yourself of this fact, just measure the area of every integer-sided right triangle with legs no longer than 10. None of those triangles has an area of 5, and no triangle with longer integer legs can have an area that small. In principle, the same laborious but reliable procedure could be applied to any integer N to determine whether or not it is the area of some Pythagorean triangle. Here are the first few integers that appear in such an enumeration ( sequence A009112.):

6, 24, 30, 54, 60, 84, 96, 120, 150, 180, 210, 216, 240, 270, 294, 330

Rational triangles. Suppose we relax the constraints a little, allowing the sides of a right triangle to be rational numbers—either fractions or integers but not irrational values such as the square root of 2. The area is still required to be an integer.

A first question is whether right triangles with fractional sides and integer areas even exist. Maybe the cupboard is bare; maybe there are no such triangles. But no, they do exist. Here’s a proof by example:

35-12triangle.png

There’s no trickery here. If you’re in any doubt, do the arithmetic; you’ll find that the numbers define a genuine right triangle (the Pythagorean theorem holds) and the area really is exactly 7.

At last we have arrived in the realm of congruent numbers. A congruent number is an integer that’s the area of a right triangle with rational sides. All integer-sided right triangles are included in the category, along with those whose sides are rational fractions. Here’s a more formal definition:

A positive integer N is congruent if there exist rational numbers a, b and c such that a2 + b2 = c2 and ab/2 = N.

All the same questions we were asking about Pythagorean triples also come up in this new context. Where do we find rational triangles with integer area, or how can we manufacture them? Which integers N can be the area of such a triangle? Answers aren’t quite as easy to come by in this case. In particular, the strategy of enumerating triangles from the smallest up won’t work, because there is no smallest rational triangle. There are other ways of imposing an ordering on the rationals, but none of them lead to a good algorithm for enumerating congruent numbers.

So where did I get the example illustrated above? I didn’t just stumble upon it by trying random rational triangles. Note that multiplying each rational side length by the common denominator of the fractions will return us to the land of integer-sided triangles. For the example shown, multiplying through by 60 yields a (175, 288, 337) triangle; these numbers have no factor in common, so they form a primitive Pythagorean triple. The area of the new triangle is 7 × 60 × 60, or 25,200. Thus we’ve recovered an integer triangle from a rational one; what’s more important, the process can be reversed to derive rational triangles from integer-sided ones.

I mentioned a Euclidean method for generating primitive Pythagorean triples. It goes like this: Take any two positive integers m and n that satisfy the following conditions:

  • m > n;
  • one of the numbers is odd and the other even;
  • the numbers are coprime (no common factors other than 1).

Then (2mn, m2n2, m2 + n2) is a primitive Pythagorean triple. By counting through all suitable values of (m, n) starting with (2, 1), all primitive triples are generated.

A postprocessing step built atop Euclid’s procedure will generate congruent numbers. The plan is to produce a primitive Pythagorean triple and calculate the area N = ab/2 of the corresponding triangle. If the area is “square-free”—that is, it has no pairs of repeated prime factors and thus cannot be divided evenly by a perfect square—then we’re done; that value of N is a congruent number and cannot be reduced to a smaller congruent number. But if N is not square-free, we can divide out each square factor, leaving a smaller triangle with rational sides and integer area, and thereby generating another congruent number.

A worked example: The (m, n) pair (5, 4) produce the primitive triple (9, 40, 41), with area N = (9 × 40)/2 = 180. Thus 180 is identified as a congruent number, but it is not square-free; it factors as 22 × 32 × 5. We can therefore divide the area by 4 and the sides by 2, producing a shrunken (9/2, 20, 41/2) triangle of area N = 45. In this way we learn that 45 is also a congruent number, but again it is not square-free. Dividing the area by 9 and the sides by 3 yields the (3/2, 20/3 and 41/6) triangle with area N = 5. And so 5, too, is congruent; it’s also square-free and therefore cannot be reduced further. There is no smaller triangle similar to the (9, 40, 41) triangle that has integer area.

This scheme can produce an unlimited supply of congruent numbers. Unfortunately, it’s not so well suited to answering the question of whether a particular integer is congruent. The problem is that the congruent values are not generated in order from smallest to largest. If we turn the crank for a while and discover that a certain integer N appears in the algorithm’s output, then we know for sure that N is congruent. But if N has not shown up, we can’t conclude that it is not among the congruent numbers; we might simply have to wait longer for N’s turn to come.

When I turn the crank on my own little program for generating congruent numbers, these are the first 101 values to pop out, in order of appearance:

6, 30, 60, 15, 84, 21, 210, 180, 45, 5, 330, 630, 70, 924, 231, 546, 504, 126, 14, 1320, 1560, 390, 840, 1386, 154, 2340, 585, 65, 1224, 306, 34, 990, 110, 2730, 3570, 1710, 190, 2574, 286, 4620, 1155, 5610, 5016, 1254, 2310, 1716, 429, 7140, 1785, 7980, 1995, 3036, 759, 4290, 7956, 1989, 221, 10374, 10920, 8970, 3900, 975, 39, 7854, 11970, 1330, 14490, 1610, 11550, 462, 4914, 6630, 12540, 3135, 19320, 4830, 6090, 4080, 1020, 255, 11856, 2964, 741, 18480, 23184, 5796, 1449, 161, 25200, 6300, 1575, 175, 7

Here’s the same list sorted into ascending order of magnitude:

5, 6, 7, 14, 15, 21, 30, 34, 39, 45, 60, 65, 70, 84, 110, 126, 154, 161, 175, 180, 190, 210, 221, 231, 255, 286, 306, 330, 390, 429, 462, 504, 546, 585, 630, 741, 759, 840, 924, 975, 990, 1020, 1155, 1224, 1254, 1320, 1330, 1386, 1449, 1560, 1575, 1610, 1710, 1716, 1785, 1989, 1995, 2310, 2340, 2574, 2730, 2964, 3036, 3135, 3570, 3900, 4080, 4290, 4620, 4830, 4914, 5016, 5610, 5796, 6090, 6300, 6630, 7140, 7854, 7956, 7980, 8970, 10374, 10920, 11550, 11856, 11970, 12540, 14490, 18480, 19320, 23184, 25200

The problem, again, is that we can’t conclude anything about the numbers that don’t appear in this collection. For example, 13 is absent, even though it is in fact a congruent number; it just doesn’t turn up until we get well down the list—it comes from the 49,485th triangle examined, which has sides (780/323, 323/30, 106921/9690). On the other hand, 8 is unlisted because it is not congruent; it will never show up in the output hopper no matter how long we keep cranking away.

Here are the first 101 true congruent numbers (sequence A003273):

5, 6, 7, 13, 14, 15, 20, 21, 22, 23, 24, 28, 29, 30, 31, 34, 37, 38, 39, 41, 45, 46, 47, 52, 53, 54, 55, 56, 60, 61, 62, 63, 65, 69, 70, 71, 77, 78, 79, 80, 84, 85, 86, 87, 88, 92, 93, 94, 95, 96, 101, 102, 103, 109, 110, 111, 112, 116, 117, 118, 119, 120, 124, 125, 126, 127, 133, 134, 135, 136, 137, 138, 141, 142, 143, 145, 148, 149, 150, 151, 152, 154, 156, 157, 158, 159, 161, 164, 165, 166, 167, 173, 174, 175, 180, 181, 182, 183, 184, 188, 189

Ancient history. The search for congruent numbers does not stretch back all the way to Pythagorus or Euclid, although Diophantus apparently considered a couple of special cases. L. E. Dickson’s big history of number theory attributes the first full statement of the problem to “an anonymous Arab manuscript, written before 972.” Other sources cite the works of Abu Bakr al-Karaji, a mathematician who worked in Baghdad at the end of the 10th century. A couple of centuries later Fibonacci, who straddled the Arabic and European worlds, had more to say about the problem in his Liber Quadratorum. Later still, Pierre de Fermat proved (in a marginal note, for which he had sufficient room!) that 1 is not a congruent number. (The proof extends to 4 and 9 and 16 and all other square numbers, none of which are congruent.)

These early writers formulated the problem in a somewhat different way than the rational-triangle scheme explained above. Given an integer N, they asked whether it’s possible to find three perfect squares in arithmetic progression with the interval between the squares equal to N. In other words, they sought a number s2 such that s2 – N, s2, and s2 + N are all perfect squares. For example, if N = 24, then the solution is s = 5; the three perfect squares are 25 – 24 = 1, 25, and 25 + 24 = 49.

The two versions of the problem—the integer-area right triangles and the squares in arithmetic progression—are equivalent, although that’s not exactly obvious. Here’s the connection: If a2 + b2 = c2 and ab/2 = N, then setting s = c/2 guarantees that s2N, s2, and s2 + N are all perfect squares. (For the algebraic exercise proving this, see the book by Neal Koblitz cited below.)

By the way, the squares-in-arithmetic-progression version is where we get the term “congruent numbers.” The three numbers s2N, s2, and s2 + N are all congruent modulo N. For example, 1, 25 and 49 are all congruent to 1 modulo 24. If you ask me, “congruent numbers” is a poor excuse for a name, and is particularly confusing in the context of triangles, where “congruent” has another meaning altogether. But after a thousand years I suppose it’s too late to fix it. “Karaji numbers,” anyone?

By 1915, all the integers up to 100 had been classified as either congruent or noncongruent. But as recently as 1980, when Ronald Alter wrote a brief review of the status of the problem, there were still 189 square-free numbers less than 1,000 for which the question of congruence remained unsettled. Then everything changed in 1983. In that year the nature of the congruent-number problem was transformed by Jerrold B. Tunnell of Rutgers, who not only found a better way to calculate congruent numbers but also showed why it’s worth the effort to do so.

From right triangles to elliptic curves. Before I go on, a warning: I am about to walk up to the blackboard with a piece of chalk in my hand and impersonate one of those masterful, self-confident lecturers who rattle off long trains of equations, invent lemmas on the fly, and always come to QED just as the blackboard fills up and the class ends. In my case any such air of mastery is a complete illusion; I’m just learning this stuff as I go along. But I’ll do my best to make it an entertaining illusion.

So here goes. It’s not hard to see that any product of perfect squares is also a perfect square: a2b2c2 = (abc)2. Therefore, if N is a congruent number, and s2 – N, s2, and s2 + N are all squares, we can set the product of these three factors equal to another square; call it y2. Thus we get the equation:

y2  =  (s2N) s2(s2 + N).

Multiplying the three terms on the right, this becomes:

y2  =  (s2)3N2s2.

Now perform a simple substitution of variables, setting s2 = x:

y2 = x3N2x.

This is the equation of an elliptic curve—another mathematical object with a really unfortunate name. An elliptic curve looks nothing like an ellipse. Here’s the graph of a particular elliptic curve, the one for N = 6:

elliptic-curve-6.jpg

The two pieces of the blue line mark the locus of all points (x, y) that satisfy the equation

y2 = x3 – 36x.

And the hot pink dot? That marks a rational point on the curve—a point where x and y both take on rational values. (Insiders, I’m told, call it a ratpoint.) Specifically, the dot identifies the point x = 25/4, y = 35/8. If you care to plug those numbers into the equation, you’ll find that indeed 1225/64 = 15625/64 – 225, and so the point does lie on the curve.

Tunnell, elaborating on earlier work of Kurt Heegner, showed that N is congruent if the elliptic curve y2 = x3N2x passes through a certain kind of point on the (x, y) plane. Specifically, we have to search for points whose x and y coordinates are both rational numbers, but we ignore the three points with y = 0. Then the x coordinate of the point has to satisfy three more conditions. Letting x = u/v, we require that:

  • u and v are perfect squares,
  • v is even,
  • u has no factors in common with N.

If we can find just one point on the curve that matches all these properties, then we can generate an infinity of other rational points. Moreover, the existence of such a point implies, according to Tunnell’s theorem, that N is congruent. (The hot pink point above clearly qualifies.) Thus the congruent-number problem appears to be solved: We have an unambiguous criterion for deciding if a given N is congruent or not. Just construct the corresponding elliptic curve and check the ratpoints.

Regrettably, we’ve been set up for yet another disappointment. Searching for rational points on elliptic curves is a task for which we have no efficient general method. We’re really no better off than trying to generate all Pythagorean triples.

But wait! We’re not done yet. From the properties of elliptic curves, Tunnell derived yet another criterion—and this one turns out to be the key to a simple and practical test. It all hinges on the number of integer solutions to some quadratic equations that on first glance appear to be arbitrary strings of symbols plucked out of thin air. The criterion is this: If N is a square-free congruent number, and if N is odd, then the number of integer solutions to the equation N = 2x2 + y2 + 8z2 must be exactly double the number of integer solutions to N = 2x2 + y2 + 32z2. (If both equations have no integer solutions, the condition is satisfied, since 2 × 0 = 0.) For even N, the two equations are slightly different, but the test works in exactly the same way.

tunnell-criterion-1000.png

The graphic above shows the 361 square-free numbers up to 1,000 that pass the Tunnell test. The height of each dot indicates the number of integer solutions to the equation N = 2x2 + y2 + 8z2 (or the corresponding equation for even N). The two values of N with 160 solutions are 689 and 761. (It’s curious that in most cases—308 out of 361—the number of solutions is actually zero. I don’t know what this means or whether that pattern continues with larger N.) [See Update 2009-10-10, below.]

What sets the Tunnell criterion apart from all the others is that we can actually carry out the test in a reasonable and predictable amount of time. For any given N, counting the solutions should be doable in time proportional to N3/2, simply by trying all integer values of x, y and z less than the square root of N.

So that’s it, right? Problem solved at last? Well, no, there’s still a small hitch—a bit of awkward fine print. Tunnell proved that if N is square-free and congruent, then the criterion on the number of integer solutions must be satisfied. This much is unconditionally true. But what about the converse? If the criterion is satisfied, can we be certain that N is a congruent number? In this direction, the proof is not unconditional. It’s contingent on a proposition known as the Birch–Swinnerton-Dyer conjecture, which is widely believed, and supported by an abundance of empirical evidence, but still unproved. Which leaves just enough doubt to make the game still interesting.

Why should anyone care about this stuff? On first acquaintance, the search for congruent numbers sounds like one of those mathematical pastimes that appeal to amateurs (like me!) but don’t really command the attention of the research community. There are so many kinds of cutely named numbers out there—perfect, amicable, lucky, happy—and not all of them carry deep significance. The congruent numbers might well be just another amusing sideshow. But it turns out they’re not. There’s serious mathematics here, enough to engage the interest of serious mathematicians.

Elliptic curves have been a focus of intense scrutiny for decades. Henri Poincaré studied them in the early years of the 20th century. In the 1920s Louis Mordell proved a theorem about the rational points on elliptic curves: Even when there are infinitely many points, they all come from a finite set of “generators”; the number of generators and hence the number of infinite families is called the rank of the curve. In the 1950s Yutaka Taniyama and Goro Shimura, with later refinements by André Weil, worked out a conjecture about elliptic curves and another class of mathematical objects, called modular forms. Andrew Wiles and Richard Taylor proved part of that conjecture in the course of settling Fermat’s Last Theorem in the 1990s; the rest of the Taniyama-Shimura conjecture has since been proved as well.

And now we have the Birch–Swinnerton-Dyer conjecture, one of the famous million-dollar math problems. I’m not going to try to explain the conjecture in any detail—I’ve used up all my chalk, and probably my readers’ patience, too—but I think the basic idea goes something like this. On the one hand we have an elliptic curve, drawn in the (x, y) plane. On the other hand we have something called an L-function, which is defined on the plane of complex numbers. Think of the L-function as an undulating landscape with mountains rising above the plane and submarine canyons deep under it. The topography of this surface is determined in part by points selected from the elliptic curve, so there’s a connection between the two objects. The conjecture formulated in the 1960s by Bryan Birch and Peter Swinnerton-Dyer says that the shape of the L-function in the neighborhood where it passes through zero gives us information about the rank of the elliptic curve and thus about the number of rational points.

There’s an analogy between the BSD conjecture and an even more celebrated problem on the million-dollar list, the Riemann hypothesis. The zeros of the Riemann zeta function (which is much like an L-function), carry information about the distribution of prime numbers. The Birch–Swinnerton-Dyer conjecture invites us to suppose that the zeros of L-functions of certain elliptic curves tell us something about the distribution of congruent numbers.

Incidentally, the BSD conjecture must be one of the earliest products of computer-driven experimental mathematics. The numerical explorations that led to the conjecture were done with the EDSAC, the pioneering electronic computer built at the University of Cambridge.

The trillion triangles. Hand-waving about elliptic curves and L-functions might provide vague a rationale for interest in congruent numbers, but one part of the story I still didn’t get was why anyone would bother to compute mass quantities of congruent numbers. So I asked Michael Rubinstein of the University of Waterloo. Rubinstein had earlier computed the congruent numbers up to 109, and it was his challenge that provoked the recent thousandfold expansion of the inventory. Rubinstein explained:

I’m interested in the statistical distribution of congruent numbers. A few years ago, Brian Conrey, Jon Keating, myself, and Nina Snaith came up with a prediction for the asymptotic number of congruent numbers up to x, akin to the prime number theorem. This prediction grew out of remarkable models created by the same researchers and David Farmer for related elliptic curve L-functions that were inspired by random matrix theory and based on a number of unproved assumptions.

The prediction is that the number of congruent numbers less than x, arising from even-rank elliptic curves, is asymptotically

\[c x^{3/4} \log(x)^{11/8}\]

where c is a constant. Our model doesn’t let us completely nail down the constant c, and the data will help us understand what the correct constant is. The asymptotic behaviour stabilizes at a logarithmic rate, so going to 1012 is only 50 percent better than going to 108.

It will also provide a good amount of data for studying the statistics of ‘higher rank elliptic curves’ in the family of elliptic curves related to congruent numbers. The correct model to use for higher rank elliptic curves is not really understood, and the billions of congruent numbers found will end up yielding several million higher rank curves with which we hope to gain insight into the statistics of higher rank curves (compared to just thousands of higher rank curves from earlier computations).

The enumeration of the trillion triangles was done by two teams. William Hart of the University of Warwick and Gonzalo Tornaria of the Universidad de la Republica in Uruguay ran their program on a computer called Selmer at Warwick. Mark Watkins of the University of Sydney, David Harvey of the Courant Institute and Robert Bradshaw of the University of Washington used the SAGE computer at Washington.

The strategy of the computation is based on the Tunnell criterion, but it turns out there’s a better way to go about it than explicitly counting the number of integer solutions to those various quadratic equations in (x, y, z). Tunnell showed that all the information about the number of solutions could be encoded in a formal power series, which looks like this:

\[A(q) = a_{1}q^{1} + a_{2}q^{2} + a_{3}q^{3} + a_{4}q^{4} + ... \]

It’s a “formal” series in the sense that we don’t actually care about evaluating the sum for any particular value of q; instead, we just want to know the various coefficients ai. This is how the Tunnell criterion gets encoded in the series: If ai is zero in this series, and if i is a square-free number, then i is congruent. (Actually, there are two separate series, one for odd i and one for even i.)

The odd and even power series are generated by a couple of formidable-looking products. Here’s the one for the odd series:

\[A(q)=q\prod_{n=1}^\infty(1-q^{8n})(1-q^{16n})\left(1+\sum_{n=1}^\infty 2q^{2n^2}\right)\]

Since both the overall product and the sum in the third factor call for infinitely many values of n, we can’t multiply out this whole expression. Fortunately, though, it turns out that larger n contribute only to higher coefficients, and so we can compute early terms in the series without worrying about what might happen later. Kent Morrison notes that taking just the factors

\[q(1-q^8)(1-q^{16})(1-q^{16})(1+2q^2+2q^8+2q^{18})\]

is enough to generate all the odd congruent numbers up to 25. Those five factors expand to:

\[A(q)=q^1+2q^3+q^9-2q^{11}-4q^{17}-2q^{19}-3q^{25}+...\text{ higher terms}\]

The terms missing from this series—those with a zero coefficient—are just the odd square-free congruent numbers in this range: 5, 7, 13, 15, 21, 23.

The big computation by Bradshaw et al. used essentially this same scheme, with a few refinements and optimizations. The challenge is that when you’re trying to compute all the terms of the series out to a1,000,000,000,000q1,000,000,000,000, you wind up multiplying some enormous numbers. I don’t mean just that the numbers are too big to fit into the registers of a 32-bit or a 64-bit computer. They’re too big to fit into the main memory of a computer with 128 gigabytes of RAM. Moreover, the arithmetic has to be done exactly; approximations are useless. Thus a major part of the effort in setting up the computation was devising efficient “out of core” multiplication routines for ridiculously large numbers. The basic algorithm was a fast-fourier-transform method.

And the result? Up to 1012, the computation identified 3,148,379,694 square-free congruent numbers number candidates with N in the residue classes 1, 2, and 3 modulo 8. I look forward to reading more about the analysis of their distribution. [See Update 2009-10-10, below.]

References and resources. Two weeks of reading and tinkering have not made me the master of this subject. For those who would like to explore further I can offer some resources I’ve found helpful, listed in no particular order.

  • Ed Eikenberg has posted some lecture notes from a talk on elliptic curves and congruent numbers given in 2000.
  • William Stein offers slides from a Harvard lecture in 2001.
  • The web page for another William Stein course (this one at the University of Washington in 2006) provides notes and handouts and a collection of function definitions that will let you play with congruent numbers and elliptic curves in the Sage computer mathematics system.
  • Koblitz, Neal. 1993. Introduction to Elliptic Curves and Modular Forms. Second edition. New York: Springer-Verlag. Some kind soul has posted page images for the first chapter online.
  • Guy, Richard K. 2004. Unsolved Problems in Number Theory. Third edition. New York: Springer. (Section D27, p. 306, discusses congruent numbers.)
  • Tunnell, J. B. 1983. A classical diophantine problem and modular forms of weight 3/2. Inventiones Mathematicae 72:323–334. (Available online via the DigiZeitschriften.)
  • Barry Cipra writes on the recent trillion-trangle computation: Tallying the class of congruent numbers, ScienceNOW, September 23, 2009. (Get it now, while it’s free.)
  • Dickson, Leonard Eugene. 1952. History of the Theory of Numbers. New York: Chelsea Pub. Co. (For congruent numbers see Vol. 2, Chapter 16, p. 459.)
  • Brown, Ezra. 2000. Three Fermat trails to elliptic curves. The College Mathematics Journal 31:162–172. Free PDF available online (unlike most CMJ articles), apparently because it won a Polya prize.
  • Rubin, Karl, and Alice Silverberg. 2002. Ranks of elliptic curves. Bulletin of the American Mathematical Society 39:455–474. Online.
  • Silverberg, Alice. 2001. Open questions in arithmetic algebraic geometry. In Arithmetic Algebraic Geometry, Institute for Advanced Study/Park City Mathematics Series 9, pp. 83–142. Providence: American Mathematical Society. Postscript preprint.
  • The American Institute of Mathematics web page on congruent numbers, which includes several helpful appendices, such as a discussion of FFT multiplication, an essay on congruent numbers by Kent Morrison and a link to the paper by Bradshaw, Hart, Harvey,
    Tornaria and Watkins reporting the trillion-triangle computation. (At this writing, the posted version of the paper is still an incomplete draft.)
  • Conrey, J. B., J. P. Keating, M. O. Rubinstein and N. C. Snaith. 2000. On the frequency of vanishing of quadratic twists of modular L-functions. arXiv preprint. (This paper and the one listed below discuss conjectures about the distribution of congruent numbers, among other topics.)
  • Conrey, J. Brian, Jon P. Keating, Michael O. Rubinstein and Nina C. Snaith. 2004. Random matrix theory and the Fourier coefficients of half-integral weight forms. arXiv preprint.
  • Brown, Jim. 2007. Congruent numbers and elliptic curves. PDF (Brown refers to this text as lecture notes for an undergraduate course, but in fact it is a very polished and well-organized expository article. Includes some Sage code for working with rational points of elliptic curves.)
  • Alter, Ronald. 1980. The congruent number problem. American Mathematical Monthly 87:43–45. (Tables of what was known and unknown shortly before Tunnell’s breakthrough.)

Special thanks to David Farmer for alerting me to this story in the first place and to Michael Rubinstein for help understanding what it all means.

Update 2009-10-10: In the comments, Barry Cipra points out that the quadratic equations in the Tunnell criterion necessarily have no solution whenever N is congruent to 5, 6, or 7 modulo 8. In fact all of the zero-solution points in the graph I presented above come from such N. Numbers in the residue classes 0 and 4 modulo 8 cannot be square-free. If we plot only those N that are square-free and congruent to 1, 2 or 3 modulo 8, we are left with just 53 square-free congruent-number candidates up to 1,000:

tunnell-criterion-123.png

Only the numbers in these three residue classes were included in the trillion-triangle computation.

Hey Google, gimme back my widgets!

11 September 2009

Sometime this morning, a web page that I visit occasionally changed its appearance from this:

google-pretty.png

to this:

google-ugly.png

The images are grabbed from Firefox on OS X; same effect in Safari. Some CSS forensics reveals what’s gone wrong here: Google has added a “height:28px” property to the style for the two buttons, and boosted their type size from 13 pixels to 16. (Type inside the search box is also given the giant billboard treatment.) Apparently the height property prevents the browser from using those cute jelly-bean widgets for the buttons.

Can we strike a bargain, Google? I’ll let you take over the world and track my every movement; I’ll sign over all my copyrights and promise never to block your advertising, if you’ll just give me my widgets back. Go ahead and Be Evil; just Don’t Be Ugly.

Or am I going to have to override you in userContent.css?

Update: Marissa Mayer, Google’s Vice President for Search Products & User Experience, explains:

For us, search has always been our focus. And, starting today, you’ll notice on our homepage and on our search results pages, our search box is growing in size. Although this is a very simple idea and an even simpler change, we’re excited about it — because it symbolizes our focus on search and because it makes our clean, minimalist homepage even easier and more fun to use. The new, larger Google search box features larger text when you type so you can see your query more clearly. It also uses a larger text size for the suggestions below the search box, making it easier to select one of the possible refinements.

For Firefox users who find the “supersized” page a nonimprovement, I offer the following quick hack:

@-moz-document domain(google.com) {
  .lsb, .gac_sb {
	font-size:13px ! important;
	height:inherit ! important;
	}

  .lst, .gac_m {
	font-size:14px ! important;
	}

}

Add these lines to your userContent.css file (see the Customizing Mozilla site for further instructions). This seems to fix the widget problem and the input-field text size, but all the drop-down gadgetry is still a mess and maybe the dropdown menu of suggestions. This will break if Google’s next whim is to change the class names “lsb,” “lst,” etc. If anyone has a cleaner solution, please pass it on.

Nautical numeration

8 September 2009

I’ve been goofing off in Nova Scotia for a few days. In Halifax I climbed up to the Citadel, a hilltop fortress built to protect the city from the French and later rebuilt to fend off the Americans; now it welcomes both those nationalities and anyone else willing to pay $12 for the tour and the costume show.

signal-balls-vert.jpg In one room of the museum I discovered the curious diagrams reproduced at right. These figures are extracted from a poster on military codes designed for ship-to-shore communication. Signals were sent by hoisting large balls or disks on a mast reaching high above the fortress ramparts; ships in the harbor could reply by raising similar signal disks on their own masks.

The part of the code shown here obviously pertains to the transmission of numbers, but I’ve never seen a weirder system of numeration. Based on these diagrams and others, I infer that the signal disks were raised on three halyards. (A fourth halyard is present in the diagrams but always seems to be empty.) Each halyard could display zero or one or two disks, and each displayed disk could be in either an upper or a lower position. That comes to five possible patterns per halyard, and thus 53 = 125 patterns in all. What baffles me is why the ten decimal digits were assigned to the specific patterns shown in the diagrams. Who counts in the sequence 1, 2, 3, 4, 5, 0, 6, 7, 8, 9, 9, 9?

I’m not even sure I understand the basic signaling protocol. When I first looked at the poster, I assumed that the ten digits were represented thus:

positional-encoding.png

Presumably, the three encodings for ‘9′ were equivalent, and any of them could be used at the signaler’s option.

Later, after much musing over this puzzle, I decided that another interpretation of the diagrams seemed more plausible:

cumulative-encoding.png

In this scheme at least the numerals from 1 through 5 have some kind of logic to them, in that the number of disks shown matches the numeral being transmitted; in essence we have a unary notation within that limited range. On the other hand, if this interpretation of the diagrams is correct, we have to ask: Why did the designers stop at 5? They could have extended the same unary scheme to the range from 1 to 6, and then used doubled disks at the top of the mast for the numerals 7, 8 and 9.

If we were creating a code like this today, I’m sure everyone’s first impulse would be a system based on binary notation. It’s not terribly surprising that the British naval authorities circa 1850 didn’t think of that. But I still find the apparent arbitrariness of this numerical code utterly perplexing. Am I missing some pattern in the data, either obvious or subtle? (I suppose the encoding could be deliberately obscure, but there’s no evidence that this was meant to be a secret code, and the mere existence of the poster—which appears to be a 19th-century artifact—suggests otherwise.)

Added 2009-09-10: Below is the signal mask on which the disks were displayed, seen from the landward side. The disks (actually fabric-covered crossed hoops, which have an approximately circular cross section from any azimuth) were 36 inches in diameter. I think the diameter of the mast is about 12 inches at the base.

photo of Halifax military signal mast

* * *

The Googleverse has so far failed to solve this mystery for me, but in the process of searching I stumbled upon a remarkable book that suggests there was some lucid thinking in the 19th century on themes that we would now identify as information theory. The book is A Manual of Signals: For the Use of Signal Officers in the Field, and for Military and Naval Students, Military Schools, Etc., by Albert J. Myer. The full text is available on Google Books in either HTML or PDF.

Myer founded the U.S. Army Signal Corps just before the outbreak of the Civil War. He had studied medicine, writing a dissertation on “A New Sign Language for Deaf Mutes,” and had also worked as a telegraph operator. Then, in the army, he devised the signaling method commonly known as wigwag, using a single flag waved to the left and the right. Fort Myer in Virginia is named for him. (These biographical snippets come from a Signal Corps history, which offers much further detail.)

Myer’s book was first published in 1864 and then revised in 1868, but it takes quite a modern mathematical approach to problems of communication and information. He begins with a tutorial on permutations and combinations, then applies these ideas to the encoding of signals:

We wish for example to make a large number of signals…. We take any few different and simple known signs, sounds, motions, or indications, which we can easily make, and we join them together, twos or threes, or more at a time, making one after another into many and different and more complex signs or arrangements. Each of these new signs becomes, when a meaning is given to it, a signal. We can increase the number of such signals to any limit by continuing to join together the known signals in greater numbers or in new arrangements.

That’s a pretty good description of Σ*, the set of all strings over a finite alphabet.

* * *

Guest update 2009-09-10: The following comes from Barry Cipra.

I have an idea for an alternative interpretation of the way digits are meant to be represented by disks on halyards. My basic premise is that a distant viewer can tell the difference between a disk being up high and it being down low, but cannot accurately judge which halyard a disk is on unless there’s at least one disk on each halyard. (In particular, two disks on outer halyards could be confused with two disks on adjacent halyards if the mast is at an angle to the distant viewer.) In this case, the digits 0 through 9 can be unambiguously represented as follows:

disks-on-halyards encoding suggested by Barry Cipra

In other words, basically your second interpretation for 0 through 8, but with an extra “anchoring” pip for the numbers 1 through 5, but your first interpretation for the number 9. This makes it clear why the small-number format stops at 5.

If my premise is correct (or accepted as correct), then out of the 216 distinct patterns, there are only 1+5+25+125 = 156 unambiguous signals possible. That is, there is 1 signal with no halyards having any disks, 5 with exactly one halyard showing at least one disk, 25 with exactly two halyards showing at least one disk each, and 125 with disks showing on all three halyards.

But maybe we should also require the meaning of a pattern to be unambiguous when the orientation of the ship is uncertain — i.e., A,B,C should have the same meaning as C,B,A. If I’ve thought through things correctly, this cuts the number of unambiguous signals down to 1+5+15+75 = 96.

It would be great to find out how the Navy really did things!

* * *

Update 2009-09-17: Many thanks to Jim Ward (in the comments) for unearthing a 2007 document by Spurgeon G. Roscoe that illuminates the history of this signaling system, even if it doesn’t quite pin down how the code worked.

Roscoe describes a system of signaling towers founded by Edward, Duke of Kent, during his busy tenure as military commander in the Maritime Provinces at the end of the 18th century. Kent’s optical telegraph line was to run from Halifax to Annapolis in Nova Scotia and on to Fredericton in New Brunswick. It is not known with certainty how much of the network was completed or whether it ever served as a practical communications link. (Roscoe is skeptical, pointing out among other things that the territory around the Bay of Fundy is very foggy.)

The chart on exhibit at the Citadel in Halifax is not directly connected with the Kent telegraph system; it comes from a later era and is identified as a device for ship-to-shore communication; nevertheless, there is an unmistakable familial resemblance. Sketches reproduced by Roscoe give the following code system for the Duke of Kent’s overland telegraph:

numeric code scheme from Roscoe manuscript

This is merely a cyclic permutation of positions 0, 6, 7, 8 and 9 in the Citadel code. (On first glance there also seems to be a mirror reversal involved, but that’s not the case; we’re merely looking at the signal mast from the opposite direction.)

Roscoe says nothing about how to interpret these diagrams—whether, for example, the display for “6″ consisted of six disks or a single disk in the lower right corner, or one of the other schemes discussed above or in the comments. But his description of how messages were composed seems so impossibly cumbersome that it leaves me wondering if this plan was ever really put into practice. Apparently, alphabetic characters did not have codes of their own; each letter was encoded in a sequence of numerals. In one system, “A” was 2, 3, 0; “B” was 1, 4, 0; “Q” was 4; “T” was 2, 3. (The compact, single-digit encoding of “Q” makes this a kind of anti-Hamming code.) Can anything so maladaptive have survived the trial of use in the field?

Update 2009-09-22: This is a response to Carl Witty’s comment. I’m putting it here in the main text rather than in a further comment because I think Witty has essentially solved the mystery.

Witty’s interpretation of the Roscoe manuscript rules out all schemes in which the code is “cumulative” — where the signal for 3, for example, consists of the disks numbered 1, 2 and 3. Instead, digits 1 through 6 are each represented by a single displayed disk, and digits 7 through 9 each consist of a single pair of disks. (The triple code for 9 is to be interpreted as an “OR”: Any of the three positions can be used with the same meaning.) Then these digits — which are really just opaque symbols, with no numerical meaning — are combined in various ways to represent the letters of the alphabet and the numbers from 1 to 99. (Supplementary flags bring the range of numbers up to 499.)

Thus when Roscoe indicates that the letter S has the coding “1, 3, 5,” that means the disks labeled 1, 3 and 5 are all displayed simultaneously in order to transmit an S.

There’s a test for this hypothesis. If the scheme is to be workable, the alphabetic and numerical codes composed by displaying two or three or four digits at once must have a particular form. There can be no repeated digits in any encoding, and no two of the digits can use the same position on the mast. Specifically, in the Duke of Kent code illustrated in the 2009-09-17 update above, there can be no composite code that calls for both a 1 and a 7, or a 3 and an 8 or a 9 and a 5.

Does the code transcribed by Roscoe pass this test? Roscoe actually gives two codes, both apparently found in an 1802 “Signal Book” from Camperdowne Station in Nova Scotia. Roscoe’s second listing of alphabetic and numeric encodings does indeed satisfy the constraints. Indeed, it obeys a stronger restriction: No combination of symbols in the code calls for hoisting more than two disks on a single halyard. For example, the code avoids not only 1 and 7 but also 2 and 7.

The other code that Roscoe lists fails the test — or so it seems at first glance. There are patterns such as “1, 7″ encoding the number 54 and “3, 8″ for 64. But on looking closer, it appears that this code adheres to a different set of constraints: There are no instances where a 6 appears together with either a 1 or 2, or where 7 is combined with 3 or 4, or finally where 8 comes together with 5 or 0. These are exactly the restrictions that have to be applied in the Citadel code!

Spammy weather

14 August 2009

July brought quite an impressive spam storm, which dumped 10,738 messages on me:

spam-2007-08-to-2009-07.png

That’s a record for my inbox, well beyond the spike of 7,506 messages received last October. The mean number of messages over the two-year period shown is 3,867; the standard deviation is 2,190.

I’m intrigued by the amount of noise in this signal. The magnitude of the fluctuations suggests to me that somewhere in the spam economy there is a small-number bottleneck. Maybe there are only a few high-volume spammers in the world, so that when one of them goes on vacation, the overall volume sinks dramatically. Or maybe there are only a limited number of customers willing to pay for big spam mailings, so that the renewal or cancellation of a single contract can have a noticeable impact. Or it could be that there are only a few major lists of harvested email addresses; when the scraper misses one of my mailboxes, I see a big change.

A bottleneck of this kind is not the only possible explanation. In some other areas with high volatility–the stock market, for example–the apparent cause is not a small population of agents but strong correlations between agents, who all follow the same signs and signals. I suppose that might happen in the spam market, too, with broader economic trends affecting everyone in the same way, but somehow it seem less likely.

The graph below breaks down the monthly totals according to which of my various email addresses the spam targeted. Again the numbers seem to be bouncing around pretty wildly. For example, the July spike mostly came from my address here at bit-player, but the peak last October was dominated by an address at amsci.org.

spam-by-address-2009-08.png

Of the seven addresses I monitor, five are openly published on the web, and thus I shouldn’t be surprised that they attract their share of spam. But the other two addresses have never been published, and one of them I have never used or even handed out to friends. Those obscure addresses are getting about 1,000 spams a month in total.

One feature of my spam that doesn’t seem to fluctuate much from month to month is the proportion written in Russian or other languages that use the cyrillic alphabet. The fraction has hovered near one-half for the past year. I have a hard time imagining a model that produces such linguistic stability along with volatility in other dimensions.

Outnumbered

10 August 2009

My column in the new issue of American Scientist is about the challenges of computing with very large numbers. The column ends with an open question, which I want to restate here as an invitation to commentary.

Any system of arithmetic in which numbers occupy a fixed amount of space can accommodate only a finite set of numbers. For example, if all numbers have to fit into a 32-bit register, then there are only 232 representable numbers. So here’s the question: If our numbering system can represent only 4,294,967,296 distinct values, which 4,294,967,296 values should we choose to represent?

There is an obvious trade-off here between range and precision: To reach farther out on the number line, we have to leave wider gaps between successive representable numbers. But that’s not the only choice to be made. We must also decide whether to sprinkle the numbers uniformly across the available range or to pack them densely in some regions while spreading them thinly elsewhere. I’m not at all sure what principles to adopt in trying to identify the best distribution of numbers.

To frame the question more clearly, I’d like to introduce a series of toy number systems. In each of these systems a number is represented by a block of six bits. There are 26=64 possible six-bit patterns, and so there can be no more than 64 distinct numbers. A function f(x) maps each bit pattern x to a number. Sometimes it’s convenient to regard the bit patterns themselves as numbers, ordered in the obvious way, in which case f(x) is a function from numbers to numbers.

In the simplest case, f(x) is just the identity function, and so the bit patterns from 000000 through 111111 map into the counting numbers from 0 through 63. A slightly more general scheme allows f(x) to be a linear transformation, f(x)=ax+b, where the parameters a and b are constants. If a=1 and b=0 we again get the nonnegative counting numbers. If a=1 and b=–32, we have a range of integers from –32 to +31. With a=1/64 and b=0, the entire spectrum of numbers is packed into the interval [0,1). Numbers formed in this way are called fixed-point numbers, since there’s an implied radix point at the same position in all the numbers. By adjusting the values of a and b, we can generate fixed-point numbers to cover any range, but they are always distributed uniformly across that range.

The well-known floating-point number system, modeled on scientific notation, is a tad more complicated. In my toy version, three bits are allocated to an exponent t, which therefore ranges from 0 to 7. The other three bits specify a fractional significand s, interpreted as having values in the range 8/16, 9/16, 10/16, …, 15/16. The mapping function is defined as f(t, s)=s×2t. The 64 numbers generated in this way range from 1/2 to 120. (Real floating-point systems, such as those defined by the IEEE standard, not only have more bits but also have a more elaborate structure, with allowance for positive and negative significands and exponents and various other doodads.)

Floating-point numbers are not distributed uniformly; they are densest at the bottom of their range and grow sparse toward the outskirts of the number line. Specifically, the density of binary floating-point numbers is reduced by half at every integer power of 2. In the toy model, the gap between successive representable numbers is 1/16 in the range between 1/2 and 1, then 1/8 between 1 and 2, then 1/4 between 2 and 4, and so on. At the top of the range—beyond 64—the only numbers that exist are those divisible by 8.

The distribution of floating-point numbers describes a piecewise-linear curve, which approximates the function 2x. In other words, if you squint and turn your head sideways, you can see that floating-point numbers look a lot like base-2 logarithms. This suggests another numbering scheme: Instead of approximating logarithms, why not use the real thing? We can simply interpret a binary pattern as a logarithm and generate numbers by exponentiating. For the six-bit model, we can take the first two bits as the integer part of the logarithm (the characteristic) and the last four bits as the fractional part (the mantissa). Assuming we continue to work in base 2, the number-defining function is just f(x)=2x. The resulting system of six-bit numbers has a range from 1 to about 235. (By simple rescaling, we could make the range of these logarithmic numbers match that of the floating-point numbers, 1/2 to 120. I’ve not done so for a trivial reason: So that in the graphs presented below the two curves do not overlap and obscure each other.)

Finally, I want to mention a fourth numbering scheme, called the level-index system. Indeed, what led me to this whole topic in the first place was an encounter with some papers on level-index numbering written in the 1980s by Charles W. Clenshaw and Frank W. J. Olver. The level-index system takes a step beyond logarithms to iterated logarithms and beyond exponents to towers of exponents. The idea is easiest to explain in terms of the mapping function f(x). As with logarithmic numbers, we view the bit pattern x as the concatenation of an integer part (the level) and a fractional part (the index). Then f(x) takes the following recursive form:

level-index-eqn.png

For any level greater than 0, this formula gives rise to a tower of exponentiations. For example, the level-index number 3.25 expands as follows:

tower-of-e.png

In the six-bit model, we can dedicate two bits to the integer level and four bits to the fractional index. The result is a numbering system that covers a very wide range: from 0 to almost 400,000 in 64 steps. The distribution of representable numbers in this range is extremely nonuniform: the gap between the two smallest numbers is 1/16, whereas that between the largest numbers is more than 320,000. (Note also that we have shifted from powers to 2 to powers of e. Actually, a level-index system could be defined on any base, but e has some pleasant properties.)

The graph below compares the distribution of numbers in the four six-bit systems.

linear-curves.png

For fixed-point numbers the graph is necessarily a straight line, since the numbers are distributed at equal intervals. The floating-point graph is a jointed sequence of straight-line segments, with the slope doubling at every power of 2. The logarithmic numbers produce a smooth curve, concave-upward. (Again, this curve could be made to coincide with that of the floating-point numbers.) The level-index curve is also smooth but has an even more pronounced elbow or hockey-stick form.

Plotting the same data on log-linear scales clarifies certain details:

log-curves.png

Now it’s the logarithmic system that yields a straight-line graph. The floating-point curve approximates a parallel straight line. The fixed-point graph is concave-downward, while the level-index curve is sigmoid.

By twiddling various knobs and dials, we could create a number system that would approximate any monotonic line or curve in this space, giving us fine-grained control over the distribution of numbers. For example, on the semilogarithmic graph a straight line of any positive slope can be generated by adjusting the base of the logarithmic number system or the radix of a floating-point system; a larger base or radix yields a steeper slope. The inflection of the level-index curve can be altered in the same way, by choosing different bases. If we wished, we could interpolate between the fixed-point curve and the logarithmic curve by creating number systems in which f(x) is some polynomial, such as x2. If we wanted a flatter curve than the fixed-point system, we could build numbers around a function such as f(x)=√2.

So many ways of counting! Let me count the ways.

And so I return to the main question. On what basis should we choose a number system for a digital computer? Let’s leave aside practicalities of implementation: Suppose we could do arithmetic with equal efficiency using any of these schemes. Which of the curves should we prefer? There are plausible arguments for allocating a greater share of our numerical resources to smaller numbers, on the grounds that we spend most of our time on that part of the number line—it’s like the middle-C octave on the piano keyboard. That would seem to favor the logarithmic or floating-point system over fixed-point numbers. But can we make the argument quantitative, so that it will tell us the ideal slope for the logarithmic line, and hence the mathematically optimal base or radix? (Does Benford’s law help at all in making such a decision?)

And what about the level-index system, which skews the distribution even further, providing higher resolution in the neighborhood of 1 in exchange for a sparser population out in the boondocks? An argument in favor of this strategy is that overflow is a more serious failure than loss of precision; the choice is between a less-accurate answer and no answer at all. Level-index numbers grow so fast that they can effectively eliminate overflow: Although it’s a finite system, you can’t fall off the end of the number line.

Some years ago Nick Trefethen, a numerical analyst, wrote down some predictions for the future of scientific computing, including these remarks:

One thing that I believe will last is floating point arithmetic. Of course, the details will change, and in particular, word lengths will continue their progression from 16 to 32 to 64 to 128 bits and beyond…. But I believe the two defining features of floating point arithmetic will persist: relative rather than absolute magnitudes, and rounding of all intermediate quantities. Outside the numerical community, some people feel that floating point arithmetic is an anachronism, a 1950s kludge that is destined to be cast aside as machines become more sophisticated. Computers may have been born as number crunchers, the feeling goes, but now that they are fast enough to do arbitrary symbolic manipulations, we must move to a higher plane. In truth, however, no amount of computer power will change the fact that most numerical problems cannot be solved symbolically. You have to make approximations, and floating-point arithmetic is the best general-purpose approximation idea ever devised.

At this point, predicting the continued survival of floating-point formats seems like a safe bet, if only because of the QWERTY factor—the system is deeply entrenched, and any replacement would have to overcome a great deal of inertia. But Trefethen’s “two defining features of floating point arithmetic” are not actually unique to the floating-point system; logarithmic and level-index numbers (and doubtless others as well) also employ “relative magnitudes” and require “rounding of all intermediate quantities.” It may well be that floating point is the best general-purpose approximation idea ever devised, but I’m not convinced that all the alternatives have been given serious evaluation.

Free Pi

Two further notes. Because of space constraints, my American Scientist column appeared with a painfully truncated bibliography. I have posted an ampler list of references here.

Also, would anyone like to have 10 million digits of pi on microfiche cards? While scrounging in my files for interesting lore on large numbers, I came upon a yellow envelope in which 10,015,000 digits of pi fill up nine fiche cards. The envelope bears the notation “rec’d from Y. Kanada, Univ. of Tokyo, 1983.” The 10 million digits set a record back then, which I noted in a brief Scientific American news item. It seems a shame to destroy such a curious artifact; on the other hand, I can think of no earthly use for it.

Kanada’s group has gone on to calculate more than a trillion digits of pi, but he didn’t send me the 900,000 microfiche cards it would take to hold the listing.

The Oracle of Wolfram

25 June 2009

In a comment on my earlier note about Wolfram Alpha, Daniel Asimov takes me to task for failing to explain “what Wolfram Alpha is.” I’ll accept the criticism, but I have to add that the question he raises is a real toughie. What, indeed, is Wolfram Alpha? Much of the prerelease hype (e.g., CIO, ZDNet, Telegraph) suggested it was to be some kind of search engine—a “Google killer,” or else, as Steven Levy wrote, “more like the anti-Google.” Another common theme (Infotoday, Guardian) suggests that Alpha is a manifestation of the semantic web, “that thing that Sir Tim Berners-Lee has been banging on about.”

There have been lots of other attempts to answer the ontological question. Jonathan Zittrain (via the New York Times) calls Alpha a “computable almanac.” Larry Greenemeier, writing for Scientific American, says: “Think of it as Ask Jeeves with [a] PhD.” Stephen Wolfram, the creator of Alpha, tells Rudy Rucker: “If anything, you might call it a platonic search engine, unearthing eternal truths that may never have been written down before.” Yuri Alkin, in a blog called Connections, had the wit to present the question to Alpha itself: “Who are you?” he asked. The polite reply, worthy of HAL or Commander Data, was “I am a computational knowledge engine.”

After a few weeks of sporadic poking around with Alpha, I’m finally ready to take my own shot at answering the big question. If you ask me, Wolfram Alpha is an oracle. Not an oracle in the computer-science sense—a hypothetical black box that simplifies complexity analysis by always giving the correct answer for queries of a specific form. I mean an oracle in the Greek-mythology sense—a sybil in a cave or a temple, whose responses to questions are often helpful but tend to be enigmatic and require careful interpretation. Sometimes you get just the answer you were looking for. Sometimes you get no answer at all. Sometimes the answer leaves you more perplexed than when you began.

All in all, perhaps it’s better to set aside the question of what Alpha is and ask what it can do.

It can do your homework (or your students’ homework):

Query: Limit (x^p)^(1/p) as p->0

Answer: \(\lim_{p \to 0}(x^p)^{1/p} = x\)

It can graph a function:

Query: plot sin(x)/x from -10 to 10

Answer:

sinxoverxgraph.gif

It makes a handy desk calculator:

Query: 12 choose 3

Answer: 220

Query: factor 8549176323

Answer: 3 × 127 × 22438783 (3 distinct factors)

It provides access to a rich trove of “curated” data:

Query: molecular weight vanadium dioxide

Answer: 82.9403 (grams per mole)

It offers links to “live” data, updated in real time, on topics such as the weather and financial markets:

Query: weather Buenos Aires

Answer:

weatherAlpha.png

But the big payoff of a service like this lies in combining factual queries with mathematical or algorithmic analysis. Surely that’s what a “computational knowledge engine” should be good at, no? I’ve been trying hard to make Alpha perform in this way. So far I’ve found the process pretty frustrating.

Here’s a case study. Remembering an old story about Kansas being flatter than a pancake, I submitted the query, “flattest state in the U.S.” The response was another question: “Did you mean ‘fastest state in the U.S.?’”

Well, no, I didn’t mean that, but out of curiosity I clicked the link to learn which is the fastest state in the U.S. The reply, in its entirety, was this:

fastestUSstate.png

The oracle was in deep enigma mode. I decided to go back to the “flattest” question. Let me add that I hadn’t really expected my first query to work; a ranking of states by flatness is not something you’d find in an almanac (computable or otherwise), and indeed the concept of flatness has various possible definitions. I thought I could give Alpha some help by being more explicit.

Query: All US states maximum elevation - minimum elevation

Answer: Did you mean: US states maximum elevation minimum elevation

I wasn’t quite sure how to respond here, but it doesn’t cost anything to try, so I accepted Alpha’s rephrasing of the query. What I got back was not the answer I was looking for, but it was not entirely without interest:

Query: US states maximum elevation minimum elevation

Answer:

elevationscatterplot.png

The scatterplot of highest and lowest elevations by states tipped me off that the data I’m looking for are in the system somewhere. Indeed, one of those dots in the lower left corner, with both lowest and highest elevations near zero, is probably the answer to my question (at least if we define flatness as the difference between maximum and minimum elevation). But how to identify the dot? Or, for that matter, how to identify the conspicuous outlier—the one state with a minimum elevation well above 1,000 feet?

I allowed myself to be distracted by the latter question. There are a couple of obvious guesses for the state with the highest lowest elevation, so I tried one:

Query: Colorado minimum elevation

Answer: 3314 feet

Hmm. That’s not the outlier in the scatterplot; 3300 feet is well off the chart. That means at least one state was clipped from the graph. Another query makes this more obvious:

Query: US states minimum elevation

Answer:

elevationranks.png

It appears that ranks 1 through 4 lie somewhere above the top edge of the graph. Is there some way to force Alpha to plot the complete data set, without arbitrary cropping? For some kinds of plotting, I’ve figured out how to control the range of the independent variable (see the command “plot sin(x)/x from -10 to 10″ above), but in this context I’ve not discovered the key, if there is one. And, as far as I can tell, there is no warning given when a plot is chopped.

Nevertheless, I was able to identify the four missing states. Accompanying the rank-order graph above was a helpful list, whose first entries were: Colorado 3314, Wyoming 3100, New Mexico 2844, and Utah 2001. The highest visible dot in the graph represents the fifth state in the sequence—Montana, with a minimum elevation of 1801 feet. The list gave the first five states and the last five in the ranking. This looked promising. If I could get a complete list of minimum elevations for all the states, and then the corresponding list of maximum elevations, perhaps Alpha could also give me the differences. I would ask it to alphabetize both lists, then subtract them element by element, and finally take the minimum of the result, or else sort again according to magnitude.

A button next to the truncated list of states promised “More.” I pressed it. Now I had the first 10 and the last 10 states, but I was still missing the 30 in the middle. Something else had changed as well: All the numbers were different, with the list of elevations beginning 1010, 945, 867. After a moment’s perplexity, I realized that Alpha had decided to shift from feet to meters. No matter. Three more presses of the “More” button finally got me a complete list of minimum state elevations (in meters). And the same rigmarole soon produced the analogous list of maximum elevations (again in meters).

But now I was stumped. How do I sort the list alphabetically? Can I subtract one list from another? Can I do anything to transform the output of a command? Is there any way to compose commands, so that the output of one routine becomes the input of another? Not a clue.

But perhaps I could do it the other way, slicing the salami crossways instead of longitudinally. Instead of compiling a list of maxes and a list of mins and then subtracting, I could subtract lowest point from highest point state by state and then list the results. Searching through various help files and lists of examples, I eventually came to a page on “Elevation Data,” with a subcategory “Minimum and Maximum Elevations.” And there, at the bottom of the page, was this suggested query: “Montana maximum elevation - minimum elevation.” Clicking on it gave me the result “11,007 feet.” So I could get the elevation range for a single state. All that remained was to persuade Alpha to map the same computation over all the states….

But wait. That’s where this story began, with the query “All US states maximum elevation - minimum elevation.” It didn’t work when I tried it before, and it still doesn’t work now.

I tried some minor variations in phrasing and punctuation, such as this one:

Query: (US states maximum elevation) - (US states minimum elevation)

Answer: 4341 feet

What does the number 4341 mean? A “Show Details” button led to the explanation:

medianelevations.png

Instead of subtracting the vectors element by element, the program is taking the median of each elevation list and then subtracting. (If I had wanted to do that, I wouldn’t have known how to ask for it.)

Finally, shown below in full detail is what came back after one further attempt to formulate the “flattest state” query:

albanianlekfeet.png

Who asked about Albanian currency? I guess this is what the sybil says when she’s tired of listening to all of my questions.

*   *   *

Wolfram Alpha is an ambitious project, as its makers would be the first to proclaim. Here’s what the “About” page tells us:

Wolfram|Alpha’s long-term goal is to make all systematic knowledge immediately computable and accessible to everyone. We aim to collect and curate all objective data; implement every known model, method, and algorithm; and make it possible to compute whatever can be computed about anything.

It’s hard to resist making fun of these lofty and all-encompassing aims, especially when a fairly simple geographic query returns a result expressed in units of Albanian Lek-feet. All the same, I still applaud the attempt to create such a service, and I hope that Stephen Wolfram and his colleagues achieve some reasonable fraction of their goals.

The main sticking point, it seems pretty obvious, is not in collecting and curating data or in formulating models, methods and algorithms. It’s the access part. How am I to communicate with the system? How am I to specify which bits of systematic knowledge I’d like to retrieve, and how do I tell Alpha which models, methods and algorithms to apply? For more than 50 years the answer to this question has generally been a programming language of some kind. The designers of Wolfram Alpha have deliberately turned their back on that option, in favor of a natural-language interface. I’m sure they made this choice with the best of motives, in order to reach out to a wider audience that might be intimidated by formal notation. Unfortunately, the natural-language interface is so limited that we’re effectively left with no notation at all.

In a way, talking to Wolfram Alpha is rather like communicating in a natural language—a foreign language you don’t happen to speak. With grunts and gestures and a few stray nouns you may be able to get across the most rudimentary touristic needs—”Where toilet?” or “How much?”—but if you want to carry on a real conversation, you need more vocabulary and, most of all, you need grammar. I’m skeptical that Wolfram Alpha will ever be of much use without such a linguistic structure.

Not up to norm

20 June 2009

I have a new column out in American Scientist, on “compressive sensing”:

When you take a photograph with a digital camera, the sensor behind the lens has just a few milliseconds to gather in a huge array of data. A 10-megapixel camera captures some 30 megabytes—one byte each for the red, green and blue channels in each of the 10 million pixels. Yet the image you download from the camera is often only about 3 megabytes. A compression algorithm based on the JPEG standard squeezes the file down to a tenth of its original size. This saving of storage space is welcome, but it provokes a question: Why go to the trouble of capturing 30 megabytes of data if you’re going to throw away 90 percent of it before anyone even sees the picture? Why not design the sensor to select and retain just the 3 megabytes that are worth keeping?

The answer to this question leads to some ingenious mathematics and technology—but I’ll leave all that for the column itself. Here, regrettably, I need to try to patch up a few soft spots in my telling of the story. To make the best of the situation, perhaps I can find something interesting to say about my mistakes.

Here’s the first trouble spot. I wrote:

If we have N equations in N unknowns (and if the equations are all distinct) the system is certain to have a unique solution.

The equations in question are linear, and they have coefficients restricted to the set {0,1}. In my first draft, the parenthesis read “(and if the equations are all linearly independent),” but I wanted to avoid that bit of jargon, and I persuaded myself that because of the restriction on the values of the coefficients, distinct equations would necessarily be linearly independent. That’s wrong. Take the system:

\[ \begin{split}
0x + 0y + 1z& = 2 \\
1x + 1y + 0z& = 3 \\
1x + 1y + 1z& = 5 \\
\end{split}\]

We have three distinct equations in three unknowns with all coefficients in {0,1}, but the system is not linearly independent because the third equation is the sum of the first two. And the system has infinitely many solutions (namely all those that satisfy \(x + y = 3).\)

So I’m left wondering: Is there some way I could have explained the point correctly without a long digression on linear independence, the rank of a matrix and other niceties of linear algebra? Peter Renz, who pointed out my error in the first place, suggested this phrasing: “What is needed is that each equation adds information to the whole, so that no equation can be derived from the others.” I like that way of putting it, but on the other hand it doesn’t tell you how to look at a system of equations and determine whether each one adds information. Still, maybe it’s the best we can do.

The second expository pothole reads as follows:

For each vector element \(x\), the least-squares rule calculates \(x^2\) and then sums all the results. The search for a sparsest vector can be framed in similar terms, the only change being that \(x^2\) is replaced by \(x^0\). The zeroth power of \(0\) is \(0\), but for any other value of \(x\), \(x^0\) is equal to \(1\).

A reader noted that he had been taught—and had gone on to teach others—that \(0^0\) is an “indeterminate form,” without a definite value. And indeed it’s true: \(0^0 = 1\) if we take the expession as the limit of \(x^0\) as \(x\) approaches \(0\), but \(0^0 = 0\) if we view it as the limit of \(0^x\) as \(x\) goes to \(0\) from above. Given this discontinuity, I should not have declared so glibly that \(0^0 = 0\), as if there could be no controversy about it. Instead I could have phrased it along the lines of, “In this context it’s convenient to adopt the convention that \(0^0\) is equal to \(0\).”

But, on looking at the situation a little more closely, the whole business seems really murky. Here’s a passage from Concrete Mathematics, by Graham, Knuth and Patashnik:

Some textbooks leave the quantity \(0^0\) undefined, because the functions \(x^0\) and \(0^x\) have different limiting values when \(x\) decreases to \(0\). But this is a mistake. We must define

\(x^0 = 1\) for all \(x\),

if the binomial theorem is to be valid when \(x=0, y=0\), and/or \(x=-y\). The theorem is too important to be arbitrarily restricted! By contrast, the function \(0^x\) is quite unimportant.

In spite of this authoritative pronouncement, however, workers in certain other fields adopt the opposite convention. The tangled sentences that got me into this pickle were written an attempt to explain—again without introducing some hard-core terminology—the difference between the \(L_2\) and the \(L_0\) vector norms. And it appears that the usual definition of the \(L_0\) norm just doesn’t work unless \(0^0 = 0\). But I had never bothered to think clearly about this until my reader raised the question.

Suppose we’re given the four-dimensional vector \(\mathbf{x} = (2,0,1,5)\), and we’re asked to define its length \(\|\mathbf{x}\|\). The most familiar definition of length is the Euclidean “square root of the sum of the squares” formula, which corresponds to the \(L_2\) norm:

\[\|\mathbf{x}\|_2 = (2^2 + 0^2 + 1^2 + 5^2)^\frac{1}{2} \approx 5.477.\]

The \(L_1\) norm is defined in the same way but with first powers rather than second powers:

\[\|\mathbf{x}\|_1 = (2^1 + 0^1 + 1^1 + 5^1)^1 = 8.\]

The \(L_1\) norm is really just the simple sum of the vector components. In the same way that the \(L_2\) norm measures Euclidean distance, the \(L_1\) norm implements the Manhattan, or taxicab, metric.

Following the same scheme, we can invent lots of higher norms. Here’s the length of our example vector in the \(L_5\) norm:

\[\|\mathbf{x}\|_5 = (2^5 + 0^5 + 1^5 + 5^5)^\frac{1}{5} \approx 5.011.\]

We can generalize this idea in a formula that defines the \(L_p\) norm for any positive integer \(p\):

\[\|\mathbf{x}\|_p = \left( \sum_{i=1}^n |x_i|^p \right)^\frac{1}{p}.\]

As \(p\) increases, the \(L_p\) calculation gives greater and greater weight to the larger components of the vector. By taking the limit as \(p \to \infty\), we can even define the \(L_\infty\) norm, which is essentially a max operation—only the largest element contributes. For our example vector, \(\|(2,0,1,5)\|_\infty = 5\).

What about going in the other direction, to \(p = 0\)? In a fuzzy-headed, hand-waving way, it’s clear that what we want in this case is the sum of the zeroth-powers of the vector components. In other words, we simply want to count the components, ignoring differences in their magnitudes. Unfortunately, though, we can’t just plug \(p=0\) into the formula. That would give:

\[\|\mathbf{x}\|_0 = \left( \sum_{i=1}^n |x_i|^0 \right)^\frac{1}{0},\]

which just won’t do. The most serious problem, of course, is the \(\frac{1}{0}\) exponent, but there’s also the question of what meaning to assign to \(0^0\). As I understand it, the communities of workers who care about using the \(L_0\) norm solve these problems by introducing a special-case definition such as this:

\[\|\mathbf{x}\|_0 = \sum_{i=1}^n |x_i|^0,\]

along with the further proviso that \(0^0 = 0\). Accordingly, \(\|\mathbf{x}\|_0\) is a count of the nonzero components of the vector \(\mathbf{x}\). For our running example,

\[\|\mathbf{x}\|_0 = (2^0 + 0^0 + 1^0 + 5^0) = (1 + 0 + 1 + 1) = 3.\]

I guess this works, but if we need a special case anyway, wouldn’t it be easier just to define the \(L_0\) norm directly as the count of the nonzero elements? In formal terms this could be done using some sort of delta function instead of \(x^0\). Taking this approach, the \(0^0\) issue would never enter the case—and maybe I’d have a better chance of staying out of trouble.

Treats Tropiques

17 June 2009

I’ve let the tropical mathematics bandwagon pass me by. It’s not that I’ve been unaware. I noticed the stream of preprints, the special session at a joint mathematics meeting a few years back, workshops in Palo Alto and Moscow, upcoming events in Montreal and Berkeley. I’ve noticed all this, but I’ve been resisting it. Maybe it was the term “tropical” that put me off—a little too whimsical, with too-deliberate designs on my curiosity. But now the tropical craze has made it to the cover of Mathematics Magazine, with a lead article by David Speyer and Bernd Sturmfels (preprint version available here). I guess it’s finally time for me to figure out what the fuss is all about. The notes below—a work in progress—trace my efforts to do so.

MathMag-cover-img0514.jpg

For starters, what is that word “tropical” supposed to mean? Speyer and Sturmfels explain: “The adjective tropical was coined by French mathematicians, including Jean-Eric Pin, in honor of their Brazilian colleague Imre Simon.” Pin, in a 1998 paper (.pdf), deflects the credit to another French mathematician, Dominique Perrin, again noting that the name honors “the pioneering work of our brazilian colleague and friend Imre Simon.” Simon himself, in a 1988 paper (.ps), attributes the term to yet a third French mathematician, Christian Choffrut. Apparently, no one wants to lay claim to the word, and I can’t entirely blame them. Speyer and Sturmfels go on: “There is no deeper meaning in the adjective ‘tropical’. It simply stands for the French view of Brazil.”

The fundamental objects in the tropical world are the real numbers \(\mathcal{R}\) augmented with \(\infty\), and two operations on those numbers, tropical addition, denoted \(\oplus\), and tropical multiplication, \(\otimes\). Tropical addition is simply the minimum operation; that is, a \(\oplus\) b is equal to the lesser of a and b. For example:

\[ \begin{split}
8 \oplus 3& = 3 \\
-8 \oplus 3& = -8 \\
x \oplus \infty& = x
\end{split}\]

Tropical multiplication is equivalent to addition in conventional (non-tropical) arithmetic: a \(\otimes\) b = a + b. Some examples:

\[ \begin{split}
8 \otimes 3& = 11\\
-8 \otimes 3& = -5\\
x \otimes \infty& = \infty .
\end{split}\]

So that’s the key idea in a nutshell: plus is min and times is plus. This looks like a fairly strange and arbitrary reshuffling of definitions. First we get rid of addition, then we bring it back but call it multiplication. Why not just leave addition alone and redefine multiplication as min? There is a reason for doing it the other way around. In tropical arithmetic, the min operation \(\oplus\) plays the same role that addition plays in ordinary arithmetic, and \(\otimes\) has the same role as multiplication. In particular, consider the distributive law:

\[ a \otimes (b \oplus c) = (a \otimes b) \oplus (a \otimes c) .\]

This rule holds correctly when \(\oplus\) is min and \(\otimes\) is +, but it wouldn’t work if the definitions were swapped. The commutative and associative laws are also valid in tropical arithmetic. Speyer and Sturmfels point out that the system even fulfills “the Freshman’s Dream”:

\[ (a \oplus b)^{3} = a^{3} \oplus b^{3} .\]

On the other hand, there’s trouble with subtraction. No value of \(x\) can satisfy an equation such as \(x \oplus 2 = 5\). Because of this asymmetry, the algebraic structure of tropical arithmetic is classified as a semiring (whereas ordinary arithmetic on \(\mathcal{R}\) is a ring). Here’s another bit of jargon for jargon’s sake: the tropical semiring is idempotent. This just means that \(a \oplus a \oplus a \oplus … \oplus a = a\).

And now you ask: What’s it all good for?

As I understand it, Imre Simon and those French mathematicians who take a tropical view of Brazil were interested in the tropical semiring mainly in connection with automata theory. The idea goes something like this. A finite-state automaton can be represented as a directed graph, where the nodes of the graph are states of the machine and the edges are transitions between states. Suppose the edges are assigned numeric labels, and a path from an initial state to a final state accumulates the sum of all the edges traversed along the way. There may be multiple paths that connect the same initial and final nodes, and so it makes sense to ask which of these paths has the minimal sum. These questions are conveniently addressed in the algebra of tropical arithmetic, where the summation of the labels is the tropical product \(\otimes\) and the taking of a minimum is the tropical sum \(\oplus\). (Expressing the rules in a formal algebra rather than an ad hoc convention helps with certain proofs about the automata.)

Automata theory may be where it all began, but the current interest in tropical mathematics seems to have a different focus. The emphasis has shifted to algebraic geometry—an area of mathematics that I look upon with equal parts of fascination and fright. The main aim in this realm is to understand the solution sets of polynomial equations.

If you plot the values of an ordinary polynomial, you get a smooth curve. The graph below at right shows the simple quadratic function \(y=2x^2+x-1\) with its quadratic.png two real roots at \(x=-1\) and \(x=1/2\). What is the equivalent graph for a tropical polynomial? To construct such a beast, we first have to decide what \(2x^2\) might mean in a tropical context. In ordinary arithmetic, \(2x^2\) is shorthand for \(2 \times x \times x\), and so the obvious translation is \(2 \otimes x \otimes x\); translating back to conventional arithmetic, this expression becomes \(2 + x + x\), or in other words \(2x + 2\). Thus the quadratic term \(2x^2\) becomes a linear function, and the same transformation applies to higher powers also: In general \(kx^{n}\) is converted into \(nx + k\). tropical-quad.png Thus the exponent in the original equation becomes the integer coefficient that determines the slope of a line. The graph of the tropical polynomial is shown at left. The green line traces the complete function \(y=2x^2 \oplus x \oplus -1\); it is the union of segments and rays drawn from the three lines \(y=2x+2\), \(y=x\) and \(y=-1\). For each value of \(x\), the function takes on the least value of \(y\) that lies on one of these lines (the “lower envelope” of the lines). Graphs of all tropical polynomials share this same basic appearance: They are piecewise linear functions and they are always concave downward.

Just as the ordinary polynomial \(2x^2+x-1\) can be factored into the form \((x+1)(2x-1)\), the tropical polynomial \(y=2x^2 \oplus x \oplus -1\) has the unique factorization

\[2 \otimes (x \oplus -2) (x \oplus -1).\]

The constants –2 and –1 appearing in these factors correspond to the “bend points” in the graph of the function, and they play a role analogous to the roots of the ordinary polynomial. They are values of the variable \(x\) where two or more of the linear constraints are simultaneously satisfied. The locus of all such points is the solution set, the object of particular interest to the algebraic geometers.

For a polynomial in one variable, the solution set generally consists of isolated points. The situation gets hairier in higher dimensions, where the solution sets become algebraic “curves”—though they sure don’t look very curvaceous. Take the two-variable tropical polynomial function \(z(x,y) = x^2 \oplus y^2 \oplus 1\). The three terms of the polynomial define three planes in \(\mathcal{R}^3\), and the value of the function is the lower envelope of these planes:

xy-polynomial-3d.png

The “bend points” are now “bend lines” where the planes intersect. Projected onto the \(x,y\) plane, this tropical “curve” has a distinctive Y-shaped form. zxy-solution-set.png At right, the rays in red represent the solution set. They divide the plane into three regions, in each of which a different linear relation has minimal height; the rays themselves define the locus of points where two (or, at the origin, all three) of the terms are minimal. This trident motif is preserved in all tropical curves, not just this simple example. The illustration on the cover of Mathematics Magazine shows a more complicated graph, part of a proof of the tropical version of Bézout’s theorem. (The theorem states, with various caveats, that two algebraic curves of degree m and n intersect in mn points. If I understand correctly, it carries over fully from “temperate” to tropical mathematics.)

The mere idea of reconstructing all of arithmetic with different operations, and then building higher-level structures (algebra, geometry) on top of this new foundation—all that’s enough fun to justify the project. But it seems there are also practical applications. Speyer and Sturmfels discuss a biological problem: Given genetic sequences that define pairwise distances between species, construct a phylogenetic tree showing the evolutionary history of the species. Tropical arithmetic turns out to be useful in finding a tree where all the distances are metrically consistent—that is, they obey the triangle inequality.

That’s about as far as I’ve been able to get in my explorations of the tropical latitudes. But in closing I’d like to say another word about the naming of things in science and mathematics. My sober, fuddy-duddy opinion is that we’d be better off in the long run giving things plain, descriptive names, without too much metaphorical spin on them. Physicists once had a lot of fun with quarks and gluons, charm and strangeness, infrared slavery and ultraviolet freedom; but eventually the joke wears thin. I gather that biologists are now regretting the game of giving genes names like Sonic Hedgehog. Tropical mathematics may also get tired of explaining over and over again about “the French view of Brazil.” On the other hand, I have to admit that if all those papers and workshops had talked about min-plus algebra, they never would have caught my eye.

Update 2009-08-25: I’ve just learned that Imre Simon died August 12.