Archive for the ‘computing’ Category

Quantum numbers

Monday, June 25th, 2007

Quantum computing gets a lot of attention, but we don’t hear much about quantum mathematics. The very idea is an affront to Platonist thinkers everywhere—those of us who consider the elements of mathematics to be independent of the physical universe. Is the truth of the Pythagorean theorem subject to the same uncertainty as the fate of Schrodinger’s cat? Surely the counting numbers 1, 2, 3,… should be the same in any universe, whether quantum or classical; indeed, if you’re a full-gospel, whole-Bible Platonist, you might well argue that those numbers existed even before there was a universe. (At the opposite extreme are the constructivists, who warn that the only numbers you can count on are those you have fingers for.)

Questions like these are easily settled by experiment: Just annihilate the universe, abolish space and time, and try doing a few sums in the void. I would put the proposition to the test right now except that I don’t want to miss this week’s episode of “1 vs. 100.”

Paul Benioff of Argonne National Laboratory has been thinking and writing about these issues for some time. A few weeks ago he posted a preprint on the arXiv titled “Space of quantum theory representations of natural numbers, integers, and rational numbers.” I’ve been struggling to understand that paper, along with a couple of earlier ones on the same theme (here and here). I’m still a long way from mastering this material, but here’s what I’ve made of it so far.

Even if you go along with the Platonist view that the true home of all things mathematical is an ideal realm outside of time and space, we don’t live in that realm. Whenever we want to do something with numbers (or other mathematical entities), we have to represent them somehow in this universe. We chalk them on the blackboard; we twiddle the beads of an abacus; we load patterns of bits into a computer memory; even mental arithmetic requires a mind, which in turn seems to require a body. Thus no one does mathematics without atoms and photons and other toys and tools of the physicist, and so perhaps it’s not utterly crazy to suppose that what we can accomplish in mathematics may depend to some extent on the laws of physics. All the evidence suggests that those laws are inescapably quantum mechanical.

Benioff argues that numbers have to be represented and manipulated in ways that accord with quantum principles; for example, all operations should be reversible, and they should enforce “unitarity,” meaning that the probabilities of all possible outcomes should sum to exactly 1. At the same time, it’s important that numbers behave like numbers: They have to obey familiar axioms of arithmetic, or they’re of no use to mathematics. In the case of the counting numbers (a.k.a. the natural numbers) we have to be able to add and multiply. For the integers (the natural numbers augmented with zero and negative values), subtraction must also be allowed. The rationals introduce division. And for all these kinds of numbers we should be able to determine if two numbers are equal, and put them in order if they are not.

Quantum computing is usually discussed in terms of qubits, or the quantum equivalent of binary digits. Benioff extends the discussion to qukits, the quantum equivalent of base-k digits. Think of a qukit as a black box that holds one of k values, but until you open the lid, you can’t be sure which value. Once you look inside, the uncertainty is resolved, and the box is found to contain a specific value. The question Benioff addresses is: How do you do arithmetic with such unruly numbers? If you can’t even be sure of what numbers you’re working with, how can you add or subtract them, or test them for equality? The answer, ultimately, is that you have to put up with a degree of uncertainty. In the quantum world, the commutative law of addition, a+b = b+a, is not a bedrock principle but a statement whose truth is a matter of probabilities.

Benioff presents a specific implementation of quantum-theoretical numbers, based on a two-dimensional lattice of qukits. Each number occupies its own row of the lattice, with the digits arrayed from left to right. For integers, a designated qukit (actually a qubit, since it has just two states) serves as a sign bit. For rationals, this special qubit also marks the position of the decimal point (or “k-al” point), separating the integer part from the fractional part. On first glance, this lattice of qukits doesn’t seem too different from the hardware of an ordinary computer, but the quantum nature of the qukits brings some peculiarities. For example, the result of every operation has to go into a newly allocated row of the lattice; you can never erase or overwrite an existing value, because quantum operations have to be reversible and cannot destroy information.

Benioff introduces a set of parameters for his numbering system: m and h are the coordinates of a number within the lattice of qukits, k is the base of the qukits, and g is a “gauge-fixing function.” This last item sounds quite arcane, but I think it has a fairly simple explanation. You can imagine a qukit as a vector that can point in any of k directions; the gauge-fixing function defines a reference direction from which all the others are measured.

Here’s a taste of what Benioff has to say about his quantum numbers:

Transformations (k, (m, h), g) → (k’, (m’, h’), g’) in the parameter set induce transformations in the representation space. These consist of unitary translations that move the qukit strings on the lattice, transformations that change states of strings of base-k qukits to states of strings of base-k’ qukits, and unitary gauge transformations for each k….

An interesting result is that the axioms and theorems for each of the three types of numbers [i.e., natural numbers, integers and rationals] are invariant under these transformations. They represent symmetries of the systems. This is the case even though the specific expressions of the axioms and theorems in terms of basic arithmetic relations and operations are different for different representations. This is like the situation in physics where the laws of physics are invariant under Lorentz transformations even though their specific expression in different reference frames may be different.

Another interesting result is that qukits qk where k is a prime number function as elementary qukits. These are the “elementary particles” as far as quantum representations of numbers are concerned. Qukits where k is not prime can be considered as composites of the prime number qk.

If you want to delve more deeply into these matters, I must send you to Benioff’s paper. Here I want to return to the broad question of whether this line of inquiry can really lead us to some kind of quantum mathematics, as opposed to an abstract version of quantum computing. Personally, I’m not quite persuaded, although I find the proposition intriguing.

In most of this work the focus is on the representation of numbers, rather than the numbers themselves—a preoccupation that seems more characteristic of computing than of mathematics. Interesting mathematical properties of numbers tend to be independent of their representation; for example, the number seven is a prime whether you write it in decimal notation as 7, in binary as 111 or as the Roman numeral vii. Of course you must choose some representation, and the choice can make a difference in what you can accomplish. Benioff points out that unary notation (in which seven becomes 1111111) is inherently less efficient than other schemes, because the amount of work expended in manipulating a number is proportional to the number itself rather than to the logarithm of the number. For similar reasons Benioff objects to building all of arithmetic on the successor function (n → n+1). But this fretting over efficiency again suggests a more computational than mathematical frame of mind. After all, unary numbers and the successor function are essential building blocks in theories of the foundations of mathematics, such as in the Principia Mathematica of Whitehead and Russell—who were blissfully unconcerned with efficiency.

If you accept that the physical representation of mathematical objects is a fundamental issue in mathematics, then it’s an easy step to the conclusion that any such representation has to be consistent with quantum principles. But is the mathematical imagination truly fettered by the bounds of the physical universe? Mathematicians routinely reason about objects and operations that have no explicit material representation—irrational numbers, for example, or Georg Cantor’s infinite sets. A century ago, in response to another challenge from those who wanted to fence in the scope of mathematics, David Hilbert defiantly proclaimed: “No one shall expel us from the paradise that Cantor has created for us.”

If I remain a tad doubtful about the mathematical status of quantum-theoretical numbers, I do think they offer an interesting perspective on quantum computing. So far, no one has succeeded in building a practical quantum computer with a large number of interacting qubits (or qukits). Technological skeptics contend it will not be done any time soon. Benioff’s work turns this argument on its head. Quantum computers are the only ones we can build, he says, because we live in a world where quantum physics is the law of the land. We may think we have classical computers, but that’s an illusion. We merely have quantum computers that are heavily biased toward specific classical outputs, but they always retain the possibility of delivering a quantum surprise. Usually we regard computing—whether it’s done with a machine or with pencil and paper—as an approximation to a mathematical ideal. If a calculation shows that a+b does not equal b+a, we don’t question the commutative law of addition; we look for a bug or an error. But maybe we need to consider the possibility that it’s the quantum computation that’s the ultimate reality, and the mathematical law is just our convenient and tidy approximation to it.

V1@gra

Wednesday, June 13th, 2007

I watched the spelling bee on TV a couple of weeks ago and was stumped by word after word: aniseikonia, oberek, randkluft, cachalot, schuhplattler, cilice. It’s all enough to send you reeling back to Andrew Jackson or Mark Twain or Winston Churchill or whoever the hell it was who said “I don’t give a damn for a man that can only spell a word one way!” As it happens, I’ve been writing lately about words that get spelled and misspelled in lots and lots of ways. My Computing Science column in the July–August issue of American Scientist asks the question: “How many ways can you spell V1@gra?”

Disclaimer: I ask the question but I can’t answer it—or at least I can’t give a definite number or a close approximation.

Another question also goes unanswered. The curious and creative spellings prevalent in spam are (presumably) intended to evade the filters that most of us have installed on our e-mail. Because the word “Viagra” is uncommonly common in spam, most e-mail that mentions it gets dumped in the junk bin. So what do you do if you frequently need to discuss Viagra in your correspondence? In particular, what about Pfizer, the company that makes and markets the stuff? Surely their corporate mail servers can’t be running filters that block all references to their own product.

I tried to find out how Pfizer deals with this problem. I sent an e-mail query to their public relations department. I got no response—which could of course be taken as an answer to my question. I tried following up by telephone, but no one I spoke with was able to shed any light on the issue. So, if anyone from Pfizer should read this, please get in touch; I’d still like to know the answer.

By the way, has anyone noticed that “Pfizer” looks a spelling invented by a spammer?

Twenty-six twiddles suffice

Monday, June 4th, 2007

Among the 250 million Rubik’s cubes manufactured since 1980, how many lie abandoned in a scrambled state, having never regained their original configuration since being taken out of the box? Most of them, I would guess. Now comes word that those cubes might be restored to pristinity with a little less effort. The upper bound on the number of moves required to solve any state of Rubik’s cube has been lowered from 27 to 26. The result is reported by Daniel Kunkle and Gene Cooperman of Northeastern University in a preprint available here.

It’s well-known that Rubik’s cube isn’t really a toy or a puzzle but rather a group-theory machine in disguise. Each possible move, or twiddle—rotating a face by some multiple of 90 degrees—is a transformation that permutes the positions and orientations of the 26 “cubies.” From any position there are just 18 distinct nontrivial moves, but in various combinations they generate a total of 43,252,003,274,489,856,000 permutations. What is the maximum number of moves needed to transform any one state of the cube into any other? That’s the question Kunkle and Cooperman are addressing. In other words, they are looking for the longest shortest path.

In principle we could get the answer by exhaustive search. It would be done by constructing the “Cayley graph” for the Rubik group: a graph in which each possible configuration of the cube is represented by a node, and two nodes x and y are connected by an edge if some single move allows configuration x to be transformed into y. The most efficient solution for any state is the shortest path (i.e., sequence of edges) leading from that state to the node representing the solved state. The worst-case solution length for the entire puzzle is the maximum of the shortest solution paths for all the nodes. The Cayley graph approach was used by Cooperman and two colleagues to find the worst-case path for the 2 × 2 × 2 version of Rubik’s cube. The Cayley graph for this device has 88,179,840 nodes, and it turns out the longest paths within it have 11 steps. For the 3 × 3 × 3 cube, however, exhaustive search is just too exhausting.

To make the search feasible, Cubists have had to scale back their ambitions and accept an upper bound rather than a true maximum. First it was shown that the worst-case solution path cannot require more than 52 moves; then the bound was lowered to 30, and then 29; a year ago Silviu Radu got the number down to 27.

Kunkle and Cooperman’s strategy is to factor the problem into two pieces by segregating half-turn (180-degree) and quarter-turn (±90-degree) moves. The subgroup of states reachable by half-turn moves is quite small by Rubik group standards, with just 663,552 elements, and the number is further reduced to only 15,752 after various symmetries are eliminated. Calculating the best solution for each of these states took only about a day on an ordinary PC. The longest such paths have 13 steps.

The second phase of the analysis tackles the “cosets” of the half-turn subgroup. For each of the 663,552 positions reachable by making half-turns only, there are 65,182,537,728,000 additional positions reachable by making quarter-turn moves. Kunkle and Cooperman had to search for the longest shortest path connecting any two elements of this set. They exploited various symmetries to reduce the scope of the problem and devised fast algorithms for some of the basic steps in the calculation. Even so, they wound up running their computer program for 63 hours on 128 processors at the San Diego Supercomputer Center. At one point the data store for intermediate results ballooned to seven terabytes. At the end of the run, they found that the longest shortest path consisted of 16 moves.

Combining the two phases of this solution yields 13 moves for the half-turn part of the path and 16 moves for the coset part, for a total of 29 moves. This is not an improvement over the best existing result of 27. However, Kunkle and Cooperman found that the two-stage algorithm reports path lengths greater than 26 only for a comparative handful of states—about 14,000. By a brute-force search they were able to find solutions of length 26 or less for each of these states, and thereby established the 26-move bound overall. It’s possible that further brute-force searching will lower the limit further to 25 moves.

Other experiments done by Herbert Kociemba, Richard Korf and others have shown that most randomly selected cube states can be solved in 18 moves or less. Korf has conjectured that all configurations are solvable in about 20 steps. We’ll see whether this is true—but probably not soon.

Amazon poker

Thursday, May 10th, 2007

Investors are constantly checking the stock ticker, gamblers check the point spread, and everybody is forever checking their e-mail. For a writerly type like me, however, the unshakeable obsession is checking my Amazon sales rank. Amazon.com calculates a sales rank for every book listed on its Web site, and updates the ranking hourly. Here’s a graph of the hourly fluctuations in the ranking of my book Infrastructure: A Field Guide to the Industrial Landscape over the course of a single day last week:

Rankforest graph of Infrastructure sales rank 2007-05-03

At any given moment the current rank is listed in the “Product Details” section of the Amazon page for both the hardcover and the paperback editions. The graph above comes from a service called Rankforest, which also tracks both the hardcover and the paperback versions.

From an author’s point of view, there’s a lot of mystery in these numbers. How are the hourly rankings calculated, and what (if anything) do they mean? How do the rankings correlate with actual sales of the book? Amazon is not telling, and so authors and other interested parties have been left to speculate and experiment. The obvious experiment is to order a copy of the book and observe the effect of this purchase on the ranking. I welcome such experimentation with my book. (I would prefer that you do it with the hardcover edition, for which I get a more generous royalty.)

Morris Rosenthal of Foner Books seems to be the leading scryer of signs in this field. The interpretation of the rankings has also been discussed by Chris Anderson, the editor of Wired, in an article, a blog, and a book (current Amazon rank = 463, way above mine, dammit). It’s clear that the hourly ranking cannot simply reflect the number of copies sold within the past hour. After all, in any given hour the vast majority of books sell zero copies, and so there would be a gigantic tie for last place. The rankings also can’t be based in any simple way on total sales since publication, because the standings are much too volatile. The received wisdom is that each sale produces an uptick, followed by an exponential decay until the next sale.

I find all these speculations fascinating, but they are not what I want to write about today. My topic is even more trivial. I’ve been tracking my Amazon ranking for more than a year and a half now, ever since the hardcover edition was first listed in September 2005. (The paperback came out about a year later.) Here’s what the trend looks like:

Amazon rankings of Infrastructure since publication

The graph records the highest (i.e., numerically smallest) rank noted on each day, with rare gaps when I happened to be offline all day. Without question it would be fairer to use the daily average rather than the daily peak. But an author’s ego is a fragile thing, and so I have gone out of my way to make the outlook as rosy as I could.

Here are some of the actual numbers I recorded, for the month of April 2007:

   date          HB      PB
2007-04-01     138263   47878
2007-04-02     192152   22728
2007-04-03      29146   41862
2007-04-04      73628   57155
2007-04-05      37172   16948
2007-04-06     127858   12779
2007-04-07     171363   25212
2007-04-08      55256   20770
2007-04-09
2007-04-10      36333    7121
2007-04-11     112423   19015
2007-04-12     183063   40015
2007-04-13     225457   46781
2007-04-14     239879   29259
2007-04-15     142252   16030
2007-04-16      24803   39200
2007-04-17      93485   18939
2007-04-18     173691   26434
2007-04-19     217440   19360
2007-04-20      44426   17276
2007-04-21     213765   26652
2007-04-22      39014   21699
2007-04-23     150012   18598
2007-04-24      61301   46268
2007-04-25      33800   22474
2007-04-26      10603   39335
2007-04-27      19746   15984
2007-04-28      14027   19324
2007-04-29      16527   25999
2007-04-30       5844   63439

Notice anything out of the ordinary? What I’m looking at is not the overall pattern of rising and falling magnitudes but rather the inner patterns of digits within the numbers. As I’ve been writing down these rankings over the past 20 months, I have had the persistent impression of a peculiar overabundance of repeated digits. Just in this small sample we have 47878, 22728, 192152, 57155, 25212, 36333, 44426, 33800, 39335, 25999, and lots more. Is there something going on here? Is Jeff Bezos broadcasting secret signals hidden in the digit patterns of the Amazon sales rankings?

I consider myself a reasonably sophisticated probabilist. I know I’m not supposed to be shocked when I bring together 23 people and find that two of them share a birthday. And I know that when people try to generate a random sequence of digits by plucking numbers out of their imagination, the result is almost always too homogeneous, with a deficiency of repetitions and other patterns of the kind I’m calling attention to. Thus I was prepared to believe that the patterns I perceived were conjured up out of nothing—phantom regularities in purely random data. Still, day after day, I would note numbers like 55256 and 20770 (which in fact appeared on the same day, one for the hardcover, one for the paperback), and wonder about the odds of such coincidences. Maybe I wasn’t just letting my imagination run away with me.

Finally, this past weekend, I could stand it no longer; I had to find out.

I’m going to have to give away the punchline right now, lest it come as a disappointment later: There is nothing unusual about those numbers. The distribution of digits—the number of pairs and triples and what-not—is well within the range expected for randomly generated numbers of the same size. In other words, I have confirmed the null hypothesis. Often, such a negative result is considered unpublishable, but this one cost me a fair amount of effort, and so I’m determined to get a blog item out of it for better or worse. If the conclusion itself is not very interesting, perhaps I can find something to say about the techniques and technologies that led to it.

How do you decide whether a number like 36333 is too unusual to be a product of random processes? I decided to analyze the numbers in my sample as if they were poker hands (with the rules of poker adapted to a deck of cards with ten ranks and no suits). I confined my attention to the five-digit numbers, which are the most numerous in my sample; of the 850 rankings I had recorded, 458 were in the range between 10,000 and 99,999. Then I wrote a five-line program that takes any such number and classifies it as one of seven types of poker hands:

  • five of a kind (e.g., 77777)
  • four of a kind (36333, 11119)
  • full house (44555, 28288)
  • three of a kind (57155, 20900)
  • two pairs (97097, 28002)
  • one pair (36739, 14912)
  • bust (53208, 16897)

I decided to ignore straights; the poker hand 56789 is simply a “bust” according to this scheme of classification. The concept of a flush—all cards of the same suit—doesn’t arise, since there are no suits.

When I ran my little program over the 458 five-digit Amazon sales rankings, here’s what I got:

   hand            number       frequency

five of a kind        0          0.0000
four of a kind        3          0.0066
full house            7          0.0153
three of a kind      34          0.0742
two pairs            51          0.1114
one pair            233          0.5087
bust                130          0.2838

   TOTAL            458          1.0000

The question now, of course, is what I should expect to see in such a data set, on the hypothesis that the digit patterns are random rather than contrived for some secret, nefarious purpose. Should I find it remarkable that half of the hands have a single pair, or that more than 70 percent have a pair or better? What about those seven full-house hands—should that abundance arouse suspicion? To begin answering questions like these, we need to calculate the probabilities of the various hands.

Having just boasted of my sophistication as a probabilist, I must now confess that the main thing I’ve learned about calculating probabilities is that getting the right answer is highly improbable. I understand the principle of the thing. It’s just a matter of counting. You count the “success” cases and divide by the total number of cases. But counting—even though it tends to come early in the mathematical curriculum—is not always easy.

In the Amazon poker problem, the first trap for the unwary is in counting the total number of cases. You might think that with five decimal digits, there would be 105 possible arrangements, but in fact 104 of those arrangements are excluded from the sample, because numbers in this context cannot have a leading digit of zero. Thus the denominator in the probability calculation will be 90,000 rather than 100,000.

I think I can trust myself to calculate the probability of a five-of-a-kind hand. The first card dealt must not be a zero but can be any of the other nine digits; thereafter, each subsequent digit must be identical to the first one. Thus the number of ways of forming a five-of-a-kind hand is 9×1×1×1×1, and the probability of this outcome is 9/90,000, or 0.0001. You can expect five of a kind in one hand out of every 10,000, if the deal is fair.

I believe this answer is correct, and I am proud of having obtained it; on the other hand, the five-of-a-kind calculation is by far the easiest case. Allow me to try to work out a harder problem: the odds of a full house. I invite you to listen in on my so-called thought process:

Well, a full house is a hand that matches the pattern aaabb. The first digit can be anything but zero, and so there are nine possibilities, but then the second and third digits have to match the first. The fourth digit must differ from the first three, and so there are eight candidates left…. No, wait…. This time zero is allowed, and so there are nine possibilities again. Then the fifth digit has to be identical to the fourth. That gives us the product 9×1×1×9×1, or 81 successes out of 90,000 total cases, for a probability of 0.0009…. Did I get that right?… Of course not. What I’ve calculated is the probability of seeing the pattern aaabb in that precise sequence; I am counting occurrences of 11100, 11122, 11133, …, 99988, but I am not including sequences such as 23233 or 45454. Okay. So one approach, now that I have the number of aaabb sequences, is to multiply by the number of ways of permuting that sequence. We can just take the aaa and intercalate the two b’s in all possible positions: bbaaa, babaa, baaba,…. No, wait…. The b could be a zero, and so it’s not always allowed to appear in the first position. Hmmm. This is not going well. For the time being, let’s forget about the prohibition of leading zeros, and we’ll correct for it later. That way we can consider all possible permutations of aaabb. As a first step, take aaa, where a can be any of the ten decimal digits, and place a b is all possible positions, allowing b to assume any value that differs from a, so that there are nine choices. There are four places to put the b: baaa, abaa, aaba and aaab. Now, for each of these four sequences, the second b can be placed in any of five positions, and so there are 4×5 = 20 permutations overall. Which means that the total number of full houses is…. Hold on. No, no, no, no, no. Not all those 20 permutations are distinguishable. If we start with baaa and insert a b in either the first or the second slot, we wind up with bbaaa in either case; this sequence should not be counted twice. So how many permutations are there, really? Offhand, I don’t see any way to count them that’s easier than direct enumeration: bbaaa, babaa, baaba, baaab, abbaa, ababa, abaab, aabba, aabab, aaabb. That’s ten cases. Let’s sum up. We know that a can take any of ten values and b has nine possible values, and there are ten ways of arranging three a’s and two b’s. Thus the total number of combinations is 10×9×10 = 900. But, don’t forget, now we have to subtract away all those sequences that start with a zero. For this purpose it doesn’t matter whether the first symbol is an a or a b; exactly 10 percent of the sequences will have a leading zero. Thus the number of full houses is 900–90=810. This gives a probability of 810/90,000 = 0.009.

I happen to know, from an independent calculation, that this answer is correct, and so it’s time to stop. If I didn’t know, however, I might well go on to ask whether we need to interchange the a’s and b’s—that is, consider the case of sequences aaabb where a has only nine allowed values and b has ten. (Why don’t we have to take that into account?)

I am mildly embarrassed to put this lurching, stumbling, caricature of a probability calculation on public exhibition, and yet it is a fair description of how I often struggle with a problem of this kind. Am I the only one who suffers so? Not everyone does. I know people who could carry out the same computation quite deftly; they command the intuition, the spürkraft, to zero in immediately on the right approach, like a chess player who doesn’t waste time considering fruitless moves. I admire that kind of finesse, but I don’t possess it.

On the other hand, I know something else. I know another way to solve the problem, with less fuss. As I mentioned above, I have already cooked up a little program that can take any five-digit Amazon rank and classify it into one of the seven categories of poker hands. In milliseconds I can run that program on all possible five-digit numbers; after all, there are only 90,000 of them, and it’s quite easy to generate all of them in sequence. Then I can just count the number of hands in each category, and all the probabilities come tumbling out in a neatly formatted table:

   hand            number       frequency

five of a kind        9          0.0001
four of a kind      405          0.0045
full house          810          0.0090
three of a kind    6480          0.0720
two pairs          9720          0.1080
one pair          45360          0.5040
bust              27216          0.3024

   TOTAL          90000          1.0000

If Fermat or Pascal or the Bernoullis were asked their opinion on this approach to probability, something tells me they would find it distasteful. I’m ambivalent myself. Returning to the chess analogy, this is the equivalent of the machine that beats a grandmaster by brute force, scanning millions of positions but knowing nothing of strategy. You can win that way, but you don’t get any style points. More important, the method is frustratingly opaque. I get the answers, and I have reasonable confidence that they’re correct, but the computation gives me no understanding of why they’re correct. Still another objection is that the method does not scale well. If my Amazon rankings had ten digits instead of five (perish the thought!), I’d have a hard time classifying the nine billion possible hands. (But then again I’m not sure the more analytic method scales all that well either.)

Setting aside these qualms, we can now take the predictions of theory and the observations of the Amazon sample and compare them side by side:

   hand            prediction       observation

five of a kind        0.0001          0.0000
four of a kind        0.0045          0.0066
full house            0.0090          0.0153
three of a kind       0.0720          0.0742
two pairs             0.1080          0.1114
one pair              0.5040          0.5087
bust                  0.3024          0.2838

   TOTAL              1.0000          1.0000

Some of these frequencies match quite closely (three of a kind, one pair); others are a bit off the mark (full house, bust). In a finite sample, of course, you would never expect an exact match to the theoretical frequencies—but are the discrepancies we’re seeing significant or not? My personal instinct says that there’s nothing amiss here, that the observed frequencies are consistent with the null hypothesis. In other words, the rankings could just as well be random numbers. The old rule of thumb that the variation should be less than the square root of the observation leads to the same conclusion. We could quantify these intuitions by calculating variances or standard deviations, doing a Χ2 test, and so on. But if I can barely calculate a simple probability, can I be trusted to navigate all those treacherous subtleties such as choosing the correct number of degrees of freedom?

Again there’s another way to go about it, relying on lots of ignorant computation to replace a little smart mathematics. The question we want to answer is this: Given a set of 458 randomly generated five-digit numbers, what is the probability that the random set will differ from the predicted frequencies by at least as much as the observed Amazon set? Suppose we measure distance from the theoretical prediction in terms of the sum of the squared differences:

S^2=\\sum_{i=1}^7(y_i-\\bar{y}_i)^2

where the index i ranges over the seven types of poker hand, and the expression in parentheses is the difference between the observed and predicted frequency for each type of hand. (Technical note: The seven numbers entering into this statistic are not independent. For example, if full-house hands are in surfeit, there has to be a compensating deficiency somewhere else. How much should I worry about this?) For the actual Amazon results, S2 works out to 4.267×10–4. Now we can generate lots of batches of random Amazon poker hands, each batch consisting of 458 five-digit numbers, and calculate S2 for each batch. What proportion of them will have an S2 value exceeding 4.267×10–4? The answer, based on half a million batches, is 80 percent. Thus all the anomalies I thought I was seeing in those numbers are pure delusion.

Needless to say, I was rooting for another outcome. I would have enjoyed finding something spooky and inexplicable in the Amazon rankings. Instead, all I’ve proved is that my intuition about what random numbers look like is not to be trusted. This is a disappointment, but maybe I can salvage something from the ruins. Perhaps I can claim the discovery that the Amazon rankings are a fairly good source of random numbers.

Large-scale differences and movements in the rankings are surely not random. It’s not purely a matter of chance that my book’s current rank is 43,888 (what an interesting number!), while some preposterous tale about a pubescent wizard occupies the top of the list—even though that book hasn’t actually been published yet. (Do I sound bitter?) The rankings are nonrandom in another way as well: The first digits are not uniformly distributed but have a Benford or Zipf distribution, with an excess of ones and a shortage of nines. It’s interesting that even though the randomly generated numbers do not share this property—the first digits have a uniform distribution over the range 1 through 9—the poker-hand analysis shows that the frequency of pairs, triples, and other patterns is identical in the two data sets. Thus the digits to the right of the first digit do seem to have the statistical properties of random numbers. The source of the randomness is presumably the hourly reshuffling of the rankings by the actions of thousands of Amazon shoppers—actions that are all too predictable in the aggregate (curse you, Harry Potter) but quite random in detail.

Perhaps it’s worth noting that when the RAND Corporation prepared their famous book A Million Random Digits with 100,000 Normal Deviates in 1955, they also employed the poker test as a measure of randomness. I’m relieved to find that their calculation of the theoretical frequency of the seven types of hands agrees with mine. (They don’t say how they performed the calculation.) The current Amazon sales rank for the book is 2,668,928 (hardcover) and 281,270 (paperback).

The stepchild

Tuesday, February 6th, 2007

A new jeremiad foretelling the doom of computer science as an academic discipline has just been published by the British Computer Society. It’s by Neil McBride, principal lecturer in the School of Computing, De Montfort University, Leicester.

Some of what McBride says strikes me as breathtakingly wrong-headed:

Now vastly complex applications for businesses, for science and for leisure can be developed using sophisticated high-level tools and components. Virtual robots—Zooks—can be created by eight-year olds without needing programming, logic or discrete mathematics skills. Web designers build complex business sites, graphic designers build animations, accountants assemble business systems without needing to go object-oriented.

Computer science has lost its mystique. There is no longer a need for a vast army of computer scientists. The applications, games and databases that students once built laboriously in final year projects are bought at bookshops and newsagents.

If I understand this argument correctly, McBride is saying that computing (and in particular software development) has become so easy that there’s no longer a need for any sort of formal curriculum. Any eight-year-old can do it, or even an accountant. I wish I could believe that, but in fact I’d be inclined to argue in exactly the opposite direction: Computer science is in trouble not because all the big problems have been solved but because the problems are so hard that no one has a clue how to make any progress. In computing theory, the P = NP? question has been out there for almost 40 years, with no sign of resolution. In engineering, large software projects routinely run over budget and behind schedule. At a more personal level, maybe eight-year-olds in Leicester can conjure up virtual robots at will, but the rest of us struggle with computing environments that still seem overly complex, fragile, unreliable, inconsistent, and just plain buggy. I see no shortage of challenges for computer science.

Still, the underlying question is not to be shrugged off: Do computer science departments have a future in the university? Enrollments are down, and chairs are fretting about funding and the head count. Students, meanwhile, worry about finding a job. Those who began their education in the era of IPO frenzy have graduated into a more-somber world. (But James H. Morris and Peter Lee resist blaming it all on the boom-and-bust cycle in Silicon Valley; they point out that undergraduate enrollment peaked in 1987, and thus it was already falling long before the vulture capitalists began picking at the bones of dot-com failures.)

A few years ago, a committee formed by the National Academy of Sciencies and chaired by Mary Shaw of Carnegie Mellon University issued a spirited defense of computer science as a free-standing, independent area of research, distinct from mathematics, engineering and all other disciplines. More recently, Bernard Chazelle of Princeton University has been carrying on the campaign in a series of essays, editorials, and panel discussions. The following Q&A comes from a press release issued in advance of a panel discussion at the AAAS meeting last year:

Isn’t computer science really just a stepchild of mathematics?

As the recent breakthroughs on Fermat’s Last Theorem indicate, the field of mathematics has never been more fertile with new ideas. Mathematics is original and deep, but it does not force you to think differently. If a math giant from the past—someone like Gauss—were to come back to Earth, he would have a lot of catching up to do but he would find that math is done much the same way that it was done during his life.

Computer science, by contrast, is a new way of thinking, a new way of looking at things. For example, mathematics can’t come near to describing the complexity of human endeavors in the way that computer science can. To make a literary analogy, mathematics produces the equivalent of one-liners—equations that are pithy, insightful, brilliant. Computer science is more like a novel by Tolstoy: it is messy and infuriatingly complex. But that is exactly what makes it unique and appealing—computer algorithms are infinitely more capable of capturing nuances of complex reality in a way that pure mathematics cannot.

“Stepchild”! That one word says it all. There’s a whole Tolstoyan novel in it, or maybe a Henry James, or at least a Grimm Brothers fairy tale. Computer science is the poor relation, the orphan, the foundling—taken in by a grand family but never taken seriously, never allowed to forget its doubtful origins. Or else computer science is the arriviste, the rich American cousin with too much cash and too little taste, who tears down the mossy old manse and puts up a spiffy new house designed by Frank Gehry. Or do we have a tale of sibling rivalry—the two children with different visions of the future for the family firm, neither of whom can ever feel secure in the love of a cold and distant parent?

I don’t know how this tear-jerker will end. Without sectarian violence, I hope. In a sense it’s none of my business, since I’m not a member of either department. But I do have to say that the idea of being asked to choose between mathematics and computer science is preposterous. If I were pleading this case, I would emphasize the connections between those fields, not the distinctions.

Jacobsthal numbers, part 3

Monday, December 11th, 2006

Our story so far: Having stumbled upon the Jacobsthal numbers, 1, 3, 5, 11, 21, 43, 85, 171, 341,…, I idly asked, “Who was Jacobsthal?” Keith Matthews promptly responded with a wealth of biographical information, even arranging to have an obituary translated from the Norwegian. So I asked, “Where did Jacobsthal mention these numbers?” and Barry Cipra quickly supplied a reference: “Fibonaccische Polynome und Kreisteilungsgleichungen” in Sitzungsberichte der Berliner Mathematischen Gesellschaft 17 (1919-1920), 43-57.

I dare not ask another question, lest someone else go running off to do more of my library errands for me. Embarrassing.

Unfortunately, though, when I look into the Jacobsthal paper, further questions are inescapable. Nowhere in the article, as far as I can tell, does Jacobsthal list any of the numbers in the sequence that now bears his name. Admittedly, I don’t actually read German; I just make it up as I go along. But, ungebildet as I am, I can at least recognize numerals, and 1, 3, 5, 11, etc. are not to be found. The closest approach is in this passage:

Jacobsthal excerpt

Here we see the recurrence relation f(n+1) = f(n) + xf(n–1), which produces the Jacobsthal sequence in the case when x = 2. But nowhere does Jacobsthal mention the specific case of x = 2, or any other specific example for that matter, except for noting that x = 1 corresponds to the “so-called” Fibonacci numbers.

So once again I’m left wondering: Exactly how did Jacobsthal’s name get attached to the numbers 1, 3, 5, 11, 21, 43…?

Incidentally, Google Language Tools offers a wonderful translation of Jacobsthal’s title: “Fibonacci polynomials and circling hurrying equations.”

Jacobsthal numbers

Thursday, November 23rd, 2006

In an item published here last May I stumbled across the sequence

1 3 5 11 21 43 85 171 341 683 1365 2731 5461 10923 21845 43691
87381 174763 349525 699051

which I dubbed “the oddest numbers” but which Neil J. A. Sloane’s Encyclopedia of Integer Sequences calls the Jacobsthal sequence. I asked: Who is or was Jacobsthal? Keith Matthews, creator and maintainer of the Number Theory Web, has stepped forward with an answer. Ernst Jacobsthal (1882-1965) was a German-born mathematician who fled to Norway when the Nazis came to power and eventually became a Norwegian citizen and professor at Trondheim. The story is told in an obituary that Matthews has made available on the Number Theory Web both in the original Norwegian and in an English translation by Jan Kristian Haugland. Matthews also mentions a brief biography in German. And the Mathematics Genealogy Project offers the further information that Jacobsthal was a student of Georg Frobenius and Issai Schur.

A bit more poking around reveals that Jacobsthal’s doctoral dissertation is available online, published by the Humboldt-Universität zu Berlin. In this work Jacobsthal gives a proof that every prime of the form 4n+1 can be written as a sum of two squares.

I’m very grateful to Matthews for answering my idle question about Jacobsthal—but there’s no end to questions. What I’d like to know now is when and where and in what context Jacobsthal discussed his sequence. The numbers are most readily defined by the recurrence Jn = Jn-1 + 2Jn-2, with J0 = J1 = 1. How did this idea come up in his work? Several journal articles and web sites mention the Jacobsthal numbers, but I’ve yet to find a reference to any specific publication.

R6RS

Friday, September 15th, 2006

Scheme, the dialect of Lisp invented by Guy Lewis Steele, Jr., and Gerald Jay Sussmann, is my first love among programming languages, though I haven’t always been faithful. There’s a new draft standard for Scheme now circulating at http://www.r6rs.org/. The proposal is open for comments until next March; on final approval it will become “The Revised6 Report on the Algorithmic Language Scheme.”

Scheme has always been a language under both tension and pressure. It started out (more than 30 years ago) as a sandbox for playing with some new ideas in programming-language semantics, and also (I think) as an attempt to “do Lisp right.” Soon it became an important teaching language, and it still is. But there’s always been pressure to make more of it. Shouldn’t a language for grownups have modules, libraries, foreign-function interfaces—not to mention classes, methods, inheritance and all the other apparatus of OO? That’s the pressure part. The tension has come from unceasing debates over issues such as the ideal syntax and semantics of macros, and the morality of creating a version of Lisp without an eval procedure.

I’ve only just begun to read the new draft, which is officially the Revised5.91 Report. But I think one can draw some preliminary conclusions just from its heft. Here are the page counts of the principal versions of the standard:

Date Pages
The Revised Report 1978 34
The Revised Revised Report 1985 76
The Revised3 Report 1986 42
The Revised4 Report 1991 55
The Revised5 Report 1998 76
The Revised5.91 Report 2006 142

An introductory essay that first appeared in the Revised3 Report begins:

Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

The same text is still present in the new document. I’ll be interested to learn whether the same spirit has survived. It’s not to be taken for granted that just because the standard has doubled and then redoubled in size, the language has also grown obese. But it’s an obvious worry.

Only connect!

Thursday, September 14th, 2006

Euclid famously said, “There is no royal road to geometry.” Among the non-royal roads, the computational pathway is notably muddy, rutty and potholed. Over the weekend I needed to write a program for a simple geometric task—finding the intersection of two line segments in the plane. It’s Wednesday Thursday now, and I finally have my program. Let me tell you some of my adventures along the way.

First steps. This is a low-key, off-the-cuff, back-of-the-envelope project. The program doesn’t have to be blazingly fast; it isn’t going to be doing real-time animation in a video game. Nor is it going to trace the trajectories of aircraft in the holding pattern at JFK; no lives will be lost if I make a goof. Still, I would like to get correct answers, and not waste cpu cycles too extravagantly.

In Euclid’s way of doing geometry—with straightedge and compass—there’s not much to be said about algorithms for finding intersections. You construct the lines and see where they cross. Most computers, however, are not adept with straightedge and compass; the problem has to be encoded somehow in a more-algebraic language. Descartes knew how to do this. The cartesian equivalent of the Euclidean procedure goes something like this: Given two segments specified by the x and y coordinates of their end points, write the equation y = mx + b for each of the infinite lines that pass through these pairs of points. Then try to solve the two simultaneous equations to find a point (x,y) that lies on both lines; this point, if it exists, must be the intersection of the lines. Now go back to check whether the intersection lies within each of the segments.

Seems straightforward—but watch out for those potholes. For starters, in the equation y = mx + b the slope m is defined as Δy/ Δx. What if the segment is vertical? Then Δx = 0, and the slope is undefined. More insidiously, what if the two end points of a segment are actually the same point, so that the segment has zero length? Again the slope is undefined; now it’s 0/0. Then there’s the matter of parallel line segments. In Euclid’s world, parallel lines just don’t intersect; that’s pretty much how “parallel” is defined. But in dealing with segments, it does makes sense to ask about the intersection of two segments lying along the same line. In this case the intersection could be empty, or it could be a single point of overlap, or it could be a segment. The method of slopes and intercepts and simultaneous equations won’t work in this situation.

Down the garden path. I like a good puzzle, and I try not to cheat, but sometimes it’s just crazy to pretend you’re the first person on Earth who ever faced a problem. So I did some scouting around on the Net, as well as in filing cabinets and on bookshelves.

If you do a Web search for “segment intersection algorithm,” you’ll find lots of references to works by eminent computer scientists: Jon Louis Bentley, Bernard Chazelle, Herbert Edelsbrunner, and others. Following those leads, you’ll find some eminently clever algorithms, the result of a long series of inventions and refinements in a sort of algorithmic arms race that has gone on for more than 20 years. Unfortunately, these powerful tools solve the wrong problem—or at least they don’t solve my problem. They deal with the task of efficiently identifying all the intersections among a large set of line segments; the subtask of finding the specific point of intersection between two given segments is a minor detail generally left as an exercise for the reader.

Furthermore, most of the papers describing these algorithms take a fairly cavalier attitude to the various singularities and degeneracies mentioned above. I quote from Chazelle and Edelsbrunner 1992 (An optimal algorithm for intersecting line segments in the plane, Journal of the Association for Computing Machinery 39:1–54):

For the ease of exposition, we shall assume that no two endpoints have the same x or y coordinates. This, in particular, applies to the two endpoints of the same segment, and thus rules out the presence of vertical or horizontal segments…. Our rationale is that the key ideas of the algorithm are best explained without having to worry about special cases every step of the way. Relaxing the assumptions is very easy (no new ideas are required) but tedious. That’s for the theory. Implementing the algorithm so that the program works in all cases, however, is a daunting task. There are also numerical problems that alone would justify writing another paper. Following a venerable tradition, however, we shall try not to worry too much about it.

Thanks, guys.

Out on the Web I did find a few odd bits of code that address the specifics of the two-segments problem. One author finesses the inconvenience of vertically oriented segments by setting Δy/0 equal to 1010, with the comment, “close enough to infinity.” I wasn’t seriously tempted to follow this example. It’s not that I want a better approximation to infinity; I’d agree that 1010 is probably close enough. But this is one of those cases where infinity itself is not a very good approximation to the result of dividing by zero. Even if you believe that parallel lines meet at infinity, single isolated points don’t get there even in the limit. Also, despite venerable tradition, I do worry that a finite infinity may invite numerical trouble somewhere down the road.

The road not taken. What’s most annoying about the whole vertical-line mess is that it’s not a fundamental geometric constraint but just an artifact of how lines are represented. It’s only by convention that we measure slope with respect to the y axis; we could choose another orientation. Which suggests a way around the problem. If one of the segments turns out to be vertical, rotate the whole coordinate frame before attempting to test for intersections, and then rotate it back afterwards. The rotation is just a matrix multiplication, so it’s not a big computational burden, and you can do it only when needed. I seriously considered this strategy, and even now I wonder if it isn’t the right way to go about it. I finally decided against it because it doesn’t solve the problem of a segment that’s a single point: That kind of degenerate line has no well-defined slope in any coordinate frame.

Perhaps this fact is an argument for outlawing single-point segments altogether. I considered that too, but it seemed a bit pusillanimous, legislating a problem out of existence just because I found it inconvenient to solve. Mathematically, a one-point line segment may be pathological, but computationally it’s just an instance of aliasing—a very common occurrence, where two things turn out to be the same thing. Besides, it would be handy to have an intersection routine that can also test whether a given point is on a line.

On the road again. Lacking a brilliant Gordian-knot insight that would solve the problem with a single stroke, I had no choice but to push on into the tangled maze of if-then-else case analysis. Is either segment vertical? Is either segment a single point? Are the segments parallel (i.e., identical slopes)? If they are parallel, are they also collinear (identical y intercepts)? But be careful: If they’re parallel, they could both be vertical, with undefined y intercepts. Each of these cases could require a separate calculation of the intersection point. I’m not even sure how many cases there are—at least a dozen, but it depends on how you count.

The procedure I eventually wrote (after exploring several more blind alleys) is not quite as ugly as I feared it would be, but I still have the firm sense that there’s a better way. If not a royal road, then at least a trail that doesn’t require four-wheel drive. I was able to get the number of cases down to five, though only by exploiting a little Lisp hocus-pocus. For convenience of reference in what follows I label the segments red and blue.

  • Case 1. Both slopes exist, and they are not equal. This is the general case of two lines that are not vertical and not parallel. We can solve the simultaneous equations.
  • Case 2. The red slope is well-defined but the blue slope does not exist. The blue segment could be either a vertical segment or a single point. In either case, if the intersection exists, its x coordinate must be that of the blue segment.
  • Case 3. The blue slope is well-defined but the red slope does not exist. Ibid, mutatis mutandis.
  • Case 4. Both segments have identical slopes and also identical y intercepts; hence they are parallel and collinear. This is where the hocus-pocus comes in: The Lisp predicate
           (equalp (slope red-segment)
                   (slope blue-segment))
    

    returns true if both slopes are numeric and are equal, and it also yields true if both slopes are nil, the Lisp idiom for things that don’t exist. By this trickery, with a similar test on the y intercepts, I pack several cases into a single cond clause: two collinear vertical segments, two collinear non-vertical segments, and various combinations of vertical segments and single points.

  • Case 5. None of the above. The only possibility remaining—unless I am mistaken—is equal slopes and different y intercepts: The lines are parallel but not collinear, and so there is no intersection.

All that analysis merely determines whether or not the lines intersect; we still have to check whether or not the intersection lies within the segments. That involves still more case analysis, since the process is a little different for parallel segments than for others. I found a way of doing it that’s not grotesquely ugly.

The road back. Belatedly, after I had a working procedure, I did some more scouting in the literature and discovered a few paths worth exploring.

Robert Sedgewick’s Algorithms textbook suggests a quite different and ingenious method of detecting intersections of segments. Suppose you walk along the red segment, and when you come to the end, you have to turn counterclockwise to reach one end point of the blue segment, and clockwise to reach the other blue end point. Now try the same experiment walking on the blue segment; if again you must turn counterclockwise in one case and clockwise in the other to see the red end points, then the two segments intersect. Conversely, if in either case both end points are reached by turning in the same direction, there can be no intersection. (To be comprehensive, we also have to consider the case where an end point is straight ahead or directly behind.) Nifty! Unfortunately, the procedure only detects the existence of an intersection; I see no easy way to make it yield the coordinates.

Joseph O’Rourke’s Computational Geometry text (I looked at the C version) also has a discussion of intersection-finding. O’Rourke suggests working with parametric equations for the two segments—defining x and y as functions of distance along the segment. This avoids the problem with undefined slopes but encounters another singularity with parallel lines. Overall, the computation does not appear to be notably simpler.

Stuck in the mud. The animus behind this entire rant is a feeling that I must be missing something obvious, that a problem like finding the intersection of two line segments shouldn’t be this hard. The difficulty I’m talking about is not computational but conceptual. There are lots of hard computational problems—graph coloring, say, or factoring integers—for which one can write a very tidy and perspicuous program. True, the program may have a running time that exceeds your lifespan, but it’s easy to describe what needs to be done. Programs for geometric problems, in contrast, seem often to be efficient but hideous, with tangled logic, an abundance of special cases, and hidden numerical perils. Why is that? Nature seems to have no trouble at all detecting intersections or collisions. If two wires cross on a circuit board, you can count on blowing a fuse no matter what the slopes of the conductors. Why can’t we compute the same result so effortlessly?

Or maybe I have indeed missed something obvious—or even something subtle. If anyone has a better intersection algorithm, please pass it on.

Update 2006-09-15. My thanks to all of the readers who so promptly weighed in with good ideas. Over the weekend I’m going to try turning a couple of those suggestions into working code, and I’ll report back.

Scheduled procrastination

Tuesday, June 27th, 2006

Forgive me if this story is slightly stale. I meant to write it two weeks ago, but for some reason I kept putting it off.

A paper titled “Scheduling Algorithms for Procrastinators” has been posted on the computer-science section of the arXiv. The authors are Michael A. Bender (Stony Brook University), Raphaël Clifford (University of Bristol) and Kostas Tsichlas (University of Patras). They confess that they got a late start on writing the article, and they finished at the last minute.

Their theme brings to mind another well-known paper: “The Effects of Moore’s Law and Slacking on Large Computations,” by Chris Gottbrath, Jeremy Bailin, Casey Meakin, Todd Thompson and J. J. Charfman (all of the University of Arizona). Gottbrath et al. considered the options that you face when you are about to begin a long-running computational project. Suppose the job would take three years of continuous work on the best computer available today. The obvious, no-nonsense approach to the problem is to get to work immediately with a state-of-the-art machine, and finish three years hence. But Moore’s Law—the widely noted observation that computer performance doubles every 18 months—suggests an alternative. You could go to the beach for a year and a half; then, returning with an enviable tan, you could buy a machine twice as fast as anything on the market now, so that you would still finish after a total of three years.

Bender, Clifford and Tsichlas consider a slightly different and more-general class of tasks—not necessarily computations—whose shared trait is that the rate of progress toward a conclusion is nonconstant; instead, the pace of work accelerates as a deadline approaches. This is a familiar phenomenon, at least for the dawdlers among us; personally, I find that my productivity gets a big boost on the night before a major project is due, especially if the penalty for missing the deadline is being flunked or fired. Bender and his colleagues focus on tasks with a simple linear speed law, where the rate of work increases steadily as the deadline nears. In particular, they assume that the initial speed is zero and the acceleration is constant until the job is done.

Given a set of such tasks, each with its own start time, deadline and linear speed function, can we find an optimum schedule? The answer is a curiously qualified maybe. If there is a feasible schedule (one where all tasks are completed on time), then there’s an algorithm for finding an optimal schedule (one where the total processing time is minimal). The trouble is, deciding whether or not a feasible schedule exists looks to be a hard problem. No known algorithm can settle the question in polynomial time, and it hasn’t even been established that the problem is in the class NP (nondeterministic polynomial time). For a problem in NP, you may or not be able to find a solution efficiently, but if a candidate solution is dropped in your lap, you can quickly test its correctness; even this limited ability is not guaranteed for the procrastinating scheduler (or the scheduling procrastinator). The root of the difficulty comes from an unexpected quarter. It’s a matter of numerical computation: Scheduled procrastination is hard because evaluating sums of square roots is hard. (Quick: Which is larger, √7 + √26 or √10 + √21?)

There’s more on the sums-of-square-roots problem at The Open Problems Project. For more on how the square-root problem relates to procrastination, see the paper by Bender, Clifford and Tsichlas.

As for me, I’ve got to run; I have a pressing deadline.