Archive for October, 2007

Bumped off

Tuesday, October 23rd, 2007

Two of the best science-blog articles of 2007 appeared early in the year at Cosmic Variance. They were written by John Conway (not the John Conway, the other John Conway) and gave an inside view of what happens in a large experimental collaboration when there’s a hint of glory in the air. The CDF experiment—one of the two large detector groups at the Fermilab Tevatron—had found a bump, and it could be a sign of the fabled Higgs boson, the one Leon Lederman called the God particle. As Conway put it: “Holy crap!”

The bump consisted of a few data points that stood two standard deviations above the expected background level.

janplot.png

In medicine, two sigma is plenty of significance to decide whether a new drug will cure you or kill you, but physicists are more discriminating. Conway explained that they really needed five sigma to claim a discovery, although a lesser level might qualify as “evidence” or “indication.” Two sigma could too easily be dismissed as a statistical fluke.

Today Conway has posted the long-awaited followup. With a larger data sample, those dots and whiskers are back in their bins:

octplot.png

Conway is totally forthright about this outcome:

Gone! The data points all fell within an error bar or so of every bin…no excess, no bump, no Higgs… I am sure you are thinking “no tickets to Stockholm” too.

He then goes on to engage in a little preemptive spin control, in a paragraph aimed straight at the likes of me:

Now, gentle readers, one thing we’ve learned is that among you are many science journalists who use blogs as a means to catch wind of breaking news. You all have to have an angle, and a story to tell (sell). In the past, our field has been treated to stories of the ilk “300 Physicists Fail to Find Supersymmetry” with the subtitle “Study Illustrates the Risks of Big Science”. (New York Times, 1993). I sincerely hope that’s not your angle here, science writers! Our favorite put-down of that is to ask whether there should have been an 1888 story titled “Physicists Fail to Find Ether in Vacuum” about the Michelson and Morley null result. But okay, we’re nerds.

If I weren’t such a thick-skinned cynical journalist, I might chafe a little at this scolding-in-advance. I might ask, for example, what headline should have appeared on a contemporaneous story about the Michelson-Morley experiment. Is Conway suggesting that the Higgs boson is in the same conceptual category as the electromagnetic ether?

But I want to set aside my own defensiveness, and also try to disarm Conway’s. Even if we loutish journalists refrain from public cackling at this delicate moment, it seems all too likely that the next time Conway spots a two-sigma bump, he’s going to keep his mouth shut and his head down. We’ll all be the losers if that happens. What was so refreshing about this affair was that we could all follow the story as it unfolded, rather than waiting for the carefully vetted post-hoc redaction. (Given the eventual negative result, if the bump had been kept under wraps at the outset, we might never have heard of it at all.)

Conway concludes on a rousing note:

I challenge you journalists out there to tell it like it is: this is a great human adventure, with all the twists and turns any good adventure has.

Yes! I second that emotion. But we journalists can report the twists and turns only if we get to follow the twists and turns along with you, including the wrong turns.

The family tree

Monday, October 22nd, 2007

When is a tree (large, woody plant) not a tree (connected acyclic graph)?

crepe myrtle with anastomosing stems

This has something or other to do with the topic of the previous post.

(The tree (?) is a crepe myrtle near the campus of North Carolina State University in Raleigh.)

How many of your ancestors are you related to?

Sunday, October 21st, 2007

David Aldous asked me that question over lunch one day. I didn’t have an answer, so he explained: In the simplest model of human genetics, you get half your genes from each parent, a fourth from each grandparent, and so on. Thus the fraction of your genes contributed by each member of the nth generation is 1/2n. But there must be some value of n for which 2n exceeds the total number of genes in the human genome. Suppose you have 50,000 genes. Well, 16 generations ago you had 216 = 65,536 ancestors, so roughly 15,000 thousand of those family members were left out of the lottery. They’re your ancestors, but you inherited no genes from them.

There’s also a value of n for which 2n is greater than the entire human population, so if you look back far enough, you have more ancestors than there were people on the planet. This has got to be a sign of something awry in the model; these calculations are not to be taken as a quantitatively accurate guide to the human family tree. Nevertheless, the idea of counting genes and counting ancestors is basically sound.

These issues have come to the fore lately with news coverage of the discovery that Barack Obama is a distant cousin of Dick Cheney. According to Lynne Cheney, both are descended from Mareen Duvall, a 17th-century Hugenot. In today’s New York Times, Nicholas Wade comments on the significance (or otherwise) of this genealogical connection:

Mr. Obama probably inherited a minute fraction — one divided by two to the 11th power — of Mareen Duvall’s genome, which would amount to less than one gene, assuming the Y chromosome was not inherited.

Alas, though the concept is right, the numbers don’t quite add up. Two to the 11th power is only 2,048, and we surely have more genes than that. Under simple assumptions of random assortment, the expected number of genes passed down to the eleventh generation would be ten or so.

Wade correctly notes that the candidate and the vice president are very unlikely to have inherited any of the same genes from their common ancestor. Not that I would change my vote just because they had a few snippets of DNA in common.

Hung over

Sunday, October 21st, 2007

The drawing below, brazenly swiped from a 1964 Martin Gardner column, illustrates the solution to a well-known puzzle. If you stack n bricks on a table, how far can you make them extend over the edge without toppling?

Martin Gardner, Mathematical Games, Scientific American, 211(5):126-133

The answer given, for bricks of unit length, is one-half the nth harmonic number, the sum of the series 1/2 + 1/4 + 1/6 + 1/8 + … + 1/2n. Since this series diverges, the overhang can be as large as you please, given enough bricks (and a strong enough table).

In describing this result, Gardner uses the word “unbelievable,” and when I first read about it, that was my reaction too. It is one of those facts that are inescapably true, and yet still implausible. Now I learn that the truth is even stranger—that far larger overhangs can be achieved.

The overhang for the harmonic stack is approximately equal to 1/2 ln n; it turns out other stacks achieve an overhang on the order of the cube root of n. This is an enormous improvement. For a million bricks, the harmonic stack can’t get beyond 13.8 bricklengths, whereas the cube-root stack extends 100 bricklengths beyond the edge.

How is it done? I’m not telling. If you can’t figure out the trick, you’ll have to consult the links below.

The key idea in the alternative solutions has apparently been known for at least 20 years. (Actually, I’d be surprised if it wasn’t known to stonemasons centuries ago, as well as generations of children playing with wood blocks—but I don’t have a reference to support that conjecture.) Quite new, however, is the mathematical analysis showing exactly how large the overhang can be. Last year Mike Paterson of the University of Warwick and Uri Zwick of Tel Aviv University presented a paper at SODA (the Symposium on Discrete Algorithms) giving optimal solutions for n up to 30, and showing that the asymptotic rate of growth is n1/3. Now Peterson and Zwick, along with Yuval Peres (Berkeley), Mikkel Thorup (AT&T Labs) and Peter Winkler (Dartmouth) have proved that order n1/3 is the best that can achieved:

More specifically, we show that any n-block stack with an overhang of at least 6n1/3 is unbalanced, and must hence collapse. Thus we conclude that the maximum overhang with n blocks is of order n1/3.

Links: The 2006 Paterson-Zwick paper is here in the SODA proceedings (subscription required) and here in the arXiv. (It is also to appear in the American Mathematical Monthly.) The five-author proof is at the arXiv here.

Multicore madness

Thursday, October 11th, 2007

The new issue of American Scientist is now available both on the web and on paper. The subtitle of my “Computing Science” column makes the following rash assertion:

Multicore chips could bring about the biggest change in computing since the microprocessor

It’s always wise to be a little skeptical of such superlative claims. In Redmond, Washington, last week, I asked a foreman at a construction site what his big hole in the ground was going to be. He told me that a local software company was building the world’s largest underground parking garage. I have no particular reason to doubt that statement, but I would have been more firmly convinced if he had been able to answer my followup question about where I could find the second-largest underground parking garage.

The assertion in my subtitle doesn’t stand up particularly well to this kind of scrutiny. I really don’t want to be asked about the second-biggest change in computing in the past 30 years. But I hope my readers will forgive or overlook my moment of impetuous hyperbole and accept the broader point that the shift to dual-core and quad-core (and eventually many-core) processors really is going to make a difference in how computers work. We’ve seen decades of research on parallel processing and concurrency, but through it all the mainstream of computing has remained steadfastly single-minded. It looks like that’s finally going to change.

Conquering divide

Tuesday, October 9th, 2007

This morning I am enjoying the benefits of jet lead. My watch says it’s 7:30, but the hotel-room clock reads 4:30, so I have a few hours free to lie awake and solve the world’s problems. As a warmup exercise I’m doing mental arithmetic. I’m dividing integers, and trying to figure out how I do it.

I know what got me started on this particular line of predawn predaydreaming. On the flight out here I was reading a paper on division by Eric Allender of Rutgers. Then yesterday I was reminded again of division’s divisiveness when I ran into a friend, Robert L. Constable of Cornell (another refugee from the Eastern time zone, who’s probably awake now too). Constable said something I found provocative—which is a little odd, because he was repeating my own words to me, from an article I wrote a while ago. Paraphrasing Constable paraphrasing me: We should count ourselves lucky that multiplication is a polynomial-time operation.

In the article I went on to elaborate:

If you are given two prime numbers, it’s easy to multiply them, yielding a bigger number as the product. But trying to undo this process—to take the product and recover the two unknown factors—seems to be much more difficult. We have fast algorithms for multiplying but not for factoring. Why is that?

The contrast between multiplying and factoring is indeed fascinating, but maybe not quite as dramatic as that passage suggests. In the first place, we don’t really know that factoring is difficult. At the moment I don’t have an efficient, polynomial-time factoring program installed on my computer, but I can’t prove that you don’t have one. It’s not even true that we have no fast algorithms for factoring. We do know of one such algorithm; it’s just that you need a quantum computer to run it.

In any case, factoring isn’t how I’m passing my time this morning; I’m dividing. The two things are related, of course; either of them might claim to be the inverse of multiplication. (If you stop people on the street and ask, “What’s the opposite of multiplying?” few of them would reply “factoring.”) We can think of division as lazy factoring, or factoring with a head start. Dividing 40321 by 29 looks like an easier task than factoring 40321. But just how easy is it to divide? Specifically, what is the complexity class of division? Is it one of those tough exponential problems, or can we count on doing it in polynomial time, even in the worst case?

Here’s the most elementary division algorithm: Find x ÷ y by counting how many times you can subtract y from x. Stop counting when x is less than y; the count at that point becomes the quotient, and whatever is left of x is the remainder. In other words:

function divide(x,y)
   q = 0
   while x ≥ y
      x = x - y
      q++
   return (q,x)   //quotient, remainder

The procedure requires exactly q subtraction steps, and so its time complexity is proportional to x / y. This sounds pretty good—a linear-time algorithm. Unfortunately, though, that’s not the proper way to measure the algorithm’s running time. The correct complexity is not a function of x or y or q but a function of the number of bits or digits in these quantities. And that means the repeated-subtraction algorithm is exponential; for an n-bit number, the running time is O(2n).

Of course no one really divides by repeated subtraction, even at five o’clock in the morning.

Like most of my contemporaries, I was exposed at a tender age to the algorithm known as long division, or guzzinta. I understand that this technique has now been dropped from the curriculum in many schools, to the disgust of educators who believe it’s immoral for youth to be spared any trauma their parents endured. Far be it from me to shirk my patriotic duty to torture children, but I’m actually glad that long division is being retired from the canon of skills that every educated person is expected to master. Part of my reason is that I never learned the trick very well myself, and I’m hoping to avoid embarrassment. But there’s also the fact that this is the first nontrivial algorithm most of us encounter in life, and it’s a really weird one. It makes for a rough and ragged introduction to the charms of algorithmic reasoning.

Here’s part of a worked example of long division from the Ask Dr. Math section of the Math Forum at Drexel University:

Let's divide 175 by 3:
      ______
    3 ) 175

We take the dividend (175) one digit at a time starting
at the left. If we tried to divide 1 by 3, we would get zero,
so let's skip that and use the first two digits, 17:

      ___5_
    3 ) 17
        15
        --
         2

What I did was to look through the multiplication
table for 3 (in my mind), looking for the biggest multiple
of 3 that is less than 17: 3, 6, 9, 12, 15, 18 - since 18 is
too big, we use 3 x 5 = 15.

It’s the bit about looking through the multiplication table in your mind that I find unsettling. Note that what’s involved here is not just a matter of using the familiar 10-by-10 multiplication table “in reverse.” When you multiply decimal numbers, you take each digit of the multiplier and the multiplicand, find the corresponding column and row in the table, and, presto, there’s the partial product at the intersection. Reversing this process would mean using the divisor to select a row in the table, finding the dividend in that row, then reading off the index of the column to determine a digit of the quotient. But, as Dr. Math notes, the dividend is generally not present in the table; you have to search for “the biggest multiple less than….” No such searching (or guessing) step is needed with multiplication.

I have another gripe with the long-division algorithm: It mixes up operations on numbers with operations on digits—on the written representation of numbers. The various multiplication and subtraction steps that enter into the long-division procedure are well-defined operations in arithmetic, but when you come to that business about “bringing down the next digit,” you’re manipulating arbitrary symbols rather than acting on numerical values. How would you express the “bringing down” step in terms of +, – and × operations? If you think this is a frivolous quibble, try writing a computer program to implement the long-division algorithm. At some points in the program you’ll want numbers to be numbers—things you can add, subtract and multiply—but at other points it’s more convenient to represent numbers as lists of digits, which are easy to select and move around but not so easy to combine numerically.

* * *

Okay. So now it’s six o’clock local time, the horizon is brightening, and I seem to have persuaded myself that division of integers is so awkward as to be all but impossible. And yet…. I have just divided 40321 by 29, getting a quotient of 1390 and a remainder of 11. My algorithm—though I hesitate to glorify it with that term—seems to be a crude mishmash of long-division and Newton’s method. I make guesses, do some trial multiplication, and use the sign and magnitude of the error to suggest a better guess. It works, but it’s not pretty, and I wouldn’t want to be teaching it to impressionable youth. There’s a faint whiff of nondeterminism about it, as if I were guessing the answer rather than calculating it.

Why is division so hard?

At times I feel this is a dumb question, and that the universe doesn’t owe us an answer to such questions. Some things are just harder than others; that’s the way the world works. We can count ourselves lucky that multiplication has a polynomial-time algorithm; we have no right to expect that our luck will hold when we turn to division.

But then I get to thinking about physical systems that multiply and divide, where the two operations seem perfectly symmetrical. An example is an electrical circuit governed by Ohm’s law.

Image scarfed from Cornell University, Center for Nanoscale Systems, Institute for Physics Teachers

If we attach the terminals of a voltmeter to opposite ends of a resistor, we observe Ohm’s law in the form E = IR: Voltage is equal to the product of current and resistance. If we install an ammeter in series with the resistor, we see the effects of the law I = E/R: Current is equal to voltage divided by resistance. If you prefer pipes to wires, you can do the hydraulic version of the experiment, and there are also lots of mechanical schemes, where the same principle is illustrated with levers, gears, pulleys and such. In all of these systems, nature seems to divide just as easily as it multiplies. Why can’t we do the same with pencil and paper, or transistors, or neurons?

Much of theoretical computer science is given over to questions just like these. Given a certain model of computation, how many operations are needed to solve a problem? If we were given some extra resource (randomness, perhaps, or quantum superposition), would that make the job any easier? Can we reduce one problem to another and thereby solve both of them with the same method?

In the case of division, the answers are in. Specifically, they’re in the paper by Eric Allender that got me started on this strange morning’s meditation. The conclusions are simply stated. Forget about those whiffs of nondeterminism; division, along with the other three arithmetic operations, is a member in good standing of the familiar and friendly complexity class P, made up of deterministic polynomial-time problems. Indeed, all four operations belong to a subclass of P called L, which consists of just those problems that can be solved by a deterministic algorithm in polynomial time and with a logarithmic amount of memory (or, again, with memory proportional to the number of digits in the input). [Correction: As Ryan Williams and Jonathan Katz point out in the comments: a logspace solution has to fit in an amount of memory proportional to the logarithm of the number of digits.] Allender narrows down the complexity of division still further: It belongs to a class called DLOGTIME-uniform TC0. If you want to know just what that means, I must send you off to Allender’s paper (PDF).

Putting multiplication and division in the same complexity class doesn’t mean they’re exactly equal in difficulty. Complexity theory adopts the amusing practice of ignoring all constant factors and all terms other than the fastest-growing one, so that a running time of, say, 106n2 + 1000n is considered the same as n2. But in the case of multiplication and division this approximation is reasonably close to reality.

Most of the results on the complexity of division are quite recent. People have been dividing for millennia, but Allender’s paper is titled “The Division Breakthroughs,” and it describes results published within the past decade. For most of the history of Homo calculensis, we didn’t know how good we had it.

Should I be surprised to wake up in the morning and find out that division isn’t as hard as it seems? On the one hand, it’s common sense that division can’t be really, really hard, because we do it all the time, getting answers promptly and reliably. On the other hand, division algorithms are so ugly we have to hide them from the children. I don’t know what to think. My mind is divided.

Update 2007-10-11. In the comments, Jonathan Katz asks a very good question: How is division done on real computers? I wish I could give a detailed answer. Here’s what little I know.

Modern general-purpose processors come with built-in hardware for floating-point division. The most popular method seems to be an algorithm called SRT division, after D. Sweeney of IBM, J. E. Robertson of the University of Illinois and T. D. Tocher of Imperial College London, who independently came up with variations on the same theme in the 1950s. The underlying idea is not fundamentally different from long division, but there are a couple of tricks and tweaks to speed the process. Although the numbers are represented in binary, the calculation is usually done in radix 4, so that the quotient is calculated two bits at a time. Moreover, the quotient digits are not 0, 1, 2, 3 but –2, –1, 0, 1, 2; this choice of digits has the advantage that the multiplication steps in the algorithm can be implemented as simple shifts and negations. But with five possible quotient digits in radix 4, there is some ambiguity or redundancy: Some numbers can be represented in more than one way. This has to be resolved, and the answer converted to conventional notation, at the end of the operation.

What I find particularly intriguing is that the “whiff of nondeterminism” I thought I smelled in schoolbook division also turns up here. David L. Harris, Stuart F. Oberman and Mark A. Horowitz explain:

The key idea of SRT division is to guess the quotient digit based on a few of the most significant divisor and partial remainder bits, rather than computing it exactly. As long as the guess is close enough, the algorithm can correct on subsequent steps using redundant quotient digits.

Antinomies on the landscape

Monday, October 1st, 2007

False Creek in Vancouver, from near Granville Island (which is not an island)

From the Dorling Kindersley Eyewitness Travel Guide to the Pacific Northwest, page 218:

In spite of its name, False Creek is not a creek at all….