Archive for February, 2006

0.203188

Tuesday, February 28th, 2006

In a “Computing Science” column titled “Rumours and Errours,” not quite a year ago, a leading role went to the nondescript number 0.203188. That number emerged from a simulation of how rumors spread through a society; given certain assumptions, 0.203188 is the proportion of the population that never hears the rumor.

A few weeks ago Paul Krapivsky of Boston University wrote me to say that the number mentioned in the column looked familiar, and he directed me to two other papers in which it appears:

Chang, Joseph T. 1999. Recent common ancestors of all present-day individuals. With commentary. Advances in Applied Probability 31:1002–1038.

Derrida, Bernard, Susanna C. Manrubia and Damián H. Zanette 1999. Statistical properties of genealogical trees. Physical Review Letters 82:1987–1990.

Both of those articles take up questions of human ancestry and genealogy, a topic that might seem remote from rumor-mongering, but in fact the conceptual connection is close. The starting point for both discussions of genealogical trees is an extraordinary all-or-nothing observation: If you look back just a few dozen generations in human history, it’s a reasonable approximation to say that anyone who was alive then is either an ancestor of everyone alive now or else is an ancestor of no one living today—the lineage went extinct. And the probability of extinction for human families is the same curious number encountered in rumor studies; Chang gives the value as 0. 20319, whereas Derrida, Manrubia and Zanette supply a few more decimal places—0.20318787. (Derrida, Manrubia and Zanette have also presented this work in American Scientist, but in that article they just miss mentioning the number.)

Krapivsky adds that it’s not altogether suprising for a number like 0.203188 to turn up in diverse contexts, because it derives from a fairly simple mathematical process. The number can be defined as the fixed point of the equation

S=e^{2S-2}.

What’s a fixed point? Suppose you make an initial guess about the value of S, such as S=0.5. Plugging this estimate into the righthand side of the equation yields a new approximation of S equal to e–1, or 0.367879. Inserting the newly calculated value into the righthand side produces a third estimate, 0.282454. If you keep going in this way, you’ll converge on a fixed point: 0.20318787…. (Strictly speaking, the process never quite gets to the fixed point, but the “…” notation means that we don’t even know where the fixed point is, exactly, so this is understandable. Also note that the initial guess needs to be less than 1.)

For purposes of describing rumors and family trees, having five or six digits of this number is more than enough precision, but when you look at it as a mathematical constant, there’s no reason not to calculate some more decimal places. It’s not hard to do that with the iterative scheme described above. If you perform this operation with the kind of floating-point arithmetic common in many computers and programming languages, the most you can hope to get is about 16 decimal digits. The value I came up with in this way is 0.2031878699799799. What a suspicious-looking number! Could the pattern continue? Of course not; the 699799799 sequence here is no more meaningful than the 18281828 found among the early decimal digits of e. Here are some more digits, for your reading pleasure:

0.203187869979979953838479062062419879105498787590570 3175002477441519575075919060248836250361690779642914 691870155025100376806318211969115434998991230687792 8607856089683113536220973669731670519088210949

Since this number appears in the literature of rumor-propagation and in studies of theoretical genealogy, there’s every reason to expect that it might be found elsewhere as well. I tried typing some digits into Simon Plouffe’s Inverse Symbolic Calculator, now known as Plouffe’s Inverter, but it appears that fixed points of exponential equations are not among the 215,000,000 numbers in Plouffe’s database. The number can be found, as a sequence of decimal digits, in Neil J. A. Sloane’s marvelous Online Encyclopedia of Integer Sequences, but it was put there (by one Robert G. Wilson v) following the publication of “Rumours and Errours,” and so I learned nothing new by looking it up.

Where to turn? Google, of course. A Google search for the first five or six digits yields many hits but nothing of interest. The documents found in this way are chock-full of numbers—coordinates, dimensions, measurements, statistics, timings, coefficients—but they are all just numeric underbrush. Appending another digit or two to the search string thinned out the background and revealed a preprint by Derrida, Manrubia and Zanette, with more on the family tree of mankind. Finally, searching for 0.2031878699 took me to a 2004 dissertation in engineering by Corine Marchand, who was evidently then a doctoral candidate at the Institut National Polytechnique de Grenoble, in France. The subject is detecting failures in communication networks; if I understand correctly (after reading only a small part of the dissertation), 0.2031878699 is the probability, given certain parameter values, that a “heartbeat” signal fails to reach a node of the network.

An interesting number, is it not? But there are a lot more where that one came from.

Taxation without rationalization

Friday, February 24th, 2006

I am the child of a bookkeeper, and I’ve inherited the habit of double-checking receipts and balancing accounts. My friends make fun of me when I carefully note down the dime that I put in a parking meter, but lately I’ve been fretting over even smaller sums—charges that come to less than a penny. It’s all about sales tax.

For the benefit of those who live outside the U.S. (or in New Hampshire) I should explain how American sales tax works. The price marked on an item in a store excludes the tax, which is added at the cash register when you pay for your purchases. Where I live, in North Carolina, the tax rate for most merchandise is 7 percent. Thus if I buy a magazine with a cover price of $3.95, I’ll actually pay $3.95 + (0.07 × $3.95), or $4.2265. Fractions of a cent are rounded to the nearest penny, making the final price $4.23. The rounding is what I’ve been worrying about.

If rounding is done fairly, one might expect that the tax would be rounded up and rounded down with roughly equal frequency, and everything would come out even in the end. But for a long time I’ve been noticing that the sales tax on my purchases is almost always rounded up; very seldom, it seems, does my shopping basket produce a total for which the tax amount needs to be rounded down. Is this bias real, or do I just imagine that numbers are conspiring against me? As I was gathering up papers for a different kind of tax—the annual income-tax ritual—I decided to find out.

For any tax rate that’s an integer percentage, all possible rounding situations are covered by considering prices modulo $1. In other words, if the tax on $0.xy rounds in a certain way, then $1.xy will get exactly the same treatment, and likewise $2.xy, $3.xy, and so on. Thus for purposes of tax rounding, we can pretend that all prices lie between $0.00 and $0.99. In the sales-tax tables (PDF) issued by the state of North Carolina, the tax on 50 of these amounts is rounded up; in 49 cases it is rounded down; the one remaining case needs no rounding in either direction. This rounding protocol introduces a very slight upward bias, amounting to $0.00005 when averaged over all possible prices. Apart from this tiny asymmetry, the system looks like it ought to be fair if purchase prices are uniformly distributed among the 100 possibilities.

The problem, of course, is that the distribution of prices is far from uniform. I took a fat envelope full of sales receipts from 2005 and recorded the prices of all items subject to the 7 percent North Carolina tax. I was able to identify 471 items. Here is the distribution of their prices, modulo $1:

bar graph of item prices modulo $1

Red bars are prices for which the tax is rounded up, blue bars represent prices whose tax rounds down; the tax on $0.00 (black bar) is exact. The source of the rounding bias that I’ve suspected is immediately obvious: Almost two-thirds of the items I purchased over the course of the year had a price (modulo $1) of 99 cents, which happens to be an amount for which the tax rounds upward. I’ve always heard that the popularity of 99-cent pricing has a psychological premise: Shaving that penny off makes things seem cheaper and therefore encourages sales. Could there be another motive as well?

Before I start accusing shopkeepers of a nefarious plot to mulct American consumers, there’s a complicating factor to consider. Sales tax is not calculated separately on each item purchased; instead, when you dump your basketful of goods at the checkout counter, the prices are totalled, and the tax is calculated only once, on the sum. (In accounting lingo, the calculation is done “per invoice” not “per item.”) It turns out that my 471 taxable purchases were grouped into 195 invoices. (On a typical trip to the store I bought 2.4 items.) When I reanalyze my receipts on this basis, the 99-cent effect is somewhat softened:

bar chart of number of invoices as a function of invoice total modulo $1

Nevertheless, the invoice totals are still strongly clustered in the region between $0.93 and $0.99, where tax amounts are all rounded upward.

With these data in hand I can now calculate the total “roundage,” which I’m going to define as N(Texact â€“ Trounded), where N is the number of transactions at a given invoice amount. With this convention, the roundage is positive when it favors the merchant and negative when it favors me.

bar chart of total roundage

An interesting statistical question is whether this pattern offers any evidence of a deliberate policy of setting prices in order to maximize roundage for the merchant. Obviously the prevalence of prices just under a dollar has this effect, but since prices in that range have other possible justifications, they can’t support any firm conclusion. Are the sharp upward spikes at 79 cents and 50 cents a result of careful strategizing, or are they just random accidents? I have not tried to answer that question, and I don’t think I have enough data to do so. It should be noted that if you exclude the region of the graph between $0.93 and $0.99, the net roundage of the remaining transactions is slightly in my favor.

How much money are we talking about here? For calendar year 2005, based on the 195 transactions I was able to document, the skewed rounding of sales tax cost me 13.43 cents. This deficit is probably not the biggest hole in my pocket. On the other hand, if my case is typical, and if we extrapolate my loss to the entire U.S. population, that comes to $40 or $50 million slipping through the cracks.

Where is the insidious excess of roundage going to? That’s the question. If merchants are required to remit to the state the exact amount of tax actually collected, then the bias in roundage merely raises the effective tax rate a tad. No big deal. As far as I can tell, however, that’s not what happens—at least not here in North Carolina. The form on which sales tax is reported asks for total receipts (exclusive of tax collected) and then multiplies this amount by the appropriate percentage. In this way all of a merchant’s sales over an entire month or quarter are lumped together before any rounding is done, and any bias in the roundage of individual transactions will not be taken into account. The form does include a line for “excess collections,” and so a conscientious merchant could keep track of the roundage and report it there. But the instructions that accompany the form do not mention this possibility, and my supposition is that the excess roundage usually stays with the retailer.

We can fight back, though. Using my sample of 471 item prices, I ran a Monte Carlo simulation to estimate the average roundage as a function of the number of items purchased on each shopping trip:

roundage per year as a function of number of items per shopping trip

The model assumes that I buy exactly k items, selected at random (with replacement) from the set of 471 prices in the 2005 data set. The sales-tax roundage on this shopping basket is calculated, and then is multiplied by 471/k, to give the total expected roundage for the year. The graph shows averages of 106 repetitions of this process for each value of k from 1 through 25. The lesson is clear: If I bought 9 or 10 items every time I went to the store, I would come out slightly ahead in the roundage game.

However, I have a better solution to propose, one that could make the system more resistant to manipulation by either the seller or the buyer. Any integer tax rate brings the curse of commensurability: a pattern of roundage that repeats every dollar, rewarding strategies such as setting all prices to $x.99. Commerce might be more interesting with an irrational tax rate. For example, if the tax percentage were not 7 but rather the square root of 50 (equal to 7.0710678118654755…), then it would be a little harder to rig prices so that the tax consistently rounds upward. Besides, it would help with a problem discussed in my February 20th post: Pundits could no longer claim that algebra has no role in ordinary life. Every time you buy a cup of coffee, you’d have to solve the quadratic equation x2 â€“ 50 = 0.

Further notes and questions.

The Monte Carlo model discussed above chooses k items at random. If you could plan a year’s worth of shopping in advance, you could list all N items you intend to buy and search for the best possible partitioning of these goods into k-item batches. How hard a computational problem is that? The brute-force approach (trying all possible partitionings) is exponential in N, but some such problems turn out to be easier than they look. What if you remove the constraint that each batch must have exactly k items?

Playing the game from the point of view of the vendor, what pricing policy will maximize roundage gains, given a buyer who is determined (at all costs!) to minimize them? Under a fixed, integer tax rate, is there any set of prices that guarantees a win for the merchant, no matter how the shopper assembles purchases into market baskets? Setting all prices to $x.00 enforces a trivial tie. Can the seller do better than that?

Instead of an irrational tax rate, we might try giving the dollar a prime number of cents. The obvious choice is 101 (in which case we might call them not cents but centunos). How would this affect roundage calculations?

The state of Ohio has recently changed its sales-tax regulations, allowing merchants to calculate tax either on a per-invoice or a per-item basis. The per-item option foils all the market-basket strategies for manipulating roundage, and it gives the merchant the potential of earning up to an extra $0.005 per item sold. It will be interesting to see whether pricing patterns change in Ohio.

Life after algebra

Monday, February 20th, 2006

Three weeks ago, Duke Helfand of the Los Angeles Times wrote a thoughtful article on high school algebra. A one-semester course in algebra has recently become a requirement for graduation in the Los Angeles unified school district, and many students are having a hard time with it. The Times article tells the story of Gabriela Ocampo, who failed the course six times and finally dropped out of school when it became apparent she would have no more success on her seventh attempt.

Last Wednesday, Richard Cohen, an op-ed columnist for the Washington Post, was inspired by Ocampo’s plight to publish his own thoughts on mathematics education, from which I extract a few sentences:

Gabriela, this is Richard: There’s life after algebra….

I confess to be one of those people who hate math. I can do my basic arithmetic all right (although not percentages) but I flunked algebra (once), barely passed it the second time — the only proof I’ve ever seen of divine intervention — somehow passed geometry and resolved, with a grateful exhale of breath, that I would never go near math again….

Here’s the thing, Gabriela: You will never need to know algebra. I have never once used it and never once even rued that I could not use it…. Most of math can now be done by a computer or a calculator….

Look, Gabriela, I am not anti-algebra. It has its uses, I suppose, and I think it should be available for people who want to take it. Maybe students should even be compelled to take it, but it should not be a requirement for graduation.

Richard, this is Brian. Although I’d sooner swallow Drano than speak these words, I have to admit it: You’re right. There is life after algebra. Millions of our compatriots are living proof—contented and productive citizens who couldn’t solve a quadratic equation to save their lives. But, Richard, you don’t go far enough. Why stop at algebra? Elementary arithmetic is also highly overrated as a passport to riches and fulfillment. On this point I can offer personal testimony: I’m very shaky on my multiplication tables (in spite of the valiant efforts of Miss Cross, who drilled me relentlessly in the second grade), and yet I have managed to stay off the welfare rolls, and I even file my own income-tax returns. Although your claim that computers and calculators can do “most” math strikes me as a little ahead of its time, those machines sure are handy for +, –, × and ÷, not to mention those irksome percentages. Come to think of it, maybe we can cut off the mathematics curriculum even before the kids get as far as arithmetic. After all, unless your ambitions tend toward professional gambling or ballroom dancing, you can probably get through life today without even knowing how to count. Why demand that our children learn such frivolous skills?

“You will never need to know algebra,” you tell Gabriela Ocampo. But what do you mean by “need to know”? Apply a strict enough standard, and very little of what we teach in the schools would survive the test of need. Who really needs to know the names of the planets or what happened in 1066 or why Achilles sulked in his tent. Does anyone’s life depend on memorizing the prelude to “Evangeline” or knowing the price of the Louisiana Purchase? Reason not the need, Richard. Culture begins just beyond where need ends.

I too have a gripe with algebra in high school. My complaint is not that schools insist on teaching it, but that they teach too litttle else of mathematics. Even for the enthusiastic students, there’s nothing on offer but a year or two of algebra, some geometry and trigonometry, then finally calculus, which is viewed as the culmination and capstone of the whole enterprise, the end point toward which everyone have been striving from kindergarten on. There’s something musty and 18th century about this curriculum. It bears no resemblance to the much broader spectrum of interests among mathematicians today. Where are the courses on combinatorics and number theory (the “higher arithmetic”)? How about spending a semester on topology, a field with beautiful and mystifying ideas that might appeal to some kids who don’t go for the standard lineup. A course in probability and statistics might score well on the need-to-know scale for certain students. Then there’s the matter of computers in mathematics and the mathematics of computation. Even the algebra of traditional high school courses is only a pale shadow of what that word really encompasses; there’s more to it, Richard, than just doing arithmetic with x’s and y’s.

Finally, I have a further gripe about the school that failed Gabriela Ocampo. How could they let a student flunk the same course six times—whether the course is algebra or anything else—and not intervene in some way? According to Helfand, 48,000 ninth-grade students enrolled in algebra in the fall semester of 2004, and 44 percent of those students failed the course. If only you could do percentages, Richard, you’d see there is something desperately wrong with the mathematics of that situation, and the problem lies elsewhere than in the algebra.

Playfair’s Powerpoint Presentation

Saturday, February 18th, 2006

“To those interested in the effective visual communication of quantitative phenomena, William Playfair’s Atlas is like the Bible: an ancient and revered book that is often cited but rarely read.”

—Howard Wainer and Ian Spence

The Commercial and Political Atlas is rarely read simply because it’s rare. Playfair published three versions of the book between 1786 and 1801, but apparently it has never been reprinted since then except in a microfilm edition. Looking at the microfilm a few years ago, I thought that someone really ought to put together a handsome facsimile edition, equipped with a learned introduction that would explain the work’s place in the history of scientific illustration. Now someone has done exactly that. Actually sometwo: Howard Wainer and Ian Spence. Wainer is with the National Board of Medical Examiners and the University of Pennsylvania and also has a recent book of his own on graphics and visualization, Graphic Discovery: A Trout in the Milk and Other Visual Adventures. Spence is professor of psychology at the University of Toronto.

The new edition of Playfair is published by Cambridge University Press (ISBN 9780521855549, U.S. price $39.99). It presents the third and final edition of the Atlas, photographically reproduced from a copy held by the University of Pennsylvania. As a bonus, the volume also includes a later Playfair pamphlet, The Statistical Breviary.

Playfair’s work is treated with Biblical reverence because it was the first publication to include graphs and charts as a way of presenting quantitative information. The Atlas is full of line graphs or area graphs showing the balance of trade between Britain and various other countries through the 1700s. A bar chart made an appearance in one of the early editions of the Atlas but was later dropped. The Breviary has pie charts.

In their introduction, Wainer and Spence address the obvious question, which is not “How did Playfair come up with these novelties?” but rather “Why didn’t anyone do it before?” Although the costs and difficulties of reproducing graphics were surely a factor, Wainer and Spence argue that the major impediment was distrust of such graphic devices. Scholars and scientists of the time might have drawn graphs for their own private use, but in publications they wanted to see raw data, in big tables of numbers. Graphs and charts had the same doubtful reputation then as Powerpoint presentations do today.

And perhaps Playfair’s colleagues were justified in their skepticism. Wainer and Spence point out that Playfair’s interpolation often looks either careless or fanciful, and some of the circular charts in the Breviary suffer from perceptual problems that still plague graphic artists today. All the same, it’s great to have the book in one’s hands.

Zeroing in on zeta zeros

Friday, February 17th, 2006

Casual observers of the mathematical arts might be forgiven for feeling that mathematicians sometimes make rapid progress in the wrong direction.

For example, the concept of a prime number is simple enough to be understood by anyone who knows a little arithmetic. In order to explain the distribution of primes along the number line, however, mathematicians have turned to studying the distribution of the zeros of the Riemann zeta function. And in order to describe the distribution of the zeros of the zeta function, they have been looking into the distribution of eigenvalues of random symmetric matrices. For many of us, converting a problem about prime numbers into a problem about zeta functions and eigenvalues is not progress toward understanding. Nevertheless, if we are to follow the ideas of modern mathematics, that’s where we’ve got to go.

Without actually trying to explain anything about zeta functions or eigenvalues, I can note that the correspondence between those two distributions is looking very good these days. For more than 20 years, Andrew M. Odlyzko (now of the University of Minnesota) has been calculating the positions of hundreds of millions of zeta zeros along a certain line in the complex plane. When the distribution of spacings between those zeros is plotted along with the predictions of random-matrix theory, the match is extremely close—although not quite perfect. The hypothesis is that if we could construct such a plot for all the zeta zeros (there are infinitely many of them) and compare it with the eigenvalues of infinitely large random matrices, the correspondence would be exact. But building such infinite structures is not a practical option.

A group of European mathematicians has now tried another approach. They have analyzed the distortions introduced by looking at only finitely many zeta zeros and by using matrices of finite size, and they have applied corrections to compensate for these finite-size effects. They find that the systematic errors disappear, leaving only what looks to be a structureless and unbiased statistical scatter. Thus the agreement between the zeta zeros and the eigenvalues seems all but perfect. Now if we can only understand what that means.

The new paper is: “On the spacing distribution of the Riemann zeros: corrections to the asymptotic result”, by E. Bogomolny, O. Bohigas and P. Leboeuf, all of the Université de Paris XI, and A. G. Monastra of the Technische Universität Dresden.

For a hand-waving introduction to random-matrix theory I mention my own American Scientist column, “The Spectrum of Riemannium.” And I take this opportunity to correct an unfortunate error in that column: In my acknowledgments I thanked Oriol Bohigas for the phrase that gave me my title, but in fact the phrase was invented jointly by Bohigas and Patricio Leboeuf (both of whom are among the authors of the new paper).

Bidirectional subroutines

Tuesday, February 14th, 2006

More on reversible computing.

If all it took to reverse a computation was stepping through a program backwards, there wouldn’t be much to say about the idea. In general, this kind of straightforward reversal doesn’t work. However, I have learned that a computer architecture outlined more than 40 years ago would have allowed such back-and-forth operation, at least for selected subroutines.

The scheme was proposed by Edwin D. Reilly, Jr., now retired from the State University of New York at Albany, and the late Francis D. Federighi. They presented their idea in the context of a machine called the MOHAC, or Mohawk-Hudson Automatic Computer, which they describe as a hypothetical computer designed for teaching programming in high schools. The MOHAC was never built, but Reilly and Federighi created a simulator that ran on the IBM 1620. By using (or abusing) certain unusual features of the MOHAC architecture, a program could be made to march either forward or backward through any sequence of contiguous instructions. Only in rare cases would such a subroutine produce meaningful results in both directions, but Reilly and Federighi suggested a further refinement that could have made the technique more broadly useful. When the direction of execution was reversed, they proposed, certain machine instructions would be replaced by their “conjugates.” Addition going forward would become subtraction on the backward phase; multiplication would become division; an instruction that reads data into a register would instead write the value of the register to a memory location. In this way many simple calculations could be made bidirectional.

Such two-way subroutines would do nothing to lower power consumption, which is now the main goal of reversible-computing research. But the dual-use instructions might have saved a few words of memory—a precious resource in 1965.

For the whole story see: Reilly, E. D. Jr., and F. D. Federighi. 1965. On reversible subroutines and computers that run backwards. Communications of the ACM 8:557–558. Reilly has made a PDF available on his web site.

Packed primes

Friday, February 10th, 2006

A few weeks ago I reported on two talks about patterns in prime numbers—pairs of primes that lie close together and long sequences of primes in arithmetic progression. Thomas J Engelsma writes to tell me of a closely related undertaking: The search for dense clusters of primes. Although we know that the average density of primes falls steadily as one looks further out along the number line—that’s the message of the Prime Number Theorem—there’s every reason to expect occasional dense local aggregations. Indeed, a conjecture of Hardy and Littlewood says there should be infinitely many clusters of the maximum possible density—intervals where every number that might conceivably be prime is in fact prime. (Giving a precise definition of the maximum density is a little messy, but it’s obvious that half of all numbers can’t be primes because they’re even, and a third are excluded because they’re divisible by 3, and so on.) Small clusters of maximum density—called prime k-tuples—are easy to find; for example, the numbers 191, 193, 197 and 199 make up a prime 4-tuple.

Curiously, another Hardy-Littlewood conjecture points in the opposite direction. The conjecture is framed in terms of the function pi(n), which counts the number of primes less than or equal to n. Hardy and Littlewood claimed that for any integers x and y greater than 2, pi(x) + pi(y) is at least as large as pi(x + y). Another way of stating this idea is to say there are no prime clusters out in the further reaches of the number line that are any denser than the cluster at the very start of the line; the primes between x and y are never more numerous than those between 1 and y â€“ x.

The two conjectures are inconsistent; they can’t both be true. And the nature of the conflict between them can be made very concrete: it turns out that pi(3159)—the number of primes between 1 and 3159—is 446, but the maximum possible number of primes in an interval of length 3159 is 447. Thus Hardy and Littlewood’s first conjecture—claiming unlimited clusters of maximum density—says that somewhere out there we ought to find an x with 447 primes between x and x + 3159. But the existence of such a cluster would refute the second conjecture, because pi(x + 3159) would be greater than pi(x) + pi(3159).

All the smart money in mathematics is betting on an infinity of k-tuples, but so far neither conjecture has been proved. Engelsma and several collaborators have been exploring the direct approach to settling the matter, simply by finding a prime 447-tuple. They remain a long way from success, with estimates that the first 447-tuple lies somewhere out toward 101200. In the meantime, though, there are lots of interesting preliminaries about the patterns of k-tuples at Engelsma’s web site. More useful background comes from Tomás Oliveira e Silva of the Universidade de Aveiro in Portugal.

A reversible eraser

Thursday, February 9th, 2006

Still more on reversible and zero-energy computing (see earlier bit-player posts here and here, and the American Scientist column):

M. Maissam Barkeshli of the University of California at Berkeley has a preprint titled “Dissipationless Information Erasure and Landauer’s Principle.” (The paper was first submitted to the arXiv last April, but I missed it then, and noticed it today only because it has just been updated.) Barkeshli’s “dissipationless erasure” does not challenge the basic premise that a reversible computer could operate (in principle) without energy loss, whereas an irreversible computer must dissipate at least some energy. He argues, however, that the energy-dissipating step in an irreversible machine need not necessarily be the erasure of a bit of information, as Rolf Landauer first suggested in 1961. Barkeshli describes a hypothetical computer technology in which erasing is free but writing a new value takes energy. The energy cost for the entire cycle remains the same.

Newton and Notwen

Tuesday, February 7th, 2006

This is another loose end from my new column on reversible computing (now available online; also see the earlier bit-player item on swapping).

In the column I mention Henry G. Baker’s idea of reversing Newton’s method for approximating the square root of a number. Undoing the extraction of square roots doesn’t seem like much of a trick: After all, you can just multiply. But that’s not good enough for a reversible computer, which has to be able to back up through all the stages of a calculation, step by step.

Baker wrote, in a 1992 paper:

The Newton iterative square root algorithm x[i+1]=(x[i]+N/x[i])/2 exemplifies a large class of computations in which the output appears to be independent of the input, since the Newton iteration produces the square root independent of the initial approximation. Nevertheless, if this algorithm is run a fixed number of iterations on infinite precision rational numbers, and if x[0]>=sqrt(N), it is reversible! Newton’s iteration essentially encodes the initial conditions into lower and lower order bits, until these bits become insignificant. It is the erasure of these low order bits through rounding and/or truncation that dissipates the information needed for reversibility. In a sense, Newton’s iteration converges only because Newton’s inverted iteration produces chaos.

I first read this passage at least 10 years ago, and it has stuck with me ever since. Exploring the zany idea of a reversible algorithm whose output seems to be independent of its input was one of the reasons I wanted to write about reversible computing in the first place. Nevertheless, when my column went to press a couple of weeks ago, I still had not looked closely at the inverse-square-root algorithm; I was simply reporting Baker’s comments. This is a situation that always leaves me feeling nervous. It’s not that I worried Baker might be wrong; he’s a reliable source (moreso than I am). It’s that my own understanding of the inverted algorithm was much too shallow, and if anyone asked me to explain it in greater detail, I would have been stumped. In particular, why does Baker mention that reversibility is guaranteed only when the initial guess x0 is greater than or equal to the square root of N? I had only the vaguest notion of the answer. But now, after a pleasant weekend immersed in algebra and Lisp, I have a somewhat clearer sense of what’s going on.

To reiterate, Newton’s method is based on this iterative computation:

x_{i+1}=\\frac{x_i+{\\frac n x_i}} {2}

In other words, given a guess xi about the value of the square root of n, we take the average of xi and n/xi, which becomes the new guess xi+1 to be fed back into the same formula. Each stage of the iteration yields a closer estimate of the square root—unless of course the estimate is already exact, in which case x is equal to n/x.

Here’s an example of how the process goes. Let’s take the square root of n=2, starting with an initial guess of x0=2. Plugging these values into the formula yields (2 + 2/2)/2, or 3/2, as the next iterate, x1. Then applying the same method again yields a value of x2=17/12. Continuing in the same way, we get x3=577/408 and x4=665857/470832. The next iterate, x5=886731088897/627013566048, has a decimal value of 1.4142135623730951, which is correct to the number of places given.

In a reversible computer, we need the capacity to turn this process around, to work backward through the sequence of approximations—to convert the Newton iteration into the Notwen iteration. The first task is to define the inverse transformation mathematically. To avoid tripping over a thicket of subscripts, let’s rewrite the equation as:

y=\\frac{x+{\\frac n x}} {2}

Then, multiplying both sides by 2x and rearranging yields this quadratic equation:

0=x^2 + 2yx + n

The forward Newton iteration is what you get when you solve this equation for y. For the Notwen iteration we have to solve it for x instead. The answer comes straight out of the quadratic formula:

x = \frac {2y \pm \sqrt {4y^2 - 4n}} 2

But this result is rather troubling. At each step in the iteration we will have to take the square root of 4y2–4n, a process that could yield an irrational value, or even an imaginary one. Can this algorithm really get us back to where we came from? Let’s try one of the examples encountered above, with n=2 and y=17/12. Then y2=289/144, and we’re looking for:

\\sqrt {4 \\times \\frac {289} {144} - 8}

Miraculously, the value under the radical reduces to 1/36, and so its square root is 1/6. Completing the rest of the calculation recovers the previous iterate x=3/2. Of course there’s actually nothing miraculous about this. Whenever the Notwen method is used to invert a Newton calculation, it will turn out that the quantity under the radical is a perfect square. As Baker points out, all the information from the original Newton iteration has been encoded in those unwieldy fractions (or in the low-order digits of their decimal expansions). We’re exploiting all of that information, like a trail of breadcrumbs, to find our way back to the starting point of the computation.

Here’s a trace of a program descending six levels into the Newton iteration and then climbing back out again through the corresponding Notwen iterations.

x0 = 2
x1 = 3/2
x2 = 17/12
x3 = 577/408
x4 = 665857/470832
x5 = 886731088897/627013566048
x6 = 1572584048032918633353217/1111984844349868137938112
x5 = 886731088897/627013566048
x4 = 665857/470832
x3 = 577/408
x2 = 17/12
x1 = 3/2
x0 = 2

Note that it’s essential to write this program in a language that calculates with exact rational numbers. Rounding or truncating would be disastrous.

There’s just one more loose thread. Why does Baker warn us to start the Newton algorithm with an x0 greater than the square root of n? The Newton iteration converges on the square root from any starting guess (except x=0), so why shouldn’t the Notwen algorithm offer the same versatility?

Let’s try it. Here’s a trace of the Newton program’s progress toward the square root of 2 from an initial guess of 1/2, followed by the Notwen program’s attempt to recover that guess:

x0 = 1/2
x1 = 9/4
x2 = 113/72
x3 = 23137/16272
x4 = 1064876737/752970528
x5 = 2267891697076964737/1603641597827614272
x4 = 1064876737/752970528
x3 = 23137/16272
x2 = 113/72
x1 = 9/4
x0 = 4

Everything goes smoothly until the very last step, when Notwen suddenly goes off the rails; applied to 9/4, it returns 4 instead of 1/2. How come? This is not just some minor bug or oversight in the program. The problem is that every quadratic equation has two roots—as indicated by the plus-or-minus symbol in the quadratic formula—and we have arbitrarily been choosing just one of those solutions, namely the larger one. That choice gives correct results as long as x0 is greater than the square root of n, but not otherwise. To apply the method more generally, we would need some scheme for keeping track of which root to choose on each iteration. The same is true of the “forward” algorithm for extracting square roots: After all, every number has two square roots, and the Newton method will converge on the negative one if given a negative initial guess.

With these bifurcations at every step, a square-root program becomes a lot more complicated than it seems on the surface, whether you run it forward or backward. That really shouldn’t come as a surprise. Algorithms only a little more elaborate—they derive from cubic equations rather than quadratic ones—underly the famous Julia and Mandelbrot sets, the very emblems and icons of complexity.

Afterthought: To reverse the Newton algorithm for extracting a square root, we have to extract a square root. Aha! We can use the Newton algorithm to calculate its own inverse! Lamentably, it won’t work. The one thing Newton’s method is no good at is finding the exact square root of a perfect square—and that’s just what we need in this case. The successive approximations of the Newton algorithm converge very quickly on the correct value, but they never actually get there.

Afterafterthought. I have discovered (or rediscovered, or remembered—it’s hard to say which) that Baker also discussed the Newton square-root algorithm in several installments of his “Garbage In/Garbage Out” column in ACM SIGPLAN Notices (the newsletter of the Association for Computing Machinery’s Special Interest Group on Programming Languages). Here are the references:

  • Baker, Henry G. 1998. Garbage In/Garbage Out: You could learn a lot from a quadratic: overloading considered harmful. ACM SIGPLAN Notices 33(1):30–38.
  • Baker, Henry G. 1998. Garbage In/Garbage Out: You could learn a lot from a quadratic: II: digital dentistry. ACM SIGPLAN Notices 33(2):34–39.
  • Baker, Henry G. 1998. Garbage In/Garbage Out: March Möbius madness with a polynomial PostScript. ACM SIGPLAN Notices 33(3):24–35.
  • Baker, Henry G. 1998. Garbage In/Garbage Out: You could learn a lot from a quadratic: Newton squares the circle. ACM SIGPLAN Notices 33(6):27–31.

Finally, while searching for these titles, I have also “discovered” yet another Baker article that has much to say about reversibility in computing systems:

  • Baker, Henry G. 1994. Thermodynamics and garbage collection. ACM SIGPLAN Notices 29(4):58–63.

Rediscovering America

Sunday, February 5th, 2006

In today’s New York Times (registration required), Gina Kolata writes on rediscoveries and reinventions in the sciences. Her essay is based in part on the experience of Rakesh V. Vohra of Northwestern University, who has discovered that one of his own results has been rediscovered at least 16 times. Kolata also quotes Stephen M. Stigler of the University of Chicago, who points out that the very notion of repeated rediscovery is something that has been repeatedly rediscovered. (But Stigler modestly does not mention Stigler’s Law of Eponymy, which says that discoveries are never named for the people who first discovered them.)

In keeping with this theme of oft-repeated repetition, there’s a recent paper in the arXiv that re-covers some of the same ground. M. V. Simkin and V. P. Roychowdhury of the University of California at Los Angeles review 20th-century work on various probabilistic models, on the structure of networks, on branching processes in genetics and taxonomy, on self-organized criticality, and more. Over and over again, they find that things have been discovered over and over again. For example:

The distribution of words by the frequency of their use follows a power law. This fact was discovered sometime before 1916 by Estoup, re-discovered in 1928 by Condon, and once more in 1935 by Zipf. Nowadays it is widely known as Zipf’s law. To explain this observation, Simon proposed [a] model…. In 1976 Price used Simon’s model to explain a power-law distribution of citations to scientific papers, which he discovered in 1965. He proposed to call it “cumulative advantage process”. Simon’s model was re-discovered in 1992 by Günter et al and in 1999 by Barabasi and Albert. In the latter case it acquired a new name: “preferential attachment”.

(In the quotation above I have silently elided several references, as well as a long passage describing Simon’s model.)

Although Simkin and Roychowdhury don’t say so directly, I get the impression that they take a dim view of all this rediscovery, seeing it as wasteful duplication of effort. But it’s time for someone to stand up in defense of rediscovery. When we find that the same mathematical law turns up in distant realms of life, that in itself is a discovery of some note, and surely worth rediscovering a time or two. Simkin and Roychowdhury conclude: “We estimate that scientists are busy re-discovering America about 2/3 of time.” I would respond by quoting Alvin E. Roth and Marilda A. Oliveira Sotomayor, in Two-sided Matching: A Study in Game-theoretic Modeling and Analysis:

Columbus is viewed as the discover of America, even though every school child knows that the Americas were inhabited when he arrived, and that he was not even the first to have made a round trip, having been preceded by Vikings and perhaps by others. What is important about Columbus’s discovery of America is not that it was the first, but that it was the last. After Columbus, America was never lost again; no subsequent explorers can claim its discovery.

Update: Soon after writing the above I discovered that the Kolata article had already been discovered by Lance Fortow, who also (will coincidence never cease?) quotes another version of the Columbus quote.