Archive for April, 2009

Sic transit

Thursday, April 30th, 2009

SciAm-cover-April-1949.jpg

The tattered magazine shown above was on newsstands 60 years ago. You could have bought a copy for 50 cents—which wasn’t cheap at the time. The article featured on the cover, “Mathematical Machines,” surveys the whole topic of computational technology, from a primer on binary arithmetic to breathless reports on the hottest new machines—the Edvac, the Univac, Project Whirlwind. Other articles in the same issue include “Submarine Canyons,” “Greek Astronomy,” “Titanium: A New Metal” and “The Evolution of Sex.”

The creators of this ambitious magazine were Dennis Flanagan and Gerard Piel, two young writers who had become friends while working at Life magazine. In 1947 they set out on their own, planning to launch a wholly new magazine about science. When they learned that Scientific American was about to fold up after more than a century of publication, they bought it and made it over. By the time of that April 1949 issue, the new magazine had already found the formula and the format that would define it for a generation of readers.

One important part of the formula was a bit of an accident. Flanagan and Piel had expected to hire professional writers to produce the main articles, but they couldn’t find enough of them. So they began inviting scientists to tell their own story, in collaboration with an editor. This shortcut would soon become a key element of the magazine’s identity and its claim to credibility.

I grew up reading Scientific American, though I was not a subscriber. I discovered a disreputable shop on Race Street in Philadelphia that sold out-of-date issues for a dime each. These were copies with their covers torn off; years later I learned why. Newsstands are entitled to return unsold copies of magazines for full credit, but shipping all that paper back to the publisher was expensive and wasteful. So the dealers sent back only the covers and promised to pulp the rest of the magazine. Some of the discarded copies never made it to the recycling yard.

My adolescent reading of all those gray-market magazines was probably my main qualification when I joined the staff of Scientific American in 1973. I spent a decade there. I’ve done lots of other stuff since then, but the years with Flanagan and Piel were the great formative experience of my working life; in my own mind, I’ll always be a former editor of Scientific American. And I still have a deep affection for the magazine, even though I dislike what’s been done to it in recent years—so much so that I find it painful to read.

What went wrong? I think the genre of this story is tragedy: a noble enterprise undone by its own success. By the 1980s the magazine was prosperous enough to attract the attention of predatory investors. Their reasoning went something like this: If those eggheads can sell half a million copies and a hundred pages of ads with a magazine crammed full of physics and mathematics and molecular biology, just imagine how well we can do if we get rid of all that boring science. In 1986, under pressure from stockholders, the company was sold to Verlagsgruppe Georg von Holtzbrinck—by no means the worst of the suitors, but the terms of the purchase left the magazine desperate for new revenue. And the collateral damage was even sadder: Flanagan and Piel parted ways and never spoke again.

Over this past weekend I learned that Scientific American is going through another rough patch and will likely see further changes. (Sources: The New York Times, Folio, Portfolio.) Ad pages are down 18 percent. John Rennie, the editor since 1994, has “taken the opportunity… to find other engaging things to do.” At least 20 staff members will lose their jobs. The magazine will be uprooted from its own offices and will move in with the parent organization, Nature Publishing Group (another Holtzbrinck acquisition). Rumor has it that the new editorial direction will be even more consumer-oriented, focusing on science that’s “useful in everyday life.” The old commitment to articles “written by the scientists who did the work described” is under question.

I’ve been stewing over all this for a few days—or maybe it’s been a few decades. In any case, I’m tired of stewing; it’s time to move on.

At a personal level, I commiserate with those who are going to be hurt most—the staff members (including some old friends) whose jobs are in jeopardy. But in journalism and publishing these days we’re all sailing in leaky tubs and bailing as fast as we can.

When I take a step back and look at the larger cultural significance of these events, I think first of how important Scientific American was in my own upbringing and education. It’s my reflex to ask: How will kids like me get turned on to science without access to those coverless bootleg magazines at a dime a piece? But what a ridiculous question! The world has changed. The resources available today to a curious 14-year-old are far superior to anything I could have imagined back in the sixties. The little shop on Race Street was a treasure chest in its time, but it can’t compare with Wikipedia and the arXiv and PLoS and Weisstein’s World of Mathematics and Sloane’s Sequence Server and all the other bounty that’s now just a click away. Indeed, that’s a big part of the problem that faces Scientific American, and other publications.

We did good work at Scientific American, back in the old days. I’m proud to have been a part of it. But what we were building was not a monument to be preserved for the perpetual admiration of future generations. It was a channel of communication, a way of linking scientists with a broader audience. I believe it’s still important to keep those channels open, but the best way of doing it may be different in 2009 than it was in 1949.

On page 1 of the April 1949 issue of Scientific American is an advertisement from the Radio Corporation of America announcing—with considerable fanfare—the company’s latest innovation for audiophiles: the 45-rpm vinyl record. Today, the replacement of the replacement of the replacement of that recording medium is teetering on the edge of obsolescence. And yet music goes on! I want to believe that science journalism can also make a transition to a new medium, and perhaps come out of it better than ever.

The control room

Sunday, April 26th, 2009

My “Computing Science” column in the new issue of American Scientist looks at economics—including the current malaise—through the lens of control theory. This is not a new idea. The Keynesian prescription for smoothing out cycles of boom and bust is essentially an application of feedback control, although Keynes himself never described it that way. By the late 1940s the connection became explicit, as economists and control engineers worked together to refine tools for taming the business cycle. The two disciplines had another flirtation in the 1970s. Economists today seem less enthusiastic about methods borrowed from control theory, but they have certainly not given up on the aim of bringing the economy under control. In the past six months governments and central banks have intervened on a scale never seen before, all in an effort to “stimulate” economic recovery.

I’ll let the column itself tell the rest of this story; here I would just like to add a a couple of personal reminiscences about control engineering as seen by an interested outside observer.

Over the years I’ve had the opportunity to visit a lot of control rooms—at oil refineries, power plants, steel mills, railroad switch yards, sewage-treatment plants, particle accelerators, observatories, Internet operations centers. They’re not all exactly alike, but they do seem to share a common architecture, and also aspects of a common culture. For example, standard control-room iconography shows normally running machinery in red, whereas stopped or faulty equipment is green—the opposite of what you might guess after a lifetime of exposure to traffic lights. Years ago most control rooms had a “mimic board”—a long wall covered with a schematic diagram of the plant, festooned with gauges and switches and indicator lights. Now the operators are more likely to sit at ordinary computer screens. (I do hope they’re not watching YouTube videos while the reactor melts down.)

Two control rooms at the same plant—a paper mill in Catawba, South Carolina—serve to frame my impressions.

One of those rooms was in charge of a papermaking machine, a stupendous contraption the size of a football stadium, which takes in a slush of digested trees at one end and delivers huge reels of pristine white paper at the other end, with the sheet of paper-to-be threading between banks of steel rollers at a rate of more than 60 kilometers per hour. The control room for the paper machine was a calm and laid-back place. I sat with an engineer for an hour or so, talking about the thundering machinery outside the soundproofed booth, and all the while keeping an eye on the display screen in front of us. The indicators hardly budged the whole time. One key set of sensors in the papermaking process monitors the thickness of the moving sheet; if the thickness varies from the set point, a correction is made by blowing either warm or cool air onto one of the steel rolls. The temperature change causes the roll to expand or contract very slightly, and thus to squeeze the sheet a little more or less tightly. This is a subtle adjustment. Here’s what I wrote about it at the time:

It should be noted that the overall control strategy in papermaking is to operate the machine in a stable, steady-state condition. Blowing hot air on a steel roll cannot effect a large or a rapid change in the roll’s diameter; in the vocabulary of control theory, the actuators for this system have only limited authority. The limitation is intentional. The paper machine must run continuously for long periods, and changes in its state are designed to be gradual and confined to narrow bounds.

The other memorable control room at the Catawba mill was in the powerhouse, where boilers generate steam needed throughout the plant, as well as electricity. The main fuel for the boilers is “black liquor,” a strange watery brew created in the digesters that leach lignin and other kinds of organic gunk out of the wood. Black liquor is not a particularly good fuel, but the plant has to get rid of the stuff anyway, so burning it makes sense. But because of wide variations in the caloric value of the liquor, the boilers are hard to control. The first thing I noticed when I sat down with an operator in the powerhouse control room was that all the finish had been rubbed off one area of the console, surrounding a button whose label had also been rubbed away. I soon learned the function of that button: It silenced the alarm buzzer that sounds whenever some parameter wanders out of bounds. On the day of my visit, the buzzer was going off every minute or two. The operator displayed a well-developed Pavlovian reflex, slapping the silence button in milliseconds every time the buzzer sounded. Then he would turn to the screen, find the cause of the problem, and make some adjustment to fix it. Thus the powerhouse control room was a much busier place, and the mood was tense. If I had been doing that job, I’d have been ready for a beer at the end of my shift.

Somewhere in the bowels of the Treasury Department or the Federal Reserve (or maybe the World Bank), there must be a control room where operators at glowing screens watch the ebb and flow of money and make adjustments when some economic parameter strays from the set point. I wonder about the mood of that room. Is it one of those placid, hushed working environments, where controllers gently nudge the system back toward equilibrium? Or are they batting at buzzers all day?

Himalayan mathematics

Monday, April 20th, 2009

Browsing through some notes I jotted down sometime in 1988, I come upon this sentence:

Mathematicians feel about computers much as the Nepalese feel about jet aircraft.

Did I read that somewhere, or did I make it up? My notes give no clue to the provenance of the thought, and Google is unhelpful.

Bits from its

Wednesday, April 15th, 2009

Sometime in the early 1980s my friend Greg Chaitin decided to go digital. He wanted to have all of his own writings, along with some other documents he particularly valued, ready at hand in machine-readable form. This was long before Postscript or PDF; scanners and OCR were exotic technology. So Greg hired a typist. I know about all this because he sent me a printout of his digitized oeuvre—an impressively thick sheaf of fanfold paper, with sprocket strips still attached. (The irony of this mode of distribution was not lost on either of us.)

I’m finally trying to catch up with Greg. Instead of hiring a typist, though, I’ve bought a cute little document scanner, which has been busily transforming heaps of moldering cellulose into shiny new bits. The scanner produces PDFs, which then run through an OCR program to build an index, making the page images searchable. (Bring on the Googlebot!) The result falls short of ideal—files are obese, graphics are low-res, there’s no opportunity to edit or reformat—but still it’s better than rummaging through cobwebby filing cabinets in my basement. And I don’t have to FedEx cartons of fanfold to my friends.

I’ll be adding links to my publications list as I upload the files. Here are a few teasers:

The HP-41C: A Literate Calculator? Byte, January 1981, pages 118-138. PDF (3.3 MB)

40,000 Points of Light. Pixel, Vol. 1, No. 2, May-June 1990, pages 36-41. [On a graphics language in which an image is defined as a locus of points.] PDF (2.4 MB)

Rank-and-File Thinking. Lotus, June 1985, pages 73-77. PDF (1.8 MB)

Theory & Practice: On the bathtub algorithm for dot-matrix holograms. Computer Language, October 1986, pages 21-32. [On patterns generated by severely oversampled signals.] PDF (4.6 MB)

The Information Age: The Numbering Crisis in World Zone 1. The Sciences, Vol. 32, No. 6, November-December 1992, pages 12-15. [On the impending shortage of telephone numbers.] PDF (1.8 MB)

I note for the record that every one of the publications mentioned in the list above is now defunct. I wonder if there’s any significance in that?

Pub date

Tuesday, April 14th, 2009

For those of you who have been waiting patiently for the paperback edition of Group Theory in the Bedroom, and Other Mathematical Diversions: It’s finally here. Today is the official release date.

GTiB-pb-cover.jpg

For those of you waiting for the movie version, well, so am I.

Computing in the classroom

Wednesday, April 8th, 2009

As the students file into the classroom, each of them reaches into a basket by the door and selects a ticket, on which a number is printed. For convenience, let’s assume the numbers are integers, they’re of reasonable size, and no two tickets bear the same number. Once everyone has settled down, the instructor assigns the class this exercise: Consult among yourselves and show me the ticket with the largest number.

How will the students solve this problem? Perhaps they pass all the tickets to one nimble kid in the first row, who thumbs through them looking for the maximum. This is a sequential algorithm, where the running time ought to be roughly proportional to N, the number of tickets. An obvious parallel solution is based on a tournament tree: Pairs of students compare their numbers, then the winners of each of these matches get together in pairs, and so on, until only one student remains. This routine yields an answer in about log2 N rounds of comparisons.

Here’s another idea. The students all stand up and shout out their numbers repeatedly, like commodity dealers in the trading pit. If a student hears a number larger than her own, she sits down and remains quiet thereafter. The last student standing has the largest number.

The efficiency of this shouting-match algorithm is hard to quantify. If we assume that everyone can yell and listen at the same time without confusion, then the problem of identifying the maximum number is solved instantaneously. For large N, however, that assumption is surely unrealistic.

A variant of the outcry algorithm is easier to analyze: A single student chosen at random calls out his number, and all those students with smaller numbers sit down. Then another student is selected from among those still standing, and the procedure is repeated until only one student remains. This algorithm also seems to have an expected running time of log2 N, since each round of eliminations should exclude (on average) half the students still in contention.

Suppose the students are asked to deliver the sum of the numbers, rather than the maximum. The tournament-tree computation adapts readily to this task, again requiring depth log2 N. On the other hand, I don’t know any reliable way to do addition by shouting.

What about calculating the average, or arithmetic mean, of the numbers? Of course this could be done by summing the numbers and then dividing by N—or by dividing first and then summing—but either of these strategies requires the students to count themselves before they can compute the average. Can’t they just use the tournament-tree algorithm to compute the average directly, as they did for the maximum and the sum? Suppose they replace the ‘max’ or ‘+’ operator with a new binary average operator, ‘•’, defined as (a • b) = (a + b)/2. Plugging this operator into the tree algorithm works just fine if N happens to be an exact power of 2, but otherwise it fails. In the case of N=5, for example, the computational tree might look like this:

averagetree.png

The four values on the left are combined in a regular, symmetrical tree to yield the correct average, but then the fifth value has to be inserted in an extra, ad hoc operation. Applying the binary average operator at the top node of the tree would give a result of 3.75, whereas of course the true average is 3. The underlying problem here is that the averaging operator is not associative: (a • (b • c)) ≠ ((a • b) • c). [Forgive me if all this is thunderingly obvious to everyone but me. It actually took me a while to figure it out.]

One could fix this problem by passing pairs of numbers up the tree; one member of each pair is the running average, and the other member is a weight equal to the number of numbers that went into calculating that average. But this is just the sum-and-divide procedure in disguise. Let me note that there is a very simple algorithm for approximating the average. Have the students form random pairs repeatedly; in each pair, both students replace the value on their ticket with the average of the two values (in other words, they apply the ‘•’ operator). In this procedure the students never actually calculate either N or the sum, but both of those quantities are conserved as invariants of the computation. As the random process continues, all of the tickets eventually converge on the same value, namely the overall average. Experiments suggest that about log2 N rounds of random coupling yield a reasonably good approximation.

In judging the merits of these algorithms, my criteria have more to do with style than with computational complexity. My dislike for the sum-and-divide method probably has no firmer basis than stubborn prejudice. All the same, a computer in which the processors are people does seem to call for a particular style of program. Ideally (in my view) an algorithm for the classroom computer would be fully parallel (every student gets an equal chance to play), homogeneous (all the students get the same instructions) and local (students interact in small groups, without needing to know much about the global state of the system—such as the value of N). An algorithm gets extra points if it exploits the mobility and flexible communication patterns of human computers.

So here’s another challenge for the class. As before, the students take numbered tickets, but now the instructor asks to see the ticket with the median number—the value that is larger than half the other numbers and smaller than half. (Let’s make life easy and assume that N is odd.) Here is a median-finding algorithm—a variation on a method sometimes called Quickselect—that I would like to believe a group of students might invent. First, all the students raise their right hand. Then they repeat the following series of steps until the median is identified.

  • Choose a student at random from among those who have a hand raised. This student’s ticket number is designated the pivot.
  • All students whose number is smaller than the pivot line up on the left and those with larger numbers line up on the right.
  • If the two lines are of equal length, then the pivot is the median. Stop.
  • Otherwise, all students in the shorter line lower their hand (if it’s not down already); the pivot student also lowers her hand.
  • Repeat.

The expected number of times the loop will be executed again appears to be approximately log2 N. Note that in this hand-waving algorithm the students never have to sort themselves. The method also requires no counting—not even for the comparison of line lengths. But we do assume that all the students can simultaneously compare their ticket numbers with the pivot.

The diagram below shows an example calculation. The green dots at the top of each configuration are the successive pivots; students holding lesser and greater numbers are queued up to the left and right of the pivot; blue dots represent numbers still eligible for consideration as pivots (i.e., students with raised hands); red dots are numbers that have been excluded because at some stage of the computation they were in the shorter queue or were pivots that turned out not to be the median.

mediandiagram.png

One could dream up plenty of other exercises to keep the students busy. They could search for subsets of numbers with special properties, such as Pythagorean triples. They could be asked to identify the longest sequence of consecutive integers in the set of numbers they hold. And then there’s the balanced subset-sum problem: The students divide themselves in two groups in such a way that the sums of the numbers in the groups are equal, or as close as possible to equal. (This is an NP-complete problem, but many instances yield easily to the right heuristics.)

The various marching-band patterns and parliamentary shouting matches described above are my own fantasies about how groups of people might “enact” algorithms. I’m curious to know how real students, in a real classroom setting, would actually go about solving such problems. Would they adopt the kinds of local and homogeneous algorithms I find aesthetically pleasing? Or would they favor a simpler and more pragmatic approach? In some cases I suspect that the ruse of passing all the tickets to the class whiz might well be fastest and most reliable way to get an answer. Moreover, for any other approach the time needed for the students to choose an algorithm and organize themselves may well exceed the execution time. Indeed, the very process of coming to agreement on how to solve a problem calls for significant group dynamics; this is not something we see in other kinds of parallel computers.

Another interesting point about people as processors is that their capabilities are not very sharply defined. For example, most algorithms assume that comparisons are strictly pairwise; a predicate such as ‘>’ or ‘=’ is applied to exactly two numbers, and the efficiency of the algorithm is measured by counting those operations. But it seems likely that three or four students could huddle together and find the maximum of their numbers about as quickly as two students could. Thus the complexity of some algorithms might be reduced to log3 N or log4 N.

On the other hand, I’ve assumed the existence of some primitive operations that may be difficult to realize. In particular, the axiom of random choice seems a little dubious in this context. If a group of students is asked to choose one their members at random, how do they go about it? Can they always do it in unit time?

There’s also the question of how such algorithms scale as N increases. A scheme that works well in a classroom with 30 students might not be ideal in a lecture hall with 300. And surely the process would go very differently in a stadium holding 30,000.

I assume that many experiments of this kind have been tried over the years, although I haven’t found much literature on the subject. (Perhaps the closest match is a 1994 article by Bachelis, James, Maxim and Stout; they also note the scarcity of published material.) If anyone has reports from the field, I’d be delighted to learn about them.

Update 2009-04-14: Thanks to the commenters for some particularly helpful and lucid responses. I would like to explore one of these themes a little further.

Mario Klingemann’s “small optimization” to the outcry algorithm is more than just a minor improvement; it corrects an outright bug in the procedure I described. Klingemann notes that when a student calls out a number and all the students with smaller numbers sit down, the calling student must also sit unless no one else remains standing. This is important! Consider the difference it makes when N=2.

So which version of the algorithm—my buggy one or Klingemann’s correct one—matches up with JeffE’s analytic result showing that the expected number of rounds is the Nth harmonic number? In the graph below the harmonic numbers H(N) are represented by the red line; the green and blue lines are empirical results averaged over 100,000 repetitions of programs implementing the two algorithms.

outcry.png

For a given N, the Klingemann result is H(N) – 1/2; the buggy algorithm yields H(N) + 1 (asymptotically, for large N).