Archive for December, 2007

The new toy

Tuesday, December 25th, 2007

The XO laptop computer, showing screen and keyboard

In this season of giving and getting, you’ve got to admire the marketing savvy of the One Laptop Per Child project. They named their introductory sales campaign “Give One, Get One.” The computers cost $200 apiece. For a donation of $400, you send one machine to a child somewhere far away (Afghanistan, Cambodia, Haiti, Mongolia and Rwanda are mentioned as possible destinations), and you get another for your own gadget-happy life.

The question that everyone seems to ask about OLPC is whether laptop computers are really what the children of Afghanistan, Cambodia, etc., need most urgently. Well, reason not the need! What I find most appealing (and most subversive) about the whole idea is that it leapfrogs over matters of brute survival. If children lack adequate food and shelter, is that grounds for denying them intellectual diversion and enrichment as well?

In connection with an earlier attempt to bring computers to children, Alan Kay wrote (pdf):

[A computer] can be like a piano: (a product of technology, yes), but one which can be a tool, a toy, a medium of expression, a source of unending pleasure and delight… and, as with most gadgets in unenlightened hands, a terrible drudge!!

So let’s not deny the world’s children either delight or drudgery. Once OLPC achieves its goal, I say we start a One Piano Per Village program.

But enough about giving. What about getting? What’s the new computer like?

They call it the XO. I’ve had mine for only a few days, and I’ve hardly begun to play with it, so all I can report here are first impressions. Basically, I think it’s brilliant, but I also wish the system software had been built on a somewhat different model.

The XO closed up.

The hardware is undeniably cute. No one will mistake it for a Sony Vaio or a Dell Inspiron. I wouldn’t be surprised if the XO becomes something of a fashion statement, the laptop to be seen with at Starbucks, or for liveblogging NANOG or SODA. I’ve heard complaints that the keyboard is too small for adult fingers, but typing on these little green keys can hardly be worse than thumbing a Treo or Blackberry. (Disclaimer: I type with two fingers, so I’m not much of a judge of keyboard ergonomics.)

If the styling has a whiff of Fisher-Price about it, there’s also some thoughtful ingenuity at work here, and designers of machines for grownups might learn something from it. The screen is gorgeous—crisp and bright in its standard color mode, with a secondary black-and-white mode that’s legible in full sunlight. The screen also swivels 360 degrees and can be laid face-up over the keyboard, so that the XO can be used as a tablet for reading e-books and the like.

There’s no disk drive; the machine has 256 megabytes of RAM and a gigabyte of flash memory for nonvolatile storage. Three USB ports and a slot for SD memory cards offer opportunities for supplemental storage. One consequence of the no-moving-parts architecture is that the computer is absolutely silent in use—no clicking or whirring from a disk drive, no fan noise.

The wifi transceiver is amazing. I never knew I had so many well-connected neighbors—people named linksys and netgear, for example. No other computer I’ve had in the house has ever detected any of these networks. Why is the wireless card in my other laptop (which cost an order of magnitude more, by the way) so wimpy in comparison? The XO can also form ad hoc “mesh” networks with other XOs, but evidently I’m the first XOwner in my neighborhood, so I haven’t been able to experiment with that capability.

When it comes to software, I can’t be quite so wholeheartedly enthusiastic. There are some bright spots. A set of four programs called TamTam provide first-rate tools for creating and playing music; the calculator is very elegantly done; there’s a cool built-in oscilloscope, and a “Write” program that seems just right for the intended audience. Also a Python interpreter named Pippi. And the software called Etoys is a highly ambitious system, willed into existence by Alan Kay and a bunch of his friends, which looks like a plaything but has a complete Smalltalk development environment inside.

Some other spots are not so bright. For one thing, the software is just not finished yet. Some basic capabilities (printing, a sleep mode) are not yet implemented, and there are various buttons that don’t yet have functions. The web browser is primitive (no tabs, very limited facilities for bookmarks). There’s an RSS reader that doesn’t seem to work. An update planned for early next year is supposed to fix some of these problems.

Anyway, software is always behind schedule; I can wait. As a matter of fact, I’d be happier if they had taken longer and built the software in a different way.

I have opined elsewhere that computing is in a horrible rut, unable to escape the grip of precedents established in the distant past. A case in point is the Unix operating system, which is now almost 40 years old but is still with us in the form of Linux and OS X. The longevity of Unix argues that it must be quite a solid product, and I won’t argue with that. On the other hand, it’s a sad commentary on progress in computer science if no one has had a better idea since 1969.

OLPC offered a rare opportunity. It was a chance to start fresh, to throw away the past, to build a computer that has no legacy code to run; there was no need to worry about backward compatibility. But that’s not the way the project evolved. The XO software turns out to be a sugar-coated version of the Linux operating system. I can understand the reasons for choosing this path. Building an entirely new software system from scratch would have slowed the project down, and it would have raised the risk of catastrophic failure. Adopting Linux provided a major head start and allowed the project to draw on a global community of experienced programmers. But the choice also had an opportunity cost. A multiuser operating system designed in the era of timesharing and dumb terminals, and tuned to the needs of professional programmers, is not an ideal match for a laptop built for kids.

When I say the XO software is sugar-coated Linux, that’s just what I mean: The user-interface layer is named Sugar. It’s an interesting program, which tries to escape the ubiquitous desktop metaphor by emphasizing community and interaction. Instead of launching a program or application, the user starts an “activity.” Many activities can be shared with other XOwners; for example, two XOs can be set up to use acoustic signals to measure the distance between them. One intriguing “feature” of Sugar is the absence of a conventional file system. You don’t create documents and then decide where to put them and what to name them. Pointers to all activities are saved automatically in reverse chronological order in a linear list called the journal.

The sugar layer is nifty, but it’s also perilously thin. Underneath lies the Unix command line, and at present there are many things that can be done only by reciting the proper command-line incantations (many of them preceded by a magic ’sudo’). Will the kids master that kind of wizardry?

Maybe I shouldn’t worry; I’m sure they’re better equipped and quicker learners than I am. In any case, it’s their judgment that counts, not mine.

Googling for graphs

Wednesday, December 12th, 2007

The latest cute trick from Google is a service for producing quantitative graphics on demand. For example, here’s a bar chart of the traffic to this web site over the past year:

bar chart from Google Charts

The cute part is how you get Google to produce the graphic. The whole specification—including data, scales, labels and stylistic preferences—gets packed into a URL. In this case, the URL reads as follows (with line breaks added for clarity):

http://chart.apis.google.com/chart?
chs=450x300
&cht=bvs
&chbh=27,6
&chtt=bit-player.org+hits+per+month
&chco=ee2233
&chf=bg,s,efedea
&chxt=x,y
&chxl=0:|J|F|M|A|M|J|J|A|S|O|N|D|
      1:|0|20,000|40,000|60,000|80,000|100,000
&chd=t:54.9,59.1,61.3,69.5,83.6,77.6,60.3,60.5,69.2,77.3,86.0,-1

(If you want the key to decode all this curious bric-a-brac, see the Google Chart API Developer’s Guide.)

There’s something vaguely unsettling about this whole idea. When the web was young, a URL was a resource locator—in other words, the address of a document stored on a server somewhere. Times have changed, and most web servers now assemble pages by issuing queries to a database rather than by grabbing documents from a simple file system; the “back end” server fetches some text here, an image there, an ad from a third source. Still, the basic paradigm remains information retrieval; the vast majority of web sites are in the business of serving up some kind of pre-cooked static content, even when it’s embellished with dynamic sauce. But I really don’t think that Google Charts works that way. I don’t think they’ve produced an archive of all possible graphs and were just waiting for me to send the URL that retrieved the one you see above.

The charts service raises anew a question that comes up all the time in computing: What should we compute just once and store away in case it’s needed again later, and what should we recompute every time? Years ago we had precomputed tables of logarithms and trig functions, but now it’s easier to just calculate those numbers on the fly. Evidently graphs have now entered that realm as well—things so cheap to compute it’s not worth hanging onto them. Note that the graph you’re looking at above is not one that I produced when I wrote and published this page; instead, when you opened the page, your web browser (or RSS reader) sent a new request to Google’s servers, which then regenerated the graph from scratch. If a thousand people read the page, the graph will be recreated a thousand times. It’s monstrously profligate and inefficient, but I guess that’s what makes computers so wonderful.

One constraint on this mechanism is the length of the URL. Is there a limit? I didn’t know the answer, and so I turned—where else?—to Google. I quickly found a helpful web page that says the standards are silent on this issue, but some web browsers and servers impose a limit of as few as 2,000 characters. The Google Charts designers probably had this limit in mind when they created their scheme for cramming data into a URL. There are actually three encodings available. Above I chose a method in which each data point is a number between 0.0 and 100.0 and values are separated by commas; this allows for a few hundred data points before you start bumping into the 2,000-character limit. The most compact encoding packs each data item into a single ASCII character in the ranges from A=0 to Z=25, a=26 to z=51 and 0=52 to 9=61. Not the most intuitive encoding, but it allows for a further cute trick: graphing words and phrases.

http://chart.apis.google.com/chart?chs=200x200&cht=lc&chd=s:Agraphisworth1000words
bar chart from Google Charts

Copy the URL into the location bar of a browser window, change the text at the end of the line, and you can see a graph of your name and address.

To P or NP, that is the question

Monday, December 10th, 2007

It’s time for my bimonthly self-promotional horn toot. The new issue of American Scientist is now available online, and paper copies should soon be stuffing mailboxes everywhere. My “Computing Science” column is on a new class of “holographic” algorithms invented a few years ago by Leslie G. Valiant of Harvard. The ideas have been further developed by Jin-Yi Cai of the University of Wisconsin (who is currently visiting Harvard as a Radcliffe Fellow) and several students and colleagues both at Wisconsin and in China.

If you want to know what holographic algorithms are all about, you’ll have to go read the column. I’m not going to rehash it here. Instead, I’m going to recycle a few paragraphs of a version of the column that might have been but never will be. This was a false start, an attempt at a lead that wound up leading me in the wrong direction:

It was the final day of a workshop, and I joined a group for dinner at a farmhouse restaurant in the village of Muggia, around the corner from Trieste, where Italy meets Istria. As the evening was winding down, we sat drinking the last of the wine, watching the sun sink into the Adriatic, and debating the P = NP question. The debate was somewhat lopsided: No one at the table was willing to defend the proposition that P is indeed equal to NP. But there was plenty of room for argument between hard-liners who insisted that P is most certainly not equal to NP and those who took a more agnostic view.

What are P and NP? They are classes of computational tasks or problems. Roughly speaking (I’ll try to speak less roughly in a moment), the class P includes all the things we know how to compute quickly. NP is a larger class, encompassing all the problems known to be in P and many others besides; for these additional tasks we don’t have efficient algorithms, and yet no one has proved that such algorithms don’t exist. If it should turn out that P = NP, then thousands of seemingly hard problems actually have some quick shortcut solution, if only we could find the secret.

A mathematician at my end of the table took an inflexible position: P = NP is not only wrong but unthinkable—a notion akin to time travel or perpetual motion. I had been lurking silently through most of this conversation, but at this point I spoke up. Time travel seems implausible, I said, because it would undermine causality, and perpetual motion would violate the conservation of energy. What bedrock principle would be overturned by the discovery that P = NP? The mathematician replied: “Do you believe in miracles? No? Well, wouldn’t it be miraculous if we lived in a universe where every last one of those NP problems has an easy solution?”

This was good enough for me on that mild summer night in Muggia. The fact is, however, the miracle argument cuts both ways. There’s a subclass of NP problems, designated NP-complete, with a special property: If there’s an efficient algorithm for any one of these tasks, the same method can be used to solve all NP problems efficiently. Thus for P to be different from NP requires the “miracle” that not even one out of thousands of NP-complete problems has a quick solution….

One more note on the column. This is difficult subject matter, and my grasp of it is tenuous. I could not have written the piece without a lot of help and hand-holding from Valiant and Cai, who were reading drafts and answering phone calls up until the very moment the issue went to press. I have a reason for thanking them here rather than in the column itself: If I’ve goofed in spite their efforts, I don’t want them implicated.

Measure twice, average once

Friday, December 7th, 2007

plywood panel with seven measurements in crayon or magic marker

Whenever Norm Abram tells me to “measure twice, cut once,” I wonder what I’m supposed to do if the two measurements disagree. Perhaps I should measure a third time, in hope of settling the question by majority rule; but then I might well wind up with three discrepant values.

Strolling by a construction site the other day, I came upon the plywood panel shown above. There was no one around to help me interpret these curious scrawled measurements, but I could easily enough imagine the scene. A carpenter—Skilsaw at the ready—is surrounded by a group of statisticians and decision theorists eager to advise him on where to make the cut.

“Obviously,” says the first consultant, “we take the average—the arithmetic mean. Gauss proved 200 years ago that the sample mean is always the best estimator for a measurement subject to normally distributed random errors.”

“Actually, he proved just the opposite,” says another hardhatted and hardheaded savant. “He started by assuming that the mean is the most probable value, and then he invented the normal distribution as a way of ensuring that this rule will hold.”

“Whatever. But we’ve come a long ways since 1805. We know that the mean is an admissible estimator. Even without assuming a normal distribution, the sample mean is the estimator that minimizes the sum of the squared errors.”

“But who says the sum of the squared errors is the function we want to optimize? It’s just one of many possibilities. And it gives undue influence to the extremes of the distribution. In this case, the presence of that peculiar-looking eight-and-an-eighth value pulls the mean down to 55.875. Is that really where we should saw the board?”

“That 8.125 is obviously an outlier. Somebody was reading the wrong end of the tape measure. Excluding that bogus value, the mean is 63.833.”

“If you’re going to be picking and choosing which data points to trust, what about the one at the upper right? I’m not even sure I can read it: 64 and seven-eighths? And somebody seems to have crossed it out. Maybe we should drop that one, too.”

“And 64 is the only other item that isn’t circled. That must mean something.”

Another direction is suggested: “Instead of Gauss’s sum of the squared errors, we could adopt the criterion of Laplace, the sum of the absolute errors. With this choice, the favored estimator is the median rather than the mean. The median of our data is 63.625. And the median is much less sensitive to outliers and strangely shaped distributions. Whether we include or exclude the eight-and-an-eighth measurement makes only a minor difference.”

“What makes you all so sure we’re seeing several attempts to measure the same quantity? I think we actually have three distinct sets of measurements here, which just happen to be scribbled on the same piece of wood. The eight-and-an-eighth is clearly on its own. The two uncircled measurements form another set. And then we have four circled values all clustering around 63-and-something. If we want to simultaneously optimize the least-squares error for all three sets, we should be using a James-Stein estimator, which shrinks the average of each set toward the overall average.”

At this point a Bayesian is heard from. Others mention maximum likelihood, Pitman’s measure of closeness, minimum variance, the method of moments….

The conversation does not end here, but the rest is lost in the whine of the power saw. The carpenter has cut off the plank somewhere out beyond 64 inches and explains this choice as follows: Cutting long may mean cutting twice, but cutting short means buying twice.

*       *       *

One lesson you might draw from this little farce and fable is that if you have a hard decision to make, you should call a carpenter rather than a statistician. But that’s not the conclusion I intended.

You sometimes get the impression that statistics is a dry and lifeless discipline, where all the interesting questions were answered long ago, and all that remains now is to memorize some formulas and learn when to apply them. I think not!

Problems in statistics don’t get much simpler than this one. It concerns a small set of observations, with one variable in one dimension and one parameter to be estimated. It’s a problem that would have been perfectly familiar to Gauss and Laplace, Legendre and Adrain. And yet there’s still room for doubt and controversy about how best to approach such questions.

I found the plywood puzzle challenging enough that I was led to do some reading. Most of it is well above my grade level, and so I can’t claim to have absorbed everything the authors have to say. But I’ll offer a few pointers in case anyone else wants to follow along:

  • Colin R. Blyth (1951) directly confronts the Norm Abram question: How do you decide when to stop measuring and start cutting? I gather that this paper was a major landmark in estimation theory. R. H. Farrell (1964) follows up on related themes. (A number of other papers could be mentioned in the same context; I draw attention to these two because they are freely available online through Cornell’s Project Euclid.)
  • There’s an “Introduction to Estimation Theory” by Don Johnson of Rice at the Connexions web site. The context is signal processing, but there’s plenty of use to carpenters.
  • For the history of statistics, Stephen Stigler is always the place to start. His article on “Gauss and the Invention of Least Squares” is chapter 17 in Statistics on the Table (Harvard University Press, 1999). The original 1981 version from Annals of Statistics is online here through Project Euclid.
  • For a gentle introduction to the James-Stein estimator, I recommend a Scientific American article by Bradley Efron and Carl Morris, “Stein’s Paradox in Statistics” (Vol. 236 No. 5, May 1977, pp. 119–127). (Disclaimer: I was the editor of that article.)
  • Finally, at the moment I’m halfway through Pitman’s Measure of Closeness: A Comparison of Statistical Estimators, by Jerome P. Keating, Robert L. Mason and Pranab K. Sen (SIAM, 1993). I really don’t yet know what to make of this, but it has opened up a world I knew nothing about.