Archive for the ‘science’ Category

The acceleration of history

Tuesday, December 20th, 2011

Four hundred years ago, the idea that the Earth goes around the Sun rather than vice versa was not just a scientific breakthrough but also a cultural bombshell. People were asked to reimagine the world they were living in. Not everyone welcomed the opportunity. Books were burned. In the case of Giordano Bruno, an author was burned.

In the modern world, cosmological revolutions seem to cause hardly a ripple in public consciousness. Inflation, dark matter, dark energy—these ideas also call for a reimagining of the world we live in, but they have provoked very little fuss outside the community of science. It’s certainly a relief that no one will be burned at the stake over matters of cosmological doctrine. But are we really more liberal and open-minded, or just not paying attention?

Those are the final paragraphs of my new column in American Scientist. Here I want to say a few words more about the reception of these new ideas in cosmology, but first I should explain that the column is really about something else, namely the Bolshoi computer simulation of the large-scale structure of the universe, led by Joel Primack of UC Santa Cruz and Anatoly Klypin of New Mexico State University.

While preparing to write the column, I picked up Marcia Bartusiak’s recent book The Day We Found the Universe, which tells the story of the discovery that the “nebulae” we see in the sky are actually distant galaxies much like our own—what Kant called “island universes.” It’s a grand story, and Bartusiak gives a splendid account of it, with engaging portraits of the dozen or so principal players. Highly recommended.

I’m not going to retell the whole story here, but I want to point out that it took 175 years for the idea of island universes to be accepted by astronomers. The earliest known proposal was by Thomas Wright in 1750; Bartusiak’s story culminates on January 1, 1925, when Edwin Hubble’s paper “Cepheids in Spiral Nubulae” was read to a joint session of the American Astronomical Society and the American Association for the Advancement of Science. In between, there was a great deal of backing and forthing. For example, William Herschel, the preeminent observational astronomer of the 18th century, initially supported the island-universe theory, but later he changed his mind. As late as 1900 many astronomers believed the nebulae were relatively small, nearby objects—perhaps protostars about to condense. It took new instruments and a barrelfull of observational evidence to overturn this view. (Specifically: telescopes that could resolve individual stars in distant galaxies, better spectroscopes, better photographic film, the understanding of redshifts, the discovery of a relation between period and luminosity in the stars called Cepheid variables.)

I find it wholly unsurprising that people might need a century or two to digest such a major shift in how we view the universe around us. What’s remarkable is that lately the pace of change has accelerated, and nobody seems to be having much trouble keeping up.

Consider what’s happened in cosmology in the 80-some years since Hubble’s revelation. There was the battle between the steady-state and the big-bang models, which can be traced back to the 1920s and 30s and that was finally resolved in the 1960s with the discovery of the cosmic background radiation. Then there’s “dark matter.” Fritz Zwicky pointed out in the 1930s that the dynamics of galaxies imply there’s a lot more mass out there than we’re seeing, and this discrepancy became more troubling with later observations. By the 1980s or 90s most astronomers had accepted the remarkable conclusion that we don’t know what the universe is made of; all of the familiar “baryonic” matter of stars and planets is a minority constituent; the bulk of the mass is some unidentified stuff that Primack dubbed cold dark matter.

Even weirder (if that’s possible) is the notion of cosmic inflation: In a period of 10–36 second, the universe expanded by a factor of 1078. The inflationary hypothesis was first put forward in 1980, was tweaked a bit later in that decade, and was soon swallowed whole by the cosmological community (with the exception of a very few skeptics).

Finally comes “dark energy,” the force that’s causing the cosmic expansion to accelerate. It’s well known that this concept goes back to the early years of general relativity, with Einstein’s cosmological constant Λ. But Einstein soon disavowed the idea, and it remained moribund until about 15 years ago, when two groups of astronomers found direct observational evidence that the expansion is indeed accelerating. The resurrection of Λ was so quick and total that this year’s Nobel prize in physics was awarded for this work.

I find it astonishing and disquieting to live in a universe that’s so very different from the one I was born into. We already had external galaxies in my childhood, and Fred Hoyle and George Gamow were sparring over the big-bang/steady-state issue. But I grew up with no inkling of dark matter, dark energy or cosmic inflation. Now it turns out that most of the universe disappeared over the event horizon in the inflationary era, a fraction of a second after it all began, and long before any of us had a chance to see what we were missing. Of what’s left, less than 1 percent is the kind of matter we know and love—and nobody has a very good idea what the rest of all that stuff might be.

Given the contentious history of earlier innovations in cosmology—starting, of course, with the post-Copernican civil war—I would have expected more controversy over these ideas. But the whole rapid-fire series of head-spinning revolutions seems to have been accepted rather placidly, both within astronomy and by the wider scientific community. Why so little resistance? Is the evidence so compelling as to overwhelm all opposition? Or, on the contrary, have we become so complacently accepting of what experts tell us to believe that we’ve lost all independent judgment.

In a telephone conversation I asked Primack how he would explain the lack of controversy. He broadened the scope of the question, pointing out that when you consider the public at large, rather than the scientific community, the issue is not uncritical acceptance but rather ignorance and indifference. A population that doubts Darwinian evolution and anthropogenic climate change is not too easily convinced by evidence or cowed by authority. If no one has risen up to denounce the teaching of dark matter and dark energy in the public schools, it’s simply because they are unaware of those ideas. I think Primack is right about this, but I don’t understand why questions about the basic nature of the universe—which once excited such passion—could now lie beneath the notice even of the most benighted citizens.

(By the way, the headline on this post is borrowed from my former boss, Gerard Piel, who published a book under that title. Now that Gerry is gone, I can confess that I never read the book, but I always liked the title.)

Snowdunes

Monday, February 28th, 2011

Several weeks ago, on the morning after the first winter storm here in the Boston area, I wrote about some peculiar snow geometry on porch railings. Now, following another storm (which I wish I could believe might be the last of the season), I have more puzzling snow shapes to present.

What gives rise to the quasi-periodic undulations in the snow filling my neighbors’ rain gutters?

Scallops0722

Note that on the roof above the gutter the snow depth appears to be perfectly uniform. The scalloped shapes appear only in the gutters.

Scallops0718

Elsewhere on the same house, the waves are less regular in period and smaller in amplitude, but they are still clearly present.

Scallops0710

On the opposite side of a different house I find more undulations, this time in the gutters draining a relatively flat porch roof.

All of the gutters shown above are oriented northwest-southeast, and were roughly parallel to the prevailing wind direction during the storm (although winds were very light). Gutters perpendicular to these revealed little or no evidence of wavelike disturbances. Waves were also absent from other linear surfaces, such as railings, regardless or orientation.

I’ve seen no sign of such wavelike formations in any previous snowfall this winter. To my eye, the waveforms are cusplike—cycloidal, perhaps, rather than sinusoidal. The snow was fairly heavy and sticky (note the coating on tree branches); temperatures were just a few degrees below freezing; total accumulation was about 10 cm.

Anyone care to propose a mechanism?

Update: After a welcome day of warm rain, the answer has slowly emerged from the wilting snow, and it has nothing to do with subtleties of fluid dynamics or particle deposition.

Hangers0732

In the photo above, peeking out from behind the last remnants of the melting snowdunes, are the metal spikes and straps that fasten the gutters to the fascia board. Their somewhat irregular spacing is a good match for the pattern of cusplike peaks. Of course! How could I have failed to think of that?

Why did the dunes appear only in this storm? Earlier snows were fluffier and deeper, and probably less affected by the hanger straps. Why only on certain sides of the houses? I’m not sure about that, but the one duneless gutter I’ve been to able to examine closely has a different hanger system, without the obstructing straps.

 

How I didn’t invent the memristor

Monday, February 14th, 2011

The memristor is a new electronic component, an addition to the family of “passive” circuit elements—a family that for well over 150 years had just three members: the resistor, the capacitor and the inductor. Enthusiasts for memristor technology argue that it will bring higher-density, lower-cost, nonvolatile computer memory, with chips that hold terabits rather than mere gigabits. There’s also talk about “neuromorphic” computer architectures, where the memristor would play the role of a synapse in a nerve-like network. Of course these wonders are still in the future. While you’re waiting, you could visit the memristor web site, where you could buy a memristor tee shirt. Or you could read more about memristors in my new “Computing Science” column in American Scientist.

Will the memristor turn out to be a transformative technology, the key to putting hundreds of trillions of devices in the palm of your hand? Or will we be asking, a few years from now, “Whatever happened to the memristor?”

You won’t find an answer to that question in the column. You won’t find the answer here, either. Instead of trying to predict the future, I’m going to tell a story from the remote past, having to do with a different electronic device, the thermistor. There’s a connection with memristors, although it may take a while to emerge.

•     •     •

Once upon a time I was a nerd with a soldering iron who knew the resistor color code by heart. I had an older friend, Dave, an electrical engineer, whose confidence in my ability was inspiring and empowering even though it was also misplaced. Dave’s company had built an oven for drying the ink on silk-screened fabrics. The customer wanted to monitor the temperature of the cloth as it moved through the oven. Dave gave me the contract to design and build a suitable instrument.

I considered doing it with thermocouples, which I had played with in a physics class. But thermocouples measure the temperature difference between two points, and I had no way to provide a stable reference temperature for the cold junction. Besides, thermocouples produce signals in the millivolt range, and I worried about stray currents in an environment where electric heating elements were drawing about 20 kilowatts. So I turned to a different temperature-sensing device, the thermistor, which has a history going back to Faraday but still seemed to be slightly exotic in the 1960s.

A thermistor is a resistor with an unusually large temperature coefficient—that is, the electrical resistance changes dramatically with temperature. Furthermore, in all the thermistors available in those days, the coefficient was negative, meaning that the resistance decreased as the device was heated. (This is the reverse of the relation in most materials.)

bridge circuit with thermistor and three fixed resistorsI designed a bridge circuit something like the sketch at right, where the T labels the thermistor and the other resistances are ordinary, fixed resistors. Component values are lost to memory, but I think the supply voltage was 12 or 18 volts and the voltmeter in the middle of the bridge read 5 or 10 volts full scale. The baseline thermistor resistance was a few hundred ohms—dropping from maybe 500 ohms to 250 over the temperature range of interest. I chose the fixed resistance values so that the bridge would balance somewhere below room temperature and produce full deflection of the voltmeter at about 300 degrees Fahrenheit.

The components were ordered by mail from the Allied or the Lafayette Electronics catalog, whose long lists of part numbers and tables of specifications I drooled over in those days. When the package of hardware arrived, I wired up the shiny new parts. Amazingly, it worked. I did a two-point calibration by immersing the thermistor in an ice bath and in boiling water, and I drew a custom scale for the meter. I also lavished attention on a logotype identifying the manufacturer of the instrument; I think I created it with leftover decals from a model-airplane kit.

When it came time to install the gadget, all went well at first. We got a room-temperature reading, then we turned the oven on and watched the needle move slowly across the scale. But later I began to notice a troubling anomaly. The needle kept rising even when the fabric temperature should have stabilized. A few more experiments showed that even with the oven cold, the indicated temperature would very slowly climb to 100 degrees or so, then accelerate on the way up to 150 degrees, and after 10 minutes would be at the top of the scale.

It was a wrenching moment when I realized what was the matter. In designing the bridge circuit I had successfully worked out various applications of Ohm’s Law: E = IR. But I had forgotten all about W = I2R, which defines the power dissipated by current flowing through a resistance. The 10 or 20 milliamps flowing through the thermistor were enough to heat it, thereby lowering its resistance and increasing the current further, bringing still more heating and more current, in a positive feedback loop. I had not observed this behavior in the calibration runs because the water bath efficiently carried away the heat. But the self-heating effect rendered the whole apparatus useless in the fabric-drying oven.

•     •     •

In retrospect, I can look upon this phenomenon and say: How interesting! What a pretty demonstration of nonlinear dynamics. Maybe one could even find a way to put the nonlinear effect to good use, for example by creating a bistable system with low- and high-resistance attractors.

A year or two before these events, a comprehensive review article, “Theory and Application of Self-Heated Thermistors,” by M. Sapoff and R. M. Oppenheim, had appeared in Proceedings of the IEEE (Vol. 51, October 1963, pp. 1292–1305), which set forth exactly these possibilities. If I had read and understood that paper, I would have saved myself a lot of bother. But I knew nothing of the article at the time; indeed I learned of it only a few weeks ago, when I found it cited in various works on memristors.

The foundation paper on the memristor itself appeared just a few years after my mortifying experience with the fabric-drying oven. The article is: “Memristor—the missing circuit element,” by Leon O. Chua, IEEE Transactions on Circuit Theory Vol. 18, September 1971, pp. 507–519 (available online through a link here). Chua, then and now at UC Berkeley, conceived of the memristor and gave it a name almost 40 years before a hardware implementation was discovered and understood in 2008 by R. Stanley Williams and several colleagues at Hewlett-Packard. In the interim, Chua described several memristor-like circuit elements, the first of them being the thermistor.

A memristor is a special kind of nonlinear, variable resistor—one whose resistance at any given moment depends on the entire history of currents and voltages it has experienced in the past. Under conditions where self-heating is important, a thermistor has this very property. As Chua put it in a 1976 paper: “[A] thermistor is in fact not a memoryless temperature-dependent linear resistor—as is usually assumed to be the case—but rather a first-order time-invariant current-controlled memristive one-port.” If only I had known.

So, in the light of history, I return to that teenage kid with good soldering skills but a somewhat shaky grasp of circuit theory. Was he on the verge of inventing the memristor ahead of Chua? No, sadly, not at all. This was not a near miss. At the time, I saw nothing in the outcome of my experiment but a humiliating failure. Looking back, though, I’m inclined to go easy on myself for that missed opportunity. I did manage to understand where I’d gone wrong; I just couldn’t see how to turn my goof into a gold mine.

By the way, the silk-screening outfit did eventually get their temperature monitor. Dave found an off-the-shelf solution, which turned out to work just like my design, but with higher resistances and lower currents, ducking below the thermistor’s region of nonlinear self-heating.

Sic transit

Thursday, April 30th, 2009

SciAm-cover-April-1949.jpg

The tattered magazine shown above was on newsstands 60 years ago. You could have bought a copy for 50 cents—which wasn’t cheap at the time. The article featured on the cover, “Mathematical Machines,” surveys the whole topic of computational technology, from a primer on binary arithmetic to breathless reports on the hottest new machines—the Edvac, the Univac, Project Whirlwind. Other articles in the same issue include “Submarine Canyons,” “Greek Astronomy,” “Titanium: A New Metal” and “The Evolution of Sex.”

The creators of this ambitious magazine were Dennis Flanagan and Gerard Piel, two young writers who had become friends while working at Life magazine. In 1947 they set out on their own, planning to launch a wholly new magazine about science. When they learned that Scientific American was about to fold up after more than a century of publication, they bought it and made it over. By the time of that April 1949 issue, the new magazine had already found the formula and the format that would define it for a generation of readers.

One important part of the formula was a bit of an accident. Flanagan and Piel had expected to hire professional writers to produce the main articles, but they couldn’t find enough of them. So they began inviting scientists to tell their own story, in collaboration with an editor. This shortcut would soon become a key element of the magazine’s identity and its claim to credibility.

I grew up reading Scientific American, though I was not a subscriber. I discovered a disreputable shop on Race Street in Philadelphia that sold out-of-date issues for a dime each. These were copies with their covers torn off; years later I learned why. Newsstands are entitled to return unsold copies of magazines for full credit, but shipping all that paper back to the publisher was expensive and wasteful. So the dealers sent back only the covers and promised to pulp the rest of the magazine. Some of the discarded copies never made it to the recycling yard.

My adolescent reading of all those gray-market magazines was probably my main qualification when I joined the staff of Scientific American in 1973. I spent a decade there. I’ve done lots of other stuff since then, but the years with Flanagan and Piel were the great formative experience of my working life; in my own mind, I’ll always be a former editor of Scientific American. And I still have a deep affection for the magazine, even though I dislike what’s been done to it in recent years—so much so that I find it painful to read.

What went wrong? I think the genre of this story is tragedy: a noble enterprise undone by its own success. By the 1980s the magazine was prosperous enough to attract the attention of predatory investors. Their reasoning went something like this: If those eggheads can sell half a million copies and a hundred pages of ads with a magazine crammed full of physics and mathematics and molecular biology, just imagine how well we can do if we get rid of all that boring science. In 1986, under pressure from stockholders, the company was sold to Verlagsgruppe Georg von Holtzbrinck—by no means the worst of the suitors, but the terms of the purchase left the magazine desperate for new revenue. And the collateral damage was even sadder: Flanagan and Piel parted ways and never spoke again.

Over this past weekend I learned that Scientific American is going through another rough patch and will likely see further changes. (Sources: The New York Times, Folio, Portfolio.) Ad pages are down 18 percent. John Rennie, the editor since 1994, has “taken the opportunity… to find other engaging things to do.” At least 20 staff members will lose their jobs. The magazine will be uprooted from its own offices and will move in with the parent organization, Nature Publishing Group (another Holtzbrinck acquisition). Rumor has it that the new editorial direction will be even more consumer-oriented, focusing on science that’s “useful in everyday life.” The old commitment to articles “written by the scientists who did the work described” is under question.

I’ve been stewing over all this for a few days—or maybe it’s been a few decades. In any case, I’m tired of stewing; it’s time to move on.

At a personal level, I commiserate with those who are going to be hurt most—the staff members (including some old friends) whose jobs are in jeopardy. But in journalism and publishing these days we’re all sailing in leaky tubs and bailing as fast as we can.

When I take a step back and look at the larger cultural significance of these events, I think first of how important Scientific American was in my own upbringing and education. It’s my reflex to ask: How will kids like me get turned on to science without access to those coverless bootleg magazines at a dime a piece? But what a ridiculous question! The world has changed. The resources available today to a curious 14-year-old are far superior to anything I could have imagined back in the sixties. The little shop on Race Street was a treasure chest in its time, but it can’t compare with Wikipedia and the arXiv and PLoS and Weisstein’s World of Mathematics and Sloane’s Sequence Server and all the other bounty that’s now just a click away. Indeed, that’s a big part of the problem that faces Scientific American, and other publications.

We did good work at Scientific American, back in the old days. I’m proud to have been a part of it. But what we were building was not a monument to be preserved for the perpetual admiration of future generations. It was a channel of communication, a way of linking scientists with a broader audience. I believe it’s still important to keep those channels open, but the best way of doing it may be different in 2009 than it was in 1949.

On page 1 of the April 1949 issue of Scientific American is an advertisement from the Radio Corporation of America announcing—with considerable fanfare—the company’s latest innovation for audiophiles: the 45-rpm vinyl record. Today, the replacement of the replacement of the replacement of that recording medium is teetering on the edge of obsolescence. And yet music goes on! I want to believe that science journalism can also make a transition to a new medium, and perhaps come out of it better than ever.

Survey on computing in the sciences

Friday, October 17th, 2008

Do you create software for scientific computing, or use such software in doing research? Then my friend Greg Wilson would like to hear from you. Together with colleagues from the University of Toronto, Simula Research Laboratory and the National Research Council of Canada, he is conducting a survey on practices in scientific computing. Greg plans to report the results next year in an American Scientist article.

Life Curves

Sunday, August 24th, 2008

J. John Sepkoski, Jr., was a fossil-hunter who did most of his digging in the library, sifting through the literature of paleontology to build a detailed, quantitative timeline of life on earth. Focusing on marine animals, he recorded the earliest and the latest known appearances of thousands of ancient organisms. The final edition of his compendium, published in 2002 (three years after his death at age 50), lists dates for more than 36,000 genera.

A few years ago I had a chance to get closely acquainted with Sepkoski’s compendium, when I needed a machine-readable version of the timeline. The listings were published on CD-ROM (remember those?), but the files were merely unstructured plain text. I needed something I could compute with, and so I spent a week or two reformatting the records and importing them into a database. (Others have done the same thing. Shanan Peters of the University of Wisconsin–Madison maintains an online version.)

Here is the summary graph that was the goal of my data-conversion project; it shows the number of extant genera as a function of time, according to Sepkoski’s tally of comings and goings:

Spekoski.png

My brief hands-on experience with Sepkoski’s compilation gave me a sense of how much care went into its preparation. Getting any large data collection into a computer tends to be a fiddly process. Irregularities that a human reader would hardly notice are sand in the gears of automated text processing. Sepkoski’s data files caused less trouble than I expected. The problems I encountered were mainly trivial typographic anomalies—missing punctuation, erratic spacing—and even those were surprisingly rare. The only hints of potentially meaningful errors were a dozen pairs of duplicated entries, where the same genus appeared twice in the listings. It’s easy to see how that would happen in a project that went on for almost three decades; indeed, it’s amazing there weren’t more duplicates.

In any case, I came away from this project with great respect for Sepkoski’s accomplishment, but that doesn’t mean that the curve reproduced above represents the final word on the history of life. It’s not even clear that the main features of the curve and its overall shape give an accurate portrait of changes in global biodiversity.

In constructing any such historical time series, certain biases and distortions are hard to overcome. Of particular importance in this case, fossils from more recent intervals are more likely to survive and to be discovered than those from more ancient times. This “pull of the recent” effect raises questions about the steep upward trend that dominates the Sepkoski curve from the Cretaceous to the present. Has evolution really been going crazy with innovation throughout the past 150 million years, or is that hockey-stick curve an artifact of preservational and sampling bias?

A newly completed analysis of another big fossil database addresses this question (and others). The data source for the new analysis is the Paleobiology Database, a large collaborative project coordinated by John Alroy of the University of California–Santa Barbara. The Paleobiology Database might be called a metacompilation: It brings together statistical and descriptive information from thousands of more-specialized fossil collections (83,444 at the latest count). Initial work on the database began a decade ago (Sepkoski was an early contributor), but it has shown a recent growth spurt.

Of course the new database is vulnerable to the same kinds of systematic bias that Sepkoski had to confront. There’s no avoiding the fact that, on the whole, younger geological strata are more accessible and better studied, and younger fossils are better preserved. But by organizing the data differently and retaining more information about each taxonomic group, Alroy and his colleagues see an opportunity to correct or compensate for some of the biases. Of particular note, whereas Sepkoski recorded only the first and last known appearance of each genus, Alroy et al. attempt to keep track of every occurrence of an organism. This extra information allows sampling bias to be estimated and corrected.

Consider these hypothetical fossil records, where each dot represents a single occurrence of a fossil organism in one of nine labeled intervals:

Alroy.png

In both cases Sepkoski’s protocol would merely indicate that the taxonomic group originated in period 3 and became extinct in or after period 8. The new database records each time unit in which the fossil was found and, whenever possible, the number of occurrences per interval. This data might seem like superfluous detail. After all, if an organism was alive in periods 3 and 8, we can safely infer that it must have existed in periods 4, 5, 6 and 7 as well, whether or not fossil evidence has come to light. But it turns out that recording occurrences rather than just chronological ranges allows for some helpful statistical magic.

As I understand it, the scheme works something like this. Suppose we could gather together all the fossils ever collected by paleontologists, and sort them into bins according to age. Because of the various sampling and preservational biases, the bins for fairly recent periods (say 50 million years ago, in the Tertiary) would be much fuller than the bins for earlier times (say 400 million years ago, in the Devonian). Any bin with more specimens would be likely to exhibit more diversity as well, simply because rare organisms have a better chance of showing up at least once in a larger sample. But we can control for this bias through a simple subsampling procedure: Draw a fixed number of specimens from each bin, making each selection at random and with replacement. The counts of genera in the subsamples should reflect the true diversity of the biota in each bin.

In practice it gets more complicated than that, because we can’t actually sample the entire fossil record at the level of individual specimens; the best we can do is to randomly choose collections of fossils or the publications that describe them. And the publications vary greatly in how much quantitative data they include; some are just lists of species observed.

After many adjustments, refinements and calibrations, Alroy and 34 co-authors have published a diversity curve based on the subsampling technique:

Alroy.png

(Graph courtesy of John Alroy.)

Their article (subscription required) appeared last month in Science, along with 67 pages of supplementary material.

The Sepkoski and the Alroy graphs are twins separated at birth—widely separated. The overall upward trend still exists in the newer graph, but it is much less dramatic, especially in the past 100 million years. Some of the famous mass-extinction events, such as those at the end of the Permian (P) and at the end of the Cretaceous (K), are visible in the new graph but are altered in character; instead of a sudden crash after a sustained build-up, we see something more like a return to normal after a brief, sharp spike in diversity. (Alroy elaborates on the dynamics of mass extinctions in a second recent article, this one in PNAS.)

Looking at the two curves, I arrive at this question: How is the interested but nonexpert reader to evaluate these contrasting views of our planetary past? I want to emphasize that the question animating me is not “Who is right?” but “How can we know who is right?” Is there some way that the ordinary, scientifically literate outsider can form a reasoned judgment about such competing claims to truth?

It was questions like these that got me in trouble the last time I wandered into this area. In 2005 Richard A. Muller of the Lawrence Berkeley National Laboratory and Robert A. Rohde, a graduate student at UC Berkeley, published a report in Nature claiming to detect periodic cycles of rising and falling diversity in the Sepkoski data. Applying Fourier analysis to the time series, they reported finding a strong signal at a period of 62 million years and a weaker one at 140 million years. The claim was controversial from the start, and I decided to take a do-it-yourself approach to understanding the issue. I went back to the original data, reimplemented the analytic methods and tried to assess the robustness of the conclusion. I told the story in an American Scientist column.

The column pleased no one. It certainly didn’t please Muller and Rohde, who objected that I was out of my depth in my amateur attempt to replicate their work. It didn’t please the critics of the Muller-Rohde hypothesis, who thought my focus on certain narrow technical issues deflected attention from deeper conceptual flaws in the argument. And it didn’t please me, because I agreed with the criticisms from both sides.

I should also mention that my column had zero impact on the controversy, which not only continues to rage but has also been extended to the new database. Alroy writes in the PNAS article that some of the peaks and valleys forming the supposed cycles fail to materialize in the new data set. On the other hand, a preprint from Adrian L. Melott of the University of Kansas argues that cycles with periods of 62 and 150 million years emerge from the Paleobiology Database with higher statistical significance than they had in the Sepkoski collection.

All in all, I think I’ll sit this one out. I’ve been itching to get my hands on some records from the new database and implement the subsampling algorithm (which sounds both intriguing and readily accessible). It would be fun to play with these ideas. But I’ll let someone else have the fun this time.

Science builds its credibility on the bedrock idea that experiments and other kinds of results are subject to independent confirmation or refutation. And the advent of computational science has made this egalitarian ideal much more practical than it used to be. Although experiments in high-energy physics remain beyond the means of most amateurs, anything done with a computer rather than a particle accelerator is pretty much fair game these days. Still, there are bounds. If every reader set out to replicate every experiment, the world wouldn’t make much progress.

On the spot

Saturday, May 24th, 2008
redspot.jpg

Wow. Jupiter has sprouted a third red spot. It was just two years ago that the Great Red Spot was joined by a smaller companion, which was quickly dubbed “Junior.” I guess the new red spot, discovered in the past few weeks, will have to be called “III.”

In the view above, from the Hubble Space Telescope, Junior is southwest of the Great Spot, and the new, smallest member of the family is due west of the big one and a little farther downwind. This is a false-color image, constructed by assigning colors to monochromatic images recorded at three wavelengths, but the intent is to correctly render colors as perceived by the human eye. Evidently none of the spots are really red at the moment. If they were all newly discovered right now, we would have the Great Peach of Jupiter and the Two Little Apricots.

When I get beyond merely admiring the glorious, painterly spectacle of this Jello-chiffon dessert in the sky, what fascinates me most is the time scale of the red spot phenomenon. The Great Spot has been there for at least a century or two, and probably much longer. It is a storm, with rapid counterclockwise circulation clearly visible in the time-lapse photos returned by the Voyager I spacecraft in 1979.

Storms are something we can relate to from our earthling experience; we have cyclones here too. But what kind of storm lasts for hundreds of years? Even allowing for the larger spatial scale of events on Jupiter, the Great Spot seems extraordinarily long-lived. The rotation period is roughly one earth-week, which means the spot has survived for something on the order of 10,000 revolutions. And it is geographically stable, too: Although the spot drifts in longitude, it seems to be pinned in latitude, hovering at a swirling boundary between easterly and westerly wind belts.

Very likely, the key to the Great Spot’s longevity is that Jupiter has no continents or other surface irregularities to disrupt the flow of the atmosphere. But that fact makes the uniqueness of the spot somewhat mysterious. If such features can arise spontaneously, purely from the dynamics of the atmospheric flow, like a pearl created without any need for a grain of sand, then why is there just one red spot? You’d think that such storms would develop from time to time wherever conditions were favorable.

And now we have our answer: There’s not just one red spot. But the question of time scales doesn’t entirely go away. It seems implausible that one storm would go on for centuries in lonely splendor, and then suddenly two more would evolve within a couple of years. Perhaps there have been others and we just didn’t notice? Not within the past 50 years, I think. Another possible explanation of this improbable coincidence is that the births of Junior and III are not independent events. All three storms are nearby (at least by Jovian standards) and are surely interacting. If that’s the case, we may not have seen the end of this sequence of events. Will there be more spots? Will they collide or coalesce? Stay tuned.

In the matter of time scales, I can’t help noting that Jupiter has a connection with another epochal event in the modern Internet era. In July of 1994 comet Shoemaker-Levy 9 crashed into Jupiter, and the world followed along via the web. The idea that anyone with a modem could download the images directly from JPL—no waiting for the news media—made quite an impression. The Netscape icon was the apotheosis of this event.

Links:

More third-spot images and explanations of how they were made, from Imre de Pater, UC Berkeley.

Reporting from Science Blog.

Reporting from New Scientist.

A report from the Philippine Daily Inquirer with some background on who first spotted the new spot.

The Wikipedia article on the Great Red Spot (which already has a note on the new one).

The temblor forecast

Tuesday, April 15th, 2008

From the Associated Press, via the New York Times:

LOS ANGELES (AP) — California faces an almost certain risk of being rocked by a strong earthquake by 2037, scientists said in the first statewide temblor forecast.

New calculations reveal there is a 99.7 percent chance a magnitude 6.7 quake or larger will strike in the next 30 years. The odds of such an event are higher in Southern California than Northern California, 97 percent versus 93 percent.

caquake.jpg

I read this report with a certain sense of wonder. What impressed me was not the prediction itself; it’s not the first time I’ve heard that the Big One is coming. What took me by surprise was the level of mathematical sophistication that we can now take for granted in readers of the morning newspaper. No more do we have to worry that people will add up 97 percent and 93 percent to get 190 percent. Evidently, we’ve reached a state of universal numeracy, where everyone knows how to combine probabilities, and there’s no need to explain the calculation. We don’t even need to remind anyone that when we compute 1 – (1 – p)(1 – q), or p + qpq, we are assuming that p and q represent probabilities of statistically independent events; everybody knows that. And everybody understands that in this context “a chance of a quake” really means “a chance of at least one quake.”

I guess the only place where we might still stumble is in actually doing the arithmetic. My calculator tells me the number is 99.8 percent, not 99.7.

A further note: The original report on which the news item is based leaves me even more perplexed. The probability model adopted in the forecast is explained as follows:

The simplest assumption is that earthquakes occur randomly in time at a constant rate; i.e., they obey Poisson statistics. This model, which is used in constructing the national seismic hazard maps, is “time independent” in the sense that the probability of each earthquake rupture is completely independent of the timing of all others. Here we depart from the… conventions by considering “time-dependent” earthquake rupture forecasts that condition the event probabilities… on the date of the last major rupture. Such models… are motivated by the elastic rebound theory of the earthquake cycle…; they are based on stress-renewal models, in which probabilities drop immediately after a large earthquake releases tectonic stress on a fault and rise as the stress re-accumulates due to constant tectonic loading of the fault.

In other words, it doesn’t sound as though the assumption of independence is even approximately satisfied. I must be missing something. The 99.7 percent combined probability is mentioned in the executive summary of the report, but I found no explanation of how that number was calculated.

Perhaps I shouldn’t worry so much. I live thousands of kilometers away in a zone of seismic serenity.

Update, several hours later: After reading a little more carefully, I think the report does assume that all possible earthquake sites are independent. At each site the probability of an event is a function of time, but it is independent of probabilities at other sites. Thus calculating a joint probability for the northern and southern parts of the state does seem to be a valid operation. And the distinction between “exactly one” and “at least one” doesn’t really enter into the matter either. That’s because the model is only valid until the next major earthquake occurs; after that, all bets are off, since the time-dependent probabilities have to be recalculated.

If this interpretation of the model is correct, I think the way the result is expressed is somewhat misleading. To say there’s a 97 percent chance in Socal and a 93-percent chance in Nocal implies there’s a high probability (90.2 percent) of seeing both events in the course of the 30-year period. But the model is no longer valid after the first quake.

I wonder if there isn’t a better way to express the concept at the heart of this story. Qualitatively, it’s easy enough to grasp: In the next 30 years there will almost certainly be a major earthquake somewhere in California, and the event is more likely to happen in the southern part of the state than in the northern part. Putting this into numbers is somewhat tricky—or at least I’ve had a lot of trouble with it. Having finally surrendered to the computer and performed a Monte Carlo simulation, I come up with this statement: There’s a 99.8 percent chance that the next major California earthquake will happen by 2037. If indeed such a quake occurs, the odds are about 57 to 43 it will hit in Southern California.

Working on the railroad

Saturday, February 10th, 2007

The March-April issue of American Scientist is now available on the Web; paper copies should be on their way soon. My column is about hump yards and turnouts and wyes—in other words, about algorithms for railroad workers. “Computing with locomotives and box cars takes a one-track mind.” There’s a small puzzle near the end of the column. You’re welcome to post comments, complaints and solutions here.

In the new issue I also recommend a “Macroscope” article on Avogadro’s number by Ronald M. Fox and Theodore P. Hill of Georgia Tech. For those who have forgotten their chemistry, Avogadro’s number is the number of molecules in a mole of a substance (an amount in grams numerically equal to the molecular weight). Specifically, NA is defined as the number of carbon atoms in 12 grams of carbon-12, and its value is roughly 6.02 × 1023. Fox and Hill suggest turning the definition upside-down: Instead of trying to count the atoms in a gram, define the gram as a certain number of atoms. They have a specific number to recommend: 602,214,141,070,409,084,099,072. I invite you to deduce what’s so special about this particular number and why they favor it over other candidates in the same range.

Running on empty

Friday, November 24th, 2006

MINI CooperDriving over the river and through the woods yesterday, I was running low on fuel. My car has two kinds of instruments to tell me that I’ll soon be standing by the side of the road feeling foolish. A conventional gas gauge shows the fraction of a tankful remaining, presumably based on readings from some sort of float mechanism inside the tank. The second instrument measures the rate of fuel flow to the engine, showing the result on a digital display that can be set to any of three modes, labeled “consumption,” “average consumption” and “range.” The two “consumption” modes are calibrated in miles per gallon; the “range” mode gives an estimate of the distance remaining until the tank is empty, in miles. When you’re nervous about whether or not you can make it to the next gas station, the range is clearly of interest.

But keeping an eye on the range estimate is also somewhat disconcerting. If the meter says you can last another 23 miles, and then you drive a mile, it seems reasonable to expect that the meter will report a remaining range of 22 miles. In fact it may well say 19 miles, or 23, or even 26. It’s particularly bizarre to see the range increase as you continue driving.

What’s going on here? It’s not hard to guess. The estimated range is simply the number of gallons remaining in the tank multiplied by the fuel-use rate in miles per gallon. Both measurements doubtless have some noise in them, but variations in the fuel flow rate are the major cause of fluctuations in the range estimate. For my car, the instantaneous fuel economy dips down below 10 miles per gallon under hard acceleration, and it appears to go well above 100 miles per gallon when coasting downhill. (The meter tops out at 99.9 mpg.) These variations could alter the range estimate by a factor of 10 or more. Strictly speaking, the fluctuating estimates are not wrong—they indicate the actual range if you were to continue driving exactly as you were at the moment of measurement—but some averaging or filtering would seem sensible.

In fact, I think the range readings I see on my dashboard instrument are smoothed to some extent. The number is updated at intervals of about 30 seconds, and it may reflect an average calculated over a somewhat longer period. The question I want to ask is this: What is the optimum averaging interval—optimal in the sense that it minimizes some measure of error in the estimates? I doubt there can be any definitive answer without making some assumptions about the nature of the fluctuations, but I have a heuristic proposal that seems pretty good to me.

Here’s how I was thinking about the problem during my Thanksgiving pilgrimmage. Suppose you keep driving until the tank runs dry, all the while recording your distance and rate of fuel consumption, moment by moment. Retrospectively, then, it’s easy to determine the number that the range meter should have been displaying at any point during the trip: You just measure the distance backward from the point where the engine died, and by definition that’s the range remaining. But note that for every point along the route, this “retrodicted” range should be equal to the number of gallons left in the tank at that point multiplied by the fuel-use rate (in miles per gallon) averaged over the remainder of the distance. This fact suggests a perfect estimation strategy: You should always average the fuel-use rate over the remaining range. Unfortunately, the remaining range is exactly what you’re trying to calculate, so this algorithm is not very practical. But perhaps we can approximate it.

To reiterate: If the remaining range is n miles, then the ideal is to estimate this range by averaging the fuel consumption over the next n miles. We can’t quite do that, for two reasons. First, we don’t know what the average consumption will be in the n miles to come; it will depend on terrain, speed, traffic conditions, and many other imponderables. Second, we don’t even known what n is; that’s what we’re trying to estimate. But don’t despair. To cope with the first problem, we can choose some other interval of n miles as a surrogate for the n miles just ahead; the obvious choice is the n miles just behind us. As for the unknown quantity n, we calculate it iteratively. Make some initial guess r0 about the fuel consumption rate, perhaps using the long-term average since the car was manufactured. Multiply r0 by the gallons in the tank to get a first estimate n0 of the remaining range. Then take the average fuel consumption over the preceding n0 miles and again multiply by the number of gallons to get a better range estimate, n1. Only a few repetitions of this process ought to be needed to converge on a pretty good estimate of n. That estimate, of course, is what the dashboard meter will report.

With this scheme, as the estimated range gets smaller, it will also get more volatile, because the consumption rate will be averaged over a smaller interval. I argue that this tendency to wider fluctuations is not a failure of the algorithm. In the last few miles before the tank runs dry, the range really does depend sensitively on whether you’re descending a hill on the open highway or stopping and starting at a series of city traffic lights.

Something tells me I’m not the first person to think about this problem or the first to propose this solution. The same issues arise in lots of other contexts, such as predicting how long the battery will last in a laptop computer. If anyone has a plan that can beat mine, I’d be pleased to hear about it.

By the way, I arrived on time for Thanksgiving dinner, with a few drops left in the tank.