Statistical error

Tom Siegfried, the editor of Science News, has published a blistering indictment of statistical methods in science and medicine. I am moved to speak for the defense.

Siegfried discusses a number of specific cases, mainly drawn from the biomedical literature, where faulty statistical reasoning has led to unreliable or erroneous conclusions. I don’t want to quibble over the particulars of those cases; I’ll concede that science provides plentiful examples of statistical analyses gone wrong. Indeed, I could add to Siegfried’s list. But I see these events mainly as failures to use the tools of statistics properly; Siegfried suggests that the problem goes deeper. If I understand him correctly, he believes that the tools themselves are defective and that science would be better off without statistics. Here is how he begins his essay:

For better or for worse, science has long been married to mathematics. Generally it has been for the better. Especially since the days of Galileo and Newton, math has nurtured science. Rigorous mathematical methods have secured science’s fidelity to fact and conferred a timeless reliability to its findings.

During the past century, though, a mutant form of math has deflected science’s heart from the modes of calculation that had long served so faithfully. Science was seduced by statistics, the math rooted in the same principles that guarantee profits for Las Vegas casinos. Supposedly, the proper use of statistics makes relying on scientific results a safe bet. But in practice, widespread misuse of statistical methods makes science more like a crapshoot.

It’s science’s dirtiest secret: The “scientific method” of testing hypotheses by statistical analysis stands on a flimsy foundation.

This argument strikes me as so totally wrong-headed that I have a hard time believing Siegfried is serious about it.

To begin with, the snide remarks about Las Vegas and crapshoots are off-target. The branch of mathematics with roots in the study of gambling is not statistics but the theory of probability. The two fields are closely allied, but they’re not identical. Early statistical ideas came out of astronomy and geodesy, with later developments in the social sciences, genetics and agriculture. If you really must find some vaguely disreputable locale for statistics, the apt choice is not the casino but the brewery (a notable Student of statistics worked for Guinness).

More disturbing than this minor historical flub is Siegfried’s vision of a lost golden age of “rigorous mathematical methods,” debased by the seductive wiles of statistics. I don’t believe there was any such fall from grace. Siegfried doesn’t tell us much about the nature of his prestatistical mathematical paradise, but since he mentions Galileo and Newton, I suppose he may be thinking of classical mechanics as an exemplar of lost innocence. It’s true that the study of planetary orbits and ballistic trajectories does offer up some pithy mathematical laws that purport to be exact descriptions of nature:

mechanics-eqns.png

We don’t usually attach error bars to these expressions, or hedge our bets by saying “Force is equal to mass times acceleration within one standard deviation.” But where do such “exact” laws come from? When Galileo performed his experiments with balls rolling down an inclined plane, the measured data did not exactly conform to a parabolic trajectory. Likewise with Newton’s inverse-square law: No real-world observations precisely follow the form 1/r2–not unless the experiment has been fudged. Making the leap from experimental data to mathematical law requires a process of statistical inference, where we extract some plausible model from the data and attribute any departures from the model to measurement error.

In the time of Galileo and Newton, tools for statistical inference were crude; by the time of Gauss, they were much sharper. In 1801 the newly discovered planetoid Ceres was observed for 41 days before it was lost in the glare of the sun. Astronomers hustled to predict where and when it would reappear in the sky. Among all the attempts, the clear winner was the prediction of Gauss, whose advantage in this competition was not so much superior astronomy as superior statistics. His secret was the method of least squares, which he later backed up with a comprehensive theory of measurement error, introducing the idea of the normal distribution.

Later still, statistics had a role in showing that the “exact” mathematics of Newtonian celestial mechanics is not exact after all. It took careful observations–and careful statistical analysis of those observations–to quantify a tiny anomalous precession in the perihelion of Mercury, explained by general relativity but not by classical gravitation.

Statistics is no “mutant form of math”; it’s the way that science answers the fundamental and inescapable question, “How do we know what is true?” I really can’t imagine how science could survive without statistics. What would replace it? Divination?

Siegfried complains that statistical tools offer no certainty–that when a result is reported as statistically significant at the 1-sigma 2-sigma level (or in other words with a P value of 0.05), there’s still a 1-in-20 chance that it’s a meaningless fluke. Quite so; that’s essentially the definition of a 2-sigma P value. But the uncertainty is not some methodological malfunction. It reflects the true limits of our knowledge. The strength of statistical reasoning is that it makes those limits explicit.

Again, I’ll readily agree that standards of statistical practice should be strengthened, and that weak or faulty conclusions are too common in some areas of the published literature. But the claim that “any single scientific study alone is quite likely to be incorrect, thanks largely to the fact that the standard statistical system for drawing conclusions is, in essence, illogical” is, in essence, illogical. Yes, we have lies, damn lies and statistics. But we also have lies and damn lies about statistics.

Posted in statistics | 9 Comments