I have a few questions for the complexity theorists among us.
Have you ever tried to explain to your grandmother why NP is named NP? Does she get it when you say that problems labeled NP-complete are the hardest problems in NP, but NP-hard problems might be harder, and not in NP?
My concern here is not with the difficulty of the concepts—there’s not much we can do about that—but with the notation and terminology. Do locutions like P#P and NISZK and (NP ∩ coNP)/poly roll trillingly off your tongue? How about EXP, EEXP, NEXP, PEXP and SUBEXP? And while we’re on the subject of EXP and friends, I’ve been wondering how to pronounce NEXPTIME. (I’m kind of hoping the “P” is silent, as in Pterodactyl.)
Presumably, the argot of complexity theory works well enough for communication among members of the club—those of you, say, who know that “IP” is neither “internet protocol” nor “intellectual property” but “interactive proof.” If you’d ever like to talk to the rest of the world, however, I think there might be room for improvement.
Let’s start with the term “complexity” itself. The root meaning of “complex” is “made up of many interconnected parts”; for example, the innards of a mechanical clock are complex. Is this the best word to describe the property of being hard to compute? To me, it seems the difficulty of an exponential algorithm is not the complexity of the task but the mere mind-numbing repetition of the same operations. An exponential heap of manure is no more complex than a polynomial heap of manure, it just takes longer to shovel it.
Another gripe: WHAT’S WITH ALL THE CAPITAL LETTERS? IS IT REALLY NECESSARY TO SHOUT ALL THE TIME?
But the thing that worries me most is a certain opacity in the names of complexity classes. According to the Complexity Zoo, there are currently 465 named classes, which is quite a menagerie. It’s enough that you can’t really expect people to keep track of meanings and relations without some internal cues from the structure of the names themselves.
There are hints of structure in the naming scheme, with terms assembled from stems, prefixes and suffixes, and combined according to quasi-grammatical rules. For example, the superscript in P#P indicates the action of an “oracle”: The class encompasses problems that can be solved in polynomial time with the help of a #P (“sharp-P” or “counting-P”) oracle. The prefix “co-” in “coNP” designates the complement of NP: If a “Does there exist…?” question is in NP, then the corresponding “Does there not exist…?” question is in coNP. These devices are applied consistently; as far as I can tell, superscripting and the “co-” prefix always mean the same thing. Unfortunately, other elements of the names are not so easy to parse. The letter P generally stands for “polynomial” (except where it’s “probabilistic”). N usually denotes “nondeterministic” (but NC is “Nick’s Class”). Likewise the prefix D is for “deterministic” (except that it’s usually omitted, and sometimes it means “difference” or “dynamical” instead). B stands for “bounded-error” (except that BH is “Boolean hierarchy” and “BPd(P)” is “Polynomial Size d-Times-Only Branching Program”). Q is for “quantum” (except “QH” is the “query hierarchy” and “QP” is “quasi-polynomial time”).
The sad truth is, the naming conventions for furniture at Ikea make for a more consistent language than those of complexity theory.
How did we get into this fix? Back in 1974, when complexity theory was still young and impressionable, Don Knuth published “A Terminological Proposal” in SIGACT News, the quarterly newsletter of what was then the ACM Special Interest Group on Automata and Computability Theory. (By the way, the issue in which Knuth’s letter appears, Vol. 6, No. 1, seems to be missing from the ACM Digital Library. Can anyone supply a copy for scanning?) Knuth had polled 30 of his friends, asking for advice on the best adjective to describe problems “at least as difficult to solve in polynomial time as those of the Cook-Karp class NP.” His own suggestions were “Herculean,” “formidable” and “arduous,” but none of those got strong support. The people he consulted had proposals of their own: “impractical,” “intractable,” “prodigious,” “exorbitant,” “Sisyphean” and several others. The idea I like best came from Shen Lin, who suggested the abbreviation PET:
PET stands for “probably exponential time”. But if this gets proved, PET stands for “provably exponential time”. And if the proof goes the other way, it stands for “previously exponential time”.
On the whole, I think it’s just as well that we are not now speaking of Herculean or Sisyphean or PET problems, but I haven’t much enthusiasm for the solution that was adopted, either:
The “winning” write-in vote is the term NP-hard, which was put forward mainly by several people at Bell Labs, after what I understand was considerable discussion. Similar if not identical proposals were made by Steve Cook, by Ron Rivest, and in earlier publications by Sakti Sahni. This term is intended for use together with another new one, NP-complete, which abbreviates ‘polynomial complete’ and is at the same time more exact.
Knuth endorses “NP-hard” and “NP-complete” primarily because they have precise definitions and thus meet the needs of the insiders, the “automata-theorists.” But he also goes on to address the issue of communicating with a wider audience, and his thoughts are worth quoting at some length:
I don’t consider it a major goal to invent completely descriptive terminology for every conceivably interesting question of the type considered here; the major goal is to have a good term to use for the masses, in the one case which experience shows is almost omnipresent. To say NP-hard actually smacks of being a little too technical for a mass audience, but it’s not so bad as to be unusable; fortunately the term does easily generalize to a lot of other cases considered by automata theorists, so it appears to be a good compromise. When more technicalities are introduced, there is less need for a universally accepted term or notation. Although it’s very interesting to consider, e.g., log-space reductions, I don’t think it’s necessary to come up with a special short name for them.
If I understand this passage correctly, Knuth is arguing that although “NP-complete” and “NP-hard” may be a bit abstruse for casual coffeeshop chatter, they are the only terms that nonspecialists will ever need to learn, and so they won’t be too much of a burden. But that’s not quite the way it has worked out.
I suppose that if I’m going to complain about somebody else’s notation, I ought to propose an alternative of my own that I believe to be better. I can’t do that, but I would like to point out that certain other fields of study offer models that might be emulated. The most obvious model is organic chemistry, where the naming problem is orders of magnitude larger than the collection of 465 items in the complexity zoo. The sesquipedalian names of organic molecules are not exactly pretty, but the notational system has the wonderful property that if you know the name of a thing, you also know its composition and structure. Wouldn’t it be grand to have the same kind of perspicuous scheme for complexity classes, so that just naming a class would tell you where it lies in the inclusion diagram?
A saving grace of complexity theorists everywhere is that they are not without a sense of humor when it comes to the obscurity of their own jargon. For example, there is a complexity class named PINC, which the Zoo translates as “Polynomial Ignorance of Names of Classes.” And you don’t want to miss Alan T. Sherman’s discourse (or descant) on superpolylogarithmic subexponential functions (.ps).
Finally, I should mention a discovery I’ve made at the end of Knuth’s terminological proposal. Everyone knows that settling the P = NP question will earn you a prize of $1 million from the Clay Mathematics Institute. I suspect that many are unaware of another prize offer, made more than 30 years ago. Knuth writes:
I’m willing to give a prize of one live turkey to the first person who proves that P = NP.
Note that this appears to be a one-way-only offer. Proving that P ≠ NP won’t win you the turkey. (You have three days left until Thanksgiving.)