The Oracle of Wolfram

In a comment on my earlier note about Wolfram Alpha, Daniel Asimov takes me to task for failing to explain “what Wolfram Alpha is.” I’ll accept the criticism, but I have to add that the question he raises is a real toughie. What, indeed, is Wolfram Alpha? Much of the prerelease hype (e.g., CIO, ZDNet, Telegraph) suggested it was to be some kind of search engine—a “Google killer,” or else, as Steven Levy wrote, “more like the anti-Google.” Another common theme (Infotoday, Guardian) suggests that Alpha is a manifestation of the semantic web, “that thing that Sir Tim Berners-Lee has been banging on about.”

There have been lots of other attempts to answer the ontological question. Jonathan Zittrain (via the New York Times) calls Alpha a “computable almanac.” Larry Greenemeier, writing for Scientific American, says: “Think of it as Ask Jeeves with [a] PhD.” Stephen Wolfram, the creator of Alpha, tells Rudy Rucker: “If anything, you might call it a platonic search engine, unearthing eternal truths that may never have been written down before.” Yuri Alkin, in a blog called Connections, had the wit to present the question to Alpha itself: “Who are you?” he asked. The polite reply, worthy of HAL or Commander Data, was “I am a computational knowledge engine.”

After a few weeks of sporadic poking around with Alpha, I’m finally ready to take my own shot at answering the big question. If you ask me, Wolfram Alpha is an oracle. Not an oracle in the computer-science sense—a hypothetical black box that simplifies complexity analysis by always giving the correct answer for queries of a specific form. I mean an oracle in the Greek-mythology sense—a sybil in a cave or a temple, whose responses to questions are often helpful but tend to be enigmatic and require careful interpretation. Sometimes you get just the answer you were looking for. Sometimes you get no answer at all. Sometimes the answer leaves you more perplexed than when you began.

All in all, perhaps it’s better to set aside the question of what Alpha is and ask what it can do.

It can do your homework (or your students’ homework):

Query: Limit (x^p)^(1/p) as p->0

Answer: \(\lim_{p \to 0}(x^p)^{1/p} = x\)

It can graph a function:

Query: plot sin(x)/x from -10 to 10



It makes a handy desk calculator:

Query: 12 choose 3

Answer: 220

Query: factor 8549176323

Answer: 3 × 127 × 22438783 (3 distinct factors)

It provides access to a rich trove of “curated” data:

Query: molecular weight vanadium dioxide

Answer: 82.9403 (grams per mole)

It offers links to “live” data, updated in real time, on topics such as the weather and financial markets:

Query: weather Buenos Aires



But the big payoff of a service like this lies in combining factual queries with mathematical or algorithmic analysis. Surely that’s what a “computational knowledge engine” should be good at, no? I’ve been trying hard to make Alpha perform in this way. So far I’ve found the process pretty frustrating.

Here’s a case study. Remembering an old story about Kansas being flatter than a pancake, I submitted the query, “flattest state in the U.S.” The response was another question: “Did you mean ‘fastest state in the U.S.?’”

Well, no, I didn’t mean that, but out of curiosity I clicked the link to learn which is the fastest state in the U.S. The reply, in its entirety, was this:


The oracle was in deep enigma mode. I decided to go back to the “flattest” question. Let me add that I hadn’t really expected my first query to work; a ranking of states by flatness is not something you’d find in an almanac (computable or otherwise), and indeed the concept of flatness has various possible definitions. I thought I could give Alpha some help by being more explicit.

Query: All US states maximum elevation - minimum elevation

Answer: Did you mean: US states maximum elevation minimum elevation

I wasn’t quite sure how to respond here, but it doesn’t cost anything to try, so I accepted Alpha’s rephrasing of the query. What I got back was not the answer I was looking for, but it was not entirely without interest:

Query: US states maximum elevation minimum elevation



The scatterplot of highest and lowest elevations by states tipped me off that the data I’m looking for are in the system somewhere. Indeed, one of those dots in the lower left corner, with both lowest and highest elevations near zero, is probably the answer to my question (at least if we define flatness as the difference between maximum and minimum elevation). But how to identify the dot? Or, for that matter, how to identify the conspicuous outlier—the one state with a minimum elevation well above 1,000 feet?

I allowed myself to be distracted by the latter question. There are a couple of obvious guesses for the state with the highest lowest elevation, so I tried one:

Query: Colorado minimum elevation

Answer: 3314 feet

Hmm. That’s not the outlier in the scatterplot; 3300 feet is well off the chart. That means at least one state was clipped from the graph. Another query makes this more obvious:

Query: US states minimum elevation



It appears that ranks 1 through 4 lie somewhere above the top edge of the graph. Is there some way to force Alpha to plot the complete data set, without arbitrary cropping? For some kinds of plotting, I’ve figured out how to control the range of the independent variable (see the command “plot sin(x)/x from -10 to 10″ above), but in this context I’ve not discovered the key, if there is one. And, as far as I can tell, there is no warning given when a plot is chopped.

Nevertheless, I was able to identify the four missing states. Accompanying the rank-order graph above was a helpful list, whose first entries were: Colorado 3314, Wyoming 3100, New Mexico 2844, and Utah 2001. The highest visible dot in the graph represents the fifth state in the sequence—Montana, with a minimum elevation of 1801 feet. The list gave the first five states and the last five in the ranking. This looked promising. If I could get a complete list of minimum elevations for all the states, and then the corresponding list of maximum elevations, perhaps Alpha could also give me the differences. I would ask it to alphabetize both lists, then subtract them element by element, and finally take the minimum of the result, or else sort again according to magnitude.

A button next to the truncated list of states promised “More.” I pressed it. Now I had the first 10 and the last 10 states, but I was still missing the 30 in the middle. Something else had changed as well: All the numbers were different, with the list of elevations beginning 1010, 945, 867. After a moment’s perplexity, I realized that Alpha had decided to shift from feet to meters. No matter. Three more presses of the “More” button finally got me a complete list of minimum state elevations (in meters). And the same rigmarole soon produced the analogous list of maximum elevations (again in meters).

But now I was stumped. How do I sort the list alphabetically? Can I subtract one list from another? Can I do anything to transform the output of a command? Is there any way to compose commands, so that the output of one routine becomes the input of another? Not a clue.

But perhaps I could do it the other way, slicing the salami crossways instead of longitudinally. Instead of compiling a list of maxes and a list of mins and then subtracting, I could subtract lowest point from highest point state by state and then list the results. Searching through various help files and lists of examples, I eventually came to a page on “Elevation Data,” with a subcategory “Minimum and Maximum Elevations.” And there, at the bottom of the page, was this suggested query: “Montana maximum elevation - minimum elevation.” Clicking on it gave me the result “11,007 feet.” So I could get the elevation range for a single state. All that remained was to persuade Alpha to map the same computation over all the states….

But wait. That’s where this story began, with the query “All US states maximum elevation - minimum elevation.” It didn’t work when I tried it before, and it still doesn’t work now.

I tried some minor variations in phrasing and punctuation, such as this one:

Query: (US states maximum elevation) - (US states minimum elevation)

Answer: 4341 feet

What does the number 4341 mean? A “Show Details” button led to the explanation:


Instead of subtracting the vectors element by element, the program is taking the median of each elevation list and then subtracting. (If I had wanted to do that, I wouldn’t have known how to ask for it.)

Finally, shown below in full detail is what came back after one further attempt to formulate the “flattest state” query:


Who asked about Albanian currency? I guess this is what the sybil says when she’s tired of listening to all of my questions.

*   *   *

Wolfram Alpha is an ambitious project, as its makers would be the first to proclaim. Here’s what the “About” page tells us:

Wolfram|Alpha’s long-term goal is to make all systematic knowledge immediately computable and accessible to everyone. We aim to collect and curate all objective data; implement every known model, method, and algorithm; and make it possible to compute whatever can be computed about anything.

It’s hard to resist making fun of these lofty and all-encompassing aims, especially when a fairly simple geographic query returns a result expressed in units of Albanian Lek-feet. All the same, I still applaud the attempt to create such a service, and I hope that Stephen Wolfram and his colleagues achieve some reasonable fraction of their goals.

The main sticking point, it seems pretty obvious, is not in collecting and curating data or in formulating models, methods and algorithms. It’s the access part. How am I to communicate with the system? How am I to specify which bits of systematic knowledge I’d like to retrieve, and how do I tell Alpha which models, methods and algorithms to apply? For more than 50 years the answer to this question has generally been a programming language of some kind. The designers of Wolfram Alpha have deliberately turned their back on that option, in favor of a natural-language interface. I’m sure they made this choice with the best of motives, in order to reach out to a wider audience that might be intimidated by formal notation. Unfortunately, the natural-language interface is so limited that we’re effectively left with no notation at all.

In a way, talking to Wolfram Alpha is rather like communicating in a natural language—a foreign language you don’t happen to speak. With grunts and gestures and a few stray nouns you may be able to get across the most rudimentary touristic needs—”Where toilet?” or “How much?”—but if you want to carry on a real conversation, you need more vocabulary and, most of all, you need grammar. I’m skeptical that Wolfram Alpha will ever be of much use without such a linguistic structure.

