bit-player | An amateur's outlook on computation and mathematics

This morning I am enjoying the benefits of jet lead. My watch says it’s 7:30, but the hotel-room clock reads 4:30, so I have a few hours free to lie awake and solve the world’s problems. As a warmup exercise I’m doing mental arithmetic. I’m dividing integers, and trying to figure out how I do it.

I know what got me started on this particular line of predawn predaydreaming. On the flight out here I was reading a paper on division by Eric Allender of Rutgers. Then yesterday I was reminded again of division’s divisiveness when I ran into a friend, Robert L. Constable of Cornell (another refugee from the Eastern time zone, who’s probably awake now too). Constable said something I found provocative—which is a little odd, because he was repeating my own words to me, from an article I wrote a while ago. Paraphrasing Constable paraphrasing me: We should count ourselves lucky that multiplication is a polynomial-time operation.

In the article I went on to elaborate:

If you are given two prime numbers, it’s easy to multiply them, yielding a bigger number as the product. But trying to undo this process—to take the product and recover the two unknown factors—seems to be much more difficult. We have fast algorithms for multiplying but not for factoring. Why is that?

The contrast between multiplying and factoring is indeed fascinating, but maybe not quite as dramatic as that passage suggests. In the first place, we don’t really know that factoring is difficult. At the moment I don’t have an efficient, polynomial-time factoring program installed on my computer, but I can’t prove that you don’t have one. It’s not even true that we have no fast algorithms for factoring. We do know of one such algorithm; it’s just that you need a quantum computer to run it.

In any case, factoring isn’t how I’m passing my time this morning; I’m dividing. The two things are related, of course; either of them might claim to be the inverse of multiplication. (If you stop people on the street and ask, “What’s the opposite of multiplying?” few of them would reply “factoring.”) We can think of division as lazy factoring, or factoring with a head start. Dividing 40321 by 29 looks like an easier task than factoring 40321. But just how easy is it to divide? Specifically, what is the complexity class of division? Is it one of those tough exponential problems, or can we count on doing it in polynomial time, even in the worst case?

Here’s the most elementary division algorithm: Find x ÷ y by counting how many times you can subtract y from x. Stop counting when x is less than y; the count at that point becomes the quotient, and whatever is left of x is the remainder. In other words:

function divide(x,y)
   q = 0
   while x ≥ y 
      x = x - y
      q++
   return (q,x)   //quotient, remainder

The procedure requires exactly q subtraction steps, and so its time complexity is proportional to x / y. This sounds pretty good—a linear-time algorithm. Unfortunately, though, that’s not the proper way to measure the algorithm’s running time. The correct complexity is not a function of x or y or q but a function of the number of bits or digits in these quantities. And that means the repeated-subtraction algorithm is exponential; for an n-bit number, the running time is O(2ⁿ).

Of course no one really divides by repeated subtraction, even at five o’clock in the morning.

Like most of my contemporaries, I was exposed at a tender age to the algorithm known as long division, or guzzinta. I understand that this technique has now been dropped from the curriculum in many schools, to the disgust of educators who believe it’s immoral for youth to be spared any trauma their parents endured. Far be it from me to shirk my patriotic duty to torture children, but I’m actually glad that long division is being retired from the canon of skills that every educated person is expected to master. Part of my reason is that I never learned the trick very well myself, and I’m hoping to avoid embarrassment. But there’s also the fact that this is the first nontrivial algorithm most of us encounter in life, and it’s a really weird one. It makes for a rough and ragged introduction to the charms of algorithmic reasoning.

Here’s part of a worked example of long division from the Ask Dr. Math section of the Math Forum at Drexel University:

Let's divide 175 by 3:
      ______
    3 ) 175
  
We take the dividend (175) one digit at a time starting 
at the left. If we tried to divide 1 by 3, we would get zero, 
so let's skip that and use the first two digits, 17:
  
      ___5_
    3 ) 17
        15
        --
         2
  
What I did was to look through the multiplication 
table for 3 (in my mind), looking for the biggest multiple 
of 3 that is less than 17: 3, 6, 9, 12, 15, 18 - since 18 is 
too big, we use 3 x 5 = 15.

It’s the bit about looking through the multiplication table in your mind that I find unsettling. Note that what’s involved here is not just a matter of using the familiar 10-by-10 multiplication table “in reverse.” When you multiply decimal numbers, you take each digit of the multiplier and the multiplicand, find the corresponding column and row in the table, and, presto, there’s the partial product at the intersection. Reversing this process would mean using the divisor to select a row in the table, finding the dividend in that row, then reading off the index of the column to determine a digit of the quotient. But, as Dr. Math notes, the dividend is generally not present in the table; you have to search for “the biggest multiple less than….” No such searching (or guessing) step is needed with multiplication.

I have another gripe with the long-division algorithm: It mixes up operations on numbers with operations on digits—on the written representation of numbers. The various multiplication and subtraction steps that enter into the long-division procedure are well-defined operations in arithmetic, but when you come to that business about “bringing down the next digit,” you’re manipulating arbitrary symbols rather than acting on numerical values. How would you express the “bringing down” step in terms of +, – and × operations? If you think this is a frivolous quibble, try writing a computer program to implement the long-division algorithm. At some points in the program you’ll want numbers to be numbers—things you can add, subtract and multiply—but at other points it’s more convenient to represent numbers as lists of digits, which are easy to select and move around but not so easy to combine numerically.

* * *

Okay. So now it’s six o’clock local time, the horizon is brightening, and I seem to have persuaded myself that division of integers is so awkward as to be all but impossible. And yet…. I have just divided 40321 by 29, getting a quotient of 1390 and a remainder of 11. My algorithm—though I hesitate to glorify it with that term—seems to be a crude mishmash of long-division and Newton’s method. I make guesses, do some trial multiplication, and use the sign and magnitude of the error to suggest a better guess. It works, but it’s not pretty, and I wouldn’t want to be teaching it to impressionable youth. There’s a faint whiff of nondeterminism about it, as if I were guessing the answer rather than calculating it.

Why is division so hard?

At times I feel this is a dumb question, and that the universe doesn’t owe us an answer to such questions. Some things are just harder than others; that’s the way the world works. We can count ourselves lucky that multiplication has a polynomial-time algorithm; we have no right to expect that our luck will hold when we turn to division.

But then I get to thinking about physical systems that multiply and divide, where the two operations seem perfectly symmetrical. An example is an electrical circuit governed by Ohm’s law.

Image scarfed from Cornell University, Center for Nanoscale Systems, Institute for Physics Teachers

If we attach the terminals of a voltmeter to opposite ends of a resistor, we observe Ohm’s law in the form E = IR: Voltage is equal to the product of current and resistance. If we install an ammeter in series with the resistor, we see the effects of the law I = E/R: Current is equal to voltage divided by resistance. If you prefer pipes to wires, you can do the hydraulic version of the experiment, and there are also lots of mechanical schemes, where the same principle is illustrated with levers, gears, pulleys and such. In all of these systems, nature seems to divide just as easily as it multiplies. Why can’t we do the same with pencil and paper, or transistors, or neurons?

Much of theoretical computer science is given over to questions just like these. Given a certain model of computation, how many operations are needed to solve a problem? If we were given some extra resource (randomness, perhaps, or quantum superposition), would that make the job any easier? Can we reduce one problem to another and thereby solve both of them with the same method?

In the case of division, the answers are in. Specifically, they’re in the paper by Eric Allender that got me started on this strange morning’s meditation. The conclusions are simply stated. Forget about those whiffs of nondeterminism; division, along with the other three arithmetic operations, is a member in good standing of the familiar and friendly complexity class P, made up of deterministic polynomial-time problems. Indeed, all four operations belong to a subclass of P called L, which consists of just those problems that can be solved by a deterministic algorithm in polynomial time and with a logarithmic amount of memory (or, again, with memory proportional to the number of digits in the input). [Correction: As Ryan Williams and Jonathan Katz point out in the comments: a logspace solution has to fit in an amount of memory proportional to the logarithm of the number of digits.] Allender narrows down the complexity of division still further: It belongs to a class called DLOGTIME-uniform TC⁰. If you want to know just what that means, I must send you off to Allender’s paper (PDF).

Putting multiplication and division in the same complexity class doesn’t mean they’re exactly equal in difficulty. Complexity theory adopts the amusing practice of ignoring all constant factors and all terms other than the fastest-growing one, so that a running time of, say, 10⁶n² + 1000n is considered the same as n². But in the case of multiplication and division this approximation is reasonably close to reality.

Most of the results on the complexity of division are quite recent. People have been dividing for millennia, but Allender’s paper is titled “The Division Breakthroughs,” and it describes results published within the past decade. For most of the history of Homo calculensis, we didn’t know how good we had it.

Should I be surprised to wake up in the morning and find out that division isn’t as hard as it seems? On the one hand, it’s common sense that division can’t be really, really hard, because we do it all the time, getting answers promptly and reliably. On the other hand, division algorithms are so ugly we have to hide them from the children. I don’t know what to think. My mind is divided.

Update 2007-10-11. In the comments, Jonathan Katz asks a very good question: How is division done on real computers? I wish I could give a detailed answer. Here’s what little I know.

Modern general-purpose processors come with built-in hardware for floating-point division. The most popular method seems to be an algorithm called SRT division, after D. Sweeney of IBM, J. E. Robertson of the University of Illinois and T. D. Tocher of Imperial College London, who independently came up with variations on the same theme in the 1950s. The underlying idea is not fundamentally different from long division, but there are a couple of tricks and tweaks to speed the process. Although the numbers are represented in binary, the calculation is usually done in radix 4, so that the quotient is calculated two bits at a time. Moreover, the quotient digits are not 0, 1, 2, 3 but –2, –1, 0, 1, 2; this choice of digits has the advantage that the multiplication steps in the algorithm can be implemented as simple shifts and negations. But with five possible quotient digits in radix 4, there is some ambiguity or redundancy: Some numbers can be represented in more than one way. This has to be resolved, and the answer converted to conventional notation, at the end of the operation.

What I find particularly intriguing is that the “whiff of nondeterminism” I thought I smelled in schoolbook division also turns up here. David L. Harris, Stuart F. Oberman and Mark A. Horowitz explain:

The key idea of SRT division is to guess the quotient digit based on a few of the most significant divisor and partial remainder bits, rather than computing it exactly. As long as the guess is close enough, the algorithm can correct on subsequent steps using redundant quotient digits.

Conquering divide