The short arm of coincidence

James Tanton tosses off number theory problems the way John D. Rockefeller handed out dimes. I wrote about one of Tanton’s problems back in January. Then a few weeks ago this tweet about factorials and squares snagged my attention, and it hasn’t let go:

Tweet reads: 4!+1 = 25, a square number. 5!+1 = 121, a square number. Another example? Two more examples?

With pencil and paper it’s easy to show that \(6!\) doesn’t work. The factorial of \(6\) is \(1 \times 2 \times 3 \times 4 \times 5 \times 6 = 720\); adding \(1\) brings us to \(721\), which is not a square. (It factors as \(7 \times 103\).) On the other hand, \(7!\) is \(5040\), and adding \(1\) yields \(5041\), which is equal to \(71^2\). This makes for a very cute equation:

\[7! + 1 = 71^2.\]

Continuing on, you can establish that \(8! + 1\), \(9! +1\) and \(10! + 1\) are not square numbers. But to extend the search much further, we need mechanized assistance. Here’s a Julia function that does the obvious thing, generating successive factorials and checking each one to see if it is \(1\) less than a perfect square:

function search_fac_sqr(maxn)
  fac = big(1)                      # bigints needed for n > 20
  for n in 1:maxn
    fac *= n                        # incremental factorial
    r = isqrt(fac + 1)              # floor of sqrt
    if r * r == fac + 1
      println(n, "! + 1 = ", r, "^2 = ", r^2)
  println("That's all folks!")

With this tool in hand, let’s check out \(n! + 1\) for all \(n\) between \(1\) and \(100\). Here’s what the program reports:

4! + 1 = 5^2 = 25
5! + 1 = 11^2 = 121
7! + 1 = 71^2 = 5041
That's all folks!

Those are the three cases we’ve already discovered with pencil and paper—and no more are listed. In other words, among all values of \(n! + 1\) up to \(n = 100\), only \(n = 4\), \(n = 5\), and \(n = 7\) yield squares. When I continued the search up to \(n = 1{,}000\), I got exactly the same result: no more squares. Likewise \(n = 10{,}000\) and \(n = 100{,}000\). Allow me to mention that the factorial of \(100{,}000\) is a rather large number, with \(456{,}574\) decimal digits. At this point in the search, I began to grow weary; furthermore, I began to lose hope. When \(99{,}993\) successive values of \(n\) fail to produce a single square, it’s hard to sustain faith that success might be just around the corner. Nevertheless, I persisted. I got as far as \(n = 500{,}000\), which has \(2{,}632{,}341\) decimal digits. Not one more perfect square in the whole lot.

What can we learn from this evidence—or lack of evidence? Are 4, 5, and 7 the only values of \(n!\) that lie \(1\) short of a perfect square? Or are there more such cases somewhere out there along the number line, maybe just beyond my reach, waiting to be found? Could there be infinitely many? If so, where are they? If not, why not?

To my taste, the most satisfying way to resolve these questions would be to find some number-theoretical principle ensuring that \(n! + 1 \ne m^2\) for \(n \gt 7\). I have not discovered any such principle, but in a dreamy sort of way I can imagine what a proof might look like. Suppose we eliminate the “\(+1\)” part of the formula, and search for integers such that \(n! = m^2\). It turns out there is just one solution to this equation, with \(n = m = 1\). You needn’t bother lathering up your laptop in the quest for larger examples; there’s a simple proof they don’t exist. In any square number, all the prime factors must be present an even number of times, as in \(36 = 2 \times 2 \times 3\times 3\). In a factorial, at least one prime factor—the largest one—always appears just once. (If you’re not sure why, check out Bertrand’s postulate/Chebyshev’s theorem.)

Of course when we put the “\(+1\)” back into the formula, this whole line of reasoning falls to pieces. In general, the factorization of \(n!\) and of \(n! + 1\) are totally different. But maybe there’s some other property of \(n! + 1\) that conflicts with squareness. It might have something to do with congruence classes, or quadratic residues. From the definition of a factorial, we know that \(n!\) is divisible by all positive integers less than or equal to \(n\), which means that \(n! + 1\) cannot be divisible by any of those numbers (except \(1\)). This observation rules out certain kinds of squares, namely those that have small primes in their factorization. But for all \(n \gt 4\) the square root of \(n!\) greatly exceeds \(n\), so there’s plenty of room for larger factors, as in the case of \(7! + 1 = 71^2\).

Here’s another avenue that might be worth exploring. The decimal representation of any large factorial ends with a string of \(0\)s, formed as the products of \(5\)s and \(2\)s among the factors of the number. Thus \(n! + 1\) must look like

\[XXXXX \ldots XXXXX00000 \ldots 00001,\]

where \(X\) represents any decimal digit, and the trailing sequence of \(0\)s now ends with a single terminal \(1\). Can we figure out a way to prove that a number of this form is never a square? Well, if the final digit were anything other than \(1, 4,\) or \(9\), the proof would be easy, but lots of squares end in \(\ldots 01\), such as \(10{,}201 = 101^2\) and \(62{,}001 = 249^2\). If there’s some algebraic argument along these lines showing that \(n! + 1\) can’t be a square, it will have to be something subtler.

All of the above is make-believe mathematics. I have stirred up some ingredients that look like they might make a tasty confection, but I have no idea how to bake the cake. Perhaps someone else will supply the recipe. In the meantime, I want to entertain an alternative hypothesis: that nothing prevents \(n! + 1\) from being a square except improbability.

The pattern observed in the \(n! + 1 = m^2\) problem—a few matches among the smallest elements of the sequences, and then nothing more for many thousands of terms—is not unique to factorials and squares. Other pairs of sequences exhibit similar behavior. For example, I have tried matching factorials with triangular numbers. The triangulars, beginning \(1, 3, 6, 10, 15, 21, \ldots\), are defined by the formula \(T(m) = m(m + 1)/2\). If we look for factorials that are also triangular, we get \(1! = T(1) = 1\), then \(3! = T(3) = 6\), and finally \(5! = T(15) = 120\). No more examples appear through \(n = 100{,}000\).

What about factorials that are \(1\) less than a triangular, satisfying the equation \(n! + 1 = T(m)\)? I know of only one case: \(2! + 1 = 3\). Broadening the search a little, I found that \(n! + 4\) is triangular for \(n \in {2, 3, 4}\), again with no more hits up to \(100{,}000\).

For another experiment we can bring back the square numbers and swap out the factorials, replacing them with the ever-popular Fibonacci sequence, \(1, 1, 2, 3, 5, 8, 13, \ldots\), defined by the recurrence \(F(n) = F(n - 1) + F(n - 2)\), with \(F(1) = F(2) = 1\). It’s been known since the 1960s that \(1\) and \(144\) are the only positive integers that are both Fibonacci numbers and perfect squares. Looking for Fibonacci numbers that are \(1\) less than a square, I found that \(F(4) + 1 = 4\) and \(F(6) + 1 = 9\), with no other instances up to \(F(500{,}000)\).

We can do the same sort of thing with the Catalan numbers, \(1, 1, 2, 5, 14, 42, 132 \ldots\), another sequence with a huge fan club. I find no squares other than \(1\) among the Catalan numbers up to \(n = 100{,}000\); I don’t know if anyone has proved that none exist. A search for cases where \(C(n) + 1 = m^2\) also comes up empty, but there are a few low-lying matches for \(C(n) + k = m^2\) for \(k \in {2, 3, 4}\).

Finding similar behavior in all of these diverse sequences changes the complexion of the problem, in my view. If we discover some obscure, special property of \(n! + 1\) that explains why it never lands on a square (for large values of \(n\)), do we then have to invent another mechanism for Fibonacci numbers and still another for Catalan numbers? Isn’t it more plausible that some single, generic cause lies behind all the observations?

But the cause can’t be too generic. It’s not the case that you can take any two numeric sequences and expect to see the same kind of pattern in their intersections. Consider the factorials and the prime numbers. By the very nature of a factorial, none of them except 2! = 2 can possibly be prime, but there’s no obvious reason that \(n! + 1\) can’t be a prime. And, indeed, for \(n \le 100\) nine values of \(n! + 1\) are prime. Extending the search to \(n \le 1000\) turns up another seven. Here is the full set of known numbers for which \(n! + 1\) is prime:

\[1, 2, 3, 11, 27, 37, 41, 73, 77, 116, 154, 320, 340, 399, 427, 872, 1477, \\ 6380, 26951, 110059, 150209\]

They get rare as \(n\) increases, but there’s no hint of a sharp cutoff, as there is in the other cases explored above. Does the sequence continue indefinitely? That seems a reasonable conjecture. (For more on this sequence, including references, see Chris K. Caldwell’s factorial prime page.)

My question is this: Can we understand these curious patterns in terms of mere chance coincidence? The values of \(n! + 1\) form an infinite sequence of integers spread over the number line, dense near the origin but becoming extremely sparse as \(n\) increases. The values of \(m^2\) form another infinite sequence, again with diminishing density, although the dropoff is not as steep. Maybe factorials bump into squares among the smallest integers because there just aren’t enough of those integers to go around, and some of them have to do double duty. But in the vast open spaces out in the farther reaches of the number line, a factorial can wander around for years—maybe forever—and not meet a square.

Let me try to state this idea more precisely. Since \(n!\) cannot be a square, we know that it must lie somewhere between two square numbers; the arrangement on the number line is \((m - 1)^2 \lt n! \lt m^2\). The distance between the end points of this interval is \(m^2 - (m - 1)^2 = 2m - 1\). Now choose a number \(k\) at random from the interval, and ask whether \(n! + k = m^2\). Exactly one value of \(k\) must satisfy this condition, and so the probability of success is \(1/(2m - 1)\), or roughly \(1 / (2 \sqrt{n!})\). Because \(\sqrt{n!}\) increases very rapidly, this probability takes a nosedive toward zero as \(n\) increases. It is represented by the red curve in the graph below. Note that by \(n = 100\) the red curve has already reached \(10^{-80}\).

Plot of probability of coincidence of factorial and square, fibonacci and square, factorial and prime

The green curve gives the probability of a collision between Fibonacci numbers and squares; the shape is similar, though it dives off the precipice a little later. The Fibonacci-square curve approximates a negative exponential: The probability is proportional to \(\phi^{-\sqrt{F(n)}}\), where \(\phi = (\sqrt{5} + 1) / 2 \approx 1.618\). The factorial-square curve is even steeper because the factorial function is superexponential: \(n!\) grows faster than \(c^n\) for any fixed \(c\).

The blue curve, recording the probability of coincidences between factorials and primes, has a very different shape. In the neighborhood of \(n!\) the average distance between consecutive primes is approximately \(\log n!\), which grows just a little faster than \(n\) itself and very much slower than \(n!\). The probability of collision between factorials and primes is roughly \(1 / \log n!\). The continuous blue curve corresponds to this smooth approximation. The blue dots sprinkled near that line give the probability based on actual distances between consecutive primes.

What to make of those curves? Is it legitimate to apply probability theory to these totally deterministic sequences of numbers? I’m not quite sure. Before confronting the question directly, I’d like to retreat a few steps and look at a simpler model where probability is clearly entitled to a seat at the table.

Let us borrow one of Jacob Bernouilli’s famous urns, which have room to hold an infinite number of ping pong balls. Start with one black ball and one white ball in the urn, then reach in and take a ball at random. Clearly, the probability of choosing black is \(1/2\). Put the chosen ball back in the urn, and also add another white ball. Now there are three balls and only one is black, so the probability of drawing black is \(1/3\). Add a fourth ball, and the probability of black falls to \(1/4\). Continuing in this way, the probability of black on the \(n\)th draw must be \(\frac{1}{n + 1}\).

If we go on with this protocol forever—always choosing a ball at random, putting it back, and adding an extra white ball—what is the probability of eventually seeing the black ball at least once? It’s easier to answer the complement of this question, calculating the probability of never seeing the black ball. This is the infinite product \(\frac{1}{2} \times \frac{2}{3} \times\frac{3}{4} \times\frac{4}{5} \ldots\), or:

\[P(\textrm{never black}) = \prod_{n = 1}^{\infty} 1 - \frac{1}{n+1}\]

The product goes to zero as \(n\) goes to infinity. In other words, in an endless series of trials, the probability of never drawing black is \(0\), which means the probability of seeing black at least once must be \(1\). (“Probability \(1\)” is not exactly the same thing as “certain,” but it’s mighty close.)

Now let’s try a different experiment. Again start with one black ball and one white ball, but after the first draw-and-replace cycle add two white balls, then four white balls, and so on, so that the total number of balls in the urn at stage \(n\) is \(2^n\); throughout the process all of the balls but one are white. Now the probability of never seeing the black ball is \(\frac{1}{2} \times \frac{3}{4} \times\frac{7}{8} \times\frac{15}{16} \ldots\), or:

\[P(\textrm{never black}) = \prod_{n = 1}^{\infty} 1 - \frac{1}{2^n}\]

This product does not go to zero, no matter how large \(n\) becomes. Neither does it go to \(1\). The product converges to a constant with the approximate value \(0.288788095\). Strange, isn’t it? Even in an infinite series of draws from the urn, you can’t be sure whether the black ball will turn up or not.

These two urn experiments do not correspond directly to any of the sequence coincidence problems described above; they simply illustrate a range of possible outcomes. But we can rig up an urn process that mimics the probabilistic treatment of the factorials-and-squares problem. At the \(n\)th stage, the urn holds \(1 + 2 \sqrt{n!}\) balls, only one of which is black. The probability of never seeing the black ball, even in an infinite series of trials, is

\[\prod_{n = 1}^{\infty} 1 - \frac{1}{1 + 2 \sqrt{n!}}.\]

This expression converges to a value of approximately \(0.2921426977\). It follows that the probability of seeing black at least once is \(1 - 0.2921426977\), or \(0.7078573023\). (No, that number is not \(1/\sqrt{2}\), although it’s close.)

An urn process resembling the factorials-and-primes problem gives a somewhat different result. Here the number of balls in the urn at stage \(n\) is \(\log n!\), again with just one black ball. The infinite product governing the cumulative probability is

\[\prod_{n = 2}^{\infty} 1 - \frac{1}{\log n!}.\]

On numerical evidence this expression seems to dwindle away to zero as \(n\) goes to infinity (although I’m not \(100\) percent sure of that). If it does go to \(0\), then the complementary probability that the black ball will eventually appear must be \(1\).

Some of these results leave me feeling befuddled, and even a little grumpy. Call me old-fashioned, but I always thought that rolling the dice infinitely many times ought to be enough to settle beyond doubt whether a pattern appears or not. In the harsh light of eternity, I would have said, everything is either forbidden or mandatory; as \(n\) goes to infinity, probability goes to \(0\) or it goes to \(1\). But apparently that’s not so. In the factorial urn model the probability of never seeing a black ball is neither \(0\) nor \(1\) but lies somewhere in the neighborhood of \(0.2921426977\). What does that mean, exactly? How am I supposed to verify the number, or even check its first few digits? Running an infinite series of trials is not enough; you need to collect a statistically significant sample of infinite experiments. For an exact result, try an infinite series of infinite experiments. Sigh.

The urn model corresponds in a natural way to the randomized version of the factorial-square problem, where we look at \(n! + k = m^2\) and choose \(k\) at random from an appropriate range of values. But what about the original problem of \(n! + 1 = m^2\)? In this case there’s no random variable, and hence there’s no point in running multiple trials for each value of \(n\). The system is deterministic. For each \(n\) the factorial of \(n\) has a definite value, and either it is or it isn’t adjacent to a perfect square. There’s no maybe.

Nevertheless, there might be a way to sneak probabilities in through the back door. To do so we have to assume that factorials and squares form a kind of ergodic system, where observing one chain of events for a long period is equivalent to watching many shorter chains. Suppose that factorials and squares are uncorrelated in their positions on the number line—that when a factorial lands between two squares, its distance from the larger square can be treated as a random variable, with every possible distance being equally likely. If this assumption holds, then instead of looking at one value of \(n!\) and trying many random values of \(k\), we can adopt a single value of \(k\) (namely \(k = 1\)) and look at \(n!\) for many values of \(n\).

Is the ergodic assumption defensible? Not entirely. Some distances between \(n!\) and \(m^2\) are known to be more likely than others, and indeed some distances are impossible. However, the empirical evidence suggests that the deviations must be slight. The histogram below shows the distribution of distances between a factorial and the next larger square for the first \(100{,}000\) values of \(n!\). The distances have all been normalized to the range \((0, 1)\) and classified in \(100\) bins. There is no obvious sign of bias. Calculating the mean and standard deviation of the same \(100{,}000\) relative distances yields values within \(1\) percent of those expected for a uniform random distribution. (The expected values are \(\mu = 1/2\) and \(\sigma = 1/12\).)

relative distance of n! from m^2, in 100 bins, for n ranging from 2 to 100,000

If this probabilistic approach can be taken seriously, I can make some quantitative statements about the prospects for ever finding a large factorial adjacent to a perfect square. As mentioned above, the overall probability that one or more values of \(n! + 1\) are equal to squares is about \(0.7078573023\). Thus we should not be too surprised that three such cases are already known, namely the examples with \(n = 4, 5,\) and \(7\). Now we can apply the same method to calculate the probability of finding at least one more case with \(n \gt 7\). Let’s make the question more general: “Whether or not I have seen any squares among the first \(C\) values of \(n! + 1\), what are the chances I’ll see any thereafter?” To answer this question, we can just remove the first \(C\) elements from the infinite product:

\[\prod_{n = C+1}^{\infty} 1 - \frac{1}{1 + 2\sqrt{n!}}.\]

For \(C = 7\), the answer is about \(0.0037\). For \(C = 100\), it’s about \(5.7 \times 10^{-80}\). We are sliding down the steep slope of the red curve.

As a practical matter, further searching for another factorial-square couple does not look like a promising way to spend time and CPU cycles. The probability of success soon falls into the realm of ridiculously small numbers like \(10^{-1{,}000{,}000}\). And yet, from the mathematical point of view, the probability never vanishes. Removing a finite number of terms from the front of an infinite product cannot change its convergence properties. If the original product converged to a nonzero value, then so will the truncated version. Thus we have wandered into the canyon of maximal frustration, where there’s no realistic hope of finding the prize, but the probabilities tell us it still might exist.

I am going to close this shambling essay by considering one more example—a cautionary one. Suppose we apply probabilistic reasoning to the search for a cube that is \(1\) less than a square. If we were looking for exact matches between cubes and squares, we’d find plenty of them: They are the sixth powers: \(1, 64, 729, \ldots\). But integer solutions to the equation \(n^3 + 1 = m^2\) are not so abundant. One low-lying example is easy to find: \(2^3 + 1 = 3^2\), but after 8 and 9 where can we expect to see the next consecutive cube and square?

The probabilistic approach suggests there might be reason for optimism. Compared with factorials and Fibonaccis, cubes grow quite slowly; the rate is polynomial rather than exponential or superexponential. As a result, the probability of finding a cube at a given distance from a square falls off much less steeply than it does for \(n!\) or \(F(n)\). In the graph below, \(P(n^3 + k = m^2)\) is the orange curve.

Plot of probability of coincidence of factorial and square, fibonacci and square, factorial and prime, with added curve for cubes and squares

Note that the orange curve lies just below the blue one, which represents the probability that \(n!\) lies near a prime. The proximity of the two curves suggests that the two problems—factorials adjacent to primes, cubes adjacent to squares—might belong to the same class. We already know that factorial primes do seem to go on and on, perhaps endlessly. The analogy leads to a surmise: Maybe cube-square coincidences are also unbounded. If we keep looking, we’ll find lots more besides \(8\) and \(9\).

The surmise is utterly wrong. The problem has a long history. In 1844 Eugène Catalan conjectured that \(8\) and \(9\) are the only consecutive perfect powers among the integers; the conjecture was finally proved in 2004 by Preda Mihăilescu. For the special case of squares and cubes, Euler had already settled the matter in the 18th century. Thus, probabilities are beside the point.

All of the questions considered here belong to the category of Diophantine analysis—the study of equations whose solutions are required to be integers. It is a field notorious for problems that are easy to state but hard to solve. Catalan’s conjecture is one of the most famous examples, along with Fermat’s Last Theorem. When Diophantine problems are ultimately resolved, the proofs tend to be non-elementary, drawing on sophisticated tools from distant realms of mathematics—algebraic geometry in the proof of Fermat’s Last Theorem by Andrew Wiles and Richard Taylor, cyclotomic fields in Mihăilescu’s proof of the Catalan conjecture. As far as I know, probability theory has not played a central role in any such proof.

When I started wrestling with these questions a few weeks ago, I did not expect to discover a definitive solution. I’ve certainly fulfilled my expectations! As a matter of fact, in my own head the situation is more muddled now than it was at the outset. The realization that even an infinite series of experiments would not necessarily resolve some of the questions is deeply unsettling, and makes me wonder how much I really understand about probability theory. But that’s hardly unprecedented in mathematics. I suppose I’ll just have to get used to it.

Update: Thanks to a further tip from Tanton, I have learned that the problem has an extensive history, and also a name: Brocard’s problem, after Henri Brocard, who published on it in 1876 and 1885. Ramanujan mentioned it in 1913. Erdos conjectured there are no more solutions. Marius Overholt connected it with the abc conjecture. Bruce C. Berndt and William F. Galway established that there are no more solutions up \(10^9\). All this comes from the Wikipedia entry on Brocard’s problem. That article also mentions (but does not explain) that the solutions are called Brown numbers.

I have some more reading to do.

Posted in computing, featured, mathematics, problems and puzzles | 14 Comments

The uniqueness constraint

For the past few weeks the Sunday New York Times has been publishing a puzzle called Capsules, devised by Wei-Hwa Huang. Here are the instructions:

Place numbers in the grid so that each outlined region contains the numbers 1 to n, where n is the number of squares in the region. The same number can never touch itself, not even diagonally.

Here is a partially completed example:

Capsules puzzle with numbers entered in some of the cells.

The black, pre-printed numbers are the “givens,” supplied by the puzzle creator. I filled in the pencil-written numbers in a sequence of “forced” moves dictated by two simple rules:

  1. A number can be placed in a square if no other number is allowed there. For example, the three singleton squares in the bottom row must each hold a 1, and these squares are the obvious place to start solving the puzzle. After the 1s are written in, the square outlined in yellow in the diagram below can also be filled in; its neighbors forbid any number other than 3.
  2. A number can be placed in a square if the number has no other possible home within a region. The blue-outlined 1 in the diagram below was determined by this rule. There must be a 1 somewhere in the region, but none of the other squares can accommodate it.

The same partially completed puzzle grid, with two squares marked by blue and yellow outlines.

At this point in the solution process, with the grid in the state shown above, I was unable to find any other blank squares whose contents could be decided by following these two rules and no others. But I did spot a move based on a different kind of reasoning. Consider the two pairs of open squares marked in color:

The same puzzle, with two blank squares colored.

The salmon-pink squares must hold the numbers 2 and 5, but it’s not immediately clear which number goes in which square. Likewise the lime-green squares must hold 2 and 4, in one order or the other. I submit that the numbers must have the following arrangement:

A correct extension of the partial solution.

How do I justify that choice? Suppose the green 2 and 4 were transposed:

An incorrect extension of the partial solution, allowing an amiguity in two squares.

Then the pink 2 and 5 could be placed in either permutation, and no later moves elsewhere in the puzzle would ever resolve the ambiguity. This outcome is not acceptable if we assume the puzzle must have a unique solution. The uniqueness constraint might be expressed as a third rule:

  1. A number can be placed in a square if it is needed to prevent other squares from having multiple legal configurations.

I have vague qualms about this mode of puzzle-solving. It’s surely not cheating, but the third rule has a different character from the others. It exploits an assumed global property of the solution, rather than relying on local interactions. We are not making a choice because it is forced on us; we are choosing a cofiguration that will force a choice elsewhere.

In this particular puzzle it’s not actually necessary to apply the uniqueness constraint. There is at least one other pathway to a solution—which I’ll leave to you to find. Can we devise a puzzle that requires rule 3? I’m not quite sure the question is even well-formed. All constraint-satisfaction problems can be solved by a mindless brute-force algorithm: Just write in some numbers at random until you reach a contradiction, then backtrack. So if we want to force the solver to use a specific tool, we somehow have to outlaw that universal jackhammer.

The uniqueness constraint is not unique to the Capsules puzzle. I’ve encountered it often in kenkens, and occasionally in sudokus. I even have a sense of deja lu as I write this. I feel sure I’ve read a discussion of this very issue, somewhere in recent years, but I haven’t been able to lay hands on it. Pointers to precedents are welcome.

Addendum 2017-03-19: Jim Propp reminds me of his marvelous Self Referential Aptitude Test. The instructions begin:

The solution to the following puzzle is unique; in some cases the
knowledge that the solution is unique may actually give you a short-cut
to finding the answer to a particular question.

I completed the 20-question puzzle when SRAT first went public some years ago. This morning I found I was able to do it again with no diminution in enjoyment—or effort. I remembered none of the answers or the sequence of deductions needed to find them.

Highly recommended. And while you’re at it, check out Propp’s Mathematical Enchantments blog and his Twitter feed: @JimPropp.

Posted in problems and puzzles | 17 Comments

A Tantonalizing Problem

In times like these one craves distraction, or maybe anaesthesia. On the whole, mathematics is better for you than ethanol, and you can even do it while driving. So in spare moments I’ve been noodling away at the following problem, tweeted a week ago by James Tanton:

Tanton tweet

The answer to Tanton’s question is surely No: The series will never again land on an integer. I leaped to that conclusion immediately after reading the definition of the series and glancing at the first few terms. But what makes me so sure? Can I prove it?

I wrote a quick program to generate more terms:


Overall, the trend visible in these results seemed to confirm my initial intuition. When the fractions are expressed in lowest terms, the denominator generally grows larger with each successive term. Looking at the terms more closely, it turns out that the denominators tend to be products of many small primes, whereas the numerators are either primes or products of a few comparatively large primes. For example:

\[\frac{9649}{2520} = \frac{9649}{2^3 \cdot 3^2 \cdot 5 \cdot 7} \qquad \textrm{and} \qquad \frac{18358381}{4084080} = \frac{59 \cdot 379 \cdot 821}{2^4 \cdot 3 \cdot 5 \cdot 7 \cdot 11 \cdot 13 \cdot 17}.\]

To produce an integer, we need to cancel all the primes in the factorization of the denominator by matching primes in the numerator; given the pattern of these numbers, that looks like an unlikely coincidence.

But there is reason for caution. Note the seventh term in the sequence, where the denominator has decreased from \(60\) to \(20\). To understand how that happens, we can run through the calculation of the term, which starts by summing the six previous terms.

\[\frac{60}{60} + \frac{120}{60} + \frac{150}{60} + \frac{170}{60} + \frac{185}{60} + \frac{197}{60} = \frac{882}{60}.\]

Then we calculate the mean, and add 1 to get the seventh term:

\[\require{cancel}\frac{882}{60} \cdot \frac{1}{6} = \frac{882}{360} = \frac{\cancel{2} \cdot \cancel{3} \cdot \cancel{3} \cdot 7 \cdot 7}{\cancel{2} \cdot 2 \cdot 2 \cdot \cancel{3} \cdot \cancel{3} \cdot 5} = \frac{49}{20} + 1 = \frac{69}{20}\]

Cancelations reduce the numerator and denominator of the mean by a factor of 18. It seems possible that somewhere farther out in the sequence there might be a term where all the factors in the denominator cancel, leaving an integer.

Another point to keep in mind: For large \(n\), the value of the Tanton function grows very slowly. Thus if integer values are not absent but merely rare, we might have to compute a huge number of terms to get to the next one. Reaching the neighborhood of 100 would take more than \(10^{40}\) terms.

value of tanton(n) for n from 1 to 300

So what do you think? Can we prove that no further integers appear in Tanton’s sequence? Or, on the contrary, might my instant conviction that no such integers exist turn out to be an alternative fact?

I’ve had my fun with this problem. I know the answer now, but I’m not going to reveal it yet. Others also deserve a chance to be distracted, or anaesthetized. I’ll be back in a few days to follow up—unless commenters explain what’s going on so thoroughly there’s nothing left for me to say.

Update 2017-01-30: Okay, pencils down. Not that anyone needs more time. As usual, my readers are way ahead of me. (See comments below, if you haven’t read them already.)

My own slow and roundabout voyage of discovery went like this. I had written a little piece of code for printing out n terms of the series, directly implementing the definition given in James Tanton’s tweet:

from fractions import Fraction as F
from statistics import mean

def tanton (n):
    seq = [F(1)]
    for i in range(n):
        seq.append(mean(seq) + 1)

But this is criminally inefficient. On every pass through the loop we calculate the mean of the entire sequence, then throw that work away and do it all again the next time. Once you have the mean of \(n-1\) terms, isn’t there some way of updating it to incorporate the nth term? Well, yes, of course there is. You just have to appropriately weight the new term, dividing by n, before adding it to the mean. Here’s the improved code:

from fractions import Fraction as F

def faster_tanton (n):
    m = F(1)
    for i in range(1, n):
        m += F(1, i)

Tracing the execution of this function, we start out with 1, then add 1, then add 1/2, then 1/3, then 1/4, and so on. This is 1 plus the harmonic series. That series is defined as:

\[H_{n} = \sum_{i=1}^{n} \frac{1}{i} = \frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{n}\]

The first 10 partial sums are:


One fact about the harmonic series is very widely known: It diverges. Although \(H_{n}\) grows very slowly, that growth continues without bound as \(n\) goes to infinity. Another fact, not quite as well known but of prime importance here, is that no term of the series after the first is an integer. The simplest proof shows that when you factor the numerator and the denominator, the denominator always has more \(2\)s than the numerator; thus when the fraction is expressed in lowest terms, the numerator is odd and the denominator even. This proof can be found in various places on the internet, such as StackExchange. There’s also a good explanation in Julian Havil’s book Gamma: Exploring Euler’s Constant.

Neither of those sources mentions anything about the origin or author of the proof. When I scouted around for more information, I found more than a dozen sources that attribute the proof to “Taeisinger 1915,” but with no reference to an original publication. For example, a recent paper by Carlo Sanna (Journal of Number Theory, Vol. 166, September 2016, pp. 41–46) mentions Taeisinger and cites Eric Weisstein’s Concise Encyclopedia of Mathematics; consulting the online version of that work, Taeisinger is indeed credited with the theorem, but the only reference is to another secondary source, Paul Hoffman’s biography of Erdős, The Man Who Loved Only Numbers; there, on page 157, Hoffman writes, “In 1915, a man named Taeisinger proved. . .” and gives no reference or further identification. So who was this mysterious and oddly named Taeisinger? I have never heard of him, and neither has MathSciNet or the Zentralblatt or the MacTutor math biography pages. In Number Theory: A Historical Approach John J. Watkins gives a slender further clue: The first initial “L.”

After some further rummaging through bookshelves and online material, I finally stumbled on a reference to a 1915 publication I could actually track down. In the Comptes Rendus Mathematique (Vol. 349, February 2011, pp. 115–117) Rachid Aït Amranea and Hacène Belbachir include this item in their list of references:

L. Taeisinger, Bemerkung über die harmonische Reihe, Monatsch. Math. Phys. 26 (1915) 132–134.

When I got ahold of that paper, here’s what I found:

Opening lines of the Theisinger 1915 paper

Not Taeisinger but Theisinger!

I still don’t know much of anything about Theisinger. His first name was Leopold; he came from Stockerau, a small town in Austria that doesn’t seem to have a university; he wrote on geometry as well as number theory.

What I do know is that a lot of authors have been copying each other’s references, going back more than 20 years, without ever bothering to look at the original publication.

Posted in mathematics, problems and puzzles | 8 Comments

The Polite Apocalypse

On the Beach cover of 1957 editionI’ve been rereading On the Beach, Nevil Shute’s novel about humanity’s last gasp in the aftermath of a nuclear war. The book was published 60 years ago, in 1957. I first read it in 1963, which is roughly when the events in the story are supposed to take place. It was also just a few months after the Cuban missile crisis, when annihilation was in the air.

Spoiler alert: Everybody dies.

The setting is Melbourne, Australia, the southernmost major city on the planet. The entire population of the Northern Hemisphere was wiped out in the war, and airborne radioactivity is slowly creeping across the Equator. Darwin and Cairns, on Australia’s north coast, are already ghost towns, and the people of Melbourne are told they have less than a year to go.

A U.S. submarine takes refuge in Melbourne’s harbor. Over a period of weeks the captain of that vessel, Dwight, forms an attachment to a young woman named Moira. There’s affection on both sides, and maybe passion, but Dwight is determined to remain faithful to his wife and children back in Connecticut. He buys them presents: a diamond bracelet, a fishing rod, a pogo stick. He speaks of them in the present tense. Is Dwight delusional? Not exactly. He knows perfectly well that his family are all dead, and that he’ll never rejoin them except in the sense that he too will soon be dead. But those deaths are abtractions.

He had seen nothing of the destruction of the war . . . ; in thinking of his wife and of his home it was impossible for him to visualize them in any other circumstances than those in which he had left them. He had little imagination, and that formed a solid core for his contentment in Australia.

It’s not just Dwight who lacks imagination—or chooses to ignore the truths it reveals. Moira studies shorthand and typing for a future job that will never exist. Her father harrows fields for crops that will never grow. Another couple plant hundreds of daffodils whose blooms they will never see, and they invest in a lawn mower for grass they’ll never cut.

The author himself seems to share this selective connection to reality. Everyone in his doomed society is unfailingly polite, and usually cheerful. Civilization may be ending, but not civility. There’s not a single act of violence or malice or even selfishness in the entire story. Shute mentions no hoarding or profiteering, much less rape and pillage. No marauding bandits or desperate refugees from the contaminated north descend on this last haven. On the Beach is the antithesis of that other Australian vision of apocalypse: the Mad Max movies. (The first of the series, in 1979, was filmed near Melbourne.)

It’s also worth noting that no one in Shute’s world takes any steps to prolong life. The government is not hollowing out mountains to keep the human germ line going until the atmosphere clears. Families are not digging fallout shelters in the back yard. These last few representatives of Homo sapiens may indulge in a variety of follies, but hope isn’t one of them.

Why am I writing about this sad book just now? Well, obviously, it’s inauguration day. Which feels more like termination day.

The threat of nuclear disaster has continued to shadow us through all the years since Shute wrote his novel. The danger of a planet-scouring war seemed particularly urgent when I was 13 and reading On the Beach for the first time. I stood up in front of my eighth-grade English class to give an oral report on the book. My performance was not interrupted by a duck-and-cover drill, but it could have been.

Now we have handed control of 4,700 nuclear warheads to a petulant brat, and the danger seems greater than ever.

Revisiting that sense of menace is why I picked up the book, but it’s not what has made the strongest impression on me this second time around. I am both drawn to and appalled by the stoic acceptance of Shute’s fictional Melbournites. Given their circumstances, their reaction is not inappropriate. The worst has already happened, there’s nothing they can do to change it, they may as well make the best of it. In the face of certain extinction, what can you do but shrug your shoulders? Maybe the best way of muddling through is just to plant some daffodils.

Given the current mood of the nation and the world, I suddenly find it easier to understand Dwight’s behavior. The urge to pretend is powerful. I too want to believe that life can go on as normal, that I can continue to enjoy the private pleasures of family and friends, that I can retreat to a cozy office or library and lose myself in the world of ideas, in the “less fretful cosmos” of mathematics and science, or art and literature for that matter.

But we are not yet huddled on the beach, the last of the doomed. It’s late, but not yet too late. This is not the moment for resignation and acquiescence. Tomorrow we march!

Posted in off-topic | 2 Comments

Joint Mathematics Morsels

Atlanta. I’m at the Joint Mathematics Meetings, the annual smorgasbord where I never have time to fully digest my helping of algebraic geometry before I move on to a desert of cohomology. Here are a few easily swallowed morsels.

Carey’s Equality. Has everyone but me known all about this for ages and ages?

In a stationary population—where births equal deaths—the number of individuals who have lived a years is the same as the number who still have a years left to live. Here’s a more precise statement from James W. Vaupel of the Max-Planck-Institute for Demographic Research:

If an individual is chosen at random from a stationary population with a positive force of mortality at all ages, then the probability the individual is one who has lived a years equals the probability the individual is one who has that number of years left to live. For example, it is as likely the individual is age 80 as it is the individual has 80 years to live—not 80 years of remaining life expectancy but a remaining lifetime of precisely 80 years.

Carey's Equality for 80 years

Is this fact obvious, a trivial consequence of symmetry? Or is it deep and mysterious? Apparently it was not clearly recognized until about 10 years ago, by James R. Carey, a biological demographer at UC Davis and UC Berkeley who was studying the age structure of fruitfly populations. The equality was proved in 2009 by Vaupel. A more general statement of the theorem and a more mathematically oriented proof were published in 2014 by Carey and Arni S. R. Srinivasa Rao of Augusta University.

I learned all this from a wide-ranging talk by Rao: “From Fibonacci to Alfred Lotka and beyond: Modeling the dynamics of population and age-structures.”

Go with the Green. Every weekday you walk from your home at the corner of 1st Avenue and 1st Street to your office at 9th Avenue and 9th Street. Since your city is laid out with a perfectly rectilinear grid, you have to go eight blocks east and eight blocks north. Assuming you never waste steps by turning south or west, or by straying outside the bounding rectangle, how many routes can you choose from?

8 block street grid with one possible sw to ne path

It would be quite a chore to count the paths one by one, but combinatorics comes to the rescue. The answer is \(\binom{16}{8}\), the number of ways of choosing eight items (such as eastbound or northbound blocks) from a set with 16 members:

\[\binom{16}{8} = \frac{16!}{(8!)(8!)} = 12{,}870{.}\]

You could walk to work for 50 years without ever taking the same route twice. Which of those 12,870 paths is the shortest? That’s the beauty of the Manhattan metric: They all are. Every such path is exactly 16 blocks long.

But just because the routes are equally long doesn’t mean they are equally fast. Suppose there’s a traffic light at every intersection. Depending on the state of the signal, you can proceed either north or east without interruption, but you’ll have to wait for the light to change if you want to cross the other way. A sensible strategy, it seems, is always to go with the green if you can. Following this rule, you will never have to wait for a light unless you are on the north or the east boundary edge of the square.

The street grid with traffic lights came up in a talk by Ivan Corwin of Columbia University, titled “A Drunk Walk in a Drunk World.” The more conventional term for this subject is “random walks in random environments.” In an ordinary random walk (with a nonrandom environment), the walker chooses a direction at each step according to a fixed probability distribution—the same at all sites and at all times. With a random environment, the probabilities vary both with position and with time. In a brief aside, Corwin offered the street grid with traffic lights as an example of a random environment. If the lights are uncorrelated on the time scale of a pedestrian’s progress through the grid, the favored direction at any intersection is an independent random variable. Then the following question arises: If the walker always takes the green-light direction when that’s possible, which paths are the most heavily traveled?

Corwin’s answer is that the walker will likely follow a stairstep path, never venturing very far from the diagonal drawn between home and office. Thus even though the distance metric says all routes are equal, the walker winds up approximating the Euclidean shortest path.

Corwin gave no proof of his assertion, although he did show the result of a computer simulation. After ruminating on the problem for a while, I think I understand what’s going on. One way of thinking about it is to break the 16-block walk into two eight-block segments, then consider the single vertex that the two segments have in common. Suppose the common point is the central intersection at 5th Avenue and 5th Street. There are 70 ways of getting from home to this point, and for each of those paths there are another 70 ways to continuing on to the office. Thus 4,900 paths pass through the center of the grid. In contrast, only one path goes through the corner of 9th Avenue and 1st Street. The same kind of analysis can be applied recursively to show that the initial eight-block segment of the walk is more likely to pass through 3rd Avenue and 3rd Street than through 5th Avenue and 1st Street.

Another way to look at it is that it’s all about the binomial theorem and Pascal’s triangle. The binomial coefficient \(\binom{n}{m}\) is largest when \(m = n/2\), making the “middle-way” paths the likeliest.

This argument says that always going with the green will give you the fastest route across town (at least in terms of expectation value), and the route you follow is likely to lie near the diagonal. What the argument doesn’t say is that deliberately biasing your choices so that you stay near the diagonal will get you to work sooner; that’s clearly not true.

When I mentioned Corwin’s example to my friends Dan Silver and Susan Williams, Susan immediately pointed out that the model fails to capture some important features of walking in an urban environment. Streets have two sides, and generally two sidewalks. To get from the southwest corner of an intersection to the northeast corner, you need two green lights. I’m not sure whether the conclusions hold up when these complications are taken into account.

I should add that solving this citified problem was not the main point of Corwin’s talk. Instead, he was addressing the problem of a bartender who wants to build a tavern in rough and ever-changing terrain near the rim of the Grand Canyon. The bartender needs to know how close he can come to the edge without endangering inebriated customers who might wander over the cliff.

TASEP. I’m a sucker for simple models of complex behavior. This week I learned of a new one—new to me, anyway. Jinho Baik of the University of Michigan talked about TASEP, a “totally asymmetric simple exclusion process” (admittedly not the most vividly descriptive name). Here’s what little I understand of the model so far.

The setting is a one-dimensional lattice, which could be either an infinite line or a closed loop of finite size. Some lattice sites are vacant and some are occupied by a particle. (No site can ever host multiple particles.) At random intervals—random with an exponential distribution—a particle “wakes up” and tries to move one space to the right (on a line) or one space clockwise (on a loop). The move succeeds if the adjacent site is vacant; otherwise the particle goes back to sleep until the next time the exponential alarm clock rings. Given some initial distribution of the particles, how does that distribution evolve over time.


When I see a model like this one, my impulse is to write some code and see what it looks like in action. I haven’t yet done that, but this is my current understanding of what I should expect to see. If you start with the smoothest possible particle distribution (alternating occupied and vacant sites), the particles will tend to clump together. If you start with a maximally clumpy state (one area solidly filled, another empty), the particles will tend to spread out. Baik and his colleagues seek a more precise description of how the density fluctuations evolve over time. And they have found one! Unfortunately, I’m not yet prepared to explain it, even in my hand-waviest way. The best I can do is refer you to the most recent paper by Baik and Zhipeng Liu.

Debunking Guy. If you ever have an opportunity to hear Doron Zeilberger speak, don’t pass it up. At this meeting he gave a spirited and inspiring defense of experimental mathematics, under the title “Debunking Richard Guy’s Law of Small Numbers.” Sitting in the front row was 100-year-old Richard Guy. Neither one of them was in any way daunted by this confrontation. In any case, Doron’s talk was more homage than attack. Later, I had a chance to ask Guy what he thought of it. “His heart is in the right place,” he said.

Guy’s Strong Law of Small Numbers says:

There aren’t enough small numbers to meet the many demands made of them.

As a consequence, if you discover that \(f(n)\) yields the same value as \(g(n)\) for several small values of \(n\), it’s not always safe to assume that \(f(n) = g(n)\) for all \(n\). Euler discovered a cautionary example that’s now well known: The equation \(n^2 + n + 41\) evaluates to a prime for all \(n\) from \(-40\) to \(+39\), but not outside that range.

Zeilberger doesn’t deny the risk of mistaking such accidents for mathematical truths. As a matter of fact, he discusses some of the most dramatic examples: the Pisot numbers, some of which produce coincidences that persist for thousands of terms, and yet ultimately break down. But such pathologies are not a sign that “empirical” mathematics is useless, he says; rather, they suggest the need to refine our proof techniques to distinguish true identities from false coincidences. In the case of the Pisot numbers, he offers just such a mechanism.

A paper by Zeilberger, Neil J. A. Sloane, and Shalosh B. Ekhad (Zeilberger’s computer/collaborator) outlines the main ideas of the JMM talk, though sadly it cannot capture the theatrics.

Soundararajan on Tao on Erdős. Take a sequence of +1s and –1s, and add them up. Can you design the sequence so that the absolute value of the sum is never greater than 1? That’s easy: Just write down the alternating sequence, +1, –1, +1, –1, +1, –1, . . . . But what if, after you’ve selected your sequence, an adversary applies a rule that selects some subset of the entries. Can you still count on keeping the absolute value of the sum below a specified bound? This is a version of the Erdős discrepancy problem, which Paul Erdős first formulated in the 1930s.

The question was finally given a definitive answer in 2015 by Terry Tao of UCLA. In the “Current Events” session of the JMM, Kannan Soundararajan of Stanford gave a lucid account of thre proof. You can read it for yourself, along with three other Current Events talks, by downloading the Bulletin.

Proust’s Powdered-Wig Party. Finally, a personal note. In the closing pages of Marcel Proust’s immense novel A la Recherche du Temps Perdu, the narrator attends a party where he runs into many old friends from Parisian high and not-so-high society. He is annoyed that no one told him the party was a costume ball: All of the guests are wearing white powdered wigs, as if they were gathering at the court of Louis XIV. Then the narrator catches sight of himself in a mirror and realizes that he too is coiffed in white.

At these annual math gatherings I run into people I have known for 30 years or more. For some time I’ve been aware that the members of this cohort, including me, are no longer in the first blush of youth. This year, however, the powdered wigs have seemed particularly conspicuous. Everyone I talk to, it seems, is planning for imminent retirement.

But of course this geriatric impression owes more to selection effects than to the aging of the mathematical population overall. Indeed, the corridors here are full of youngsters attending their first or third or fifth JMM. Which brings us back to Carey’s Equality. If we can safely assume that the population of meeting attendees is stationary, then the proportion of people who have been coming to these affairs for 30 years should be equal to the proportion who will attend 30 more meetings.

Posted in mathematics | 1 Comment