Extrapolating the steep trend line of the past five years predicts a thousandfold increase in capacity by about 2012; in other words, today’s 120-gigabyte drive becomes a 120-terabyte unit.
Extending that same growth curve into 2016 would allow for another four doublings, putting us on the threshold of the petabyte disk drive (i.e., \(10^{15}\) bytes).
None of that has happened. The biggest drives in the consumer marketplace hold 2, 4, or 6 terabytes. A few 8- and 10-terabyte drives were recently introduced, but they are not yet widely available. In any case, 10 terabytes is only 1 percent of a petabyte. We have fallen way behind the growth curve.
The graph below extends an illustration that appeared in my 2002 article, recording growth in the areal density of disk storage, measured in bits per square inch:
The blue line shows historical data up to 2002 (courtesy of Edward Grochowski of the IBM Almaden Research Center). The bright green line represents what might have been, if the 1997–2002 trend had continued. The orange line shows the real status quo: We are three orders of magnitude short of the optimistic extrapolation. The growth rate has returned to the more sedate levels of the 1970s and 80s.
What caused the recent slowdown? I think it makes more sense to ask what caused the sudden surge in the 1990s and early 2000s, since that’s the kink in the long-term trend. The answers lie in the details of disk technology. More sensitive read heads developed in the 90s allowed information to be extracted reliably from smaller magnetic domains. Then there was a change in the geometry of the domains: the magnetic axis was oriented perpendicular to the surface of the disk rather than parallel to it, allowing more domains to be packed into the same surface area. As far as I know, there have been no comparable innovations since then, although a new writing technology is on the horizon. (It uses a laser to heat the domain, making it easier to change the direction of magnetization.)
As the pace of magnetic disk development slackens, an alternative storage medium is coming on strong. Flash memory, a semiconductor technology, has recently surpassed magnetic disk in areal density; Micron Technologies reports a laboratory demonstration of 2.7 terabits per square inch. And Samsung has announced a flash-based solid-state drive (SSD) with 15 terabytes of capacity, larger than any mechanical disk drive now on the market. SSDs are still much more expensive than mechanical disks—by a factor of 5 or 10—but they offer higher speed and lower power consumption. They also offer the virtue of total silence, which I find truly golden.
Flash storage has replaced spinning disks in about a quarter of new laptops, as well as in all phones and tablets. It is also increasingly popular in servers (including the machine that hosts bit-player.org). Do disks have a future?
In my sentimental moments, I’ll be sorry to see spinning disks go away. They are such jewel-like marvels of engineering and manufacturing prowess. And they are the last link in a long chain of mechanical contrivances connecting us with the early history of computing—through Turing’s bombe and Babbage’s brass gears all the way back to the Antikythera mechanism two millennia ago. From here on out, I suspect, most computers will have no moving parts.
Maybe in a decade or two the spinning disk will make a comeback, the way vinyl LPs and vacuum tube amplifiers have. “Data that comes off a mechanical disk has a subtle warmth and presence that no solid-state drive can match,” the cogniscenti will tell us.
“You can never be too rich or too thin,” someone said. And a computer can never be too fast. But the demand for data storage is not infinitely elastic. If a file cabinet holds everything in the world you might ever want to keep, with room to spare, there’s not much added utility in having 100 or 1,000 times as much space.
In 2002 I questioned whether ordinary computer users would ever fill a 1-terabyte drive. Specifically, I expressed doubts that my own files would ever reach the million megabyte mark. Several readers reassured me that data will always expand to fill the space available. I could only respond “We’ll see.” Fourteen years later, I now have the terabyte drive of my dreams, and it holds all the words, pictures, music, video, code, and whatnot I’ve accumulated in a lifetime of obsessive digital hoarding. The drive is about half full. Or half empty. So I guess the outcome is still murky. I can probably fill up the rest of that drive, if I live long enough. But I’m not clamoring for more space.
One factor that has surely slowed demand for data storage is the emergence of cloud computing and streaming services for music and movies. I didn’t see that coming back in 2002. If you choose to keep some of your documents on Amazon or Azure, you obviously reduce the need for local storage. Moreover, offloading data and software to the cloud can also reduce the overall demand for storage, and thus the global market for disks or SSDs. A typical movie might take up 3 gigabytes of disk space. If a million people load a copy of the same movie onto their own disks, that’s 3 petabytes. If instead they stream it from Netflix, then in principle a single copy of the file could serve everyone.
In practice, Netflix does not store just one copy of each movie in some giant central archive. They distribute rack-mounted storage units to hundreds of internet exchange points and internet service providers, bringing the data closer to the viewer; this is a strategy for balancing the cost of storage against the cost of communications bandwidth. The current generation of the Netflix Open Connect Appliance has 36 disk drives of 8 terabytes each, plus 6 SSDs that hold 1 terabyte each, for a total capacity of just under 300 terabytes. (Even larger units are coming soon.) In the Netflix distribution network, files are replicated hundreds or thousands of times, but the total demand for storage space is still far smaller than it would be with millions of copies of every movie.
A recent blog post by Eric Brewer, Google’s vice president for infrastructure, points out:
The rise of cloud-based storage means that most (spinning) hard disks will be deployed primarily as part of large storage services housed in data centers. Such services are already the fastest growing market for disks and will be the majority market in the near future. For example, for YouTube alone, users upload over 400 hours of video every minute, which at one gigabyte per hour requires more than one petabyte (1M GB) of new storage every day or about 100x the Library of Congress.
Thus Google will not have any trouble filling up petabyte drives. An accompanying white paper argues that as disks become a data center specialty item, they ought to be redesigned for this environment. There’s no compelling reason to stick with the present physical dimensions of 2½ or 3½ inches. Moreover, data-center disks have different engineering priorities and constraints. Google would like to see disks that maximize both storage capacity and input-output bandwidth, while minimizing cost; reliability of individual drives is less critical because data are distributed redundantly across thousands of disks.
The white paper continues:
An obvious question is why are we talking about spinning disks at all, rather than SSDs, which have higher [input-output operations per second] and are the “future” of storage. The root reason is that the cost per GB remains too high, and more importantly that the growth rates in capacity/$ between disks and SSDs are relatively close . . . , so that cost will not change enough in the coming decade.
If the spinning disk is remodeled to suit the needs and the economics of the data center, perhaps flash storage can become better adapted to the laptop and desktop environment. Most SSDs today are plug-compatible replacements for mechanical disk drives. They have the same physical form, they expect the same electrical connections, and they communicate with the host computer via the same protocols. They pretend to have a spinning disk inside, organized into tracks and sectors. The hardware might be used more efficiently if we were to do away with this charade.
Or maybe we’d be better off with a different charade: Instead of dressing up flash memory chips in the disguise of a disk drive, we could have them emulate random access memory. Why, after all, do we still distinguish between “memory” and “storage” in computer systems? Why do we have to open and save files, launch and shut down applications? Why can’t all of our documents and programs just be everpresent and always at the ready?
In the 1950s the distinction between memory and storage was obvious. Memory was the few kilobytes of magnetic cores wired directly to the CPU; storage was the rack full of magnetic tapes lined up along the wall on the far side of the room. Loading a program or a data file meant finding the right reel, mounting it on a drive, and threading the tape through the reader and onto the take-up reel. In the 1970s and 80s the memory/storage distinction began to blur a little. Disk storage made data and programs instantly available, and virtual memory offered the illusion that files larger than physical memory could be loaded all in one go. But it still wasn’t possible to treat an entire disk as if all the data were all present in memory. The processor’s address space wasn’t large enough. Early Intel chips, for example, used 20-bit addresses, and therefore could not deal with code or data segments larger than \(2^{20} \approx 10^6\) bytes.
We live in a different world now. A 64-bit processor can potentially address \(2^{64}\) bytes of memory, or 16 exabytes (i.e., 16,000 petabytes). Most existing processor chips are limited to 48-bit addresses, but this still gives direct access to 281 terabytes. Thus it would be technically feasible to map the entire content of even the largest disk drive onto the address space of main memory.
In current practice, reading from or writing to a location in main memory takes a single machine instruction. Say you have a spreadsheet open; the program can get the value of any cell with a load instruction, or change the value with a store instruction. If the spreadsheet file is stored on disk rather than loaded into memory, the process is quite different, involving not single instructions but calls to input-output routines in the operating system. First you have to open the file and read it as a one-dimensional stream of bytes, then parse that stream to recreate the two-dimensional structure of the spreadsheet; only then can you access the cell you care about. Saving the file reverses these steps: The two-dimensional array is serialized to form a linear stream of bytes, then written back to the disk. Some of this overhead is unavoidable, but the complex conversions between serialized files on disk and more versatile data structures in memory could be eliminated. A modern processor could address every byte of data—whether in memory or storage—as if it were all one flat array. Disk storage would no longer be a separate entity but just another level in the memory hierarchy, turning what we now call main memory into a new form of cache. From the user’s point of view, all programs would be running all the time, and all documents would always be open.
Is this notion of merging memory and storage an attractive prospect or a nightmare? I’m not sure. There are some huge potential problems. For safety and sanity we generally want to limit which programs can alter which documents. Those rules are enforced by the file system, and they would have to be re-engineered to work in the memory-mapped environment.
Perhaps more troubling is the cognitive readjustment required by such a change in architecture. Do we really want everything at our fingertips all the time? I find it comforting to think of stored files as static objects, lying dormant on a disk drive, out of harm’s way; open documents, subject to change at any instant, require a higher level of alertness. I’m not sure I’m ready for a more fluid and frenetic world where documents are laid aside but never put away. But I probably said the same thing 30 years when I first confronted a machine capable of running multiple programs at once (anyone remember Multifinder?).
The dichotomy between temporary memory and permanent storage is certainly not something built into the human psyche. I’m reminded of this whenever I help a neophyte computer user. There’s always an incident like this:
“I was writing a letter last night, and this morning I can’t find it. It’s gone.”
“Did you save the file?”
“Save it? From what? It was right there on the screen when I turned the machine off.”
Finally the big questions: Will we ever get our petabyte drives? How long will it take? What sorts of stuff will we keep on them when the day finally comes?
The last time I tried to predict the future of mass storage, extrapolating from recent trends led me far astray. I don’t want to repeat that mistake, but the best I can suggest is a longer-term baseline. Over the past 50 years, the areal density of mass-storage media has increased by seven orders of magnitude, from about \(10^5\) bits per square inch to about \(10^{12}\). That works out to about seven years for a tenfold increase, on average. If that rate is an accurate predictor of future growth, we can expect to go from the present 10 terabytes to 1 petabyte in about 15 years. But I would put big error bars around that number.
I’m even less sure about how those storage units will be used, if in fact they do materialize. In 2002 my skepticism about filling up a terabyte of personal storage was based on the limited bandwidth of the human sensory system. If the documents stored on your disk are ultimately intended for your own consumption, there’s no point in keeping more text than you can possibly read in a lifetime, or more music than you can listen to, or more pictures than you can look at. I’m now willing to concede that a terabyte of information may not be beyond human capacity to absorb. But a petabyte? Surely no one can read a billion books or watch a million hours of movies.
This argument still seems sound to me, in the sense that the conclusion follows if the premise is correct. But I’m no longer so sure about the premise. Just because it’s my computer doesn’t mean that all the information stored there has to be meant for my eyes and ears. Maybe the computer wants to collect some data for its own purposes. Maybe it’s studying my habits or learning to recognize my voice. Maybe it’s gathering statistics from the refrigerator and washing machine. Maybe it’s playing go, or gossiping over some secret channel with the Debian machine across the alley.
We’ll see.
]]>Notice the spacing around the minus sign. It’s too close to the argument on its left, whereas the plus sign lies right in the middle. The proper rendering of this expression looks like this:
Closely comparing the two images, I realized that spacing isn’t the only issue. In the malformed version the minus sign is also a little too long, too low, and too skinny.
For typesetting mathematics, I rely on MathJax, an amazing JavaScript program created by Davide Cervone of Union College. It works like magic: I write in standard TeX (math mode only), and the typeset output appears beautifully formatted in your web browser, with no need to bother about installing fonts or downloading plugins. For the past few years MathJax has been totally reliable, so this spacing glitch came as an annoying surprise.
The notes that follow record both what I did and what I thought as I tried to track down the cause of this problem. If anyone else ever bumps into the bug, the existence of this document might save them some angst and agita. Besides, everybody likes a detective story—even if the detective turns out to be more bumbling than brilliant. (If you just want to know how it comes out, skip to the end.)
Hypothesis: My first thought on seeing the wayward minus sign was that I must have typed something wrong. The TeX source code for the expression shown above is so simple (just a + b - c
) that there’s not much room for error, but accidents happen. Maybe one of those space characters is not an ordinary word space (ASCII 0x20
) but a non-breaking space (HTML
). Or maybe the hyphen that represents a minus sign is not really a hyphen (ASCII 0x2D
) but an en-dash (HTML –
or –
) or a discretionary hyphen (HTML ­
or ­
). Experiment 1: Try typing the expression again, very carefully. Result: No change. Experiment 2: Copy the original source text into an editor that shows raw hexadecimal byte values. Result: Nothing exotic. Experiment 3: Copy the source text into a different TeX system (Pierre-Yves Chatelier’s LaTeXiT). Result: Typesets correctly. Conclusion: Probably not a typo.
Question: Could it be a browser bug? Tests: Try it in Chrome, Firefox, Safari, Opera. Results: Same appearance in all of them. Conclusion: It’s not the browser.
Internet interlude: The most important debugging tools today are Google and Stack Overflow. Most likely the answer is already out there. But searches for “minus sign spacing MathJax” and “minus sign spacing TeX” turn up nothing useful. The most promising leads take me to discussions of the binary subtraction operator \(a - b\) vs. the unary negation operator \(-b\). That’s not the issue here, so I am thrown back on my own resources.
Question: Is it just my machine? Test: Try opening the same page on another laptop. Result: Same appearance. However, these two computers are very similar. In particular, they have the same fonts installed. Test: Try a third machine, with different fonts. Result: No change.
Question: Is the problem confined to the one article I’m currently writing, or does it show up in earlier blog posts as well? Research: Page back through the bit-player archives. I find several more instances of the bug. Followup question: Was the minus-sign spacing in those earlier articles already botched when I wrote and published them? Or were they correct then, and the bug was introduced by some later change in the software environment?
Clue: In the course of rummaging through old blog posts, I discover that the spacing anomaly appears only in “inline” math expressions (those that appear within the flow of a paragraph), not in “display” equations (which are set off on a line of their own). The two rendering modes are invoked by surrounding an expression with different sets of delimiters: \( ... \)
for inline and \[ ... \]
for display. By merely toggling between round and square brackets, I find I can turn the bug on and off. This discovery leads me to suppose there really might be something awry within MathJax. If it formats an expression correctly in one mode, why does it fail on the same input text in another mode?
Investigation: Using browser developer tools, I examine the HTML markup that MathJax writes into the document. In display mode (where the spacing is correct), here’s the coding for the minus sign:
<span class="mo" id="MathJax-Span-15"
style="font-family: STIXGeneral-Regular;
padding-left: 0.228em;">-</span>
The phrase I have highlighted in red is the crucial bit of styling that sets the spacing on the left side of the minus operator. Here’s the corresponding markup for the minus sign in the inline version of the same expression:
<span class="mo" id="MathJax-Span-22"
style="font-family: STIXGeneral-Regular;">–</span>
The padding-left
statement is absent. This is the proximate cause of the incorrect spacing. But why does MathJax supply the appropriate spacing in display mode but omit it in inline mode? That’s the puzzle.
Inquiry: I turn to the MathJax source-code repository on GitHub, and browse the issues database. Nothing relevant turns up. Likewise the MathJax user group forum. Baffling. If the problem really is a MathJax bug, someone would surely have reported it, unless it’s quite new. I consider opening a new issue, but decide to wait until I know more.
Question: The bug seems to be everywhere on bit-player.org, but what about the rest of the web? On MathOverflow (which I know uses MathJax) it doesn’t take long to find an inline equation that includes a minus sign. It is formatted perfectly. David Mumford’s blog is another MathJax site; I poke around there and find another inline equation with a correctly spaced minus sign. Uh oh. The finger of blame is pointing back toward me and away from MathJax.
Question: Am I using the same version of MathJax as those other sites, and the same configuration file? Not exactly, but when I try several other versions (including older ones, in case this is a recently introduced bug), there’s no change.
Pause for reflection: MathJax seems to be behaving differently on bit-player than it does on other sites. What could account for that difference? There are dozens of possible factors, but I have a leading candidate: bit-player is built on the WordPress blogging platform, and the other sites I’m looking at are not. I have no idea how the interaction of WordPress and MathJax could lead to this particular outcome, but they are both complicated software systems, with lots going on behind the curtains.
Experiment: I can test the WordPress hypothesis by setting up a web page that has everything in common with the bit-player site—the same server hardware and software, and the same MathJax processor—but that lives outside the WordPress system. I do exactly that, and find that minus signs are correctly formatted in both display and inline equations. Conclusion: It sure looks like WordPress is messing with my TeX!
Revelation: Throughout this diagnostic adventure, I’ve been relying heavily on the developer tools in the Chrome and Firefox browsers. These tools provide a peek into a page’s HTML encoding as it is displayed by the browser, after MathJax and any other JavaScript programs have worked their transformations on the source text. Now, for sheer lack of any better ideas, I decide to try the View Source command, which shows the HTML as received from the server, before any JavaScript programs run, and in particular before MathJax has converted TeX source code into typeset mathematical output. Instantly, the root of the problem is staring me in the face. The display-mode TeX is exactly as I wrote it: \[a + b - c\]
. But the inline-mode markup is this: \(a + b – c\)
. The HTML entity –
specifies an en-dash. Where did that come from? Actually, I’m pretty sure I know where; what I don’t know is why. WordPress has built-in functions to “prettify” text, converting typewriter quote marks ('', "") to typographer’s quotes (‘ ’, “ ”). More to the point, the program also replaces a double hyphen (--) with an en-dash (–) and a triple hyphen (---) with an em-dash (—). Although I haven’t been typing double hyphens in the math expressions, I still suspect that the WordPress character substitution process has something to do with those troublesome en-dashes.
Confirmation: Before investing more effort in this hypothesis, I try to make sure I’m on the right track. Typing my test expression with an en-dash instead of a hyphen produces output identical to the buggy version, in display mode as well as inline mode. Performing the same experiment in LaTeXiT yields a very similar result.
The culprit exposed: Searching for #8211
in the WordPress source code takes me to the file formatting.php
, where I find a function called wptexturize
. PHP is not my favorite programming language, but it’s easy enough to guess what these lines are about (I have simplified and abbreviated the statements for clarity):
$static_characters = array( '---', ' -- ', '--', ' - ') $static_replacements = array( $em_dash, ' ' . $em_dash . ' ', $en_dash, ' ' . $en_dash . ' ')
Note the fourth element of the $static_characters
array: a hyphen surrounded by spaces. The corresponding element of $static_replacements
is an en-dash surrounded by spaces. I call that a smoking gun. MathJax, like other TeX processors, expects an ASCII hyphen as a minus sign; if you feed it an en-dash, it’s not going to recognize it as a mathematical operator. (When Knuth was developing TeX, circa 1980, no standard character encoding existed beyond the 96 codes of plain ASCII.)
The fix: It could be as simple as writing a+b-c
instead of a + b - c
! When I make that minor change to the text, it works like a charm. Why didn’t I think of trying that sooner? I guess because TeX in math mode promises to ignore whitespace in the source code, and it never occurred to me that WordPress doesn’t have to honor that promise. Thus I can solve the immediate problem just by removing spaces around minus signs. As a permanent remedy, however, changing my writing habits is not appealing. Nor is sifting through all my earlier posts to remove those spaces. The fact is, I don’t want hyphens to magically become en-dashes while I’m not looking. It may be a feature for some people, but for me it’s a bug.
What I did. The first commandment of WordPress development is “Thou shall not modify the core files.” But in that respect I’m already a sinner, and unrepentant. Yeah, I edited those two arrays in the formatting.php
file, and it felt good.
Lessons learned. In hindsight, I see that I missed several opportunities to root out the problem more quickly. Next time I’ll remember View Source. And if I had done a better job of early-stage analysis, I would have been able to find help more efficiently. I am not the only one to confront this glitch, but I needed better search terms to follow the breadcrumbs of those who went before. Also, along the way I misinterpreted some important clues. When I discovered that the bug affects only inline mode and not display mode, I was quite sure that fact implicated MathJax, but I was wrong. (As it happens, I still don’t really understand why display mode is immune to the bug. Why is the hyphen converted to an en-dash when I enclose it in slashed round brackets, but not when it appears in slashed square brackets? Evidently the wptexturizing treatment is skipped in the latter case, but I lack the stamina to slog through all that PHP to figure out why.)
The big picture: I’m not mad at WordPress. I still believe it is a wonder of the age, making millions of people into instant, pushbutton publishers. According to some reports, it powers a quarter of all web sites. In this respect it may well be the most important application-layer software for fulfilling the original promise of the World Wide Web: allowing all of us to be contributors and creators rather than merely consumers of mass media. But there’s a cost: Keeping WordPress easy on the outside seems to require a dense thicket of thorns and briers on the inside. As the years go by I find I spend too much time fighting against its automation, which is a joyless task. I would prefer something simpler. I have Jekyll envy.
Yet my main takeaway after this episode is gratitude for open-source software. If MathJax and WordPress had been sealed, blackbox applications, I would have been helpless to help myself, unable to do anything about the problem beyond whining and pleading.
]]>The web sites numbersaplenty.com and numberworld.info dish up a smorgasbord of facts about every natural number from 1 to 999,999,999,999,999. Type in your favorite positive integer (provided it’s less than 10^{15}) and you’ll get a list of prime factors, a list of divisors, the number’s representation in various bases, its square root, and lots more.
I first stumbled upon these sites (and several others like them) about a year ago. I revisited them recently while putting together the Carnival of Mathematics. I was looking for something cute to say about the new calendar year and was rewarded with the discovery that 2016 is a triangular number: 2016 bowling pins can be arranged in an equilateral triangle with 63 pins per side.
This incident set me to thinking: What does it take to build a web site like these? Clearly, the sites do not have 999,999,999,999,999 HTML files sitting on a disk drive waiting to be served up when a visitor arrives. Everything must be computed on the fly, in response to a query. And it all has to be done in milliseconds. The question that particularly intrigued me was how the programs recognize that a given number has certain properties or is a member of a certain class—a triangular number, a square, a Fibonacci, a factorial, and so on.
I thought the best way to satisfy my curiosity would be to build a toy number site of my own. Here it is:
Prime factors: | |
Prime number | ? |
Square-free number | ? |
Square-root-smooth number | ? |
Square number | ? |
Triangular number | ? |
Factorial number | ? |
Fibonacci number | ? |
Catalan number | ? |
Somos-4 number | ? |
Elapsed time:
This one works a little differently from the number sites I’ve found on the web. The computation is done not on my server but on your computer. When you type a number into the input field above, a JavaScript program running in your web browser computes the prime factors of the number and checks off various other properties. (The source code for this program is available on GitHub, and there’s also a standalone version of the Number Factoids calculator.)
Because the computation is being done by your computer, the performance depends on what hardware and software you bring to the task. Especially important is the JavaScript engine in your browser. As a benchmark, you might try entering the number 999,999,999,999,989, which is the largest prime less than 10^{15}. The elapsed time for the computation will be shown at the bottom of the panel. On my laptop, current versions of Chrome, Firefox, Safari, and Opera give running times in the range of 150 to 200 milliseconds. (But an antique iPad takes almost 8 seconds.)
Most of that time is spent in factoring the integer (or attempting to factor it in the case of a prime). Factoring is reputed to be a hard problem, and so you might suppose it would make this whole project infeasible. But the factoring computation bogs down only with really big numbers—and a quadrillion just isn’t that big anymore. Even a crude trial-division algorithm can do the job. In the worst case we need to try dividing by all the odd numbers less than \(\sqrt{10^{15}}\). That means the inner loop runs about 16 million times—a mere blink of the eye.
Once we have the list of prime factors for a number N, other properties come along almost for free. Primality: We can tell whether or not N is prime just by looking at the length of the factor list. Square-freeness: N is square-free if no prime appears more than once in the list. Smoothness: N is said to be square-root smooth if the largest prime factor is no greater than \(\sqrt{N}\). (For example, \(12 = 2 \times 2 \times 3\) is square-root smooth, but \(20 = 2 \times 2 \times 5\) is not.)
The factor list could also be used to detect square numbers. N is a perfect square if every prime factor appears in the list an even number of times. But there are lots of other ways to detect squares that don’t require factorization. Indeed, running on a machine that has a built-in square-rooter, the JavaScript code for recognizing perfect squares can be as simple as this:
function isSquare(N) { var root = Math.floor(Math.sqrt(N)); return root * root === N; }
If you want to test this code in the Number Factoids calculator, you might start with 999,999,961,946,176, which is the largest perfect square less than \(10^{15}\).
Note that the isSquare
function is a predicate: The return
statement in the last line yields a boolean value, either true
or false
. The program might well be more useful if it could report not only that 121 is a square but also what it’s the square of. But the Number Factoids program is just a proof of concept, so I have stuck to yes-or-no questions.
Metafactoids: The Factoids calculator tests nine boolean properties. No number can possess all of these properties, but 1 gets seven green checkmarks. Can any other number equal this score? The sequence of numbers that exhibit none of the nine properties begins 20, 44, 52, 68, 76, 88, 92,… Are they all even numbers? (Hover for answer.)
What about detecting triangular numbers? N is triangular if it is the sum of all the integers from 1 through k for some integer k. For example, \(2016 = 1 + 2 + 3 + \dots + 63\). Given k, it’s easy enough to find the kth triangular number, but we want to work in the opposite direction: Given N, we want to find out if there is a corresponding k such that \(1 + 2 + 3 + \cdots + k = N\).
Young Carl Friedrich Gauss knew a shortcut for calculating the sums of consecutive integers: \(1 + 2 + 3 + \cdots + k = N = k\,(k+1)\,/\,2\). We need to invert this formula, solving for the value of k that yields a specified N (if there is one). Rearranging the equation gives \(k^2 + k - 2N = 0\), and then we can crank up the trusty old quadratic formula to get this solution:
\[k = \frac{-1 \pm \sqrt{1 + 8N}}{2}.\]
Thus k is an integer—and N is triangular—if and only if \(8N + 1\) is an odd perfect square. (Let’s ignore the negative root, and note that if \(8N + 1\) is a square at all, it will be an odd one.) Detecting perfect squares is a problem we’ve already solved, so the predicate for detecting triangular numbers takes this simple form:
function isTriangular(N) { return isSquare(8 * N + 1); }
Try testing it with 999,999,997,764,120, the largest triangular number less than \(10^{15}\).
Factorials are the multiplicative analogues of triangular numbers: If N is the kth factorial, then \(N = k! = 1 \times 2 \times 3 \times \cdots \times k\). Is there a multiplicative trick that generates factorials in the same way that Gauss’s shortcut generates triangulars? Well, there’s Stirling’s approximation:
\[k! \approx \sqrt{2 \pi k} \left( \frac{k}{e} \right)^k.\]
We might try to invert this formula to get a function of \(k!\) whose value is \(k\), but I don’t believe this is a promising avenue to explore. The reason is that Stirling’s formula is only an approximation. It predicts, for example, that 5! is equal to 118.02, whereas the true value is 120. Thus taking the output of the inverse function and rounding to the nearest integer would produce wrong answers. We could add correction terms to get a closer approximation—but surely there’s a better way.
One approach is to work with the gamma (\(\Gamma\)) function, which extends the concept of a factorial from the integers to the real and complex numbers; if \(n\) is an integer, then \(\Gamma(n+1) = n!\), but the \(\Gamma\) function also interpolates between the factorial values. A recent paper by Mitsuru Uchiyama gives an explicit, analytic, inverse of the gamma function, but I understand only fragments of the mathematics, and I don’t know how to implement it algorithmically.
Fifteen years ago David W. Cantrell came up with another inverse of the gamma function, although this one is only approximate. Cantrell’s version is much less intimidating, and it is based on one of my favorite mathematical gadgets, the Lambert W function. A Mathematica implementation of Cantrell’s idea works as advertised—when it is given the kth factorial number as input, it returns a real number very close to \(k+1\). However, the approximation is not good enough to distinguish true factorials from nearby numbers. Besides, JavaScript doesn’t come with a built-in Lambert W function, and I am loath to try writing my own.
On the whole, it seems better to retreat from all this higher mathematics and go back to the definition of the factorial as a product of successive integers. Then we can reliably detect factorials with a simple linear search, expressed in the following JavaScript function:
function isFactorial(N) { var d = 2, q = N, r = 0; while (q > 1 && r === 0) { r = q % d; q = q / d; d += 1; } return (q === 1 && r === 0); }
A factorial is built by repeated multiplication, so this algorithm takes it apart by repeated division. Initially, we set \(q = N\) and \(d = 2\). Then we replace \(q\) by \(q / d\), and \(d\) by \(d + 1\), while keeping track of the remainder \(r = q \bmod d\). If we can continue dividing until \(q\) is equal to 1, and the remainder of every division is 0, then N is a factorial. This is not a closed-form solution; it requires a loop. On the other hand, the largest factorial less than \(10^{15}\) is 17! = 355,687,428,096,000, so the program won’t be going around the loop more than 17 times.
The Fibonacci numbers are dear to the hearts of number nuts everywhere (including me). The sequence is defined by the recursion \(F_0 = 0, F_1 = 1, F_k = F_{k-1} + F_{k-2}\). How best to recognize these numbers? There is a remarkable closed-form formula, named for the French mathematician J. P. M. Binet:
\[F_k = \frac{1}{\sqrt{5}} \left[ \left(\frac{1 + \sqrt{5}}{2}\right)^k - \left(\frac{1 - \sqrt{5}}{2}\right)^k\right]\]
I call it remarkable because, unlike Stirling’s approximation for factorials, this is an exact formula; if you give it an integer k and an exact value of \(\sqrt{5}\)), it returns the kth Fibonacci number as an integer.
One afternoon last week I engaged in a strenuous wrestling match with Binet’s formula, trying to turn it inside out and thereby create a function of N that returns k if and only if \(N\) is the kth Fibonacci number. With some help from Mathematica I got as far as the following expression, which gives the right answer some of the time:
\[k(N) = \frac{\log \frac{1}{2} \left( \sqrt{5 N^2 + 4} - \sqrt{5} N \right)}{\log \frac{1}{2} \left( \sqrt{5} - 1 \right)}\]
Plugging in a few values of N yields the following table of values for \(k(N)\):
N | k(N) |
---|---|
1 | 2.000000000000001 |
2 | 3.209573979673092 |
3 | 4.0000000000000036 |
4 | 4.578618254581733 |
5 | 5.03325648737724 |
6 | 5.407157747499656 |
7 | 5.724476891770392 |
8 | 6.000000000000018 |
9 | 6.243411773788614 |
10 | 6.4613916654135615 |
11 | 6.658737112471047 |
12 | 6.8390081849422675 |
13 | 7.00491857188792 |
14 | 7.158583717787527 |
15 | 7.3016843734535035 |
16 | 7.435577905992959 |
17 | 7.561376165404197 |
18 | 7.680001269357004 |
19 | 7.792226410280063 |
20 | 7.8987062604216005 |
21 | 7.999999999999939 |
In each of the green rows, the function correctly recognizes a Fibonacci number \(F_k\), returning the value of k as an integer. (Or almost an integer; the value would be exact if we could calculate exact square roots and logarithms.) Specifically, 1 is the second Fibonacci number (though also the first), 3 is the fourth, 8 is the sixth, and 21 is the eighth Fibonacci number. So far so good. But there’s something weird going on with the other Fibonacci numbers in the table, namely those with odd-numbered indices (red rows). For N = 2, 5, and 13, the inverse Binet function returns numbers that are close to the correct k values (3, 5, 7), but not quite close enough. What’s that about?
If I had persisted in my wrestling match, would I have ultimately prevailed? I’ll never know, because in this era of Google and MathOverflow and StackExchange, a spoiler lurks around every cybercorner. Before I could make any further progress, I stumbled upon pointers to the work of Ira Gessel of Brandeis, who neatly settled the matter of recognizing Fibonacci numbers more than 40 years ago, when he was an undergraduate at Harvard. Gessel showed that N is a Fibonacci number iff either \(5N^2 + 4\) or \(5N^2 - 4\) is a perfect square. Gessel introduced this short and sweet criterion and proved its correctness in a problem published in The Fibonacci Quarterly (1972, Vol. 10, No. 6, pp. 417–419). Phillip James, in a 2009 paper, presents the proof in a way I find somewhat easier to follow.
It is not a coincidence that the expression \(5N^2 + 4\) appears in both Gessel’s formula and in my attempt to construct an inverse Binet function. Furthermore, substituting Gessel’s \(5N^2 - 4\) into the inverse function (with a few other sign adjustments) yields correct results for the odd-indexed Fibonacci numbers. Implementing the Gessel test in JavaScript is a cinch:
function gessel(N) { var s = 5 * N * N; return isSquare(s + 4) || isSquare(s - 4); }
So that takes care of the Fibonacci numbers, right? Alas, no. Although Gessel’s criterion is mathematically unassailable, it fails computationally. The problem arises from the squaring of \(N\). If \(N\) is in the neighborhood of \(10^{15}\), then \(N^2\) is near \(10^{30}\), which is roughly \(2^{100}\). JavaScript does all of its arithmetic with 64-bit double-precision floating-point numbers, which allow 53 bits for representing the mantissa, or significand. With values above \(2^{53}\), not all integers can be represented exactly—there are gaps between them. In this range the mapping between \(N\) and \(N^2\) is no longer a bijection (one-to-one in both directions), and the gessel
procedure returns many errors.
I had one more hope of coming up with a closed-form Fibonacci recognizer. In the Binet formula, the term \(((1 - \sqrt{5})\,/\,2)^k\) becomes very small in magnitude as k grows large. By neglecting that term we get a simpler formula that still yields a good approximation to Fibonacci numbers:
\[F_k \approx \frac{1}{\sqrt{5}} \left(\frac{1 + \sqrt{5}}{2}\right)^k.\]
For any integer k, the value returned by that expression is within 0.5 of the Fibonacci number \(F_k\), and so simple rounding is guaranteed to yield the correct answer. But the inverse function is not so well-behaved. Although it has no \(N^2\) term that would overflow the 64-bit format, it relies on square-root and logarithm operations whose limited precision can still introduce errors.
So how does the Factoids calculator detect Fibonacci numbers? The old-fashioned way. It starts with 0 and 1 and iterates through the sequence of additions, stopping as soon as N is reached or exceeded:
function isFibo(N) { var a = 0, b = 1, tmp; while (a < N) { tmp = a; a = b; b = tmp + b; } return a === N; }
As with the factorials, this is not a closed-form solution, and its computational complexity scales in linear proportion to N rather than being constant regardless of N. There are tricks for speeding it up to \(\log N\); Edsger Dijkstra described one such approach. But optimization hardly seems worth the bother. For N < \(10^{15}\), the while
loop cannot be executed more than 72 times.
I’ve included two more sequences in the Factoids calculator, just because I’m especially fond of them. The Catalan numbers (1, 1, 2, 5, 14, 42, 132, 429…) are famously useful for counting all sorts of things—ways of triangulating a polygon, paths through the Manhattan street grid, sequences of properly nested parentheses. The usual definition is in terms of binomial coefficients or factorials:
\[C_k = \frac{1}{k+1} \binom{2k}{k} = \frac{(2k)!}{(k+1)! k!}\]
But there is also a recurrence relation:
\[C_0 = 1,\qquad C_k = \frac{4k-2}{k+1} C_{k-1}\]
The recognizer function in the Factoids Calculator does a bottom-up iteration based on the recurrence relation:
function isCatalan(N) { var c = 1, k = 0; while (c < N) { k += 1; c = c * (4 * k - 2) / (k + 1); } return c === N; }
The final sequence in the calculator is one of several discovered by Michael Somos around 1980. It is defined by this recurrence:
\[S_0 = S_1 = S_2 = S_3 = 1,\qquad S_k = \frac{S_{k-1} S_{k-3} + S_{k-2}^2}{S_{k-4}}\]
The surprise here is that the elements of the sequence are all integers, beginning 1, 1, 1, 1, 2, 3, 7, 23, 59, 314, 1529, 8209. In writing a recognizer for these numbers I have made no attempt to be clever; I simply generate the sequence from the beginning and check for equality with the given N:
function isSomos(N) { var next = 1, S = [1, 1, 1, 1]; while (next < N) { next = (S[3] * S[1] + S[2] * S[2]) / S[0]; S.shift(); S.push(next); } return next === N; }
But there’s a problem with this code. Do you see it? As N
approaches the \(10^{15}\) barrier, the subexpression S[3] * S[1] + S[2] * S[2]
will surely break through that barrier. In fact, the version of the procedure shown above fails for 32,606,721,084,786, the largest Somos-4 number below \(10^{15}\). For the version of the program that’s actually running in the Factoids calculator I have repaired this flaw by rearranging the sequence of operations. (For details see the GitHub repository.)
The factorial, Fibonacci, Catalan, and Somos sequences all exhibit exponential growth, which means they are sprinkled very sparsely along the number line. That’s why a simple linear search algorithm—which just keeps going until it reaches or exceeds the target—can be so effective. For the same reason, it would be easy to precompute all of these numbers up to \(10^{15}\) and have the JavaScript program do a table lookup. I have ruled out this strategy for a simple reason: It’s no fun. It’s not sporting. I want to do real computing, not just consult a table.
Other number series, such as the square and triangular numbers, are more densely distributed. There are more than 30 million square and triangular numbers up to \(10^{15}\); downloading a table of that size would take longer than recomputing quite a few squares and triangulars. And then there are the primes—all 29,844,570,422,669 of them.
What would happen if we broke out of the 64-bit sandbox and offered to supply factoids about larger numbers? A next step might be a Megafactoids calculator that doubles the digit count, accepting integers up to \(10^{30}\). Computations in this system would require multiple-precision arithmetic, capable of handling numbers with at least 128 bits. Some programming languages offer built-in support for numbers of arbitrary size, and libraries can add that capability to other languages, including JavaScript. Although there is a substantial speed penalty for extended precision, most of the algorithms running in the Factoids program would still give correct results in acceptable time. In particular, there would no problem recognizing squares and triangulars, factorials, Fibonaccis, Catalans and Somos-4 numbers.
The one real problem area in a 30-digit factoid calculator is factoring. Trial division would be useless; instead of milliseconds, the worst-case running time would be months or years. However, much stronger factoring algorithms have been devised in the past 40 years. The algorithm that would be most suitable for this purpose is called the elliptic curve method, invented by Hendrik Lenstra in the 1980s. An implementation of this method built into PARI/GP, which in turn is built into Sage, can factor 30-digit numbers in about 20 milliseconds. A JavaScript implementation of the elliptic curve method seems quite doable. Whether it’s worth doing is another question. The world is not exactly clamoring for more and better factoids.
Addendum 2016-02-05: I’ve just learned (via Hacker News) that I may need to add a few more recognition predicates: detectors for “artisanal integers” in flavors hand-crafted in the Mission District, Brooklyn, and London.
]]>
Tradition obliges me to say something interesting about the number 130. I could mention that 130 = 5 × 5 × 5 + 5. Is that interesting? Meh, eh? The Wikipedia page for 130 reports the following curious observation:
\(130\) is the only integer that is the sum of the squares of its first four divisors, including \(1\): \(1^2 + 2^2 + 5^2 + 10^2 = 130\).
What’s interesting here is not the fact that 130 has this property. The provocative part is the statement that 130 is the only such integer. How can one know this? It would be easy enough to check that no smaller number qualifies, but how do you rule out the possibility that somewhere far out on the number line there lurks another example? Clearly, what’s needed is a proof, but Wikipedia offers none. I spent some time trying to come up with a proof myself but failed to put the pieces together. Maybe you’ll do better. If not, Robert Munafo explains all, and traces the question to the Russian website dxdy.ru.
Enough of 130. I’d like to celebrate another number of the season: 2016. As midnight approached on December 31, Twitter brought this news:
In other words, the year we are now beginning is equal to \(2^{11} - 2^{5}\), and in another 32 years we’ll be celebrating the turn of a binary millennium, marking the passage of two kibiyears. (Year-zero deniers are welcome to wait an extra year.)
Fernando Juan, with an animated GIF, pointed out another nonobvious fact: \(2016 = 3^3 + 4^3 + 5^3 + 6^3 + 7^3 + 8^3 + 9^3\).
Even more noteworthy, May-Li Khoe and Federico Ardillo gave graphic proof that 2016 is a triangular number:
Specifically, 2016 is the sum of the integers \(1 + 2 + 3 + \cdots + 63\), making it the \(63\)rd element of the sequence \(1, 3, 6, 10, 15, \ldots\). You could go bowling with \(2016\) pins!
According to a well-known anecdote, Carl Friedrich Gauss was a schoolboy when he discovered that the nth triangular number is equal to \(n(n + 1) / 2\), which is also the value of the binomial coefficient \(\binom{n + 1}{2}\). Hence 2016 is the number of ways of choosing two items at a time from a collection of 64 objects. A tweet by John D. Cook offers this interpretation: “2016 = The number of ways to place two pawns on a chessboard.”
Patrick Honner expressed the same idea another way: 2016 is the number of edges in a complete graph with 64 vertices.
If you attended a New Years Eve party with 64 people present, and at midnight every guest clinked glasses with every other guest, there were 2016 clinks in all. (And probably a lot of broken glassware.)
In a New Years story of another kind, Tim Harford warns of the cost of overconfidence when it comes to making resolutions, as when you buy a year’s gym membership and stop going after six weeks. “Some companies base their business models on our tendency to overestimate our willpower.”
Triangular numbers turn up in another holiday item as well. A video by Tipping Point Math offers a quantitative analysis of the gift-giving extravaganza in “The Twelve Days of Christmas.” On the \(n\)th day of Christmas you receive \(1 + 2 + 3 + \cdots + n\) gifts, which is clearly the triangular number \(T_n\). But what’s the total number of gifts when you add up the largesse over \(n\) days? These sums of triangular numbers are the tetrahedral numbers: \(1, 4, 10, 20, 35 \ldots\); the twelfth tetrahedral number is \(364\). The video shows where to find all these numbers in Pascal’s triangle.
Vince Knight, in Un Peu de Math, approaches the Christmas gift exchange as a problem in game theory. Two friends agree not to exchange gifts, but then they are both tempted to renege on the agreement and buy a present afterall. This situation is a variant of the game-theory puzzle known as prisoner’s dilemma—which sounds like a rather grim view of holiday tradition. But in fact the conclusion is cheery: “People enjoy giving gifts a lot more than receiving them.”
The December holiday season also brings Don Knuth’s annual Christmas lecture, which has been posted on YouTube. In previous years, this talk was the Christmas Tree lecture, because it touched on some aspect of trees as a data structure or a concept in graph theory. Now Knuth has branched out from the world of trees. His subject is comma-free codes—sets of words that can be jammed together without commas or spaces to mark the boundaries between words, and yet still be read without ambiguity. A set that lacks the comma-free property is {tea, ate, eat}
, because a concatenation such as teateatea
has within it not just tea,tea,tea
but also eat,eat
and ate,ate
. But the words {sat, set, tea}
do qualify as comma-free, because no sequence of the words can be partitioned in more than one way to yield words in the set.
The idea of comma-free codes arose in the late 1950s, when it was thought that the genetic code might have a comma-free structure. Experiments soon showed otherwise, and interest in comma-free codes waned. But Knuth has rediscovered a paper by Willard Eastman, published in 1965, that constructs an algorithm for generating a comma-free code (if possible) from a given set of symbols. Knuth’s lecture demonstrates the algorithm and gives a computer implementation.
December 10, 2015, was the 200th birthday of Ada, Countess of Lovelace, who collaborated with Charles Babbage on the proposed computing machine called the analytic engine. The anniversary was commemorated in several ways, including an exhibition at the Science Museum in London, but here I want to call attention to a 12,000-word biographical essay by Stephen Wolfram, the creator of Mathematica.
The stories of Lovelace and Babbage have been told many times, but Wolfram’s account is worth reading even if you already know how the plot comes out. The principles of the machines and the algorithm that Lovelace devised as a demo (a computation of Bernoulli numbers) are described in detail, but the heart of the story is the human drama. Today we see these two figures as pioneers and heroes, but their lives were tinged with frustration and disappointment. Lovelace, who died at age 36, never got a proper opportunity to show off her ideas and abilities; in her one published work, all of her original contibutions were relegated to footnotes. Babbage in his later years felt aggrieved with the world for failing to support his vision. Wolfram makes an interesting biographer of this pair, never hesitating to see his subjects reflected in his own experience, and vice versa.
Back in the summer of 2012 Shinichi Mochizuki of Kyoto University released four long papers that claim to resolve an important problem in number theory called the abc conjecture. (I’m not going to try to explain the conjecture here; I did so in an earlier post.) More than three years later, no one in the mathematical community has been able to understand Mochizuki’s work well enough to verify that it is indeed a proof. In December more than 50 experts gathered at the University of Oxford to try to make progress on breaking the impasse. Brian Conrad of Stanford University, one of the workshop participants, wrote up his notes as a guest post in Cathy O’Neil’s Mathbabe blog. This is an insider’s account, and parts of the discussion will not make much sense unless you have fairly deep background in modern number theory. (I don’t.) But that inaccesibility illustrates the point, in a way. Number theorists themselves are in the same situation with regard to the Mochizuchi papers, which are clotted with idiosyncratic concepts such as Inter-universal Teichmuller Theory (IUT). Conrad writes:
After [Kiran] Kedlaya’s lectures, the remaining ones devoted to the IUT papers were impossible to follow without already knowing the material: there was a heavy amount of rapid-fire new notation and language and terminology, and everyone not already somewhat experienced with IUT got totally lost…. Persistent questions from the audience didn’t help to remove the cloud of fog that overcame many lectures in the final two days. The audience kept asking for examples (in some instructive sense, even if entirely about mathematical structures), but nothing satisfactory to much of the audience along such lines was provided.
It’s still not clear when or how the status of the proof will be resolved. O’Neil herself has taken a stern position on the issue of inpenetrable purported proofs. When the Mochizuchi papers were first released, she wrote: “If I claim to have proved something, it is my responsibility to convince others I’ve done so; it’s not their responsibility to try to understand it (although it would be very nice of them to try).”
Anthony Bonato, the Intrepid Mathematician, offers a friendlier introduction to the abc conjecture and its consequences. Also see comments on the conjecture and the workshop by Evelyn Lamb in the Scientific American blog Roots of Unity and by Anna Haensch in an American Mathematical Society blog.
Another number-theory event that attracted wide notice in recent weeks was the successful defense and publication of a doctoral thesis at Princeton University by Piper Harron. The title is “The Equidistribution of Lattice Shapes of Rings of Integers of Cubic, Quartic, and Quintic Number Fields: an Artist’s Rendering,” and it’s a great read (PDF). Here is the abstract:
A fascinating tale of mayhem, mystery, and mathematics. Attached to each degree \(n\) number field is a rank \(n\, - 1\) lattice called its shape. This thesis shows that the shapes of \(S_n\)-number fields (of degree \(n = 3, 4,\) or \(5)\) become equidistributed as the absolute discriminant of the number field goes to infinity. The result for \(n = 3\) is due to David Terr. Here, we provide a unified proof for \(n = 3, 4,\) and \(5\) based on the parametrizations of low rank rings due to Bhargava and Delone–Faddeev. We do not assume any of those words make any kind of sense, though we do make certain assumptions about how much time the reader has on her hands and what kind of sense of humor she has.
This is not your grandmother’s doctoral thesis! Harron interleaves sections of “laysplaining” and “mathsplaining,” and illustrates her work with an abundance of metaphors, jokes, asides to the reader, cartoons, and commentary on the course of her life during the 10 years she spent writing the thesis (e.g., a brief interruption to give birth). In terms of expository style, Mochizuki and Harron stand at opposite poles—a fact noted by at least two bloggers. Both of the posts by Evelyn Lamb and by Anna Haensch mentioned above in connection with Mochizuki also discuss Harron’s work.
Harron has a blog of her own with a post titled “Why I Do Not Talk About Math,” and she has written a guest post for Mathbabe with this description of her thesis:
My thesis is this thing that was initially going to be a grenade launched at my ex-prison, for better or for worse, and instead turned into some kind of positive seed bomb where flowers have sprouted beside the foundations I thought I wanted to crumble.
Reading her accounts of life as “an escaped graduate student,” I am in equal measures amused and horrified (not a comfortable combination). If nothing else, Harron’s “thesis grenade” promises to broaden—to diversify—the discussion of diversity in mathematical culture. It’s not just about race and gender (although Harron cares passionately about those issues). Human diversity also ranges across many other dimensions: modes of reasoning, approaches to learning, cultural contexts, styles of explaining, and ways of living. Harron’s thesis is a declaration that one can do original research-level mathematics without adopting the vocabulary, the styles, the attitudes, and the mental apparatus of one established academic community.
If I show you a cube, you can easily place it in a three-dimensional cartesian coordinate system in such a way that all the vertices have rational \(x,y,z\) coordinates. By scaling the cube, you can make all the coordinates integers. The same is true of the regular tetrahedron and the regular octahedron, but for these objects the scaling factor includes an irrational number, \(\sqrt{2}\). For the dodecahedron and the icosahedron, some of the vertex coordinates are themselves irrational no matter how the figure is scaled; the irrational number \(\varphi = (1 + \sqrt{5})\, /\, 2\) plays an essential role in the geometry.
This intrusion of irrationality into geometry troubled the ancients, but we seem to have gotten used to it by now. However, David Eppstein, a.k.a \(0\)xDE, writes about a class of polyhedra I still find deeply disconcerting: “Polyhedra whose vertex coordinates have no closed form formula.” In this context a closed-form formula is a mathematical expression that can be evaluated in a finite number of operations. There’s no universal agreement on exactly what operations are allowed; Eppstein works with computational models that allow the familiar operations \(+, -, \times, \div\) as well as finding roots of polynomials. He constructs a polyhedron—it looks something like the teepee his children used to play with—whose vertex coordinates cannot all be calculated by a finite sequence of these operations.
With this development we land in territory even stranger than that of the irrational numbers. I cannot draw an exact equilateral triangle on a computer screen because at least one of the coordinates is irrational; nevertheless, I can tell you that the vertex ought to have the coordinate \(y = 1 / \sqrt{3}\). In Eppstein’s polyhedron I can’t give you any such compact, digestible description of where the vertex belongs; the best anyone can offer is a program for approximating it.
Brent Yorgey in The Math Less Traveled offers an unlikely question: What’s the best way to read a manuscript printed on three-sided paper? If you have a sheaf of unbound pages printed on just one side, the obvious procedure is to read the top sheet, then shuffle it to the bottom of the heap, and continue until you come back to page 1. If the pages are printed on two sides, life gets a little more complicated. You can read the top page, flip it over and read the back, then flip it again and move it to the bottom of the sheaf. An alternative (attributed to John Horton Conway) is to read the front page, move it to the bottom of the heap, then flip over the entire stack to read the back side; then flip the stack again to read the front of the following page. In either case, you must alternate two distinct operations, which means you must somehow keep track of which move comes next.
The unsolved problem is to find the best algorithm when the pages are printed on three sides. “You may not be familiar with triple-sided sheets of paper,” Yorgey writes, “so here’s how they work: they stack nicely, just like regular sheets of paper, but you have to flip one over three times before you get back to the original side.”
Yorgey gives some criteria that a successful algorithm ought to satisfy, but the question remains open.
Mr. Honner takes up a problem encountered at a Math for America banquet:
Suppose you are standing several miles from the Pentagon. What is the probability you can see three sides of the building?
He discovers that it’s one of those cases where interpreting the problem is as much of a challenge as finding the answer. “In particular, it’s a reminder of how the different ways we model random selection can make for big differences in our solutions!”
Long-tailed data distributions—where extreme values are more common than they would be in, say, a normal distribution—are notoriously tricky. John D. Cook points out that long tails become even more treacherous when the variables are discrete (e.g., integers) rather than continuous values.
Suppose \(n\) data points are drawn at random from a distribution where each possible value \(x\) appears with frequency proportional to \(x^{-\alpha}\). Working from the data sample, we want to estimate the value of the exponent \(\alpha\). If \(x\) is a continuous variable, there’s a maximum-likelihood formula that generally works well: \(\hat{\alpha} = 1 + n\, / \sum \log x\). But with discrete variables, the same method leads to disastrous errors. In a test case with \(\alpha = 3\), the formula yields \(\hat{\alpha} = 6.87\). But we needn’t lose hope. Cook presents another approach that gives quite reliable results.
A post on DRHagen.com tackles a mystery found in an xkcd cartoon:
The size of each date is proportional to its frequency of occurrence in the Google Books ngram database for English-language books published since 2000. September 11 is clearly an outlier. That’s not the mystery. The mystery is that the 11th of most other months is noticeably less common than other dates. The cartoonist, Randall Munroe, was puzzled by this; hover over the image to see his message. Hagen convincingly solves the mystery. I’m not going to give away the solution, and I urge you to try to come up with a hypothesis of your own before following the link to DRHagen.com. And if you get that one right, you might work on another calendrical anomaly—that the 2nd, 3rd, 22nd, and 23rd are underrepresented in books printed before the 20th century—which Hagen solves in a followup post.
A shiny new blog called Off the Convex Path, with contributions from Sanjeev Arora, Moritz Hardt, and Nisheeth Vishnoi, “is dedicated to the idea that optimization methods—whether created by humans or nature, whether convex or nonconvex—are exciting objects of study and often lead to useful algorithms and insights into nature.” A December 21 post by Vishnoi looks at optimization mthods in the guise of dynamical systems—specifically, systems that converge to some set of fixed points. Vishnoi gives two examples, both from the life sciences: a version of Darwinian evolution, and a biological computer in which slime molds solve linear programming problems.
In the Comfortably Numbered blog, hardmath123 writes engagingly and entertainingly about numerical coincidences, like this one:
\(1337^{47168026} \approx \pi \cdot 10^{147453447}\) to within \(0.00000001\%\). It begins with the digits \(31415926 \ldots\).
The existence of such coincidences should not be a big surprise. As hardmath123 writes, “Since the rationals are dense in the reals, we can find a rational number arbitrarily close to any real number.” The question is how to find them. The answer involves continued fractions and the Dirichlet approximation theorem.
For the final acts of this carnival, we have a few items on mathematics in the arts, crafts, games, and other recreations.
In the Huffington Post arts and culture department, Dan Rockmore takes a mathematical gallery walk in New York. The first stop is the Whitney Museum of American Art, showing a retrospective of the paintings of Frank Stella, some of whose canvases are nonrectilinear, and even nonconvex. In another exhibit mathematics itself becomes art, with 10 equations and other expressions calligraphically rendered by 10 noted mathematicians and scientists.
Katherine, writing on the blog Will Knit for Math, tells how “Being a mathematician improves my knitting.” It’s not just a matter of counting stitches (although a scheme for counting modulo 18 does enter into the story). I hope we’ll someday get a followup post explaining how “Being a knitter improves my mathematics.”
Chelsea VanderZwaag, a student majoring in mathematics and elementary education, visits an arts-focused school school, where a lesson on paper-folding and fractions blends seamlessly into the curriculum. (An earlier post on the problem of creating a shape by folding paper and making a single cut with scissors provides a little more background.)
On Medium, A Woman in Technology writes about Dobble, a combinatorial card game. “My children got this game for Christmas. We haven’t played it yet. We got as far as me opening the tin and reading the rules, at which point I got distracted by the maths and forgot about the game.”
There are 55 cards, each bearing eight symbols, and each card shares exactly one symbol with each of the other cards. How many distinct symbols do you need to construct such a deck of cards. The game instructions say there are 50, but the author determines through hard work, scribbling, and spreadsheets that the instructions are in error. It all comes to a happy end with the formula \(n^2 - n + 1\) and a suggested improvement to the game that the manufacturer should heed.
That’s it for Carnival of Math 130. My apologies to a few contributors whose interesting work I just couldn’t squeeze in here.
Next month the carnival makes a hometown stop with Katie at The Aperiodical.
]]>Scribble, scribble, scribble. As if the world didn’t get enough of my writing already, with a bimonthly column in American Scientist, now I’m equipped to publish my every thought on a momen’t notice.
That’s how it all began here on bit-player.org: The first post (with the first typo) appeared on January 9, 2006. I’ve published another 340 posts since then (including this one)—and doubtless many more typos and other errors. Many thanks to my readers, especially those who have contributed some 1,800 thoughtful comments.
]]>My closest friends and family must make do with an old-fashioned paper-and-postage greeting card, but for bit-player readers I can send some thoroughly modern pixels. Happy holidays to everyone.
In recent months I’ve been having fun with “deep dreaming,” the remarkable toy/tool for seeing what’s going on deep inside deep neural networks. Those networks have gotten quite good at identifying the subject matter of images. If you train the network on a large sample of images (a million or more) and then show it a picture of the family pet, it will tell you not just whether your best friend is a cat or a dog but whether it’s a Shih Tzu or a Bichon Frise.
What visual features of an image does the network seize upon to make these distinctions? Deep dreaming tries to answer this question. It probes a layer of the network, determines which neural units are most strongly stimulated by the image, and then translates that pattern of activation back into an array of pixels. The result is a strange new image embellished with all the objects and patterns and geometric motifs that the selected layer thinks it might be seeing. Some of these machine dreams are artful abstractions; some are reminiscent of drug-induced hallucinations; some are just bizarre or even grotesque, populated by two-headed birds and sea creatures swimming through the sky.
For this year’s holiday card I chose a scene appropriate to the season and ran it through the deep-dreaming program. You can see some of the output below, starting with the original image and progressing through fantasies extracted from deeper and deeper layers of the network. (Navigate with the icons below the image, or use the left and right arrow keys. Shorthand labels identifying the network layers appear at lower right.)
A few notes and observations:
And some links:
Update 2015-12-31: In the comments, Ed Jones asks, “If the original image is changed slightly, how much do the deep dreaming images change?” It’s a very good question, but I don’t have a very good answer.
The deep dreaming procedure has some stochastic stages, and so the outcome is not deterministic. Even when the input image is unchanged, the output image is somewhat different every time. Below are enlargements cropped from three runs probing layer 4c, all with exactly the same input:
They are all different in detail, and yet at a higher level of abstraction they are all the same: They are recognizably products of the same process. That statement remains true when small changes—and even some not-so-small ones—are introduced into the input image. The figure below has undergone a radical shift in color balance (I have swapped the red and blue channels), but the deep dreaming algorithm produces similar embellishments, with an altered color palette:
In the pair of images below I have cloned a couple of trees from the background and replanted them in the foreground. They are promptly assimilated into the deep dream fantasy, but again the overall look and feel of the scene is totally familiar.
Based on the evidence of these few experiments, it seems the deep dreaming images are indeed quite robust, but there’s another side to the story. When these neural networks are used to recognize or classify images (the original design goal), it’s actually quite easy to fool them. Christian Szegedy and his colleagues have shown that certain imperceptible changes to an image can cause the network to misclassify it; to the human eye, the picture still looks like a school bus, but the network sees it as something else. And Ahn Nguyen et al. have tricked networks into confidently identifying images that look like nothing but noise. These results suggest that the classification methods are rather brittle or fragile, but that’s not quite right either. Such errors arise only with carefully crafted images, called “adversarial examples.” There is almost no chance that a random change to an image would trigger such a response.
]]>
Consider Boston, our best guess for where you might be reading this article. It’s very expensive for spending on the average Medicare patient. But, when it comes to private health insurance, it’s about average.
Their best guess about my whereabouts was pretty good. I am indeed in the Boston area. I’m not surprised that they know that, but I was nonplussed to discover that they had altered the story according to my location. Here’s the markup for that paragraph:
<div class="g-insert"> <p class="g-body"> Consider <span class="g-custom-place g-selected-hrr-name"> Boston</span> <span class="g-geotarget-success">, our best guess for where you might be reading this article</span>. <span class="g-new-york-city-addition g-custom-insert g-hidden"> (Here, the New York City region includes all boroughs but the Bronx, which is listed separately.)</span> It’s <span class="g-medicare-adjective g-custom-place"> very expensive</span> for spending on the average Medicare patient. <span class="g-local-insert g-very-different"> But, w</span><span class="g-close g-same g-hidden g-local-insert"> W</span>hen it comes to private health insurance, it’s <span class="g-hidden g-same g-local-insert">also</span> <span class="g-private-adjective g-custom-place"> about average</span>. <span class="g-close g-hidden g-local-insert"> The study finds that the levels of spending for the two programs are unrelated. That means that, for about half of communities, spending is somewhat similar, like it is in <span class="g-custom-place g-selected-hrr-name g-hrr-only-no-state"> Boston</span> </span> <span class="g-same g-hidden g-local-insert g-same-sentence"> <span class="g-custom-place g-hrr-only-no-state">Boston</span> is one of the few places where spending for both programs is very similar – in most, there is some degree of mismatch. </span> <span class="g-new-york-addition g-local-insert g-hidden"> Several parts of the New York metropolitan area are outliers in the data – among the most expensive for both health insurance systems.</span> <!-- <span class="g-atlanta-addition g-local-insert g-hidden"> (Atlanta is one of the few places in the country where spending for both programs is very similar. In most, there is some degree of mismatch.)</p> --> </p> </div>
It seems I live in a g-custom-place
(“Boston”), which is associated with a g-medicare-adjective
(“very expensive”) and a g-private-adjective
(“about average”). Because the Times has been able to track me down, I get a g-geotarget-success
message. But for the same reason I don’t get to see certain other text, such as a remark about New York as an outlier; those text spans are g-hidden
.
Presumably, a program running on the server has located me by checking my IP number against a geographic database, then added various class names to the span tags. Some of the class names are processed by a CSS stylesheet; for example, g-hidden
triggers the style directive display: none
. The other class names are apparently processed by a Javascript program that inserts or removes text, such as those custom adjectives. Generating text in this way looks like a pretty tedious and precarious business, a little like writing poetry with refrigerator magnets. For example, an extra set of class names is needed to make sure that if a place is first mentioned as a city and state (e.g., “Springfield, Massachusetts”), the state won’t be repeated on subsequent references.
I suppose there’s no great harm in this bit of localizing embellishment. After all, they’re not tailoring the article based on whether I’m black or white, male or female, Democrat or Republican, rich or poor. It’s just a geographic split. But it makes me queasy all the same. When I cite an article here on bit-player, I want to think that everyone who follows the link will see the same article. It looks like I can’t count on that.
Update: Turns out this is not the Times’s first adventure in geotargeting. A story last May on “paths out of poverty” used the same technique, as reported by Nieman Lab. (Thanks to Andrew Silver (@asilver360) for the tip via Twitter.)
Update 2015-12-17: Margaret Sullivan, the Public Editor of the Times, writes today on the mixed response to the geotargeted story, concluding:
]]>The Times could have quite easily provided readers with an opt-in: “Want to see results for your area? Click here.”
As the paper continues down this path, it’s important to do so with awareness and caution. For one thing, some readers won’t like any personalization and will regard it as intrusive. For another, personalization could deprive readers of a shared, and expertly curated, news experience, which is what many come to The Times for. Losing that would be a big mistake.
Please contribute! Anything that might engage or delight the mathematical mind is welcome: theorems, problems, games and recreations, notes on math education. We’ll take pure and applied, discrete and continuous, geometrical and arithmetical, differential and integral, polynomial and exponential…. You get the idea. And don’t be shy about proposing your own work!
]]>Porter links continuing growth not just to prosperity but also to a host of civic virtues. “Economic development was indispensable to end slavery,” he declares. “It was a critical precondition for the empowerment of women…. Indeed, democracy would not have survived without it.” As for what might happen to us without ongoing growth, he offers dystopian visions from history and Hollywood. Before the industrial revolution, “Zero growth gave us Genghis Khan and the Middle Ages, conquest and subjugation,” he says. A zero-growth future looks just as grim: “Imagine ‘Blade Runner,’ ‘Mad Max,’ and ‘The Hunger Games’ brought to real life.”
Let me draw you a picture of this vision of economic growth through the ages, as I understand it:
For hundreds or thousands of years before the modern era, average wealth and economic output were low, and they grew only very slowly. Life was solitary, poor, nasty, brutish, and short. Today we have vigorous economic growth, and the world is full of wonders. Life is sweet, for now. If growth comes to an end, however, civilization collapses and we are at the mercy of new barbarian hordes (equipped with a different kind of horsepower).
Something about this scenario puzzles me. In that frightful Mad Max future, even though economic growth has tapered off, the society is in fact quite wealthy; according to the graph, per capita gross domestic product is twice what it is today. So why the descent into brutality and plunder?
Porter has an answer at the ready. The appropriate measure of economic vitality, he implies, is not GDP itself but the rate of growth in GDP, or in other words the first derivative of GDP as a function of time:
If the world follows the trajectory of the blue curve, we have already reached our peak of wellbeing. It’s all downhill from here.
Some economists go even further, urging us to keep an eye on the second derivative of economic activity. Twenty-five years ago I was hired to edit the final report of an MIT commission on industrial productivity. Among the authors were two prominent economists, Robert Solow and Lester Thurow. I argued with them at some length about the following paragraph (they won):
In view of all the turmoil over the apparently declining stature of American industry, it may come as a surprise that the United States still leads the world in productivity. Averaged over the economy as a whole, for each unit of input the United States produces more output than any other nation. With this evidence of economic efficiency, is there any reason for concern? There are at least two reasons. First, American productivity is not growing as fast as it used to, and productivity in the United States is not growing as fast as it is elsewhere, most notably in Japan….
The phrase I have highlighted warns us that even though productivity is high and growing higher, we need to worry about the rate of change in the rate of growth:
Taking the second derivative (the green curve) as the metric of economic health, it appears we have already fallen back to the medieval baseline, and life is about to get even worse than it was at the time of the Mongol conquests; the Mad Max world will be an improvement over what lies in store in the near future.
Why should human happiness and the fate of civilization depend on the time derivative of GDP, rather than on GDP itself? Why do we need not just wealth, but more and more wealth, growing faster and faster? Again, Porter has an answer. Without growth, he says, economic life becomes a zero-sum game. “As Martin Wolf, the Financial Times commentator has noted, the option for everybody to become better off—where one person’s gain needn’t require another’s loss—was critical for the development and spread of the consensual politics that underpin democratic rule.” In other words, the function of economic growth is to blunt the force of envy in a world with highly skewed distributions of income and wealth. I’m not persuaded that growth per se is either necessary or sufficient to deal with this issue.
Porter’s essay on zero growth was prompted by the climate-change negotiations now under way in Paris. He worries (along with many others) that curtailing consumption of fossil fuels will lead to lower overall production and consumption of goods and services. That’s surely a genuine risk, but what’s the alternative? If burning more and more carbon is the only way we can keep our civilization afloat, then somebody had better send for Mad Max. The age of fossil fuels is going to end, sooner or later, if not because of the climatic effects then because the supply is finite.
Economic growth is not necessarily tied to the carbon budget, but it can’t be cut loose entirely from physical resources. Even the ethereal goods that are now so prominent in commerce—code and data—require some sort of material infrastructure. Ultimately, whether growth continues is not a question of social and economic policy or moral philosophy; it’s a matter of physics and mathematics. I’m with Kenneth Boulding. I don’t see Mad Max in our future, but I’m not counting on perpetual growth, either.
]]>When I looked over the collection, I quickly realized that we could not form a set of eight matching glasses. The closest we could come was 6 + 2. But then I saw that we could form a set of eight glasses with no two alike. As I placed them on the table, I thought “Aha, Ramsey theory!”
At the root of Ramsey theory lies this curious assertion: If a collection of objects is large enough, it cannot be entirely without structure or regularity. Dinner parties offer the canonical example: If you invite six people to dinner, then either at least three guests will already be mutual acquaintances (each knows all the others) or at least three guests will be strangers (none has met any of the others). This result has nothing to do with the nature of social networks; it is a matter of pure mathematics, first proved by the Cambridge philosopher, mathematician, and economist Frank Plumpton Ramsey (1903–1930).
Ramsey problems become a little easier to reason about when you transpose them into the language of graph theory. Consider a complete graph on six vertices (where every vertex has an edge connecting it with every other vertex, for a total of 15 edges):
The aim is to color all the edges of the graph red or blue in such as way that no three vertices are connected by edges of the same color (forming a “monochromatic clique”). The red edges might signify “mutually acquainted” and the blue ones “strangers.” As the diagrams below show, it’s easy to find a successful red-and-blue coloring of a complete graph on five vertices: In the pentagon at left, each vertex is connected to two other vertices by red edges, but those vertices are connected to each other by a blue edge. Thus there are no red triangles, and a similar analysis shows there are no blue ones either. The same scheme doesn’t work for a six-vertex graph, however. The attempt shown at right fails with two blue triangles. In fact, any two-coloring of this graph has monochromatic triangles. Ramsey’s 1928 proof of this assertion is based on the pigeonhole principle. These days, we also have the option of just checking all \(2^{15}\) possible colorings.
More formally, the Ramsey number \(\mathcal{R}(m, n)\) is the number of vertices in the smallest complete graph for which a two-coloring of the edges is certain to yield a red clique of \(m\) edges or a blue clique of \(n\) edges (or both). In applying this notion to the wine glass problem, I was asking: How many glasses do I need to have in my cupboard to ensure there are either eight all alike or eight all different?
At dinner that night we cheerfully clinked our eight dissimilar glasses. Maybe we even completed the full round of \((8 \times 7) / 2 = 28\) clinks. Later on, after everyone had gone home and all the glasses were washed, my thoughts returned to Ramsey theory. I was wondering, “What is the value of \(\mathcal{R}(8, 8)\), the smallest complete graph that is sure to have a monochromatic subgraph of at least eight vertices? Lying awake in the middle of the night, I worked out a solution in terms of wine glasses.
Suppose you start with an empty cupboard and add glasses one at a time, aiming to assemble a collection in which no eight glasses are all alike and no eight glasses are all different. You could start by choosing seven different glasses—but no more than seven, lest you create an all-different set of eight. Every glass you subsequently add to the set must be the same as one of the original seven. You can keep going in this way until you have seven sets of seven identical glasses. When you add the next glass, however, you can’t avoid creating a set that either has eight glasses all alike or eight all different. Thus it appears that \(\mathcal{R}(8, 8) = 7^2 + 1 = 50\).
The moment I reached this conclusion, I knew something was dreadfully wrong. Computing Ramsey numbers is hard. After decades of mathematical and computational labor, exact \(\mathcal{R}(m, n)\) values are known for only nine cases, all with very small values of \(m\) and \(n\). Lying in the dark, without Google at my fingertips, I couldn’t remember the exact boundary between known and unknown, but I was pretty sure that \(\mathcal{R}(8, 8)\) lay on the wrong side. The idea that I might have just calculated this long-sought constant in my head was preposterous. And so, in a state of drowsy perplexity, I fell asleep.
Next morning, the mystery evaporated. Where did my reasoning go wrong? You might want to think a moment before revealing the answer.
No, I wasn’t drunk. The blue trace shows me lurching all over the track, straying onto the soccer field, and taking scandalous shortcuts in the turns—but none of that happened, I promise. During the entire run my feet never left the innermost lane of the oval. All of my apparent detours and diversions result from GPS measurement errors or from approximations made in reconstructing the path from a finite set of measured positions.
At the end of the run, the app tells me how far I’ve gone, and how fast. Can I trust those numbers? Looking at the map, the prospects for getting accurate summary statistics seem pretty dim, but you never know. Maybe, somehow, the errors balance out.
Consider the one-dimensional case, with a runner moving steadily to the right along the \(x\) axis. A GPS system records a series of measured positions \(x_0, x_1, \ldots, x_n\) with each \(x_i\) displaced from its true value by a random amount no greater than \(\pm\epsilon\). When we calculate total distance from the successive positions, most of the error terms cancel. If \(x_i\) is shifted to the right, it is farther from \(x_{i-1}\) but closer to \(x_{i+1}\). For the run as a whole, the worst-case error is just \(\pm 2 \epsilon\)—the same as if we had recorded only the endpoints of the trajectory. As the length of the run increases, the percentage error goes to zero.
In two dimensions the situation is more complicated, but one might still hope for a compensating mechanism whereby some errors would lengthen the path and others shorten it, and everything would come out nearly even in the end. Until a few days ago I might have clung to that hope. Then I read a paper by Peter Ranacher of the University of Salzburg and four colleagues. (Take your choice of the journal version, which is open access, or the arXiv preprint. Hat tip to Douglas McCormick in IEEE Spectrum, where I learned about the story.)
Ranacher’s conclusion is slightly dispiriting for the runner. On a two-dimensional surface, GPS position errors introduce a systematic bias, tending to exaggerate the length of a trajectory. Thus I probably don’t run as far or as fast as I had thought. But to make up for that disappointment, I have learned something new and unexpected about the nature of measurement in the presence of uncertainty, and along the way I’ve had a bit of mathematical adventure.
The Runmeter app works by querying the phone’s GPS receiver every few seconds and recording the reported longitude and latitude. Then it constructs a path by drawing straight line segments connecting successive points.
Two kinds of error can creep into the GPS trajectory. Measurement errors arise when the reported position differs from the true position. Interpolation errors come from the connect-the-dots procedure, which can miss wiggles in the path between sampling points. Ranacher et al. consider only the inaccuracies of measurement, on the grounds that interpolation errors can be reduced by more frequent sampling or a more sophisticated curve-fitting method (e.g., cubic splines rather than line segments). Interpolation error is eliminated altogether if the runner’s path is a straight line.
Suppose a runner on the \(x, y\) plane shuttles back and forth repeatedly between the points \(p = (0, 0)\) and \(q = (d, 0)\). In other words, the end points of the path lie \(d\) units apart along the \(x\) axis. After \(n\) trips, the true distance covered is clearly \(nd\). A GPS device records the runner’s position at the start and end of each segment, but introduces errors in both the \(x\) and \(y\) coordinates. Call the perturbed positions \(\hat{p}\) and \(\hat{q}\), and the Euclidean distance between them \(\hat{d}\). Ranacher and his colleagues show that for large \(n\) the total GPS distance \(n \hat{d}\) is strictly greater than \(nd\) unless all the measurement errors are perfectly correlated.
I wanted to see for myself how measured distance grows as a function of GPS error, so I wrote a simple Monte Carlo program. The Ranacher proof makes no assumptions about the statistical distribution of the errors, but in a computer simulation it’s necessary to be more concrete. I chose a model where the GPS positions are drawn uniformly at random from square boxes of edge length \(2 \epsilon\) centered on the points \(p\) and \(q\).
In the sketch above, the black dots, separated by distance \(d\), represent the true endpoints of the runner’s path. The red dots are two GPS coordinates \(\hat{p}\) and \(\hat{q}\), and the red line gives the measured distance between them. We want to know the expected length of the red line averaged over all possible \(\hat{p}\) and \(\hat{q}\).
Getting the answer is quite easy if you’ll accept a numerical approximation based on a finite random sample. Write a few lines of code, pick some reasonable values for \(d\) and \(\epsilon\), crank up the random number generator, and run off 10 million iterations. Some results:
\(\epsilon\) | \(d\) | \(\hat{d}\) |
---|---|---|
0.0 | 1.0 | 1.0000 |
0.1 | 1.0 | 1.0034 |
0.2 | 1.0 | 1.0135 |
0.3 | 1.0 | 1.0306 |
0.4 | 1.0 | 1.0554 |
0.5 | 1.0 | 1.0882 |
For each value of \(\epsilon\) the program generated \(10^7\) \((\hat{p}, \hat{q})\) pairs, calculated the Euclidean distance \(\hat{d}\) between them, and finally took the average \(\langle \hat{d} \rangle\) of all the distances. It’s clear that \(\langle \hat{d} \rangle > d\) when \(\epsilon > 0\). Not so clear is where these particular numbers come from. Can we understand how \(\hat{d}\) is determined by \(d\) and \(\epsilon\)?
For a little while, I thought I had a simple explanation. I reasoned as follows: We already know from the one-dimensional case that the \(x\) component of the measured distance has an expected value of \(d\). The \(y\) component, orthogonal to the direction of motion, is the difference between two randomly chosen points on a line of length \(2 \epsilon\); a symmetry argument gives this length an expected value of \(2 \epsilon / 3\). Hence the expected value of the measured distance is:
\[\hat{d} = \sqrt{d^2 + \left(\frac{2 \epsilon}{3}\right)^2}\, .\]
Ta-dah!
Then I tried plugging some numbers into that formula. With \(d = 1\) and \(\epsilon = 0.3\) I got a distance of 1.0198. The discrepancy between this value and the numerical result 1.0306 is much too large to dismiss.
What was my blunder? Repeat after me: The average of the squares is not the same as the square of the average. I was calculating the squared distance as \({ \langle x \rangle}^2 + {\langle y \rangle}^2 \) when what I should have been doing is \(\langle {x^2 + y^2}\rangle\). We need to average over all possible distances between a point in one square and a point in the other, not over all \(x\) and \(y\) components of those distances. Trouble is, I don’t know how to calculate the correct distance.
I thought I’d try to find an easier problem. Suppose the runner stops to tie a shoelace, so that the true distance \(d\) drops to zero; thus any movement detected is a result of GPS errors. As long as the runner remains stopped, the two error boxes exactly overlap, and so the problem reduces to finding the average distance between two randomly selected points in the unit square. Surely that’s not too hard! The answer ought to be some simple and tidy expression—don’t you think?
In fact the problem is not at all easy, and the answer is anything but tidy. We need to evaluate a terrifying quadruple integral:
\[\iiiint_0^1 \sqrt{(x_q - x_p)^2 + (y_q - y_p)^2} \, dx_p \, dx_q \, dy_p \, dy_q\, .\]
Lucky for me, I live in the age of MathOverflow and StackExchange, where powerful wizards have already done my homework for me.
\[\frac{2+\sqrt{2}+5\log(1+\sqrt{2})}{15} \approx 0.52140543316\]
Nothing to it, eh?
The corresponding expression for nonzero \(d\) is doubtless even more of a monstrosity, but I’ve made no attempt to derive it. I am left with nothing but the Monte Carlo results. (For what it’s worth, the simulations do agree on the value \(\hat{d} = 0.5214\) for \(d = 0\)).
I tried applying the Monte Carlo program to my 1,600-meter run. In the Runmeter data my position is sampled 100 times, or every six seconds on average, which means that \(d\) (the true average distance between samples) should be about 16 meters. Estimating \(\epsilon\) is not as easy. In the map above there’s one point on the blue path that’s displaced by at least 10 meters, but if we ignore that outlier most of the other points are probably within about 3 meters of the correct lane. Plugging in \(d = 16\) and \(\epsilon = 3\) yields about 1,620 meters as the expected measured distance.
What does the Runmeter app have to say? It reports a total distance of 1,599.5 meters, which is, I’m inclined to say, way too good to be true. Part of the explanation is that the measurement errors are not uniform random variables; there are strong correlations in both space and time. Also, measurement errors and interpolation errors surely have canceled out to some extent. (It’s even possible that the developers of the app have chosen the sampling interval to optimize the balance between the two error types.) Still, I have to say that I am quite surprised by this uncanny accuracy. I’ll have to run some more laps to see if the performance is repeatable.
Another thought: People have been measuring distances for millennia. How is it that no one noticed the asymmetric impact of measurement errors before the GPS era? Wouldn’t land surveyors have figured it out? Or navigators? Distinguished mathematicians, including Gauss and Legendre, took an interest in the statistical analysis of errors in surveying and geodesy. They even did field work. Apparently, though, they never stumbled on the curious fact that position errors orthogonal to the direction of measurement lead to a systematic bias toward greater lengths.
There’s yet another realm in which such biases may have important consequences: measurement in high-dimensional spaces. Inaccuracies that cause a statistical bias of 2 percent in two-dimensional space give rise to a 19 percent overestimate in 10-dimensional space. The reason is that errors along all the axes orthogonal to the direction of measurement contribute to the Euclidean distance. By the time you get to 1,000 spatial dimensions, the measured distance is more than six times the true distance.
Even for creatures like us who live their lives stuck in three-space, this observation might be more than just a mathematical curiosity. Lots of algorithms in machine learning, for example, measure distances between vectors in high-dimensional spaces. Some of those vectors may be closer than they appear.
]]>
- I am greatly indebted to Prof. Riesz for translating the present paper.
- I am indebted to Prof. Riesz for translating the preceding footnote.
- I am indebted to Prof. Riesz for translating the preceding footnote.
Why stop at three? Littlewood explains: “However little French I know I am capable of copying a French sentence.”
I thought of this incident the other day when I received a letter from Medicare. At the top of the single sheet of paper was the heading “A Message About Medicare Premiums,” followed by a few paragraphs of text, and at the bottom this boldface note:
The information is printed in Spanish on the back
Naturally, I turned the page over. I found the heading “Un mensaje sobre las primas de Medicare,” followed by a few paragraphs of Spanish text, and then this in boldface:
La información en español está impresa al dorso
The line is a faithful translation of the English text from the other side of the sheet. (O el inglés es una traducción fiel del español.) But in this case neither copying nor faithful translation quite suffices. It seems we have fallen into the wrong symmetry group. The statement “This sentence is not in Spanish” is true, but its translation into Spanish, “Esta frase no está en español” is false. Apart from that self-referential tangle, if the two boldface notes in the letter are to be of any use to strictly monolingual readers, shouldn’t they be on opposite sides of the paper?
By the way, I had always thought the Littlewood three-footnote story referred to a real paper. But his account in A Mathematicians’s Miscellany suggests it was a prank he never had a chance to carry out. And in browsing the Comptes Rendus on Gallica, I find no evidence that Littlewood ever published there. [Please see comment below by Gerry Myerson.]
]]>A place where thousands of people suffered and died makes an uncomfortable tourist destination, yet looking away from the horror seems even worse than staring. And so, when Ros and I were driving from Prague to Dresden last month, we took a slight detour to visit Terezín, the Czech site that was the Theresienstadt concentration camp from late 1941 to mid 1945. We expected to be disturbed, but we stumbled onto something that was disturbing in an unexpected way.
Terezín was not built as a Nazi concentration camp. It began as a fortress, erected in the 1790s to defend the Austrian empire from Prussian threats. Earthen ramparts and bastions surround buildings that were originally the barracks and stables for a garrison of a few thousand troops. By the 20th century the fortress no longer served any military purpose. The troops withdrew, civilians moved in, and the place became a town with a population of about 7,000.
In 1941 the Gestapo and the SS siezed Terezín, expelled the Czech residents, and began the “resettlement” of Jews deported from Prague and elsewhere. In the next three years 150,000 prisoners passed through the camp. All but 18,000 perished before the end of the war.
Now Terezín is again a Czech town, as well as a museum and memorial to the holocaust victims. It seems a lonely place. A few boys kick a ball around on the old parade ground, the café has two or three customers, someone is holding a rummage sale—but the town’s population and economy have not recovered. The museum occupies parts of a dozen buildings, but many of the others appear to be vacant.
We looked at the museum exhibits, then wandered off the route of the self-guided tour. At the edge of town, near a construction site, a tunnel passed under the fortifications. Walking through, we came out into a grassy strip of land between the inner and outer ramparts. When we turned back to the tunnel, we noticed graffiti on the walls of the portal.
At first I assumed it was recent adolescent scribbling, but on looking closer we began to see dates in the 1940s, carved into the sandstone blocks. Could it be true? Could these incised names and drawings really be messages from the concentration-camp era? If so, who left them for us? Did the prisoners have access to this tunnel, or was it an SS guard post?
I was skeptical. Too good to be true, I thought. If the carvings were genuine, they would not have been left out here, exposed to the elements and unprotected against vandalism. They would be behind glass in one of the museum galleries. But if they were not genuine, what were they?
I took pictures. (The originals are on Flickr.)
Back home, some days later, my questions were answered. Googling for a few phrases I could read in the inscriptions turned up the website ghettospuren.de, which offers extensive documentation and interpretation (in Česky, Deutsch, and English). Briefly, the carvings are indeed authentic, as shown by photographs made in 1945 soon after the camp was liberated. The markings were made by members of the Ghettowache, the internal police force selected from the prison population. A dozen of the artists have been identified by name.
The website is the project of Uta Fischer, a city planner in Berlin, with the photographer Roland Wildberg and other German and Czech collaborators. They are working to preserve the carvings and several other artifacts discovered in Terezín in the past few years.
I offer a few notes and speculations on some of the inscriptions, drawing heavily on Fischer’s commentary and translations:
“Brána střežena stráží ghetta L.P. 1944.” Translation from ghettospuren.de: “The gate is being guarded by the ghetto guard, A.D. 1944.” This sign, given a prominent position at the entrance to the tunnel, reads like a territorial declaration. The date is interesting. Are we to infer that the gate was not guarded by the stráží ghetta before 1944? | |
“Pamatce na pobyt 1941–1944.” Translation from ghettospuren.de: “In remembrance of the stay 1941–1944.” Fischer remarks on the formality of the inscription, suggesting that this part of the south wall was created as “a collective place of remembrance.” The carving has been badly damaged since the first photos were made in 1945. | |
A floral arrangement is the most elaborate of all the carvings. Fischer identifies the artist as Karel Russ, a shopkeeper in the Bohemian town of Kyšperk (now Letohrad). Fischer writes: “In the top center there is still a recognizable outline of the Star of David that was already removed in a rough manner in 1945.” For what it’s worth, I’m not so sure that’s not another flower. The deep hole in the middle was not present in 1945 and is not explained. | |
Four caricatures of the same figure are lined up on a single sandstone block on the north wall, with a fifth squeezed into a narrow spot on the block below. Why the repetition? And who was the subject? The Italian legend “Il capitano della guardia” and the double stripe on the hat suggest a high-ranking Ghettowache official. Did he take these cartoonish portrayals with good humor? Or could the drawings possibly be selfies? | |
The menorah at the bottom left of this panel is the only explicitly Jewish iconography I have spotted in these images. (As noted above, Fischer believes the floral panel included a Star of David.) As far as I can tell, there are no Hebrew inscriptions. | |
Portraits of a man and a woman? That’s my best guess, but the carving is indistinct. The line above presumably reads “M.C. 1944,” but the “1″ has been gouged away. | |
Not all of the inscriptions come from the Second World War. This one, signed “Alchuz Jan,” is dated August 6, 1911. Another (not shown) claims to be from 1871. | |
It’s only to be expected that there are also later additions to the graffiti. Toward the bottom of this panel we have B.K. ♥ R.V. 1953. The white scrawl at top left is much more recent. On the other hand, the signature of “Waltuch Wilhelm” at upper right is from the war years. Fischer has identified him as the owner of a cinema in Vienna. Elsewhere he also signed his name in Cyrillic script. |
I am curious about the chronology of the Ghettowache inscriptions. Are we seeing an accumulation of work carried out over a period of years, or was all the carving done in a few weeks or months? The preponderance of items dated 1944 argues for the latter view. In particular, the inscription “In remembrance of the stay 1941–1944” could not have been written before 1944, and it suggests some foreknowledge that the stay would soon be over.
A lot was going on at Terezín in 1944. In June, the camp was cleaned up for a stage-managed, sham inspection by the Red Cross; to reduce overcrowding in preparation for this event, part of the population was deported to Auschwitz. Later that summer, the SS produced a propaganda film portraying Theresienstadt as a pleasant retreat and retirement village for Jewish families; the film wasn’t really titled “The Führer Gives a Village to the Jews,” but it might as well have been. As soon as the filming was done, thousands more of the residents were sent to the death camps, including most of those who had acted in the movie. In the fall, with the war going badly for Germany, the SS decided to close the camp and transport everyone to the East. Perhaps that is when some of the inscriptions with a tone of finality were carved—but I’m only guessing about this.
As it happens, the liquidation of the ghetto was never completed, and in the spring of 1945 the flow of prisoners was reversed. Trains brought survivors back from the extermination camps in Poland, which were about to be overrun by the Red Army. When Terezín was liberated by the Soviets in early May, there were several thousand inmates. But the tunnel has no inscriptions dated 1945.
Graffiti is a varied genre. It encompasses scatological scribbling in the toilet stall, romantic declarations carved on tree trunks, the existential yawps of spray-paint taggers, dissident political slogans on city walls, religious ranting, sports fanaticism, and much else. It’s often provocative, sometimes indecent, imflammatory, insulting, or funny. The tunnel carvings at Terezín evoke a quite different set of adjectives: poignant, elegiac, calm, tender. It’s not surprising that we see no overtly political or accusatory statements—no strident “Let my people go,” no outing of torturers or collaborators. After all, these messages were written under the noses of a Nazi administration that wielded absolute and arbitrary power of life and death. Even so—even considering the circumstances—there’s an extraordinary emotional restraint on exhibit here.
What audience were the tunnel elegists addressing? I have to believe it was us, an unknown posterity who might wander by in some unimaginable future.
When Ros and I wandered by, the fact that we had discovered the place by pure chance, as if it were a treasure newly unearthed, made the experience all the more moving. Seeing the stones in a museum exhibit—curated, annotated, preserved—would have had less impact. Nevertheless, that is unquestionably where they belong. Uta Fischer and her colleagues are working to make that happen. I hope they succeed in time.
]]>On your desktop is a black box. Actually it’s an orange box, because black boxes are usually painted “a highly visible vermilion colour known as international orange.” In any case, it’s an opaque box: You can’t see the whirling gears or the circuit boards or whatever else might be inside.
Go ahead: Press the button. A number is printed on the tape. Press again and another number appears. Keep going. A few more. Notice anything special about those numbers? The sequence begins:
5, 3, 11, 3, 23, 3, 47, 3, 5, 3, 101, 3, 7, 11, 3, 13, 233, 3, 467, 3, 5, 3, . . .
oeis.org/A137613
Yep, they’re all primes. They are not in canonical order, and some of them appear more than once, but every number in the list is certifiably indivisible by any number other than 1 and itself. Does the pattern continue? Yes, there’s a proof of that. Do all primes eventually find a place in the sequence? The very first prime, 2, is missing. Whether all odd primes eventually turn up remains a matter of conjecture. On the other hand, it’s been proved that infinitely many distinct primes are included.
So what’s inside the box? Here’s the JavaScript function that calculates the numbers printed on the tape. There’s not much to it:
var n = 2, a = 7; // initial values function nextG() { var g = gcd(n, a); n = n + 1; a = a + g; return g; }
The function gcd(n, a)
computes the greatest common divisor of n
and a
. As it happens, gcd
is not a built-in function in JavaScript, but there’s a very famous algorithm we can easily implement:
function gcd(x, y) { while (y > 0) { var rem = x % y; // remainder operator x = y; y = rem; } return x; }
The value returned by nextG
is not always a prime, but it’s always either \(1\) or a prime. To see the primes alone, we can simply wrap nextG
in a loop that filters out the \(1\)s. The following function is called every time you press the Next button on the orange black box
function nextPrime() { var g; do g = nextG() while (g === 1); // skip 1s return g; }
For a clearer picture of where those primes (and \(1\)s) are coming from, it helps to tabulate the successive values of the three variables n, a, and g.
n 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 a 7 8 9 10 15 18 19 20 21 22 33 36 37 38 39 40 41 42 43 44 45 46 g 1 1 1 5 3 1 1 1 1 11 3 1 1 1 1 1 1 1 1 1 1 23
From the given initial values \(n = 2\), \(a = 7\), we first calculate \(g = \gcd(2, 7) = 1\). Then \(n\) and \(a\) are updated: \(n = n + 1\), \(a = a + g\). On the next round the gcd operation again yields a \(1\): \(g = \gcd(3, 8) = 1\). But on the fourth iteration we finally get a prime: \(g = \gcd(5, 10) = 5\). The assertion that \(g\) is always either \(1\) or a prime is equivalent to saying that \(n\) and \(a\) have at most one prime factor in common.
This curious generator of primes was discovered in 2003, during a summer school exploring Stephen Wolfram’s “New Kind of Science.” A group led by Matthew Frank investigated various nested recursions, including this one:
\[a(n) = a(n-1) + gcd(n, a(n-1)).\]
With the initial condition \(a(1) = 7\), the sequence begins:
7, 8, 9, 10, 15, 18, 19, 20, 21, 22, 33, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 69, . . .
oeis.org/A106108
Participants noticed that the sequence of first differences — \(a(n) - a(n-1)\) — seemed to consist entirely of \(1\)s and primes:
1, 1, 1, 5, 3, 1, 1, 1, 1, 11, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 23, 3, . . .
oeis.org/A132199
Stripping out the \(1\)s, the sequence of primes is the same as that generated by the orange black box:
5, 3, 11, 3, 23, 3, 47, 3, 5, 3, 101, 3, 7, 11, 3, 13, 233, 3, 467, 3, . . .
oeis.org/A137613
During the summer school, Frank and his group computed 150 million elements of the sequence and observed no composite numbers, but their conjecture that the value is always \(1\) or prime remained unproved. One of the students present that summer was Eric S. Rowland, who had just finished his undergraduate studies and was about undertake graduate work with Doron Zeilberger at Rutgers. In 2008 Rowland took another look at the gcd-based prime generator and proved the conjecture.
The sequence beginning with \(a(1) = 7\) is not unique in this respect. Rowland’s proof applies to sequences with many other initial conditions as well—but not to all of them. For example, with the initial condition \(a(1) = 3765\), the list of “primes” begins:
53, 5, 57, 5, 9, 13, 7, 71, 3, 41, 3, 4019, 3, 8039, . . .
Neither 57 nor 9 is a prime.
A number of other mathematicians have since elaborated on this work. Vladimir Shevelev gave an alternative proof and clarified the conditions that must be met for the proof to apply. Fernando Chamizo, Dulcinea Raboso, and Serafín Ruiz-Cabello showed that even if a sequence includes composites, there is a number \(k\) beyond which all entries \(a(k)\) are \(1\) or prime. Benoit Cloitre explored several variations on the sequence, including one that depends on the least common multiple (lcm) rather than the greatest common factor; the lcm sequence is discussed further in a recent paper by Ruiz-Cabello.
Should we be surprised that a simple arithmetic procedure—two additions, a gcd, and an equality test—can pump out an endless stream of pure primality? I have been mulling over this question ever since I first heard about the Rowland sequence. I’m of two minds.
Part of the mystique of the primes is their unpredictability. We can estimate how many primes will be found in any given interval of the number line, and we can compile various other summary statistics, but no obvious rule or algorithm tells us exactly where the individual primes fall within the interval.
But there’s another side to this story.
\[p_n = \left\lfloor 1 - \log_2 \left( -\frac{1}{2} + \sum_{d|P_{n-1}} \frac{\mu(d)}{2^d - 1} \right) \right \rfloor.\]
Here \(P_n\) is the primorial product \(p_{1}p_{2}p_{3} \ldots p_{n}\) and \(\mu\) is the Möbius function. (If you don’t know what the Möbius function is or why you should care, Peter Sarnak explains it all.)
Way back in 1947, W. H. Mills offered a formula with just three symbols and a pair of floor brackets. He proved that a real number \(A\) exists such that
\[\left \lfloor A^{3^{n}}\right \rfloor\]
is prime for all positive integers \(n\). One possible value
2, 11, 1361, 2521008887, 16022236204009818131831320183, . . .
oeis.org/A051254
A third example brings us back to the gcd function. For all \(n > 1\), \(n\) is prime if and only if \[\gcd((n - 1)!, n) = 1.\]
From this fact we can craft an algorithm that generates all the primes (and only the primes) in sequence.
The trouble with all these formulas is that they require prior knowledge of the primes, or else they have such knowledge already hidden away inside them. Solomon Golomb showed that Gandhi’s formula is just a disguised version of the sieve of Eratosthenes. The Mills formula requires us to calculate the constant \(A\) to very high accuracy, and the only known way to do that is to work backward from knowledge of the primes. As for \(\gcd((n - 1)!, n) = 1\), it’s really more of a joke than a formula; it just restates the definition that n is prime iff no integer greater than 1 divides it.
Underwood Dudley opined that formulas for the primes range “from worthless, to interesting, to astonishing.” That was back in 1983, before the Rowland sequence was known. Where shall we place this new formula on the Dudley spectrum?
Rowland argues that the sequence differs from the Gandhi and Mills formulas because it “is ‘naturally occurring’ in the sense that it was not constructed to generate primes but simply discovered to do so.” This statement is surely true historically. The group at the Wolfram summer school did not set out to find a prime generator but just stumbled upon it. However, perhaps the manner of discovery is not the most important criterion.
Let’s look again at what happens when the procedure NextG
is invoked repeatedly, each time returning either \(1\) or a prime.
n 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 a 7 8 9 10 15 18 19 20 21 22 33 36 37 38 39 40 41 42 43 44 45 46 g 1 1 1 5 3 1 1 1 1 11 3 1 1 1 1 1 1 1 1 1 1 23
In the \(g\) row we find occasional primes or groups of consecutive primes separated by runs of \(1\)s. If the table were extended, some of these runs would become quite long. A good question to consider is how many times NextG
must be called before you can expect to see a prime of a certain size—say, larger than 1,000,000. There’s an obvious lower bound. The value of \(gcd(n, a)\) cannot be greater than either \(n\) or \(a\), and so you can’t possibly produce a prime greater than a million until \(n\) is greater than a million. Since \(n\) is incremented by \(1\) on each call to NextG
, at least a million iterations are needed. And that’s just a lower bound. As it happens, the Rowland sequence first produces a prime greater than 1,000,000 at \(n =\) 3,891,298; the prime is 1,945,649.
The need to invoke NextG
at least \(k\) times to find a prime greater than \(k\) means that the Rowland sequence is never going to be a magic charm for generating lots of big primes with little effort. As Rowland remarks, “a prime \(p\) appears only after \(\frac{p - 3}{2}\) consecutive \(1\)s, and indeed the primality of \(p\) is being established essentially by trial division.”
Rowland also points out a shortcut, which is best explained by again printing out our table of successive \(n, a, g\) values, with an extra row for some \(a - n\) values:
n 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 a 7 8 9 10 15 18 19 20 21 22 33 36 37 38 39 40 41 42 43 44 45 46 g 1 1 1 5 3 1 1 1 1 11 3 1 1 1 1 1 1 1 1 1 1 23 a–n 5 5 5 11 11 11 11 23 23 23 23 23 23 23 23 23 23
Within each run of \(1\)s, \(a-n\) is a constant—necessarily so, because both \(a\) and \(n\) are being incremented by \(1\) on each iteration. What’s more, based on what we can see in this segment of the sequence, the value of \(a-n\) during a run of \(1\)s is equal to the next value of \(n\) that will yield a nontrivial gcd. This observation suggests a very simple way of skipping over all those annoying little \(1\)s. Whenever \(gcd(a, n)\) delivers a \(1\), set \(n\) equal to \(a-n\) and increment \(a\) by \(a - 2n\). Here are the first few values returned by this procedure:
5, 3, 11, 3, 23, 3, 47, 3, 95, 3, . . .
Uh oh. 95 is not my idea of a prime number. It turns out the shortcut only works when \(a-n\) is prime. To repair the defect, we could apply a primality test to each value of \(a-n\) before taking the shortcut. But if we’re going to build a primality test into our prime generator, we might as well use it directly to choose the primes.
It seems we are back where we began, and no closer to having a practical prime generator. Nevertheless, on Dudley’s scale I would not rank this idea as “worthless.” When you take a long hike around the shore of a lake, you eventually wind up exactly where you started, but that does not make the trip worthless. Along the way you may have seen something interesting, or even astonishing.
]]>
The technology behind this magic trick is a JavaScript component called CodeMirror, which is also Haverbeke’s creation. It is a code editor that can be embedded in any web page, providing all the little luxuries we’ve come to expect in a modern programming environment: automatic indentation and bracket matching, syntax coloring, autocompletion. The program and all of its many addons are free and open source. By now it is very widely used but not as widely noticed, because it tends to get buried in the infrastructure of other projects. I am writing this brief note to bring a little attention to an underappreciated gem.
Haverbeke’s Eloquent JavaScript is not the only online book that relies on Codemirror. Another one that I find very engaging is Probabilistic Models of Cognition, by Noah D. Goodman and Joshua B. Tenenbaum, which introduces the Church programming language. There’s also The Design and Implementation of Probabilistic Programming Languages, by Goodman and Andreas Stuhlmüller, which provides a similar introduction to a language called WebPPL. And the Interactive SICP brings in-the-page editing to the Sussman-and-Abelson Dragon Wizard Book.
But CodeMirror has spread far beyond these pedagogic projects. It is built into the developer tools of both the Chrome and the Firefox web browsers. It is the editor module for the IPython notebook interface (a.k.a. Jupyter), which is also used by the Sage mathematics system. It’s in both Brackets and Light Table, two newish open-source code editors (which run as desktop applications rather than in a browser window). You’ll also find CodeMirror working behind the scenes at Bitbucket, JSFiddle, Paper.js, and close to 150 other placed.
I had only a vague awareness of CodeMirror until a few months ago, when the New England Science Writers put on a workshop for science journalists, Telling Science Stories with Code and Data. As part of that project I wrote an online tutorial, JavaScript in a Jiffy. CodeMirror was the obvious tool for the job, and it turned out to be a pleasure to work with. A single statement converts an ordinary textarea
HTML element into fully equipped editor panel. Any text entered in that panel will automatically be styled appropriately for the selected programming language. The machinery allowing the user to run the code is almost as simple: Grab the current content of the editor panel, wrap it in a <script>...</script>
tag, and append the resulting element to the end of the document. (Admittedly, this process would be messier with any language other than JavaScript.)
The trickiest part of the project was figuring out how to handle the output of the programs written in CodeMirror panels. Initially I thought it would be best to just use the browser’s JavaScript console, sending textual output as a series of console.log
messages. This plan has the advantage of verisimilitude: If you’re actually going to create JavaScript programs, the console is where test results and other diagnostic information get dumped. You need to get used to it. But some of the workshop participants found the rigmarole of opening the browser’s developer tools cumbersome and confusing. So I went back and created pop-up panels within the page to display the output. (It still goes to the console as well.)
A project like this would have been beyond my abilities if I had had to build all the machinery myself. Having free access to such elegant and powerful tools leaves me with the dizzy sensation that I have stumbled into an Emerald City where the streets are paved with jewels. It’s not just that someone has taken the trouble to create a marvel like CodeMirror. They have also chosen to make it available to all of us. And of course Haverbeke is not alone in this; there’s a huge community of talented programmers, fiercely competing with one another to give away marvels of ingenuity. Who’d’ve thunk it?
]]>