## New year, new bit-player

New server, new design, new ambitions. It’s been a big project. More later, when I’ve slept. In the meantime, feel free to poke around.

Posted in meta | 5 Comments

## Powerlessness

It’s been two weeks since the storm called Sandy hit the East Coast, with devastating effects on both people and infrastructure. I have some remarks to make about just one aspect of the damage: the prolonged power outage in parts of New Jersey, New York and a few other states. At the nadir, on October 30th, 8.2 million utility customers were without electricity. As of this morning, almost 99 percent of them have finally been reconnected—but that still leaves 89,000 in the dark, quite a sizeable number under ordinary circumstances.

I am outside the blackout area, but I have been following the story with particular care and interest because my daughter and her family are among those who have been living in pre-Edisonian gloom. She finally had service restored last night, after a wait 13 days.

The numbers charted above come from the U.S. Departmenrt of Energy, Office of Electricity Delivery and Energy Reliability. Note that they tally the number of customers—or in essence the number of electric meters—not the number of people. Carl Bialik, writing in the Wall Street Journal, quotes an estimate by Brian Wolff of the Edison Electric Institute: The peak outage of 8.2 million customers may have represented 60 million people in the dark. Over the two-week period, the outages sum up to about 32 million customer-days.

A few days into the ordeal, when my daughter was seeing no sign of power-company line crews, I offered this optimistic hypothesis: Maybe there’s a lot of work going on behind the scenes that will allow many thousands of customers to be switched on all at once. In other words, the plot of outage numbers would be concave-downward, with an initial plateau where there’s little change from day to day, then a steep plunge when all the circuits are finally re-energized. A glance at the actual plot shows that my theory was quite wrong, a product of mere wishful thinking. The shape of the curve is definitely concave-upward. As a matter of fact, over the first nine days, the progress of restoration is modeled fairly well by a simple exponential function: Each day, about 25 percent of the remaining outage victims get their lights back. Thus the number of restorations per day steadily declines. If we suppose that the work crews are putting in constant effort, this pattern of diminishing returns suggests that the work required to restore a single customer doubles every two or three days.

Sandy is the third storm in a little over a year to cause very widespread and persistent power-system disruptions in the Northeast and mid-Atlantic states. Hurricane Irene, in August of 2011, left 6.7 million customers without electricity for about a week. Then, a Halloween-weekend snowstorm knocked out power for 3 million customers, many of whom again had to wait several days for repair crews to show up. (My daughter’s family was among the unlucky ones in both of those events, too. They’ve spent roughly one full month out of the last 15 months off the grid.)

It’s my impression, based on nothing more reliable than personal recollection, that power losses of this scale and duration are a new phenomenon. Until 1996, I don’t think I ever experienced a power failure of more than 24 hours. What happened in 1996? I was in North Carolina at the time, and Hurricane Fran had me living by candlelight and charcoal grill for a week. It seemed like quite a novelty at the time.

We’ve certainly seen other large-scale blackouts over the decades. A vast swath of the Northeast went dark in November of 1965. New York City had a particularly traumatic night in 1977. In the summer of 2003 a cascading outage began in Ohio and spread east to the coast and north into Ontario. But these three events differ from the recent storm-associated outages in at least two important ways. First, they didn’t last as long: Few customers were cut off for as much as 24 hours. Second, the source of the problems was at the “wholesale” level of the electric-power grid, with generating plants and high-voltage transmission lines. A single fault in this part of the network can knock out power to millions of people; on the other hand, a single repair can bring it back again.

The power-grid damage caused by Sandy and other recent storms was mainly at the “retail” level, with hundreds or thousands of local distribution lines out of commission. Those snagged lines and broken poles have to be fixed one by one. (Sandy brought at least one exception to this “retail” pattern: The power failure in lower Manhattan was caused by flooding of a major substation.)

•     •     •

If it’s true that storm-damage outages have gotten bigger and more prolonged in the past couple of decades, what might account for the change? I don’t have an answer, but I have a bunch of speculative hypotheses to offer. I’ll let other people shoot these ideas down.

• It’s the climate, stupid. Intense storms are more frequent than they used to be. The problem is only going to get worse. Get used to it.
• Sprawl. Look at the hard-hit areas of New Jersey and Long Island, where millions of suburban and exurban residents have been suffering these past two weeks. Until recent years all those acres were farmland, occupied by cows and a few flinty farm families who knew how to light a kerosene lamp. When the power failed out in the sticks, only a few people were directly affected, and the rest of us never heard about it. Now those cow pastures are full of McMansions, whose occupants expect a much higher level of service.
• Sharpen the axe. When all that rural land was farmed, it was also cleared of trees. In the past 50 years or so, much of the northeast has been reforested at the same time it has been suburbanized. The roadside Norway maples planted circa 1960 are mature now—ready to fall on the nearest power line. If the power companies want to trim them, homeowners and municipalities won’t let them.
• Deregulation, privatization, fragmentation. In the good old days of my youth, the entire electric power industry was regulated by the Federal Power Commission and by state utility commissions. The companies themselves were vertically integrated, handling all aspects of electrical technology from generation to transmission to distribution to retail sales. There was an entrenched engineering culture that valued reliability above all else. After the 1965 blackout, the industry formed regional reliability councils to develop strategies and practices to avoid repetitions of that event. Now, in the post-Enron age, generating electricity, transmitting it and distributing it are economically separate activities. Organizations that were once cooperating divisions of the same parent company have become cut-throat competitors. Regulation has all but disappeared. Maintenance of right of way is seen as a cost and a threat to quarterly earnings, not as a prudent investment. No one corporate entity can be held responsible for keeping the lights on.
• Shoot the messenger. Most utility poles are owned by electric companies, but the owners also rent out the space beneath the power lines. The main tenants are telephone companies and cable TV. With the proliferation of communication circuits, poles carry more weight than they used to. Perhaps more important, fiber-optic telephone and cable-TV lines are often suspended from a very stout steel cable called a messenger wire. The poles along a suburban road could have two or three of these messenger wires. It used to be that a tree falling onto copper or aluminum wires would just snap them off; the repair was a quick splicing job. Now, the steel messenger wires can hold the weight of a fallen tree, and it’s the poles that give way. Replacing a snapped pole takes a lot longer than splicing a few wires. (During the Sandy recovery, there were rumors of pole shortages.)

Following the big blackouts of 1965, 1977 and 2003, boards of inquiry were formed to root out the causes and suggest remedies. I’m hoping we see the same kind of thoughtful inquiry this time.

Posted in modern life, technology | 1 Comment

## Fly me to the end of the alphabet

When I got to the Raleigh-Durham airport for a flight home last week, I was confronted with the following list of departures—almost half of which were actually nondepartures:

I wasn’t surprised by the numerous cancellations and delays (flagged in red). It was October 30th, a day after Post-Tropical Cyclone Sandy came ashore in New Jersey. But I would not have predicted the alphabetic bias visible in this list, which is sorted by city name. It looks as if the storm spared the early letters of the alphabet and clobbered the later ones. For cities in the alphabetic range from “Atlanta” to “Little Rock,” all but four of 35 flights were operating normally. For destinations from “Memphis” to “Washington,” all but three of 31 flights were canceled or delayed. That’s 11 percent versus 90 percent.

How fluky is this? Perhaps the proper question is: How unlikely is it that four major cities of the mid-Atlantic region would all have end-of-alphabet names: New York, Newark, Philadelphia, Washington? While, conversely, the early-letter cities served by flights from RDU are all elsewhere: Atlanta, Austin, Boston, Charlotte, Chicago, Cincinnati, Dallas, Detroit, Fort Lauderdale, Hartford, Houston, Indianapolis, Little Rock. Or maybe we should be asking why Minneapolis, Montreal and Omaha, though beyond the reach of Sandy, were also red-tag cities that afternoon.

I have little to say in answer to the questions above. And, the truth is, this post is not really about aviation or my minor inconveniences in getting home. This post is a test rig for a bit of Javascript that I’ve been fooling around with for the past few days. The display of flight information above is too small to be readable, but try moving your mouse pointer over the image. (If hovering with the mouse—or your finger, on touch devices—does nothing useful, please send me a note or leave a comment.)

Source code here. Inspiration from Trent and aplweb.

Posted in computing, statistics, technology | 8 Comments

## On the sunny side

The invented etymology of posh—which says it’s an acronym for “port out, starboard home”—is utterly bogus. Nevertheless, when I book airline seats I always try to get a window on the shady side. Photography is easier with the sun at your back. This morning, however, I was on the sunny side southbound out of Boston. The photographic result was fairly curious.

You might think you’re looking at a snowy landscape here, with bits of two lakes intruding into the left and right edges of the frame. But the lakes are actually islands, and the white field between them is the sea, with the specular reflection of the sun in the upper lefthand corner. Under these lighting conditions, it seems that small variations in angle or texture cause huge differences in brightness. It’s Snell’s Law in action.

A detail from a second frame—two was all I caught—shows the interference patterns in the boat wakes even more clearly.

Here is the uncropped and unenhanced version of the first image. (Uncropped but with grossly reduced pixel count.)

The two islands are part of an archipelago that extends southwest from Woods Hole on Cape Cod, between the mainland and Martha’s Vineyard. I believe these two are Pasque Island on the left and Nashawena Island on the right. Wikipedia tells me that almost all of these islands, which are known as the Elizabeths, are owned by the Forbes family. How posh!

Posted in photography | 3 Comments

## Remember the memristor?

Less than two years ago, I was reading and writing about a new kind of passive circuit element—the memristor, a companion to the resistor, the capacitor and the inductor. (See the bit-player coverage, a bit-player followup and my American Scientist column.) The principal actors in this story were Leon Chua of Berkeley, who formulated the theory of the memristor 40 years ago, and R. Stanley Williams of Hewlett-Packard, who announced a working device in 2008.

I haven’t been keeping up with memristor news lately, but I did notice that Williams has been making the rounds with a talk on “Mott Memristors, Spiking Neuristors and Turing Complete Computing.” When he came to Harvard last week, I went over to listen in. I discovered that my 20-month-old article is totally out of date.

The memristor I wrote about in 2011 was based on ion drift in titanium dioxide. A layer of TiO2 was fabricated with a conductive doped region and an insulating undoped region. Current flowing through the TiO2 would shift the boundary between the two regions and thereby switch the memristor between conducting and insulating states. The device was bipolar: Currents in opposite directions had opposite effects. It was also nonvolatile: The resistance state would persist even after the current ceased.

The memristor that Williams discussed last week is still recognizable as a member of the same family, but just barely so. Again the material is TiO2 or another transition-metal oxide, but there is no mention of doping. And, again, electric currents cause a change in resistance, but the underlying mechanism is quite different. Williams described the device as a cylinder of insulating oxide with a narrow conductive channel down the middle; current flowing through the channel heats the cylinder from the inside out, causing a phase transition that converts surrounding layers of material to the conducting state. Specifically, the heating induces a Mott transition, in which localized electron clouds begin to overlap, bringing on a sudden increase in conductivity.

During the talk I didn’t hear Williams say anything about reversing the phase transition, but other sources indicate that the high-resistance state is restored by applying a second, larger current. Note that this memristor is not a bipolar device: The switching actions can be triggered by currents flowing in either direction.

A heat-induced phase transition sounds like an unlikely mechanism for a modern computing element. When I heard the idea explained, I couldn’t help thinking of computing with a toaster, whose glowing red wires seem ruinously power-hungry, and awfully slow. But Williams says the Mott memristor can be both fast and efficient, because the active volume is very small, with a typical dimension of 30 nanometers. Even though the temperature swing may be as much as 800 Kelvin, the switching time is nanoseconds, and the energy dissipated is femtojoules.

It’s a little disconcerting to see the basic physical principles of the memristor changing so drastically before the device even reaches the market. But I suppose there’s precedent in the early history of the transistor. The transistor announced by Bell Labs in 1948 was a bipolar, point-contact device made from a block of germanium; the world now runs on field-effect transistors imprinted on a silicon surface.

Which brings us to the big, obvious question I wasn’t able to answer two years ago. Will the memristor become the transistor of the 21st century, transforming electronics and computing? Or will it fade away like so many other once-promising technologies, like magnetic bubbles or superconducting Josephson junctions? Or could it find a niche market, the way charge-coupled devices have?

It’s well-known—and it may even be true—that most innovations fail to dislodge the incumbent technology. That’s a reason for betting against the memristor regardless of its particular merits and handicaps. No doubt it’s a smart bet. Nevertheless, I would like to say one thing in support of efforts to develop the memristor and other novelties like it. The mainstream in digital electronics is still focused on taking devices whose operation we understand at micrometer scale and trying to reproduce the same behavior in devices a thousand times smaller. It seems worthwhile trying the opposite strategy as well: Looking at what happens “naturally” at nanoscale, and trying to build something out of it.

After the talk, I dug up a few papers with more details on the recent developments. Two are by members of the Williams group:

Alexandrov, A. S., A. M. Bratkovsky, B. Bridle, S. E. Savel’ev, D. B. Strukov and R. Stanley Williams. 2011. Current-controlled negative differential resistance due to Joule heating in TiO2. Applied Physics Letters 99:202104. (Preprint.)

Strukov, Dmitri B., Fabien Alibart and R. Stanley Williams. 2012. Thermophoresis/diffusion as a plausible mechanism for unipolar resistive switching in metal–oxide–metal memristors. Applied Physics A DOI 10.1007/s00339-012-6902-x. (Preprint.)

The third is a slightly earlier discussion of Mott memristors by a Korean-UCSD collaboration:

Driscoll, Tom, Hyun-Tak Kim, Byung-Gyu Chae, Massimiliano Di Ventra and D.N. Basov. 2009. Phase-transition driven memristive system. (arXiv preprint.)

Posted in computing, technology | 3 Comments

## Dancing with the Spheres

In my latest American Scientist column I’m packing spheres again. The goal this time is to maximize the number of contact points where spheres touch each other. For example, four spheres in a tetrahedral cluster have six contacts; this is the most possible, since each sphere touches all the rest. Over the past few years groups at Harvard and Yale have solved the maximum-contact problem for clusters of up to 11 spheres. Beyond that, nobody knows.

You can read all about it in your choice of PDF or living HTML, or you might even hunt down one of the rare and precious paper copies of the magazine.

The American Scientist column focuses on the work of the Harvard and Yale groups. Here I’m going to talk about my own adventures with contact-counting while creating illustrations for the article.

Geomag. When I visited some of the people behind the new contact-counting results, the first thing I noticed was their toys. They play with Geomag, a Swiss-made construction kit consisting of plastic-coated magnetic rods and half-inch polished steel balls. As soon as I got home, I bought a set for myself.

The pair of images below, showing two depictions of a regular tetrahedron, should make clear how clusters of spheres map onto Geomag models.

Assume all spheres have a radius of 1/2; then two tangent spheres have a center-to-center distance of 1. In the diagram at right I emphasize this fact by drawing a rod of unit length between the centers of all spheres that are in contact. The Geomag model at left reproduces this skeleton of rods, omitting the spheres themselves.

Natalie Arkus, who initiated the recent contact-counting work when she was a graduate student at Harvard (she is now at Rockefeller University) told me that the Geomag models were more than just a visual aid in her work. They were a tool for discovering new configurations and deciding which candidate structures are geometrically feasible.

Spinners. Having a 3D model to turn over in your hands is definitely the best way to understand the geometry of sphere clusters. Unfortunately, there is no <geomag> HTML tag that would allow me to embed a model in this web page.

A familiar alternative for 3D visualization is the stereoscopic pair:

I’m not a big fan of this technique. Not everyone is adept at fusing the images without the help of a viewing device. And even when you can achieve the stereo illusion, it doesn’t help much in seeing what’s on the back side of the object.

My own favorite trick for 3D visualization is to make the object twirl. It’s no secret that the human visual system does a better job of interpreting three-dimensional structures from a moving image than from a still one. This strategy is of no use in the print edition of the magazine, but I wanted to take advantage of it in the web edition and here on bit-player. If you haven’t done so already, hover your mouse pointer over the image above right; it should start spinning.

“Should,” I said. It works fine in Chrome and Safari and Opera. It even works in Internet Explorer. It works in Firefox version 3.6. But newer versions of Firefox give erratic results. Typically, the animation plays once but then dies. This bug was reported to Mozilla ages ago, but it persists in the latest Firefox update (16.0.1). Sorry for the rant, but it’s not like we’re dealing with bleeding-edge HTML5 technology here. The spinning sphere clusters are animated GIFs, a file format that goes back to 1989.

Coordinates. The first task in creating these illustrations is finding the coordinates of the sphere centers. Both the Harvard and the Yale groups have published extensive lists of coordinates, but I elected to roll my own. I wanted the animated figures to twirl around an axis of symmetry (in those cases where such an axis exists), and I found it easier to build up the geometry from scratch rather than making transformations from some other coordinate system.

Calculating coordinates is generally easy in these clusters because they all share an important property: Each sphere touches at least three other spheres. Suppose you want to add a fifth sphere to the tetrahedral cluster illustrated above. The only way the new sphere can touch three others is if it nestles into the central dimple in one of the four triangular faces of the tetrahedron. You can find the coordinates of the fifth center point by solving three simultaneous equations, defining distances to the three vertices of the triangular face. The equations have two solutions, defining points above and below the plane of the triangle. In the tetrahedron, one of those points is already occupied, and so the new sphere must be placed at the other solution point, on the opposite side of the plane of the triangle.

Having attached a fifth sphere to the cluster, you can use the same method to add a sixth, a seventh, and so on. In clusters made up entirely of regular tetrahedra, the process is especially straightforward. With other geometries it’s not always quite so obvious how to proceed, but with one exception I was able to construct all the clusters of interest by such step-by-step methods. (For the exception, see below under The Problematic Eight.)

Making Movies. With the coordinates in hand, assembling the diagrams is not complicated. At each sphere-center coordinate, draw a semitransparent sphere of radius 1/2. Then, to represent the Geomag-like skeleton of bonds, draw much smaller opaque spheres (radius 1/20) at each point, and add opaque tubes connecting the centers of spheres that are touching. How does the algorithm know which spheres are touching? Simply run through all $$n(n-1)/2$$ pairs of spheres and select those whose euclidean distance is close enough to 1. (“Close enough” allows for some imprecision in the case of coordinates determined by numerical rather than analytic methods.)

To make the cluster twirl, I generate a series of views, rotating the object by some fixed, small angle between snapshots. If the object has rotational symmetry, there’s no need to twirl through a full 360 degrees. The tetrahedron, for example, makes do with a 120-degree subset.

All this was done in Mathematica, which has a rich set of three-dimensional graphics routines. Rotating a 3D object, for example, is a built-in function; all the trig and the linear algebra happens behind the scenes, without human intervention. Another Mathematica command exports the whole series of rotated views as a prepackaged GIF file. The entire makeSpinner procedure is a seven-line program. Nifty! On the other hand, I had to struggle to defeat some of the Mathematica-knows-best automation. It seems a 3D graphic is always sized so that its 2D projection fills the allotted rectangular region on the screen. This may be a good idea most of the time, but it induces nausea with a rotating object. Try hovering on the octahedron at left. I found a directive (SphericalRegion → True) that I thought would cure this behavior, but it didn’t. If anyone reading this knows the right way to mark it up, please do let me know. I wound up with a kludge: I surrounded each cluster with a large, invisible sphere centered at the origin, fooling the Mathematica drawing routines into thinking that the cluster has constant width.

The Mathematica source code for building all the illustrations and animations is available here as a Mathematica notebook.

The Wiggler. Once I had all of those pirouetting purple bunches of grapes, I realized that the same animation apparatus could be adapted to show other kinds of motion.

Among all the sphere packings I’ve met in the course of this project, one of the most pivotal—and I choose that word carefully—is a nine-sphere configuration that exhibits a degree of torsional flexibility. Even though every sphere in the cluster touches at least three others, and therefore cannot move as an individual, the overall structure can twist without breaking any bonds.

Animating this kind of motion is a little more complicated than displaying a rigid rotation of the entire cluster. At each time step, the two spheres at the top of the structure are twisted in opposite directions; then all the rest of the coordinates have to be recalulated to accommodate the change.

Note that the animation shows an exaggerated range of motion. The structure can twist only infinitessimally if you insist on maintaining all center-to-center distances at exactly 1. The animation was drawn with a tolerance of ±1 percent, which allows rotation of ±4 degrees before bonds start breaking.

The Fivefold Way. If you spend any time playing with Geomag models, you are sure to stumble upon the structure shown at right, which consists of four tetrahedra joined along faces. It looks as if you might be able to add one more bond to close the gap, creating a solid of five joined tetrahedra. But it doesn’t work. The gap is slightly too wide. (By the way, I consider this a defect of our universe. We shouldn’t have to put up with such untidyness.) If you remove the central strut from the cluster of four tetrahedra, allowing the top and bottom spheres to move apart slightly, the gap closes and you arrive at a new cluster with exact fivefold symmetry, the pentagonal dipyramid, shown at left.

After the magazine had gone to press, I realized that the transition between these two structures could also be made clearer by an animation. Hover on the figure below to see it in action:

The Arkus Conjecture. The reversible transition between a tetratetrahedron and a pentagonal dipyramid requires breaking one bond and forming one new bond. Here’s another example of a single-bond exchange, where an octahedron is transformed into a cluster of three joined tetrahedra.

Arkus points out that all known maximum-contact packings are connected by some sequence of such single-bond moves. She conjectures that this fact will remain true for all larger clusters as well. If the conjecture is true, it will become much easier to catalogue the clusters for each value of n.

The Problematic Eight. At n=8 we encounter this peculiar structure, which can be understood as two pentagonal dipyramids smushed together at right angles:

This object has stumped all my efforts to determine the coordinates by a step-by-step procedure, working from a known triangle to a fourth vertex, then a fifth, and so on. Making the Geomag model was easy, but to generate the Mathematica graphic I had to solve for five variables simultaneously.

Turn It Up to 12. The Harvard and Yale groups identified all configurations that maximize the contact number for clusters with up to 11 spheres. Beyond this limit is terra incognita. Nevertheless, there’s a 12-sphere solution that looks like a very plausible candidate.

At the upper left is the nine-sphere “wiggler,” with 21 contacts. Then each of the subsequent frames, proceding from left to right and top to bottom, adds a single node and four more contacts. The final state, with 12 spheres and 33 contacts, is also shown at right in a spinning diagram. Is this structure the best that can be achieved with 12 spheres? For now the only reliable way to answer this question is to catalogue all other 12-sphere candidate clusters, and that would be a formidable undertaking.

I should add that this dense 12-sphere cluster is not a subset of the Kepler packing (hexagonal close-packed or face-centered cubic), and it cannot be extended to fill all of three-dimensional space.

Overflow. In American Scientist, my column this month spilled over two extra pages beyond my usual allotment. Nevertheless, there are still topics that I wasn’t been able to fit in. Some pointers:

• In the 1970s, M. R. Hoare and J. McInnes published a pioneering study of “small atomic clusters,” anticipating some of the recent results. They surveyed clusters with up to 13 atoms but considered only clusters that could be formed by combining tetrahedra and octahedra, ignoring various “new seeds” such as the pentagonal dipyramid. Reference: Hoare, M. R., and J. McInnes. 1976. Statistical mechanics and morphology of very small atomic clusters. Faraday Discussions of the Chemical Society 61:12–24.
• In 1995 Neil J. A. Sloane, R. H. Hardin, T. D. S. Duff and John Horton Conway suveyed “minimal-energy” clusters of up to 32 identical spheres. They defined the lowest-energy configuration as the one that minimizes the second moment of the spatial distribution, or in other words the sum of the squares of the distances from the center of mass. Although this criterion is quite different from the contact-counting method, many of the same clusters turn up. Reference: Sloane, N. J. A., R. H. Hardin, T. D. S. Duff and J. H. Conway. 1995. Minimal-energy clusters of hard spheres. Discrete and Computational Geometry 14:237–259.
• Conway and Sloane are also the authors of the standard reference work on the mathematics of sphere packing: Conway, J. H., and N. J. A. Sloane. 1999. Sphere Packings, Lattices, and Groups. 3rd edition. New York: Springer. I can also enthusiastically recommend a less-formal account of packing problems by Tomaso Aste and Denis Weaire: The Pursuit of Perfect Packing, 2008, second edition. New York: Taylor & Francis.
• If I ever get a chance to return to this topic, I would like to take a closer look at the work of Salvatore Torquato and his colleagues at Princeton, who study dense clusters in a context akin to the Newton kissing-number problem. For example, see: Hopkins, Adam B., Frank H. Stillinger and Salvatore Torquato. 2011. Densest local sphere-packing diversity. II. Application to three dimensions. Physical Review E 83:011304. (arXiv preprint)

Thanks. For helpful conversations and other kinds of guidance, my thanks to Rob Hoy of the University of South Florida, Corey O’Hern of Yale, Vinny Manoharan of Harvard and Natalie Arkus of Rockefeller University.

Posted in computing, featured, mathematics | 5 Comments

## College ties

Nate Silver’s fivethirtyeight blog for the New York Times applies computational statistics to U.S. presidential politics. A recent post discusses the possibility of a tie vote in the Electoral College.

If the votes on November 6 should come out according to the map above, we’ll have a 269-to-269 deadlock, leaving it to the House of Representatives to choose the next president. Silver ran 25,001 simulated elections, and the result shown above appeared 137 times. Eight other tied arrangements also turned up, though more rarely. The overall probability of an even split was about 0.6 percent. (The simulations were based on polls taken before last week’s Obama-Romney debate.)

Looking at Silver’s maps, I began to muse about a purely combinatorial question: Setting aside all considerations of political plausibility, in how many different ways can the Electoral College vote come out a tie? To put it another way, how many two-colorings of the U.S. map give each party 269 Electoral College votes?

I now have an answer to this question, which I’ll give below. But first, for the benefit of my non-U.S. readers, I should say a word about that curious institution the Electoral College. It was the wisdom of the Founders that the POTUS should not be elected directly by the people but rather by the states, with each state getting as many votes as it has senators (two each) plus members of the House of Representatives (from 1 to 53, depending on population). Since 1964 the District of Columbia (which isn’t a state, but I’m going to call it one anyway) has also had three votes. Thus there are 51 voting states, with vote allocations ranging from 3 to 55, and the total number of votes is 538.

 California 55 Maryland 10 Utah 6 Texas 38 Minnesota 10 Nebraska 5 Florida 29 Missouri 10 New Mexico 5 New York 29 Wisconsin 10 West Virginia 5 Illinois 20 Alabama 9 Hawaii 4 Pennsylvania 20 Colorado 9 Idaho 4 Ohio 18 South Carolina 9 Maine 4 Georgia 16 Kentucky 8 New Hampshire 4 Michigan 16 Louisiana 8 Rhode Island 4 North Carolina 15 Connecticut 7 Alaska 3 New Jersey 14 Oklahoma 7 Delaware 3 Virginia 13 Oregon 7 District of Columbia 3 Washington 12 Arkansas 6 Montana 3 Arizona 11 Iowa 6 North Dakota 3 Indiana 11 Kansas 6 South Dakota 3 Massachusetts 11 Mississippi 6 Vermont 3 Tennessee 11 Nevada 6 Wyoming 3

Generally, each state’s votes are cast as a winner-take-all block (but see the note on Maine and Nebraska at the end of this post).

I assume here that only two parties receive Electoral College votes. Thus any partitioning of the elector list into two subsets—red and blue—is a possible election outcome. The number of distinguishable outcomes is simply the product of all these binary choices: 251, or 2,251,799,813,685,248. We want to know how many of these cases yield exactly 269 red and 269 blue votes. It turns out there are 16,976,480,564,070 ways of arriving at a drawn election, which is roughly 0.75 percent of the total.

Let me emphasize that this number is of no political consequence; in particular, it says nothing about the probability of a tie—at least not unless the states choose their votes by flipping fair coins. I can’t see where the number holds any notable mathematical significance either. What attracted my interest was simply the challenge of computing it.

Doing it the direct and obvious way—enumerating all 251 partitions and summing the votes for each case—is not utterly unthinkable. Amazingly enough, we live in a world where you can count to a quadrillion and live to tell the tale. But doing so would require more programming effort—or more patience or more hardware—than I’m willing to invest for the sake of idle curiosity.

It’s interesting that some bigger problems are actually easier. Suppose we keep the number of electors constant at 538 but we subdivide the states into smaller territories, each of which gets fewer votes. In the limiting case there are 538 regions with one vote each. Now the problem is immensely larger: There are 2538 ways of two-coloring a map with 538 regions. Nevertheless, with very little fuss we can calculate the number of tied configurations, without having to count them all one by one. The number is:

$$\binom{538}{269} = \frac{538!}{(269!)^2} \approx 3 \times 10^{160}.$$

(The exact value is 30937469012875859932471852213149074559 46754979248232932201743920079668590495281866215749684213 30476315882464242716565222046883627439960129220374560486 58575478800.)

Is there some similar mathematical magic that will solve the original Electoral College problem? Not quite so neatly, as far as I know. But the same underlying principle offers a measure of help.

Let’s focus for a moment on the eight states in the Electoral College that get three votes each. Considering just those states in isolation, they have 28 = 256 possible red-blue configurations, but only nine different ways of contributing to the election outcome. If red wins none of the states, it gets zero votes. If red wins any one of the states, it gets three votes, and there are eight ways for this to happen. Winning any two of the states yields six votes, and there are $$\binom{8}{2} = 28$$ combinations with this result. In the map above, red wins five of the three-vote states (Alaska, Montana, North Dakota, South Dakota and Wyoming) while blue wins three (Delaware, the District of Columbia and Vermont). This combination is one of 56 that produce a 15–9 vote split in favor of red. By weighting the vote sums according to the appropriate binomial coefficients, we can deal with just nine cases instead of 256.

The same trick works for the five states that get four votes apiece, the three states that have five votes each, and so on. A neat way of organizing the computation is to count through all the possible values of a mixed-radix number (where each digit has its own radix, independent of its neighbors). For the Electoral College data, the mixed-radix number has a digit for each set of states that are allocated the same number of votes; the radix of this digit is one more than the number of states in the set. The number of 19 digits overall, with radices ranging from 2 to 9.

We can survey the full spectrum of election outcomes by cycling through all the values of this number, starting at 0000000000000000000 and counting up to 1122121111443236358. For each value we calculate the corresponding vote total s:

$$s = \sum_i k_i m_i$$

Then we use s as an index into an array H where we keep track of how many ways the total s can be achieved. We increment Hs by the appropriate number of combinations:

$$H_s = H_s + \prod_i \binom{r_i}{k_i}$$

With this algorithm, the number of configurations for which we need to sum up the votes is 6,270,566,400, which is a lot less than 251. Even allowing for the extra work of calculating binomial coefficients, the mixed-radix strategy reduces the size of the computation by a factor of 10,000 or so. For a Lisp program running on a laptop, it becomes a job of hours rather than years.

Since I was riffling through all of those 6,270,566,400 configurations anyway, I figured I might as well record not just the number of ties but the full spectrum of vote totals. It’s no surprise that the result looks like a binomial distribution:

What did surprise me a little is the smoothness of the curve. I was expecting a bit of lumpiness because of the minimum allocation of three votes and because a few states constitute large, indivisible blocks of votes. If you zoom in on the extreme tails of the distribution, there is indeed some jaggedness:

   votes:        0  1  2  3  4  5   6   7   8    9   10
combinations: 1  0  0  8  5  3  34  43  36  122  201


There’s just one way that a party can receive zero votes; winning either one or two votes is impossible; seven votes are more likely than either six or eight. But fluctuations of this kind become undetectable toward the middle of the distribution.

Is there some other approach that would improve on the mixed-radix algorithm? Almost surely. But I would bet against the possibility of a polynomial-time algorithm. Counting Electoral College ties is a variation on another problem for which I harbor a special fondness, called the number partitioning problem (NPP). I was introduced to the NPP by my friend Stephan Mertens, and I wrote about it in a 2002 American Scientist column. The NPP asks whether a set of positive integers has at least one partitioning into two sets that have the same sum. In this form, the problem is NP-complete. The counting version of the problem is surely no easier.

Finally, I have to confess that I haven’t really computed all the ways that the Electoral College can wind up with a tied vote. The snag is that pesky pair of states Maine and Nebraska, which do not necessarily award all their votes to the winner of the state contest. Instead they allocate one elector to the winner in each Congressional district, with the two senatorial electors going to the winner of the entire state. Thus Maine could be viewed as comprising three voting territories with allocations of 1, 1 and 2 electoral votes; Nebraska breaks down into four districts with 1, 1, 1 and 2 votes. This fragmentation of the vote expands the combinatorial challenge of counting ties. Another complication is even more troublesome: The intrastate voting territories are not independent. Unless vote counting is even more treacherous than it seems, you can’t lose all of a state’s Congressional districts but win the state overall. In my calculations I’ve simply ignored the possibility of split votes in Maine and Nebraska; perhaps someone else would like to patch this up.

Yet another caveat is that the Electoral College is not just a mathematical artifice but also a human institution. The electors are people who pledge to vote according to the mandate of their state, but they are not compelled to do so. An elector who ignores his or her instructions and votes for Mickey Mouse may well be punished after the fact, but the vote for Mickey stands, nonetheless.

Update 2012-10-10: I wrote: “Is there some other approach that would improve on the mixed-radix algorithm? Almost surely.” Well, I got that right, and I’ve never felt quite so chagrinned about being correct. An anonymous commenter suggested trying dynamic programming, and soon after that Iain Murray posted an elegant, terse Python program based on that principle. His program answers the how-many-ties? question in milliseconds. It’s hard to fathom how I could have spent several days noodling over the tie-counting problem and missed a solution that’s so clearly superior. Especially since I’d seen it before. I mentioned above that Stephan Mertens introduced me to the number-partitioning problem. His lovely book The Nature of Computation (co-authored with Cristopher Moore) includes a detailed discussion of a dynamic-programming algorithm for the NPP.

One final point: Although Murray’s program is fast, it is not polynomial-time (nor does he make any claims to that effect). The anonymous commenter correctly states that the running time of the algorithm is proportional to the product of the number of states and the number of electors. But the size of the input is the logarithm of this number. Thus the running time is an exponential function of the number of bits in the input.

Posted in computing, mathematics | 5 Comments

## IMG_1134

Lots of digital cameras come with a default file-naming scheme of IMG_nnnn, where nnnn is a four-digit number assigned sequen­tially, starting with 0001. The four images above are photos I made with four different cameras over a span of more than a decade, but they all have the same filename: IMG_1134.JPG. As you might guess, name collisions cause occasional trouble when I decide to merge folders. I could easily fix the problem at the source—all of the cameras allow for an override of the default naming scheme—but I’ve never bothered. And I’m not the only one.

The other day it occurred to me that the prevalence of IMG_nnnn filenames offers a way to “randomly” select a sample of images from public archives such as Flickr. Here’s a collage of results produced by searching Google Images with the query string “IMG_1134″:

Below is a similar collage generated by running the same search query on Flickr:

And finally I offer a specimen of what Bing Images delivers for “IMG_1134″:

In what sense are these selections random? The rationale is this: If we choose the 1,134th picture taken with many cameras by many photographers, I wouldn’t expect much correlation in the subject matter of those images. And indeed the samples above do seem to cover a pretty wide range.

On the other hand, I wouldn’t try to argue that the method produces a totally unbiased sample. The most serious problem is that the list of images returned by a search engine is not itself randomized. Google and Bing rank images according to the estimated prominence of the web pages they appear on. Flickr offers three sorting orders—”relevant,” “recent” and “interesting”—but does not have a “random” option.

In the samples above I dealt with this issue in various ad hoc ways. For the Google Images sample I made the selections based on computer-generated pseudorandom numbers. In the case of Flickr I took every seventh image. On Bing I scrolled down past the first few thousand thumbnail images and then copied an arbitrary rectangular region of the screen. These are feeble attempts at randomization; no doubt the selections still favor higher-ranked or more popular images.

Another source of bias is that we’re only seeing images that people chose to save and to share. The pictures you take with your thumb on the lens are likely to be deleted; unflattering portraits of your spouse will not be uploaded.

Still another bias is subtler and more interesting: Not all nnnn‘s are the same. A camera that gets used for a few weeks when it’s fresh out of the box and then gathers dust in the closet over subsequent months and years will never reach the IMG_9000s, and maybe not even the IMG_1000s. To test this hypothesis I ran a search on Google Images for each string of the form “IMG_nn00,” from “IMG_0100″ through “IMG_9900.” For each search string I recorded the number of hits reported by Google:

Over much of the range, the number of images decays in a way that looks vaguely logarithmic—but then something weird happens at about nnnn = 6000. I suspect that the weirdness comes from a glitch in Google’s classification and counting algorithms rather than from the habits of the worldwide community of photographers. To test this idea I reran the same script on Flickr, with these results:

The scale is different but the shape of the distribution is very similar—without any spiky weirdness at the high end. A logarithmic curve fits quite well.

As the number of IMG_nnnn images falls off with increasing nnnn, the nature of the images may also change. In particular, one might suppose that photographers who practice their craft enough to make several thousand images would produce more accomplished and more interesting results. In other words, by looking at the higher-numbered images, we may weed out the dilettantes.

I grew curious about IMG_0001. What kinds of pictures do people take when they first unpack a new camera, charge up the battery, and begin clicking the shutter? I expected a lot of self-conscious and self-referential images like these:

Obviously I found a few, but I had to sift through several hundred Flickr IMG_0001 thumbnails to come up with these four. (There were also a few pictures of camera boxes and packaging.) To my surprise, the vast majority of IMG_0001 photos did not look at all like test images made with a new toy. Not unless people unpack and test their new cameras inside the Hagia Sophia or on the south rim of the Grand Canyon or at a basketball game.

So what’s going on with the 0001s? Perhaps through some metadata mixup, multiple images are all getting labeled “IMG_0001,” even though that’s not the filename coming out of the camera. But if that’s the case, “IMG_0001″ ought to be an outlier, with substantially more exemplars than neighboring file names IMG_0002, IMG_0003, etc. The data indicate otherwise:

An alternative explanation is that photographers frequently reset the camera’s file counter to 0001. Here’s still another possibility: Maybe some of those IMG_0001 pictures are not the camera’s first image but its 10,000th. I’ve never gone all the way around with a camera, so I don’t know exactly what happens when the counter turns over. Is IMG_9999 followed by IMG_0000? (I note that a Flickr search for “IMG_0000″ returns 2,430 results. It’s the tiny leftmost lollipop in the graph above.)

•     •     •

Even if searching for IMG_nnnn offers only a crude level of randomization, I think it may still hold some promise of showing us what average or typical photographs look like. How many are portraits and how many are landscapes? Taken indoors or out? What fraction of all photos show children? Pets? What are the proportions of men and women?

Based on a casual perusal of a few thousand photos, I was led to make some tentative observations:

• The genre of family snapshots—children’s birthday parties, vacation trips to Disney World—is not nearly as prominent in these collections as I would have expected. I guess those pictures are all on Facebook.
• Food, on the other hand, is a much more popular subject than I ever would have guessed. In a sample of 100 Bing images, 21 showed comestibles of some kind. Is it really true that a fifth of all the world’s pixels are being used to show what we ate for lunch?
• Vehicles are not quite as everpresent as meals, but there sure are a lot of cars, bikes, trains, trams and aircraft to be seen. It seems a lot of people want to show off their ride.

Addendum: Here’s the data behind the three graphs above.

Posted in uncategorized | 12 Comments

## The abc game

Five years ago I wrote about a rumored proof of the abc conjecture, an idea from the 1980s that stands at the juncture between the additive and the multiplicative parts of number theory. Now there’s more abc news, and this time it’s not just a rumor. Shinichi Mochizuki of Kyoto University has released a series of four long papers in which he claims to provide a proof. I have nothing useful to say about the proof itself—the waters are deep and dark out where Mochizuki swims—but I’m going to take the occasion for some dabbling in the shallow end of the pool.

So what’s the abc conjecture? I gave an account in my earlier post, but here I want to try a different approach. Let’s make a game of it.

You choose a positive integer a, and I’ll choose a larger integer b that’s relatively prime to a. In other words, a < b and no integer n > 1 divides both a and b. Now we calculate two quantities. The first is the sum a + b = c. (I note that c must also be relatively prime to both a and b. Agreed?) The second quantity is called the radical of a, b, c, which I’m going to designate R; it’s the product of all the distinct primes in a, b and c. To compute the radical we list all the prime factors of a, b and c, merge the three lists, cast out duplicates until each prime appears just once, and finally multiply.

Now for the payoff: If R ≥ c, you win. If R < c, the triple {a, b, c} is called an abc-hit, and I win.

Suppose you pick a = 5 and I reply with b = 9, for a sum of c = 14. The distinct prime factors of these numbers are 2, 3, 5 and 7, and so the radical R = 2 × 3 × 5 × 7 = 210. Since R is larger than c, I lose. But if you pick 5 and I pick 27, we have c = 5 + 27 = 32 whereas R = 2 × 3 × 5 = 30. This is an abc-hit, and I’m the winner.

In a sense, this game is very much in your favor, because abc-hits are rare. There are 4,950 integer pairs with a < b ≤ 100, and among them 3,043 pairs are relatively prime; but only six of those pairs produce an abc-hit:

\begin{align} 1 + 8 &= 9 & R &= 6\\ 1 + 48 &= 49 & R &= 42\\ 1 + 63 &= 64 & R &= 42\\ 1 + 80 &= 81 & R &= 30\\ 5 + 27 &= 32 & R &= 30\\ 32 + 49 &= 81 & R &= 42 \end{align}

The same information is presented graphically below. Maroon dots are abc-hits (and their symmetrical counterparts after inter­changing a and b); gray dots are all the other relatively prime pairs.

For a broader view, here are the 1,318 abc-hits with a < b ≤ 106 (and their reflections). I didn’t compute all these hits myself; the data come from the ABC@Home project, where volunteers have been donating cpu cycles to the task for several years.

Given the rarity of abc-hits, it looks like you have an excellent chance of avoiding them. If we both pick random a‘s and b‘s in the range from 1 to 1,000,000, you win with probability 1 – 10–10. Before you lay down any heavy wagers on this game, however, consider that I too have an advantage. I play second. If for every a there is some b that yields an abc-hit, then in principle I can always win. All the same, I won’t be making any big bets either. Although it may well be true that there’s at least one abc-hit for every a, I don’t have the computational resources to find a winning b in every case. For instance, if you choose a = 330, I’m stuck. The ABC@Home team has not yet found a b that makes an abc-hit with a = 330; if such a b exists, it must be larger than 999,999,999,999,999,670.

So that’s the basic abc game. Let’s fiddle with the rules a little to make it more interesting.

Not all abc-hits are equally impressive. Sometimes R is much smaller than c, but in other cases the difference is slight and the ratio c/R is barely greater than 1. A few examples suggest the range of possibilities:

\begin{align} 459 + 3125 &= 3584 & R &= 3570 & c/R &= 1.004\\ 5 + 27 &= 32 & R &= 30 & c/R &= 1.067\\ 1 + 4374 &= 4375 & R &= 210 & c/R &= 20.833\\ 2 + 6436341 &= 6436343 & R &= 15042 & c/R &= 427.891 \end{align}

Suppose we adjust the rules of the game so that I win only if I can find a “high-impact” abc-hit, where R is dramatically less than c. Before playing, we agree on a number h, which can be any real number equal to or greater than 1. Then you pick your a, I pick my b, and we calculate the sum c and the radical R, just as before. But in this version of the game, I win only if Rh < c. (A mathematically equivalent formulation says that I win if (log c)/(log R) > h.)

If h = 1, our new game is identical to the old one (since R1 = R). If we assume that an abc-hit exists for every a, and if I have unlimited computing power, then I should always be able to win with h = 1. But everything changes when h > 1. This is where the abc-conjecture comes into the picture. The conjecture asserts that only finitely many abc-hits have Rh < c whenever h has any value greater than 1—no matter how slight the excess over 1. An immediate consequence is that you win the game. If there are only finitely many “high-impact” abc-hits for our chosen value of h, then there must be infinitely many a‘s for which I can’t find a winning b.

For the record, here’s a statement of the abc conjecture without the apparatus of the two-player game:

Given any real number h > 1, there are at most finitely many triples {a, b, c} with radical R(a,b,c) where Rh < c.

Note that the definition requires the exponent h to be strictly greater than 1. It says there can’t be an infinity of abc-hits with R2 < c or R1.5 < c or even R1.0000001 < c. But it doesn’t rule out an infinity of abc-hits with R1 < c, and in fact such an infinity is known to exist. (A presentation by Frits Beukers gives a constructive proof, generating an infinite sequence of abc-hits with a = 1.)

I’d like to dwell for just a moment on the strangeness of this statement. Think of h as a parameter controlling the size of the pores in a mathematical sieve. We set h to some specific value and then run all the abc-hits through the sieve, keeping those with (log c)/(log R) > h, and discarding the rest. If we start with h = 1.5, we might find we retain just a few hits. As we reduce the value of h, the size of the collection grows, but (if the abc conjecture is true) the number of qualifying triples remains finite as long as h is greater than 1, even infinitesimally so. And then, when h = 1, the sieve suddenly catches an infinity of abc-hits. So now we can ask: For how many of those infinitely numerous hits is (log c)/(log R) equal to 1? The answer is: 0.

I don’t mean to suggest there’s anything wrong with the mathematics of this situation. It’s just another hint that this paradise Cantor created for us is a mighty weird place to live.

One more digression. The defining property of an abc-hit is R < c. What about the case of R = c? I would love to exhibit a specimen of such a triple, but I haven’t been able to find one. Do they exist? I see no obvious reason why they shouldn’t, and I suspect they’re just very rare. After all, there are many ways for R to be less than c and many ways for R to be greater than c but there’s only one way for R to be equal to c. My search for low-hanging fruit (examining all triples with c ≤ 25,000) came up empty. I have also sifted through 14 million triples recorded by the ABC@Home project, which cover all territory up to c = 1018. In this data set I found no triples with R = c. However, I don’t know whether that’s because no R = c triples exist in this range or because the ABC@Home software was set to look only for R < c.

•     •     •

My abc games require nothing but simple arithmetic. Not so Mochizuki’s work on the abc conjecture.

Mochizuki is professor in the Research Institute for Mathematical Sciences in Kyoto. His Ph.D. is from Princeton, where he studied under Gerd Faltings, the Fields Medalist who proved the Mordell Conjecture.

In August Mochizuki posted four manuscripts on his web site at Kyoto University. (Links to PDFs: Part I, Part II, Part III, Part IV.) They carry the collective title “Inter-Universal Teichmüller Theory,” and together they amount to 512 pages. Part IV is where the claimed proof of the abc conjecture appears.

I have made a sincere attempt to extract some meaning from these documents. In the case of Part IV, I have “read” the entire manuscript, in the sense that I have cast my eyes over each line of text and tried to apprehend the sense of the words and the symbols, seeking help in reference works and the web when necessary. Nothing of worth has come from this endeavor. I’m usually pretty good at reading things I don’t understand, but Mochizuki has defeated me.

I haven’t yet found anyone else who claims to fully understand the proof, but several readers have made much more progress than I have. The best commentary so far is on Mathoverflow. Other sources are Jordan Ellenberg’s blog Quomodocumque, a Google+ note by John Baez and Peter Woit’s blog Not Even Wrong.

Note: The paragraph above beginning “I’d like to dwell for just a moment…” has been changed since this article was first published. Resisting the powerful urge to bury my embarrassing mistakes, I reproduce below the original version:

I’d like to dwell for just a moment on the strangeness of this statement. Suppose we go through all the abc-hits and for each one write down the value of h = (log c)/(log R). Then, assuming the abc conjecture is true, we have these results:

• How many hits have h ≥ 1? Answer: infinitely many.
• How many hits have h > 1? Answer: finitely many.
• How many hits have h = 1? Answer: zero.

Addendum 2012-09-12: When I began this piece, I said I was going to dabble in the shallow water. Now I’m in well over my head. As Stevie Smith said: Not waving but drowning.

In the comments, Jim Lyon takes issue with my statement about the “strangeness” of the abc conjecture. Lyon writes:

If N(h) is the number of hits for value h, it’s not that strange that N(h) is finite when h > 1, but infinite when h = 1.

To some extent, strangeness is subjective. What a person finds strange or surprising is the mirror image of what feels familiar and expected. Is it strange that the integers and the rationals can be put into one-to-one correspondence but the real numbers form a larger set? Well, I think I understand the proof of that fact, and yet, when I focus on the fact itself, my brow still furrows. Is it strange or remarkable that the infinite series 1/1 + 1/4 + 1/9 + 1/16 . . . converges but 1/2 + 1/3 + 1/5 + 1/7 . . . does not? What about the Banach-Tarski procedure for turning a pea into a planet? Is there anyone for whom that notion is a comfy piece of mental furniture?

The thing is, though, if strangeness is just a matter of unfamiliarity, it can’t be sustained indefinitely. And, indeed, after wrestling with these ideas for a few days and nights, I find the sense of strangeness fading. I’ve changed my mind and I tend to agree with the Jim Lyon statement quoted above.

A simple model helps to clear away some of the sense of mystery. Take the set of all fractions 1/n with n in 1, 2, 3, 4, . . . Now we can count all the fractions whose value is greater than some real number m. For 0 < m ≤ 1, there is a finite set of fractions greater than m. As we reduce m, the set becomes larger, and at m = 0 the set of qualifying fractions is infinite. And yet there are no fractions in the set with 1/n = 0. This is the same behavior observed with abc-hits as h is reduced to 1, and there’s nothing very mysterious about it.

But perhaps this model explains too much. If the set of abc-hits had the same structure as the set of 1/n fractions, there would be no need for an abc conjecture. The truth of the proposition would be easy to establish.

Is anything known, even empirically, about the asymptotic behavior of the number of solutions as h approaches 1? (Aside from the fact that it goes to infinity.) Like, does it go as 1/(h-1)? 1/(h-1)^2? exp[1/(h-1)]?

I suspect this is actually a deep and difficult question. If we had a definitive answer, surely that would be at least a head start toward resolving the abc conjecture.

In any case, here is some empirical data that might or might not shed light on the limiting behavior. For a sample of 100,000 hits drawn from the ABC@Home file, I calculated ε = h–1 = ((log c)/(log R)) –1. Then I plotted the cumulative distribution of the ε values, using six bins of exponentially increasing width. The first bin includes counts for all ε > 0, the second for all ε > e–5 ≈ 0.007, the third for all ε > e–4 ≈ 0.018, and so on.

\begin{align} \textit{bin} & \quad \textit{count}\\ 0 & \quad 100000\\ e^{-5} & \quad 84532\\ e^{-4} & \quad 63839\\ e^{-3} & \quad 30514\\ e^{-2} & \quad 4748\\ e^{-1} & \quad 48\\ \end{align}

The same data in graphic form:

The sample consists of the first 100,000 consecutive abc hits beginning with the first hit having c ≥ 109. I avoided the beginning of the file because the smallest hits seem to have peculiar statistics. Still, I’m not sure whether this is a reasonably fair sample, or even what it means to sample fairly from an infinite data set.

Posted in featured, mathematics | 10 Comments

## Stop lights

For something like 80 years, a traffic light was an ordinary incandescent light bulb mounted behind a colored glass lens. Then, in 1999, LED signal lights began appearing at street corners; the light bulb was replaced by an array of small light-emitting diodes shining through a diffusing lens. LEDs are more efficient than incandescent bulbs, and they last longer. Moreover, having many LEDs instead of one tungsten filament offers the safety advantage of a gradual failure mode. As I wrote in my Field Guide to the Industrial Landscape: “If one small emitter burns out, the rest continue to shine.”

That sentence was written when LED traffic lights were still very new, and I had never actually seen one with a failed emitter. Now that the LED signals have several years of service behind them, I’ve begun to notice a lot of defects, and they don’t look like what I expected. Instead of random, isolated blemishes on the face of the signal, I’m seeing large scars and ragged craters. Oddly shaped blocks of emitters are all failing at once.

This is a pattern that turns up with notable frequency:

This one has three blocks of dead pixels:

And here almost half the emitters have gone dark:

All three of the examples above are green signal lights, which seem to fail much more often than red and yellow ones. But here’s an example of a red light with more than 20 blacked-out pixels and several more dim ones:

Finally, in another green light, a quite different and more orderly pattern of defects appears:

For me these images provoke a couple of questions. First, why do we so often see signal faces with a dozen or more LED failures, while single-pixel outages seem to be very rare? Second, what accounts for the distinctive shapes of the dark regions?

No doubt some diligent googling or wikipedia-wrangling would turn up some answers, but sometimes it’s just more fun to speculate in ignorance. In any case, I have a hypothesis about the first question. The idea comes from childhood experience with Christmas-tree lights. LEDs are generally low-voltage devices, but the available power supply in the traffic light is 120 volts. The cheapest and easiest way to deal with this mismatch is to wire groups of LEDs in series, thereby dividing the voltage. A consequence is that if any one LED in a group fails to an open-circuit state, the entire chain goes dark.

As for the geometric question, I haven’t got a clue. If I were laying out the circuit board, I would surely wind up with some collection of reasonably tidy, compact, symmetrical clusters. The patterns on exhibit here are not tidy, compact or symmetrical. The last of the images above might be explicable: Wiring together every other element in a row would be a good way of reducing the visual impact of a failure. But all the other patterns are incomprehensible.

Of course my whole Christmas-lights hypothesis may be totally screwy. I welcome alternative hypotheses, and even Google-Wiki-derived answers.

As a final bit of fun, here’s a video of an LED traffic signal shot at 1,000 frames per second, and slowed down by a further factor of 2, showing the 120-hertz flicker from the AC power supply, including the fade at the end of the green cycle.

Posted in technology | 6 Comments