<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>bit-player</title>
	<atom:link href="http://bit-player.org/feed" rel="self" type="application/rss+xml" />
	<link>http://bit-player.org</link>
	<description>An amateur's outlook on computation and mathematics.</description>
	<pubDate>Tue, 31 Aug 2010 20:30:22 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.3</generator>
	<language>en</language>
			<item>
		<title>I could carry less</title>
		<link>http://bit-player.org/2010/i-could-carry-less</link>
		<comments>http://bit-player.org/2010/i-could-carry-less#comments</comments>
		<pubDate>Tue, 31 Aug 2010 20:30:20 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[mathematics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=778</guid>
		<description><![CDATA[

&#65279;The fabled carefree residents of the&#160;Carryless Islands in the remote South Pacific&#160;have very few possessions, which is just as well, since their&#160;notion of arithmetic is ill-suited to accurate record-keeping.&#160;When they add or multiply numbers, they follow similar rules&#160;to ours, except that there are no carries into other digit positions.&#160;Addition and multiplication of single-digit numbers&#160;are performed [...]]]></description>
			<content:encoded><![CDATA[</p>
<blockquote>
<p>&#65279;The fabled carefree residents of the&nbsp;Carryless Islands in the remote South Pacific&nbsp;have very few possessions, which is just as well, since their&nbsp;notion of arithmetic is ill-suited to accurate record-keeping.&nbsp;When they add or multiply numbers, they follow similar rules&nbsp;to ours, except that there are <em>no carries</em> into other digit positions.&nbsp;Addition and multiplication of single-digit numbers&nbsp;are performed by a process that we would call &#8220;reduction mod 10.&#8221;&nbsp;Any carry digits are simply ignored.&nbsp;So&nbsp;9 <strong>+</strong> 4 = 3,&nbsp;5 <strong>+</strong> 5 = 0,&nbsp;9 <strong>&#215;</strong> 4 = 6, 5 <strong>&#215;</strong> 4 = 0, and so on.</p>
</blockquote>
<p>With this fable, David Applegate, Marc LeBrun and N. J. A. Sloane introduce a new scheme of arithmetic in <a href="http://arxiv.org/abs/1008.4633">a paper newly posted on the arXiv</a>.</p>
<p>And if you think the mathematics sounds trivial, try explaining the structure of this sequence:</p>
<blockquote><p>21, 23, 25, 27, 29, 41, 43, 45, 47, 49, 51, 52, 53, 54, 56, 57, 58, 59, 61, 63,</p></blockquote>
<p>which lists the first 20 &#8220;carryless primes.&#8221; </p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/i-could-carry-less/feed</wfw:commentRss>
		</item>
		<item>
		<title>In the zone</title>
		<link>http://bit-player.org/2010/in-the-zone</link>
		<comments>http://bit-player.org/2010/in-the-zone#comments</comments>
		<pubDate>Tue, 24 Aug 2010 14:46:51 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[modern life]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=776</guid>
		<description><![CDATA[
Before leaving on a trip to the West Coast, I copied my return flight information onto my Google calendar: SFO to BOS, 12:50 p.m. to 9:30 p.m. Now that  I&#8217;m nearing the end of my visit to the Mythical State of Jefferson (see local landmark above), I&#8217;ve just checked the departure details by calling [...]]]></description>
			<content:encoded><![CDATA[<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/mtshasta-from-the-coca-cola-hill-0147.jpg" border="0" alt="Mt. Shasta, from a nearby hilltop owned by Coca Cola" title="Mt. Shasta, from a nearby hilltop owned by Coca Cola" width="450" height="602" /></p>
<p>Before leaving on a trip to the West Coast, I copied my return flight information onto my Google calendar: SFO to BOS, 12:50 p.m. to 9:30 p.m. Now that  I&#8217;m nearing the end of my visit to the <a href="http://bad.eserver.org/issues/2000/48/shaw.html">Mythical State of Jefferson</a> (see local landmark above), I&#8217;ve just checked the departure details by calling up the calendar on my cell phone. It tells me the flight departs at 9:50 a.m. and arrives at 6:30 p.m.</p>
<p>It could be worse, of course. As an eastbound traveler all that I risk is wasting three hours at the airport. If I were westbound, I might well miss my flight.</p>
<p>The developers of Google calendar would doubtless argue that automatic time-zone conversion is a feature, not a bug. If the event in question had been a conference call scheduled for 12:50 p.m. Eastern Daylight Time, then 9:50 a.m. Pacific Daylight Time would indeed be the moment to dial in. Or, if I had a series of pills to be taken at fixed intervals, it might be helpful to have the program remind me at the correct times as I wander across continents. But in the case of today&#8217;s flight home, the result is just plain wrong.</p>
<p>It&#8217;s not only a Google calendar problem. A few weeks ago a friend was entering a schedule of talks into a Drupal web page for a conference that begins November 7, 2010. All the times were mysteriously shifted by an hour. A 1:00 p.m. talk on the input form became a 2:00 p.m. talk on the displayed web page. The key to solving this mystery is knowing that November 7 is the date daylight saving time ends in (most of) the U.S.</p>
<p>I am certainly not the first to encounter such problems. Peter Neumann&#8217;s <a href="http://catless.ncl.ac.uk/risks">RISKS Digest</a> reports hundreds of computational mishaps involving time zones or daylight saving time, going back over the past 25 years. The issue is <a href="http://www.google.com/support/forum/p/Calendar/thread?tid=1f05577692e541c2&#038;hl=en">known to Google</a>. But what is the right fix?</p>
<p>Some other calendar software offers the option of specifying a time zone for an event. To handle the airline case correctly, the program needs to allow for different zones for the start and the end times. And the Drupal problem suggests we may also need some means of indicating whether or not to adjust for daylight saving time. It gets very messy. Note that an event scheduled for 1:30 a.m. on November 7, 2010, will happen twice. An event at 2:30 a.m. on March 13, 2011, will never take place.</p>
<p>For years my own makeshift solution to these complexities was to live in a single time zone, no matter where I was. I carried a laptop, but I never changed its time-zone setting, even when I was away from home for weeks or months. When conversion was needed, it happened in my head. Even now the Google calendar display on my laptop gives the correct departure time from SFO, because the laptop doesn&#8217;t know that it ever left Boston. But in our new world of location-aware devices, pretending to stay home is no longer an option.</p>
<p>I begin to wonder if the whole railroad-age concept of time zones hasn&#8217;t outlived its usefulness. But I haven&#8217;t time just now to consider the alternatives. Right now it&#8217;s time to leave for the airport. I think.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/in-the-zone/feed</wfw:commentRss>
		</item>
		<item>
		<title>The ormat game</title>
		<link>http://bit-player.org/2010/the-ormat-game</link>
		<comments>http://bit-player.org/2010/the-ormat-game#comments</comments>
		<pubDate>Mon, 16 Aug 2010 22:30:59 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[games]]></category>

		<category><![CDATA[mathematics]]></category>

		<category><![CDATA[physics]]></category>

		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=771</guid>
		<description><![CDATA[Here&#8217;s the deal. I&#8217;m going to give you a square grid, with some of the cells colored and others possibly left blank. We&#8217;ll call this a template. Perhaps the grid will be one of these 3&#215;3 templates:

You have a supply of transparent plastic overlays that match the grid in size and shape and that also [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s the deal. I&#8217;m going to give you a square grid, with some of the cells colored and others possibly left blank. We&#8217;ll call this a template. Perhaps the grid will be one of these 3&#215;3 templates:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/ormat-template-3x31.png" border="0" alt="colored 3x3 ormat grids" width="365" height="120" /></p>
<p>You have a supply of transparent plastic overlays that match the grid in size and shape and that also bear patterns of black dots:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/perm-template-3x3-dots.png" border="0" alt="dot patterns for the six 3x3 permutation matrices" width="365" height="257" /></p>
<p>Note that each of these patterns has exactly three dots, with one dot in each row and each column. The six overlays shown are the only 3&#215;3 grids that have this property.</p>
<p>Your task is to assemble a subset of the overlays and lay them on the template in such a way that dots cover all the colored squares but none of the blank squares. You are welcome to superimpose multiple dots on any colored square, but overall you want to use as few overlays as possible. To make things interesting, I&#8217;ll suggest a wager. I&#8217;ll pay you $3 for a correct covering of a 3&#215;3 template, but you have to pay me $1 for each overlay you use. Is this a good bet?</p>
<p>Before going further, I should mention that not every conceivable template can be covered under these rules. To take an obvious example, no 3&#215;3 template with fewer than three colored squares can possibly be covered by any combination of the six overlays. But I promise to submit only templates that can be covered by <em>some</em> combination of the given dot patterns; if I err about this, I forfeit the bet.</p>
<p>How does the game play out? If I give you the template marked  &#8220;1&#8243; above, you can easily win; just choose permutations <em>a</em> and <em>b</em>, which together cover all the colored squares and no others. You pay $2 and get $3. Template 2, with all nine squares colored, looks like it might be the toughest challenge. Clearly, it cannot be covered with fewer than three overlays, since we need a total of nine dots; and it turns out that exactly three overlays are required. Indeed, there are two ways of covering the template with three overlays: <em>a</em> + <em>d</em> + <em>e</em> and <em>b</em> + <em>c</em> + <em>f</em>. Thus this template is a breakeven proposition: You earn $3 and pay $3.</p>
<p>Now we come to template 3, which has eight colored squares and one blank. Surely if you can cover the full nine squares with just three overlays, then you should also be able to cover eight squares&#8212;no? I invite you to try it. In fact the only covering that works requires four overlays: <em>b</em> + <em>d</em> + <em>e</em> + <em>f</em>. Thus you shouldn&#8217;t take my bet, since I can always give you a template with just one blank, and you&#8217;ll have a net loss of $1.</p>
<p><strong>Some background.</strong> I&#8217;ll return to the gaming table momentarily, but first let me explain what this is all about and where it came from. A few weeks ago, <a href="http://bit-player.org/2010/four-questions-about-fuzzy-rankings">I was writing</a> about &#8220;ranges of rankings,&#8221; which led me into the topic of permutation matrices. To recapitulate:</p>
<ul>
<li>A permutation matrix is a square matrix with a single 1 in each column and each row, and all the rest of the elements 0.</li>
<li>An <em>ormat</em> is a superposition of permutation matrices, formed by applying the Boolean <em>OR</em> function to corresponding elements of the permutation matrices. For example: <img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/matrix-or-sum.png" border="0" alt="matrix-or-sum.png" width="242" height="65" /></li>
<li>Not all square matrices with (0,1) entries can be formed by <em>OR</em>-ing permutation matrices, but there&#8217;s an efficient algorithm for deciding whether or not a given matrix is an ormat. (I thank some helpful commenters for enlightening me on this point.)</li>
<li>Given an ormat, the total number of distinct permutation paths that can be threaded through the 1 entries of the matrix is equal to the <em>permanent</em> of the matrix. Calculating the permanent is known to be a hard computational problem.</li>
</ul>
<p>In a comment, Barry Cipra posed the following query:</p>
<blockquote><p>The permanent tells us the maximum number of different permutations that can be <em>OR</em>-summed to produce a given ormat, but what is the corresponding minimum number? Also, in how many different ways can the minimum be achieved?</p></blockquote>
<p>The connection between ormats and my little game is probably apparent by now. The template of colored and blank squares is an ormat; the dotted overlays represent permutation matrices; to maximize your payoff in the game (or to minimize your loss), you need to answer Barry&#8217;s first question, finding the minimum number of permutations that can be combined to yield the given ormat.</p>
<p>For 3&#215;3 matrices, we can solve this problem by exhaustive search, calculating the <em>OR</em>-sums of all possible combinations of the six 3&#215;3 permutation matrices taken 1, 2, 3, &#8230;, 6 at a time. I did this with pencil and paper on a recent airplane trip. Here is a summary of the results:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/3x3-ormat-stats-0105.jpg" border="0" alt="number of ormats generated by various combinations of permutation matrices" width="450" height="269" /></p>
<p>Some of the numbers on this card are easy to explain. The six ormats with just three 1 entries are the permutation matrices themselves. There are six of them because there are 3! = 6 permutations of three things. There are no ormats with four 1 entries for a reason that bears thinking about: There can be no permutations that differ from one another in just one element. When you superimpose any of the six overlays shown above, you can wind up with three, five or six dots, but never four.</p>
<p>At the other end of the scale, it&#8217;s no surprise that there&#8217;s exactly one ormat with nine 1 entries, and that it takes three permutations to produce it. And then there are the nine ormats with eight 1 entries, which each require four permutations to be <em>OR</em>-ed. These are the single-blank patterns like template 3 above.</p>
<p>Based on these results, I began speculating about what I would see in a tabulation of all 4&#215;4 ormats.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/4x4-ormat-stats-0108.jpg" border="0" alt="guesses about stats for 4x4 ormats" width="450" height="272" /></p>
<p>There would have to be 4! = 24 patterns with four 1 entries, and just one pattern with all 1s, generated by <em>OR</em>-ing four permutations. And there should be 16 ormats that require five permutations, namely the 16 matrices with a single 0 element. This last prediction seemed a little less self-evident than the others.</p>
<p><strong>Pocket change and Cheerios</strong>. My thoughts about the single-zero (or single-blank) case went something like this. To cover 15 squares with sets of four dots each, we need at least four sets, or else we simply won&#8217;t have enough dots. So a useful starting point is one of the optimal arrangements that cover all 16 squares without gaps or overlaps. By this time I had grown tired of drawing zillions of dots, and so I started working with sets of coins.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/coins-stage-1-0097.jpg" border="0" alt="initial configuration of four permutations of coins" width="450" height="450" /></p>
<p>In this arrangement each coin denomination forms a permutation, with no two pennies, nickels, dimes or quarters in the same row or the same column. We have successfully covered all the colored squares, but unfortunately we&#8217;ve also covered the blank at the lower right. Thus this pattern of coins is not an acceptable solution, but maybe we can fix it up somehow?</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/coins-stage-2-0098.jpg" border="0" alt="adjusted configuration after one coin is moved" width="450" height="447" /></p>
<p>Moving the penny from the blank square to another square in the same column solves one problem but creates another: Now the arrangement of pennies is no longer a permutation. There are two pennies in the third row.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/coins-stage-3-0099.jpg" border="0" alt="coins after second adjustment to restore permutation" width="450" height="448" /></p>
<p>So now we have to shift another penny to restore the one-per-row-and-column property. Inevitably, this leaves a colored square uncovered. The only way we can cover that exposed square is to introduce a fifth permutation. Since I had run out of coin denominations, I chose a popular brand of breakfast toroids. Voila:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/coins-stage-4-0100.jpg" border="0" alt="coins and cheerios -- the five-permutation solution" width="450" height="449" /></p>
<p>There&#8217;s nothing special about the particular moves I chose in this sequence. If you try some alternatives, you should be able to persuade yourself that moving the penny that covers the blank to any other square in the fourth column (or in the fourth row) would lead to essentially the same situation. Likewise the game would come out the same if the single blank square were placed anywhere else in the grid. And you could also start with a different set of initial permutations (provided they cover all the squares).</p>
<p>This coin-shuffling exercise demonstrates that we can cover any 4&#215;4 template that has a single blank by combining no more than five permutations, but how do we know that five are actually needed? Maybe there&#8217;s some totally different arrangement that would do the job with just four permutations? Well, think about what such an arrangement would look like. It would have to differ at exactly one position from some other layout of permutations that covers the full 16-square grid. But no two permutations can differ at one and only one place. Thus the reason there can be no four-permutation cover of 15 squares is essentially the same as the reason no 4&#215;4 ormat pattern can cover just five squares.</p>
<p>This argument generalizes to <em>k</em>&#215;<em>k</em> matrices: For any integer <em>k</em>, there must be at least <em>k</em> ormat patterns that cannot be covered with fewer than <em>k</em>+1 permutations. But then comes the bigger speculative leap: Perhaps <em>k</em>+1 is an upper bound. Perhaps part of the answer to Barry&#8217;s question is that no <em>k</em>&#215;<em>k</em> ormat pattern requires <em>more</em> than <em>k</em>+1 permutations. At one point I even had a &#8220;proof&#8221; of this conjecture. Then I wrote a program to check it, doing much the same thing I did with the dots on the airplane.</p>
<p><strong>Out of bounds.</strong> My program found the expected 16 ormat patterns that require five permutations&#8212;and it found many more as well. In all it identified 2,032 4&#215;4 ormats that can&#8217;t be composed from fewer than five permutations. And then came a bigger surprise: The program also found 480 patterns that require <em>six</em> permutations. So much for my proposed upper bound.</p>
<p>One of those problematic 480 ormats takes this form:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/4x4-six-perm-ormat.png" border="0" alt="((1 1 1 1) (1 1 1 1) (1 1 1 1) (1 1 0 0))" width="81" height="76" /></p>
<p>Looking over this pattern, I thought I understood where my earlier reasoning had gone awry. This matrix is just like the single-zero pattern, but with two zeros! (I do mean for that statement to make sense. Bear with me.) Suppose we start again with a set of four permutations that completely cover the grid, including the two blanks.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/4x4-coins-starting-state-0113.jpg" border="0" alt="starting configuration of 16 coins on 4x4 template with two blanks" width="450" height="448" /></p>
<p>Then we can uncover each blank just as we did in the coin-shuffling procedure above, although we have to be careful the two sets of movements don&#8217;t interfere with each other. (Not much point in removing the penny from a blank square, then putting the nickel there.) Here is a strategy for clearing both blank squares while maintaining the one-per-column-and-row permutation property:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/4x4-coins-first-solution-0114.jpg" border="0" alt="four coins and two blanks: first solution" width="450" height="448" /></p>
<p>Inevitably, when we uncover the two blank squares, we also remove coins from two colored squares, which now have to be filled in again. The key point is that no single permutation can repair that damage, because the two open colored squares are in the same row. To cover both of those squares we need <em>two</em> additional permutations.</p>
<p>Other ways of reshuffling the coins avoid putting the two open squares in the same row or column, but they still foil all attempts to complete the covering with just five permutations. Try adding four Cheerios to the diagram below. If you cover both of the open blue squares, then either you also cover one of the blank squares or you wind up with two Cheerios in the same row.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/4x4-coins-second-solution-0112.jpg" border="0" alt="four coins and two blanks: second solution" width="450" height="456" /></p>
<p>So now it&#8217;s clear we need as many as six permutations to cover a 4&#215;4 ormat. Does that suggest that the general upper bound might be <em>k</em>+2 rather than <em>k</em>+1? Or perhaps the appropriate formula is 2<em>k</em>&#8211;2? In support of this latter possibility I offer these two ormats, which require 8 and 10 permutations respectively:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/2k-2ormats.png" border="0" alt="2k-2ormats.png" width="256" height="107" /></p>
<p><strong>Another wager.</strong> Having fooled myself several times about the upper bound on minimal ormat coverings, I feel I should build in a little margin for error before I invite you to make a further wager. We already have direct evidence that covering a <em>k</em>&#215;<em>k</em> ormat can take as many as 2<em>k</em>&#8211;2 permutations. So I&#8217;ll be generous and offer a full $2<em>k</em> for a proper covering, while charging $1 per permutation. If <em>k</em>=3 or <em>k</em>=4, you can definitely make money on this deal. But is it a good bet for larger <em>k</em>? (Hint: I&#8217;d be willing to play the game on these terms for real money.)</p>
<p class="centered">&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;</p>
<p><strong>Update 2010-08-19:</strong> No takers for my bet, eh? Too bad; I had already spent my winnings.</p>
<p>Barry Cipra, who raised the question about minimal ormat covers in the first place, sends this illuminating letter:</p>
<blockquote><p>I&#8217;m going to tiptoe a short ways out on a long long limb and conjecture (really just guess) that the &#8220;worst case&#8221; behavior, in terms of the minimum number of permutations it takes to produce a given ormat, occurs for ormats of the following form, shown here for <em>k</em>=7:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/barry-7x7-uppity-matrix.png" alt="((1 1 1 1 1 1 1) (1 1 1 1 1 1 1) (0 1 1 1 1 1 1) (0 0 1 1 1 1 1) (0 0 0 1 1 1 1) (0 0 0 0 1 1 1) (0 0 0 0 0 1 1))" border="0" width="131" height="124" /></p>
<p>So as not to abuse existing matrix terminology, I&#8217;ll call any (square) matrix of this type&#8212;i.e., whose entries below the main subdiagonal are all 0&#8212;&#8221;uppity triangular.&#8221; I can (and will!) show that this uppity triangular ormat for <em>k</em>=7 requires (at least) 16 permutations&#8212;and the number appears to grow concavely upwards from that, so I, for one, will definitely not take you up on your $2<em>k</em> wager.</p>
<p>The trick, I realized, is to view each ormat as the &#8220;shadow&#8221; of what I&#8217;ll call an &#8220;addmat.&#8221;  If you let <em>P</em>1, <em>P</em>2, &#8230;, <em>Pr</em> be <em>k</em>&times;<em>k</em> permutation matrices, their addmat is simply the ordinary result of addition:  <em>S</em> = <em>P</em>1 + <em>P</em>2 + &#8230; + <em>Pr</em>, whose entries are positive integers wherever one or more of the constituent permutations has a 1 and otherwise 0.  The associated ormat is obtained by changing each of these entries to a 1, while leaving the 0&#8217;s alone.  In this sense, the ormat&#8217;s 1&#8217;s are the &#8220;shadows&#8221; of the addmat&#8217;s positive entries.</p>
<p>What&#8217;s crucial is that addmats have a lovely little property not shared with their shadows:  the row and columns sums of the entries of an addmat all equal the number of permutations that produce them, <em>r</em>.</p>
<p>Come now, let us reason together&#8230;.  The uppity triangular ormat example above (for <em>k</em>=7) must come from an addmat of the form</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/barry-7x7-with-stars.png" alt="{{"*", "*", "*", "*", "*", "*", "*"}, {"*", "*", "*", "*", "*", "*", "*"}, {0, "*", "*", "*", "*", "*", "*"}, {0, 0, "*", "*", "*", "*", "*"}, {0, 0, 0, "*", "*", "*", "*"}, {0, 0, 0, 0, "*", "*", "*"}, {0, 0, 0, 0, 0, a, b}}" border="0" width="131" height="121" /></p>
<p>where <em>a</em>, <em>b</em>, and all the *&#8217;s are positive integers.  In particular, each * is at least 1.  Since all row and column sums must be equal, the sum <em>a</em>+<em>b</em> must equal the sum of <em>b</em> and all 6 *&#8217;s above it.  Hence <em>a</em> is at least 6.  Likewise <em>a</em>+<em>b</em> must equal the sum of <em>a</em> and all 6 *&#8217;s above it, so <em>b</em> is also at least 6.  Hence <em>a</em>+<em>b</em> is at least 12, which means the <em>OR</em>-sum that produced the given ormat involves at least 12 permutations.</p>
<p>This clearly generalizes to arbitrary <em>k</em>, which is more than &#8220;direct evidence&#8221; that covering a <em>k</em>&times;<em>k</em> ormat can take as many as 2<em>k</em>&#8211;2 permutations, it&#8217;s rigorous proof!  But we can immediately do better, at least on a case-by-case basis.  If we try to get by with just 12 permutations for this uppity triangular ormat, we quickly run into trouble.  We obviously must have <em>a</em> = <em>b</em> = 6, and it follows that all the *&#8217;s above them are 1&#8217;s (to make those column sums 12).  That is, we have the addmat</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/barry-7x7-with-stars-and-at.png" alt="{{"*", "*", "*", "*", "*", 1, 1}, {"*", "*", "*", "*", "*", 1, 1}, {0, "*", "*", "*", "*", 1, 1}, {0, 0, "*", "*", "*", 1, 1}, {0, 0, 0, "*", "*", 1, 1}, {0, 0, 0, 0, "@", 1, 1}, {0, 0, 0, 0, 0, a, b}}" border="0" width="131" height="124" /></p>
<p>where I now wish to call your attention to the entry labeled &#8220;@&#8221;.  To make its row-sum equal 12, we need @ = 10.  But that means its column sum (with the 5 *&#8217;s above it) is at least 15, which cannot be!  So we are forced to try larger values of <em>a</em> and/or <em>b</em>&#8212;which is to say, we need more permutation matrices to produce this addmat.</p>
<p>It turns out you can&#8217;t satisfy the row and column sum condition until you get to <em>a</em> = <em>b</em> = 8.  I won&#8217;t take you through all the steps, but just give you a taste with the penultimate possibility, <em>a</em> = 7, <em>b</em> = 8.  The best you can hope for in this case is</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/barry-a7-b8.png" alt="{{"*", "*", "*", "*", 1, 1, 1}, {"*", "*", "*", "*", 1, 1, 1}, {0, "*", "*", "*", 1, 1, 1}, {0, 0, "*", "*", 1, 1, 1}, {0, 0, 0, 12, 1, 1, 1}, {0, 0, 0, 0, 10, 3, 2}, {0, 0, 0, 0, 0, 7, 8}}" border="0" width="146" height="125" /></p>
<p>Note that I put as much of the &#8220;weight&#8221; in the last two columns as close to the 7 and 8 as possible, so that I could use the smallest possible value (10) as the entry with 5 positive entries above it.  This makes the last three rows, and the right three columns all have the same sum, 15, but now we see a problem in the 12&#8217;s column:  Its sum is at least 16.  So once again, we&#8217;re screwed.  It&#8217;s only with the next attempt that we avoid contradiction:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/barrya8b8.png" alt="{{8, 3, 1, 1, 1, 1, 1}, {8, 3, 1, 1, 1, 1, 1}, {0, 10, 2, 1, 1, 1,<br />
  1}, {0, 0, 12, 1, 1, 1, 1}, {0, 0, 0, 12, 2, 1, 1}, {0, 0, 0, 0, 10, 3, 3}, {0, 0, 0, 0, 0, 8, 8}}" border="0" width="157" height="127" /></p>
<p>This matrix finally has all its row and column sums equal.  Please note, this may or may not be an actual addmat of a set of 16 permutation matrices&#8212;I suspect it probably is, but I haven&#8217;t bothered to check.  All we know is that it satisfies a necessary condition of being an addmat, namely that its row and column sums are all equal.  (It&#8217;d be nice if that were also a <em>sufficient</em> condition, but something tells me it isn&#8217;t.)</p>
<p>This example, which can clearly be played out for larger values of <em>k</em>, suggests that not only are you safe with a $2<em>k</em> wager, but with a $(2<em>k</em>+2) wager and higher  I&#8217;ve played around with this a bit, and persuaded myself that the number of permutation matrices will go to 2<em>k</em> + a lot&#8212;for <em>k</em>=10, if I did things correctly, you need 24 permutations (or possibly more, if the uppity triangular matrix the analysis leads to is not an actual addmat).  I am entirely convinced that some additional careful thought can streamline the analysis into a nice, slick proof.  I&#8217;m just not sure I haven&#8217;t already made a mistake, and built an elaborate house of cards&#8230;.</p>
<p>Does any of this jibe with what you&#8217;ve already found to be the case?</p>
</blockquote>
<p>It does indeed jibe. </p>
<p>First of all, to answer a small question Barry left open, here is a set of 16 permutations that will successfully cover his 7&times;7 &#8220;uppity triangular&#8221; matrix:</p>
<p>{1,2,3,4,5,6,7} {2,1,4,3,6,7,5} {1,3,2,5,4,7,6} {1,3,4,2,6,5,7}<br />
{2,3,1,5,6,4,7} {1,2,3,5,6,7,4} {1,2,4,5,3,6,7} {1,2,4,5,6,3,7}<br />
{1,2,4,5,6,7,3} {1,3,4,5,2,6,7} {1,3,4,5,6,2,7} {1,3,4,5,6,7,2}<br />
{2,3,4,1,5,6,7} {2,3,4,5,1,6,7} {2,3,4,5,6,1,7} {2,3,4,5,6,7,1}</p>
<p>This was found with a simple greedy search.</p>
<p>My own attempts to find an upper bound have focused not on uppity triangular matrices but on matrices I&#8217;ve been calling &#8220;flags,&#8221; like this 7&times;7 case:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/7x7-flag.png" alt="{{0, 0, 0, 1, 1, 1, 1}, {0, 0, 0, 1, 1, 1, 1}, {0, 0, 0, 1, 1, 1, 1}, {1, 1, 1, 1, 1, 1, 1}, {1, 1, 1, 1, 1, 1, 1}, {1, 1, 1, 1, 1, 1, 1}, {1, 1, 1, 1, 1, 1, 1}}" border="0" width="132" height="122" /></p>
<p>This matrix also requires 16 permutations for a proper covering. To see why, try threading permutations through the columns of the matrix, starting at the left edge and in each column choosing a 1 element (never a 0) from a different row. Because of the block of zeros at the upper left, the first three elements of every permutation must lie in rows 4 through 7. Thus each permutation &#8220;uses up&#8221; three of the last four rows in the first three columns, and the rest of the permutation can revisit this range of rows only once. It follows that each permutation can touch only one element in the 4&times;4 block of 1s in the lower right corner of the matrix, and at least 16 permutations are needed to cover all the 1s in the matrix. Showing that 16 are sufficient is not hard. </p>
<p>This kind of analysis works for any odd <em>k</em>, and thus we know that such matrices can require as many as </p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/ceiling-k-over-2.png" alt="{\biggl\lceil\frac{k}{2}\biggr\rceil}^2" border="0" width="34" height="43" /></p>
<p>permutations. (For even <em>k</em> the situation is a little less symmetrical, and I haven&#8217;t worked out the exact details.)</p>
<p>These results give us a lower bound on the upper bound on the number of permutations that may be needed to cover a <em>k</em>&times;<em>k</em> ormat. But we haven&#8217;t proved it&#8217;s the true upper bound. Are there other ormats that require even more permutations? My guess is no, but keep in mind that almost all my conjectures along these lines have turned out to be wrong. </p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/the-ormat-game/feed</wfw:commentRss>
		</item>
		<item>
		<title>The state of the spamosphere</title>
		<link>http://bit-player.org/2010/the-state-of-the-spamosphere</link>
		<comments>http://bit-player.org/2010/the-state-of-the-spamosphere#comments</comments>
		<pubDate>Mon, 09 Aug 2010 00:28:57 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[modern life]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=742</guid>
		<description><![CDATA[BP has finally stoppered the Macondo well with a plug of mud and cement, but the gusher of spam continues to pollute inboxes everywhere. Maybe we need a relief well?

It&#8217;s been six months since my&#160;last spam update.&#160;&#65279;The good news, I suppose, is that&#160;last summer&#65279;&#8217;s&#160;huge spurt of spam has subsided. But I&#8217;m still getting 2,000 inanities [...]]]></description>
			<content:encoded><![CDATA[<p>BP has finally stoppered the Macondo well with a plug of mud and cement, but the gusher of spam continues to pollute inboxes everywhere. Maybe we need a relief well?</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/08/spamstats-2008-08-to-2010-07.png" border="0" alt="spam statistics Aug 2008 through July 2010" width="450" height="300" /></p>
<p>It&#8217;s been six months since my&nbsp;<a href="http://bit-player.org/2010/yet-another-spam-update">last spam update</a>.&nbsp;&#65279;The good news, I suppose, is that&nbsp;last summer&#65279;&#8217;s&nbsp;huge spurt of spam has subsided. But I&#8217;m still getting 2,000 inanities a month. That&#8217;s six or seven times the rate I was seeing when I first started monitoring my spam intake in 2003. For the two years&nbsp;covered by the graph above, the cumulative total is&nbsp;&#65279;111,945 unwanted emails received.</p>
<p>Needless to say, most of those messages are utterly unremarkable. But as I reviewed the most recent batch of dreck, one series of emails caught my eye. In the past 24 hours I&#8217;ve received five copies of a spam with the subject line: &#8220;Solve this if you could&#8230;!!!!&#8221;:</p>
<blockquote>
<p>For all Math&#8217;s champs, Accounting Experts &amp; Number Numbssss&#8230;&nbsp;(including future champs too)</p>
<p>Find the 6th Number</p>
<p>1, 2, 6, 42, 1806, ____?</p>
<p>6th number is the password of the attachment.</p>
</blockquote>
<p>The attachment mentioned in the last line is a zip archive that unpacks to reveal an .exe file, which I would not run even if I could. But I confess that I <em>did</em> stop to solve the little puzzle sequence. It&#8217;s not difficult, although it is a little harder than any of the series I use in the spambot filter on the bit-player comment form. Maybe I should add it to my repertory.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/the-state-of-the-spamosphere/feed</wfw:commentRss>
		</item>
		<item>
		<title>Four questions about fuzzy rankings</title>
		<link>http://bit-player.org/2010/four-questions-about-fuzzy-rankings</link>
		<comments>http://bit-player.org/2010/four-questions-about-fuzzy-rankings#comments</comments>
		<pubDate>Sat, 24 Jul 2010 19:40:16 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[games]]></category>

		<category><![CDATA[mathematics]]></category>

		<category><![CDATA[problems and puzzles]]></category>

		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=731</guid>
		<description><![CDATA[The National Research Council is getting ready to release a new assessment of graduate-education programs in the U.S. The previous study, published in 1995, gave each Ph.D.-granting department a numerical score between 0 and 5, then listed all the programs in each discipline in rank order. For example, here&#8217;s the top-10 list for doctoral programs [...]]]></description>
			<content:encoded><![CDATA[<p>The National Research Council is <a href="http://sites.nationalacademies.org/pga/Resdoc/index.htm">getting ready</a> to release a new assessment of graduate-education programs in the U.S. The previous study, published in 1995, gave each Ph.D.-granting department a numerical score between 0 and 5, then listed all the programs in each discipline in rank order. For example, here&#8217;s the top-10 list for doctoral programs in mathematics (as <a href="http://www.stat.tamu.edu/~jnewton/nrc_rankings/nrc1.html">presented</a> by H. J. Newton of Texas A&amp;M University):</p>
<pre><span style="text-decoration: underline;"> rank    school                 score   </span>
    1    Princeton               4.94
    2    Cal Berkeley            4.94
    3    MIT                     4.92
    4    Harvard                 4.90
    5    Chicago                 4.69
    6    Stanford                4.68
    7    Yale                    4.55
    8    NYU                     4.49
    9    Michigan                4.23
   10    Columbia                4.23
</pre>
<p>Note that the scores of the first two schools are identical (to two decimal places), and the first four scores differ by less than 1 percent. Given the uncertainties in the data, it seems reasonable to suppose that the ranking could have turned out differently. If the whole survey had been repeated, the first few schools might have appeared in a different order. Doctoral candidates in mathematics are presumably sophisticated enough to understand this point. Nevertheless, the spot at top of the list still carries undeniable prestige, even when you know that the distinction could be merely an artifact of statistical noise.</p>
<p>The <a href="http://sites.nationalacademies.org/pga/Resdoc/index.htm#members">committee</a> appointed by the NRC to conduct the new graduate-school study wants to avoid this <a href="http://www.insidehighered.com/news/2010/05/10/nrc">&#8220;spurious precision problem.&#8221;</a> They&#8217;ve adopted some jazzy statistical methods&#8212;mainly a technique called resampling&#8212;to model the uncertainty in the data, and they&#8217;ve also decreed that the results will be presented differently. There will be no sorted master list showing overall ranks in descending order. Instead the programs in each discipline will be listed alphabetically, and each program will be given a range of possible ranks. For example, a program might be estimated to rank between fifth place and ninth place. Let&#8217;s call such a range of ranks a <em>rank-interval</em>, and denote it {5, 6, 7, 8, 9} or {5&#8211;9}.</p>
<p>For a hypothetical set of 10 institutions, <em>A</em> through <em>J</em>, here&#8217;s what a set of rank-intervals might look like.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/ranges-alphabetical.png" border="0" alt="bar graph showing ranges of rankings for schools A through J.png" width="438" height="443" /></p>
<p>Acknowledging the uncertainty in your findings is commendable. But let&#8217;s be realistic. If you actually want to make use of these results&#8212;for example, if you&#8217;re a student choosing a grad-school program&#8212;the first thing you&#8217;re going to do is sort those bars into some sort of rank order, trying to figure out which school is best and how they all stack up against one another. In other words, you&#8217;re going to undo all the elaborate efforts the NRC committee has put into obscuring that information.</p>
<p>Below is one possible ordering of the bars. I have sorted first on the top of the rank-intervals, then, if two columns have the same top rank, I&#8217;ve sorted on the bottom rank. Other sorting rules give similar but not identical results. For example, sorting on the midpoints of the intervals would interchange columns <em>B</em> and <em>F</em>.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/ranges-sorted.png" border="0" alt="bar graphs showing rank-ranges sorted into one canonical order.png" width="438" height="443" /></p>
<blockquote>
<p><strong>Question 1.</strong> Does sorting a set of rank-intervals by one of these simple rules yield a consistent and meaningful total ordering of the data? To put it another way, can you trust this attempt to reconstruct a ranking?</p>
</blockquote>
<p>I hasten to add that this is not really a practical question about finding the best grad school. If you&#8217;re facing such a choice in real life, the NRC rank-intervals are not the only available source of information. But, for the sake of the mathematical puzzle, let&#8217;s pretend that all we know about schools <em>A</em> through <em>J</em> is embodied in those ranges of rankings.</p>
<p>It turns out that rank-intervals have some fairly peculiar behavior. Ranges of <em>ratings</em> are not a problem. If the NRC merely gave each school a fuzzy rating on the 0-to-5 scale, no one would have much trouble interpreting the results. But when you turn fuzzy ratings into fuzzy rankings, there are hidden constraints. For example, not all sets of rank-intervals are well-formed.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/impossible-ranges-1.png" border="0" alt="two impossible sets of rank-ranges" width="399" height="202" /></p>
<p>The set at left is impossible because there&#8217;s no one in last place. (We can&#8217;t <em>all</em> be above average.) The example at right is also nonsensical because <em>D</em> has no ranking at all. For a set of rank-intervals to be valid, there has to be at least one entry in each row and each column.</p>
<p>That&#8217;s a necessary condition, but not a sufficient one, as the two graphs below illustrate.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/impossible-ranges-2.png" border="0" alt="two more impossible rank-intervals" width="399" height="202" /></p>
<p>Do you see the problem with the example at left? Column <em>B</em> has a rank-interval of {1&#8211;2}, but in fact <em>B</em> can never rank first because <em>A</em> has no alternative to being first. The case at right is conceptually similar but a little subtler: If <em>B</em> is ranked third, then either first place or second place will have to remain vacant.</p>
<p>The underlying issue here is the presence of constraints or linkages within a set of rankings. Suppose you have calculated ratings and rankings of several schools, and then some new information turns up about one school. You can change the rating of that school without any need to adjust other ratings, but not so the ranking. If a school goes from third place to fourth place, the old fourth-place school has to move to some other rung of the ladder, and somebody has to fill the vacancy in third place. These interdependencies are obvious in a non-fuzzy ranking, but they also exist in the fuzzy case. You can&#8217;t just assign arbitrary rank-intervals to the items in a set and assume they&#8217;ll all fit together. This observation leads to a second question:</p>
<blockquote>
<p><strong>Question 2.</strong> What are the admissible sets of rank-intervals? How do we characterize them?</p>
</blockquote>
<p>I have a partial answer to this question. It goes like this. Any ranking of <em>k</em> things must be a permutation of the integers from 1 through <em>k</em>. A permutation can be embodied in a <em>permutation matrix</em>&#8212;a square <em>k</em> &#215; <em>k</em> matrix in which every row has a single 1, every column has a single 1, and all the other entries are 0. For example, here are the six possible 3 &#215; 3 permutation matrices:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/3x3-permutation-matrices.png" border="0" alt="3x3-permutation-matrices.png" width="419" height="61" /></p>
<p>They correspond to the rankings (1, 2, 3), (1, 3, 2), (2, 1, 3), (3, 1, 2), (2, 3, 1) and (3, 2, 1).</p>
<p>Since a permutation matrix represents a specific (non-fuzzy) ranking, we can build up a set of rank-intervals by taking the <em>OR</em>-sum of two or more permutation matrices. What do I mean by an <em>OR</em>-sum? It&#8217;s just the element-by-element sum of the matrices using the boolean <em>OR</em> operator, &#8744;, instead of ordinary addition. <em>OR</em> has the following addition table:</p>
<pre>                      0 &#8744; 0 = 0
                      0 &#8744; 1 = 1
                      1 &#8744; 0 = 1
                      1 &#8744; 1 = 1
</pre>
<p>For the first two 3 &#215; 3  matrices shown above the arithmetic sum is:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/matrix-addition.png" border="0" alt="matrix-addition.png" width="242" height="65" /></p>
<p>whereas the <em>OR</em>-sum looks like this:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/matrix-or-sum.png" border="0" alt="matrix-or-sum.png" width="242" height="65" /></p>
<p>Every valid set of rank-intervals must correspond to an <em>OR</em>-sum of permutation matrices, simply because a set of rank-intervals is in fact a collection of permutations. The converse also holds: Any <em>OR</em>-sum of permutation matrices yields an admissible set of rank-intervals. Thus the <em>OR</em>-sums of permutation matrices&#8212;let&#8217;s call them <em>ormats</em> for brevity&#8212;are in one-to-one correspondence with the admissible sets of rank-intervals. (There&#8217;s just one catch when applying this idea to the NRC study. The columns of an ormat may well have &#8220;gaps,&#8221; as in the column pattern (0 1 1 0 0 1 1), which corresponds to the rank-interval {2&#8211;3, 6&#8211;7}. Will the NRC allow such discontinuous ranges in their grad-school assessments? Perhaps the issue will never come up in practice. In any case, I&#8217;m ignoring it here.)</p>
<p>Arithmetic sums of permutation matrices form an open-ended, infinite series; in contrast, there are only finitely many distinguishable <em>OR</em>-sums. The reason is easy to see: Ormats have <em>k</em><sup>2</sup> entries, each of which can take on only two possible values, and so there can&#8217;t be more than \(2^{k^{2}}\) distinct matrices. Because of the various constraints on the arrangement of the entries, the actual number of ormats is smaller. For example, at <em>k </em>= 3 the \(2^{k^{2}}\) upper bound allows for 512 ormats, but there are only 49:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/the-49-3-by-3-or-sums.png" border="0" alt="the-49-3-by-3-or-sums.png" width="450" height="364" /></p>
<p>Thus we come to the next question.</p>
<blockquote><p><strong>Question 3.</strong> For each <em>k</em> &#8805; 1, how many distinct ormats can we build by <em>OR</em>-ing subsets of <em>k</em> &#215; <em>k</em> permutation matrices? Is there a closed-form expression for this number?</p></blockquote>
<p>I have answers only for puny values of <em>k</em>.</p>
<pre style="margin: 8px;">  <span style="text-decoration: underline;"> k       upper bound        # of ormats  </span>
   1                 1                  1
   2                16                  3
   3               512                 49
   4            65,536              7,443
   5        33,554,432          6,092,721
   6    68,719,476,736                  ?</pre>
<p>The tallies of ormats were calculated by direct enumeration, which is not a promising approach for larger <em>k</em>. (I note&#8212;to spare folks the bother of looking&#8212;that the sequence 1, 3, 49, 7443, 6092721 does not yet appear in the <a href="http://www.research.att.com/~njas/sequences/index.html">OEIS</a>.)</p>
<p>To extend this series, we might try to exploit the internal structure and symmetries of the ormats. By sorting the columns and rows of the matrices, we can reduce the 49 3&#215;3 ormats to just six equivalence classes, with the following exemplars:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/ormat-equiv-classes.png" border="0" alt="exemplars of six ormat equivalence classes" width="424" height="69" /></p>
<p>Enumerating just these reduced sets of matrices should make it possible to reach larger values of <em>k</em>, but I have not pursued this idea. (Furthermore, the two-dimensional sorting of matrices looks to be a curiously challenging task in itself.)</p>
<p>By the way, I think the number of ormats will approach the \(2^{k^{2}}\) upper bound asymptotically as <em>k</em> increases. Many of the features that disqualify a matrix from ormathood&#8212;such as all-zero rows or columns&#8212;become rarer when <em>k</em> is large. I have tested this conjecture by generating random (0,1) matrices and then counting how many of them turn out to be ormats.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/fraction-of-ormats.png" border="0" alt="fraction-of-ormats.png" width="447" height="297" /></p>
<p>For <em>k</em> = 1 through 5 the results are in close agreement with the actual counts of ormats, and up to <em>k</em> = 10 the trend is clearly upward. But continuing this inquiry to larger values of <em>k</em> will depend on a positive answer to the next question.</p>
<blockquote><p><strong>Question 4.</strong> Given a square matrix with (0,1) entries, is there an efficient algorithm for deciding whether or not it is an <em>OR</em>-sum of permutation matrices, and thus an admissible set of rank-intervals?</p></blockquote>
<p>The question asks for a recognition predicate&#8212;a procedure that will return <em>true</em> if a matrix is an ormat and otherwise <em>false</em>. If efficiency doesn&#8217;t matter, there&#8217;s no question such an algorithm exists. At worst, we can generate all the <em>k</em> &#215; <em>k</em> ormats and see if a given matrix is among them. But that&#8217;s like saying we can factor integers by producing a complete multiplication table. It just won&#8217;t do in practice. Isn&#8217;t there a quick and easy shortcut, some distinctive property of ormats that will let us recognize them at a glance?</p>
<p>If we could replace the <em>OR</em>-sum with the ordinary arithmetic sum, the answer would be yes. Permutation matrices have the handy property that all rows and columns sum to 1. An arithmetic sum of <em>r</em> permutation matrices has rows and columns that all sum to <em>r</em>. (It is a semi-magic square.) The converse is also true (though harder to prove): If a matrix of nonnegative integers has rows and columns that all sum to <em>r</em>, it is a sum of <em>r</em> permutation matrices. This fact yields a simple test: Sum the rows and the columns and check for equality.</p>
<p>Unfortunately, the trick won&#8217;t work for ormats, because the boolean <em>OR</em> operation throws away even more information than summing does. Because 0 &#8744; 1 = 1 &#8744; 0 = 1 &#8744; 1, infinitely many sets of operands map into the same result, and there&#8217;s no obvious way to recover the operands or even to determine how many permutation matrices entered into the <em>OR</em>-sum.</p>
<p>Maybe there&#8217;s some other clever trick for recognizing ormats, but I haven&#8217;t found it. Let me make the question more concrete. Below are three (0,1) square matrices. Two of them are ormats but the third is not. Can you tell the difference?</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/three-puzzle-matrices.png" border="0" alt="three-puzzle-matrices.png" width="309" height="100" /></p>
<p>If it&#8217;s so hard to recognize an ormat, how did I count the ormats among a bunch of randomly generated (o,1) matrices? By hard work: I reconstructed the set of permutations allowed by each matrix. Visualize a permutation as a path threading its way through the matrix from left to right, connecting only non-zero elements and touching each column and each row just once. When you have drawn all possible permutation paths, check to see if every non-zero element is included in at least one path; if so, then the matrix is an ormat. Note that this is <em>not</em> an efficient recognition procedure. In the worst case (namely, an all-ones matrix), there are <em>k!</em> permutations, so this method has exponential running time. But <em>k!</em> is better than \(2^{k^2}\); and, besides, for sparse matrices the number of permutations is much smaller than <em>k!</em>. The 10 &#215; 10 matrix presented as an example at the start of this post gives rise to 580 permutations, a manageable number. Here&#8217;s what they look like, plotted as a spider web of red paths across the bar chart.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/ranges-with-paths.png" border="0" alt="ranges-with-paths" width="438" height="443" /></p>
<p>Every nonzero site is visited by at least one permutation path, so this set of rank-intervals is indeed valid.</p>
<p>This process of lacing permutations through a matrix finally brings me back to Question 1, about how to make sense of the NRC&#8217;s fuzzy ranking scheme. Let&#8217;s take a small example:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/probability-example-1.png" border="0" alt="probability-example-1.png" width="169" height="185" /></p>
<p>Examining the graph above shows that <em>A</em> must rank either first or second&#8212;but which is more likely? In the absence of more-detailed information, it seems reasonable to assume the two cases are equally likely; we assign them each a probability of 1/2. Similarly, <em>B</em> has the rank-interval {1&#8211;3}, and so we might suppose that each of these three cases has probability 1/3. Continuing in the same way, we assign probabilities to every element of the matrix.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/probability-example-2.png" border="0" alt="probability-example-2.png" width="169" height="185" /></p>
<p>But wait! This can&#8217;t be right; our probabilities have sprung a leak. Any proper set of probabilities has to sum to 1. Our procedure assures that each column obeys this rule, but there is no such guarantee for the rows. In row 1, we&#8217;re missing one-sixth of our probability, and in row 2 we have an excess of 1/2; row 4 comes up short by 1/3.</p>
<p>Is there any self-consistent assignment of probabilities for the elements of this matrix? Sure. As a matter of fact, there are infinitely many such assignments, including this one:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/probability-example-3.png" border="0" alt="probability-example-3.png" width="169" height="185" /></p>
<p>I&#8217;ll return in a moment to the question of how I plucked those particular numbers out of the air, but note first what they imply about the ranking of items <em>A</em> through <em>D</em>. For item <em>A</em>, with the rank-interval {1&#8211;2}, the odds are two-to-one that it ranks first rather than second. <em>B</em> has the behavior we expected from the outset, with probability uniformly distributed over the three cases. But if you pick either <em>C</em> or <em>D</em>, each with the rank-interval {2&#8211;4}, your chance of getting second place is only 1/6, and half the time you&#8217;ll be in last place.</p>
<p>Where do these numbers come from? Instead of starting with the assumption that probability is uniformly distributed over each rank-interval, assume that each possible permutation of the ranks is equiprobable. For this matrix there are six allowed permutations: (1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4), (1, 3, 4, 2), (2, 1, 3, 4) and (2, 1, 4, 3). Observe that four of the six ordering put <em>A</em> first, and only two permutations place <em>A</em> second. We can also tally up such &#8220;occupation numbers&#8221; for all the other matrix elements:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/probability-example-4.png" border="0" alt="probability-example-4.png" width="169" height="185" /></p>
<p>Dividing these numbers by the total number of permutations, 6, yields the probabilities given above.</p>
<p>We can do the same computation for the 10 &#215; 10 example matrix, which turns out to allow 580 permutations:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/ranges-with-path-weights.png" border="0" alt="ranges-with-path-weights.png" width="438" height="442" /></p>
<p>If you care to check, you&#8217;ll find that each column and each row sums to 580; dividing all the entries by this number yields a probability matrix with columns and rows that sum to 1 (also known as a doubly stochastic matrix).</p>
<p>This process of tabulating permutation paths recovers some of the information we would have gotten from the arithmetic sum of the permutation matrices&#8212;information that was lost in the <em>OR</em>-ing operation. But we get back only <em>some</em> of the information because we have to assume that each permutation included in the <em>OR</em>-sum appears only once. (This is just another way of saying that the allowed permutations are equiprobable.) There&#8217;s no particularly good reason to make this assumption, but at least it leads to a feasible probability matrix.</p>
<p>Is there any way of calculating the entries in the doubly stochastic matrix without explicitly tracing out all the permutation paths? I&#8217;m sure there is. I think the construction of the matrix can be approached as an integer-programming problem, and perhaps through other kinds of optimization technology. What seems less likely is that there&#8217;s some simple and efficient shortcut algorithm. But I could be wrong about that; there&#8217;s a lot of mathematics connected with this subject that I don&#8217;t understand well enough to write about (e.g., the <a href="http://en.wikipedia.org/wiki/Birkhoff_polytope">Birkhoff polytope</a>). I hope others will fill in the gaps.</p>
<p>Getting back to the assessment of grad schools&#8212;have we finally found the right way to understand those rank-intervals that the NRC promises to publish any day now? My sense is that a semi-magic square (or, equivalently, a doubly stochastic matrix) will give a less-misleading impression than a simple eyeball sorting on the spans or midpoints of the rank-intervals. But what a lot of bother to get to that point! How many prospective grad students are going to repeat this analysis?</p>
<p><strong>Acknowledgment:</strong> Thanks to Geoff Davis of <a href="http://www.phds.org/">PhDs.org</a> for introducing me to this story. PhDs.org will have the new ratings as soon as the NRC releases them, and may even find a way to make them intelligible! <strong>Disclaimer:</strong> I&#8217;ve done paid work for the PhDs.org web site (but this is not a paid endorsement).</p>
<p><strong>Update 2010-07-27:</strong> If you&#8217;ve gotten this far, please read the comments as well. A number of commenters have provided important insights and context, which have helped me understand what&#8217;s going on in the matrices I&#8217;ve been calling ormats. But I&#8217;m still a bit murky about the best way to recognize and count them. I&#8217;m not sure that publishing my still-murky thoughts is terribly helpful, but maybe someone else will read what follows and give us a dazzling, gemlike synthesis.</p>
<p>For the ormat-recognition problem (Question 4 above), three basic approaches have been mentioned: enumerating the permutation paths through the matrix, examining matrix minors, and looking for perfect matchings in a bipartite graph defined by the matrix. It seems to me that all of these methods are doing the same thing.</p>
<p>Start with Barry Cipra&#8217;s method of minors. The basic operation is to choose a nonzero matrix element, then delete the row and the column in which that element occurs. You then apply the same operation to the remaining, smaller matrix.</p>
<p>In tracing permutation paths, we&#8217;re looking for sequences of nonzero elements, drawing one element from each column and each row. A way of organizing this search is to choose a nonzero element and then, after recording its location, delete the corresponding column and row, so that no other elements can be chosen from that column or row.</p>
<p>In the method based on Hall&#8217;s theorem, as explained by John R., we view the ormat as the adjacency matrix of a bipartite graph, where every nonzero element designates an edge connecting a row vertex to a column vertex. To find a matching, we delete an edge, along with the two vertices it connects (and also all the other edges incident on those vertices). <del>Then we recurse on the smaller remaining graph.</del> (See further update below.) If you translate this operation on the graph back into the language of matrices, deleting an edge and its endpoints amounts to deleting a row and a column of the adjacency matrix.</p>
<p>I am not asserting that these three algorithms are all identical, but they all rely on the same underlying operation. To say more, we would need to consider the control structure of the algorithms&#8212;how the basic operations are organized, how the recursion works, all the details of the bookkeeping. I don&#8217;t trust myself to make those comparisons without trying to implement the three methods, which I have not yet done. However, at this point I just don&#8217;t see how any method can guarantee correct results without something resembling backtracking (or else exhaustive search through an exponential space). After all, we&#8217;re not looking for just one matching in the graph, or one decomposition into matrix minors, or one permutation path; we have to examine them all.</p>
<p>Here&#8217;s a further hand-wavy argument for the essential difficulty of the task. For a (0,1) matrix, the number of permutation paths that avoid all zero entries is equal to the permanent of the matrix. Computing the permanent of such a matrix is known to be <a href="http://en.wikipedia.org/wiki/Permanent_is_sharp-P-complete">#P-complete</a>.</p>
<p><strong>Update 2010-07-31:</strong> With lots of help from my friends, I think I finally get it. Although there could be as many as <em>k!</em> permutation paths in a <em>k</em> &#215; <em>k</em> matrix, you don&#8217;t need to examine all of the paths to decide whether or not the matrix is an ormat. It&#8217;s enough to establish that one such path passes through each nonzero element. This is what the algorithm based on Hall&#8217;s theorem does. As Frans points out in a comment below, I misunderstood the essential nature of that algorithm (in spite of having it explained to me several times). There is no recursive deconstruction into progressively smaller matrix minors; instead, we just loop over all the nonzero elements of the matrix, find the minor associated with each such element, then check for a perfect matching in the minor. (Still more refinements are possible&#8212;but already we have a polynomial algorithm.)</p>
<p>With this efficient recognizer predicate, it&#8217;s easy to measure the proportion of ormats in random matrices at larger values of <em>k</em>:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/fraction-of-ormats-k25.png" border="0" alt="fraction-of-ormats-k25.png" width="447" height="297" /></p>
<p>As expected, the fraction of ormats approaches 1 beyond about <em>k</em> = 20.</p>
<p>So much for identifying ormats. I am still unable to extend the series of exact counts beyond <em>k</em> = 5. The tabulations for random (0,1) matrices suggest that for&nbsp;<em>k</em> = 6 there should be about 20 billion ormats, and counting that high is just too painful. I need to work out the symmetries of the problem.</p>
<p>As far as I can tell, assigning exact probabilities to the nonzero matrix elements requires a full enumeration of all the permutation paths, and thus a calculation equivalent to the permanent. There may be a useful approximation.</p>
<p>Barry Cipra asks a really good question: The permanent tells us the <em>maximum</em> number of permutations that could possibly be included in a given ormat, but what is the <em>minimum</em> number? A naive upper bound is the number of 1s in the matrix, but I don&#8217;t see an easy path to an exact count. But enough for now.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/four-questions-about-fuzzy-rankings/feed</wfw:commentRss>
		</item>
		<item>
		<title>The thrill of the chase</title>
		<link>http://bit-player.org/2010/the-thrill-of-the-chase</link>
		<comments>http://bit-player.org/2010/the-thrill-of-the-chase#comments</comments>
		<pubDate>Mon, 12 Jul 2010 00:47:50 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=706</guid>
		<description><![CDATA[How I love to go out hunting on a bright Sunday morning&#8212;though it&#8217;s not my style to shoot furry/feathery/finny animals. My game is to get up early and&#160;stalk a wily factoid.
A posting from Mat Roberts, whose blog I&#8217;ve recently discovered, sent me out this morning to chase down a passage in How Long Is a [...]]]></description>
			<content:encoded><![CDATA[<p>How I love to go out hunting on a bright Sunday morning&#8212;though it&#8217;s not my style to shoot furry/feathery/finny animals. <em>My</em> game is to get up early and&nbsp;stalk a wily factoid.</p>
<p>A <a href="http://moleseyhill.com/blog/2009/05/25/how-many-bugs/">posting</a> from Mat Roberts, whose <a href="http://moleseyhill.com/blog/">blog</a> I&#8217;ve recently discovered, sent me out this morning to chase down a passage in <em><a href="http://plus.maths.org/issue21/reviews/book4/index.html">How Long Is a Piece of String</a></em>, a book by&nbsp;Rob Eastaway and Jeremy Wyndham:</p>
<p><img class="centered" title="Don't be picky about the formula. Yes, it's true, S could be zero. We can handle that, if necessary, with a slightly more elaborate version." src="http://bit-player.org/wp-content/uploads/2010/07/eastaway-wyndham-p160.png" border="0" alt="passage from Eastaway-Wyndham, page 160" width="450" height="329" /></p>
<p>The concept here seemed familiar, but the term &#8220;Lincoln Index&#8221; was new to me. Lincoln who? What index?</p>
<p>Google offered some useful clues. (Also a generous helping of false scents&#8212;books about Honest Abe that happen to have an index.) Without even clicking on a link I had the general context:</p>
<blockquote>
<p>The <em>Lincoln Index</em> provides a way to  measure population sizes of individual animal species. It is based on a  capture/mark/ recapture method&#8230;</p>
</blockquote>
<p>So we&#8217;re talking ecology and population biology. The original idea was not to catch the same typo twice but to catch the same furry/feathery/finny creature twice. Interesting. However, the first couple of web pages that Google sent me to (<a href="http://www.offwell.free-online.co.uk/lincoln.htm">here</a> and <a href="http://teachers.net/lessons/posts/2222.html">here</a>) told me nothing about Lincoln. And, oddly, I found no Wikipedia entry for &#8220;Lincoln Index.&#8221; If it&#8217;s not in Wikipedia, does it exist?</p>
<p>With a little more poking around, I stumbled upon another clue that seemed promising: a <a href="http://www.sbs.utexas.edu/jcabbott/courses/bio208web/labs/populations/populations.htm">mention</a> of &#8220;the Lincoln-Pearson equation for estimating population size.&#8221; I was still in the dark about Lincoln, but Pearson is quite a familiar figure. Surely that&#8217;s Karl Pearson, the pioneering statistician, who did much of his work in the biological sciences and might very well have come up with a scheme for estimating population sizes.</p>
<p>Back at Google, though, searching for &#8220;Lincoln-Pearson&#8221; turned up nothing pertinent other than the page I&#8217;d come from (though I <em>did</em> learn that Karl Pearson &#8220;read in chambers in Lincoln&#8217;s Inn&#8221; during his early years studying law).</p>
<p>More beating the bushes. Eventually I realized I had wandered into a blind alley. Somebody needs to hire a pair of proofreaders: The formula is not &#8220;Lincoln-Pearson&#8221; but &#8220;Lincoln-Petersen.&#8221; Try <em>those</em> names at Google and you&#8217;ll get an abundance of useful pointers. (You&#8217;ll also learn that Abraham Lincoln died in&nbsp;Petersen&#8217;s Boarding House, across the street&nbsp;from Ford&#8217;s Theater. Google is not just a search engine but also a coincidence engine.)</p>
<p>The particular web page where I finally got the correct names (<a href="http://www.cals.ncsu.edu/course/fw353/Estimate.htm">notes for a course at North Carolina State University</a>) explains that capture-mark-recapture methods</p>
<blockquote>
<p>are used extensively to estimate populations of fish, game animals, and many non-game animals.&nbsp;The approach was first used by Petersen (1896) to study European plaice in the Baltic Sea and later proposed by Lincoln (1930) to estimate numbers of ducks. Petersen&#8217;s and Lincoln&#8217;s method is often referred to as the Lincoln-Petersen Index, even though it is not an index but a method to estimate actual population sizes. (Should it not be the Petersen-Lincoln Estimate?)</p>
</blockquote>
<p>I decided to pursue Petersen first&#8212;and immediately ran into a&nbsp;few further bibliographic brambles. Some citations spell the name &#8220;Petersen&#8221; and others &#8220;Peterson.&#8221; Some give the initials &#8220;C. G. T.&#8221; and others &#8220;C. G. J.&#8221; or &#8220;C. J. G.&#8221; The date might be 1895 or 1896 or 1897. Here&#8217;s what I believe to be a correct citation:</p>
<blockquote>
<p>Petersen, C. G. J. 1896. The yearly immigration of young plaice into the Limfjord from the German Sea. <em>Report of the Danish Biological Station to the Home Department</em> 6:1&#8211;48.</p>
</blockquote>
<p>Wikipedia identifies our elusive author as Carl Georg Johannes Petersen (1860-1928).&nbsp;He was a founder of the Danish Biological Station, which was not in fact a station but a mobile laboratory&#8212;a decommissioned naval vessel that was moved around from year to year. In 1895, Petersen took the station to the Limfjord, a chain of bays, lakes and channels cutting across the Jutland peninsula in northern Denmark. There he studied the plaice fishery. (Back to Wikipedia: &#8220;The European plaice is a right-eyed flounder belonging to the Pleuronectidae family.&#8221; But let&#8217;s not get started on right-eyed and left-eyed flatfish, or we&#8217;ll never get to the end of this.)</p>
<p>Petersen&#8217;s report is&nbsp;<a href="http://www.archive.org/details/reportofdanishbi06dans">available online</a>, scanned from a copy belonging to the library of the Marine Biological Laboratory and Woods Hole Oceanographic Institution, and hosted by the Biodiversity Heritage Library of the Internet Archive. A second surprise: The report is written in English. But on reading through it I find only vague and murky connections between the work Petersen reports and the mark-recapture method of&nbsp;estimating&nbsp;populations. There&#8217;s nothing resembling the <em>E<sub>1</sub>E<sub>2</sub>/S</em> formula.</p>
<p>Petersen&nbsp;<em>does</em> describe a series of capture/mark/recapture experiments. A few hundred plaice were caught and marked by attaching numbered buttons, then put back in the water. Fishermen who recaught the labeled fish in later months were asked to report them. But the purpose of this study was not to estimate the total population; instead, Petersen used before-and-after measurements of the marked fish to estimate their growth rate.</p>
<p>In a much larger experiment, some 82,580 plaice (somebody must have counted them!) were transplanted into the fjord, and 10,900 of the fish were marked by having a hole punched in their dorsal fin. The number of marked fish was recorded as the plaice were caught during the coming year. It&#8217;s not clear whether the aim of this project was to estimate the total population, but in any case it didn&#8217;t work. The fraction of marked fish in the transplanted batch was about 1/7, but the marked fraction in the subsequent catches was 1/5. Petersen remarks, &#8220;This result is very strange,&#8221; and I have to agree.</p>
<p>When Petersen did try to estimate the plaice population, he didn&#8217;t rely on a recapture scheme. He went out with seine nets designed to dredge up every bottom fish in a measured plot, then extrapolated from the density of fish per unit area.</p>
<p>The whole report is fascinating fishy stuff, but it leaves me wondering just how Petersen came to be given credit for the resampling idea. As far as I can tell, it&#8217;s not to be found in this paper.</p>
<p>Having chased down Petersen, I turned back to Mr. Lincoln. Without much trouble I was able to identify the work in question:</p>
<blockquote>
<p>Lincoln, F. C. 1930. Calculating waterfowl abundance on the basis of banding returns. <em>United States Department of Agriculture Circular</em> 118:1&#8211;4.</p>
</blockquote>
<p><img class="alignleft" title="Credit: U.S. Geological Survey" src="http://bit-player.org/wp-content/uploads/2010/07/fredericklincoln.jpg" border="0" alt="portrait of Frederick C. Lincoln in his office, with stuffed duck." width="220" height="289" />The author was Frederick C. Lincoln, who was&nbsp;<a href="http://www.pwrc.usgs.gov/BBL/homepage/lincoln.htm">bird-bander-in-chief</a> in the U.S. for some 25 years. The agency he founded has since migrated from the Department of Agriculture to the&nbsp;U.S. Geological Survey&nbsp;and become the Bird Banding Laboratory.</p>
<p>Google returns hundreds of works that cite Lincoln&#8217;s paper (including some quite far afield from population biology). But tracking down the USDA document itself was not so easy. If the USDA has it online, I wasn&#8217;t able to locate it. But a search of <a href="http://www.worldcat.org/">WorldCat</a> eventually turned up an archive in the <a href="http://catalog.hathitrust.org/">Hathi Trust Digital Library</a> where you can page through&nbsp;<a href="http://babel.hathitrust.org/cgi/pt?view=image;size=100;id=umn.31951d02969945h;page=root;seq=640;num=81">Lincoln&#8217;s pamphlet</a> in a copy scanned by Google at the University of Minnesota library.</p>
<p>Lincoln gives only a brief and informal account of the recapture idea, but the basic principle is stated clearly enough:</p>
<blockquote>
<p>If in one season 5,000 ducks were banded and yielded 600 first-season returns, or 12 percent, and if during that same season the total number of ducks killed and reported by sportsmen was about 5,000,000, then this number would be equivalent to approximately 12 per cent of the waterfowl population for that year, which would be about 42,000,000.</p>
</blockquote>
<p>It&#8217;s not hard to translate this formula from the language of duck hunters into the language of proofreaders. The first reader finds 5,000 typos and the second spots 5 million; 600 of these errors are common to both lists, and so the total number of typos is:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/07/typos-eqn.png" border="0" alt="\frac{5\,000 \times 5\,000\,000}{600} = 41\,666\,667" width="208" height="34" /></p>
<p>So that&#8217;s my reward for a morning spent out hunting: 42 million typos.</p>
<p>Does Frederick Lincoln deserve credit for the Lincoln Index? I&#8217;d say he has a good claim, except that Pierre Simon de Laplace <span style="text-decoration: underline;" title="text added 2010-07-12">had the same idea</span> more than a century earlier. In 1802 Laplace applied his method to estimating the (human) population of France. But maybe that&#8217;s a story for another Sunday morning.</p>
<p><strong>Epilogue</strong>. This is not really a story about typos, or about fish and ducks. It&#8217;s about finding things&#8212;about the phenomenal ease of chasing facts on the world wide web. Does a marked fish have any hope of escaping recapture there?</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/the-thrill-of-the-chase/feed</wfw:commentRss>
		</item>
		<item>
		<title>A twist of fate</title>
		<link>http://bit-player.org/2010/a-twist-of-fate</link>
		<comments>http://bit-player.org/2010/a-twist-of-fate#comments</comments>
		<pubDate>Fri, 11 Jun 2010 20:51:36 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[problems and puzzles]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=697</guid>
		<description><![CDATA[The school of philosophy called Antipodianism briefly flourished on the fringes of the Hellenistic world more than 2,000 years ago. The sect held that every person has an opposite number, a mirror image who inverts all our beliefs, feelings, actions and attitudes. If I smile, my antipodian counterpart frowns; when I wake, she sleeps. If [...]]]></description>
			<content:encoded><![CDATA[<p>The school of philosophy called Antipodianism briefly flourished on the fringes of the Hellenistic world more than 2,000 years ago. The sect held that every person has an opposite number, a mirror image who inverts all our beliefs, feelings, actions and attitudes. If I smile, my antipodian counterpart frowns; when I wake, she sleeps. If I&#8217;m a Mac, she&#8217;s a PC. For every liberal Democrat there&#8217;s an antipodian Tea Party Republican. In this way the&nbsp;universe is held in&nbsp;balance. It&#8217;s an <em>enforced</em> equilibrium, which none of us has the power to upend, hard as we might try.</p>
<p>The earliest of the Antipodians believed that every such matched pair&nbsp;(commonly designated A and &#8704;)&nbsp;live at diametrically opposite points on the surface of the earth. This arrangement ensures that A and &#8704; can never meet&#8212;thereby averting a cosmic catastrophe. A&nbsp;later quantum-field version of Antipodianism relaxed the geographic constraint by allowing for the creation and annihilation of A&#8704;&nbsp;&#65279;pairs, but that idea never really caught on.</p>
<p>One day an Antipodian master was teaching an exchange student from New Zealand. The child was crafty.</p>
<blockquote>
<p>&#8220;Is it not true,&#8221; she asked, &#8220;that&nbsp;A and &#8704;&nbsp;&#65279;always do the opposite thing?&#8221;</p>
<p>&#8220;Yes, antisymmetry demands it,&#8221; the master replied.</p>
<p>&#8220;If A walks north,&nbsp;&#8704;&nbsp;&#65279;&#65279;must walk south?&#8221; the child asked.</p>
<p>Again the master assented.</p>
<p>&#8220;If A goes east,&nbsp;&#8704;&nbsp;&#65279;&#65279;must go west?&#8221;</p>
<p>&#8220;Yes.&#8221;</p>
<p>&#8220;If A turns to the right,&nbsp;&#8704;&nbsp;&#65279;&#65279;must turn to the left, no?&#8221;</p>
<p>The master agreed, although he sensed trouble coming.</p>
<p>&#8220;I&#8217;m afraid the universe is out of joint,&#8221; said the child. &#8220;If A goes north and turns to the right, while&nbsp;&#8704;&nbsp;&#65279;goes south and turns to the left, afterwards they are both walking east. They are doing the <em>same</em> thing.&#8221;</p>
</blockquote>
<p>Needless to say, this was a moment of crisis in Antipodian doctrine. The Pythagoreans, you may recall, resolved a similar impasse by resorting to violence. When some upstart challenged their precept that &#8220;all is number&#8221; by showing that no known number can be the square root of 2, the Pythagoreans tossed the troublemaker out of the boat. But in this case the&nbsp;Antipodian&#65279; master kept his calm.</p>
<blockquote>
<p>&#8220;Ah my little Kiwi,&#8221; he said to the student. &#8220;You are clever but not wise. Your own statements refute your claim. Did you not begin by saying&nbsp;that&nbsp;A and &#8704;&nbsp;&#65279;always do the opposite thing?&#65279; When A walks 10 paces north,&nbsp;&#8704;&nbsp;&#65279;walks 10 paces south. When A turns right,&nbsp;&#8704;&nbsp;&#65279;turns left. But now you would have us believe they&nbsp;<em>both</em> take a step forward, contradicting the most basic law of their nature. What really happens is that A walks forward and&nbsp;&#8704;&nbsp;&#65279;walks backward. Thus A goes eastward and&nbsp;&#8704;&nbsp;&#65279;westward, and all is well with the world.&#8221;</p>
</blockquote>
<p>Through this brittle sophistry the master extricated himself from the classroom&#8212;though he may have had to walk backwards to make good his escape. He never taught again. The Kiwi student went on to a brilliant career studying the weak interactions of neutral <em>K</em> mesons. As for Antipodianism, it vanished without a trace.</p>
<p>Or maybe it left a tiny trace. I&#8217;ve never visited the antipodes, but I hear that <a href="http://www.occa-corkscrews.com/index.html">corkscrews Down Under</a> turn the other way.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/a-twist-of-fate/feed</wfw:commentRss>
		</item>
		<item>
		<title>Disentangling Gaussians</title>
		<link>http://bit-player.org/2010/disentangling-gaussians</link>
		<comments>http://bit-player.org/2010/disentangling-gaussians#comments</comments>
		<pubDate>Thu, 10 Jun 2010 21:11:28 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=695</guid>
		<description><![CDATA[The printed program for the recent STOC meeting in Cambridge announced the following talk:

Unfortunately, the fourth author could not be present (and he is not listed as an author on the paper itself), so the talk was given by Gregory Valiant.
Here is a motivating example. Take a tape measure, and go record the heights of [...]]]></description>
			<content:encoded><![CDATA[<p>The printed program for the recent STOC meeting in Cambridge announced the following talk:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/06/erdos-on-the-program.png" border="0" alt="Talk by Kalai, Moitra, Valiant and Erdos on printed program, STOC 2010" width="450" height="96" /></p>
<p>Unfortunately, the fourth author could not be present (and he is not listed as an author on the paper itself), so the talk was given by Gregory Valiant.</p>
<p>Here is a motivating example. Take a tape measure, and go record the heights of a few thousand adults chosen at random. You&#8217;ll come back with a distribution that looks something like this:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/06/combined-distribution.png" border="0" alt="combined-distribution.png" width="447" height="315" /></p>
<p>How come the curve is so lumpy and lopsided? Isn&#8217;t height a variable with an approximately normal distribution? Of course it is. The problem is that we have mixed up two subpopulations&#8212;men and women&#8212;with different height distributions:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/06/three-distributions.png" border="0" alt="three-distributions.png" width="447" height="315" /></p>
<p>Question: Given the lumpy combined distribution, and the knowledge that it represents a mixture of exactly two normal distributions, can we somehow recover the component distributions? Specifically, can we determine the means and standard deviations and the Gaussian curves, as well as their relative weights, or contributions to the total?</p>
<p>The answer is yes. In a series of papers published in the 1950s and 60s, Henry Teicher (then at Purdue, now emeritus at Rutgers) proved a kind of unique-factorization theorem for distributions. He showed that no mixed distribution can be decomposed into two or more different sets of normal distributions, just as no composite whole number can be formed as the product of two or more distinct sets of primes. (Teicher&#8217;s theorem&nbsp;generalizes to many distributions other than normal ones.&nbsp;There are also a few caveats; for example,&nbsp;none of the component distributions can be pointlike, with a standard deviation of zero.) Teicher&#8217;s work implies that a mixed distribution is <em>identifiable:</em> If you can break it down into a set of primitive distributions, then that decomposition is unique.</p>
<p>If STOC were a mathematics meeting, that might be the end of the story. But STOC is the Symposium on the Theory of Computing, and the question here is not &#8220;Does a solution exist?&#8221; but rather &#8220;Can you solve it in polynomial time?&#8221;&nbsp;Teicher&#8217;s result offers no such guarantee. It turns out that his proof of identifiability depends on the behavior of the distributions far out in the tails. Measuring data frequencies in these sparsely populated outlying regions requires an exponentially large number of samples. But other approaches to the problem don&#8217;t run into this snag; Valiant and his colleagues show that &#8220;robust polynomial identifiability&#8221; is indeed possible.</p>
<p>A key idea in their proof goes back at least as far as the 1890s, to work done by the indefatigable statisticians W. F. R. Weldon and Karl Pearson. In 1892 Weldon spent a summer on the Bay of Naples, measuring various features on the carapaces of crabs. One such feature yielded a distinctively lumpy distribution much like the height curve shown above. Weldon thought that the asymmetry might signal the incipient splitting of the crab population into two races or species, each of which if taken individually would have a normal distribution. For help with the analysis of his data he turned to Pearson, who was able to identify two component Gaussian curves that sum up to the observed distribution. (The figure comes from Weldon&#8217;s 1893 paper; see below for references.)</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/06/weldon-pearson.png" border="0" alt="Figure 3 distribution from Weldon 1893." width="441" height="295" /></p>
<p>Pearson&#8217;s method was based on calculating the first six statistical moments of the given distribution. (The <em>n</em>th moment of a distribution is the expectation value of&nbsp;\((x-\bar{x})^n\), where \(x\) is a random value drawn from the distribution and \(\bar{x}\) is the mean value.) The process called for solving a ninth-degree polynomial, which was a heroic feat in 1893.</p>
<p>Valiant et al. show that a procedure like Pearson&#8217;s will always succeed in the following sense: If two mixed distributions can be distinguished at all, then they can be distinguished by examining their first six moments, and those moments characterize the component Gaussians. Calculating the moments requires only a polynomial number of samples, and the running time of the overall algorithm is a polynomial function of various parameters such as the required accuracy. Moreover, the process can be extended from one-dimensional distributions to multidimensional Gaussians.</p>
<p>Although the algorithm has polynomial running time, that&#8217;s not a guarantee of practicality. (After all, this is a <em>theory</em> conference.) One stage of the process is essentially a brute-force search through a very large (though polynomially bounded) space of parameter values for the six moments.</p>
<p><strong>Sources</strong>:</p>
<p>The <a href="http://portal.acm.org/toc.cfm?id=1806689&amp;idx=SERIES396&amp;type=proceeding&amp;coll=ACM&amp;dl=ACM&amp;part=series&amp;WantType=Proceedings&amp;title=STOC&amp;CFID=93401172&amp;CFTOKEN=47216168">STOC proceedings</a> with <a href="http://doi.acm.org/10.1145/1806689.1806765">the Kalai-Moitra-Valiant</a> paper are now available online, but only by subscription. A <a href="http://www.cs.berkeley.edu/~gvaliant/papers/KMV_c.pdf">PDF preprint</a> is posted on <a href="http://www.eecs.berkeley.edu/~gvaliant/index.html">Valiant&#8217;s web site</a>; also an <a href="http://www.cs.berkeley.edu/~gvaliant/papers/KMV_full.pdf">expanded version</a>.</p>
<p>For Henry Teicher&#8217;s work see: &#8220;Identifiability of Mixtures,&#8221;&nbsp;<em>The Annals of Mathematical Statistics</em> 32(1):244&#8211;248 (1961) and &#8220;On the Mixture of Distributions,&#8221; &nbsp;<em>The Annals of Mathematical Statistics</em> 31(1):55&#8211;73 (1960).</p>
<p>For Weldon and Pearson see:&nbsp;W. F. R. Weldon: &#8220;On Certain Correlated Variations in <em>Carcinus maenas,</em>&#8220;&nbsp;<em>Proceedings of the Royal Society of London</em> 54:318&#8211;329&nbsp;(1893) and&nbsp;Karl Pearson: &#8220;Contributions to the Mathematical Theory of Evolution,&#8221; <em>Philosophical Transactions of the Royal Society of London A</em> 185:71- 110&nbsp;(1894).</p>
<p>My thanks to Virginia Gold and Irene Frawley at ACM, Lance Fortnow, Paul Oka, Jennifer Chayes and Christian Borgs, all of whom helped make it possible for me to attend some of the STOC sessions.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/disentangling-gaussians/feed</wfw:commentRss>
		</item>
		<item>
		<title>The snarXiv</title>
		<link>http://bit-player.org/2010/the-snarxiv</link>
		<comments>http://bit-player.org/2010/the-snarxiv#comments</comments>
		<pubDate>Mon, 07 Jun 2010 12:06:18 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[physics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=689</guid>
		<description><![CDATA[
The snarXiv is a ran&#173;dom high-energy the&#173;ory paper gen&#173;er&#173;a&#173;tor incor&#173;po&#173;rat&#173;ing all the lat&#173;est trends, entropic rea&#173;son&#173;ing, and excit&#173;ing mod&#173;uli spaces. The arXiv is sim&#173;i&#173;lar, but occa&#173;sion&#173;ally less ran&#173;dom.&#65279;

Inspiring! Soon bit-player, too, will be generated by a context-free grammar, and no one will know the difference.
&#160;
]]></description>
			<content:encoded><![CDATA[<blockquote>
<p>The <a href="http://snarxiv.org">snarXiv</a> is a ran&#173;dom high-energy the&#173;ory paper gen&#173;er&#173;a&#173;tor incor&#173;po&#173;rat&#173;ing all the lat&#173;est trends, entropic rea&#173;son&#173;ing, and excit&#173;ing mod&#173;uli spaces. The <a href="http://arxiv.org">arXiv</a> is sim&#173;i&#173;lar, but occa&#173;sion&#173;ally less ran&#173;dom.&#65279;</p>
</blockquote>
<p>Inspiring! Soon bit-player, too, will be generated by a context-free grammar, and no one will know the difference.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/the-snarxiv/feed</wfw:commentRss>
		</item>
		<item>
		<title>A hole in the bottom of the ocean</title>
		<link>http://bit-player.org/2010/a-hole-in-the-bottom-of-the-ocean</link>
		<comments>http://bit-player.org/2010/a-hole-in-the-bottom-of-the-ocean#comments</comments>
		<pubDate>Fri, 04 Jun 2010 15:09:36 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[modern life]]></category>

		<category><![CDATA[off-topic]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=681</guid>
		<description><![CDATA[The explosion and fire that destroyed the drilling rig Deepwater Horizon on the night of April 20 was a run-of-the-mill industrial accident. In saying this, I don&#8217;t mean to make light of the disaster, in which 11 workers perished. Nevertheless, it&#8217;s important to recognize that events like this one have happened before. They happen all [...]]]></description>
			<content:encoded><![CDATA[<p>The explosion and fire that destroyed the drilling rig Deepwater Horizon on the night of April 20 was a run-of-the-mill industrial accident. In saying this, I don&#8217;t mean to make light of the disaster, in which 11 workers perished. Nevertheless, it&#8217;s important to recognize that events like this one have happened before. They happen all the time. Just two weeks earlier, an explosion in a West Virginia coal mine killed 29 workers. A refinery fire in Anacortes, Washington, killed seven on April 2. In February six workers died in a natural gas explosion at a power plant under construction in Connecticut. In January, leaking phosgene gas killed an operator at a West Virginia chemical plant.&nbsp;Looking back a few years further,&nbsp;a 2005 fire at a BP refinery in Texas City, Texas, killed 15 workers and injured 170 more. There was another fatality at the same refinery in 2008. Texas City was also the site of a&nbsp;fertilizer explosion&nbsp;in 1947 that destroyed much of the port and left almost 600 dead. Then there&#8217;s the Bhopal catastrophe, where toxic fumes from an insecticide plant suffocated thousands of residents of nearby neighborhoods.</p>
<p>Still another accident that belongs in this sad catalogue is&nbsp;the destruction of the Piper Alpha platform in the North Sea oil fields. Piper Alpha was not a drilling rig but a production platform; it pumped oil and natural gas from completed wells to a terminal in the Orkney Islands. On the night of July 6, 1988, a huge explosion shattered the main part of the platform, and a subsequent fire consumed the rest; 167 crew members died. The apparent cause was a miscommunication: The night shift tried to start up a pump, not knowing that the day shift had removed a crucial valve for maintenance, leading to a massive leak of flammable gas.</p>
<p>It&#8217;s a bitter truth: Industrial accidents are business as usual. Death on the job is a cost that we (as a society) are evidently willing to accept and pay. When a mine collapses or a sugar refinery explodes, we read the story in the newspaper and we watch the film at 11, and then we grit our teeth and move on. The Deepwater Horizon sinking is just another in this long series of mishaps.</p>
<p>And yet it&#8217;s also totally different. This accident has consequences that reach beyond the workers and their families. A hole in the bottom of the ocean has been spewing&nbsp;hydrocarbons for six weeks. The rate of loss could well be a million gallons a day. Efforts to control the well are still under way, but if they fail the spill might go on for another three months. All that makes for a nightmare that doesn&#8217;t drop out of the news cycle after the first 24 hours.</p>
<p><img class="centered" title="The Deepwater Horizon oil slick as seen by the NASA Terra satellite, 24 May 2010" src="http://bit-player.org/wp-content/uploads/2010/06/terra-may-24-2010.jpg" border="0" alt="The Deepwater Horizon oil slick as seen by the NASA Terra satellite, 24 May 2010" width="450" height="282" /></p>
<p>In this respect, Deepwater Horizon may turn out to be not another Piper Alpha but another Three Mile Island. The 1979 accident at that nuclear power station had lasting consequences: We haven&#8217;t built another nuclear plant in the U.S. in the 30 years since. (There are other reasons for the long lassitude of the nuclear industry, but TMI was a major factor.) Likewise, in the aftermath of Deepwater Horizon, it seems a fair guess that we won&#8217;t be&nbsp;drilling another deep offshore well anywhere near the U.S. coastline for years to come, and maybe decades. Maybe never.</p>
<p>A moratorium on offshore drilling, enforced by economic as well as legal and political pressure, looks like a perfectly sensible response to the current situation. But we deserve more. By all means, let&#8217;s do whatever necessary to avoid repeating this particular mistake, but at the same time let&#8217;s address the broader issue of why things all over the industrial landscape keep blowing up and falling down.</p>
<p>Will we ever learn what happened on the Deepwater Horizon? There are certainly a lot of investigators trying to find out. Three Congressional committees have heard testimony already. A board of inquiry formed by the Coast Guard and the Minerals Management Service is holding its own series of hearings. So is a state panel in Louisiana. On May 11 President Obama asked the National Academy of Engineering to look into the cause of the accident, then on May 22 he appointed a special commission to carry out yet another investigation. Meanwhile the Justice Department is studying possible civil or criminal penalties.</p>
<p>For all that effort, we sure haven&#8217;t learned much yet. The well is leaking, but BP isn&#8217;t. It&#8217;s amazing how they&#8217;ve managed to keep the public and the press at a distance; the one art they seem to have mastered is secrecy.&nbsp;Even friendly trade publications (e.g., <a href="http://www.ogj.com/index.html"><em>Oil and Gas Journal</em></a>, <a href="http://www.offshore-mag.com/index.html"><em>Offshore</em></a>) have so far failed to penetrate the security cordon. In the end, though, the story will come out. We&#8217;re going to learn why that blowout preventer didn&#8217;t prevent a blowout. But I&#8217;m not so confident we&#8217;ll learn how to prevent the next blowout.</p>
<p><img class="centered" title="Blowout stack beneath a Chesapeake Energy drilling rig, Marlow, Oklahoma, 2004." src="http://bit-player.org/wp-content/uploads/2010/06/blowout-stack-7957-fg.jpg" border="0" alt="Blowout stack beneath a Chesapeake Energy drilling rig, Marlow, Oklahoma, 2004." width="450" height="504" /></p>
<p>The only blowout preventer I&#8217;ve ever seen up close and personal was on a drilling rig near Oklahoma City in 2004. At the time, I thought it was quite a brawny-looking piece of gear, with all those rings of torqued bolts clamping the flanges together, and the big red hydraulic rams like pincers clasping the pipe. Apparently, this unit is puny compared with the&nbsp;BOP stack installed&nbsp;on the Deepwater Horizon well (&ldquo;&#65279;five stories tall&#8221; in news accounts). Still, the principle of operation is the same: If the well &#8220;kicks&#8221;&#8212;meaning that gas and oil begin to push their way toward the surface&#8212;the rams close off the well and keep everything sealed tight. Some rams are meant to choke off the annular space around the drill pipe; some are &#8220;blind rams&#8221; used when no drill pipe is present in the bore; the last resort is a pipe ram or shear ram meant to crush or cut off the drill pipe.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/06/bop-controls-7980.jpg" border="0" alt="BOP-controls-7980.jpg" width="450" height="393" /></p>
<p>It sounds foolproof, but evidently not. Controls can stick, valves can leak, parts can break. I&#8217;ve read speculation that the shear ram on the&nbsp;BOP beneath the&nbsp;Deepwater Horizon might have failed because it happened to strike a thicker-walled section of drill pipe where two lengths of tubing are threaded together; I really hope that&#8217;s not a serious concern, because it would mean that every BOP on the planet has a failure probability of 3 percent. (Drill pipe comes in 30-foot lengths, and the thickened sections are about a foot long.)</p>
<p>A 1997 <a href="http://www.offshore-mag.com/index/article-display/23675/articles/offshore/volume-57/issue-1/departments/drilling-production/well-control-ultra-deepwater-blowouts-how-could-one-happen.html">article in <em>Offshore</em></a> lists a fascinating variety of other failure modes for deep-water wells. The article is by Larry H. Flak of Boots and Coots&#8212;a company often called in to deal with crises like this one. Flak is writing for an audience of oil-patch insiders, and I don&#8217;t follow all of his jargon. For example, there&#8217;s this wonderfully opaque passage:</p>
<blockquote>
<p>Broached blowouts could happen with casing failure. Recently, an ultra-deepwater operator swabbed in a kick resulting in over 9,000 psi on the subsea BOPs. Fortunately, the casing was sound and set just on top of the sand. This allowed safe kick bullheading.</p>
</blockquote>
<p>Even for those of us who have never&nbsp;<a href="http://www.glossary.oilfield.slb.com/Display.cfm?Term=bullhead">bullheaded</a> or <a href="http://www.glossary.oilfield.slb.com/Display.cfm?Term=swab">swabbed</a> in a kick, the message comes through: A lot can go wrong. A well is not just a hole in the ground; it&#8217;s a fairly complicated structure with multiple concentric layers of casing and drill pipe, which create several interconnected annular spaces; fluids under pressure can find many pathways to the surface.&nbsp;Flak points out that even the massive steel blades of a hydraulic ram can be torn to pieces by high-velocity streams of abrasive fluids, such as drilling mud. He concludes: &#8220;Blowout control options in ultra-deepwater are very limited. Blowout prevention is of paramount importance.&#8221;</p>
<p>When the story of the Deepwater Horizon accident is finally told, we&#8217;re going to hear at least two interpretations. In one telling, BP and its contractors were incompetent or negligent or criminally greedy; they screwed up. But the basic technology of offshore drilling is sound, and if only the companies had followed established industry practices and common sense, none of this mess would have happened. In the other version,&nbsp;BP and its contractors were incompetent or negligent or criminally greedy; they screwed up.&#65279; But even if they had&nbsp;followed approved industry practices&#65279;, a disaster like this was waiting to happen, because the technology of offshore drilling is fatally flawed. I really wish I knew which of these stories to believe. I worry that we lack the institutional means to decide between them.</p>
<p class="centered">&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;</p>
<p>I&#8217;m a great admirer of the <a href="http://www.ntsb.gov/Abt_NTSB/history.htm">National Transportation Safety Board</a>. When an airliner crashes or a couple of commuter trains collide, the NTSB mounts a focused, scientific effort to understand the cause of the accident. Investigating accidents is all they do, and this turns out to be a key to their success; they are&nbsp;insulated from the responsibilities and temptations of regulating the industry, adjudicating legal culpability, assessing penalties or even enforcing their own safety recommendations.&nbsp;Separating the investigative and regulatory functions has worked brilliantly, and the board&#8217;s findings command wide respect.&nbsp;But the NTSB won&#8217;t be investigating the Deepwater Horizon blowout, because the event doesn&#8217;t fall within their statutory purview; it&#8217;s not a transportation accident. There&#8217;s also a&nbsp;<a href="http://www.csb.gov/about/mission.aspx">Chemical Safety and Hazard Investigation Board&#65279;</a> (CSB), which has had a similar role in the chemicals industry since 1998. You might think that petroleum would count as a chemical substance, but as far as I can tell the CSB will also be sitting this one out. Their mission statement says: &#8220;The CSB conducts root cause investigations of chemical accidents at fixed industrial facilities.&#65279;&#8221; The Deepwater Horizon wasn&#8217;t &#8220;fixed.&#8221;</p>
<p>So here&#8217;s my call for action: We need a single agency, modeled on the NTSB and the CSB, with authority to investigate all accidents involving industry or infrastructure&#8212;everything from well and refinery fires to water-main breaks, power blackouts, mine cave-ins, bridge and dam failures, boiler explosions and sewage spills. I believe we&#8217;re under a moral imperative to learn all we can from every such accident, in the hope of preventing&nbsp;a recurrence. A permanent, independent agency with investigative authority, in-house expertise and technical resources looks to me like the best way to reach this goal.</p>
<p>After the 2005 refinery fire in Texas City, BP&nbsp;hired James Baker III (the former Secretray of State) to convene a special of inquiry into the company&#8217;s safety practices and culture. The <a href="http://www.bp.com/bakerpanelreport">Baker panel report</a> observed that BP had an admirable record on <em>personal safety</em> but cited deficiencies in <em>process safety</em>. Personal safety says: Wear your hardhat, your eye protection and your steel-toed boots when you go out in the plant to open a valve. Process safety says: Make sure you open the right valve, not the one that&#8217;s going to create a fireball that engulfs the whole plant. The report remarks: &#8220;BP mistakenly interpreted improving personal injury rates as an indication of acceptable process safety performance.&#8221; On April 20, BP executives flew out to the Deepwater Horizon to celebrate a personal safety milestone on the rig: no loss-of-work accidents in seven years. The executives were still on board when somebody turned the wrong valve.</p>
<p>Note: I have posted a <a href="http://industrial-landscape.com/chapts/chap04.pdf">PDF</a> of the oil and gas chapter from my book <em>Infrastructure: A Field Guide to the Industrial Landscape </em>on <a href="http://industrial-landscape.com/index.html">the book&#8217;s web site</a>.</p>
<p><strong>Update 2010-06-21</strong>: A superb <a href="http://www.nytimes.com/2010/06/21/us/21blowout.html">report</a> on the failure modes of blowout preventers was published this morning in <em>The New York Times</em>. The article is by&nbsp;David Barstow, Laura Dodd, James Glanz, &nbsp;Stephanie Saul and Ian Urbina. Nothing they say is reassuring.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/a-hole-in-the-bottom-of-the-ocean/feed</wfw:commentRss>
		</item>
		<item>
		<title>On the air</title>
		<link>http://bit-player.org/2010/on-the-air</link>
		<comments>http://bit-player.org/2010/on-the-air#comments</comments>
		<pubDate>Tue, 25 May 2010 18:31:51 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[books]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=665</guid>
		<description><![CDATA[Tomorrow I&#8217;ll be doing a gig on &#8220;The State of Things,&#8221; broadcast by North Carolina Public Radio. The main subject of discussion will be Scott Huler&#8217;s new book On the Grid; I&#8217;ll be present as supporting cast (call me a bit player) and will doubtless find a way to plug my own book Infrastructure.
If you&#8217;re [...]]]></description>
			<content:encoded><![CDATA[<p>Tomorrow I&#8217;ll be doing a gig on &#8220;The State of Things,&#8221; broadcast by North Carolina Public Radio. The main subject of discussion will be Scott Huler&#8217;s new book <em><a href="http://www.scotthuler.com/index.cgi">On the Grid</a></em>; I&#8217;ll be present as supporting cast (call me a bit player) and will doubtless find a way to plug my own book <em><a href="http://industrial-landscape.com/index.html">Infrastructure</a></em>.</p>
<p>If you&#8217;re within listening range of a North Carolina Public Radio station, tune in between noon and 1 p.m. Later in the day the audio should be available <a href="http://wunc.org/programs/tsot/">online</a>.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/on-the-air/feed</wfw:commentRss>
		</item>
		<item>
		<title>A shy woodland creature</title>
		<link>http://bit-player.org/2010/a-shy-woodland-creature</link>
		<comments>http://bit-player.org/2010/a-shy-woodland-creature#comments</comments>
		<pubDate>Mon, 24 May 2010 22:16:26 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[games]]></category>

		<category><![CDATA[problems and puzzles]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=661</guid>
		<description><![CDATA[Martin Gardner died over the weekend. He was 95 and living in Norman, Oklahoma, not too far from his birthplace in Tulsa.
Like many others, I grew up on Martin&#8217;s &#8220;Mathematical Games&#8221; column in Scientific American. Later I joined the staff of that magazine&#8212;but don&#8217;t imagine that Martin and I became office buddies. As a matter [...]]]></description>
			<content:encoded><![CDATA[<p>Martin Gardner died over the weekend. He was 95 and living in Norman, Oklahoma, not too far from his birthplace in Tulsa.</p>
<p>Like many others, I grew up on Martin&#8217;s &#8220;Mathematical Games&#8221; column in <em>Scientific American</em>. Later I joined the staff of that magazine&mdash;but don&#8217;t imagine that Martin and I became office buddies. As a matter of fact,&nbsp;I never once saw him in the office. He worked at home. He attended none of our editorial meetings. We never had lunch together. Indeed, I&nbsp;never would have met him at all except for the coincidence that we lived a few blocks apart, and every now and then I would be called upon to deliver a package of urgent proofs. (By the way, his address in those days was on Euclid Avenue!)</p>
<p>Someone on the magazine staff described Martin as &#8220;a shy woodland creature,&#8221; and the tag stuck. Looking back, however, I think it tells only half the story. Yes, Martin was no schmoozer, and he preferred to stay out of the spotlight, both in person and in print. Most of his best-known columns reported on someone else&#8217;s discoveries&mdash;Conway&#8217;s game of life, RSA&#8217;s cryptosystem, Penrose&#8217;s tilings. He delighted in annotating other people&#8217;s work, as in his celebrated edition of Lewis Carroll.&nbsp;Yet he was anything but timid or retiring. In an argument, the shy woodland creature was a grizzly bear. He had strongly held opinions and philosophical convictions, and he knew the worth of his own work. He was a man of ideas, to be taken seriously, yet also a man who had fun with his ideas.</p>
<p>As a celebration of Martin&#8217;s peculiar genius, I would like to revive the little puzzle that formed the basis of his very first column, in January of 1957. (He had published a few articles in <em>Scientific American</em> earlier, but this was the first column to appear under the &#8220;Mathematical Games&#8221; title.)</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/05/bingo450.png" border="0" alt="bingo450.png" width="450" height="434" /></p>
<p>On the bingo card above, choose any number, circle it, and cross out all the other numbers in the same column or row. Now select a second number from among those that remain unmarked, and again circle your choice and then cross out the rest of the column and row. Continue in this way until there are no unmarked numbers left to choose.</p>
<p>The sum of the circled numbers is 57. How did I know that? How did Martin construct the matrix?</p>
<p><strong>Update 2010-05-29:</strong> Here is Martin&#8217;s own explanation of the 1957 puzzle:</p>
<blockquote>
<p>Like most tricks, this one is absurdly simple when explained. The square is nothing more than an old-fashioned addition table, arranged in a tricky way. The table is generated by two sets of numbers: 12, 1, 4, 18, 0 and 7, 0, 4, 9, 2. The sum of these numbers is 57. If you write the first set of numbers horizontally above the top row of the square, and the second set vertically beside the first column [see Fig. 9], you can see at once how the numbers in the cells are determined. The number in the first cell (top row, first column) is the sum of 12 and 7, and so on through the square.</p>
</blockquote>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/05/gardner-addition-table.png" border="0" alt="Gardner-addition-table.png" width="357" height="369" /></p>
<blockquote>
<p>You can construct a magic square of this kind as large as you like and with any combination of numbers you choose. It does not matter in the least how many cells the square contains or what numbers are used for generating it. They may be positive or negative, integers or fractions, rationals or irrationals. The resulting table will always possess the magic property of forcing a number by the procedure described, and this number will always be the sum of the two sets of numbers that generate the table.</p>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/a-shy-woodland-creature/feed</wfw:commentRss>
		</item>
		<item>
		<title>The big blip</title>
		<link>http://bit-player.org/2010/the-big-blip</link>
		<comments>http://bit-player.org/2010/the-big-blip#comments</comments>
		<pubDate>Mon, 24 May 2010 13:07:48 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[social science]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=658</guid>
		<description><![CDATA[If you were an astute or lucky stock trader on the afternoon of May 6, you could have bought shares of Accenture PLC for a penny each and sold them a minute later for almost $40. Or you could have invested in Sotheby&#8217;s for about $30 a share and, if your timing was right, sold [...]]]></description>
			<content:encoded><![CDATA[<p>If you were an astute or lucky stock trader on the afternoon of May 6, you could have bought shares of Accenture PLC for a penny each and sold them a minute later for almost $40. Or you could have invested in Sotheby&#8217;s for about $30 a share and, if your timing was right, sold out at a price of $99,999.9999. Did you miss those moneymaking opportunities? Don&#8217;t kick yourself too hard. Those particular trades were canceled by the exchanges as &#8220;clearly erroneous errors.&#8221; But millions of other bizarre transactions were allowed to stand, even though prices were fluctuating wildly.</p>
<p>A <a href="http://www.sec.gov/sec-cftc-prelimreport.pdf">preliminary report</a> on these events was released last week by a joint committee of the Commodity Futures Trading Commission and the Securities and Exchange Commission. The report reads a lot like an inquiry into an airplane crash, evoking both horror&nbsp;and&nbsp;fascination. But whereas the investigators of aircraft accidents usually come up with a likely cause, the CFTC/SEC committee makes clear that they don&#8217;t yet understand what happened on May 6, and it seems possible we&#8217;ll never know.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/05/daylong-avg-prices.png" border="0" alt="daylong-avg-prices.png" width="450" height="250" /></p>
<p>Throughout that day, stock prices were trending lower, a decline attributed mainly to worries about the European economy. But those concerns can&#8217;t account for the extraordinary crevasse the market fell into and then climbed out of between 2:30 and 3:00 p.m. The Dow Jones Industrial Average (<em>blue</em>) and the Standard and Poor&#8217;s 500 index (<em>green</em>) both lost 6 or 7 percent of their value in less than 10 minutes, then gained it all back. If those price changes are extrapolated to all U.S. stocks, something like a trillion dollars went missing for half an hour. (The red line in the graph, labeled E-Mini S&amp;P 500, refers to a stock futures contract, which I&#8217;ll discuss below.)</p>
<p>What could cause such rapid whipsawing? The first speculations implicated a &#8220;fat-finger trade&#8221;&mdash;a data-entry error. There have been several such events in recent years; for example,&nbsp;in 2005&nbsp;a Japanese broker who meant to sell 1 share of stock at a price of 610,000 yen keyed in instructions to sell 610,000 shares at 1 yen. However, the committee finds no evidence of such goofs on May 6.</p>
<p>The committee also dismisses the Procter &amp; Gamble theory, put forward by commentators on CNBC who noticed a particularly sharp break in the stock of that company (one of the 30 Dow components).</p>
<blockquote>
<p>The decline in PG did not begin until 2:44 p.m., well after the broader market indices, which began their precipitous drop at approximately 2:40 p.m. Accordingly, early reports that an inordinately large trade in PG may have triggered the broad market decline do not appear well founded.</p>
</blockquote>
<p>Various kinds of deliberate mischief have also been mentioned as possible causes. Maybe some secretive hedge fund has found a way to manipulate the market to its own advantage. Or a hacker might have infiltrated the computer networks that handle stock transactions. The glitch could even be an act of international terrorism. Again, the committee finds no signs of such malevolence but can&#8217;t entirely rule out the possibility.</p>
<p>The committee gives closer scrutiny to high-volume trading on the stock futures market, and in particular to the E-Mini S&amp;P 500 futures, which offer a mechanism for betting on the value of the S&amp;P 500 index a few weeks in the future. Traffic in S&amp;P 500 futures was unusually heavy on May 6, and it spiked at the time of the big dip:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/05/e-mini-price-and-volume.png" border="0" alt="E-mini-price-and-volume.png" width="450" height="294" /></p>
<p>The price excursions were wide enough to trigger a &#8220;Stop Logic&#8221; system that halted trading for five seconds.&nbsp;Furthermore, transactions initiated by a single firm accounted for some 9 percent of the trading volume in the critical half-hour, and all of that firm&#8217;s activity was on the selling side. (The committee report does not name this firm, but <a href="http://www.nytimes.com/2010/05/15/business/15trader.html">others</a> have identified it as Waddell &amp; Reed, a mutual fund in Overland Park, Kansas.) So, do we blame it all on a mutual fund run amok in the KC suburbs? The committee thinks further investigation is warranted, but they also note that the same firm has made similar trades in the past, as have many other parties, all without causing a ripple in the wider market.</p>
<p>Two more items of Wall Street arcana that get a lot of attention in the report are&nbsp;stop-loss orders and&nbsp;stub quotes. A stop-loss order causes a stock to be sold automatically if the price falls below a specified threshold. Traders enter such orders in the expectation that the sale will take place at a price near the threshold level, but if prices are falling rapidly, there&#8217;s no assurance of that. For a few minutes on May 6, certain stop-loss orders had the effect not of stopping losses but of maximizing them. At the instant when the orders were executed, there were no purchase offers at any price higher than a penny, and so that&#8217;s the price the stocks sold for.&nbsp;The offers of $0.01 are thought to have been &#8220;stub quotes,&#8221; placed by brokers who act as market-makers and who are therefore obliged always to have both buy and sell orders in place. Stub quotes are a way of meeting this obligation at times when the broker doesn&#8217;t really want to be in the market. Trades are never supposed to be executed at the stub price, but that&#8217;s what happens if no one else is buying. (Transactions at $100,000 per share reflect stub quotes at the other end of the scale, for shares that no one else is willing to sell.)</p>
<p class="centered">&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;</p>
<p>If the&nbsp;Commodity Futures Trading Commission and the Securities and Exchange Commission&#65279; don&#8217;t know what went wrong on May 6, then I&#8217;m sure I don&#8217;t know either. But a couple of&nbsp;points seem pretty obvious (which may be why the committee left them unstated).</p>
<p>First, whatever happened on May 6 must have been driven by the internal dynamics&nbsp;of the securities markets, not by events in the larger economy. No changes in the business prospects of Accenture PLC&#65279; would justify 4,000 percent swings in the company&#8217;s market value within half an hour.</p>
<p>Second, there&#8217;s got to be some instability at work here&mdash;some positive feedback loop. A thousand-point dip in the Dow wasn&#8217;t just a freak coincidence, where millions of stockholders acting independently all chose to sell at the same moment, and then a few minutes later changed their minds and decided to buy. Rather, there must have been some mechanism whereby one trader&#8217;s decision to buy or sell induced other traders to do the same.</p>
<p>The committee report points out that stop-loss orders create one such destabilizing loop, which is hard-wired into the market machinery. If a stop-loss order on a particular stock is activated at $100, say, the sale of those shares might drive the market price down to $95, triggering more stop-loss orders and lowering the price still further, in a runaway cascade. More generally, any trading strategy that calls for following trends or tracking &#8220;market momentum&#8221; is susceptible to this kind of instability. For any one individual, selling out when the market sags may or may not be a prudent policy; but if <em>everyone</em> adopts such a rule, the outcome is certain disaster.</p>
<p>Positive feedbacks of some kind surely had a role in the crash of May 6, but they can&#8217;t be the whole story. If a wave of self-reinforcing selling accounts for the sudden dive in prices, what explains the equally sudden turnaround and recovery? And there&#8217;s an even deeper question. It&#8217;s not hard&nbsp;to dream up models in which every random fluctuation is amplified by positive feedback, but the result is an economy that experiences weird jolts and hiccoughs all the time. A useful theory of May 6 has to explain not only what happened on that day but also&nbsp;why it doesn&#8217;t happen routinely.</p>
<p>Some analysts have compared the May 6 event with the <a href="http://www.federalreserve.gov/pubs/feds/2007/200713/200713pap.pdf">stock market crash of October 1987,</a> which was even deeper than the recent dip, although it played out over a period of days rather than minutes. I have vivid memories of this event; I followed it on the radio (no CNBC in those days) and then I read the post-mortem reports. But apparently my memory is faulty in certain crucial details. The crash was&nbsp;blamed in large part on &#8220;program trading,&#8221; which I took to mean that computer programs were making buy and sell decisions in real time. The root of the problem, as I understood it then, was that multiple programs controlling large investments all shared the same basic logic, so that they would all react in the same way to changing market conditions. It turns out, though, that the computing&nbsp;machinery of the time was not up to operating in this online regime. Instead, the economic models were run in batch mode, and the trades were executed after the fact. There were people in the loop.</p>
<p>Today, in contrast, thousands of computers&nbsp;are plugged directly into the markets, and program trading is everywhere. The big hedge funds and other major players install their servers in colocation facilities next door to the major exchanges, as a way of reducing communication latency. For &#8220;high frequency traders,&#8221; transactions are routinely completed in about a third of a millisecond. From the point of view of these firms, the sudden market collapse on May 6 played out in slow motion. During the 10 minutes of tumbling prices, a trading rate of three transactions per millisecond allows time for 180,000 transactions.</p>
<p>Perhaps, then, the much-feared runaway automation of 1987 has finally caught up with us in 2010.&nbsp;Ironically, though, the&nbsp;CFTC/SEC&#65279; report hints that if&nbsp;automated trading&nbsp;was behind the May 6 glitch, the problem might not be the presence of these traders but rather their sudden withdrawal from the market. Julie Creswell tells the story in <a href="http://www.nytimes.com/2010/05/17/business/17trade.html">The New York Times</a>:</p>
<blockquote>
<p>RED BANK, N.J. &mdash; Above the Restoration Hardware in this Jersey Shore town, not far from the Navesink River, lurks a Wall Street giant.</p>
<p>Here, inside the humdrum offices of a tiny trading firm called Tradeworx, workers in their 20s and 30s in jeans and T-shirts quietly tend high-speed computers that typically buy and sell 80 million shares a day.</p>
<p>But on the afternoon of May 6, as the stock market began to plunge in the &ldquo;flash crash,&rdquo; someone here walked up to one of those computers and typed the command HF STOP: sell everything, and shutdown.&#65279;</p>
</blockquote>
<p>According to Creswell, high-frequency traders account for between 40 and 70 percent of all the trading volume on U.S. securities markets, so the sudden departure of these market participants would certainly have a noticeable effect.</p>
<p>Almost everything about the stock market has changed utterly in the years since 1987.&nbsp;Back then, trading was done by guys in colorful blazers yelling at one another on the floor of the New York Stock Exchange. That trading floor still exists, but it&#8217;s a kind of Wall Street theme park, maintained for the benefit of visiting high school classes and CNBC cameras. Most of the actual trading in NYSE stocks is done across the river in Jersey City by electronic&nbsp;&#8221;matching engines&#8221; that line up offers to sell with bids to buy. Once there were &#8220;specialists&#8221; in each stock who were expect to intervene with their own capital to damp out unwarranted price fluctuations. That role has not disappeared entirely, but in most modern markets no one has legal responsibility for maintaining stability. In 1987 most stocks could be bought and sold in only one venue; now, transactions are automatically routed to whatever exchange offers the best terms, including the ominously named &#8220;dark pools,&#8221; where shares change hands anonymously. Back then, brokerage fees and other transaction costs were high enough to discourage strategies such as high-frequency trading; now there is much less friction in the market. It&#8217;s a new world.</p>
<p>&#65279;Even though the CFTC and the SEC have not yet sorted out the causes of the May 6 blip, they are already proposing remedies. The basic tool is the time out: When the market throws a tantrum, it will be told to sit in the corner for a few minutes. Many such rules already exist, some of them going back to 1987. The rationale is that a pause in trading will allow time for &#8220;additional liquidity to enter the market.&#8221; In other words, if everyone is selling in a panic, we wait a little while for some buyers to show up. Of course the pause&nbsp;might also allow time for more sellers to join the stampede.</p>
<p>A year ago, I <a href="http://amsciadmin.eresources.com/libraries/documents/2009491133257238-2009-05Hayes.pdf">was writing</a> about the uneasy relations between economics and the engineering discipline known as control theory. That was in the&nbsp;context of macroeconomics, where the aim is to control cycles of boom and bust with a time scale of years or decades. The challenges of controlling securities markets are rather different: The time scale is much shorter, which means you have to act quicker, but on the other hand it&#8217;s much easier to measure what&#8217;s happening, to gather information second by second. But the&nbsp;biggest impediment to effective control is the same in both cases: It&#8217;s hard to control the dynamics of a system when you don&#8217;t understand those dynamics&mdash;when you can&#8217;t reliably predict what the system will do in the absence of control or how it will respond to control actions. Given the human element in economic affairs&mdash;including the likely presence of actors who will try to subvert any control strategy&mdash;it&#8217;s not clear that we can ever have that kind of predictive power.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/the-big-blip/feed</wfw:commentRss>
		</item>
		<item>
		<title>A new Handbook</title>
		<link>http://bit-player.org/2010/a-new-handbook</link>
		<comments>http://bit-player.org/2010/a-new-handbook#comments</comments>
		<pubDate>Mon, 17 May 2010 17:42:05 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[mathematics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=654</guid>
		<description><![CDATA[The Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (better known as Abramowitz and Stegun) is&#160;a much-storied book. Not that it&#8217;s a&#160;book full of stories; truth is, there&#8217;s not much of a narrative thread running through those formulas, graphs and mathematical tables. But it&#8217;s a book with a story behind it.
The story began [...]]]></description>
			<content:encoded><![CDATA[<p>The <em>Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables</em> (better known as Abramowitz and Stegun) is&nbsp;a much-storied book. Not that it&#8217;s a&nbsp;book full of stories; truth is, there&#8217;s not much of a narrative thread running through those formulas, graphs and mathematical tables. But it&#8217;s a book with a story behind it.</p>
<p>The story began in the late 1930s, when the Mathematical Tables Project was launched in a factory building on the West Side of Manhattan. Supported&nbsp;by the Works Progress Administration, the Tables Project had dual aims: first, preparing high-quality tables of trigonometric functions, logarithms, and the like; and, second, providing work for unemployed New Yorkers. The 450 human computers hired for the project were chosen more on the basis of need than skill, and the work was done on a kind of numerical assembly line. According to David Alan Grier:</p>
<blockquote>
<p>Each group was taught to perform a single arithmetic operation. One group knew how to add positive numbers, a second to subtract, and the third to multiply single digits. The last and most sophisticated group did long division.</p>
</blockquote>
<p>I have to admit there&#8217;s a certain nightmare aspect to this scene. Working in a numbers factory sounds no more appealing than stamping sheet metal all day, although there was less danger of getting a finger crushed in the machinery.</p>
<p>A few years later, the Tables Project was swept up in war work; then, afterward, many of the key personnel&nbsp;moved to the National Bureau of Standards (now NIST, the National Institute of Standards and Technology). There they conceived the <em>Handbook</em>. Apparently the initial plan&nbsp;was a&nbsp;greatest-hits album of tables,&nbsp;but by the early 1950s the future of table-making was looking pretty dim. And so the project changed course and put more emphasis on the mathematical functions that lay behind the tables&mdash;familiar functions such logarithms, more specialized ones such as Bessel functions and some more recondite topics such as Mathieu functions and orthogonal polynomials. There were still tables listing numeric values of functions, but the <em>Handbook</em> also presented the mathematics you would need to evaluate (or approximate) the function for yourself.</p>
<p>The editors in charge of the <em>Handbook</em> were Milton Abramowitz and Irene Stegun, both veterans of the New York table office. They recruited about 30 young mathematicians to write chapters. In 1958 Abramowitz died suddenly; Stegun saw the project through to publication in 1964.</p>
<p>The <em>Handbook</em> seems an unlikely best-seller, but the U.S. government has distributed more than 150,000 copies, and editions from other publishers are estimated to bring the total copies in print to something near a million. (As a product of government work, the <em>Handbook</em> is not covered by copyright; there are <a href="http://www.math.ucla.edu/~cbm/aands/">scanned versions</a> on the web.)</p>
<p>Announced last week is a new <em>Handbook</em>, officially retitled the <em>NIST Handbook of Mathematical Functions</em>. The ink-and-paper version, which I have not yet seen, is published by <a href="http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521192255">Cambridge University Press</a>. Perhaps even more interesting is the web edition, called the&nbsp;<a href="http://dlmf.nist.gov/">NIST Digital Library of Mathematical Functions</a> (DLMF), which I have just begun to explore. It is recognizably the same book as Abramowitz and Stegun, with the same terse style of presentation. But much has changed. The hundreds of pages of tables are finally gone; this is not the place to look up the sine of 23 degrees. But there are handsome color graphics now, and a new emphasis on methods of computation, including pointers to recommended software. And the selection of topics has expanded somewhat. For example, there are new chapters on the Painlev&eacute; equations and on functions whose argument is a matrix. Elsewhere, the Lambert W function (a <a href="http://amsciadmin.eresources.com/libraries/documents/2005216151419_306.pdf">personal favorite</a> of mine) is a newcomer to the chapter on elementary functions.</p>
<p>Apart from the content, the DLMF is interesting as an experiment in presenting mathematics on the web. It&#8217;s the most ambitious project I&#8217;ve seen based on <a href="http://www.w3.org/Math/">MathML</a>, and it seems to work well, at least when viewed in recent versions of Firefox. (In other browsers I&#8217;ve tried, MathML gets garbled, but the equations can be displayed as images and are still quite readable in that format&mdash;even with an ancient version of Internet Explorer.) Here&#8217;s part of a page as seen in Firefox:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/05/dlmf-zeta-450.jpg" border="0" alt="DLMF-zeta-450.jpg" width="450" height="569" /></p>
<p>Mousing over the &#8220;i&#8221; icon at the right margin provides access to encodings of the equations in MathML or TeX and as PNG images, as well as definitions, cross-references and such. Very slick.</p>
</p>
<p>The editor in chief of the new <em>Handbook</em> is Frank W. J. Olver of the&nbsp;University of Maryland College Park and NIST, whom I have mentioned before both <a href="http://bit-player.org/2009/outnumbered">here at bit-player</a> and <a href="http://amsciadmin.eresources.com/libraries/documents/20097301410207456-2009-09Hayes.pdf">in </a><em><a href="http://amsciadmin.eresources.com/libraries/documents/20097301410207456-2009-09Hayes.pdf">American Scientist</a></em>. As a young mathematician, half a century ago, Olver wrote one of the <em>Handbook</em> chapters on Bessel functions. (As an even younger mathematician, in the 1940s, he worked with Alan Turing at the National Physical Laboratory in Britain.) There are three more principal editors:&nbsp;Daniel W. Lozier,&nbsp;Ronald F. Boisvert and Charles W. Clark, all of NIST, as well as a long roster of associate editors and domain experts. It&#8217;s too soon to say whether some combination of these names will eventually replace the moniker &#8220;Abramowitz and Stegun.&#8221;</p>
<p>Notes: The quotation above from David Alan Grier appears in &#8220;The Math Tables Project of the Work Projects Administration: The Reluctant Start of the Computing Era,&#8221;&nbsp;<em>IEEE Annals of the History of Computing</em>, Vol. 20, No. 3, 1998. Grier has also written a profile of Stegun in &#8220;Irene Stegun, the <em>Handbook of Mathematical Functions</em>, and the Lingering Influence of the New Deal,&#8221; <em>American Mathematical Monthly</em>, August-September 2006. Boisvert and Lozier have written a <a href="http://nvl.nist.gov/pub/nistpubs/sp958-lide/135-139.pdf">brief account</a> of the history of the <em>Handbook</em>. (Oddly, the images in this PDF file are negatives.)</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/a-new-handbook/feed</wfw:commentRss>
		</item>
		<item>
		<title>Pilgrim&#8217;s Progress</title>
		<link>http://bit-player.org/2010/pilgrims-progress</link>
		<comments>http://bit-player.org/2010/pilgrims-progress#comments</comments>
		<pubDate>Sun, 09 May 2010 00:39:03 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[modern life]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=647</guid>
		<description><![CDATA[In the past few weeks I&#8217;ve had little time for bit-playing; I&#8217;ve been playing with atoms instead. I&#8217;ve been sorting and packing and toting atoms, then hauling them, rearranging them, offloading them, storing them. Lots and lots of atoms: maybe 1029. Bits are so much easier to handle. My bitly possessions&#8212;some hundreds of gigabytes&#8212;fit comfortably [...]]]></description>
			<content:encoded><![CDATA[<p>In the past few weeks I&#8217;ve had little time for bit-playing; I&#8217;ve been playing with atoms instead. I&#8217;ve been sorting and packing and toting atoms, then hauling them, rearranging them, offloading them, storing them. Lots and lots of atoms: maybe 10<sup>29</sup>. Bits are so much easier to handle. My bitly possessions&mdash;some hundreds of gigabytes&mdash;fit comfortably in a shirt pocket. My atomic chattels are a bulkier burden. </p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/05/pilgrims-progress1.png" title="I dreamed, and behold I saw a Man clothed with Raggs standing in a certain place, with his face from his own House, a Book in his hand, and a great burden upon his Back." alt="I dreamed, and behold I saw a Man clothed with Raggs standing in a certain place, with his face from his own House, a Book in his hand, and a great burden upon his Back." border="0" width="450" height="454" /></p>
<p>All this atom-pushing was done in the course of moving my household from Durham, North Carolina, to Cambridge, Massachusetts. I&#8217;ve moved before, but this time the experience was unusually physical. In vacating my Durham home, I carried all my belongings out of the house and onto a truck&mdash;and I did it without helpers and without the use of carts or dollies or other wheeled implements. In other words, everything I own (except a car) I have now lifted up, cradled in my arms, carried 50 feet or more, and set down again. It took a day and a half.</p>
<p>What a goofy thing to do, eh? I even turned away offers of help. I guess I&#8217;m just a <a href="http://bit-player.org/bph-publications/Sciences-1991-03-Hayes-do-it-yourself.pdf">do-it-yourself</a> kind of guy. In a strange way the labor was worth it: The process gave me a vivid and visceral sense of how stuff accumulates over a lifetime&mdash;how my possessions possess me. Late in the afternoon of the second day, as I loaded up the last few items and shut the door of the truck, there was a bright glow of satisfaction and accomplishment. </p>
<p>Yet I never want to do it again. When I arrived in Cambridge, I did <em>not</em> refuse the generous help of a younger, stronger friend. (Thanks, Mici!)</p>
<p>The total weight of my load was roughly two and a half metric tons. At least two tons of that mass consisted of &#8220;information goods&#8221;&mdash;books, periodicals, manuscripts and proofs, file drawers full of paper documents, photographs, musical recordings in various formats, art works. And this was the residue remaining after a two-year effort to lighten my load, mainly by transforming atoms into bits. In particular, I had scanned 22 drawers full of files, converting paper into PDFs and then recycling all the cellulose.</p>
<p>Before I lug my belongings out of <em>this</em> dwelling, I vow to jettison another ton or more. If only I could figure out how to digitize my clothes or my pots and pans.</p>
<p>As for my new home, everyone knows that Cambridge is the intellectual capital of North America. But I didn&#8217;t quite realize how high the standard had become. The sign in the photo below is on the garden gate next door.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2010/05/dogskeepgateclosed.jpg" alt="Dogs: Keep Gate Closed" border="0" width="450" height="375" /></p>
<p>The literate local canines seem to comply with this order, since the gate is always closed.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2010/pilgrims-progress/feed</wfw:commentRss>
		</item>
	</channel>
</rss>
