Archive for the ‘problems and puzzles’ Category

Taxation without rationalization

Friday, February 24th, 2006

I am the child of a bookkeeper, and I’ve inherited the habit of double-checking receipts and balancing accounts. My friends make fun of me when I carefully note down the dime that I put in a parking meter, but lately I’ve been fretting over even smaller sums—charges that come to less than a penny. It’s all about sales tax.

For the benefit of those who live outside the U.S. (or in New Hampshire) I should explain how American sales tax works. The price marked on an item in a store excludes the tax, which is added at the cash register when you pay for your purchases. Where I live, in North Carolina, the tax rate for most merchandise is 7 percent. Thus if I buy a magazine with a cover price of $3.95, I’ll actually pay $3.95 + (0.07 × $3.95), or $4.2265. Fractions of a cent are rounded to the nearest penny, making the final price $4.23. The rounding is what I’ve been worrying about.

If rounding is done fairly, one might expect that the tax would be rounded up and rounded down with roughly equal frequency, and everything would come out even in the end. But for a long time I’ve been noticing that the sales tax on my purchases is almost always rounded up; very seldom, it seems, does my shopping basket produce a total for which the tax amount needs to be rounded down. Is this bias real, or do I just imagine that numbers are conspiring against me? As I was gathering up papers for a different kind of tax—the annual income-tax ritual—I decided to find out.

For any tax rate that’s an integer percentage, all possible rounding situations are covered by considering prices modulo $1. In other words, if the tax on $0.xy rounds in a certain way, then $1.xy will get exactly the same treatment, and likewise $2.xy, $3.xy, and so on. Thus for purposes of tax rounding, we can pretend that all prices lie between $0.00 and $0.99. In the sales-tax tables (PDF) issued by the state of North Carolina, the tax on 50 of these amounts is rounded up; in 49 cases it is rounded down; the one remaining case needs no rounding in either direction. This rounding protocol introduces a very slight upward bias, amounting to $0.00005 when averaged over all possible prices. Apart from this tiny asymmetry, the system looks like it ought to be fair if purchase prices are uniformly distributed among the 100 possibilities.

The problem, of course, is that the distribution of prices is far from uniform. I took a fat envelope full of sales receipts from 2005 and recorded the prices of all items subject to the 7 percent North Carolina tax. I was able to identify 471 items. Here is the distribution of their prices, modulo $1:

bar graph of item prices modulo $1

Red bars are prices for which the tax is rounded up, blue bars represent prices whose tax rounds down; the tax on $0.00 (black bar) is exact. The source of the rounding bias that I’ve suspected is immediately obvious: Almost two-thirds of the items I purchased over the course of the year had a price (modulo $1) of 99 cents, which happens to be an amount for which the tax rounds upward. I’ve always heard that the popularity of 99-cent pricing has a psychological premise: Shaving that penny off makes things seem cheaper and therefore encourages sales. Could there be another motive as well?

Before I start accusing shopkeepers of a nefarious plot to mulct American consumers, there’s a complicating factor to consider. Sales tax is not calculated separately on each item purchased; instead, when you dump your basketful of goods at the checkout counter, the prices are totalled, and the tax is calculated only once, on the sum. (In accounting lingo, the calculation is done “per invoice” not “per item.”) It turns out that my 471 taxable purchases were grouped into 195 invoices. (On a typical trip to the store I bought 2.4 items.) When I reanalyze my receipts on this basis, the 99-cent effect is somewhat softened:

bar chart of number of invoices as a function of invoice total modulo $1

Nevertheless, the invoice totals are still strongly clustered in the region between $0.93 and $0.99, where tax amounts are all rounded upward.

With these data in hand I can now calculate the total “roundage,” which I’m going to define as N(Texact â€“ Trounded), where N is the number of transactions at a given invoice amount. With this convention, the roundage is positive when it favors the merchant and negative when it favors me.

bar chart of total roundage

An interesting statistical question is whether this pattern offers any evidence of a deliberate policy of setting prices in order to maximize roundage for the merchant. Obviously the prevalence of prices just under a dollar has this effect, but since prices in that range have other possible justifications, they can’t support any firm conclusion. Are the sharp upward spikes at 79 cents and 50 cents a result of careful strategizing, or are they just random accidents? I have not tried to answer that question, and I don’t think I have enough data to do so. It should be noted that if you exclude the region of the graph between $0.93 and $0.99, the net roundage of the remaining transactions is slightly in my favor.

How much money are we talking about here? For calendar year 2005, based on the 195 transactions I was able to document, the skewed rounding of sales tax cost me 13.43 cents. This deficit is probably not the biggest hole in my pocket. On the other hand, if my case is typical, and if we extrapolate my loss to the entire U.S. population, that comes to $40 or $50 million slipping through the cracks.

Where is the insidious excess of roundage going to? That’s the question. If merchants are required to remit to the state the exact amount of tax actually collected, then the bias in roundage merely raises the effective tax rate a tad. No big deal. As far as I can tell, however, that’s not what happens—at least not here in North Carolina. The form on which sales tax is reported asks for total receipts (exclusive of tax collected) and then multiplies this amount by the appropriate percentage. In this way all of a merchant’s sales over an entire month or quarter are lumped together before any rounding is done, and any bias in the roundage of individual transactions will not be taken into account. The form does include a line for “excess collections,” and so a conscientious merchant could keep track of the roundage and report it there. But the instructions that accompany the form do not mention this possibility, and my supposition is that the excess roundage usually stays with the retailer.

We can fight back, though. Using my sample of 471 item prices, I ran a Monte Carlo simulation to estimate the average roundage as a function of the number of items purchased on each shopping trip:

roundage per year as a function of number of items per shopping trip

The model assumes that I buy exactly k items, selected at random (with replacement) from the set of 471 prices in the 2005 data set. The sales-tax roundage on this shopping basket is calculated, and then is multiplied by 471/k, to give the total expected roundage for the year. The graph shows averages of 106 repetitions of this process for each value of k from 1 through 25. The lesson is clear: If I bought 9 or 10 items every time I went to the store, I would come out slightly ahead in the roundage game.

However, I have a better solution to propose, one that could make the system more resistant to manipulation by either the seller or the buyer. Any integer tax rate brings the curse of commensurability: a pattern of roundage that repeats every dollar, rewarding strategies such as setting all prices to $x.99. Commerce might be more interesting with an irrational tax rate. For example, if the tax percentage were not 7 but rather the square root of 50 (equal to 7.0710678118654755…), then it would be a little harder to rig prices so that the tax consistently rounds upward. Besides, it would help with a problem discussed in my February 20th post: Pundits could no longer claim that algebra has no role in ordinary life. Every time you buy a cup of coffee, you’d have to solve the quadratic equation x2 â€“ 50 = 0.

Further notes and questions.

The Monte Carlo model discussed above chooses k items at random. If you could plan a year’s worth of shopping in advance, you could list all N items you intend to buy and search for the best possible partitioning of these goods into k-item batches. How hard a computational problem is that? The brute-force approach (trying all possible partitionings) is exponential in N, but some such problems turn out to be easier than they look. What if you remove the constraint that each batch must have exactly k items?

Playing the game from the point of view of the vendor, what pricing policy will maximize roundage gains, given a buyer who is determined (at all costs!) to minimize them? Under a fixed, integer tax rate, is there any set of prices that guarantees a win for the merchant, no matter how the shopper assembles purchases into market baskets? Setting all prices to $x.00 enforces a trivial tie. Can the seller do better than that?

Instead of an irrational tax rate, we might try giving the dollar a prime number of cents. The obvious choice is 101 (in which case we might call them not cents but centunos). How would this affect roundage calculations?

The state of Ohio has recently changed its sales-tax regulations, allowing merchants to calculate tax either on a per-invoice or a per-item basis. The per-item option foils all the market-basket strategies for manipulating roundage, and it gives the merchant the potential of earning up to an extra $0.005 per item sold. It will be interesting to see whether pricing patterns change in Ohio.

Best Friends

Monday, January 30th, 2006

Among children of a certain age, everyone has a best friend—and exactly one. Ideally, the best-friend relationship is symmetric: If I am your best friend, then you are my best friend, too. But symmetry is not guaranteed, and it can happen that I like you best, but you have someone else you like better than me. Sad, but life is like that sometimes.

We can model best-friendship geometrically by letting distance—or, rather, nearness—stand for intensity of affection. Sprinkle a bunch of points at random on a plane, and then draw an arrow from each point to its nearest neighbor, which we take to be the point’s best friend. When the best friends are mutual, a bidirectional arrow links the two points. In other cases, a chain of arrows points from a to b to c and so on. Here is a best-friend graph constructed in just this way:

best-friends-diagram

There are 100 dots in the diagram, which have spontaneously formed 30 constellations, or connected clusters. Some 40 of the dots represent disconsolate children who feel an attachment for someone who’s attached to someone else. (Note that some of the dots are so close that the arrows are obscured.)

Questions.

  1. In general, what proportion of the children in this model are stuck in unrequited best-friendships?
  2. How many clusters form, on average?
  3. How do these quantities vary as a function of the overall number of children?
  4. Adding a new child to the class or the neighborhood could disrupt some existing friendships: If a new point is inserted at a random position, how many existing bonds are likely to be broken or rearranged?
  5. What about the geometry and topology of the model? Would it make a difference if the points were plotted on the surface of a torus? Would the results be different in one dimension (with all the points arrayed along a line) or in three dimensions (with the points distributed throughout a volume of space)?

Note that the best-friend problem is not equivalent to the well-known stable-marriage problem (on which there is an extensive literature). In the stable-marriage situation, matchings are always two-by-two: If I like you best but you prefer someone else, then I simply have to find another partner.

For a rather different perspective on the mathematics of friendship (and also enmity) see Dynamics of Social Balance on Networks, by T. Antal, P. L. Krapivsky, and S. Redner.