Reviewing this month’s batch of incoming junk mail, I stumbled upon the following message:
In case that image is too tiny to read, here is the first word in source-code form:
28 47 34 74 33 85 42 16 43 25 5048 08124 8813 2714 34 02 25 66 50 31 855 05 3404 65 88362 00 25 72 01651 8008 36 42 77 27 81 06 04 40 72 83 02 32 47 12 24 87 33 78 03 87100 83844 18 21813 08 73634
The basic technique is anything but novel. I can remember green-and-white-striped printouts that had my name emblazoned in the same kind of two-inch-high characters. But why are the characters here formed entirely out of numbers, rather than other ASCII glyphs? And do the numbers themselves mean anything?
I think I know the answer to the first question: The spammer thought a message composed of nothing but numerals might slip through the spam filters. (In my case, at least, it didn’t work. I fished this message out of the garbage pail.)
As for the second question, my immediate guess was that the digits are the output of some simple pseudo-random number generator. That would be an easy way to produce them, and it would also allow the spammer to make each individual message unique. On taking a closer look, however, I realized there was something quite nonrandom about the numbers in the message.
Here is the full list of digits. There are exactly 900 of them. Do you see what’s missing?
284734807433341016202332628542642574418481303116432550480812488 132714721846667434022566503185505580464271163634046588362002572 016511712427000046735580083642772781060440148383627872830232471 224873301464000807803871008384418218130077346262602008225346571 155727363470732323181618223162744253246331737038301533254837881 148802160371074555632302255640217448457046416116253484658726108 147181540061231788804563557807254177278106044014838362787283023 247122487330146400080780387100838462042135220046847482422143746 770236783058460185444521283134537306537546855305024142275437615 010235002438258320577785451436776143066166025853832747551576004 831136831376228235381112678466011047530048032816623514158481030 413446024450055236762111281250031205166204213522004684748242214 374677023678305846018544452128313453730653754685530502414227543 761501023500243825832057778545143677614306616602585383274755157 600483113683137622
There’s nary a 9 in the bunch. And in other respects too the digit distribution looks slightly off-kilter:
When I tabulated all the correlations between successive digits, that too looked a little fishy, although the sample is too small for any reliable conclusions.
s e c o n d 0 1 2 3 4 5 6 7 8 9 0 23 12 20 9 17 9 7 7 8 0 1 11 13 11 12 16 8 13 5 10 0 f 2 11 11 13 15 14 15 6 14 9 0 i 3 18 13 15 7 11 8 13 13 12 0 r 4 8 9 12 13 12 10 22 11 18 0 s 5 11 7 5 14 12 14 4 10 11 0 t 6 12 10 15 6 8 7 10 10 6 0 7 6 10 9 10 12 7 9 11 14 0 8 12 14 8 24 13 10 0 7 6 0 9 0 0 0 0 0 0 0 0 0 0
So what’s going on here? I think the pseudo-random generator is still a leading candidate, though it would have to be a badly implemented RNG. The absence of 9s isn’t hard to explain: We only have to suppose that the spammer was working in C and wrote the plausible-looking expression random(9), thinking that would generate integers between 0 and 9.
On the other hand, maybe it isn’t random. Maybe there’s a secret message-within-the-message. Anybody see a pattern?
While I’m talking spam, I’ll update my ongoing tally of my inbox contents. I can report that September was a good, strong month for spam, with further steady growth continuing the summer-long trend. The stock market is in retreat and credit is tight, but the purveyors of replica watches are undeterred. My receipts have crossed the 5,000-per-month threshold for the first time:
And another threshold has also been left behind: For the first time this month, more than half of my spam is written in Russian. (Based on character-set declarations, 2,858 messages out of 5,021 were in Cyrllic scripts, or about 57 percent.)
Update 2008-10-12: In response to a request in the comments, I’ve uploaded the full text (including headers) of the original email. The file is here. Incidentally, I’ve searched my spam archive for other messages like this one, without success. That in itself makes this a peculiar spam. Usually, if I get a spam once, I see dozens of copies or variants within a few days.