Recursive driveling

If there’s anything inaner than turning literature into drivel, it’s turning drivel into drivel. I’ve added a “recurse” button to the drivel generator. It feeds the output back to the input, like xeroxing a xerox.

What happens when this process is repeated many times? Before I tried the experiment, two quite different outcomes both seemed plausible. On the one hand, driveling is a recombinatory process that stirs up the text and thereby introduces novelty. Nth-order drivel can’t create any n-grams that don’t also appear in the source text, but those n-grams are jammed together in new ways. Hodgepodge words and phrases can be expected to proliferate as the recursion continues, increasing the entropy of the drivel.

The counterargument says that driveling is also a sampling process. With each round of recursion, some elements of the source text are left behind—and once lost they can never be recovered. In the long run, then, we should expect the recursive drivel to grow more monotonous, with an ever-smaller vocabulary. It’s like a biological population that steadily loses diversity for lack of new germ plasm.

By all means try the experiment for yourself. Here are the results I got when I repeatedly recycled the text of Matthew Arnold’s poem “Dover Beach.” In each round of recursion, I generated a thousand characters of drivel, which became the source text for the next round. In the snippets below, I show only the first line of the output for each round. The integer in front of each line is the level of recursion.

second-order drivel
0   Hating To trand is of Engles shor low, to Hat gone Fine shichocland,
1   p ful, Gles drawither lonce, Gles thdrand lon so of sh Only he again;
2   an, th Only heith re by th re Fret ebbland long Thering and is drawits
3   nly he And, thelas of dar turn, Wits of drawith turn, up fliffs ong
4   e and by herin; And, the and by th turn, we and brin; And by heit up
5   Wit up flike and, to-night up flike Fing pebblas onch’s of Fin; And, to-
6   tretretretretretretretretretretretretretretretretretretretretretretret

fourth-order drivel
0   d flight Gleams, So various, so beautiful, so beautiful, so beautiful, so
1   beautiful, so beautiful, so beautiful, so beautiful, so beautiful, so

sixth-order drivel
0   his mind then again begin, and round earth’s shore Lay like the sound a
1   rue To one another! for the world. Ah, love, let us be true To one
2   , let us be true To one another! for the world. Ah, love, let us be true
3   rue To one another! for the world. Ah, love, let us be true To one

eighth-order drivel
0   g ago Heard it on the Aegean, and it brought Into his mind the turbid
1   r the world. Ah, love, let us be true To one another! for the world. Ah,
2   ve, let us be true To one another! for the world. Ah, love, let us be true
3   world. Ah, love, let us be true To one another! for the world. Ah, love,

In every case, the process quickly reaches a fixed point—and a rather boring one at that. The banana phenomenon is doubtless a major factor in what we’re seeing here; it would be interesting to rerun the experiment with an algorithm immune to that flaw. Also important are finite-size effects. I would like to believe that the outcome would be different if we could generate infinite streams of drivel from an infinitely long source text. The trouble is, I can’t really imagine what an infinitely long source text would look like. If an endless lyric by Matthew Arnold is not trivially repetitious (“so beautiful, so beautiful, so beautiful”) then it has to be some sort of enumeration of all possible combinations of n-grams. In either case, it seems rather drivelish, even without algorithmic help.

Posted in computing, linguistics | 9 Comments