Googling the rise and fall of literary reputations
As the New York Times recently pointed out, Google’s new online book database, which allows users to chart the evolving frequency of words and short phrases over 5.2 million digitized volumes, is a wonderful toy. You can look at the increasing frequency of George Carlin’s seven dirty words, for example—not surprisingly, they’ve all become a lot more common over the past few decades—or chart the depressing ascent of “alright.” Most seductively, perhaps, you can see at a glance how literary reputations have risen or fallen over time. Take these five, for example:
It’s hard not to see that, for all the talk of the death of Freud, he’s doing surprisingly well, and even passed Shakespeare in the mid-’70s (around the same time, perhaps not coincidentally, as Woody Allen’s creative peak). Goethe experienced a rapid fall in popularity in the mid-’30s, though he had recovered nicely by the end of World War II. Tolstoy, by contrast, saw a spike sometime around the Big Three conference in Tehran, and a drop as soon as the Soviet Union detonated its first atomic bomb. And Kafka, while less popular during the satisfied ’50s, saw a sudden surge in the paranoid decades thereafter:
Obviously, it’s possible to see patterns anywhere, and I’m not claiming that these graphs reflect real historical cause and effect. But it’s fun to think about. Even more fun is to look at the relative popularity of five leading American novelists of the last half of the twentieth century:
The most interesting graph is that for Norman Mailer, who experiences a huge ascent up to 1970, when his stature as a cultural icon was at his peak (just after his run for mayor of New York). Eventually, though, his graph—like those of Gore Vidal, John Updike, Philip Roth, and Saul Bellow—follows the trajectory that we might suspect for that of an established, serious author: a long, gradual rise followed by a period of stability, as the author enters the official canon. Compare this to a graph of four best-selling novelists of the 1970s:
For Harold Robbins, Jacqueline Susann, Irving Wallace, and Arthur Hailey—and if you don’t recognize their names, ask your parents—we see a rapid rise in popularity followed by an equally rapid decline, which is what we might expect for authors who were once hugely popular but had no lasting value.
It’ll be interesting to see what this graph will look like in fifty years for, say, Stephenie Meyer or Dan Brown, and in which category someone like Jonathan Franzen or J.K. Rowling will appear. Only time, and Google, will tell.
Unsure if there aren’t hidden methodological issues here, as I think that searching by full name is safe but the site is then logging, I think, the total instances of the full-name (right? I don’t see book-filters, for instance, i.e. the number of books which mention an author), but no way of telling if this is a symbolic cite or a more in-depth treatment of the same. I suppose this would require downloading the actual file of “bigrams”–(the I Ching enthusiasts across the world must be in despair). On the other hand, the “yellow peril” peaked just at the start of twentieth century and we’ve never been safer!
drewberthu
December 17, 2010 at 4:03 pm
Oh, I’m sure there are all kinds of methodological issues. But the results are interesting and they’re more or less what I expected, so they must be true, right?
nevalalee
December 17, 2010 at 4:09 pm
It’s an interesting toy. If I ever get time I’ll experiment further.
Some of the methodological stuff is explained in links on the Google n-gram page, and it looks to be essentially a count of mentions of particular names or phrases in the corpus of books Google have digitized – all kinds of issues could affect results, I guess, including how you decide what search terms to use and whether the names crop up because people are popular or merely notorious! Is William McGonagall, for example, being increasingly cited because people actually like his poetry or because he’s an increasingly commonly cited example of terrible poetry? I guess the thing about this is, it does offer enough information to start raising such questions.
Jon Vagg
December 17, 2010 at 5:53 pm
Good point. And if I were using this data for an academic study (rather than a blog post) I’d definitely want to examine the sources more closely. Still, the fact that the graphs for the two sets of novelists look so similar implies that if the charts are being skewed somehow, they’re all being skewed in the same way. (Although this could also just be a question of me seeing what I want to see.)
nevalalee
December 17, 2010 at 7:48 pm
Hrm… these are prettier, but harder to see scaled… http://www.chrisharrison.net/projects/trigramviz/index.html
drewberthu
January 11, 2011 at 12:59 pm
Interesting. I like the trigrams for “He married” and “She married.” Respectively, they’re “mary,” “elizabeth,” “sarah, “margaret,” “a,” “in,” “martha,” “his,” “anna,” “the,” and “jane”; and “john,” “william,” “james,” “thomas,” “george,” “robert,” “charles,” “a,” “joseph,” and “henry.”
nevalalee
January 11, 2011 at 9:26 pm