Invisible objects

By Franco Moretti

I was a fellow at the Wissenschaftskolleg in 1999-2000; then, I was lucky enough to return for another year, in 2012-13. Thirteen years. Enough to change many things, in one’s individual life. But in this case, the years also brought important changes in the intellectual discipline – broadly speaking, literary and cultural history – within which I work. As I write, such cheanges are known under the passe-partout name of “digital humanities”. Not the best definition, in my opinion; “computational criticism”, for one, would have been clearer; and other alternatives circulated in the past ten years or so. But when the great American funding agency – the National Endowment for the Humanities – decided to call one of its subsections “digital humanities”, the game was over. It’s the sign of a major, and dangerous, novelty: the key role played by grants – “soft money” – in the humanities as a whole. More on this towards the end.

Digital humanities, then. Literary historians, using computers to think about literature. And the first thing that happens, is that the object of study changes. Not just the tool, which is obvious; the object itself. "The objects studied by contemporary historians” have this peculiarity, Krzysztof Pomian observed some time ago, that “no one has ever seen them, and no one could ever have seen them […] because they have no equivalent within lived experience”. He was thinking of things like demographic evolution and literacy rates, and it’s true, no one can have a “lived experience” of these “invisible objects”, as he also calls them; our objects are different of course, they are literary ones, but they too have no equivalent within the usual experience of literature.

So what are they like, these objects we study in the Stanford Literary Lab, which I founded, with a handful of PhD students, in 2010? They are things like – this: figure 1. This image comes from our recent collective pamphlet, “Style at the Scale of the Sentence”, which was finished, in a whirlwind of emails and skype meetings, during the Fall of my year at the Wiko, and that can be found at the Lab’s website. Here, let me just say that the chart shows how four types of significant narrative clauses, indicated by the red lines, can be differentiated on the basis of the relative frequency of a certain number of words, in green and gold; we spent quite a few hours trying to understand the logic behind this distribution, and others like it. These are the objects we study. Or this: figure 2. The red segments at the bottom express the declining presence of loud speaking verbs (“cried”, “roared”, “screamed”…), hence the "silencing" of the 19th-century English novel that 21-year old Holst Katsma discovered in our database. This is what our objects are like. And, truly, no one had ever seen them, because they exist on a different scale from that at which we typically experience literature: one that is simultaneously much bigger and much smaller than the usual: three thousand novels, and a handful of words for loudness; or, as in figure 3, the eleven different literary genres represented by the colored word strips, and the twenty-odd verb forms indicated by the black vectors. No one experiences literature as a scatterplot of verb forms and genres. Reading a novel; watching a play; memorizing a sonnet: this is the lived experience of literature. Instead, here literature is de-composed into its extremes; but this radical reduction also allows us to see a relationship between the very small and the very large that would otherwise remain hidden: how crucial the passive past simple is for the rhetoric of Gothic novels, for instance, or progressive tenses for the Bildungsroman. And it’s not just a matter of “seeing” the relationship; you can work on it: change the variables; use adjectives instead of verbs to test if they differentiate genres better; exclude function words or include them. You can conduct small experiments with historical evidence. This says something important about the new object of study: it is not something we have found somewhere (in an archive, say); it’s something we have constructed for a specific purpose; it’s not a given, it’s the result of a new practice. A new type of work that, before the advent of digital corpora and tools, was simply unimaginable.

Which brings me to a question I have often been asked, and rightly so: will the humanities of the digital age lose what has so powerfully characterized them – the experience of reading a book from beginning to end? And, I don’t want to answer for the humanities in general, but for those of us in digital literary studies the answer has to be, Yes: reading a book from beginning to end loses its centrality, because it no longer constitutes the foundation of knowledge. Our objects are much bigger than a book, or much smaller than a book, and in fact usually both things at once; but they’re almost never a book. The pact with the digital has a price, which is this drastic loss of "measure". Books are so human-sized; now that right size is gone. It’s a loss that seems to be a necessary consequence of the new approach.

Now, let me be clear, this does not mean that literary critics, let alone readers in general, shouldn't read books any more. Reading is one of the greatest pleasures of life, it would be insane to give it up. What is at stake is not reading, it’s the continuity between the experience of reading of a book and the production of knowledge. That’s the point. I read a lot of books; but when I work in the Literary Lab they’re not the basis of my work. The “lived experience” of literature no longer morphs into knowledge, as in Ricoeur’s great formula of the “hermeneutic of listening", where understanding consists in hearing what the text has to say. In our work we don't listen, we ask questions; and we ask them of large corpora, not of individual texts. It’s a completely different epistemology.

Do we not read at all, then? Well, not exactly. You may have noticed a crazy outlier at the top of the third chart; there, each of the strips indicates a set of two hundred narrative sentences from a novel, and that one, from the early chapters of Middlemarch, was so extreme that we of course took those two hundred sentences and read them very very carefully. The question is: while doing so, were we reading Middlemarch? I don’t think so. The sentences came from Middlemarch, yes, but they couldn’t be “read” like one reads a novel because they were not continuous with each other; rather, they formed a series only on the basis of a grammatical peculiarity we wanted to investigate. No one could have ever “seen” them together while reading Middlemarch. We were “studying” Middlemarch, but not “reading” it.

The objects have changed, and the scale has changed, and the type of work, and of knowledge, and the relationship to reading. And this of course raises all sorts of further questions, many of which were indeed asked during my 2013 Dienstagskolloquium. Are the old and the new type of knowledge – in conflict? Complementary? Independent of each other? And the study of these new objects – what exactly has it achieved? So many changes … But have they really changed literary history?

Let’s keep it simple: not much. In part, we haven’t yet had much time to do our work. Plus, grant-funding agencies, by rewarding “promise-the-moon-by-tomorrow” projects, are in their own way hindering genuine research, instead of promoting it. But the main reason lies elsewhere: unlike most critical schools, the digital humanities have not been inaugurated by a major theoretical statement – “Die syntaktischen Errungenschaften der Symbolisten”, “Iskusstvo kak priëm”, Sur Racine, Orientalism – but by the increasing digitization of books, and the development of text-mining algorithms. New archives, and new computational programs: such have been the two pillars of the digital humanities up to this point, which have unquestionably enriched our idea of literary history. Enriched it; not changed. For that to happen, concepts must move to the center of the digital humanities. After the years of databases and tools – the years of hypotheses. And thirteen years from now …