Hacking the Humanities

Nearly two thousand years after his death, Pliny inspired a computer program that mimics his writing.IMAGE BY ADOC-PHOTOS/CORBIS

Last spring, I taught a literature seminar called “Before Wikipedia.” The subject was the history of encyclopedic writing, from ancient times to the present day. We read excerpts of Isidore of Seville’s "Etymologies" and Diderot’s "Encyclopédie" alongside works by Calvino, Sebald, and Flaubert.

The word “Wikipedia” in the course title seemed to attract an unusual preponderance of science majors for a seminar in comparative literature. There were physicists and mathematicians, a cluster of coders, an engineer, a neuroscience major. I teach at Brown, which has an open curriculum that encourages diverse course enrollments, but I’d never found myself in a room with so many young scientists patiently waiting for me to begin a lecture that I wasn’t planning to give.

In my experience, a successful seminar usually involves a mutiny quite early in the semester, when the students take over and my own voice is drowned out by the din of a crowded wheelhouse. This particular seminar’s discussions, however, began awkwardly. The silences I’ve learned to let hang in a classroom seemed unreasonably long. In the first week, I was further unnerved by an odd sound each time I’d turn to write something on the blackboard—the fluent skittering of fingers across twenty laptop keyboards, transcribing my scrawled words as though they’d be on an exam later in the week.

Our first writing assignment was on the Roman encyclopedist Pliny the Elder, whose "Natural History" is a font of Late Antiquity information on a stupefying range of subjects: the history of paper, portents provided by bees, the manufacture of purple dye, Alexander the Great’s famous dog. I asked the students to study Pliny’s writing style and try to mimic it in their own entry about a topic not treated in "The Natural History_."_ A week later, I had a stack of finely rendered facsimiles of Pliny’s clipped prose, on such topics as “coffee and coffee shops,” “the usual and unusual attributes of the hype man,” Chick-fil-A, and Kanye West. As I read through pages of perfect mimicry and snarky pastiche, I felt relief. The “two cultures” of the sciences and humanities were not so far apart, after all, or at least could be bridged by the lingua franca of pop culture.

One student, Henry, a double major in computer science and mathematics, approached the assignment differently. Rather than trying to imitate Pliny himself, he found a text version of "The Natural History" on the Internet, analyzed its thirty-seven books using a natural-language processing toolkit, and then wrote a computer algorithm that generated English sentences using the discovered features of Pliny’s style. Here’s a sample from the passage that he submitted:

The Pharusi, originally a Persian people, are said to be very greatly infested by pirates. And it will not be doubted that one or other of two things, full moon or the moon's conjunction in summer she must retire a long way off, and is gathered by the daughters of Hesperus. Ctesias states that in India there is a fish called the platanista with a dolphin's beak and tail, but 24 ft. long. Also great creatures resembling sheep come out on to the land for an unascertained reason, and they bud best under those circumstances, as otherwise it would make only leaves. Even so it is dragged ashore by more men hauling from the beach.

Henry turned in the assignment even though the algorithm had failed to produce “any sophisticated themes or totally coherent sentences,” as he put it. His robot encyclopedist spoke in magnetic poetry phrases, which occasionally yielded uncanny reproductions of Plinian syntax but often fell flat. Optimizing the program by using something called Latent Semantic Analysis and something else called a Markov Chain Monte Carlo could be interesting, he mused, but would require more time than he was willing to devote to the project at the moment.

After reading the assignment and reëvaluating the choices I’d made in my life, I slowly wrote the letter A in my grade book next to Henry’s name. My own programming skills were somewhat rudimentary and, without his cheerful explanations of the code, I would have been lost. There were two things, though, of which I was certain. First, a machine guided by an undergraduate had taught me something new about the expository style of an ancient Roman natural historian. Second, I had to hire Henry.

During the previous months, I’d been learning a coding language while trying to develop a project about the aesthetics of classical Arabic poetry. My interests were similar to Henry’s: What could we learn about an author’s oeuvre by studying his or her tics and favorite clichés? What made a certain poem identifiably the product of a person, place, or time, from the perspective of syntax and vocabulary? After class one day, I asked Henry whether he would be interested in collaborating, though I felt sure that he had more interesting things to do with his time. Amazingly, he agreed.

We spent the rest of the semester developing an algorithm that could detect different types of rhetorical figures in a large corpus of poetry. It flew through thousands of lines of verse like a drone over a wildlife habitat, snapping pictures of similes, allusions, and metatheses. The program, like the Pliny text generator, produced both epiphanies and duds. We spent a lot of time hand-checking the results, cleaning the data, tweaking a variable and rerunning the code. There were many crash courses on unfamiliar topics. I explained the basic principles of Arabic morphology and classical poetics to Henry, which he internalized effortlessly. Meanwhile, he explained terms like Hellinger distance, quadratic time, and high-dimensional space to me, while I blinked vacantly and asked him to repeat it all just one more time.

By the end of the term, my hard drive was littered with raw text files, Python scripts, and data visualizations. I spent as much time scrolling through the pages of Stack Overflow—an online discussion forum frequented by coders and engineers—as I did sorting through the poems in our corpus. Every little victory was accompanied by two or three major setbacks. It was exhilarating.

In the past decade, digital scholarship has gone from being a quirky corner of the humanities to a mainstream phenomenon, restructuring funding landscapes and pushing tenure committees to develop new protocols for accrediting digital projects. As the stakes have grown, so has an expectation about the role that the “digital turn” might play in revivifying the humanities, effecting a synthesis with the sciences, and other weighty causes. For many of its champions, the tinkering character of the digital humanities represents a kind of artisanal inquisitiveness, a hands-on, tool-building, map-making ethos that chafes against more abstract modes of humanistic inquiry.

Our poetry project had less modest ambitions. Still, over the course of stitching together our digital monster, I found myself thinking about the future. Would my graduate students soon have to pass comprehensive exams in JavaScript and QGIS alongside classical Arabic? Would the trickling stream of humanities jobs become a swift-moving river, swollen with funding and public-private synergies? Every sunlit vision casts a dystopic shadow.

On occasion, I’d bring up these issues with Henry, but he seemed puzzled by them. Our collaboration didn’t need to offer a panacea for the humanities to be worthwhile; it just needed to solve a problem. I could hear the silence hanging in the classroom again, aware that the undergraduate with his skittering laptop keyboard was now doing the talking, while I was the one grasping for answers.