Benjamin M. Schmidt

Benjamin M. Schmidt is the Visiting Graduate Fellow at the Cultural Observatory at Harvard University, and a PhD Candidate in history at Princeton University. He writes about digital humanities at Sapping Attention.


Words Alone: Dismantling Topic Models in the Humanities

As this issue shows, there is no shortage of interest among humanists in using topic modeling. An entire genre of introductory posts has emerged encouraging humanists to try LDA.[1] So many scholars in humanities departments are turning to the tool in their research that it is sometimes described as part of the digital humanities in itself.

Code Appendix for “Words Alone: Dismantling Topic Models in the Humanities”

Topic Modeling Ships

Begin by getting the data in order. (This data is available on request.)


# Oceans2
rm(list = ls())
source("ICOADS parsing.R")
source("../Map Functions.R")

This step pulls in the Maury data and splits it.

Theory First

It’s easy to be reasonable about the relationship we’d like to see between digital humanities and “Theory.” Each should inform the other. After all, humanists who put big-T Theory before any empirical data foolishly close their ears to the new evidence digital can create;┬ádigital humanists who ignore theory entirely jeopardize not only their careers but the soundness of their conclusions.

