The HathiTrust Research Center (HTRC) provides research support for the growing corpus of over fourteen million volumes in the HathiTrust Digital Library (HTDL) through a suite of tools text analysis. The size of the HTDL affords scholars the opportunity to increase the scale of their inquiry and to ask new kinds of research questions. The HTRC tools create avenues for scholars to pursue these new modes of research by allowing for “non-consumptive” text analysis with the HTDL corpus. Through demonstrations, hands-on exercises, and discussion, workshop attendees will learn about the suite of HTRC tools and how they can be used to support research and teaching.
Attendees will come away with an understanding of:
- HTRC tools that allow researchers to build custom subcollections of items from the HTDL, run HTRC-provided, off-the-shelf algorithms against them, and interpret the results;
- HathiTrust+Bookworm, an interactive visualization for studying lexical trends within material from the HTDL; and
- the Extracted Features (EF) Dataset, which provides page-level metadata and data derived from the items in a subcollection that a researcher can download and analyze on his or her own computer.
The workshop will present scenarios and example use cases in which HTRC tools shine, and will demonstrate the ways in which they can complement each other. Attendees will learn new strategies and approaches for the complex research questions that digitized text corpora are uniquely poised to help answer across academic disciplines.