Event box

Text Analysis Tools from the HathiTrust Research Center

Text Analysis Tools from the HathiTrust Research Center

HathiTrust Digital Library offers a collection of over 16 million titles digitized from libraries around the world. The HathiTrust Research Center (HTRC) is a collaboration between Indiana University and the University of Illinois. The HTRC facilitates non-profit and educational uses of the HathiTrust Digital Library by enabling computational analysis of public domain works and (on limited terms) in-copyright works from its collection, a process known as "non-consumptive research".

This two-hour workshop will cover the basics of using the HathiTrust Digital Library to create collections of full-text volumes, to export your custom collection's metadata, and to upload that metadata for use in the HTRC text analysis algorithms. These algorithms include topic modeling, named entity recognition, tag cloud creation, and token (word) counting, and running them requires no programming knowledge. The in-development HathiTrust+Bookworm interface also allows you to create maps and other visualizations of word-use trends in 13.7 million HathiTrust volumes.

No prior experience with text analysis is required! Participants are encouraged to bring their own laptops.

Related LibGuide: Digital Scholarship @ UB by Rachel Starry

Thursday, May 2, 2019
1:00pm - 3:00pm
310 Silverman Library
North Campus
  Digital Scholarship  
Registration has closed.

Event Organizer

Profile photo of Heidi Dodson
Heidi Dodson

CLIR Postdoctoral Fellow in Digital Scholarship

Profile photo of Rachel Starry
Rachel Starry

CLIR Postdoctoral Fellow in Social Science Data Curation


Contact: rlstarry@buffalo.edu

Click to schedule an appointment

More events like this...