Text Analysis Tools from the HathiTrust Research Center
HathiTrust Digital Library offers a collection of over 16 million titles digitized from libraries around the world. The HathiTrust Research Center (HTRC) is a collaboration between Indiana University and the University of Illinois. The HTRC facilitates non-profit and educational uses of the HathiTrust Digital Library by enabling computational analysis of public domain works and (on limited terms) in-copyright works from its collection, a process known as "non-consumptive research".
This two-hour workshop will cover the basics of using the HathiTrust Digital Library to create collections of full-text volumes, to export your custom collection's metadata, and to upload that metadata for use in the HTRC text analysis algorithms. These algorithms include topic modeling, named entity recognition, tag cloud creation, and token (word) counting, and running them requires no programming knowledge. The in-development HathiTrust+Bookworm interface also allows you to create maps and other visualizations of word-use trends in 13.7 million HathiTrust volumes.
No prior experience with text analysis is required! Participants are encouraged to bring their own laptops.
Related LibGuide: Digital Scholarship @ UB by Rachel Starry
- Thursday, May 2, 2019
- 1:00pm - 3:00pm
- 310 Silverman Library
- North Campus