SoIC faculty earn grant to expand Data Capsule Service of the HathiTrust Digital Library

Beth Plale
Beth Plale

Several faculty of the School of Informatics and Computing, in partnership with research libraries and library schools, have been awarded a research grant from the Institute of Museum and Library Services for an extension of the Data Capsule service, which enables remote access of the HathiTrust Digital Library, to other collections managed by research libraries.

As the volume of digital content has expanded exponentially over the past several years, researchers and educators have recognized the potential of big data techniques to analyze, access, and organize digital scholarly collections. The Data Capsule service, which was developed for use in the HathiTrust Research Center (HTRC), creates virtual computers for users to access a restricted collection. Within HTRC, the Data Capsule service is used for non-consumptive analytics, which allows the computer to analyze the text but doesn’t allow the user to read or disseminate copyrighted content. Non-consumptive analytics include text extraction, textual analysis and information extraction, linguistic analysis, automated translation, image analysis, file manipulation, OCR correction, and indexing and search capabilities.

“Enabling greater library and archival community use of the HTRC Data Capsule service will open some very unique possibilities for use of born-digital content within many different types of libraries and archives,” says Professor of Informatics and Computing Beth Plale, who is leading the initiative. “The grant draws from years of experience of providing a similar service within HathiTrust and proposes to evaluate the needs of research libraries in other cases of restricted data requiring safeguarding the interests of right holders and protecting privacy.”

The project will partner with eight academic libraries across the country to understand current library needs and practices in provisioning library services for computational access to special collections having constraints due to sensitivity or restrictions. It also will extend the Data Capsule service to broader needs of provisioning for analytical access to restricted collections across a range of collections and uses, study extensions of Data Capsule to cloud computing environments for broader uses, and identify gaps in skills needed for librarians to enable secure data analytics and provide resources that can address those gaps.

The grant will be carried out under the encompassing framework of Participatory Design and involve funded partners at the University of Illinois, the University of California at Berkeley, and the University of Virginia, plus engaged partners at Lafayette College, MIT, Rutgers University, Swarthmore College, and UCLA.

“The UC Berkeley library has numerous archives that would benefit from access within the Data Capsule service,” says Erik Mitchell, Associate University Librarian at UC Berkeley. “Some of these collections are restricted by US copyright law and other restrictions. This type of research environment will be transformative for us.”

The two-year grant is for $360,000.

“The potential of the HathiTrust is vast, and this exploratory grant that builds off of tools in use in HathiTrust will help researchers obtain earlier access to collections that are similarly restricted in some form ,” said Raj Acharya, dean of SoIC.

Media Contact

Ken Bikoff
Communications Specialist
Phone: (812) 856-6908