Inside CDL

UC Libraries and HathiTrust: Overview and FAQ

  1. What is the HathiTrust?
  2. What does the name HathiTrust mean?
  3. How did the project come about?
  4. What libraries are participating?
  5. What is UC’s and CDL’s role?
  6. What will UC’s Digital Preservation Program be doing?
  7. Where is the content stored?
  8. What content is in the repository?
  9. What are the benefits to UC in participating in this project?
  10. What services will the HathiTrust offer users?

1. What is the HathiTrust?

The HathiTrust is an inter-institutional repository of primarily mass-digitized books. The repository supports both preservation and access, with a range of access services encompassing permitted uses of books digitized by Google and full access to public domain content.

2. What does the name HathiTrust mean?

Hathi (pronounced hah-tee) is the Hindi word for elephant, an animal highly regarded for its memory, wisdom, and strength. Trust is a core value of research libraries and one of their greatest assets. In combination, the words convey the key benefits researchers can expect from a first-of-its-kind shared digital repository.

3. How did the project come about?

The project is an outgrowth of the University of Michigan’s local MBooks repository (which contains UM’s books digitized by Google). It was initially re-conceived as a shared repository of the Committee on Institutional Cooperation (CIC) institutions. The University of Michigan and Indiana University are the original lead partners.

Back to Top

4. What libraries are participating?

In addition to the 10 UC libraries and the CDL, partners include the libraries in the Committee on Institutional Cooperation (CIC). The Midwest-based CIC includes Indiana University, University of Illinois, University of Illinois at Chicago, University of Iowa, University of Michigan, Michigan State University, University of Minnesota, Northwestern University, Ohio State University, Penn State University, Purdue University and University of Wisconsin-Madison. The University of Virginia has also joined the HathiTrust; other libraries participating in mass digitization initiatives may join in the future.

5. What is UC’s and CDL’s role?

Because of the size and scale of the UC libraries’ mass digitization program, UC was invited to become the third lead partner in the project.  UC has significant representation at all levels of governance.  CDL coordinates UC’s participation and devotes resources to the ongoing development of access and preservation services.  Working groups will be charged by the Strategic Advisory Board and composition will likely depend on the type of expertise needed.  UC will have opportunities to lead or participate in groups focused on topics of interest.

6. What will UC’s Digital Preservation Program be doing?

The UC Digital Preservation Program (DPP) is committed to preserving the digital assets that support UC’s research, teaching, and learning mission (e.g., UCTV, web archived content, UC library collections, eScholarship, electronic theses and dissertations, etc.), not just mass digitized content. The DPP will continue to work with campus partners on an evolving set of digital curation services that are designed to meet UC’s unique needs. The DPP will also be directly involved in creating a common preservation infrastructure for the HathiTrust.

7. Where is the content stored?

The current architecture includes two mirror sites — one at the University of Michigan and one at Indiana University. A third tape copy is stored in Ann Arbor. One or both sites could be relocated later on and/or additional sites added if there were a desire and adequate funding.

8. What content is in the repository

Books digitized by Google (including more than 1.6 million volumes at the University of Michigan) form the backbone of the repository, but Michigan’s locally-digitized books are also included.  Books digitized by Google from the University of Wisconsin-Madison are currently being added.  UC will contribute all of its mass digitization materials (1.6 million volumes), unifying content digitized by Google and the Internet Archive (IA).  The HathiTrust is also committed to including public domain content from non-Google partners.

9. What are the benefits to UC in participating in this project?

  • Greater service to users through combined content and access to materials digitized by other institutions, including content from partner libraries that has been, and will continue to be contributed that is not found elsewhere on the web or has been specifically opened (in the case of copyright-restricted materials) by copyright holders for access to users in HathiTrust.
  • Opportunity to provide deeper support for scholarly access to mass digitized materials, including the abilities to retrieve content in different formats (e.g. plain text, PDF, and page image), browse and facet search results, define full-text searches across selected bodies of content, and save items to targeted collections.
  • Reduced costs resulting from sharing access and preservation services with multiple partners.

Back to Top

10. What services will the HathiTrust offer users?

The University of Michigan’s MBooks repository forms the core of the current services, which will evolve in a more collaborative way under the new organization. Current services include:

  • Brings together content digitized by Google, the OCA, and other institutions for unified full text discovery and access with a common user experience.
  • Consolidated full text indexes for all content in the repository, which point to downloadable full text where available, or to local copies and Google Book Search copies.
  • Page-turner application for online viewing where permitted by copyright.
  • Services for print-disabled users.
  • Copyright management and rights clearance.

Back to Top