Skip to Main Content
Bowdoin College Library <Ask Us!

HathiTrust: Content

Content Overview

As of 2023, works published before 1928 in the U.S. are usually considered to be out of copyright and in the public domain. More information about public domain.

All data are unofficial and are as of November 2020.

  1. What types of materials are in the HathiTrust digital library?
    "HathiTrust provides long-term preservation and access services to digitized content from a variety of sources, including Google, the Internet Archive, Microsoft, and in-house member institution initiatives."
  2. How many items are in HathiTrust?
    HathiTrust includes over 17,444,000 items, over 3,087,000 (≈18%) of them public domain.
    HathiTrust includes over 1,200,000 full-view (public domain) Federal government items, more than 38% of the HathiTrust public domain collection.

Institutions

The HathiTrust collection is most representative of the holdings of large research libraries in the United States. More than 98% of the items in HathiTrust were contributed by institutions in the United States. The 13 large research libraries in the pie chart below together contributed more than 90% of the items to the HathiTrust collection. The University of Michigan and University of California have alone contributed more than 50% of the content.

Figure showing deposited volumes by original source of content As of November 2020. Source.

Dates

As might be expected, given that works published before 1928 in the U.S. are usually considered to be out of copyright and in the public domain, over 70% of the public domain materials in HathiTrust are from before 1920.

Figure showing date ranges represented in HathiTrust public domain materials As of November 2020. Source.

Languages

The English language is represented in about 60% of the HathiTrust public domain materials. The 11 languages in the pie chart below are together represented in about 94% of those materials.

Figure showing languages represented in HathiTrust's 3,131,760 itemsAs of November 2020. Source.

In addition, the following 26 languages are each represented in 1000 or more items (in descending order of frequency): Swedish, Danish, Hebrew, Arabic, Ancient Greek, Polish, Hungarian, Czech, Modern Greek, Norwegian, Turkish, Armenian, Icelandic, Persian, Ukrainian, Yiddish, Finnish, Sanskrit, Croatian, Welsh, Catalan, Serbian, Thai, Turkish, Romanian, Bulgarian.