Celebrating Twenty-Five Years of Digital Preservation
By Stephen Abrams, Head of Digital Preservation
Harvard Library Preservation Services
This week marks the 25th anniversary of the Digital Preservation Repository. The DRS is Harvard Library’s centrally-supported platform for long-term preservation of Harvard’s deep, broad, and unique digital collections. These materials span all genres and structural forms implicated in the University’s research, teaching, and learning mission, as well as administrative records necessary for institutional continuity and operational productivity.
Harvard was an early adopter and leader in the field of digital preservation, first addressed in the context of the Library Digital Initiative (LDI) in 1998. The initial design for the DRS commenced in 1999 and it moved into production operation on October 21, 2000. The first preserved item? A digitized photograph from the Harvard-Yenching Library’s Hedda Morrison collection.
Hedda Morrison, Buddhist Nun Kneeling at Altar, ca. 1933-1946, Harvard-Yenching Library, HM01.6811, https://nrs.harvard.edu/URN-3:FHCL:198
In the ensuing years, the DRS has grown substantially and now provides preservation stewardship for over 11.1 million items, represented by 238 million files in 109 formats, totaling 618 TB. (To give some sense of scale, if all of that material was HDTV you could binge watch the DRS 24x7 for 23 years.) This content was contributed by 63 curatorial units across the Library and University, and is arranged in 484 thematic collections. The most recent item preserved in the DRS? A digitized Piranesi etching from the Harvard Art Museums.
Giovanni Battista Piranesi, Remains of the temple of Concord, Harvard Art Museums, M2869.1.34.1, https://nrs.harvard.edu/URN-3:HUAM:819287
Responsibility for the DRS is shared between the Digital Preservation Services (DPS) team in HL Preservation Services (HLPS), who act as the service owner on behalf of University stakeholders, and HUIT Library Technology Services (LTS), who exercise technical and operational responsibility.
All items in the DRS are subject to proactive automated and human attention and care. Ongoing preservation monitoring and analysis by DPS ensures the long-term integrity, authenticity, accessibility, and usability of preserved resources. All files are replicated on geographically-dispersed storage platforms and media types to avoid single points of technological failure. All replicas are audited periodically to ensure that they remain faithful copies of each other; when very occasional – but somewhat inevitable – discrepancies arise, damaged copies are replaced with validated data. DPS also initiates appropriate remediation when necessary, for example, the 2016 migration of audio items from the obsolete RealAudio format to equivalent MP3s; and the recent migration of image items from the obsolete Photo CD format to equivalent JPEG 2000s. When assessed according to the NDSA Levels of Preservation, a maturity model widely adopted by the international digital preservation community, the DRS conforms to the highest levels of professional best practice. DPS participates in many international membership organizations contributing to progress in the art and science, theory and practice of digital preservation, including the Digital Preservation Coalition (DPC), the National Digital Stewardship Alliance (NDSA), and the Open Preservation Foundation (OPF), all of whom offer valuable advocacy, education, and technical services.
The DRS has undergone two major technological upgrades: an important functional expansion beginning in 2008; and support for enhanced storage infrastructure in 2021. Recognizing the limitations of supporting locally-developed and maintained technology, the Library received generous funding from the University IT Capital Fund in 2023 for a multiyear project of generational modernization for the DRS, the DRS Futures project. The new DRS system, based on the LIBSAFE platform, will replace the legacy system in Spring 2026. This will provide significantly-enhanced service function to Library stakeholders and positions the DRS to be ready for its next 25 years of operation!
While the DRS remains the centerpiece of the Library’s preservation infrastructure, digital preservation concern and activity encompass other areas. The Library began initiatives in web archiving in 2006 and email archiving in 2009. While both originally relied on locally-developed infrastructure, collection managers now use the hosted Archive-It service for web archiving over 31,000 websites, and the open-source ePADD tool, recently enhanced with DPS participation, for email archiving. More recently, recognizing the importance of preserving software along with the data that relies on it to be suitably rendered, DPS has opened investigations into local support for software preservation and emulation in cooperation with the Software Preservation Network community and EASSI Research Alliance. Beginning in 2024, DPS inaugurated a new service, the Digital Accessions Program (DAP), to assist in reducing collection backlogs through central support for born-digital and common physical media types. All of these activities exemplify DPS’s mission of ensuring the continuing viability of Harvard’s extensive and ever growing – in number, size, diversity, and complexity – digital collections.
To see other highlights across the past 25 years of digital preservation at Harvard, please look to this summary timeline (or PDF) For more information about digital preservation at Harvard, please contact the Digital Preservation Services team through the DPS wiki.