Enhancing Discovery & Access of Web Archives at Harvard Library

by Tricia Patterson, Digital Preservation Analyst

A scholar walks into an online discovery platform. The scholar says to the search bar, “Can I get the archived website of Harvard’s Department of Sanskrit and Indian Studies from 2007?” The search bar looks confused; “Archived website? We don’t have any of those here – how about a Networked Resource or a Manuscript instead?” The scholar sighs in defeat, abandoning their research forever…

Okay, so maybe that’s overstating the ramifications.

But Harvard Library’s web archives collections have never been catalogued in a systematic or standardized method, leaving these unique resources without a direct pathway for discovery or advocacy for their use. So while it may not inspire our scholars to abandon their research, neither does it elevate their experience. Consequently, the Web Archives Advisory Group (WAAG), as part of a four-year initiative to make web archiving a more sustainable programmatic activity across Harvard Library, developed some recommendations for enhancing the discovery and access of these collections.

When web archiving initiatives first began at our institution in 2007, just harvesting the resources in the first place was pioneering work. Best practices for cataloging the content so that it could actually be discovered had not yet been established, so individual library units had to shoehorn in the content on HOLLIS by conceptualizing them as extensions of other genre types. The lack of standards around cataloging web archives resulted in the legacy records we still have today.

A breakdown of the features a user currently sees when using Hollis.

In 2017, WAAG charged a Web Archives Access and Discovery Working Group to survey peer institutions’ solutions to web archives discovery and offer recommendations for what Harvard could implement to make this content more discoverable. The resulting implementation suggestions centered around adopting and standardizing cataloguing practices for the genre and enhancing our discovery platform to enable more dynamic search results.

A breakdown of what features a future user will see when using Hollis.

As of Fall 2020, WAAG is sunsetting – the conclusion of four years of collaborative work to transform web archiving from project to Library-wide programmatic activity. These implementation recommendations for enhancing web archives access and discovery will now swim upstream to the Access and Discovery Stewardship Committee, for prioritization and technical implementation - so that on a fine, clear day in the near future, that researcher can walk back into that online discovery platform and get handed the 2007 Department of Sanskrit and Indian Studies website they so dearly desire.