Improving Web Archiving Practices at UNC University Libraries
The project led by Hannah Wang aims to establish a comprehensive web archiving policy at UNC University Libraries following challenges faced in preserving digital content. The current reliance on Archive-It poses limitations in storage management, prompting the need for a library-wide strategy to enhance web archiving workflows and sustainability.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
UNC Chapel Hill Libraries Web Archiving System SAA Research Forum, August 2016 Hannah Wang (Project Leader) Holly Croft Matt Cresson Sangeeta Desai
Project Summary and Scope Our project seeks to generate a library-wide policy for the web archiving practices of the UNC University Libraries. The project emerged from the Carolina Digital Repository s initial request based on their difficulty with preserving an undergraduate capstone project that had a large web component. Our initial interview with Preservation Librarian Andrew Hart and Repository Librarian Julie Rudder revealed the need for a library-wide web archiving policy.
UNC Library System -A Federated System The UNC Library System consists of the various campus libraries, University Archives Records and Management (UARMS) and special collections departments (i.e. CDR, North Carolina Collection, Southern Folklife Collection, etc.). All libraries and departments are funded through the university. University Archives and Records Management (UARMS) currently does the bulk of web archiving.
Current Web Archiving System All UNC Library departments that are archiving websites currently use Archive-It. Archive-It is a proprietary web archiving service with an annual fee that is used by approximately 75% of repositories that have web archiving systems. Archive-It is a web-served system that utilizes Heritrix as a web crawler and the Internet Archive Wayback Machine as its system for accessing the created archive.
Current Web Archiving System (continued) Because the departments employing web archiving all use the same Archive-It account, they must archive their web content in the same Archive-It storage account that formally belongs to the University Libraries. Problem: Once the annual storage limit is reached, all web archiving activities stop. The University cannot buy more storage space until the next fiscal year. At present no clear policies exist about the allocation of storage space, nor for oversight of the storage space.
Current Archive-It Workflow for the entire UNC-Library System
Recommendations for New System Two primary recommendations: 1.Centralize authority over library web archiving activities 2.Develop policies for accessioning and archiving different types of websites
Archiving UNC Web Material Activity Model *CDP = Collection Development Policy
Implementation Strategy One option we are recommending is that University Libraries hire an employee who is responsible for overseeing web archiving and other digital assets -a Digital Collections Archivist/Librarian. University Libraries has previously had this type of position and reorganized, but this has lead to a piecemeal system that does not work well. Short of this, University Libraries at the very least must ensure one person is the point person for web archiving, and provided with the authority necessary.
Implementation Strategy (Continued) UNC will continue to need to increase storage space for web archiving, which is a cost that will accrue through both the old and new systems. The cost to hire a Librarian 2 on a permanent basis at UNC is roughly $2,073,897.20. - insurance cost = $82,953.88 annually $63,720 + 22.04% benefits ($14,043.88) + $5,192 fixed health - $82,955.88 x 25 years = $2,073,897.20
The Exception What about the CDR, though? Webrecorder.io is a product of Rhizome, debuted at the International Internet Preservation Consortium (IIPC) s first annual Web Archives Conference in Reykjavik, Iceland, on April 15, 2016. It s currently in beta form, but will be an open source product. We don t yet know what kind of charges there will be for storage. Provides On Demand web archiving, which is perfect for the CDR.
Resources / Literature Review Archive-It, The Archive-It Life Cycle Model #iipcWAC16 Lisa Gregory, The Practice and Perception of Web Archiving in Academic Libraries and Archives Stanford University Libraries, Web Archiving Policy
Acknowledgments Thank you to the following people who we interviewed/emailed for information: - Julie Rudder Andy Hart - - Nicholas Graham Jennifer Coggins - - Matthew Farrell Lisa Gregory - Stewart Varner