About the Web Curator Tool
The Web Curator Tool is a tool for acquiring web material, such as websites, web pages, and other documents you might find on the internet. The collected web material can then be stored and preserved in a digital archive. The National Library uses the Web Curator Tool to collect New Zealand's online heritage.
The Library and the British Library developed the tool collaboratively. The first version was completed and released as open-source software in September 2006. A follow-up release was made in August 2007.
The Web Curator Tool, along with manuals, FAQ documents, source code, development documentation and further information, is available on the Web Curator Tool website.
The Web Curator Tool was a finalist in the Digital Preservation category of the United Kingdom's 2007 Conservation Awards, and shortlisted for the 2007 Computerworld Excellence Awards.
Selective harvesting and domain harvesting
Web harvesting is the process of downloading web material. There are two basic approaches: selective harvesting, and domain harvesting.
In selective harvesting, high-value web material is identified, and then scoped for harvest. A domain harvest involves attempting to harvest all the web material within an internet domain; for example, all the websites whose URLs end in '.nz'.
The Web Curator Tool supports selective harvesting by allowing curators, librarians or archivists to perform several crucial tasks:
- selecting the web material to harvest
- setting a schedule for harvesting the web material
- performing the harvest, by downloading the scoped web material at the scheduled time
- reviewing the harvest results to ensure the web material has been downloaded properly
- sending the web material to a digital archive for storage.
Software development
The National Library and the British Library are jointly funding and developing the Web Curator Tool. Throughout the development process we collaborated with the National Library of Australia, the UK Web Archiving Consortium, and the International Internet Preservation Consortium.
Sytec Resources Ltd. was selected to design and build the Web Curator Tool software in a competitive tender process. The tool has been developed as an open-source project for the benefit of other organisations building collections of web material.
Further information
Contact the British Library about this initiative
| Contact Us | Web Curator Tool Project |
|---|

