Web harvesting is a term used by the National Library to describe the selecting, copying and archiving of websites found on the internet. The collection of New Zealand websites is covered by Legal Deposit legislation (National Library of New Zealand Act 2003, Part 4).
General information for Publishers
Most web harvesting is undertaken on a selective basis by the Alexander Turnbull Library. These websites form the New Zealand Web Archive, which is part of the Alexander Turnbull Library's published collections.
In 2008 and 2010 special harvests of the .NZ domain were undertaken by the National Library of New Zealand. These provided snapshots of the New Zealand Internet at these times. These harvests are not currently available to the public.
More about the New Zealand Web Archive
More about the New Zealand Web Harvest
Technical information for Webmasters
The Library uses the Web Curator Tool to acquire copies of the publicly available pages on a website. If the Library takes a copy of your website you will see this identifier in your website access logs:
Mozilla/5.0 compatible; heritrix/1.14.1 +http://www.natlib.govt.nz/website-harvest
If our crawling is having an impact on your site please contact us immediately.
Most of our harvests are scheduled for evenings or weekends, with a delay between page requests so that we cause minimal disruption while web pages are copied. We have found it necessary to ignore the robots.txt protocol in order to obtain a copy that retains the look of the original website. We monitor the Web Curator Tool logs while the harvest is in progress to check for any crawler traps and avoid them if necessary. If we have technical problems downloading your website we may contact you.
More about the Web Curator Tool
Contact us
New Zealand Web Archive
Email: web.archive@dia.govt.nz
Phone: (04) 470 4564
For information about the Web Curator Tool email wct@natlib.govt.nz


