Web harvesting is a term used by the National Library to describe the selecting, copying and archiving of websites found on the internet. The collection of New Zealand websites is covered by Legal Deposit legislation (National Library of New Zealand Act 2003, Part 4).

General information for Publishers

Most web harvesting is undertaken on a selective basis by the Alexander Turnbull Library. These websites form the New Zealand Web Archive, which is part of the Alexander Turnbull Library's published collections.

In 2008 a special harvest of the .NZ domain was undertaken by the National Library of New Zealand. This provided a snapshot of the New Zealand Internet at that time. This harvest is not currently available to the public.

More about the New Zealand Web Archive

More about the New Zealand Web Harvest 2008

More about Legal Deposit

Technical information for Webmasters

The Library uses the Web Curator Tool to acquire copies of the publicly available pages on a website. If the Library takes a copy of your website you will see this identifier in your website access logs:

Mozilla/5.0 compatible; heritrix/1.14.1 +http://www.natlib.govt.nz/website-harvest

If our crawling is having an impact on your site please contact us immediately.

Most of our harvests are scheduled for evenings or weekends, with a delay between page requests so that we cause minimal disruption while web pages are copied. We have found it necessary to ignore the robots.txt protocol in order to obtain a copy that retains the look of the original website. We monitor the Web Curator Tool logs while the harvest is in progress to check for any crawler traps and avoid them if necessary. If we have technical problems downloading your website we may contact you.

More about the Web Curator Tool

Contact us

New Zealand Web Archive

Email web.archive@natlib.govt.nz

Phone (04) 474 3000 x 8676

For information about the Web Curator Tool email wct@natlib.govt.nz