Update on 2008 Web Harvest, 31 October 2008
The National Library of New Zealand has a social responsibility to preserve New Zealand's social and cultural history, be it in the form of books, newspapers and photographs, or of websites, blogs and YouTube videos. An increasing amount of New Zealand's documentary heritage is only available online. New Zealanders find this content valuable and convenient, but its impermanence, lack of clear ownership, and dynamic nature pose significant challenges to our efforts to collect and preserve it.
The public benefit from the safe, long-term preservation of New Zealand's online heritage is incalculable. Our online social history and much government and institutional history will be preserved for researchers, historians, and ordinary New Zealanders. We will be able to look back on internet documents as we do the printed words left to us by previous generation.
In recognition of the increasingly central role of the internet in all areas of life, the National Library has undertaken a web harvest of the entire .NZ domain and also list of approximately 500 websites outside the .NZ domain with New Zealand content (for example, .com, .net).
The domain harvest finished slightly ahead of schedule on Thursday 23 October 2008. The harvester collected 105 million URLs, about 10 million every day. It harvested about 4.1 terabytes of data, which compresses down to slightly less than 3 terabytes (a terabyte is 1,024 gigabytes, or 1.05 million megabytes).This is a unique and significant undertaking that will preserve invaluable information about New Zealand life, which otherwise might be lost.
We recognise that there have been a number of issues associated with our recent harvest. It was absolutely not our intention to cause disruption to services.
We have been working very closely with the web community since these issues have been flagged to us, and meetings have been scheduled with industry groups to discuss the situation.
Further information
Web Harvest update (15 October 2008)
Answers to questions about the Web Harvest (20 October 2008, updated 21 & 29 October)

