29 September 2006

European Digital Library, European Archive

Some time back I wrote about the European Digital Library. But it seems that this isn't enough: now we have the European Archive, too, which seems even more ambitious. For as well as providing access to digital versions of traditional content, it seems to be aiming to become a European mirror of the wonderful Internet Archive:

The European Archive is a non-profit foundation working towards universal access to all knowledge. The archive will achieve this through partnerships with libraries, museums, other collection bodies, and through building its own collections. The primary goal of collecting this knowledge is to make it as publicly accessible as possible, via the Internet and other means.


As the web has grown in importance as a publishing medium, we are behind in bringing into operation the archiving and library services that will provide enduring access to many important resources. Where some assumed web site owners would archive their own materials, this has not generally been the case. If properly archived, the Web history can provide a tremendous base for time-based analysis of the content, the topology including emerging communities and topics, trends analysis etc. as well as an invaluable source of information for the future.

The foremost effort to archive the Web has been carried on in the US by the Internet Archive, a non-profit foundation based in San Francisco. Every two months, large snapshots of the surface of the web are archived by the Internet Archive since 1996.

This entire collection offers 500 terabytes of data of major significance in all domain that have been impacted by the development of the Internet, that is, almost all. This represent large amount of data (petabytes in the coming years) to crawl, organize and give access to.

By partnering with the Internet Archive, the European Archive is laying down the foundation of a global Web archive based in Europe.

Obviously, all this begs scads of questions to do with access and copyright, but at least it's a start.

No comments: