Showing posts with label wayback machine. Show all posts
Showing posts with label wayback machine. Show all posts

20 April 2011

How Can Your Content Live After You Die?

The current computer scene is notable for the role played by user-generated content (UGC): Facebook, Twitter, Flickr, YouTube etc. are all driven by people's urge to create and share.

Most of this is done by relatively young people; this means death is unlikely to be high on their list of preoccupations. Which also implies that they are probably not thinking about what will happen to all the content they create when they do die.

So we find ourselves in a situation where more and more content is being produced - not all of it great, by any means, by certainly characteristic of our time and important to the people that create it and their family, friends and users. Despite that rapid accumulation, no one is really trying to address the issue of what is going to happen to it all as users die.

This is quite separate from the more immediate problem of services shutting down, as is happening with Google Video. At least in these cases, you generally have the option to transfer it to some other site. But what happens when you - the creator, the uploader, the one that is nominally responsible for that content - are no longer around to do that?

You might hope that your heirs, whoever they might be, would carry on with things. But that presupposes that you leave all your passwords with them - in your will, perhaps? There are probably also issues to do with changing over the ownership of accounts - again, something that has not needed tackling much yet.

But is it really realistic to expect your family and friends to carry on caring for your content? After all, they will probably have their own to worry about. And what happens when they die? Will they then pass on not only their own UGC, but yours too? Won't that create a huge digital ball and chain that grows as it is passed on to the unlucky recipient? Hardly a recipe for sustainability.

Doubtless at some point some sharp entrepreneur will interpret this coming need as an opportunity. Just as you can pay a company to keep your cryogenically-preserved body against the day when a cure will be found for whatever ailment you eventually die of, so there will be companies offering digital immortality for your content.

The key question - as for those cryogenic preservation companies - is: will they really be around in hundreds of years' time? Of course, that's not really a problem for those sharp entrepreneurs that have your money *now*; and there's also not much you will be able to do about it if they don't make good on their side of the bargain...

What we need are repositories where content can be stored safely with a very particular audience in mind: posterity. To a certain extent, the Internet Archive already does that, but as I know from my own blog posts, its coverage is very patchy. And that's to be expected: a single organisation cannot hope to archive the entire Internet, including its second-by-second changes.

Moreover, depending on on one organisation is like putting all of the world's knowledge in the Library of Alexandria and nowhere else: after a good fire or two, you have lost everything. No, the solution is clearly to store the world's digital heritage in a distributed fashion.

We could start with national repositories, like the great deposit libraries that have a copy of every book published in their land. Those national Net holdings might also be national - after all, if every country did this, the world's output would be covered.

But clearly that's not a safe option either: ideally, you want multiple backups of national material to build in redundancy. You'd also want vertical markets to be stored by relevant organisations - every architectural site by some architectural body, every fishing site by some suitable organisation. You might have even more local stores of data in local libraries, or in local universities. Obviously the more the merrier (although it would be good to have some protocol so that they could all signal their existence and what they held to each other.)

Of course, none of this is going to happen, because the intellectual monopolists would be squawking their heads off about the inclusion of "their" content· This would have knock-on consequences for UGC, since, as we know, the boundaries between what is fair use and copyright infringement is ill-defined without hugely-expensive court cases. No organisation is going to take the risk of getting it wrong given the insanely litigious nature of the content companies.

And so we must sit back and contemplate not only the inevitability of our own demise - however far off that might be - but also the inevitable destruction of all that really ace content we have created and will create. Because, you know, maintaining that 18th-century intellectual monopoly is just so much more important than preserving the unparalleled global explosion of human creativity we are currently witnessing online.

Follow me @glynmoody on Twitter or identi.ca.

14 January 2009

Censorship = Destroying the Past

This is what censorship is about: destroying our memories.

According to multiple customers of Demon Internet - now owned by Brit telecom Thus - the London-based ISP is blocking access to all sites stored in the archive. When they query the Wayback Machine, hoping to retrieve archived pages, customers are met with generic "not found" error pages. But judging from their urls, these pages are generated by a web filter based on the blacklist compiled by the Internet Watch Foundation, a government-backed organization charged with policing online pornography.

22 October 2007

Open Content Alliance - Good, but not New....

Nice story in the New York Times about libraries choosing to go with the Open Content Alliance rather than that nice Mr. Google or Mr. Microsoft:

Several major research libraries have rebuffed offers from Google and Microsoft to scan their books into computer databases, saying they are put off by restrictions these companies want to place on the new digital collections.

The research libraries, including a large consortium in the Boston area, are instead signing on with the Open Content Alliance, a nonprofit effort aimed at making their materials broadly available.

Libraries that agree to work with Google must agree to a set of terms, which include making the material unavailable to other commercial search services. Microsoft places a similar restriction on the books it converts to electronic form. The Open Content Alliance, by contrast, is making the material available to any search service.

That's all jolly well and good, but what I can't understand is that the blogosphere is going nuts about this "new" initiative:

The Internet Archive, whose main claim to fame is the Wayback Machine, designed to archive the internet's web history, has created a new project: the Open Content Alliance.

Well, no, not as such:

The Open Content Alliance (OCA) represents the collaborative efforts of a group of cultural, technology, nonprofit, and governmental organizations from around the world that will help build a permanent archive of multilingual digitized text and multimedia content. The OCA was conceived by the Internet Archive and Yahoo! in early 2005 as a way to offer broad, public access to a rich panorama of world culture.

So founded in 2005; and as its press archive shows, it's hardly been dormant since then....

Update: More details from Da Man himself, Brewster Kahle, here.

06 December 2006

Wayback: 85,898,456,616 and Counting

The Wayback Machine is one of the Internet's best-kept secrets:

A snapshot of the World Wide Web is taken every 2 months and donated to the Internet Archive by Alexa Internet. Further, librarians all over the world have helped curate deep and frequent crawls of sites that could be especially important to future researchers historians and scholars.

As web pages are changed or deleted every 100 days, on average, having a resource like this is important for the preservation of our emerging cultural heritage.

And even for someone like me, who uses it all the time, numbers like this still take the breath away:

The Internet Archive's Wayback Machine now has 85,898,456,616 archived web objects in it

plus

The database contains over 1.5 petabytes of data that came from the web (that is 1.5 million gigabytes) which makes it one of the largest databases of any kind.

And a cyber-pearl beyond price. (Via Open Access News.)

21 November 2006

The Beginning of the End for Novell?

This is a characteristically brilliant post from Pam over at Groklaw, particularly in the way it uses the Wayback machine to skewer Novell as it twists in the wind. It concludes:

So, here's the question I have for Novell: what happened to that promise to protect FOSS with its patent portfolio? Novell did say it. We relied upon it, and OIN is totally separate from the above promise. I mention that because some Novell guys have been saying that Novell never made any such promise or that the OIN patents fulfill the promise. Read the promise again. Novell clearly promised to use its patent portfolio, not OIN's, and Novell appears to have just bargained that patent portfolio away, giving Microsoft a clear path to now bring patent infringement claims against everyone else. Novell's character and honor is on the line. And we await your statement with interest.

But arising from this, I too have a couple of questions that are starting to loom large in my mind:

Is this the beginning of the end for Mono? If Novell continues along its current path surely everything it touches will be regarded as tainted by the free software community, and Mono is sponsored by Novell. And now that Sun has done the decent thing with Java, there is a nice little programming language just waiting for all those disappointed hackers.

The other question is even bigger: is this the end for Novell? It seems to me that there is a broad-based and massive movement growing within the free software world to ostracise Novell utterly - something that will simply kill the company. As far as I know, this has never been done before - perhaps because the free software world simply wasn't strong enough. Now it is: are we about to see it claim its first victim? (Via AC/OS.)