20 June 2011

British Library Encloses the Public Domain

There's considerable excitement about an announcement from the British Library and Google detailing a wonderful gift to the world:

The British Library and Google today announced a partnership to digitise 250,000 out-of-copyright books from the Library’s collections. Opening up access to one of the greatest collections of books in the world, this demonstrates the Library’s commitment, as stated in its 2020 Vision, to increase access to anyone who wants to do research.

Selected by the British Library and digitised by Google, both organisations will work in partnership over the coming years to deliver this content free through Google Books (http://books.google.co.uk) and the British Library’s website (www.bl.uk). Google will cover all digitisation costs.

Isn't that just swell? Vast quantities of fascinating books in the public domain are being made "available to all", as the press release trumpets:

This project will digitise a huge range of printed books, pamphlets and periodicals dated 1700 to 1870, the period that saw the French and Industrial Revolutions, The Battle of Trafalgar and the Crimean War, the invention of rail travel and of the telegraph, the beginning of UK income tax, and the end of slavery. It will include material in a variety of major European languages, and will focus on books that are not yet freely available in digital form online.

Freely available, too... But, er, exactly *how* freely available?

Once digitised, these unique items will be available for full text search, download and reading through Google Books, as well as being searchable through the Library’s website and stored in perpetuity within the Library’s digital archive.

Fab, and....?

Researchers, students and other users of the Library will be able to view historical items from anywhere in the world as well as copy, share and manipulate text for non-commercial purposes.

But hang on: these are materials that are in the public domain; public domain means that anyone can do anything with them - including commercial applications. So this condition of "non-commercial purposes" means one thing, and one thing only: although the texts themselves are public domain, the digitised texts are not (otherwise it would be impossible to impose the non-commercial clause).

In other words, far from helping to make knowledge freely accessible to all and sundry, the British Library is actually enclosing the knowledge commons that rightfully belongs to humankind as a whole, by claiming a new copyright term for the digitised versions. Call me ungrateful, but that's a gift I can do without.

Follow me @glynmoody on Twitter or identi.ca.

43 comments:

Tony Kennick said...

Can I challenge your phraseology "Encloses the Public Domain" There is nothing in this that changes the status quo, if you have access to a copy of one of these books and want to put the effort in to digitise it you still can. You can then do what you like with that. What they seem to be saying is you can't make a profit from their efforts (and i bet if you rock up with a commercial licensing proposition they will listen).
While there would be obvious benefits if they chose a more open licence to release these under, they aren't doing anything to trample your rights as you are heavily implying.

Ian Davis said...

At the Europeana plenary in 2010 I repeatedly asked James Crawford, Engineering Director of Google Books, whether scanned public domain books would be available under a public doman licence. He finally admitted that they would be copyright Google, but freely available.

Not the same thing at all. You're right about Google enclosing the public domain. I'm so reminded of http://en.wikipedia.org/wiki/Enclosure

glyn moody said...

@tony: yes, you're right in legal terms; but practically, that's what they're doing.

For example, for books it owns where there are very few other copies in existence - and given the riches of the BL's collection I'm sure there are many - it will be hard if not impossible to find other institutions that will allow digitisation.

So the BL effectively has a monopoly on those books. And I doubt whether it would be keen to allow anyone else to digitise them and then give away the files to anyone.

Ian Davis said...

Tony, the implication is that now they are available digitally the BL can now withdraw those books from public access on the grounds of preservation.

Also, if the only copies of public domain works are in museums or are unique, how are we expected to produce our own scans of them?

glyn moody said...

@Ian: thanks for confirming that Google are as bad as the BL. I'd assumed that from this:

http://www.google.com/support/forum/p/books/thread?tid=4088c1d3eeea8567&hl=en

glyn moody said...

@Ian: exactly - this is a slippery slope that the BL should never have gone down...

P T B! said...

I heard after the digitize the books they burn them...

glyn moody said...

@PTB: well, makes sense...

Nick Poole said...

Really interesting post and thread. I see and applaud the idea that the public domain should be a free and untrammeled space, and even that the public domain should imply 'commercial' as well as 'non-commercial' uses.

My problem is that very few people advocating this view have a practical answer to the question of how the very considerable costs of Digitisation and long-term preservation *should* be covered if these kinds of restriction are not permitted.

Given that no Government is ever simply going to fund the digitisation and unfettered re-use of the bulk of our cultural heritage, what model do you propose by which these costs should be covered?

Setting aside the usual suspicion that surrounds the motivations of large organisations, is there not an argument to say that the access which the BL and Google are providing is considerably better than the nothing which would take its place if such arrangements weren't made?

mrsean2k said...

@ian

As far as subsequently withdrawing access to original works on the back of the digitisation effort, I see no implication whatsoever in the press release you link to.


@glyn

"far from helping to make knowledge freely accessible to all and sundry"

This is flatly contradicted by the facts you present in your own article.

Could it be more open? Certainly.

So what. It's a vast improvement; on the basis of what's actually in the release, as opposed to what is imagined, far more people will have far more free access.

"

glyn moody said...

@Nick: you're of course right that a central issue is who pays for the digitisation?

There is an interesting parallel with Ordnance Survey data. Many people (myself included) wants it all available free, for any use. The same question comes up: who will pay for its collection?

Well, let's look at the US. There, geographic information is paid for by the US government, and then given away. Why? Because it has generated a multi-billion dollar industry based around geographic information (the Americans wouldn't give away this stuff if there weren't a profit - for the US - in doing so.) More details here:

http://www.guardian.co.uk/technology/2006/mar/09/education.epublic

Similarly, imagine the businesses that could be build around free access to the BL's holdings.

The current approach specifying "nonn-commercial" only is a failure of imagination that is really penny-wise but pound-foolish....

glyn moody said...

mrsean2k: no, there's nothing in the press release, nor did I say there was; I was pointing out a possible, plausible danger.

It's not "freely accessible to all and sundry" - freely accessible means being able to do what you wish with it; this is looking but not touching - not my idea of free (which I use in the "free software" sense - see the rest of this blog....)

But what worries me is that accepting this accepts the principle that a major institution can take public domain material and take it out of the public domain in this way as a "quid pro quo": this is about principles.

Joe said...

I was at the wikipedia / museums conference hosted by the British Museum last year which had a very interesting collection of attendees.

One of the presentations was by the National Portrait Gallery about their dispute with Wikimedia Commons who had posted high res versions of NPG pictures taken from the NPG website.

Each section of this presentation was named for a Barbra Streisand song. NPG have heard of the Streisand effect.

They explained that Wikimedia commons has now added a disclaimer to each of these pictures noting that Wikimedia thinks they are public domain but NPG claims copyright. This disclaimer includes a link to the NPG website.

NPG has decided to live with this and not pursue the matter further. This disclaimer creates just enough uncertainty to drive serious commercial users to get a contract with NPG.

Lady Bridgeman was in the audience that day and in the discussion afterwards she discussed the Corel vs Bridgeman case, voicing the opinion that the Bridgeman gallery were badly advised and were unlucky to lose that court case.

Wikimedia's response - the day after a U.S. court decides photos of old masters are copyrightable these will disappear from Wikimedia Commons. Until then they stay.

Having this content on the web with an unfree license is a first step but it isn't the end of this road. There are lots of people thinking about these issues and your post here is a useful addition to that discussion.

glyn moody said...

@Joe: thanks for that very useful summary. I wondered what had happened to that case. It is, of course, identical to what the BL are doing, and all part of a worrying trend.

mrsean2k said...

@glyn

I understand how you're employing the phrase, but just because you wish it was so, doesn't make it the only reasonable interpretation.

There's nothing misleading about the article. Millions of people will have vastly greater access.

As it happens, I understand the concept of free (beer and speech) software and I'm an ardent supporter - with actual hard cash - but some realism, please.

If it turns out they have struck some sort of Faustian bargain that means the originals are retired, I'll revise that position.

Andrew Katz said...

Glyn

I'm struggling to see how the BL or Google can claim copyright over the resulting text (this is similar to, but clearer legally than) the hoo-hah over digitisation of images in the National Portrait Gallery a little while ago.

If Google is digitising the text, and if the text is faithful to the original (and if it isn't, then why isn't it?), then the google version of the text cannot attract copyright as it is not an original work.

We really need Google and/or the BL to be asked the question "will the text files of the scanned works be identical to the text of the originals", to which they will only be able to answer "yes" to retain any credibility, at which point they are essentially admitting that the text files are in the public domain.

mrsean2k said...

So I emailed the press office with this:

There have been a number of reports about the effort to digitise 250,000 books in partnership with Google.

Some have criticised the degree of openness of the effort, in particular the condition that access to the digitised version must be on a non-commercial basis.


Can you confirm that the opportunity to access the original copies will remain unchanged, in both ease of access and purpose, in light of the new project?

If this is not the case, can you tell me what additional restrictions will be put in place on access to original works as a direct result of the Google partnership?



And this was the response (I've removed the name only)


Thanks for your email.

The items that will be digitised will be unavailable for a few months, but will then be re-shelved and available for free in our Reading Rooms.

JonB said...

If you want to put the effort in there is absolutely nothing to stop you from making your own digital copies if you want to use them for commercial purpose.

Of course it would be a better situation if Googles digital copies were completely open but that does nothing to change the fact that this is a quite fantastic project that will be a enormous use to many. Trying to spin this in a negative way is quite astonishing to me.

glyn moody said...

@Andrew: that's a hugely interesting comment. Are you essentially say *if* the text is identical, it has to be in the public domain, even digital versions, and that conditions can't be attached to it? Because that's clearly a very Big Thing if you are...

glyn moody said...

@mrsean2k: thanks for that. But does that mean they will let *me* digitise the book if I ask them? I think not....

glyn moody said...

@JonB: because there is a huge point of principle here that is much bigger even than 250K books. This is about the future of digital knowledge - and of the public domain.

dare said...

You've been right on the money about a lot of things recently Glyn! I followed you via rss for a while but warranted an official twitter follow due to your outstanding writing on a variety of freedom-centric tech topics.

Thank you for your voice, I pray it remains true and strong.

Peace,
dare

mrsean2k said...

@glyn

Why speculate, why not ask?

My email was answered within about 10 minutes.

glyn moody said...

@dare: many thanks for that; such support helps keep me strong...(not sure it can guarantee the "true" bit, but hey....)

glyn moody said...

@mrsean2k: because it's incidental to the main point: that the BL is claiming new rights in digitised public domain material. That's a terrible precedent for the future in terms of getting *all* public domain analogue texts online as public domain.

Anonymous said...

If Google can claim copyright of something they scanned, then I should be able to scan any paper book and claim copyright on the digital copy, if it is previously only published on paper?

glyn moody said...

well, this only really relates to public domain materials, but according to Google's logic, it seems so

Andrew Katz said...

@glyn

Yes - I am saying that if the textfile made available by Google/BL (typos and all) is identical to the text of the original book, then that text file will exhibit zero originality, and as such will be incapable of attracting copyright.

glyn moody said...

@Andrew: OK, thinking about this some more, it depends whether the scanned pages will allow the text to be copied (as Google Books does for PD books).

If it does, then presumably you're saying people would be able to copy page by page to recreate the complete text?

Hmm, would be interesting to explore this further when the books appear...

Andrew Katz said...

@glyn

Yes, cutting and pasting the text would be one way to do it.

glyn moody said...

@Andrew: thanks for those thoughts

Andrew Katz said...

@glyn
BTW, for some reason, Google Books doesn't regard "Wired Love" by Ella Cheever Thayer as a public domain work despite being published in 1880.

http://books.google.com/books?id=BjAOAAAAYAAJ&q=wired+love&dq=wired+love&pgis=1

- Andrew

glyn moody said...

@Andrew: weird...

Adrian Pohl said...

Hello,

regarding the question of claiming copyright on digitizations of Public Domain books: Recently the German lawyer Till Kreutzer in a legal guide for German libraries on digitizing public domain material made clear that you in most cases and most probably cannot claim copyright or relative rights on digitizations of public domain works. See the guide "Digitalisierung gemeinfreier Werke durch Bibliotheken" (PDF), chapter 5.

Of course these statements only hold for German law...

kg said...

The enourmous BL prices of copies of rare books are more serious than this pseudo-problem of Copyfraud.

Google's claim isn't an EULA - this has be confirmed by Google.

You can take the scans of works in the PD and do whatever you want.

The Bridgeman v. Corel decision text has a long discussion of UK law. According to European law I cannot see that pure reproductions could be protected by Copyright.

glyn moody said...

@Adrian: thanks very much for that interesting link. As you say, it's German law, but the logic seems pretty universal - especially the section about automatic processes (like OCR) not giving rise to new copyright.

glyn moody said...

@kg: I don't really know about those prices - how high are they?

Assuming they are not insane (maybe they are...), that seems a better way of generating money than - say - trying to charge licensing fees for commercial use of digital texts.

As far as Google is concerned, when you say scans, do you mean the images, or the OCR'd text as well?

Niall said...

Having worked in AV archiving in the UK, my understanding is that converting an item into digital form (or migrating it to a new standard/platform) legally generates a new work. This work - even though it may be practically identical to an existing one - attracts copyright for the producer/publisher. Unless, of course, the producer waives that right.

This seems partly (if not wholly) nonsensical to me, but it's not the fault of the BL or Google that this is legally the case.

A deeper issue is the BL's position as a copyright library. If, for instance, an author thinks a work of theirs has been plagiarised, they have legal recourse to the BL's catalogue to prove that they originated the work/idea/patent. This is one of the reasons that the BL have generally avoided digitising in-copyright works. However, having BL acting as a digital publisher even of out of copyright material puts them in an ambiguous position: digital copies generate (or infringe existing) copyright willy-nilly. It's just the way the law works.

glyn moody said...

@Niall: thanks for those. Certainly, I'm not saying this is wholly BL's fault - the legal system is clearly largely to blame. I just don't see it being very helpful in trying to (a) deal with these problems (b) change them.

filmoyster said...

I'll give you a break down of what the British Library is currently doing. They take a public domain book and scan it in. If I want to use a photo from that book for commercial purposes from their library I will have to pay them a fee. I called for a quote and they want 350 pounds, or 550 dollars for the use of an image they scanned in.

It is one thing to pay a current copyright holder a fee for a photo, but I think it is an outright hustle for them to charge such high fees for images in the public domain.

The other aspect is that if I were to find the same image from another book at a public library, scan it in myself, the British Library could potentially turn around and sue me for "their" image. The burden would be upon myself to provide documentation that I went to another library and made the scan myself - my own "faithful reproduction of a document in the public domain."

I think it's a racket, and a complete violation of the spirit of public domain.

BTW there are holdings at other online libraries that do not have the same restrictions. More importantly, they cite where the book resides so that you can source the information.

glyn moody said...

@filmoyster - thanks for that update. hadn't seen those figures before

Sheogorath said...

Don't worry so much, it's only a publisher's copyright, which will expire on 01/01/2037, if I have my maths correct.

glyn moody said...

@sheogorath: not sure where you get that figure from - could you explain, please?