23 November 2007

We Demand Books on Demand

One of the interesting results of the move to digital texts is a growing realisation that analogue books still have a role to play. Similarly, it's clear that analogue books serve different functions, and that feeds into their particular physical form. So some books may be created as works of art, produced to the very highest physical standards, while others may simply be convenient analogue instantiations of digital text.

Public domain books are likely to fall into the latter class, which means that ideally there should be an easy way to turn such e-texts into physical copies. Here's one:

This is an experiment to see what the demand for reprints of public domain books would be. This free service can take any book from the Internet Archive (that is in public domain) and reprint it using Lulu.com. Prices of the books are rounded up from Lulu.com cost prices to the nearest $0.99 to cover the bandwidth and processing power that we rent from Amazon using their EC2 service. There is also a short post on my blog about it.

How Does It Work

Anyone with an email address can place a request on this page using an Internet Archive link or ID. Your request will be forwarded to our conversion server which will convert the appropriate book to printable form, and sends it off to Lulu.com. When the book has been uploaded, it will be made for immideate ordering and shipping, and you will receive a link to it via email. Currently, only soft cover books are supported in 6"x9", 6.625"x10.25" or 8"x11" trim sizes.

Interesting to see Lulu.com here, confirming its important place as a mediator between the digital and analogue worlds. (Via Open Access News.)

Openness: Purity of Essence

I wrote a piece for Linux Journal recently warning that Microsoft was beginning to hijack the meaning of the phrase "open source". But the problem is much bigger than this: the other opens face similar pressures, as Peter Murray-Rust notes.

In some ways it's even more serious for fledgling movements like open access and open data: there, the real meaning has barely been established, and so defending it is harder than for open source, which has had a well-defined definition for some time. Given the importance of labels, this is a matter that needs to be addressed with some urgency before "open access" and "open data" become little more than bland marketing terms.

Thank You, FOSS

Via GigaOM, I came across a link to this love-letter to Facebook:

Thinking about it, I've rarely used a service that has brought me so much emotional satisfaction...connecting with good friends is a feel-good thing and it is this emotional value that makes Facebook hard to beat in terms of the gratification other services can provide. So much so, here I am even writing a thank you note to the service (I can't remember doing that for any service...I've written about how "cool" stuff is, or how useful some service might be...but "thank you"? Never).

Although I think that Facebook is interesting - but not unproblematic, especially its recent moves - I'd never see it in this light. But it set me wondering whether there was anything comparable for me - a place of digital belonging of the kind offered by Facebook. And I realised there was, but not one that was crystallised in a single service. Rather, I feel this same sense of "connecting with good friends" with respect to the much larger, and more diffuse free software community.

This isn't a new thing. Back in the early years of this century, when I was writing Rebel Code, I was astonished at how helpful everyone was that I spoke to in that world. That stood in stark contrast to the traditional computing milieu, where many was full of their own (false) self-importance, and rather too fixated on making lots of money.

It seems I'm not alone in this sense of hacker camaraderie:

The key thing here is that in all the details, spats, debates, differences in direction and nitty-gritty, it is easy to forget that the core ingredients in this community are enthusiastic, smart, decent people who volunteer their time and energy to make Open Source happen. As Open Source continues to explode, and as we continue to see such huge growth and success as it spreads across the world and into different industries, we all need to remember that the raw ingredients that make this happen are enthusiastic, smart, decent people, and I for one feel privileged to spend every day with these people.

To paraphrase W.H.Auden:

Thank You, Thank You, Thank You, FOSS.

Public Domain Search

One of the big advantages of open content is that there are no problems with indexing it - unlike proprietary stuff, where owners can get unreasonably jumpy at the idea. Public domain materials are the ultimate in openness, and here's a basic search engine for some of them:

major public domain sites were chosen, the most important being the US federal government sites. government:

* .gutenberg.org
* .fed.us
* .gov
* .mil

But there are plenty of exclusions. Also, it's a pity this is only for the US: the public domain is somewhat bigger. (Via Open Access News.)

22 November 2007

Happy Birthday Internet

Watch out, there's a meme about:


The Internet is 30 today. Exactly 30 years ago today on November 22, 1977 the first three networks were connected to become the Internet.

(Via Simon Willison's Weblog.)

Realising Virtual Worlds Through Openness

I mentioned Tim Berners-Lee below as an iconic figure. Philip Rosedale may not quite be there yet, but he stands a good chance of attaining that status if his vision works out. He's put together a useful summary of how that vision grew, and, more importantly, what Linden Lab is going to do to realise it more fully. Nice to see that at the heart of the strategy lies openness:

we need to keep opening SL up, as we’ve started talking about lately. This means formats, source code, partners, and more. We are working on turning our clear vision on this into more detailed plans. Virtual worlds, in their broadest form, will be more pervasive that the web, and that means that their systems will need to be open: extended and operated by many people and companies, not just us.

That Umair Bloke on Blogonomics 2007

Glad it's not just me that feels this way.

Tim B-L: On Moving from the WWW to the GGG

Tim Berners-Lee is an inconic figure for a reason: he's actually rather sharp. This makes his rare blog posts important and interesting - none more so than his most recent one about the Giant Global Graph (GGG):

In the long term vision, thinking in terms of the graph rather than the web is critical to us making best use of the mobile web, the zoo of wildy differing devices which will give us access to the system. Then, when I book a flight it is the flight that interests me. Not the flight page on the travel site, or the flight page on the airline site, but the URI (issued by the airlines) of the flight itself. That's what I will bookmark. And whichever device I use to look up the bookmark, phone or office wall, it will access a situation-appropriate view of an integration of everything I know about that flight from different sources. The task of booking and taking the flight will involve many interactions. And all throughout them, that task and the flight will be primary things in my awareness, the websites involved will be secondary things, and the network and the devices tertiary.

This is probably the best thing I've read about social graphs, not least because it anchors a trendy idea in several pre-existing areas of serious Webby development. (Via Simon Willison's Weblog.)

21 November 2007

Interoperability: The New Battlefield

One word is starting to crop up again and again when it comes to Microsoft: interoperability - or rather the lack of it. It was all over the recent agreement with the EU, and it also lies at the heart of the OpenDocument Foundation's moves discussed below.

And now here we have some details of the next interoperability battles:

the EU Competition Commissioner’s office, with the first case decided by the EU Court of First Instance, now has started working intensively on the second case.

The new case involves three main aspects. First, Microsoft allegedly barred providers of other text document formats access to information that would them allow to make their products fully compatible with computers running on Microsoft’s operating systems. “You may have experienced that sometimes open office documents can be received by Microsoft users, sometimes not.”

Second, for email and collaboration software Microsoft also may have privileged their own products like Outlook with regard to interfacing with Microsoft’s Exchange servers. The third, and according to Vinje, most relevant to the Internet and work done at the IGF, was the problem of growing .NET-dependency for web applications. .NET is Microsoft’s platform for web applications software development. “It is a sort of an effort to ‘proprietise’ the Internet,” said Vinje.

That's a good summary of the problems, and suggests that the Commission is learning fast; let's hope that it doesn't get duped when it comes to remedies as it did the last time, apparently fooled by Microsoft's sleights of hand over patents and licences.

Decentralise Your Data - Or Lose It

Aside from the obvious one of not trusting the UK government with personal data, the other lesson to be learned from the catastrophic failure of "security" by the HMG is the obverse to one of free software's key strengths, decentralisation. When you do centralise, you make it easy for some twerp - or criminal - to download all your information onto a couple of discs and then lose them. A decentralised approach is not without its problems, but at least it puts a few barriers in the way of fools and knaves.

Hardware is Like Software? - Ban Hardware Patents

I won't bother demolishing this sad little piece on why software patents are so delicious and yummy, because Mike Masnick has already done that with his customary flair.

But I would like to pick on something purports to be an argument in the former:


One needs to understand that there is fundamentally no difference between software and hardware; each is frequently expressed in terms of the other, interchangeably describing the same thing. For example, many microprocessors are conceptualized as software through the use of hardware description languages (HDL) such as Bluespec System Verilog and VHDL. The resulting HDL software code is downloaded to special microprocessors known as FPGAs (field programmable gate arrays), which can mimic a prospective chip's design and functions for testing. Eventually, the HDL code may be physically etched into silicon. Voilà! The software becomes hardware.

Well, that's jolly interesting, isn't it? Because it means that such hardware is in fact simply an instantiation of algorithms - hard-wired, to be sure, but no different from chiselling those algorithms in granite, say. And as even the most hardened patent fan concedes, pure knowledge such as mathematics is not patentable.

So the logical conclusion of this is not that software is patentable, but that such hardware *shouldn't* be. I'd go further: I suspect that anything formed by instantiating digial information in an analogue form - but which is not essentially analogue - should not be patentable. The only things that might be patentable are purely analogue objects - what most people would recognise as patentable things.

There is an added benefit to taking this approach, since it is also solves all those conundrums about whether virtual objects - in Second Life, for example - should be patentable. Clearly, they should not, because they are simply representations of digital entities. But if you wanted to make an analogue version - and not just a hard-wiring - you could reasonable seek a patent if it fulfilled the usual conditions.

Oh, Tell Me the Truth About...the ODF Bust-Up

The recent decision by the OpenDocument Foundation to shift its energies away from ODF to CDF has naturally provoked a lot of rather exaggerated comment. I wrote a piece for LWN.net (now out from behind the paywall) exploring what exactly was going on, and found out that there are bigger issues than simply document interoperability at play.

It turns out to be all about Microsoft's Sharepoint - software that I am beginning to see as one of the most serious threats to open source today. Read it and be very afraid.

GNU PDF Project

Around ten years ago I fought a fierce battle to get people to use HTML instead of PDF files, which I saw as part of a move to close the Web by making it less transparent.

You may have noticed that I lost.

Now, even the GNU project is joining in:

The goal of the GNU PDF project is to develop and provide a free, high-quality and fully functional set of libraries and programs that implement the PDF file format, and associated technologies.

...

PDF has become the de-facto standard for documentation sharing in the industry.

Almost all enterprises uses PDF documents to communicate all kinds of information: manuals, design documents, presentations, etc, even if it is originally composed with OpenOffice, LaTeX or some other word processor.

Almost all enterprises use proprietary tools to compose, read and manipulate PDF files. Thus, the workers of these enterprises are forced to use proprietary programs.


I still think HTML, suitably developed, would be a better solution. (Via LXer.)

20 November 2007

Actuate's Actual Open Source Snapshot

One of the sure signs that open source is moving into the mainstream is the number of surveys about it that are being conducted. The great thing about these is that while individually they bolster the case for open source in different areas, collectively they are almost overwhelmingly compelling.

The latest such survey comes from Actuate. It's actually an update of an earlier, more circumscribed one, and it ranges far more widely:


Following research first conducted in November 2005, exclusively targeted at financial services companies in the UK and Europe, the 2007 Actuate Open Source Software Survey broadened its scope to include research attitudes to open source systems in both North America and Germany. The 2007 survey also extended beyond financial services to include public services, manufacturing and telecommunications (telco) in the new regions and now uniquely provides a detailed local insight as well as interesting regional comparisons across the geographies and the vertical sectors within them.

The top-line result?
Half the organizations surveyed stated that open source is either the preferred option or is explicitly considered in the software procurement process. One surprising note is that one-third of the organizations surveyed are now likely to consider open source business intelligence in their evaluations. This is a huge shift from just a few years ago.

The survey is available free of charge, but registration is required.

UK Government Loses 15 Million Bank Details

This has to be about the most stupid security lapse in the history of computing:

Confidential details of 15 million child benefit recipients are on a computer disc lost by HM Revenue and Customs, the BBC understands.

Insult is added to injury:

Revenue and Customs says it does not believe the records - names, addresses and bank accounts - have fallen into the wrong hands.

Yeah? And they know that precisely how - because they're psychic, perhaps?

And then the UK government wants us to trust them with our IDs, too? If we did, how long before the odd 60 million IDs get "lost"? At least you can change your bank details - you don't have that option with your identity.

Update 1: What's really heartening is that a surprisingly large proportion of those commenting here on the BBC story spot the ID card connection....

Update 2: Better make that 25 million bank details, plus key data on all children in the UK.

Free Software and the Categorical Imperative

The Web could have been invented for butterfly minds like mine. For example, in one of Stephen O'Grady's hallmark Q&As (this one on Red Hat's cloud computing announcement) I came across a link that took me to the Wikipedia page about Immanuel Kant's categorical imperative.

I first encountered Kant when I was in my late teens - the perfect age for grappling with those big questions that look too big and daunting when we are older and more sensible. I thought then, and still think now, that his Critique of Pure Reason represents the acme of modern philosophical thought - the Choral Symphony of metaphysics.

I was therefore already familiar with the categorical imperative, not least in Auden's rather fine clerihew:


When the young Kant
Was told to kiss his aunt,
He obeyed the Categorical Must,
But only just.

But reading the excellent Wikipedia entry, with its formulation:

"Act only according to that maxim whereby you can at the same time will that it should become a universal law."

brought home to me something that - stupidly - I hadn't really grasped before about Kant's idea: its essential symmetry. Of course, it's there implicitly in the other version, which I knew:

"Act in such a way that you treat humanity, whether in your own person or in the person of any other, always at the same time as an end and never merely as a means"

but the second form lacks the extra precision of the first.

What struck me is that this is the driving force behind free software - Stallman's belief that we must share software that we find interesting or useful. And more generally, it lies at the heart of all the kinds of openness that are starting to blossom: they are all predicated on this symmetry, on the giving back as well as the taking.

So there we have it: Immanuel Kant - philosopher and proto-hacker.

Larry Sanger's Question

Larry Sanger has a question about Citizendium:

Suppose we grow to Wikipedian size. This is possible, however probable you think it might be.

Suppose, also, that, because we are of that size, we have the participation of a sizable portion of all the leading intellectuals of the world, in every field–and so, there are hundreds of thousands, if not millions, of approved articles. These are all long, complete with many links, bibliography, etc., etc.–all the subpage stuff. It’s reference utopia. Far better than Wikipedia has any hope of becoming.

Here’s the question, then. If we use a license that permits commercial reuse–CC-by-sa or GFDL–then every major media company in the world could, and probably would, use CZ content. Do you favor a license that allows CBS, Fox, the New York Times, English tabloids, Chinese propaganda sheets, Yahoo!, Google, and all sorts of giant new media companies to come, to use our content? Without compensation?

That's the question that Linus faced over a decade ago when he decided to adopt the GNU GPL instead of the earlier one that forbade any kind of money changing hands. And as Linus has said many times, choosing the GNU GPL was one of the best decisions he ever made, because it has widened support for Linux enormously, and as a result has driven its development even faster.

There's your answer, Larry....

What Can You Protect in Open Source?

Marc Fleury is a Frenchman who famously made lots of dosh when he sold his open source company JBoss to Red Hat. That puts him in a strong permission to pontificate about what does and what doesn't work in the world of businesses based around free software. Try this wit and wisdom, for example:

B.D asks: "marcf, my open source project is starting to enjoy a measure of success, I am thinking of going professional with it, I am thinking about business models. How much thought should I put in protecting my Intellectual Property?"

Answer: B.D. protecting IP in OSS is extremelly important. The only "private" property that exists in OSS are 1- brand 2- URL. Both are obviously related but really you need to protect your brand name, in other words REGISTER your trademarks, use them, declare they are yours and enforce the trademark, meaning protect against infringement. Other products, specifically based on your product should not include your name. Consultancies will be able to say they know and work with your "product name" but they cannot ship products using your trademark. Educate yourselves on brand IP, that is a big asset in OSS.

The URL deserves the same treatment. A successful website with traffic is a source of revenue in this day and age, either directly through ad placement or indirectly by lead generation.

It's interesting that Fleury concentrates on trademarks, rather than copyright or patents (of the latter he says: "you will have little protection against thieves that want to copy what you have done without letting you know and put it under different licenses, I have seen it done, such is the nature of the beast.") I think this indicates that trademarks can be useful, even with open source, just as copyright is necessary for licences to work. It's patents that remain the problem.

Of "IP", "Piracy" and China

As readers of this blog will know, I don't use the terms "intellectual property" or "piracy", since both are profoundly misleading and hopelessly skew the discussion. Nonetheless I can recommend a paper entitled "Intellectual Property Piracy: Perception and Reality in China, the United States, and Elsewhere", since it presents a cool analysis of the reality behind the terms, as well as some surprises.

Here's a sample of the former:

Free-rider downloading also serves an advertising function that may actually benefit music-copyright owners: Some free-rider downloaders may like “Sci-Fi Wasabi” enough to go out and spend 99¢ per song for other Cibo Matto tunes from iTunes, or even $11 for the album Stereo Type A or $19 for Pom Pom: The Essential Cibo Matto. If the downloader (or another who hears the downloaded copy) becomes a fan, hundreds of dollars in sales may result; if no download takes place, all of these potential future sales would be lost. Even if the total number of such sales represents only a tiny portion of downloads, it still exceeds the number of sales in the absence of downloading, which would be zero.


And one of the surprises is as follows:

Of the supposed $6.1 billion in losses to U.S. studios, 2.3 billion, or 38%, were lost to Internet piracy, while 3.8 billion, or 62%, were lost to hard-goods piracy. The three countries in which the losses to U.S. studios were highest were not East Asian countries, and two of them were not developing countries: Mexico, the United Kingdom, and France accounted for over $1.2 billion in lost revenues, or 25% of the non-U.S. total – and slightly less than the U.S. total of $1.3 billion. The three countries have a combined population of about 225 million, somewhat less than the United States’ 293 million, giving them a slightly higher per capita piracy rate.

(Via Salon.)

Will WIPO Wipe the Slate Clean?

So the sorry saga at WIPO is coming to an end, with the controversial Director leaving early (although I was disappointed that this was not "to spend more time with his family.") The question now, is who will take over, and what new direction will WIPO take?

This handover comes at a time when many (including me) are questioning what the role of an organisation nominally about so-called "intellectual property" should be in a world increasingly looking to move on to a less proprietorial approach to knowledge. The appointment of a new head would a good time to re-evaluate WIPO's role - and maybe even change its name.

Dealing with Disabilities

One of the problems raised with the use of ODF in Massachusetts was its lack of support for people with disabilities. That has now been sorted out, but it's probably generally true that open source has not addressed this issue as well as it could, not least because hackers tend to be young and hale, and therefore less aware of the problems faced by those who are not, for example.

So it's good to hear that some work is being done on precisely this area:

IBM and the researchers at the University of Dundee School of Computing (UK) and the University of Miami's Miller School of Medicine are collaborating to develop open source software technology tools to accommodate the needs of older workers to help them adapt to and remain productive in the changing workplace of the 21st century.

...

One way to support maturing workers who have age-related disabilities is to find new ways to increase their comfort level and ability to use technology.

(Via Daniweb.)

I've Got a Little List

On the basis that you just can't have enough lists of open source software, here's another one.

19 November 2007

OpenSolaris CIFS Server: Colour Me Confused

The goal of this project is to provide a native, integrated CIFS implementation to support OpenSolaris as a storage operating system. The OpenSolaris CIFS Server provides support for the CIFS/SMB LM 0.12 protocol and MSRPC services in workgroup and domain mode. Substantial work has already gone into modifying and adapting the existing OpenSolaris file system interfaces, services and commands to accommodate Windows attributes and file sharing semantics. The intent is to provide ubiquitous, cross-protocol file sharing in Windows and/or Solaris environments.

Now, I may be wrong, but this all sounds very similar to Samba. So the question is, how did Sun manage to emulate the protocols? And does the agreement between Microsoft and the EU over interoperability have any bearing on this? Yours, confused of London.

Google Desperately Seeking Picasa

What on earth took them so long?

Finally, Google has integrated Picasa Web Albums into Google Image Search. Public albums can be enabled for a public search option, meaning your images will be more likely to come up in Google image results. And that’s a huge improvement, because previously images on Picasa (and Blogger, and Google Docs) were not searchable at all. The other Google applications are still missing out on all the fun, but Picasa images are now searchable. This is limited, however, to a Google image search.

What's the point of having masses of open content if you can't find it? (Via Searchblog.)

Die, TinyURL, Die!

A couple of years ago, I wrote about TinyURLs, noting:

they are a great idea: too many Internet addresses have become long snaking strings of apparently random text. But the solution - to replace this with a unique but shorter URL beginning http://tinyurl.com commits the sin of obscuring the address, an essential component of the open Web.

Well, I don't want to say "I told you so", but "I told you so":

The link shortening and redirection service TinyURL went down apparently for hours last night, rendering countless links broken across the web. Complaints have been particularly loud on Twitter, where long links are automatically turned to TinyURLs and complaining is easy to do, but the service is widely used in emails and web pages as well. The site claims to service 1.6 billion hits each month.

That post worries about having a single point of failure for the Web; that's certainly valid, but for me the malaise is deeper. Even if there were hundreds of TinyURL-like services, it wouldn't solve the problem that they subvert the open nature of the Web.

Far better for the Web to wean itself off TinyURL now and get back to proper addressing. Interestingly, blogging URLs often do that, with nicely descriptive URLs that let you form a rough idea of what you're going to view before you get there.