07 December 2006

Samba Dances Towards the GNU GPLv3

According to this story, the Samba project will move to GNU GPLv3 once it's finished. That's a big win for the the FSF, since Samba is undoubtedly one of the most widely-used and highest-profile open source projects.

What a Waste of Energy

The official Curmudgeon of Computing, Nick Carr, stirred up a little excitement recently by pointing out that Second Life, for all its virtuality, really does use quite a lot of electricity. But before we start grabbing the digital pitchforks and descending upon Linden Lab for being such an ecological extravagant bunch, it's probably best to put things in context.

That's exactly what this post from KnowProSE.com does. It points out that the problem is not really Second Life's, it's the Internet's - ours, in other words. And it's certainly a big problem.

But it seems to me that the solution is less finding all the energy, than reducing the amount used by computers. It's a bit like cars: it's not really hard making them more fuel efficient, but until there are incentives to do so, you carry on using the old, inefficient technologies. We need to re-engineer our thinking, not just out technologies.

06 December 2006

Wayback: 85,898,456,616 and Counting

The Wayback Machine is one of the Internet's best-kept secrets:

A snapshot of the World Wide Web is taken every 2 months and donated to the Internet Archive by Alexa Internet. Further, librarians all over the world have helped curate deep and frequent crawls of sites that could be especially important to future researchers historians and scholars.

As web pages are changed or deleted every 100 days, on average, having a resource like this is important for the preservation of our emerging cultural heritage.

And even for someone like me, who uses it all the time, numbers like this still take the breath away:

The Internet Archive's Wayback Machine now has 85,898,456,616 archived web objects in it

plus

The database contains over 1.5 petabytes of data that came from the web (that is 1.5 million gigabytes) which makes it one of the largest databases of any kind.

And a cyber-pearl beyond price. (Via Open Access News.)

Set My Libri Free

Everybody knows about Project Gutenberg, which aims to provide texts of as many public domain books as possible. One freedom that is available for such texts is to create spoken versions of them. Librivox is aiming to do just that:


LibriVox volunteers record chapters of books in the public domain, and then we release the audio files back onto the net (through a podcast, catalog, and bit torrents). We are a totally volunteer, open source, free content, public domain project, and we operate almost exclusively through Internet communications.

...

We get most of our texts from Project Gutenberg, and the Internet Archive and ibiblio.org host our audio files.

Not only that, but it offers its files in both the well-known - but proprietary - MP3 format, as well as the less well-known but free and deliciously-named ogg format. Another unexpected plus of the project, is that it can offer several versions of the same text, allowing all kinds of interesting comparisons to be made - to say nothing of cool reworkings.

There is also a small but select group of texts in languages other than English. (Via Creative Commons.)

Google Maps Go to Azeroth

If any further proof were needed of the fading line between real and virtual, here comes a story about Google Maps moving beyond the tangible:

The fictional continent of Azeroth in the World of Warcraft now has an area that uses Google Maps API. The map, if we may add, is amazingly accurate. Accordingly, there are over 15,000 data points covering 69 resources with their exact map location in the WoW database.

I can't wait for the virtual mashups.

Gowers Now Out

The Gowers Review is now out. I've not had time to read it all yet, but there's a good summary in the Treasury's press release:

Whilst the Review concludes that the UK has a fundamentally strong IP system, it sets out important targeted reforms. The reforms aim to:

* strengthen enforcement of IP rights to protect the UK's creative industries from piracy and counterfeiting;
* provide additional support for British businesses using IP in the UK and abroad; and
* strike the right balance to encourage firms and individuals to innovate and invest in new ideas while ensuring that markets remain competitive and that future innovation is not impeded.

There's some good news in this:

To ensure the correct balance in IP rights the review recommends:

* ensuring the IP system only proscribes genuinely illegitimate activity. The Review recommends introducing a strictly limited 'private copying' exception to enable consumers to format-shift content they purchase for personal use. For example to legally transfer music from CD to their MP3 player;
* enabling access to content for libraries and education establishments - to ensure that the UK's cultural heritage can be adequately stored for preservation and accessed for learning. The Review recommends clarifying exceptions to copyright to make them fit for the digital age; and
* recommending that the European Commission does not change the status quo and retains the 50 year term of copyright protection for sound recordings and related performers' rights.

But I worry about what the following will mean in practice:

With the music industry losing as much as 20 per cent of annual turnover to piracy and counterfeiting, the Review recommends strengthening enforcement of IP rights through:

* new powers and duties for Trading Standards to take action against infringement of copyright law;
* IP crime recognised as an area for police action in the National Community Safety Plan;
* tougher penalties for online copyright infringement - with a maximum 10 years imprisonment;
* lowering the costs of litigation - by using mediation and consulting on the use of fast-track litigation. The Review acknowledges that prohibitive legal costs affect the ability of any to defend and challenge IP; and
* consulting on the use of civil damages as a deterrent for IP infringement.

If this means going after large-scale counterfeiters, well and good. But if we're talking about "tougher penalties" and "police action" for all kinds creative uses - mashups etc. - then there are going to be big problems.

Parenthetically, here's a characteristically wise and well-written piece by Larry Lessig in today's FT about one aspect of the report. He's worried that the Gowers recommendation on not changing the status quo for sound recordings may be ignored by the UK Government to keep some of its industry chums happy:

There is not much doubt about what it will say on this proposal. There is much more doubt about whether the government will follow the report's sensible advice.

Lessig then makes his usual sensible pitch about orphan works, including with the following splendid peroration:

There are some who believe that copyright terms should be perpetual. Britain did the world a great service when it resolved that debate almost 300 years ago, by establishing one of the earliest copyright regimes to limit copyright to a fixed term. It could now teach the world a second important lesson: any gift of term extension should only go to those who ask.

TheyWorkForYou.com and Open Politics

Today I received an email from a service I signed up to recently. I'd forgotten about it because it dealt with the apparently yawn-worthy subject of what my local Member of Parliament said. In fact, the service promises to deliver to me, freshly-baked, all the wit and wisdom of said Honourable Member.

Now, truth to tell, what the chap opined about the number of buses on Chelsea bridge was less than gripping. But the point is, I now know when he speaks, and what he says. Not only that, the information on the site TheyWorkForYou.com presents a gloriously Web 2.0-ified version of Parliamentary speeches, complete with Ajaxy popups, and links to more information about MPs than you could shake an identity card at.

In short, the service turns the whole area into a data wonderland. This is what open politics should be. Thanks: YouReallyReallyDoWorkForMe.

How Cool is Coull.tv?

Today's Web user depends on search - well, I do, at least. But search/Google is really only doing words. And, as any fule kno, words is easy. Now video, that's quite a different matter.

So the appearance of any site claiming to make videos searchable is at least worth a look, so to speak. Coull.tv is one such:

coull.tv enables you to activate objects within a video - making people, objects and other items clickable. Anybody can then add tags and comments describing the video and or the objects in that video. This enables the video or parts of it to be easily found; the more popular tags and comments become, the more often coull.tv will suggest them in related searches. coull.tv will use the power of the community to help categorize and tag every element of a video. coull.tv has no pre-roll or post-roll advertising in the way of the viewing experience.

There are two important elements here. First, the fact that elements within a video can be demarcated and made clickable. But that on its own doesn't make a search engine: it just turns video flow into discrete elements. The second part of the coull.tv equation is to get users to do the difficult bit: indexing all those elements. In fact, this is probably the only way video indexing is going to be done for a long while. Automatic recognition through some kind of AI is just too difficult currently.

God knows, the last thing we need is another video sharing site; happily, coull.tv seems to be searching for something more. (Via John Battelle's Searchblog.)

05 December 2006

Dell Delivers - Even in Second Life

If you go to Dell's main site at www.dell.com, you have an unexpected option on the pop-up list of countries and regions at the bottom, as this post shows. Yes, Dell really seems to get this Second Life lark.

Mashup 2.0 and a New Data Commons

One of the defining characteristics of Web 2.0 is the ability to combine data from various sources - the mashup. And yet, in a sense, mashups so far have been purely additive: you take something and add it to something else to create a third. The two sources rarely meet in any deep way to forge some truly new information or insight, other than ones born of clever data representation (not to be sneezed at, either).

That's what makes the new Swivel service important. The Web site reveals nothing currently, but TechCrunch has some tantalising details:

the site allows users to upload data - any data - and display it to other users visually. The number of page views your website generates. Or a stock price over time. Weather data. Commodity prices. The number of Bald Eagles in Washington state. Whatever. Uploaded data can be rated, commented and bookmared by other users, helping to sort the interesting (and accurate) wheat from the chaff. And graphs of data can be embedded into websites. So it is in fact a bit like a YouTube for Data.

But then the real fun begins. You and other users can then compare that data to other data sets to find possible correlation (or lack thereof). Compare gas prices to presidential approval ratings or UFO sightings to iPod sales. Track your page views against weather reports in Silicon Valley. See if something interesting occurs.

And better yet, Swivel will be automatically comparing your data to other data sets in the background, suggesting possible correlations to you that you may never have noticed.

This is really heavy stuff, and will allow truly new information, and new kinds of information, to emerge from the comparison of other data - something that gets stronger the more data that is uploaded. And what makes me think it's going to be hugely successful is that it has a viable business model attached:

Not all data will be public. The companies business model is to provide the service for free for public data, and charge a fee for data that is kept private. Private data can still be compared by the owner to public data sets.

Which is exactly what you want: all the benefit of the public data, but none of the issues of sharing your own. Essentially, this allows limited private grazing of a new data commons, whose overall creation and care is paid for in part by that grazing. Brilliant.

Update: Swivel is now up, in beta at least. Inevitably, there's not much to see yet.

All the News You Can Trust

Here's an interesting twist on the Digg idea: a site that does not merely vote stories up or down, but which rates them in terms of their reliability - quality, not mere popularity:

In recent years, the consolidation of mainstream media, combined with the rise of opinion news and the explosion of new media outlets, have created a serious problem for democracy: many people feel they can no longer trust the news media to deliver the information they need as citizens.

To address this critical issue, NewsTrust is developing an online news rating service to help people identify quality journalism - or "news you can trust." Our members rate the news online, based on journalistic quality, not just popularity. Our beta website and news feed feature the best and the worst news of the day, picked from hundreds of alternative and mainstream news sources.

This non-profit community effort tracks news media nationwide and helps citizens make informed decisions about democracy. Submitted stories and news sources are carefully researched and rated for balance, fairness and originality by panels of citizen reviewers, students and journalists. Their collective ratings, reviews and tags are then featured in our news feed, for online distribution by our members and partners.

It's a laudable idea, although I'm not sure how it will be funded in the long-term, or whether it will fall victim to people with an agenda putting together a clique to skew the results. (Via OpenBusiness.)

From O(GL)LPC to O(W)LPC

An interesting story here:

Microsoft wants to make its Windows operating system available on the One Laptop per Child (OLPC) notebook computers, OLPC chairman Nicholas Negroponte said at the NetEvents conference in Hong Kong on Saturday.

...

"We put in an SD slot in the machine just for Bill. We didn't need it but the OLPC machines are at Microsoft right now, getting Windows put on them."

The SD slot is needed so that memory can be boosted sufficiently to run Windows. That probably won't be a problem in terms of cost, because memory just keeps on getting cheaper. But what's deeply ironic here is that the current price of the GNU/Linux-based OLPC system - around $140 - is utterly dwarfed by the cost of Windows. Obviously Microsoft will offer a cut-down, el cheapo version, but nonetheless the unjustifiable disparity between hardware and software costs is striking.

Microsoft's interest is understandable - it doesn't want to lose a potentially huge and impressionable market. What is less understandable is Negroponte's willingness to give up all his fine principles of empowering children, and to allow them to be shackled by closed source/DRM/Trusted Computing - for what looks like a rather pathetic and unbecoming reason:

"I have known [Microsoft chairman] Bill Gates his entire adult life. We talk, we meet one-on-one, we discuss this project," said Negroponte, according to a transcript provided to vnunet.com.

Gosh, you must be important. (Via Slashdot.)

The Great UnSuggester

This, surely, is what technology was invented for:

Unsuggester takes "people who like this also like that" and turns it on its head. It analyzes the seven million books LibraryThing members have recorded as owned or read, and comes back with books least likely to share a library with the book you suggest.

After all, who wants to know about things that will slide down your mental gullet like a proverbial oyster? What we need are intellectual chicken bones that makes us choke on new ideas.

Free Tibet, Free Tibetan Typeface

I came across this worthy project through an article about Tibet by Paul Jones:


The Tibetan & Himalayan Digital Library project at the University of Virginia is pleased to make available the alpha release of the Unicode character based Tibetan Machine Uni OpenType font for writing Tibetan, Dzongkha and Ladakhi in dbu can script with full support for the Sanskrit combinations found in chos skad texts.

Alpha release here, people: could all Tibetan hackers please hammer the code.

04 December 2006

See Viv Run

Yes:

The European Union's telecommunications watchdog has called for regulators to take a backseat in setting standards--and allow consumers to take the lead by picking the platform that offers the services they want.

Speaking on Monday here at the ITU Telecom World 2006 conference, Viviane Reding, the EU's commissioner for information society and media, said regulators should no longer be the main force in charge of mandating standards.

...

Reding said the spectrum freed up by the switch to digital TV will offer a "once-in-a-generation opportunity" for expanded wireless services, adding that regulators must be flexible and "get out of the command-and-control system."

Now, if we could possibly make that liberated spectrum into a commons....

Climate Commons

A new one for the commons collection:

Climate commons is a networked conversation space that creates a cross-disciplinary platform for planetary ecological concerns. Twelve people who research issues relevant to the arctic and climate change contribute the progress of their investigations and reflections from October 10, 2006 through January 10, [2]007. These networked conversations can be read by and contributed to by visitors to the exhibition at the Institute of Contemporary art Boston or on the web at climate-commons.net.

It also offers some open source-y goodness:

Matt Shanley has created three main visual tools to help foster the dialog on this site.
The word count history graph uses sparklines to give you an idea of the ups and downs of site activity at a glance.
The hexagraph provides a spatial representation of the threads of a conversation. You can literally see when a conversation branch is bursting at the seams.
Category highlighting reveals common threads by illuminating the key words.

Each of these extensions will be released as an open source project in early 2007, around the time Climate commons is coming to completion.

I have to say, though, that there is something vaguely jarring about a project to do with climate change, sustainability and all that coming to "completion": shouldn't it just go on and on? (Via WorldChanging.)

Thanks - I'll Pass on that Poisoned Chalice

Good news, you might think:

Novell today announced that the Novell edition of the OpenOffice.org office productivity suite will now support the Office Open XML format, increasing interoperability between OpenOffice.org and the next generation of Microsoft Office. Novell is cooperating with Microsoft and others on a project to create bi-directional open source translators for word processing, spreadsheets and presentations between OpenOffice.org and Microsoft Office, with the word processing translator to be available first, by the end of January 2007. The translators will be made available as plug-ins to Novell's OpenOffice.org product. Novell will release the code to integrate the Open XML format into its product as open source and submit it for inclusion in the OpenOffice.org project. As a result, end users will be able to more easily share files between Microsoft Office and OpenOffice.org, as documents will better maintain consistent formats, formulas and style templates across the two office productivity suites.

Pretty cool, huh? Well, maybe not.

The code may well be released as open source, but there's the small matter of patents they might draw upon. Given that "Novell is cooperating with Microsoft and others", there must be the fear that to produce these undeniably handy translators Novell has availed itself of some inside knowledge kindly provided by that nice Mr Ballmer.

I've no idea whether that happened or not, but if I were in the OpenOffice.org group I do know I'd be refusing the proffered chalice - just in case. (Via LWN.net.)

Time to Praise Simão Jatene?

In these dark days when everything seems to be getting worse with the environmental commons, it is rare to come across something as positive as this:


Vast tracts of rainforest in Brazil are to get a new protected status.

The segments of land in the northern Para state together cover 16.4 million hectares (63,320 sq miles), an area of land that is bigger than England.

Thousands of wildlife species inhabit the pristine forest, including jaguars, anteaters and colourful macaws.

Campaigners say the decision made by Para Governor Simao Jatene is one of the most important conservation initiatives of recent years.

If it is true, then Governor Jatene deserves to go down in the annals as a wise and great man. The only trouble is, I can't find anything confirming this wonderful news on the site of one of the organisations quoted in the story above. Instead, there is just a rather dry report on forest management.

Let's hope.

Of Kant and Cant

Sad to see the once-rigorous nation of Immanuel Kant falling for the, er, cant of the content industries in the copyright reform discussions:

But Jerzy Montag, Member of the Green Party opposition, sees this slightly differently. “The current reform draft is in some points friendly to industry and antagonistic to the interests of authors and creators,” he said. “We should give more rights to creators, but I am pessimistic here. And it makes me see red to think about how vehemently based on the current draft the CDU-SPD [Social Democratic Party] coalition wants to go after users.”

The target of Montag’s critique is a proposed change to establish criminal liability for illegal private copies. A mass complaint against 25,000 private users resulted in a clear statement of a court in Karlsruhe that it was unable to bear that load and therefore would not open proceedings in minor cases.

The German justice minister reacted with the introduction of a “bagatelle clause” into the draft proposal to limit criminal proceedings on commercial “pirates.” Yet after heavy criticism over legalising intellectual property theft from rights holders and some members of parliament, the minister withdrew the bagatelle clause (which refers to a minor case of no commercial relevance).

"Pirates", "intellectual property theft", and so on, and so on....

The Distro Xerxes Would Have Used

Here's one that famous blogger Mahmood Ahmadinejad probably prepared earlier:

Jalal Haji-Gholam-Ali who is a member of Sharif Technical University’s Advanced ICT Scientific Board and consultant of the ICT Ministry in launching the Persian Linux Project, reiterated, "Launching the Pilot Study phase of Persian Linux Project has be[en] commissioned to TCI’s Research Center."

...

Emphasizing that many of the main services of the ICT are Linux-based, he reiterated, "That ministry is determined to migrate towards Linux."

Referring to the establishment of an infrastructure Software Work Group at the Secretariat of the ICT Ministry, he said, "This work group is established aimed at facilitating the migration of the ICT Ministry towards full usage of Linux."

(Via tuxmachines.org.)

Ode to an Expiring Blue Frog

I suppose every frog has its day, but I hope that this doesn't mean the end for Azureus as we know and love it:

Azureus, maker of the popular peer-to-peer client, has revamped its software to include video publishing and distribution tools with a much slicker and user-friendly interface. To support the new platform, called Zudeo, the company has raised a $12 million second round of funding.

This space is hot; BitTorrent last week said it had raised $20 million from Accel Partners and Doll Capital Management. Much like BitTorrent, Palo Alto-based Azureus incorporated, took venture money, and came up with a business model only after the massive success of its open source software.

After all, it's well known that bloat is bad for frogs and software.

Saint Johnomics

Sir John Sulston is one of my heroes, right up there with RMS. Indeed, Sulston can reasonably be called the RMS of genomics (or maybe RMS is the Sulston of software). More than anyone else, it was Sulston who fought for and won the free availability of the human genome's digital code. Without him, I suspect that the company that once seemed set to become the Microsoft of molecular biology, Celera, would "own" the human genome, with all the appalling things that this implies.

I mention this because there was short piece by him in the FT recently. It's an edited extract from a talk he gave; the editing and extraction are not very well done, and it certainly doesn't do justice to the man or his ideas. For that, you should read his book The Common Thread - significantly, subtitled "A Story of Science, Ethics and the Human Genome".

Great literature it ain't, but it fair bristles with the same sense of mission and moral imperatives that makes RMS's stuff such fun to read. If RMS is St IGNUcius, perhaps Sulston is St Johnomics.

Open Provenance Architecture

Interesting:

Ultimately, our aim is to conceive a computer-based representation of provenance that allows us to perform useful analysis and reasoning to support our use cases. The provenance of a piece of data will be represented in a computer system by some suitable documentation of the process that led to the data. While our applications will specify the form that such a documentation should take, we can identify several of its general properties. Documentation can be complete or partial (for instance, when the computation has not terminated yet); it can be accurate or inaccurate; it can present conflicting or consensual views of the actors involved; it can be detailed or not.

Open Science or Free Science?

The open science meme is rather in vogue at the moment. But Bill Hooker raises an interesting point (in a post that kindly links to a couple items on this blog):

should we be calling the campaign to free up scientific information (text, data and software) "Free Science", for the same reasons Stallman insists on "Free Software"?

Interestingly, there is another parallel here:

Just as free software gained the alternative name "open source" at the Freeware Summit in 1998, so free open scholarship (FOS), as it was called until then by the main newsletter that covered it - written by Peter Suber, professor of philosophy at Earlham College - was renamed "open access" as part of the Budapest Open Access Initiative in December 2001. Suber's newsletter turned into Open Access News and became one of the earliest blogs; it remains the definitive record of the open access movement, and Suber has become its semi-official chronicler (the Eric Raymond of open access - without the guns).

Brits Get the Net - and Net Ads

I remember well during the heady Web 1.0 days worrying about business models (I know, this made me something of an oddity). Because it was clear to me that the banner advertising then in vogue just wasn't going to cut it. Net advertising - it'll never catch on, I thought.

Close. Not.

The second Net boom/bubble has been largely driven by Google and its targeted ads. The knock-on effect is that Net advertising is thriving, and no more so than in the UK, apparently. This article has some interesting figures on the differences between the UK and US markets, tying them in to techno-socio-economic factors.