15 November 2007

Lecture Search Engine

Given the centrality of search to the way we use the Internet, it's surprising that we're still stuck with a few file-types - essentially text, with a few tags for images and video thrown in if you're lucky. I've written before about picture searching, and now here's lecture searching:

The Lecture Browser is a web interface to video recordings of lectures and seminars that have been indexed using automatic speech recognition technology. You can search for topics, much like a regular web search engine. If any results look relevant, you can play the video starting at the relevant point and see the synchronized transcript.

Even better, the lectures this is indexing are from MIT's OpenCourseWare:

More than 200 MIT lectures are currently available on the site (web.sls.csail.mit.edu/lectures/). So far, most of the users are international students who access the lectures through MIT's OpenCourseWare (OCW) initiative, which makes curriculum materials for most MIT courses available to anyone with Internet access. Although the lecture-browsing system is still in the early development stages, a recent announcement in OCW's newsletter has drawn increased traffic to the site.

Barzilay and Glass expect the system will be most useful for OCW users and for MIT students who want to review lecture material. MIT World, a web site that provides video of significant MIT events such as lectures by speakers from MIT and around the world, is also participating in the project.

(Via Open Access News.)

Yummy Yamli

Although my Arabic is, shall we say, jejune, there have been rare occasions when I've wanted to search for Arabic words in Arabic script. Now I can, thanks to the clever Yamli site:

Yamli allows you to type Arabic using roman characters. For example you can enter the roman letter "f" for "ف". You can also use the Arabic chat characters. For example "3" for "ع", "2" for "ء", "7" for "ح", etc ...

Type it the way you say it. Try it out !

Couldn't be simpler. Now I just have to learn a few thousand Arabic words and I'm away. (Via The Inquirer.)

W(h)ither Blogging?

Here's a thoughtful post:

Somehow it seemed that blogging just isn't that hot anymore. The feeling has been exacerbated by the latest slow down in news. My feeds just do not update that often these days. Can it be that the digestion phase applies to blogs just as it applies to startups? In this post we'll investigate whether the blogosphere is going through a digestion phase.

I find this particularly interesting because my impression is exactly the reverse: I find more and more interesting stuff in my feeds. Not only that, but I find this humble little blog is also attracting more attention, particularly among the PR crowd. I've noticed a distinct change in attitude among the latter - unspoken, but clearly there - from regarding blogs as vaguely interesting but not very influential, to seeing them as just as important as traditional media.

I'd go further: the blogs seem to be taking over. At a time when more and more (dead-tree) newspapers and magazines are closing down, or going purely online, and when more and more online titles are starting to run bloggers as part of the mix, it seems to me that the barycentre of digital publishing is mostly certainly moving deep into the heart of the blogosphere.

Of course, the acid test will be during the next downturn, but I'm optimistic. Unlike the publishing excesses of dotcom 1.0, where magazines blossomed with the manic marketing of no-hoper startups, only to wilt themselves when that, er, fertiliser dried up, blogs are predicated on lean and mean. The only ones that will suffer seriously are those that are beginning to bloat towards the condition of traditional, inefficient publications. No names, no packdrill.

Sony Reads the e-Leaves

I have to admit that I hate Sony computers. I bought a Vaio portable once, and it was awful in just about every respect - overpriced, weird backup discs, and a battery that soon died on me, with no sensibly-priced option of getting a replacement. So I don't come with many positive feelings towards its new e-book device, the PRS-505. But apparently it does a couple of things right:

the Reader is powered by Montavista Linux and uses code from projects like OpenSSL and Freetype

There again, you'd be mad not to use GNU/Linux on a system like this. Now all Sony needs to do is reduce the price by a factor of about ten and I might be vaguely interested.

Or maybe not.

Mobilising Tim Berners-Lee

Tim on mobile openness:

On the opening day of Mobile Internet World in Boston, the man credited with inventing the World Wide Web told a packed hall that the mobile Internet needs to be fully and completely the Internet, nothing more and nothing less. It needs to be free of central control, universal, and embodied in open standards.

“The Web is an open platform on which you build other things,” he said. “That’s how you get this innovation. The Web is universal: you can run it on any hardware, on any operating system, it can be used by people of different languages…It’s a sandbox where people can [play and] exercise their creativity. It’s very important to keep the Web universal as we merge the Internet with mobile.”

From Rebel Code to Codi Rebel

And now, the moment you've all been waiting for: Rebel Code....in Catalan:


Una història apassionant que ens explica com un grup d'inconformistes desafià les grans empreses i va produir una revolució inesperada en el món dels ordinadors. En 1991 un jove estudiant, Linus Torvalds, va comprar un ordinador personal i es va plantejar l'elaboració d'un nou programari. Tot va començar gairebé com un joc, com un hobby, però en pocs anys, i amb l'ajuda d'un grup d'amics i col·laboradors connectats a través de la xarxa, Torvalds desenvolupà un sistema operatiu que esdevingué un veritable repte per a Microsoft. El programari GNU/Linux és utilitzat avui per milions de persones. I la qüestió que més desassossec provoca entre els grans gegants de la informàtica és que el seu ús és lliure, no costa diners. En aquest relat ple d'anècdotes i d'històries reveladores, Glyn Moody exposa de manera clara i accessible com es va desenvolupar aquesta lluita entre David i Goliat protagonitzada per Linux, bo i situant-la en el context més ampli de la història del moviment en favor del programari lliure. Alhora mostra tot el que es pot aconseguir quan la creativitat i la cooperació intel·lectual es posen per damunt del simple benefici econòmic. Glyn Moody s'ha ocupat de Linux gairebé des del moment de la seua elaboració.

Couldn't have put it better myself.

Update: A here's a word from the translators.

Adding Some Lustre to Supercomputing

Everybody knows that GNU/Linux absolutely dominates the top 500 supercomputing listings: in the latest survey it notches up an 85% share (Windows manages 1.2%). Less well-known - to me, at least - is the fact that Lustre, an open source cluster file system, also does well:

Lustre highlights include:

The #1 fastest supercomputer in the world.

Lustre is being used on 7 out of the top 10 fastest supercomputers in the world.

Out of the top 30 fastest supercomputers in the world - Lustre can be found on 16 of them.

14 November 2007

Opening Up the Source Code of Society

Carl Malamud has done it again:


Public.Resource.Org and Fastcase, Inc. announced today that they will release a large and free archive of federal case law, including all Courts of Appeals decisions from 1950 to the present and all Supreme Court decisions since 1754. The archive will be public domain and usable by anyone for any purpose.

Great news (well, mostly for Americans, but a legal commons is a legal commons). And how about this for an quotation:

“The U.S. judiciary has allowed their entire work product to be locked up behind a cash register,” said Carl Malamud, CEO of Public.Resource.Org. “Law is the operating system of our society and today's agreement means anybody can read the source for a substantial amount of case law that was previously unavailable.”

(Via Lessig Blog.)

Sun Eyes Up GNU/Linux's Jugular

In the nicest possible way, of course:

Dell and Sun Microsystems are set to announce that Sun's Solaris and OpenSolaris operating systems will be supported in all of Dell's servers.

Dell founder and CEO Michael Dell and Sun Microsystems CEO Jonathan Schwartz plan to make the announcement during a joint appearance at the Oracle OpenWorld 2007 conference here today.

The agreement means that customers buying a Dell rack or blade server is ordered will get the option of installing Solaris or OpenSolaris. Customers picking one of these operating systems will get support from Sun's online support organization through Dell, making the experience seamless for the customer.

So while getting Dell to put GNU/Linux on its desktop machines has been the obsession of certain fanboys (oh, that would be me), that cunning Mr Schwartz has snuck up on the server side. (Via Erwin Tenhumberg.)

Yahoo! Goes Whoop! About Hadoop! (and Pig!)

Now why on earth would Yahoo be doing this?

Yahoo! Inc., a leading global Internet company, today announced that it will be the first in the industry to launch an open source program aimed at advancing the research and development of systems software for distributed computing. Yahoo!'s program is intended to leverage its leadership in Hadoop, an open source distributed computing sub-project of the Apache Software Foundation, to enable researchers to modify and evaluate the systems software running on a 4,000 processor supercomputer provided by Yahoo!. Unlike other companies and traditional supercomputing centers, which focus on providing users with computers for running applications and for coursework, Yahoo!'s program focuses on pushing the boundaries of large-scale systems software research.

Currently, academic researchers lack the hardware and software infrastructure to support Internet-scale systems software research. To date, Yahoo! has been the primary contributor to Hadoop, an open source distributed file system and parallel execution environment that enables its users to process massive amounts of data. Hadoop has been adopted by many groups and is the software of choice for supporting university coursework in Internet-scale computing. Researchers have been eager to collaborate with Yahoo! and tap the company's technical leadership in Hadoop-related systems software research and development.

As a key part of the program, Yahoo! intends to make Hadoop available in a supercomputing-class data center to the academic community for systems software research. Called the M45, Yahoo!'s supercomputing cluster, named after one of the best known open star clusters, has approximately 4,000 processors, three terabytes of memory, 1.5 petabytes of disks, and a peak performance of more than 27 trillion calculations per second (27 teraflops), placing it among the top 50 fastest supercomputers in the world.

M45 is expected to run the latest version of Hadoop and other state-of-the-art, Yahoo!-supported, open-source distributed computing software such as the Pig parallel programming language developed by Yahoo! Research, the central advanced research organization of Yahoo! Inc.

It's cool that Yahoo's backing the open source Hadoop, and doubly cool that one of the projects is called Pig. But it's also shrewd. It's becoming abundantly clear that open beats closed; Google, for all its use of open source software, is remarkably closed at its core. Enter Hadoop, running on a 4,000 processor supercomputer provided by Yahoo, with the real possibility of spawning a truly open rival to Google.... (Via Matt Asay.)

Facebook Goes Corporate

Here's an important straw in the wind:

Content-oriented Facebook Applications may now easily be developed using the Alfresco platform. This means that enterprise content management capabilities can be mixed with the social graph of Facebook.

The first of many.

Flickr: Happy Two Billionth!

Flickr has reached its two billionth picture: and that's just the beginning....

Documentum Opens Up (Not)

Hm, not my idea of opening up:

EMC is inviting independent software vendors, system integrators and channel partners to help develop, integrate and sell new content/ records management products and related professional services for specific vertical markets in the mid-market based on its Documentum 6 platform.

But mark my words, it will do, but probably too late, once the open source enterprise content management systems have completely redefined the market.

The University of Openess (sic)

New one on me:

The University of Openess is a self-institution for independent research, collaboration and learning. Find out more about the courses, campuses and student/teacher life at the uo in the AboutUo section.

(Via if:book.)

From Gizmo Manuals to Gitmo Manuals

Nobody reads the manual, right? Well, here's one that people probably will want to read on the fine Wikileaks site: it's for Guantánamo Bay....

Isn't openness a wonderful thing?

The disclosure highlights the internet's usefulness to whistle-blowers in anonymously propagating documents the government and others would rather conceal. The Pentagon has been resisting -- since October 2003 -- a Freedom of Information Act request from the American Civil Liberties Union seeking the very same document.

JK Rowling Misunderstands the Magic of Sharing

"It is not reasonable, or legal, for anybody, fan or otherwise, to take an author's hard work, re-organize their characters and plots, and sell them for their own commercial gain. However much an individual claims to love somebody else's work, it does not become theirs to sell."

Sorry, darling, it is not only reasonable to produce this kind of reference work, it is actually beneficial - to you, and the world. Not least because it discourages people from coming up with killer one-liners like this:

The big news from the world of Harry Potter isn't that Dumbledore is gay. It's that J.K. Rowling is greedy.

Unlocking the Value of Open Innovation

It's a truism that there are more clever people out there than in here, wherever "here" may be. So it makes sense to try to tap into that cleverness - which is precisely what open source and cognate movements attempt to do. Now it looks like business is slowly getting the hang of this:

Barrick’s Unlock the Value program is a unique opportunity for scientific problem solvers. We invite proposals for an economically viable way to recover silver from silica-encapsulated ore. For proposals judged to have merit, Barrick will:

* Fund your research
* Pay you a consulting fee
* Provide resources and expertise
* Help you develop and test your idea

For a method or technology that is successfully implemented, Barrick will pay a performance bonus of $10,000,000.

(Via Peter Murray-Rust.)

Oh, Tell Me the Truth About Mobile

I don't really understand mobile, but I do understand its importance. So the news that the British company Volantis will be releasing a big gob of code as open source was clearly nice:

Volantis Systems, which provides the Intelligent Content Adaptation software delivering mobile content to more than 250 million mobile phone users worldwide, today eliminated price as a barrier to entry for companies that would like to capitalize on Volantis solutions to deliver content to mobile users. The Volantis Mobility Server is available immediately as a free download, and in the first quarter of 2008, Volantis will release the product under the GNU General Public License (GPL), version three, in the process contributing 1.2 million lines of code, based on seven years of development, to the community.

With more consumers and corporate customers moving toward the mobile Internet, enterprises need a simple way to build Web sites for mobile devices. Volantis Mobility Server provides an inexpensive path for companies to create this content and easily distribute it to the wide variety of mobile browsers on the market.

Which is all well and good, but couldn't you just do that with a CSS stylesheet? I asked Mark Watson, co-founder and chief executive officer of Volantis. He very kindly explained to me in words of one syllable why it was a smidge more complicated than that.

The basic problem is that there is no standard 640x480 resolution on mobile devices, which come in just about every shape and size imaginable, with handset manufacturers constantly adding more as they seek to differentiate their products from the others. This means that you need to reformat your Web stuff hundreds, if not thousands of times, depending on the device. And no, Google's Android doesn't really help here, because you've still got the hardware to cope with. This is clearly a pain, and where there is pain there is always a business opportunity to reduce that pain for gain - hence the existence of Volantis.

So, you might ask, why is Volantis giving away its crown jewels? The usual story: it currently has a number of jolly big customers, and thinks, probably rightly, that it will make more dosh if is has thousands of smaller customers. Since the latter are unlikely to fork out large sums for software, the code is going open source, with money made on services, as per usual.

Sounds sensible to me, but what do I know?

Yikes! I've Been Blognapped

Not that I egosurf or anything, you understand, but I happened to notice that somebody was linking to me:

glyn moody wrote an interesting post today on
Here’s a quick excerpt
CARTES & IDentification 2007, Villepinte, France, November 13, 2007–Gemalto, the world leader in digital security, today announced it has attained Gold Certified Partner status in the Microsoft Partner Program. …

Except, of course, I wrote no such interesting post....

Bizarre.

13 November 2007

Android: Kitted Out with WebKit

Well spotted by GigaOM: one of the key components of Android is WebKit:

WebKit is an open source web browser engine. WebKit is also the name of the Mac OS X system framework version of the engine that's used by Safari, Dashboard, Mail, and many other OS X applications. WebKit's HTML and JavaScript code began as a branch of the KHTML and KJS libraries from KDE.

As Om notes, this is a significant vote of confidence for WebKit, and a reminder that most other browsers - even rather popular open source ones like Firefox - are behind in this particular race. Also, rather a pat on the back for KDE....

Of Bazaars and Dangerous Co-location

I often bang on about modularity in this blog, and its critical importance to creating and running open projects. Here are some more thoughts on the subject, along with many interesting ruminations on creating a Raymondian bazaar, and the state of open source companies today. It concludes by answering a key question it posed itself:

Why do so many open-source projects not have the active community of external contributors they are hoping for? Because they have been largely developed by co-located teams of hired software engineers, 100% dedicated to the project, managed and organized like any traditional software development effort. This seems to be especially true for the new crop of ‘custom build’ open-source companies, which would like to take advantage of the open-source business model. They might hope to also enjoy the advantages of the open-source development model one day, but achieving that requires a conscious effort.

Good stuff.

Sick, Sick, Sick: The Sickness Deepens

I've warned you about this bloke before:

Intellectual Ventures LLC, a low-profile investment firm run by former Microsoft Corp. executive Nathan Myhrvold, is laying plans to go global: It hopes to raise as much as $1 billion to help develop and patent inventions, many of them from universities in Asia.

The move could help the firm, formed seven years ago to purchase patents and help inventors dream up new ones, expand its already-vast store of patents. But the new push also could exacerbate concerns that Intellectual Ventures will begin launching lawsuits to pressure companies to pay for use of its intellectual property.


Mr. Myhrvold said that his firm hasn't sued anybody for patent infringement but that he can't rule it out in the future.

That's a "yes", in case you were wondering. (Via Against Monopoly.)

More on Dzonghka: Microsoft's Morals

Regular readers of this blog will know that I have a soft spot for Bhutan's Dzonghkha language and its use in free software, so I was naturally intrigued by Andrew Leonard's recent post on the subject. This led me to the following, which somehow I had missed when it first came out:

Microsoft has barred the use of the Bhutanese government’s official term for the Bhutanese language, Dzongkha, in any of its products, citing that the term had affiliations with the Dalai Lama. In an internal memorandum, Microsoft employees were told not to use the term Dzongkha in any Microsoft software, language lists or promotional materials since “Doing so implies affiliation with the Dalai Lama, which is not acceptable to the government of China. In this instance, replace “Dzongkha” with ‘Tibetan - Bhutan’.”

How's that for a perfect confluence: Dzongkha-Tibetan-Chinese repression-Microsoft-free software? Nothing like a little moral prostitution to boost that bottom line, eh Microsoft?

UK Government Votes for e-Voting Quagmire

The UK Government has this crazy idea about IT: that if they say "make it so" often enough, it is so. But what they fail to realise is that complex IT projects are, er, complex, and often/usually go wrong. Stamping your pretty little foot ain't gonna fix it. As a result of this institutional ignorance/stupidity/wilfulness, it looks like the government is ploughing on with the doomed e-voting idea. When will they ever learn?

Go, gOS, Go!

Recently I wrote about the Everex Green gPC TC2502, sold by Walmart. On the product page at Walmart there are some fascinating comments, including the following:

I was surprised/shocked when it booted to Linux instead. My initial thought was someone had bought the machine, put Linux on it and returned it. However once it loaded up and was "green" everywhere I realized it was the way it's supposed to be (it matched the box's color).

So I began to think I'd need to take it back, but after working with it and letting my relative work with it I was absolutely amazed at how quickly she picked up on the concepts and ideas. The large desktop icons make it very easy for her to navigate, the big search bar makes it even easier.

We cleaned off the apps I don't think she'd be interested in or ready for (facebook, stuff like that) and left her with a wonderfully simple desktop that she was hooked on.

Assuming that this isn't a really cunning GNU/Linux fanboy masquerading as a super-satisfied customer, I think this is a significant straw in the wind. For those whose computing needs really are basic - typically older, rather than younger people - this ultra-low cost, ultra-simple PC could be a really effective solution.

One, moreover, that Windows-based PCs will never match until Microsoft starts giving away its software - as, precisely, it is starting to do in places like China and Russia. Even then it will have problems because of software bloat that GNU/Linux is mercifully unaffected by.