07 December 2006

The Open Source Brain

At first sight, there's something appropriate about Paul Allen paying for the Allen Brain Atlas:

an interactive, genome-wide image database of gene expression in the mouse brain. A combination of RNA in situ hybridization data, detailed Reference Atlases and informatics analysis tools are integrated to provide a searchable digital atlas of gene expression. Together, these resources present a comprehensive online platform for exploration of the brain at the cellular and molecular level.

After all, he did work on an "electronic brain" as they were mockingly called back in those dim, dark days of early computing. And it comes as no surprise that the freely-available and rather impressive 3D Brain Explorer - think Google Earth for the mouse brain - is only available for Windows XP and the Macintosh.

But dig a little deeper, and you find something rather telling about the real "brain" behind this brain:

Processing the amount of data produced during the Atlas project (approximately 1 terabyte/day) requires a fully automated data processing and analysis pipeline. A goal of informatics is to provide the infrastructure that will allow scaling of an increase in image data and complexity of image processing. The IDP was designed to be modularized and scalable to support a library of informatics algorithms and to function so that additional incorporation of informatics modules does not interrupt production systems. The system must also have the flexibility to accommodate defining multiple workflows using some or all algorithms and is iterative in its processing of gene image series. Parts of the process are computationally intensive, such as image quality assurance/quality control (QA/QC) and preprocessing, registration, and signal quantification. These tasks are scheduled and run in parallel on the server cluster.

Right. And just as a matter of interest, what might that cluster be running?

The cluster consists of a total of 148 CPUs, 32 HP BL35p blades with dual AMD 2.4Ghz, 4GB RAM and 21 IBM HS20 blades with dual Intel 2.8Ghz Hyperthreaded, 4GB RAM, all running Fedora Linux.

Obviously someone used their brain.


Common Allen Brain Atlas Misconceptions:

Still, as you note, the great thing about this project is that it's privately-funded, so nothing was taken from the public purse. It's all extra, even if it's a little hyperbolic in its claims (must be the Microsoft influence....).

Good catch, Glyn. There's usually Linux on the backend.

It's not that I want them to post 'Linux cluster' on top, but posting Microsoft Windows on top is a bit of misrepresentation.

Interoperability: "Can't we all just get along?"

I was surprised that they gave all the details - cynically, I would have expected these sort of things to be elided over....