Showing posts with label bermuda principles. Show all posts
Showing posts with label bermuda principles. Show all posts

31 May 2010

Transparency is in WikiLeaks' DNA

It is somewhat ironic that the man behind WikiLeaks, Julian Assange, is not a fan of being in the spotlight; and therefore perhaps poetic justice that he is increasingly the focus of in-depth profiles. The best one so far has just appeared in The New Yorker, and includes this memorable description:

WikiLeaks receives about thirty submissions a day, and typically posts the ones it deems credible in their raw, unedited state, with commentary alongside. Assange told me, “I want to set up a new standard: ‘scientific journalism.’ If you publish a paper on DNA, you are required, by all the good biological journals, to submit the data that has informed your research—the idea being that people will replicate it, check it, verify it. So this is something that needs to be done for journalism as well. There is an immediate power imbalance, in that readers are unable to verify what they are being told, and that leads to abuse.” Because Assange publishes his source material, he believes that WikiLeaks is free to offer its analysis, no matter how speculative.

I'm sure Sir John Sulston had no idea how far his idea of openness would be taken when he drew up the Bermuda Principles....

17 July 2006

The World's First Open Source Man

The genome – the totality of DNA found in practically every cell in our body - is a kind of computer program, stored on 23 pairs of biological DVDs, called chromosomes. Within each chromosome, there are thousands of special sub-routines known as genes. Between the genes lie stretches of the main program that calls the subroutines, as well as spacing elements to make the code more legible, and non-functional comments – doubtless deeply cool when they were first written – that have by now lost all their meaning for us.

DNA's digital code – written not in binary, but quaternary (usually represented by the initials of the four chemicals that store it: A, C, G and T) – is run in a wide range of cellular computers, using a central processing unit (known as a ribosome), and with various initial values and time-dependent inputs supplied in a special format, as proteins. The cell computer produces similarly-formatted outputs, which may act on both itself and other cells.

Thanks to a far-sighted agreement known as the Bermuda Principles, the digital code that lies at the heart of life is freely available from three main databases: one each in the US, UK and Japan. As a result, the DNA that was obtained through the Human Genome Project is open source's greatest triumph.

But so far, no human genome can be said to represent any single human being: that of the Human Genome Project is in fact a composite, made up of a couple of dozen anonymous donors. But soon, all that will change; for the first time, the complete genome of a single person will be placed in the public databases for anyone to download and to use, creating in effect the world's first open source man.

His name is Craig Venter, and for nearly two decades he has been simultaneously revered and reviled as one of the most innovative researchers in the world of genomics. He was the person behind the company Celera that sought to sequence the human genome before the public Human Genome Project, with the aim of patenting as much of it as possible. Fortunately, the Human Genome Project managed to stitch together the thousands of DNA fragments it had analysed – not least thanks to some serious hardware running GNU/Linux – and to put its own human genome in the public domain, thus thwarting Celera's plans to make it proprietary.

A nice twist to this story is that it turned out that Celera's DNA sequence was not, as originally claimed, another composite, but came almost entirely from one person: Craig Venter himself. So his latest project is in many ways simply the completion of this earlier attempt to become the first human with a fully-sequenced genome. The difference now, though, it that it will be in the public databases, and hence accessible by anyone.

This will have profound consequences. Aside from placing his DNA fingerprint out in the open – which will certainly be handy for any police forces that wish to investigate Venter – it means that anyone can analyse his DNA for anything. At the very least, scientists will be able to carry out tests for genetic pre-dispositions to all kinds of common and not-so-common diseases.

So it might happen that a laboratory somewhere discovers that Venter is carrying a genetic variant that has potentially serious health implications. Most of us will be able to choose whether to take such tests and hence whether to know the results, which is just as well. In the case of incurable diseases, for example, the knowledge that there is a high probability – perhaps even certainty – that you will succumb at some point in the future is not very useful unless there is a cure or at least a treatment available. Venter no longer has that choice. Whether he wants it or not, others can carry out the test and announce the result; since Venter is a scientific celebrity and a public figure, he is bound to get to hear about it one way or another.

So while his decision to sequence his genome might be seen as the ultimate act of egotism, by choosing to publish the result he will in fact be providing science with a wonderfully rich resource - the complete code of his life - and at some considerable risk, if only psychological, to himself.

09 March 2006

The Dream of Open Data

Today's Guardian has a fine piece by Charles Arthur and Michael Cross about making data paid for by the UK public freely accessible by them. But it goes beyond merely detailing the problem, and represents the launch of a campaign called "Free Our Data". It's particularly good news that the unnecessary hoarding of data is being addressed by a high-profile title like the Guardian, since a few people in the UK Government might actually read it.

It is rather ironic that at a time when nobody outside Redmond disputes the power of open source, and when open access is almost at the tipping point, open data remains something of a distant dream. Indeed, it is striking how advanced the genomics community is in this respect. As I discovered when I wrote Digital Code of Life, most scientists in this field have been routinely making their data freely available since 1996, when the Bermuda Principles were drawn up. The first of these stated:

It was agreed that all human genomic sequence information, generated by centres funded for large-scale human sequencing, should be freely available and in the public domain in order to encourage research and development and to maximise its benefit to society.

The same should really be true for all kinds of large-scale data that require governmental-scale gathering operations. Since they cannot be feasibly gathered by private companies, such data ends up as a government monopoly. But trying to exploit that monopoly by crudely over-charging for the data is counter-productive, as the Guardian article quantifies. Let's hope the campaign gathers some momentum - I'll certainly being doing my bit.

Update: There is now a Web site devoted to this campaign, including a blog.