Showing posts with label human genome project. Show all posts
Showing posts with label human genome project. Show all posts

20 July 2013

Public Domain Human Genome Project Generated More Research And More Commercial Activity Than Proprietary Competitor

Traditionally, there has been a blithe assumption that more innovation occurs when patents are granted than when they are not. But as Techdirt has reported, people are finally beginning to call that into question. A forthcoming paper from an economist at MIT, Heidi Williams, provides another example of where that is not the case: in the field of genomics (via @gsDetermination). 

On Techdirt.

28 April 2011

Damaging the DNA of Science

Here's a sad story, but not for the reason you might expect:

Developing therapies from human embryonic stem cells is under threat in Europe, say scientists.

In a letter to Nature, they express "profound concern" about moves at the European Court of Justice to ban patent protection for embryonic stem cell lines.

...

In their letter to Nature, the scientists argue that industry would have no incentive to invest in this area unless their innovations could be protected with patents.

This is the old FUD that unless patents are given for every possible advance, industry will never "invest". Well, even assuming that were true, scientists shouldn't be worrying about that: they are *scientists*, not managers. They are supposed to be motivated by love of knowledge, by the joy of research. Patents weren't allowed on the results of the Human Genome Project, and yet somehow that came to splendid fruition: why should stem cell research be any different?

And the idea that industry doesn't invest without patents is nonsense: that's precisely what happened in the world of software until a misguided court decision allowed programs to be patented in the US. But the introduction of patents in that field has led to a net *loss* for the industry of billions of dollars, as the book "Patent Failure" - written by two supporters of patents - explains in great detail.

The central motivation for innovation is not to get a patent, but to use that innovation to surpass rivals and win business as a result - it's a means to an end. Even if those rivals then use that same invention, they are still at a disadvantage because they are simply following in the original innovator's footsteps. And if they manage to develop the work further, then they advance the field and provide more ideas for yet more innovation - that's how things are supposed to work.

But what's really sad about this whole episode is the fact that scientists have become so corrupted by the trend towards turning knowledge into property that they can't conceive of carrying out exciting science without the nominal incentives of patents. This indicates that something bad has happened to very DNA of science - and patented stem cell research certainly isn't going to fix it.

Follow me @glynmoody on Twitter or identi.ca.

06 October 2009

Postcodes: Royal Fail

Here's a perfect example of why intellectual commons should not be enclosed.

The UK Postcode data set is obviously crucial information for businesses and ordinary citizens - something that is clearly vital to the smooth running of everyday life. But more than that, it is geographic information that allows all kinds of innovative services to be provided by people with clever ideas and some skill.

That's exactly what happened when the Postcode database was leaked on to the Internet recently. People used that information to do all sorts of things that hadn't been done before, presumably because the company that claims to own this information, Royal Mail, was charging an exorbitant amount for access to it.

And then guess what happened? Yup, the nasties started arriving:

On Friday the 2nd October we received correspondence from the Royal Mail demanding that we close this site down (see below). One of the directors of Ernest Marples Postcodes Ltd has also been threatened personally.

We are not in a position to mount an effective legal challenge against the Royal Mail’s demands and therefore have closed the ErnestMarples.com API effective immediately.

We understand that this will cause harm and considerable inconvenience to the many people who are using or intend to use the API to power socially useful tools, such as HealthWhere, JobcentreProPlus.com and PlanningAlerts.com. For this, we apologise unreservedly.

Specifically, intellectual monopolies of a particularly stupid kind are involved:

Our client is the proprietor of extensive intellectual property rights in the Database, including copyright in both the Database and the software, and database rights.

Here's what Wikipedia has to say about these "database rights":

The Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases is a European Union directive in the field of copyright law, made under the internal market provisions of the Treaty of Rome. It harmonizes the treatment of databases under copyright law, and creates a new sui generis right for the creators of databases which do not qualify for copyright.

Before 1996, these sui generis "database rights" did not exist; they were created in the EU because lobbyists put forward the argument that they would offer an incentive to create more databases than the Americans, whose database publishers strangely didn't seem to need this new "right" to thrive, and so make the mighty EU even mightier - at least as far as those jolly exciting databases were concerned.

Rather wisely, afterwards the EU decided to do some research in this area, comparing their creation before and after the new sui generis right was brought in, to see just how great that incentive proved to be - a unique opportunity to test the theory that underpins intellectual monopolies. Here are the results of that research:

Introduced to stimulate the production of databases in Europe, the “sui generis”protection has had no proven impact on the production of databases.

According to the Gale Directory of Databases, the number of EU-based database “entries” was 3095 in 2004 as compared to 3092 in 1998 when the first Member States had implemented the “sui generis” protection into national laws.

It is noteworthy that the number of database “entries” dropped just as most of the EU-15 had implemented the Directive into national laws in 2001. In 2001, there were 4085 EU-based “entries” while in 2004 there were only 3095.

While the evidence taken from the GDD relies on the number of database “entries” and not on the overall turnover achieved or the information supplied by means of databases, they remain the only empirical data available.

So, the official EU study finds that the sui generis protection has had no proven impact on the production of databases; in fact, the number of databases went *down* after it was introduced.

Thus these "database rights" have been shown to stifle the production of databases - negating the whole claimed point of their introduction. Moreover, the Royal Mail's bullying of a couple of people who are trying to offer useful services that would not otherwise exist, shows the danger of entrusting such a critical data commons to commercial entities who then enclose it by claiming "database rights" in them: they will always be tempted to maximise their own profit, rather than the value to society as a whole.

Giving the Royal Mail a monopoly on this critical dataset - one that for all practical purposes can never be created again - is like giving a genetics company a monopoly on the human genome. That was attempted (hello, Celera) but, fortunately for us, thwarted, thanks largely to free software. Today, the human genome is an intellectual commons (well, most of it), and the Postcode data should be, too.

Follow me @glynmoody on Twitter or identi.ca.

21 July 2009

Has Google Forgotten Celera?

One of the reasons I wrote my book Digital Code of Life was that the battle between the public Human Genome Project and the privately-funded Celera mirrored so closely the battle between free software and Microsoft - with the difference that it was our genome that was at stake, not just a bunch of bits. The fact that Celera ultimately failed in its attempt to sequence and patent vast chunks of our DNA was the happiest of endings.

It seems someone else knows the story:

Celera was the company founded by Craig Venter, and funded by Perkin Elmer, which played a large part in sequencing the human genome and was hoping to make a massively profitable business out of selling subscriptions to genome databases. The business plan unravelled within a year or two of the publication of the first human genome. With hindsight, the opponents of Celera were right. Science is making and will make much greater progress with open data sets.

Here are some rea[s]ons for thinking that Google will be making the same sort of mistake as Celera if it pursues the business model outlined in its pending settlement with the AAP and the Author's Guild....

Thought provoking stuff, well worth a read.

Follow me @glynmoody on Twitter or identi.ca.

14 May 2009

The Common Thread: Open Data, Open Access

Sir John Sulston is one of this country's - and the world's - heroes. Already a one-time Nobel prize winner for his work on worms (well, cell death, more precisely), he stands a good chance of winning another one for his work on the human genome project. But his contribution there is even greater: he was one of the main people behind making the human genome data freely available immediately, with no strings attached - one of the first, and still biggest, wins for open data.

One knock-on effect was that this made patenting genes harder in those jurisdictions benighted enough to allow it - something that Sulston has railed against loudly. As it happens, there is currently a major court case in the US is trying to undo some of the stupid earlier decisions in this respect: this is a biggie, so let's keep our fingers crossed.

But Sulston is not resting on his considerable laurels; he's at it again, working this time with a traditional publisher to edit a major new series of books that will be freely available online under a CC licence:


Sir John Sulston, Nobel prize winner and one of the architects of the Human Genome Project, has teamed up with Bloomsbury to edit a new series of books that will look at topics including the ethics of genetics and the cyber enhancement of humans.

The series will be the first from Bloomsbury's new venture, Bloomsbury Academic, launched late last year as part of the publisher's post-Harry Potter reinvention. Using Creative Commons licences, the intention is for titles in the imprint to be available for free online for non-commercial use, with revenue to be generated from the hard copies that will be printed via print-on-demand and short-run printing technologies.

As for the topics:

Sulston and Harris's series, Science, Ethics and Innovation, will be aimed "at a very wide market", covering subjects from "the interplay between science and society, to new technological and scientific discoveries and how they impact on our understanding of ourselves and our place in society", and the responsibility of science to the wider world. Authors they will be looking to commission will range from academics to policymakers, opinion formers, those working in commercial scientific roles, "and maybe even politicians". "They'll be non-technical books which will appeal to any intelligent person," said Harris. "The proverbial Guardian reader."

This is whole area of openness is one where Sulston has been active for decades. Indeed, alongside open data and open access he is also a big supporter of free software, and hugely savvy about the ethical aspects of this movement. If you want to find out more about this extraordinary man and his amazing career, I strongly recommend his autobiography: The Common Thread.

07 July 2008

A New Institute for Science, Ethics and Innovation

One of the most remarkable men around today is Sir John Sulston. He's already won a Nobel Prize for his work on nematode worms/apoptosis, and he seems certain to share another for his work on the Human Genome Project. He really ought to get a couple for that, since he was the leader of the forces that kept the human genome free and (relatively) unpatented - think of him as the RMS of the genome (he's also a big fan of free software).

So it's great to see his passion for ethics being channelled in a new institute, which opened last Saturday:

The mission of the Institute for Science, Ethics and Innovation (iSEI) is to observe and analyse the role and moral responsibilities of science and innovation. The institute will examine the ways in which science is used in the 21st century, evaluate possible or desirable changes, and consider the forms of regulation and control of the process that are appropriate or required.

More power to his elbow.

12 October 2007

Copyright Olympics

Good to see an eminent writer getting it:

It's not just that the idea of copyrighting an entry in the English dictionary, or someone's face, haircut or name, is ridiculous. There is an issue of principle. By declaring images, titles and now words to be ownable brands, these various organisations and individuals are contributing to an increased commodification and thus privatisation of materials previously agreed to be in the public domain. For scientists, this constrains the use of public and published knowledge, up to and including the human genome. For artists, it implies that the only thing you can do with subject matter is to sell it.

(Via TechDirt.)

14 June 2007

Access to Knowledge is Dangerous

Apparently:

Although the idea of discussions on a Treaty on Access to Knowledge appears to have strong support in the African Group, Asian Group and the Group of Friends of Development, Group B is mounting a full court press against even the mere mention of “access to knowledge” in the recommendations of this PCDA as evidenced by the bracketed text.

Paragraph 10 on complementary mechanisms of stimulating innovation reads:

10. [To exchange experiences on open collaborative projects for the development of public goods such as the Human Genome Project and Open Source Softwared (Manalo 38)]

It is quite unfortunate that the intransigence of rich Member States and their allies is hindering true progress at WIPO whether it be on the over-arching principle of a Treaty on Access to Knowledge or examining open collaborative projects.

Dangerous stuff this knowledge: got to keep it locked down. (Via James Love.)

Update: Some movement on the first matter, it seems.

13 February 2007

Novartis Does Open Genomics

It's happening, slowly:

Novartis, the Basel, Switzerland, drug giant, has helped uncover which of the 20,000 genes identified by the Human Genome Project are likely to be associated with diabetes. But rather than hoard this information, as drug firms have traditionally done, it is making it available for free on the World Wide Web.

"It will take the entire world to interpret these data," says Novartis research head Mark Fishman. "We figure we will benefit more by having a lot of companies look at these data than by holding it secret."

The data and more information is available from the Diabetics Genetics Initiative site at the Broad Institute. (Via Slashdot.)

27 September 2006

Open Access to the Origins of Language

New Scientist reports:

Linguists are calling for an online public database, similar to the human genome project, that would allow researchers to collaboratively share different studies of language impairment.

By gathering together studies of developmental disorders that cause communication impairments – such as autism or Down’s syndrome – they hope to provide new clues about the origins of language.

Aside from the interesting nature of the project, what is striking is that the key element is not creating new knowledge, but consolidating it in a database, allowing higher-level knowledge to emerge. Clearly, for this to work in an optimal way, all the data and papers need to be open access. Whether it will be, assuming the project goes ahead, remains to be seen.

Update: Wow, the original article behind the NS story is not behind the usual paywall. So from this I can read:

We close by illustrating how systematic analyses within and between disorders, suitably informed by evolutionary theory—and ideally facilitated by the creation of an open-access database—could provide new insights into language evolution.

17 July 2006

The World's First Open Source Man

The genome – the totality of DNA found in practically every cell in our body - is a kind of computer program, stored on 23 pairs of biological DVDs, called chromosomes. Within each chromosome, there are thousands of special sub-routines known as genes. Between the genes lie stretches of the main program that calls the subroutines, as well as spacing elements to make the code more legible, and non-functional comments – doubtless deeply cool when they were first written – that have by now lost all their meaning for us.

DNA's digital code – written not in binary, but quaternary (usually represented by the initials of the four chemicals that store it: A, C, G and T) – is run in a wide range of cellular computers, using a central processing unit (known as a ribosome), and with various initial values and time-dependent inputs supplied in a special format, as proteins. The cell computer produces similarly-formatted outputs, which may act on both itself and other cells.

Thanks to a far-sighted agreement known as the Bermuda Principles, the digital code that lies at the heart of life is freely available from three main databases: one each in the US, UK and Japan. As a result, the DNA that was obtained through the Human Genome Project is open source's greatest triumph.

But so far, no human genome can be said to represent any single human being: that of the Human Genome Project is in fact a composite, made up of a couple of dozen anonymous donors. But soon, all that will change; for the first time, the complete genome of a single person will be placed in the public databases for anyone to download and to use, creating in effect the world's first open source man.

His name is Craig Venter, and for nearly two decades he has been simultaneously revered and reviled as one of the most innovative researchers in the world of genomics. He was the person behind the company Celera that sought to sequence the human genome before the public Human Genome Project, with the aim of patenting as much of it as possible. Fortunately, the Human Genome Project managed to stitch together the thousands of DNA fragments it had analysed – not least thanks to some serious hardware running GNU/Linux – and to put its own human genome in the public domain, thus thwarting Celera's plans to make it proprietary.

A nice twist to this story is that it turned out that Celera's DNA sequence was not, as originally claimed, another composite, but came almost entirely from one person: Craig Venter himself. So his latest project is in many ways simply the completion of this earlier attempt to become the first human with a fully-sequenced genome. The difference now, though, it that it will be in the public databases, and hence accessible by anyone.

This will have profound consequences. Aside from placing his DNA fingerprint out in the open – which will certainly be handy for any police forces that wish to investigate Venter – it means that anyone can analyse his DNA for anything. At the very least, scientists will be able to carry out tests for genetic pre-dispositions to all kinds of common and not-so-common diseases.

So it might happen that a laboratory somewhere discovers that Venter is carrying a genetic variant that has potentially serious health implications. Most of us will be able to choose whether to take such tests and hence whether to know the results, which is just as well. In the case of incurable diseases, for example, the knowledge that there is a high probability – perhaps even certainty – that you will succumb at some point in the future is not very useful unless there is a cure or at least a treatment available. Venter no longer has that choice. Whether he wants it or not, others can carry out the test and announce the result; since Venter is a scientific celebrity and a public figure, he is bound to get to hear about it one way or another.

So while his decision to sequence his genome might be seen as the ultimate act of egotism, by choosing to publish the result he will in fact be providing science with a wonderfully rich resource - the complete code of his life - and at some considerable risk, if only psychological, to himself.

18 May 2006

And the First Shall Be Last

It is done: the last unsequenced human chromosome - which happens to be the first in terms of size and hence numbering - has finally been "completed" (to 99.4%). Even more impressive, you can actually read the full Nature report on the subject. The digital code of the human genome, of course, has always been freely available (well, since 1996).

OK, so we've got the source code of us: all we have to do is understand it. Indications are, there will be quite a few surprises.

04 April 2006

Coughing Genomic Ink

One of the favourite games of scholars working on ancient texts that have come down to us from multiple sources is to create a family tree of manuscripts. The trick is to look for groups of textual divergences - a word added here, a mis-spelling there - to spot the gradual accretions, deletions and errors wrought by incompetent, distracted or bored copyists. Once the tree has been established, it is possible to guess what the original, founding text might have looked like.

You might think that this sort of thing is on the way out; on the contrary, though, it's an extremely important technique in bioinformatics - hardly a dusty old discipline. The idea is to treat genomes deriving from a common ancestor as a kind of manuscript, written using just the four letters - A, C, G and T - found in DNA.

Then, by comparing the commonalities and divergences, it is possible to work out which manuscripts/genomes came from a common intermediary, and hence to build a family tree. As with manuscripts, it is then possible to hazard a guess at what the original text - the ancestral genome - might have looked like.

That, broadly, is the idea behind some research that David Haussler at the University of California at Santa Cruz is undertaking, and which is reported on in this month's Wired magazine (freely available thanks to the magazine's enlightened approach to publishing).

As I described in Digital Code of Life, Haussler played an important role in the closing years of the Human Genome Project:

Haussler set to work creating a program to sort through and assemble the 400,000 sequences grouped into 30,000 BACs [large-scale fragments of DNA] that had been produced by the laboratories of the Human Genome Project. But in May 2000, when one of his graduate students, Jim Kent, inquired how the programming was going, Haussler had to admit it was not going well. Kent had been a professional programmer before turning to research. His experience in writing code against deadlines, coupled with a strongly-held belief that the human genome should be freely available, led him to volunteer to create the assembly program in short order.

Kent later explained why he took on the task:

There was not a heck of a lot that the Human Genome Project could say about the genome that was more informative than 'it's got a lot of As, Cs, Gs and Ts' without an assembly. We were afraid that if we couldn't say anything informative, and thereby demonstrate 'prior art', much of the human genome would end up tied up in patents.

Using 100 800 MHz Pentiums - powerful machines in the year 2000 - running GNU/Linux, Kent was able to lash up a program, assemble the fragments and save the human genome for mankind.

Haussler's current research depends not just on the availability of the human genome, but also on all the other genomes that have been sequenced - the different manuscripts written in DNA that have come down to us. Using bioinformatics and even more powerful hardware than that available to Kent back in 2000, it is possible to compare and contrast these genomes, looking for tell-tale signs of common ancestors.

But the result is no mere dry academic exercise: if things go well, the DNA text that will drop out at the end will be nothing less than the genome of one of our ancient forebears. Even if Wired's breathless speculations about recreating live animals from the sequence seem rather wide of the mark - imagine trying to run a computer program recreated in a similar way - the genome on its own will be treasure enough. Certainly not bad work for those scholars who "cough in ink" in the world of open genomics.

16 March 2006

The Power of Open Genomics

The National Human Genome Research Institute (NHGRI), one of the National Institutes of Health (NIH), has announced the latest round of mega genome sequencing projects - effectively the follow-ons to the Human Genome Project. These are designed to provide a sense of genomic context, and to allow all the interesting hidden structures within the human genome to be teased out bioinformatically by comparing them with other genomes that diverged from our ancestors at various distant times.

Three more primates are getting the NHGRI treatment: the rhesus macacque, the marmoset and the orangutan. But alongside these fairly obvious choices, eight more mammals will be sequenced too. As the press release explains:

The eight new mammals to be sequenced will be chosen from the following 10 species: dolphin (Tursiops truncates), elephant shrew (Elephantulus species), flying lemur (Dermoptera species), mouse lemur (Microcebus murinus), horse (Equus caballus), llama (Llama species), mole (Cryptomys species), pika (Ochotona species), a cousin of the rabbit, kangaroo rat (Dipodomys species) and tarsier (Tarsier species), an early primate and evolutionary cousin to monkeys, apes, and humans.

If you are not quite sure whom to vote for, you might want to peruse a great page listing all the genomes currently being sequenced for the NHGRI, which provides links to a document (.doc, alas, but you can open it in OpenOffice.org) explaining why each is important (there are pix, too).

More seriously, it is worth noting that this growing list makes ever more plain the power of open genomics. Since all of the genomes will be available in public databases as soon as they are completed (and often before), this means that bioinformaticians can start crunching away with them, comparing species with species in various ways. Already, people have done the obvious things like comparing humans with chimpanzees, or mice with rats, but the possibilities are rapidly becoming extremely intriguing (tenrec and elephant, anyone?).

And beyond the simple pairing of genomes, which yields a standard square-law richness, there are even more inventive combinations involving the comparison of multiple genomes that may reveal particular aspects of the Great Digital Tree of Life, since everything may be compared with everything, without restriction. Now imagine trying to do this if genomes had been patented, and groups of them belonged to different companies, all squabbling over their "IP". The case for open genomics is proved, I think.

27 January 2006

Personal Genomics...but Not Yet

A new X-prize, this time for exploring inner rather than outer space, has been announced. To win the prize money, all you have to do is sequence the DNA of a 100 or more people in a few weeks. That may sound a little vague, but it is many orders of magnitude faster than we can do it now (and remember, the first human genome took about 15 years and three billion dollars).

Why bother? Well, it will open up the world of personal genomics: where the particular details of your genome - not the human genome in general - will be used to aid diagnosis and help doctors make decisions about treatment.

The X-prize announcement is really tantamount to recognising that all those breathless predictions of imminent personal genomes, made by some at the time of the Human Genome Project, were rather optimistic.

I have to say that I, for one, am not too sad. Much as I'd like to Google my genome, being able to do so will also raise considerable ethical dilemmas, as I discussed in my book Digital Code of Life.

As St. Augustine nearly said: "Give me genotypability - but not yet...."

12 December 2005

Going to the Dogs

My heart leapt last week upon seeing the latest issue of Nature magazine. The front cover showed the iconic picture of Watson and Crick, with the latter pointing at their model of DNA's double helix. A rather striking addition was the boxer dog next to Crick, also gazing up at the DNA: inside the journal was a report on the first high-quality sequencing of the dog genome (a boxer, naturally).

This is big news. Think of the genome as a set of software modules that form a cell's operating system. Every change to a genome is a hack; like most hacks, most changes cause malfunctions, and the cell crashes (= dies/grows abnormally). Some, though, work, and produce slight variants of the original organism. Over time, these variations can build up to form an entirely new species. (In other words, one way of thinking about evolution is in terms of Nature's hacking).

Mostly, the changes produced by these hacks are small, or so slow as to be practically invisible. But not for dogs. Humans have been hacking the dog genome for longer than any other piece of code - about 100,000 years - and the result can be seen in the huge variety of dog breeds (some 400 0f them).

Getting hold of the dog genome means that scientists have access to this first Great Historical Hack, which will tell us much about how genomic variation translates to different physical traits (known as phenotypes). Even better - for us, though not for the dogs - is that all this hacking/interbreeding has produced dogs that suffer from many of the same diseases as humans. Because particular breeds are susceptible to particular diseases, we know that there must be a strong genetic element to these diseases for dogs, and so, presumably, for humans (since our genomes are so similar). The different breeds have effectively separated out the genes that produce a predisposition to a particular disease, making it far easier to track them down than in the human code.

That tracking down will take place by comparing the genomes for different breeds, and by comparing dog genomes against those of humans, mice, apes and so on. Those comparisons are only possible because all this code is in the public domain. Had the great battle over open genomics - open source genomes - been lost at the time of the Human Genome Project, progress towards locating these genes that predispose towards major diseases would have been slowed immeasurably. Now it's just a matter of a Perl script or two.

Given this open source tradition, and the importance of the dog genome, it's a pity that the Nature paper discussing it is not freely available. Alas, for all its wonderful traditions and historic papers, Nature is still the Microsoft of the science world. The battle for open access - like that for open source - has still to be won.