Showing posts with label false positives. Show all posts
Showing posts with label false positives. Show all posts

03 January 2010

Why Extending the DNA Database is Dangerous

Part of the problem with extending the DNA database is that doing so increases the likelihood of this happening:

After a seven-day trial, Jama had been convicted of raping a 40-year-old woman in the toilets at a suburban nightclub.

The only evidence linking him to the crime was a DNA sample taken from the woman's rape kit.

...

Jama had steadfastly denied the charge of rape and said he had never been to that nightclub, not on that cold Melbourne night, not ever. He repeatedly stated he was with his critically ill father on the other side of Melbourne, reading him passages from the Koran.

But the judge and the jury did not buy his alibi, despite supporting evidence from his father, brother and friend. Instead, they believed the forensic scientist who testified there was a one in 800 billion chance that the DNA belonged to someone other than the accused man.

This week Jama gave the lie to that absurdly remote statistic. After prosecutors admitted human error in the DNA testing on which the case against Jama was built, his conviction was overturned.

Prosecutors said they could not rule out contamination of the DNA sample after it emerged the same forensic medical officer who used the rape kit had taken an earlier sample from Jama in an unrelated matter. They admitted a "serious miscarriage of justice".

DNA is an important forensic tool - when used properly. But it is not foolproof, not least because contamination can lead to false positives.

The more DNA profiles that are stored on a database, the more likely there will be a match found due to such false positives. And such is the belief in the infallibility of DNA testing - thanks to the impressive-sound "one in 800 billion chance that the DNA belonged to someone other than the accused man" - that it is likely to lead to more *innocent* people being convicted. The best solution is to keep the DNA database small, tight and useful.

Follow me @glynmoody on Twitter or identi.ca.

27 October 2008

More on Labour's Data Delusion

And so it goes on:


Every police force in the UK is to be equipped with mobile fingerprint scanners - handheld devices that allow police to carry out identity checks on people in the street.

The new technology, which ultimately may be able to receive pictures of suspects, is likely to be in widespread use within 18 months. Tens of thousands of sets - as compact as BlackBerry smartphones - are expected to be distributed.

The police claim the scheme, called Project Midas, will transform the speed of criminal investigations. A similar, heavier machine has been tested during limited trials with motorway patrols.

To address fears about mass surveillance and random searches, the police insist fingerprints taken by the scanners will not be stored or added to databases.

Yeah, pull the other one. The point is, given the current government's mentality that more is better, it is inevitable that these prints will be added. The irony is, this will actually make the system *less* useful.

To see why, consider what happens if there is a 1 in 100,000,000 chance of false positives using these new units. Suppose there are 1,000,000 fingerprints on the database: that means after 100 checks, there is likely to be a false match - bad enough. But now consider what happens when all these other fingerprints, obtained at random, are added, and the database increases to 10,000,000: a false positive will be obtained after every *10* checks on average. In other words, the more prints there are on the database, the worse the false positive rate becomes because of the unavoidable errors in biometrics.

This back of the envelope calculation also shows the way forward for biometric checks - of all kinds, since they are all subject to the same scaling problem. The government should aim to *reduce* the number of files it holds, but ensure that they are the ones that they are most interested in/concerned about. In other words, try to cut the database down to 100,000, say, but make sure they are *right* 100,000, not just random members of the public.

It's clear that the reason for Labour's data delusion is that it doesn't understood the technology that it is seeking to apply. In particular, it doesn't understand that the error rate sets a limit on the useful size of such databases. Super-duper databases are simply super stupid.