Anatomy of the meltdown of a forensic procedure

The CBS News program “60 Minutes” and the Washington Post aired an investigative report on November 16 criticizing the FBI for failing to notify relevant jurisdictions that hundreds of inmates have been jailed using a flawed forensic methodology. Despite discontinuing the use of “bullet lead” analysis in 2005 because of validity concerns, the FBI had taken no action to inform the courts that some defendants were potentially innocent and wrongfully imprisoned.

Bullet lead analysis was first used in the investigation of the assassination of JFK, and was routinely used in the 1980’s when bullets were so misshapen that ballistic evidence was unobtainable. The essential idea is that trace elements in lead vary naturally and that bullets could be “matched” as coming from the same source (i.e., the same box of bullets) by comparing the compositions of these trace elements. In the 2005 press release, the FBI stated, “One factor significantly influenced the Laboratory’s decision to no longer conduct the examination of bullet lead: neither scientists nor bullet manufacturers are able to definitively attest to the significance of an association made between bullets in the course of a bullet lead examination.”

We naturally ask, “How is it possible that a procedure could be trusted for 40 years, be invoked in 2,500 investigations, be used as testimony in about 500 of those cases, and then be discredited?” The FBI commissioned an independent review of the procedure in 2002 by the National Research Council. Their report is very fascinating to read, is very comprehensive, and was completed in 2004. A copy may be purchased at the following URL: The findings of this report convinced the FBI to discontinue the bullet lead analysis.

After browsing through this report and reading the findings and recommendations, it is clear that the FBI procedure devised in the 1960’s could not withstand public scrutiny. From my perspective, the most troubling aspect of the analysis was that it was (and is) unknown how many compositionally similar bullets were produced and where they were distributed. This means that a probability statement concerning the likelihood of a false positive (i.e., saying the bullets came from the same box when they didn’t) was impossible. Without such a statement the forensic examiner cannot state with any reliability or objectivity that the bullet found at the crime scene came from the same box as bullets found in the possession of the suspect.

The NRC also indicated that the method of computing the statistical match should be revised. From my perspective this is because the FBI’s computational procedure was not based on a statistic. It was computed using statistical ideas, but not supported with statistical distribution theory. This procedure falls into the realm of “ad-hoc analytics.” It seemed good at the time. There wasn’t a better idea. But, there was no way to determine error rates and probabilities associated with the procedure. I have seen a lot of ad-hoc statistical procedures in my day and they nearly always fail eventually because they are based on some statistical idea but they have no statistical theory that supports them. In the long run, the queen of statistics (i.e., natural variability) overwhelms all procedures that do not estimate probability models from empirical data.

I have a good friend who quoted the maxim, “Models before algorithms” often. By this he meant that you should analyze the processes that generate the data and the variability associated with the data before you build detection methodologies. I have tried to follow this rule assiduously in devising detection methodologies for Caveon Data Forensics. Without the guidance of reasonable probability models, statistical interpretations of the data are subjective and indefensible.

Dennis Maynes

Chief Scientist, Caveon Test Security

Leave a Reply