Benefits-payment cheater caught using statistics

The other day a woman in the UK was caught in a lie where she fabricated the existence of seven children to receive government benefits. She claimed to have given birth to quadruplets in 2005, to twins (who were delivered one week apart) in the same year, and then to a seventh child in 2007. None of these children existed.

http://www.dailymail.co.uk/pages/live/articles/news/news.html?in_article_id=494261&in_page_id=1770

The article starts: “Any mother who has given birth to quadruplets needs all the help she can get. So benefits staff were happy to provide support for Victoria Young in raising babies Kier, Kie, Kyla and Conrad. There was just one problem – none of them existed. …”

The benefits staff got suspicious on the seventh child and investigated the crime. By that time, Victoria Young “had swindled more than £40,000 in benefits payments with her bogus brood of seven babies in the space of 18 months.” (direct quote)

It’s natural to ask how data forensics techniques could be applied to this situation. We start with models that describe the population. To test the above claims we need to know about multiple birth probabilities, fertility rates, and birth spacing statistics. I found the needed statistics at a government website: http://www.statistics.gov.uk/downloads/theme_population/FM1_32/FM1no32.pdf

In 2003, only one set of quadruplets survived birth, making the probability of live quadruplets to be approximately 1 in 600,000 (Table 6.4 from the government report, see Multiple Births in Wikipedia also: http://en.wikipedia.org/wiki/Multiple_Births). From the table of statistics, the probability of twins is about 9,001 in 615,787, of triplets is about 127 in 615,787, and of quadruplets is about 3 in 615,787. If we use these values and assume that birth multiplicity is independent of each occurrence of maternity, then we can test Victoria Young’s claims with the conditional probabilities in Table 1 (computed using standard convolution equations).

Table 1: Conditional Probabilities of number of maternities given family size

 

Number of Children

Number

of

Maternities

1

2

3

4

5

6

7

1

1.00000

0.014836942

0.000209

4.95E-06

*

*

*

2

 

0.985163058

0.029234

0.000629

1.59E-05

1.88E-07

2.03976E-09

3

 

 

0.970557

0.043201

0.001251

3.57E-05

6.89055E-07

4

 

 

 

0.956165

0.056747

0.002064

6.70445E-05

5

 

 

 

 

0.941987

0.069882

0.003059676

6

 

 

 

 

 

0.928019

0.082614512

7

 

 

 

 

 

 

0.914258077

The conditional probabilities are read down the columns. (Asterisks are used to indicate values that could not be estimated from the government statistics.) For example, the probability of three maternities given seven children is in row 3 and column 7 and is equal to 6.89055E-07. (This number is in scientific notation and indicates the value of 0.00000068905507, or one in 1,450,000.)

We see that Victoria’s initial claim of quadruplets was very extreme (even though the data show that quadruplets are delivered in the UK) with a probability of one in 200,000 (this is a very extreme number and the sort of value that we typically find with extreme occurrences in Caveon Data Forensics). Her claim of six children with two maternities is even more extreme, with a probability of one in 500,000. And her final claim of seven children in three or fewer maternities has a probability of one in 1.4 million.

The claimed birth spacing is very unusual also. Victoria claimed the twins were born eight months after the quadruplets in September 2005. Birth spacing statistics from the UK website only provide a median statistic of 37 months between the first and second maternity and 42 months between the second and third maternity (Table 11.3 from the UK government report). We don’t have a lot of statistical information but for the purposes of this exposition we assume the birth spacing data follow an exponential distribution (waiting time distribution; this assumption should be tested in practice). The median will be a good estimator for the mean. Using this estimate we find that the probability of having a second maternity within 8 months or less is about one chance in one trillion. We also find that the probability of having two maternities within 18 months or less (we need the distribution of the sum, so we add the medians together) is 1 in 1025 (one trillion squared).

We have found that it is always useful to combine the probability evidence together. After all, Victoria’s motive was to acquire a large family as quickly as possible so as to maximize benefits payments. Using techniques developed at Caveon, we evaluate her final claim of seven children with three or fewer maternities in 18 months or less. The estimated probability is one chance in 1031 (one in ten billion cubed). Yes, the benefits people were justified in being suspicious. If their systems had implemented these types of probability analysis for fraud detection, they may have been able to save the UK some embarrassment and expense in catching a cheater more quickly.

In data forensics work we proceed just as I have illustrated above. We create population models. We assume the data conform to the models (i.e., there is no cheating). We test the anomalous data against the model and eventually compute probabilities. It is nearly always the case that the data do not conform precisely to the model, but the models provide sufficient guidance that objective statements concerning the improbability of the extreme data may be made.

Dennis Maynes

Chief Scientist, Caveon Test Security

Leave a Reply