TheEconomistFebruary26th 2022
Graphic detail Fishy covid-19 data
85
More equal than
others
S
ometimesthenumbersaresimplytoo
tidy to be believed. Irregular statistical
variation has proven a powerful forensic
tool for detecting possible fraud in aca
demic research, accounting statements
and election tallies. Now similar tech
niques are helping to find a new subgenre
of faked numbers: covid19 death tolls.
That is the conclusion of a new study to
be published in Significance, a statistics
magazine, by the researcher Dmitry Kobak.
Mr Kobak has a penchant for such stud
ies—he previously demonstrated fraud in
Russian elections based on anomalous tal
lies from polling stations. His latest study
examines how reported death tolls vary
overtime. He finds that this variance is
suspiciously low in a clutch of countries—
almost exclusively those without a func
tioning democracy or a free press.
Mr Kobak uses atestbasedonthe“Pois
son distribution”. This is named after a
French statistician who first noticed that
when modelling certain kinds of counts,
such as the number of people who enter a
railway station in an hour, the distribution
takes on a specific shape with one mathe
matically pleasing property: the mean of
the distribution is equal to its variance.
This idea can be useful in modelling the
number of covid deaths, but requires one
extension. Unlike a typical Poisson pro
cess, the number of people who die of
covidcan be correlated from one day to the
next—superspreader events, for example,
lead to spikes in deaths. As a result, the dis
tribution of deaths should be what statisti
cians call “overdispersed”—the variance
should be greater than the mean. Jonas
Schöley, a demographer not involved with
Mr Kobak’s research, says he has never in
his career encountered death tallies that
would fail this test.
That should make it easy to pass. And
the vast majority of countries reporting
datato the World Health Organisation do.
This does not mean that their death tallies
were necessarily accurate—undercount
ing still plagues many countries with in
sufficienttesting(whichiswhyTheEcono-
mistestimates the pandemic’s death toll
using excess deaths). But it does suggest
that the numbers reported are not being
deliberately tampered with.
Yet data from 17 countries had the oppo
site pattern. In many weeks, the variance
of each distribution was less than the
mean. This is a statistical smoking gun. “It
seems reasonable to conclude that there’s
no way these are independent observa
tions,” says David Steinsaltz, a professor of
statistics at the University of Oxford.
Imputing motives is harder. A benign
explanation would be bureaucratic bottle
necks in processing death certificates. Yet
there are other irregularities: the usual
dropoff in weekend reporting is often ab
sent. According to Mr Kobak, the likelier
explanation is cackhanded tampering.
The Russian numbers offer an example
of abnormal neatness. In August 2021
dailydeath tallies went no lower than 746
and no higher than 799. Russia’s invariant
numbers continued into the first week of
September, ranging from 792 to 799. A
backoftheenvelope calculation shows
that such a lowvariation week would
occurby chance once every 2,747 years.n
Abnormal tallies suggest some
countries reported false covid-19 data
→ Reported covid-1 death tallies have an expected amount of variance. But in some countries, variance is abnormally low
New confirmed covid-19 deaths per million people
*Lessthan1%ofexpectedvariation Sources:“Underdispersioninthereportedcovid-19caseand death
numbersmaysuggestdatamanipulations”,byD.Kobak,workingpaper,2022;OurWorldinData;JHU CSSE
Mar
2020
Apr May Jun Jul Aug Sep Oct Nov Dec Jan
2021
Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan
2022
0
5
10
15
20
Brazil
United
States
Russia
Tu r ke y
Genuine death counts vary
from day to day, as people
do not fall ill uniformly
Recent data from Russia are
also suspiciously smooth
Turkish data during the first
year of the pandemic have a
dubious lack of variation
60
65
86
86
29
79
81
67
54
26
28
92
90
69
58
62
80
% abnormal
↓
Mongolia
Cambodia
Russia
Lebanon
Tu r ke y
Albania
Uzbekistan
Algeria
Kyrgyzstan
Syria
Egypt
Venezuela
ElSalvador
Azerbaijan
Belarus
Serbia
Saudi Arabia
Very low*
Weeks with normal variation
Weeks with abnormally low variation