The Year of the Bat: Lies, Damn Lies and Statistics

Last year was supposed to be the Chinese Year of the Rat but the year had hardly started and it transmogrified into the Year of the Bat.  It not only created genocidal mayhem but also drove the world schizophrenic with false narratives and bad data.  This blog focusses on the latter with particular reference to South Africa.

Fighting Covid is a war and, just like in a war, the first casualty is the truth.  Firstly, who is to blame for the war and secondly how successfully it’s being prosecuted.  We saw the Chinese disappearing various scientists, journalists and commentators who dared bring to light any narrative other than the officially sanctioned one.  More famously, we have seen Trump taking the art of lying to new highs (or is it lows).  At the end of the Year of the Bat, Russian duplicity was admitted when the deputy Prime Minister admitted that while the official death toll was 57 000, there seemed to be in excess of 180 000 deaths.  This estimate came courtesy of the Rosstat statistics agency who stated that deaths from all causes recorded between January and November were 229 700 higher than in 2019 and Covid was probably responsible for 81% (186 000).

As the pandemic started getting a grip, I began tracking the daily reported cases and deaths in the UK and the USA in Excel.  I was whiling away lockdown trying do a bit of armchair research by establishing the relationship:

Dn = M * In-d

where D = Daily deaths on day n

            In-d = Daily infections from d days prior

            M = mortality ratio

The results weren’t too bad and generally 3-4% of people died 7-10 days after infection.

Then suddenly, the UK corrected their numbers with a big spike and changed the definition of who was infected.  M and d refused to behave as predicted just like the virus.  There was also a significant difference between the US and the UK.  Eventually I gave it up.

Different countries, were using different definitions for infection in the early days when testing wasn’t that prevalent.  Later on, we also found out that many people were asymptomatic or very lightly affected and so didn’t bother with testing or even seeing their doctors. 

The other side of the equation were the deaths.  Apart from willfully incorrect certification by some countries, we have the problem, as with AIDS, as to what did the person die of if they had a heart attack, for instance, while infected.  On top of all this, we didn’t realise how terribly it affected the aged and those with comorbidities.  Once that was understood, far more stringent measures were taken to protect them.  On top of that, many of the vulnerable had popped their clogs early on and their deaths declined while the infections continued amongst the young and the stupid who refused to die.

After about two months, I gave up.  My days as a couch scientist were over.  I had failed partially because the input data could not be trusted.

Closer to home, I started tracking South Africa’s infections and deaths with a focus on the Western Cape.  Initially, I was trying to verify that equation, but after I gave that up, I just tracked out of morbid interest.  Eventually, the rest of the country caught the bug and I became interested as to why the number of infections was stubbornly higher in the Western Cape than that for Gauteng.  Eventually, their infection numbers overtook ours to reach roughly twice ours but their deaths only ever just exceeded ours.  The second wave arrived, and their deaths have since fallen behind again.

I became suspicious of Gauteng’s numbers as their infections and death numbers were far more erratic than the Western’s Cape’s.  That Gauteng’s numbers could not be trusted was confirmed yesterday by an article in Politicsweb by SAMRC (South African Medical Research Council, Covid-19 II: Huge December spike in excess deaths – SAMRC – DOCUMENTS | Politicsweb.  Since the beginning, they have tracked excess (natural) deaths by some statistical method, comparing this 2020’s deaths to previous years.  Death numbers as opposed to infections are very solid and can’t be fudged, only the cause is up for debate.  During the first wave, the statistical anomaly can be attributed almost wholly to Covid.  During the current second wave, that is not very true anymore as many medical treatments have been stopped, particularly those involving cancer.

Nevertheless, the excess deaths far outstrip the officially reported Covid deaths.  SAMRC calculates the excess deaths for the period 6 May to 29 December as 71778 while South Africa officially reports only 27568 or 2.6x less!

While not specifically noted in the report but glaringly obvious to any vaguely sentient being, the various graphs presented for the individual provinces show that the numbers of Covid deaths in the Western Cape bears a far stronger resemblance to reality than all the other provinces.  SAMRC excess deaths have been compared to the official Covid deaths for the period 5 May to 29 December and presented in the table below. 

ProvinceExcess DeathsOfficial Covid DeathsReporting Accuracy
Eastern Cape22530717132%
Free State5088216343%
Gauteng14110542438%
Limpopo232555624%
KwaZulu-Natal11894413435%
Mpumalanga333864019%
North West311057719%
Western Cape9768652467%
SA Overall717782756838%

Graphs of excess deaths and reported deaths for the Western Cape (top graph) and Gauteng (lower graph) dramatically show the discrepancy in underreporting visually below (The blue line represents the statistical excess deaths while the red represents the recorded Covid deaths.

As a case in point concerning the erratic compilation of data, I was just finishing off this blog when I checked the latest numbers for 6 December.  South Africa reported 21 832 infections compared to their 7-day average up to that point of 15200, a jump of 44%.  Even worse, 844 deaths were recorded against a 7-day average of 325, a massive increase of 172%.  While this is serious, it is not necessary cause for alarm.

These statistical anomalies were caused by bad data capture over the festive season.  The Eastern Cape was the main culprit in the deaths column when it played catch up as it recorded 452 deaths on the 6th, more than 50% of the deaths for the country as a whole.  However, in the 7 days prior, it had a daily average of 83.  Thus, there was a jump of 444%!  Strangely, this catch up was not particularly reflected in the infection numbers when only 1088 were recorded against a prior daily average of 784.  Only the Western Cape plodded on with serious but consistent, and hence, believable numbers.

Having read the Politsweb article, I decided to retry establishing the mortality equation using only the Western Cape numbers as they were shown to be reasonably accurate.  I further used the 7-day averaged infection data and compared the calculated results with the 7-day averaged deaths. 

The results were quite satisfying and my last hurrah showed that a roughly 4% mortality and 10 day delay was a good fit.  I was hoping for a better fit but I decided to leave at that as the progress of the pandemic has been a movable feast (for the virus, politicians and conspiracy theorists) and applying a constant equation on an evolving issue is problematical:

  • Different sections of the population were affected at different times as we learnt who was vulnerable.
  • Treatment regimes are constantly being adapted
  • Changes to the definition of infections particularly in the early stages when tests were not widely available
  • The evolution of the virus

Can I have my Nobel prize now, please?

As a very final last hurrah, I decided to check how predictable Gauteng’s death rate is using the Western Cape coefficients of the equation.

Hmmm.  I think I’ll either have to give my Noble prize back or I’ve just proven what the Politicsweb article was trying not to say out loud – the data ain’t to be trusted from certain provinces.  One might argue that the Western Cape is just terrible at managing their infections.  Playing around with the coefficients, I found a reasonable fit for Gauteng when the mortality rate was 2.1% compared to 3.8% in the Western Cape.  I don’t buy that Gauteng has better hospitals or that their people are tougher.  The giveaway is that their actual death data is erratic even when smoothed by 7-day averaging.  Furthermore, if you compare my calculated Gauteng province deaths/Covid deaths and the excess deaths/Covid deaths from two graphs previously, they are eerily similar. 

Statistics do not kill people – directly that is.  However, with the correct input data, the statistics and inferences drawn therefrom help in understanding the pandemic.  In addition, it drives the correct allocation of resources and in applying correct level of the economically damaging restrictions. 

3 Comments

  1. I agree with you fully Dean. In terms of excess deaths I have often wondered – if our population is 60m, and life expectancy were 60, just to make the maths easier, surely approximately 1m people die a year, or 80k per month. Our birth rate is about 1,5m per year (per GDE number admitted to schools) and we have a growing population so this number makes a degree of sense. Where are all the “missing“ deaths recorded? At this point, Covid is starting to grow significantly but up until now HIV (171k) heart disease and cancer have traditionally been much higher, and over 60k have died in the pre Covid times from respiratory disease per ourworldindata.com.

    Reply

Leave a Comment.

*