There is a lot of pressure at present for the expansion of testing for COVID-19 to enable better understanding of the spread of the virus and possibly to help with reopening of the economy. Random population surveys have also been conducted in many countries, with a recent antibody survey in California, for example, finding 50 times more people infected than official estimates report. The WHO recognizes testing as a key part of the coronavirus response, and some countries are beginning to discuss the idea of “immunity passports”, in which people are given an antibody test and enabled to return to work if they test positive to antibodies and are well (since this indicates that they have been infected and gained immunity). The WHO advises against this approach because there is no evidence yet that people who have experienced COVID-19 and recovered are actually immune. But in addition to this virological concern, there is a larger, statistical concern about COVID-19 tests (especially antibody tests) and the consequence of widespread use of these tests as a policy guide: how reliable are they, and what are the consequences of deploying poor-quality tests?

My reader(s) may be familiar with my post on the use of Bayesian statistics to assess the impact of anti-trans bathroom laws on natal women. This study found that, since being transgender is a very low prevalence phenomenon, if we tried to actually enforce birth-gender bathroom laws almost everyone we kicked out of a woman’s toilet would actually be a cis woman. This is a consequence of Bayes’ Law, which basically tells us that when a condition has very low prevalence, any attempt to test for that condition will largely produce false positives unless the test is a very very accurate test. This applies to any attempt to discriminate between two classes of things (e.g. trans women vs. natal women, or coronavirus vs. no coronavirus). It is a universal mathematical theory, and there is no escaping it.

So what happens with testing for coronavirus. There are a couple of possible policies that can be enacted based on the result of testing:

  1. People testing positive are isolated from the rest of the community in special hospitals or accommodation, to be treated and managed until they recover
  2. People testing positive self-isolated and all their potential contacts are traced and tested, self-isolating as necessary
  3. People testing negative are allowed to return to ordinary life, working and traveling as normal
  4. People testing positive to antibodies with no illness are issued an “immunity passport” and allowed to take up essential work
  5. Health workers testing negative are allowed to return to hospital

Obviously, depending on the policy, mistakes in testing can have significant consequences. This is why the WHO has quite strict diagnostic criteria for the use of testing, which requires multiple tests at different specified time points with rules about test comparison and cautionary notes about low-prevalence areas[1]. Now that some antibody tests have achieved marketing status, I thought I would do a few brief calculations using Bayes’ rule to see how good they are and what the consequences will be. In particular let’s consider policy options 1, 3 and 4. I found a list of antibody tests currently being marketed or used in the USA here, and information on one PCR test, from Quantivirus. I assumed a testing program applied to a million people, and for each test under this program I calculated the following information:

  • The number of people testing positive and the number who are actually negative
  • The proportion of positive tests that are actually positive
  • The number of people testing negative and the number who are actually positive
  • The estimated prevalence of COVID-19 obtained from each of these tests

I used the current number of cases in the USA on 24th April (870,000), multiplied by 10 to include asymptomatic/untested cases and a US population of 330 million to estimate the true prevalence of coronavirus in USA at 2.6%.  Note that with 2.6% prevalence the true situation is 26,000 cases of COVID-19 and 974,000 people negative. I then compared the estimated prevalence for each test against this. Here are the results

Beckton-Dickinson/Biomedomics Covid-19 IgM/IgG Rapid Test

This test has 88.7% sensitivity and 90.6% specificity, and has been given emergency use authorization by the FDA. If used to test a million people in the context of disease prevalence of 2.6%, we would find the following results:

  • 114,906 people testing positive of whom 91,521 are actually negative
  • Only 20.4% of tests positive
  • 885,903 people testing negative, of whom 2,979 are positive
  • An estimated coronavirus prevalence of 11.4%

This would mean that under policy 1 (isolation of all positive cases) we would probably increase prevalence by a factor of 5, since 80% of the people we put into isolation with positive cases would be negative (and would then be infected). If we followed policy 3 or 4, we would be releasing 2,979 people into the community to work, get on trains etc., and infect others. We would also recalculate the case fatality rate of the virus to be 50 times lower than the actual observed estimate, because we had observed deaths among 870,000 cases (prevalence 0.26%) but were now dividing the confirmed deaths by a prevalence of 11.4%. This would make us think the disease is not much worse than influenza, while we were spreading it to five times as many people. Not good! Curing that epidemic is going to need a lot of bleach injections.

Cellex qSars-CoV-2 IgG/IgM Cassette Rapid Test

This test has also received emergency use authorization, and has 93.8% sensitivity and 95.6% specificity, which sounds good (very big numbers! Almost as good as Trump’s approval rating!) But if used to test 1,000,000 Americans with prevalence of 2.6% it still performs very poorly:

  • 67,569 people testing positive of whom 42840 are actually negative
  • Only 36.5% of tests positive
  • 932,430 people testing negative, of whom 1,635 are positive
  • An estimated coronavirus prevalence of 6.8%

This is still completely terrible. Isolating all the positive people (policy 1) would likely increase prevalence by a factor of 3, and we would allow 1,635 people to run around infecting others blithely assuming they were negative. Not a good outcome.

CTK Biotech OnSite Covid-19 IgG/IgM Rapid Test

This test has not yet received emergency use authorization, but has 96.9% sensitivity and 99.4% specificity. With this test:

  • 31,338 people test positive of whom 5,841 are actually negative
  • About 81% of tests are actually positive
  • 968,611 people test negative, of whom 817 are positive
  • An estimated coronavirus prevalence of 3.1%

This is much better – most people testing positive are actually positive, we aren’t releasing so many people into the wild to infect others, and our prevalence estimate is close to the true prevalence. But it still means a lot of people are being given incorrect information about their status, and are taking risks as a result.

Conclusion

Even slightly inaccurate tests have terrible consequences in epidemiology. As testing expands the ability to conduct it carefully and thoroughly – with multiple tests, sequenced tests, and clinical confirmation – drops, and the impact of even small imperfections in the testing regime grows rapidly. In the case of a highly contagious virus like COVID19 this can be catastrophic. It will expose uninfected people to increased risk of infection through hospitalization or isolation alongside positives, and if used for immunity passports significantly raises the risk of positive people returning to work in places where they can infect others. In comparison to widespread testing with low-quality tests, non-pharmaceutical interventions (e.g. lockdowns and social distancing) are far more effective, cheaper and less dangerous. It is very important that in our desire to reopen economies and restart our social lives we do not rush to use unreliable tests that will increase, rather than reduce, the risk to the community of social interactions. While testing early and often is a good, strong policy for this pandemic, this is only true when testing is conducted rigorously and using good quality tests, and not used recklessly to end social interventions that, while painful, are guaranteed to work.

 


fn1: It’s almost as if they know what they’re doing, and we should listen to them!