In early March, when COVID-19 was starting to spread in the UK, the government announced a strategy of “herd immunity” in which they would shield vulnerable people (such as older people and people with pre-existing conditions) from the disease, and aim to slowly allow the rest of the country to be infected up to some proportion of the population. This policy was based on the idea that once the disease had infected a certain proportion of the population then this would mean it had naturally been able to achieve herd immunity, and after that would die out. The basics of the strategy and its timeline are summarized here. This strategy was an incredibly dangerous, stupid and reckless strategy that was built on a fundamental failure to understand what herd immunity is, and some really bad misconceptions about the dynamics of this epidemic. Had they followed this policy the entire UK population would have been infected, and everyone in the UK would have lost at least one of their grandparents. Here I want to explain why this policy is incredibly stupid, and make a desperate plea for people to stop talking about achieving herd immunity by enabling a certain portion of the population to become infected. This idea is a terrible misunderstanding of the way infectious diseases work, and if it takes hold in the public discourse we are in big trouble next time an epidemic happens.
I will explain here what herd immunity is, and follow this with an explanation of what the UK’s “herd immunity” strategy is and why it is bad. I will call this “herd immunity” strategy “Johnson immunity”, because it is fundamentally not herd immunity. I will then present a simple model which shows how incredibly stupid this policy is. After this I will explain what other misconceptions the government had that would have made their Johnson Immunity strategy even more dangerous. Finally I will present a technical note explaining some details about reproduction numbers (the “R” being bandied about by know-nothing journalists at the moment). There is necessarily some technical detail in here but I’ll try to keep it as simple as possible.
What is herd immunity?
Herd immunity is a fundamental concept in infectious disease epidemiology that has always been applied to vaccination programs. Herd immunity occurs when so many people in the population are immune to a disease that were a case of the disease to arise in the population, it would not be able to infect anyone else and so would die out before it could become an epidemic. Herd immunity is linked to the concept of the Basic Reproduction Number, R0. R0 tells us the number of cases that will be generated from a single case of a disease, so for example if R0 is 2 then every person who has the disease will infect 2 other people. Common basic reproduction numbers range from 1.3 (influenza) to about 18 (measles). The basic reproduction number of COVID-19 is probably 4.5, and definitely above 3.
There is a simple relationship between the basic reproduction number and the proportion of the population that need to be vaccinated to ensure herd immunity. This proportion, p, is related to the basic reproduction number by the formula p=1/(1-1/R0). For smallpox (R0~5) we need 80% of the population to be vaccinated to stop it spreading; for measles (R0~18) it is safest to aim for 95%. The reason this works is because the fundamental driver of disease transmission is contact with vulnerable people. If the disease has a basic reproduction number of 5, each case would normally infect 5 people; but if 4 of every 5 people the infected person meets are immune, then the person will only likely infect 1 person before they recover or die (or get isolated). For more infectious diseases we need to massively increase the number of people who are immune in order to ensure that the infection doesn’t spread.
If we vaccinate the correct proportion of the population, then when the first case of a disease enters the population, it’s chances of meeting an infectable person will be so low that it won’t spread – effectively by vaccinating 1-1/R0 people we have reduced its effective reproduction number to 1, at which point each case will only produce 1 new case, and the virus will not spread fast enough to matter. This is the essence of herd immunity, but note that the theory applies when we vaccinate a population before a case enters the population.
What is Johnson Immunity?
There is a related concept to the basic reproduction number, the effective reproduction number Rt, which tells us how infectious the virus currently is. This is tells us how many people each case is infecting at the current state of the epidemic. Obviously as the proportion of the population who have been infected and recovered (and become immune) increases, Rt must drop, since the chance that they will have contact with an infectious person goes down. Eventually the proportion of the population infected will become so large that Rt will hit 1, meaning that now each case is only infecting another case. The idea of Johnson Immunity was that we would allow the virus to spread among only the low-risk population until it naturally reached the proportion of the population required to achieve an Rt value of 1. Then, the virus would be stifled and the epidemic would begin to die. If the required proportion to achieve Rt=1 is low enough, and we can shield vulnerable people, then we can allow the virus to spread until it burns out. This idea is related to the classic charts we see of influenza season, where the number of new infections grows to a certain point and then begins to go down again, even in the absence of a vaccine.
This idea is reckless, stupid and dangerous for several reasons. The first and most serious reason it is dangerous is that the number of daily new infections will rise as we head towards Rt=1, and by the time we reach the point where, say, 60% of the population is infected, the number of daily cases will be huge. At this point Rt=1, so each case is only infecting 1 other case. But if we have 100,000 daily new cases at this point, then the following generation of infections will spawn 100,000 new infections, and so on. If, for example, the virus has an R0 of 2, and takes 5 days to infect the next generation, then the number of new cases doubles every 5 days. After a month we have 64 cases, after two months we have 4100 cases, and so on. By the time we get to 30 million cases, we’ll likely be seeing 100,000 cases in one generation. So yes, now the virus is going to start to slow its spread, but the following generation will still generate 100,000 cases, and the generation after that 90,000, and so on. This is an incredible burden on the health system, and even if death rates are very low – say 0.01% – we are still going to be seeing a huge mortality rate.
The second reason this idea is reckless and stupid is that it is basically allowing the disease to follow its natural course, and for any disease with an R0 above about 1.5, this means it will infect the entire population even after it has achieved its Rt of 1. This happens because the number of daily cases at this point is so large that even if each case only infects 1 additional case, the disease will still spread at a horrific rate. There is an equation, called the final size equation, which links R0 to the proportion of the population that will be infected by the disease by the time it has run its course, and basically for any R0 above 2 the final size equation tells us it will infect the entire population (100% of people) if left unchecked. In practice this means that yes, after a certain period of time the number of new cases will reach a peak and begin to go down, but by the time it finishes its downward path it will have infected the entire population.
A simple model of Johnson Immunity
I built a very simple model in Excel to show how this works. I imagined a disease that lasts two days. People are infected from the previous generation on day 1, infect the next generation and then recover by the end of day 2. This means that if I introduce 1 case on day 1, it will infect R0 cases on day 2, R0*R0 cases on day 3, and so on. This is easy to model in Excel, which is why I did it. Most actual diseases have incubation periods and delayed infection, but modeling these requires more than 2 minutes work in a real stats program, and this is a blog post, so I didn’t bother with such nuance. Nonetheless, my simple disease shows the dynamics of infection. I reclaculated Rt each day for the disease, so that it was reduced by the proportion currently infected or immune, so that for example once 100,000 people are infected and recovered, in a population of 1 million people, the value of Rt becomes 90% of the value of R0. This means that when it reaches its Johnson Immunity threshold the value of Rt will go below 1 and the number of cases will begin to decline. This enables us to see how the disease will look when it reaches the Johnson Immunity threshold, so we can see what horrors we are facing. I assumed no deaths and no births, so I ran the model in a closed population of 1 million people. I ran it for a disease with an R0 of 1.3, 1.7, and 2.5, to show some common possible scenarios. Figure 1 shows the results. Here the x-axis is the number of days since the first case was introduced, and the y-axis is the number of daily new cases. The vertical lines show the day at which the proportion of the population infected, Pi, crosses the threshold 1-1/R0. I put this in on the assumption that the Johnson Immunity threshold will be close to the classical herd immunity threshold (it turns out it’s off by a day or two). The number above the line shows the final proportion of the population that will be infected for this particular value of R0.
As you can see, when R0 is 1.3 (approximately seasonal influenza), we cross the approximate Johnson Immunity threshold at 44 days after the first case, and at this point we have a daily number of cases of about 40,000 people. This disease will ultimately infect 49% of the population. Note how slowly it goes down – for about a week after we hit the Johnson Immunity threshold we are seeing 40,000 or so cases a day.
For a virus with an R0 of 1.7 the situation is drastically worse. We hit the Johnson Immunity threshold after 23 days, and at this point about 140,000 cases a day are being infected. Three days later the peak is achieved, with nearly 200,000 cases a day being infected, before the disease begins a rapid crash. It dies out within a week of hitting the Johnson immunity threshold, but by the time it disappears it has infected 94.6% of the population. That means most of our grandparents!
For a disease with an R0 of 2.5 we hit the Johnson Immunity threshold at day 13, with about 140,000 cases a day, and the disease peaks two days later with 450,000 cases a day. It crashes after that, hitting 0 a day later because it has infected everyone in the population and has no one left to infect.
This shows that for any kind of R0 bigger than influenza, when you reach the Johnson Immunity threshold your disease is infecting a huge number of people every day and is completely out of control. We have shown this for a disease with an R0 of 2.5. The R0 of COVID-19 is probably bigger than 4. In a population of 60 million where we are aiming for a herd immunity threshold of 36 million we should expect to be seeing a million new cases a a week at the point where we hit the Johnson Immunity threshold.
This is an incredibly stupid policy!
Other misconceptions in the policy
The government stated that its Johnson Immunity threshold was about 60% of the population. From this we can infer that they thought the R0 of this disease was about 2.5. However, the actual R0 of this disease is probably bigger than 4. This means that the government was working from some very optimistic – and ultimately wrong – assumptions about the virus, which would have been catastrophic had they seen this policy through.
Another terrible mistake the government made was to assume that rates of hospitalization for this disease would be the same as for standard pneumonia, a mistake that was apparently made by the Imperial College modeling team whose work they seem to primarily rely upon. This mistake was tragic, because there was lots of evidence coming out of China that this disease did not behave like classic pneumonia, but for some reason the British ignored Chinese data. They only changed their modeling when they were presented with Italian data on the proportion of serious cases. This is an incredibly bad mistake, and I can only see one reason for it – they either didn’t know, or didn’t care about, the situation in China. Given how bad this disease is, this is an incredible dereliction of duty. I think this may have happened because the Imperial College team have no Chinese members or connections to China, which is really a very good example of how important diversity is when you’re doing policy.
Conclusion
The government’s “herd immunity” strategy was based on a terrible misunderstanding of how infectious disease dynamics work, and was compounded by significantly underestimating the virulence and deadliness of the disease. Had they pursued the “herd immunity” strategy they would have reached a point where millions of people were being infected daily, because the point in an epidemic’s growth where it reaches Rt=1 is usually the point where it is at its most rapidly spreading, and also its most dangerous. It was an incredibly reckless and stupid policy and it is amazing to me that anyone with any scientific background supported it, let alone the chief scientific adviser. Britain is facing its biggest crisis in generations, and is being led by people who are simply not competent to manage it in any way.
Sadly, this language of “herd immunity” has begun to spread through the pundit class and is now used routinely by people talking about the potential peak of the epidemic. It is not true herd immunity, and there is no sense in which getting to the peak of the epidemic to “immunize” the population is a good idea, because getting to the peak of the epidemic means getting to a situation where hundreds of thousands or millions of people are being infected every week.
The only solution we have for this virus is to lockdown communities, test widely, and isolate anyone who tests positive. This is being done successfully in China, Vietnam, Japan, Australia and New Zealand. Any strategy based on controlled spread will be a disaster, and anyone recommending it should be removed from any decision-making position immediately.
Appendix: Brief technical note
R0 (and Rt) are very important numerical qualities of an infectious disease but they are not easily calculated. They are numbers that emerge from the differential equations we use to describe the disease, and not something we know in advance. There are two ways to calculate them: Empirically from data on the course of disease in individuals, or through dynamic analysis of disease models.
To estimate R0 empirically we obtain data on individuals infected with the disease, so we know when they were infected and when they recovered down to the narrowest possible time point. We then use some statistical techniques related to survival analysis to assess the rate of transmission and obtain statistical estimates for R0.
To estimate R0 from the equations describing the disease, we first establish a set of ordinary differential equations that describe the rates of change of uninfected, infected, and recovered populations. From this system of equations we can obtain a matrix called the Next Generation Matrix, which describes all the flows in and out of the disease states, and from this we can obtain the value of R0 through a method called spectral analysis (basically it is the dominant eigenvalue of this matrix). In this case we will have an equation which describes R0 in terms of the primary parameters in the differential equations, and in particular in terms of the number of daily contacts, the specific infectiousness of the disease when a contact occurs, and the recovery time. We can use this equation to fiddle with some parameters to see how R0 will change. For example, if we reduce the recovery time through treatment, will R0 drop? If we reduce the infectiousness by mask wearing, how will R0 drop? Or if we reduce the number of contacts by lockdowns, how will R0 drop? This gives us tools to assess the impact of various policies.
In the early period of a new infectious disease people try to do rough and ready calculations of R0 based on the data series of infection numbers in the first few weeks of the disease. During this period the disease is still very vulnerable to random fluctuation, and is best described as a stochastic process. It is my opinion that in this early stage all diseases look like they have an R0 of 1.5 or 2, even if they are ultimately going to explode into something far bigger. In this outbreak, I think a lot of early estimates fell into this problem, and multiple papers were published showing that R0 was 2 or so, because the disease was still in its stochastic stage. But once it breaks out and begins infecting people with its full force, it becomes deterministic and only then can we truly understand its infectious potential. I think this means that early estimates of R0 are unreliable, and the UK government was relying on these early estimates. I think Asian governments were more sensible, possibly because they were in closer contact with China or possibly because they had experience with SARS, and were much more wary about under-estimating R0. I think this epidemic shows that it is wise to err on the side of over-estimation, because once the outbreak hits its stride any policies built on low R0 estimates will be either ineffective or, as we saw here, catastrophic.
But whatever the estimate of R0, any assumption that herd immunity can be achieved by allowing controlled infection of the population is an incredibly stupid, reckless, dangerous policy, and anyone advocating it should not be allowed near government!