Did COVID-19 projections have a common sampling error?

First, our hearts go out to the victims of COVID-19. We also want to thank all the selfless individuals in the healthcare industry fighting this disease with so much courage, as well as those keeping our roads safe, grocery stores open, delivering mail and so, so many others in these challenging times.

We write these words from a mathematical perspective. Tailor Research is a market research company after all, not a political one. We have no political agenda whatsoever, so please keep that in mind.

We have heard many people and media criticize the initial COVID-19 projections. We decided to share our perspective from a statistical sampling and surveying perspective. Some of the smartest minds in the world (Gates Foundation, University of Washington, John-Hopkins University, etc.) worked on those models. This post is not meant to question their work, but to add some observations we see as a market research and survey company.

With so many people and organizations behind the COVID-19 projections, we do not imagine that each made a common sampling error known as Sampling Bias or Exclusion error, but there are several interesting clues that make us wonder.

In our opinion, projections should be formed using sound statistical techniques. Statisticians and modelers would not rationally argue this point. When statistical studies are done correctly, they work. For example, a projection with a confidence level of 95% and a plus or minus margin-of-error of 3%, should yield results very close to reality. When that is not the case (as was the case with the COVID-19 initial projections), it usually means there was an error in the way the projections were designed.

We believe it is possible that the COVID-19 models, could have excluded asymptomatic people, and if true, would have thrown-off those projections greatly.

This type of statistical error is known as Sampling Bias. It is when a sample is collected from an intended population in a way that is biased toward one sub-set of the population. In other words, it is when a modeler uses a sample from the population that is not randomly selected, thereby not representative of the whole population.

Sampling (instead of spending enormous time and money to reach the whole population) is often difficult. One reason is because of the choices that modelers must make when choosing the sample. Each choice has its advantages and disadvantages.

For example, Simple Random Sampling is easy to conduct and representative, but identifying and contacting all members of the sample can be difficult. Cluster sampling is more accurate, but representation can become an issue.

Another method of sampling, called Stratified Random Sampling, is ideal when certain areas of the population need to be properly represented. In other words, when it is important to ensure that each state, city, rural or urban area is represented, then Stratified Random Sampling has its advantages.

Instead of giving everyone in the United States a test (time-consuming and costly), a professional survey statistician meticulously determines the percentage of people to draw a sample from, in “stratas”. [Undoubtedly, the statistician would give each respondent in the sample a COVID-19 test, to determine whether they were asymptomatic not carrying the virus, asymptomatic carrying the virus, carrying the virus and showing signs, in the hospital, in ICU or dead.] The statistician picks a large enough sample in each stratum to ensure the overall projections are statistically significant. If the statistician designs the sample well, the projections are accurate and within a confidence interval.

Back to Sampling bias and whether that contributed to the COVID-19 models being off, one possible reason why the models could have been off (seemingly beyond the confidence intervals) is how effective we as a nation were at social distancing. Another reason could have been because the sample was chosen incorrectly (e.g. the modeler had Sample Bias). We suspect both are true.

Social distancing without a doubt has helped fight the virus. There is no disputing that fact whatsoever. It is absolutely possible that projections took social distancing into account, but not to the degree that Americans actually accomplished. In other words, if Americans were more effective at social distancing than modeled, the rate of spread in those models would have been too high, effecting all modeled parameters.

However, how do we know that social distancing was the only reason the models were off? One fact about sampling and modeling is you cannot sample just a portion of the intended population and expect accurate results. The entirety of the population must be considered.

The question is whether Covid-19 modelers sampled only those who showed signs of the virus, thereby introducing sampling bias. Generally, we know that in the United States, tests were only available for those who had symptoms. So, did the modelers use a representative subset of the population without symptoms? We believe that is unlikely based on what has been reported and said.

Dr. Fauci and Dr. Brix have stated:

1. Modelers used Italy, Spain, China, South Korea, and New York as the initial primary inputs into their models, because those were what they had to work with at that point in time (we would argue this point is not true if stratified random sampling had been used later in the modeling process).

2. The model adjusted based on the new number of tested individuals in the United States, in a way that New York, New Jersey, Italy, Spain, and others were more influential initially than later.

3. The model had a range of results e.g. 100,000 to 200,000 projected deaths, that changed as the model was updated.

4. Finally, Dr. Fauci, stated that the model could consistently adjust downward/upward beyond the ranges because “any model is only as good as the assumptions given”. Remember, he was not the architect of these models.

A sample-based model, driven by the entire U.S. population, does not need to include exogenous data (e.g. Italy, Spain, China, South Korea) except to model how the virus spreads (i.e. how contagious the virus is).

Modeling who currently has the virus can only be done with the U.S. population in mind.

The severity and effect of the virus can only be done with U.S. demographics in mind. Similar profiles from other countries could be carefully modeled but to do that, one must first, understand the U.S. idiosyncrasies and nuances clearly or sampling error is introduced.

An example of how models can go wrong if not carefully planned would be to use Italy and Spain to model South Korea and the United States. The aging population in Italy and Spain are much different than that in South Korea or here, therefore death rates will naturally be higher in Italy and Spain.

This much is certain… If the models (Gates Foundation’s model, University of Washington’s model, etc.) did not use a sample from the entire population, and instead used only those who were showing signs of the COVID-19 virus, then those models were doomed to be inaccurate from the beginning. Because those in our population who were asymptomatic would not have been represented in the model, thereby making the projections of deaths and illness way too high. Indeed, that was the case.

The modelers likely had different scenarios or cases. Worst case– no one social distances, and these many people will die. Best case– every one area of the country social distances perfectly, than many less will die. So, yes the model adjusts based on scenarios but those scenarios were modeled already and the models were still off in the best case scenario. In other words, the model that assumed that everyone social distanced was the model that yielded a much higher number than reality.

We understand that until recently, asymptomatic tests were not available. However, the sample must accurately represent the general population of the United States. A model cannot exclude those that show no signs of illness or it is doomed to be wrong.

Incidentally, another advantage of stratified random sampling is the potential to have each government/locality (state, city, county, etc.) conduct their own study and each of these stratums inform their communities. That is until someone from another community enters (exogenous influences) theirs, hence the potential need for communication and monitoring between borders.

How many randomized people would be needed to have less than a 1% margin-of-error i.e. a high likelihood of projecting the reality of the United States circumstance? The answer is a mere 10,000 randomly selected people (1,000 probably could be used if not stratified). To put that figure into perspective, there has been well over 2 million tests given already in the United States. If we used randomized sampling techniques, such as stratified random sampling, we as a nation and locally would come very close to approximating what percent of the U.S. has the disease, what percentage are asymptomatic, and a lot more.

Again, our hearts and prayers are with those affected by this pandemic. Please stay safe and healthy during these hard times. We wish each of you the very best!