Accuracy and confidence: why we trust the data from the COVID-19 infection survey.

A nurse holding a Covid test

Since April, the Coronavirus (COVID-19) Infection Survey (CIS) has offered weekly snapshots of the level of the SARS-CoV-2 virus circulating within the community. Here Sarah Crofts explains how we can be confident in our results and what conclusions we can draw from the data.

Our national survey has tracked infection levels since it began in April 2020. Since then it has picked up changes in infection levels in England from 1 in 2000 people being infected last summer to 1 in 50 at the start of 2021. The survey was expanded during 2020 to include Wales, Scotland and Northern Ireland and is used by government decision makers as a key source of data in the pandemic response.

Test Accuracy and Ct values

Our survey uses a standard test which looks for the presence of genetic material from the virus in a nose and throat swab. The test detects three SARS-CoV-2 genes and also estimates how much virus is present, the viral load, through “cycle threshold values (Ct)”. The test goes through a number of cycles copying and expanding the amount of genetic material from the virus before a positive result is detectable. If there is a high quantity of the virus present, a positive result will be identified after a low number of cycles. However, if there is only a small amount of the virus present, then it will take more cycles to detect it. Therefore, Ct values are used as a proxy for the quantity of the virus, also known as the viral load. The higher the viral load, the lower the Ct value.

While all survey participants are tested regardless of whether they have symptoms, we also collect information on any symptoms they may be experiencing. This information is compared with their Ct values and this enables us to classify the strength of results into three tiers. A “higher” evidence result is when a positive result was determined by the detection of two or three genes regardless of Ct value; a “moderate” evidence result is when a single gene was detected and supported by either a lower Ct score or reported symptoms; and a “lower” evidence result is when only a single gene is detected with a high Ct score.

By classifying the positive results in this way, we have analysed our data to understand if there were any trends we could identify that may be of use for earlier indicators of infection rate changes.

During the Autumn, a new pattern emerged from test results in London and the South East of England. These positive results detected the presence of two genes and had low Ct values, indicating high viral load. This pattern suggested that a new variant was becoming increasingly prevalent. From mid-December, this new UK variant (also called “B.1.1.7”) has been the most common variant of positive cases in the UK.

Since the start of 2021, we started to see lower viral loads, which made it increasingly difficult to identify which genes could be detected. This is likely to be due to people recovering from infection following the rise in cases over the winter. You can read more about this here.

False results

No test can be guaranteed to be 100% accurate. This is because it depends on the viral load (a very low viral load could not be detected by the test even though there is virus there) and it is possible that a person has not swabbed themselves correctly (meaning their swab doesn’t have virus on it so it can’t be detected). Both these mean that sometimes a test may show a negative result when a person has the infection (a “false negative” result). Occasionally a test may show a positive result when a person does not have the infection (a “false positive” result).

The proportion of positive results that are truly accurate is commonly referred to as “sensitivity”. The estimated sensitivity of the test used in our survey is around 85% to 95%, which suggests that around 5% to 15% of these results could be false negative results. A recent analysis by the Department of Health and Social Care suggests sensitivity is likely to be closer to 95%.

The proportion of negative results that are truly accurate is commonly referred to as “specificity”. The specificity of our survey results is estimated to be at least 99.995%. This is the same as saying that for every 20,000 positive results, we would expect 1 to be truly negative. We know that false positives are likely to be very low because hundreds of thousands of tests were carried out over the summer months that showed very low numbers of positive tests. This coincided with low infection levels reported by other data sources and recorded hospital admissions were also low during this time. We know that there were real infections in the community because there was a “second wave” of infections in the Autumn. Collectively, these provide reassurance that the low infections seen during the summer were an accurate measure at that time.

Ct values also provide useful information on the possible levels of false positives. If the positive tests follow consistent Ct patterns, then it is likely that there hasn’t been a sudden change in the numbers of false positive and false negative results. We monitor Ct values closely to identify changes and if there are changes then we investigate further to see if there is a cause. For example, in January we started to see a higher number of tests with a low viral load, which shows that people were recovering from the high levels of infection over the winter.

In addition to testing for current COVID-19 infection, we also test a sample of our participants for the presence of antibodies to SARS-CoV-2, which are produced by a person’s immune system in response to COVID-19 infection or a vaccine. Positive results are likely to be reliable, since the presence of antibodies is identified from a blood sample and sensitivity is estimated to be at least 98-99%. However, sometimes a result may be negative, even if a person has recovered from a past infection or received a vaccine. Within the first few weeks after vaccination or infection, a person’s immune system may not yet have produced enough antibodies to be detected. This is why we take blood samples monthly from the same people so that we can see how antibodies vary over time.

If a person has experienced a mild infection or an infection several months previously, their levels of antibodies may never have risen or may have risen and then declined and will not be detected in the test. Some people never develop antibodies after vaccination. This does not necessarily mean that they have no protection against the virus, because even though their levels are too low to be identified in the test, there are other parts of the immune system which also give protection. Our survey will continue to provide helpful information on what it means to develop antibodies after vaccination or infection.

Accuracy of Estimates

As with all surveys, we collect information from a sample of people. While the sample is randomly selected to be representative, we need to ensure that our results reflect the population we are measuring. There are established statistical tests that we carry out to identify if a difference in infection levels between one time period and another (or one group of people and another) could be due to chance because of the sample of participants who were selected. These tests provide a measure of the confidence we can have that a difference is real and we describe these differences as “statistically significant”. We also ensure that alongside all our published estimates, we provide an interval in which we would expect the true value to lie 95% of the time. These intervals can be used as a measure of the certainty that we have around our figures; the smaller the interval, the greater the certainty.

Further Work

We are continuing to provide weekly data on infection levels in the community population and to monitor emerging trends among parts of the UK and among particular groups of people. We are also monitoring infection levels in schools, antibody levels in the community population and the impact of vaccination programmes. All these sources help us continue to provide valuable, high quality data to inform the UK’s response to the pandemic.