Census 2021 – the count is done, the data is in, so what happens next?
The Census 2021 collection operation was a fantastic success with more than 97% of households in England and Wales completing their questionnaire to make sure they are represented when it comes to provision for local services. But just as big a challenge is turning the information you have provided into statistics we can all use. Ed Dunn explains what is happening now and when the results will be available.
As some of our recently published findings highlighted, the Census 2021 return rate from occupied households was over 97% for England & Wales, with most responses being received online. Six months on from Census Day, the Office for National Statistics is working hard to get those numbers out.
First and foremost, our objective for the census is to count everyone once, and to count them in the right place. To achieve this, after carrying out the census collection operation, we compile, clean, complete, cross-check, ensure confidentiality and continue our consultation with users.
More than two and a half million paper questionnaires were scanned by the end of July, a major step in how we compile data. Because of the huge online response this is a fraction of what was scanned in 2011 but still represents a significant undertaking. This data is then carefully combined with the online responses all the while ensuring the security and confidentiality of the data is maintained. All personal data is being handled on systems securely in our control and managed to UK government security specifications. This data will only ever be used for the purpose of producing anonymised statistics.
Recycling and coding
Once data is all electronically combined, paper forms are securely destroyed. The resulting waste material is then baled and sent to UK paper mills. All of it will be recycled into soft tissue and hygiene products such as hand towels and toilet paper.
With such a large and complex data collection exercise we take great care to ensure we’ve removed errors to clean the data, and on-line collection has helped significantly. For example, we remove duplicate responses – yes, you’ve read that correctly, people do complete the census more than once! In fact, in 2011 almost 300,000 people did so.
We also code and classify all responses so that statistics for topics like ethnicity, religion, and occupation are produced on a consistent set of categories. Even though much of this can be automated there are many, many thousands of unique responses to work through.
The census not only captures the address where a person lives but also many other addresses such as workplaces, term time addresses and previous addresses if someone has moved within the last 12 months. All address information is carefully processed to ensure geographic accuracy.
To complete the census database, we use very carefully designed and assured statistical methods to fill gaps in the data where mandatory questions have been missed or where invalid or inconsistent responses have been given. A fundamental aim of the England & Wales Census is to include everyone, so we also estimate those who did not respond to produce an estimate of the total population.
We estimate those people using the Census Coverage Survey (CCS) which independently samples around 350,000 households. We use the CCS to look for individuals and households captured in both the Census and the CCS or only in one or the other. This allows us to make assumptions about our coverage of the whole population. For the mathematically minded there is a great video by Johnny Ball and the BBC available here (capture/recapture (YouTube) or read our technical paper on dual system estimation.
Quality assurance
We also cross-validate our estimates. We are taking this process of quality assurance and validation very seriously. Comparing estimates to a range of alternative administrative and survey sources whilst consulting with topic experts worked well in the 2011 Census and we are repeating this approach.
However, our cross-validation and quality assurance is going even further in 2021.
For the first time we’ve asked local authorities to help us directly and we will be providing limited access to anonymised, provisional census estimates for their local authority strictly for the purposes of quality assurance prior to publication.
We’ve already asked local authorities to identify and supply alternative data sources like Council Tax data, which will help us, but taking this extra step will further help identify any inconsistencies which may need investigation prior to us publishing the first results. As a result of this extended quality assurance phase the first results will be available in late Spring 2022 [please see this statement published on March 1 for our latest position].
We are really excited about extending the collaboration with local government which was such an important part of making the collection operation a success.
Consultation
Since July we have been consulting with users over our plans for outputs and analysis. The consultation recently closed and we would like to thank those who took time to submit more than 300 detailed responses from a wide range of users of census data including local authorities, charities, community groups and commerce.
There will be lots of important user insight to consider and we will publish a response to what our users have told us in early 2022. We will also update our outputs webpages to make clear how we will protect confidentiality in the final statistics and how we will make information about the results available. This will include more information about how we plan to supplement census data with other sources of information in certain areas.
So, as you can see there is still plenty to do ahead of publication of the first results in late Spring 2022.
If you’d like more information about the areas covered in this blog we published the 2021 Census Statistical Design last year and the slides for a series of supporting webinars on Designing for Quality are also available.