“28 January is Data Privacy Day, and one of its aims is to raise awareness of best practice in data protection. Pete Stokes discusses how ONS makes data available for research, while protecting confidentiality at all times”.
The data revolution promises to deliver a step-change in the way data are used, and has the potential to improve and inform decision making affecting all aspects of our lives. Better use will be made of data, analysing and combining a diverse range of sources, by a growing number of data scientists, economists, statisticians and researchers, working across government, the commercial sector, academia and civil society.
Arguably the biggest challenge for this increased use of data is the need to reassure the public, who answer censuses and surveys, who provide detailed information about themselves, their families, their businesses and their lives to government, businesses and others, that that their private information won’t ever be made public. We need to demonstrate that we can use that information in a way that benefits society, and serves the public good, without ever allowing them to be identified.
It wasn’t so long ago that we were able to do that by conducting a survey, using the information collected to produce some statistics and then either locking away or destroying those initial responses, so that nobody would ever see or use them. We can’t do that anymore; we need to be able to access and re-use those data in different ways, for different statistical and analytical purposes, to produce better statistics for better decisions.
My team’s job is to find a way to maximise the use of the detailed data that ONS holds, while keeping them secure at all times; to let government, academics, businesses and others use these data, while being able to assure you that you will never be identified, your private details will never become public and that the information you have given us will only ever be used in ways that clearly serve the public good. We do that in a range of ways, which can be summarised into what is commonly called the “Five Safes”: Safe people; Safe projects; Safe settings; Safe outputs; Safe data.
Most people wanting to access detailed data held by ONS for research purposes apply through the Approved Researcher scheme for permission to do so. To be considered Safe people, researchers have to: demonstrate that they have the technical skills to use the data, either through academic qualifications or practical research experience; complete our training course and pass the assessment at the end; agree to their details being published on our website; and sign an agreement promising to protect the confidentiality of your data at all times.
However, completing that process is just the first step, and it doesn’t actually give researchers access to any data. The next step is for them to request access to the specific data sources they need by putting together a project proposal. For access to data to be granted, the researchers need to demonstrate that their proposal is an appropriate and ethical use of the data, that it will deliver clear public benefits and that they will publish their results to enable use, scrutiny and further research. These applications are reviewed by a panel of experts from across ONS and, where appropriate, by the National Statistician’s Data Ethics Advisory Committee as well. If either body is not satisfied that all of our conditions are met, the application is rejected.
Once a project is approved, we’re happy for it to start, but we still need to ensure your data are safe, so we don’t give the researchers a copy. Instead, we have established a safe setting called the Virtual Microdata Laboratory (VML), where researchers can analyse data on our systems, where they have no access to email, the internet, printers or any other way of taking out our protected data. This system has a wide range of security controls built in, including CCTV in the secure rooms and protective monitoring software, which monitors and records every keystroke and mouse-click researchers do to make sure nobody misuses the data.
As researchers can’t take data out of the VML, once they’ve completed their work they need my team to check their results and send them outputs they can use. To ensure that the outputs, tables and charts produced cannot identify the data-subjects, 2 people check them, independently, to ensure that they meet the same confidentiality standards that ONS applies to all outputs published as Official Statistics. Once both are happy, the results are sent to the researcher, who can then write and publish their report.
Despite all of these controls, we still need to make sure that researchers do not inadvertently learn something about you during the course of their analysis. To do this, we de-identify the data, by removing names, addresses and any other details that would directly identify the data subjects, before we make them available for any analysis.
This framework means we can assure you that, whenever your data are used for analysis, this is only:
- completed by people who have been trained and accredited
- for research projects that deliver clear public benefits
- in a secure setting where it is impossible to remove data
- where all outputs are checked and confirmed as non-disclosive
- when the data to be used have had your name, address and any other variables that would directly identify you removed beforehand
In addition to ensuring that any data help by ONS are kept safe, these Five Safes are also used by other organisations that support research in a similar way, including in government and academia in the UK and internationally. Although specific rules may vary, for example about who can use data, or how a safe-setting is managed, the same core principles remain, and help to ensure safe use of a diverse range of information while empowering UK research, and our contribution to the data revolution.
The Five safes framework is just one part of how ONS protects your data. To find out more about how we keep data secure at all stages, through, collection, processing, analysis and output production, please visit our website.
Pete Stokes is Head of Microdata Access and Exploitation.