Store and online data will bring a generational change to ONS price statistics
To aid the response to the coronavirus (COVID-19) pandemic the ONS has been introducing many new surveys and started using wider sources of data. Meanwhile important transformational work in other areas, including consumer price statistics has continued. Together the planned improvements form the most significant change to inflation statistics in a generation and will greatly improve the detail and representativity of the ONS measures. Tanya Flower outlines the changes and explains when the new data will appear.
Accurate measures of inflation play a vital role in business, government and everyday life. From rail fares to taxes to pensions, financial transactions in every area of our lives are regularly adjusted to reflect the change in prices over time. It is crucial to measure these changes in price as accurately as possible.
Today we’ve given more detail on our plans to introduce new data sources. The data sources we are investigating are scanner data (point-of-sale expenditure and quantity data provided directly by retailers) and web-scraped data (automated data collection from retailer websites).
These data sources will represent a substantial increase in the amount of data used for the production of consumer price statistics – from the hundreds of thousands of prices gathered in our current collection, to hundreds of millions that we are aiming to receive from these alternative data sources.
Over the last year, we have made significant progress towards achieving these plans. We are now receiving data from six prominent high street stores, and are in discussions with several others, covering tens of millions of prices each week. The scanner data we are receiving tells us exactly how much of each item is sold in each shop, so we will be able to have much more detailed information on how much weight to give each item, but also how buying patterns change when the prices of items change.
Since November 2018 we have also been receiving regular web scraped data. These data are for different types of product covering areas such as clothing, and technological goods such as laptops and smartphones. There are no historical series available with these data (unlike scanner data) so we will need to build up a sufficient time series of high-quality data before a final impact assessment can be completed.
In 2019 a prototype statistical production system was further developed to process web-scraped and scanner data using our new IT processing environment. Scanner data were processed through the system towards the end of 2019, showing the new systems capability to process hundreds of millions of rows of data to produce index outputs. We also published some experimental web scraped indices using this system, showing its capability to process these data for research purposes.
We have also continued our research programme. For example, a framework has been developed to decide on the best index methods for use with new data sources.
We plan on introducing these new sources into our headline indices in 2023, to give us time to further develop and rigorously test the new systems. Throughout this period, we will also be liaising regularly with our advisory panels on consumer prices, our users, and the Office for Statistics Regulation, to ensure that our future plans for consumer price inflation measurement are appropriate for improving the quality of our statistics and meeting our ongoing user requirements.
While much work remains to be done, we have made substantial progress towards our aim of transforming our consumer prices statistics over the last 12 months and are confident that we will be able to deliver the planned introduction of alternative data sources by the beginning of 2023