Public Health England has admitted that 16,000 confirmed coronavirus cases in the UK were missed from daily figures being reported between September 25 and October 2. The missing figures were subsequently added to the daily totals, but given the importance of these numbers for monitoring the outbreak and making key decisions, the results of the error are far-reaching.
Not only does it lead to underestimating the scale of coronavirus in the UK, but perhaps more important is the subsequent delay in entering the details of positive cases into the NHS Test and Trace system which is used by a team of contact tracers. Although all those who tested positive had been informed of their results, other people in close contact with them and potentially at risk of exposure were not immediately followed up (ideally within 48 hours). This was a serious error. What could have caused it?
It emerged later that that day a “technical glitch” was to blame. To be more specific, the lab test results were being transferred to Excel templates. The templates hit a limit in the number of rows they could handle and then failed to update with more cases added. The issue was resolved with all new cases added to the totals reported over the weekend by breaking the data down across smaller spreadsheets.
The issue may have been fixed, but people’s confidence in the testing system in place in England will undoubtedly take a knock. It’s also likely that politicians and media will use this as political ammunition to argue the incompetence of government and Public Health England. Is this the right response? What should we take away from this mistake?
An avoidable mistake
We should not forget that the government and public health workers are doing an incredibly challenging and demanding job dealing with a pandemic. But this kind of mistake was avoidable. We live in a world of big data, with artificial intelligence and machine learning permeating all aspects of our lives. We have smart factories and smart cities; we have self-driving cars and machines trained to exhibit human intelligence. And yet Public Health England used Microsoft Excel as an intermediary to manage a large volume of sensitive data. And herein lies the problem.
Although Excel is popular and commonly used for analysis, it has several limitations that make it unsuitable for large amounts of data and more sophisticated analyses.
The companies that analysed the swab tests to identify who had the virus submitted their results as comma-separated text files to PHE. These were then ingested into Excel templates to be uploaded to a central system to be made available to the Test and Trace team and government. Although today’s Excel spreadsheets can handle 1,048,576 rows and 16,384 columns, developers at PHE used an older Excel file format (XLS instead of XLSX) resulting in each template being able to store only around 65,000 rows of data (or around 1,400 cases). When the limit was reached, any further cases were left off the template and therefore positive cases of coronavirus were missed in the daily reporting.
The bigger issue is that, in light of the data-driven and technologically advanced age in which we live, that a system based on shipping around Excel templates was even deemed suitable in the first place. Data engineers have for a long time been supporting businesses with managing, transforming and serving up data, and developing methods for building efficient, robust and accurate data pipelines. Data professionals have also developed approaches to information governance, including assessing data quality and developing appropriate security protocols.
For this kind of custom application there are plenty of data management technologies that could have been used, ranging from on-site to cloud-based solutions that can scale and provide managed data storage for subsequent reporting and analysis. The Public Health England developers no doubt had some reason to transform the text files into Excel templates, presumably to fit with legacy IT systems. But avoiding Excel together and shipping the data from source (with appropriate cleaning and checks) into the system would have been better and reduced the number of steps in the pipeline.
The blame game
Despite the benefits and widespread use of using Excel, it is not always the right tool for the job, especially for a data-driven system with such an important function. You can’t accurately report, model or make decisions on inaccurate or poor quality data.
During this pandemic we are all on a journey of discovery. Rather than point the finger and play the blame game, we need to reflect and learn from our mistakes. From this incident, we need to work on getting the basics right – and that includes robust data management. Perhaps rather concerning are reports that Public Health England is now breaking the lab data into smaller batches to create a larger number of Excel templates. This seems a poor fix and doesn’t really get to the root of the problem – the need for a robust data management infrastructure.
It is also remarkable how quickly technology or the algorithm is blamed (especially by politicians), but herein lies another fundamental issue – accountability and taking responsibility. In the face of a pandemic we need to work together, take responsibility, and handle data appropriately.


WiseTech Global Denies Knowledge of Investigation Into Founder Richard White
Anthropic AI Model Uncovers Vulnerabilities in Classified U.S. Government Systems During Security Test
Samsung Electronics Stock Surges on Report of Massive $59 Billion Share Buyback Plan
Republican Lawmaker Introduces AI Incident Reporting Bill to Strengthen U.S. AI Safety
World Cup technology: from ref cams to AI analysts, cutting-edge research is changing the game
Trump’s Quantum Push Lifts IBM Stock as CEO Arvind Krishna Receives White House Praise
Tencent Reviews Marvelous Stake as Gaming Giant Reassesses Global Investment Strategy
Alphabet Stock Slides as AI Talent Exodus and SpaceX Losses Shake Investor Confidence
Cerebras Revenue Forecast Tops Expectations, but Margin Concerns Weigh on Stock
Kioxia Targets U.S. Listing as AI Chip Boom Accelerates
SK Hynix Moves Closer to New York ADR Listing Amid AI Chip Boom
SpaceX Stock Plunges 16% as KeyBanc Warns Valuation May Be Overstretched
Apple Supplier Stocks Slide as Samsung, SK Hynix Lead Selloff After Apple Price Hikes
SK Hynix Targets $29.4 Billion Nasdaq Listing to Expand AI Chip Business
Doncasters Raises $919 Million in NYSE IPO as Aerospace Growth Accelerates
Samsung and SK Hynix Shares Jump After Micron Earnings Boost AI Chip Optimism
Today’s space race could turn fatal if we don’t agree on new rules 



