Hello all and welcome back to the Excel Tip of the Week! This week, we have a Basic User post – but an unusual one. Rather than looking at any particular Excel skill or tool, in this post I am going to review the recent case of lost coronavirus cases by Public Health England, and try and unpack exactly what happened and what part Excel had to play in the whole thing.
It was reported that around 16,000 coronavirus cases had been dropped somewhere in the track-and-trace machinery. While all the people who had tested positive were told about their results, this information didn’t pass on properly and some contact tracing was delayed. The delays occurred between 25th September and 2nd October, when the issue was discovered and resolved.
In the immediate aftermath of the disclosure, there was a lot of speculation and finger-pointing swirling around. Sources indicated that some kind of spreadsheet error, possibly due to a file overrunning, was to blame – but there was no detailed explanation available. The theories were that either data was capped because tests were stored in columns – of which Excel has 16,384 in the modern .xlsx format – or because results were stored in rows but in the older .xls format – which only has 65,536 rows.
Later detail confirmed that the issue came about in transferring a CSV file between the private testing centres, which actually ran the tests on the swabs, and PHE, who were working to load the case details into their system in order to carry out contact tracing as well as updating other government systems. They were indeed using the older .xls format, with its 65,536 rows, and this was where the issue arose. CSV – comma-separated values – files are very simple flat text files with just data and commas to delineate each item. They can be generated by most any system and can be read very easily by many programs – including Excel. But Excel only has so much memory allocated to data, and any beyond that limit can’t be loaded. Hence when the CSV was opened in Excel, some data was lost.
What could have been done to prevent the error from occurring?
Quite a lot. First of all, the obvious – why was this done with .xls? The file format debuted in 1987 and was replaced in Excel 2007. While .xls is more straightforward, the superior data handling capabilities and features of later Excel versions – let alone the increased security and speed – should make the choice clear. Ministers described the issue as one of a “legacy system” that was due for replacement, but realistically Excel 2003 should have not been the default program in 2020 already.
There’s also the fact that if you open a CSV with too many rows in Excel – even in Excel 2003 – then you get a warning box that indicates as such. So the PHE process must have either been automated (using VBA to import from a file doesn’t trigger any such box), or done by a user who dismissed this box without considering what it was telling them. While we don’t know which, the VBA option is more likely – but there are lessons to be learned either way. If this was an inexperienced or careless user ignoring a warning box, then of course such a critical task should not have been handled by such a person – proper training and supervision was necessary. And if it was VBA, then instead we have a problem of an over-experienced user without proper training and supervision – creating an automated routine without sufficient knowledge and caution to anticipate this potential issue.
Of course, even use of the newer .xlsx format doesn’t prevent a possible overflow error – although it allows 16,384 columns and 1,048,576 rows instead of 256 by 65,536. But all file formats have a limit somewhere – and even if it isn’t size, other kinds of issues can cause import/export operations to go wrong. So the real answer here is that whenever data is ported from one system to another, some kind of integrity check should be carried out – whether it’s something as simple as checking that the number of records output from one system equals the number input into the new one, or a more sophisticated system such as a hash or other error-detecting code. This would have substantially reduced the chance of error regardless of the system.
The ultimate lessons remain clear – Excel was at the centre of this problem, but as in so many other cases, the failings weren’t caused by anything unique to Excel – but by failings of training, implementation, and governance. Excel attracts a lot of attention for its many failings – see the European Spreadsheet Risk Interest Group’s list of horror stories for some of them – but the prevalence of them is, at least in part, down to the overwhelmingly popularity of Excel, rather than any innate error-proneness in the software itself. Just because more people are injured in McDonalds than in indoor skiing arenas doesn’t mean that McDonalds is more dangerous, just that many more people go there.
This article is brought to you by the Excel Community where you can find additional extended articles and webinar recordings on a variety of Excel related topics. In addition to live training events, Excel Community members have access to a full suite of online training modules from Excel with Business.