ICAEW.com works better with JavaScript enabled.

HMRC’s expanded data collection

Author: Caroline Miskin

Published: 28 Sep 2022

HMRC’s expanded data collection article image

Caroline Miskin explains HMRC’s proposals to expand the data it collects from employers and self assessment taxpayers, raising concerns over administration, data security and possible new penalties.

Data is one of our most valuable commodities. HMRC has been ahead of the game in how it collects and collates data in its Connect system. However, the focus now seems to be changing from post-filing compliance checks to collecting more data, both from taxpayers and from third parties, at the earliest possible opportunity and for a variety of possible uses. The current proposals do not appear to reflect a coherent data strategy across central and local government or indeed for HMRC.

The purpose of a self assessment income tax return is “establishing the amounts in which a person is chargeable to income tax and capital gains tax for a year of assessment” (s8, Taxes Management Act 1970). Many of the current proposals go well beyond what HMRC needs to establish tax liabilities and ensure compliance, straying into policymaking across government. Any new data requirements must clearly comply with the existing General Data Protection Regulation principles, including purpose limitation and data minimisation.

What data is HMRC seeking to collect?

The most recent proposal, Improving the data HMRC collects from its customers, has within its scope collecting additional data from employers and self assessment taxpayers. 

The data to be collected from employers includes:

  • the occupations of employees;
  • the hours employees work; and
  • the location of an employment.

From self assessment taxpayers, the data includes:

  • the business sector of the self-employed;
  • the occupation of the self-employed;
  • the location of the self-employed business;
  • dividends paid to shareholders in owner-managed businesses; and
  • the start and end dates of self-employment (which is collected, but not reliably so as is currently optional).

Alongside that, other proposals that involve more data collection include:

Making Tax Digital

The moves to collect more data contrast with Making Tax Digital (MTD) VAT and income tax self assessment (ITSA). These have been designed so that HMRC receives exactly the same level of detail as previously. MTD ITSA will require this data to be submitted more frequently, but that data will not, at least initially, be put to any obvious good use. This creates an increased administrative burden and extra cost with little benefit to the taxpayer.

Many have suggested that a facility to provide additional information to support MTD VAT returns would be helpful (eg, a copy of an invoice for an unusually large purchase), but this has not been developed by HMRC. The three-line account relaxation is to continue and while this is a welcome simplification it does seem to run counter to the stated aims of MTD ITSA.

Many in the profession have suggested that simplification of the tax system should have been considered in advance of MTD ITSA and this could usefully have included a review of the data that HMRC should hold.

It is surprising that the latest consultation does not mention data on income from property. Perhaps that will follow the Office of Tax Simplification’s Review of Property Income? The design of MTD ITSA for income from property is certainly constrained by the data that is currently collected on tax returns, particularly in relation to income from jointly held property.

A careful review of the income tax self assessment return might lead to changes to the data that is collected on those returns. In some cases, more detail might facilitate HMRC compliance checks. For example:

  • reporting each private pension separately rather than providing an analysis in the white space which, as far as I am aware, is not input into HMRC’s systems.
  • a further breakdown of some investment income categories.

In other areas it might be possible to reduce the level of detail required. Some of the decisions about the self assessment dataset are rooted in the days of paper returns rather than the digital age. Providing a more detailed breakdown would, in some cases, not create the administration burden that it would have in the past. Other than dividends for owner-managed companies, this does not seem to have been considered by HMRC.

What use will the data be put to?

From a customer service perspective, HMRC does not have a good record when it comes to how it uses the data it does collect, largely because of legacy systems that prevent it from joining things up at a taxpayer level. The single customer record and account have the potential to significantly improve the taxpayer experience of using HMRC systems, but will take several years to develop fully. The current proposals for collecting new data do not, in my view, bring with them any significant customer service benefits.

Better data would undoubtedly have been helpful when the government was designing COVID-19 support schemes. But is it really sensible to collect data that would have been useful in the past or would support a current government policy objective (such as levelling up) that may well have changed by the time the data is available? Stable doors and horses come to mind.

Additional data that is genuinely useful for compliance checks (such as the data that is collected through the Foreign Account Tax Compliance Act, the Common Reporting Standard (CRS) and DAC6) and that can be collected without adding a significant administration burden is something that the tax profession is likely to support. I would, however, note the current proliferation of one-to-many campaign letters that HMRC is issuing that do add to the administration burden and often create unnecessary worry for compliant taxpayers.  

Collection of data for policymaking or use by other government departments for unspecified reasons needs much better justification.

In particular, the burden that HMRC intends to put on employers seems onerous. I don’t think I am alone in not having previously been aware of Standard Occupation Codes. Location is very fluid. Whatever requirements are introduced need to reflect modern working practices. Standard Industrial Classification codes are familiar, but as far as I am aware are not always selected with great care.


The main concerns are likely to relate to:

  • The administration burden, particularly if HMRC collects data that it does not use effectively. This includes the impact on taxpayers having to correct incorrect data held on HMRC systems and potentially on the systems of other government departments. In some cases, the usefulness of the data is limited by the fact that it is for a calendar rather than a tax year (eg, CRS data and the new data to be provided by digital platforms).
  • Data security, especially when data is shared with other government departments, which may increase the risk of a data-related incident. HMRC’s annual report and accounts show that in 2021/22 there were 16 unauthorised disclosures reported to the Information Commissioner’s Office, affecting 10,896 taxpayers.
  • The possibility of new penalties for inaccurate data that does not have any tax impact.

These proposals would seem to take us further on the road towards HMRC becoming a data processing organisation and further away from it behaving like a tax authority. All the different initiatives need to be looked at together, with serious questions asked about what data HMRC and government needs and what it does with it.

About the author

Caroline Miskin, Senior Technical Manager Digital Taxation, ICAEW