ICAEW.com works better with JavaScript enabled.

Why we urgently need a common understanding of data

Author: Tim Stafford Sunday

Published: 01 Dec 2020

data

Those with the access and power to decide how data is used will be the focus of a key 21st century battleground. While few know how to describe the issue, let alone regulate it, it’s imperative to take heed of the debate now – and its business implications, writes Tim Stafford.

Clive Humby, a mathematician and co-creator of Tesco’s Clubcard programme, is widely acknowledged as the first to claim “data is the new oil” in 2006. Others now argue it is more like the wind: it belongs to everyone and no one. Alix Dunn, an adviser on ethics and technology, compares it to money. And so the metaphors go on. There is no consensus because there is no historical precedent. There is nothing that perfectly describes the key role data will play – or how hard it will be to manage the power data confers.

Software review website TechJury estimates it would take an individual 181m years to download all of the internet’s data. According to Internet Live Stats, Google receives more than 4bn search requests a day – 45,000 a second – 15% of which have never been typed before, according to Google itself. 

But the internet is only one place where data is stored. Huge amounts, often far more personal, private and specific, sit on machines that only some people can access, and more is generated every day. In 2017, IBM calculated that 90% of all the world’s data had been created in the past two years – a trend that many believe will only have accelerated since.

Data shapes billions of daily decisions, large and small, made by managers, policymakers and legislators to, for example, prescribe a new drug, offer someone a job or invest in a housing development. Increasingly, those who have access to that data – and the tools to analyse and manipulate it – have considerable knowledge and influence. Or, in other words, power.

Yet despite this accumulation of power, we don’t have a rigorous way to discuss data, to regulate it or ensure a level playing field. Dunn, founder of Computer Says Maybe, a technology company that describes itself as “building the skills we need for an equitable future”, explains “there’s not currently the legal infrastructure or the consumer understanding of how to engage with institutions with the right level of nuance’ to make this possible”.

Regulatory challenge

The global nature of data makes regulation even more difficult. Data can be stored anywhere in the world, moved nearly instantaneously, and used ‘non-exclusively’ (by an infinite number of people and machines simultaneously). This is at odds with a world in which politicians are seeking to strengthen borders and make stronger ownership claims on what belongs to their citizens.

Despite these challenges, the need for regulation is increasingly urgent. If IBM’s estimate for the amount of data being produced is not wildly out of date now, it soon will be. The much-discussed fourth industrial revolution (4IR) encompasses the just-as-much-discussed Internet of Things – where internet-connected machines produce, share and consume data independently. There will soon be much more data, it will be even more fundamental to everyday life (making it even harder for anyone to live without using and producing data), and so will be an even bigger issue to deal with.

Policymakers and governments have responded to some concerns, most notably about people’s right to know how their personal data is used. The EU’s General Data Protection Regulation (GDPR) is at the forefront of this, and requires all organisations who process people’s personal data (name, email address etc) to use it only for the initiative that the person submitted the data for, and then delete it once it has been used for that purpose. 

California’s Consumer Privacy Act, which became effective on 1 January this year, aims to do something similar for that state’s citizens. Beyond this, the EU, UK and US – at state and federal level – are all discussing ways of regulating facial recognition technologies but, as yet, no clear legislation has been passed. There is also, again, a lot of debate on how to build on existing legislation to regulate the use of Artificial Intelligence (AI).

So, while a patchwork of legislation exists – including the comprehensive GDPR – there isn’t yet a holistic attempt to think through the implications that access to data presents and a legislative programme to address that. 

But at some point in the next decade, data ethics experts agree, the issues raised by the power imbalances data causes will spark debates that lead to major legislative change, which will change the way our economy and society work today. By then, businesses, policymakers – all of us – will have a clearer idea of the way forward. 

Public battlegrounds: the use and misuse of data

Dunn proposes the metaphor of data as money because it reflects how data “requires a set of rules and regulations because it can be used for a wide variety of different tasks, some of them good, some bad”. She says: “Few people focus on regulation of money, but you do expect there to be regulatory frameworks, controls and consumer protections at the level of how money is spent and how the marketplace functions.” The problem is, she says: “Because we’re still wrapping our heads around this stuff, we focus on the bits and the bytes, instead of the institutions that use the bits and bytes to do something.”

It’s not yet clear how mainstream political conversation will shape the data debate, where the battlegrounds will be or what sides will be taken. But Dunn predicts it will hit home when people see how institutions use data to transform a particular domain – whether in education, healthcare, the labour market or simply when people find they are paying more for everyday goods than their neighbours. In the future, she says, “companies could know exactly what amount of a product you need and charge you this amount for this product”.

Again, it’s this imbalance of a minority with access to data and the tools to manipulate it, changing the world for the majority, that will push debate – possibly civilised, possibly not – into the mainstream.

Dunn highlights a public sector example from August of how this might play out. UK A-level students – angry at how technology was used to predict the grades they would have got had COVID-19 not cancelled their exams, and angry at being awarded lower grades than they thought they deserved – protested against the algorithm itself. Possibly the first recorded instance of computer code being heckled.

Tipping point

Even though the debates about data are numerous and multiplying, because the issues are so complex, we won’t know until after the fact what tipping point led us to major legislative change. However, there will be “an Archduke Ferdinand moment” that could rapidly transform the debate, says Alice Thwaite, founder of tech ethics campus, community and consultancy Hattusia.

She predicts that, somewhere in the world, an outcry (for example, a racial profiling incident leading to the mistaken identity of someone as a criminal or a scam leading to thousands of people losing money) will result in facial recognition technology and the use of personal sensitive data being banned. In turn, many countries will follow suit and this will fire a huge debate about the use of data and people’s rights.

This sudden shift in thinking poses risks for companies that process large amounts of data. Businesses must consider how they treat the data they have access to and, more fundamentally, how they change the way they work as this new world emerges. Referring to 4IR, Thwaite says “it won’t be easy, revolutions require new thinking”. She highlights how, for years, most companies had ‘global pandemic’ in their list of top 10 risks yet COVID-19 still upended everything. 

Dunn has worked with companies to help them produce new data products, and has to show managers how certain decisions (about the design of the product, when and where it’s launched, how it’s marketed, and so on) have to be taken at the most senior level. Companies “wouldn’t set up a finance department and then say, ‘Great, now no one else need talk about money.’ It’s the same with data,” adds Dunn.

One example of the potential fallout comes from a class action lawsuit against Oracle and Salesforce. Filed in Amsterdam this August, it asks for a €500 payment for each person whose personal sensitive data was allegedly used without their explicit consent (and so in contravention of GDPR). The group behind it claims it could generate $10bn of fines. Oracle and Salesforce have stated that they will defend the lawsuit, with spokespeople from both companies saying the claims are without merit.

Corporate battlegrounds: access to data and why it matters

Viktor Mayer-Schönberger, professor of internet governance and regulation at the University of Oxford, dismisses the once common ‘data is oil’ analogy. He stresses the relative value of a piece of data in a dataset can change over time, whereas all oil of a certain type has the same value. 

Mayer-Schönberger says: “Data scientists never tire of saying the first step of using data is to answer questions you already have, but a more sophisticated use is to find out what questions you should ask.” This thereby changes the value of some existing data.

It also produces another difference between data and other commodities, says Mayer-Schönberger. “Nobody cares about ownership of data; everybody cares about whether they can access data and use it. Because that’s what creates value,” he adds. And it’s this question of access that is upending the market economy and once more shifting the balance of power. 

Traditionally, large businesses produce high volumes of a product and do so more efficiently than smaller companies. Without legislation, this leads to a self-reinforcing competitive advantage that concentrates a market into the hands of a few players. 

But, says Mayer-Schönberger: “Fortunately, for hundreds of years, we had the counter force of innovation: human ingenuity that is dispersed throughout humanity. Somebody has a great idea, takes that idea, brings it to market, and that topples the large company.”

New ideas require data to implement

The problem is that now it’s not enough just to have a new idea. In the ‘data age’, any new idea will require data and the tools to manipulate it (machine learning, AI, and so on) to shape and test prototypes, to understand how to position it in the market and to sell it – whether it’s a physical product, say, or software-as-a-service. But most of the data required is held by big tech firms. Google, for example, has a great spellchecker, but that’s because no other firm can access the data that Google can. For the first time in history, innovation won’t counter the advantage that incumbent companies have from operating at scale. 

There are numerous moves and conversations to combat this but so far little concrete action. The EU has put forward proposals that would give Europeans a right to access the data collected by the US data platforms as a condition of those platforms operating in the EU. The German Social Democratic Party has a policy on this, and the UK government has also made similar proposals. This wouldn’t contradict strong data privacy policies as it would include stipulations to only make data available in aggregate and wouldn’t allow anyone accessing the data to identify individuals from it.

These moves are summed up by EU policymakers as the need for ‘data sovereignty’. A June report by the EU’s European Institute of Innovation & Technology called for coordinated action to produce and protect this sovereignty. The report concluded, ‘there is currently a vicious cycle in which a very small number of (non-EU) companies have oligopolistic access to valuable user data’ that they then use to cement their oligopoly. This cycle, it says, “forms a barrier to European innovation and growth, constitutes a serious threat to European sovereignty, and needs to be slowed down and eventually broken”.

Hundreds of organisations are working on ‘Gaia-X’, a Europe-based initiative to build an infrastructure for sharing data collected and produced by organisations with anyone that wants it, to boost innovation. Mayer-Schönberger argues for a future where Europe will diverge from the emerging US vs China axis to “create a counter force that requires more flow of data. It’s [essentially] a data subsidy that the government provides to SMEs, by making data accessible to them. It would enable a more open, innovation-based European tech economy to flourish”. But it would create “geopolitical tensions, where the US might say to Europe, ‘Close it down!’, and Europe might say, ‘No, we keep that open flow because it helps the industry that we have, rather than tries to replicate what you guys have’.”

Join ICAEW’s Tech Faculty, a network of finance professionals focused on tech issues and developments. For more information, click here.

What to consider when starting a business

Managers need to keep on top of two major activities when it comes to data. If you’re starting a business, these two issues should play an important role in your pre-launch planning

1. Data as growth strategy

  • Understand what the use of data can tell you about customers, suppliers and competitors.
  • Think about how you can use the data you collect in day-to-day operations to help customers. Can you aggregate the data you have and spot trends? Can you provide anonymous benchmarks or best practice to other customers?
  • Will your staff be able to share and collectively work on data easily (especially if they are remote)? What other data can they get access to that would help them in their roles?

2. Data compliance and risk management

  • For EU-based start-ups, GDPR is the most wide-ranging and applicable data legislation to prepare for. And for UK businesses, Brexit will not mean any immediate changes. The UK Information Commissioner’s Office says it’s “committed to maintaining the high standards of the GDPR and the government plans to incorporate it into UK law at the end of the transition period”. The GDPR will require you to understand and document what personal data you can hold, where you hold it, how long you hold it and how you will protect it. Fines for non-compliance can be hefty and put a start-up out of business.
  • Regarding the above, it also makes sense to build in a data risk management process from the start. Understand where the biggest risks lie – both from regulators and to your reputation in the marketplace – quantify them, and revisit them regularly to ensure that you are happy with the level of risk you are taking on. Ensure also that all relevant employees understand your appetite for this risk and make decisions that accord with that appetite.

For a closer look at one implication of this broad debate, see author Nina Schick’s analysis of how the use of AI to manipulate video is upending some of our most fundamental assumptions about the world.