In this second paper on operational resilience, we will look at how to develop the audit approach. Our first paper highlighted this as a complex subject, which presents both challenges and opportunities for internal audit, given many organisations are in the early stages of their resilience journeys. The topic can seem deeply technical and so when developing a framework for achieving operational resilience, organisations need to turn the principles underpinning this goal into simple and proportionate activities as far as they can.
Internal audit objectives
Internal audit should aim to assess the effectiveness of their organisation’s operational resilience arrangements. In doing so, it should firstly determine whether there is a good articulation and understanding of what operational resilience means to the organisation within the context of their specific industry.
This includes defining and identifying the business services which, were they to fail, could impact customers, markets, and other stakeholders together with the measures needed to ensure the business remains operational.
Developing an outline approach
The audit approach will depend on a number of factors, including:
- How management has articulated what operational resilience means for the organisation;
- The relative maturity of an organisation’s operational resilience arrangements; and
- Whether the resulting (or planned) framework aligns to aspirations.
For example, in a less mature environment, internal audit is more likely to provide value by adopting a review and recommend scope to its work and focusing its effort on assessing the framework’s design. Where the framework is more developed and embedded, internal audit should aim to provide greater assurance over the operating effectiveness.
Internal audit should also consider whether it will assess operational resilience as a standalone review or assess how resilience is covered within each element of a broader, more robust framework. This means adding a resilience component to the review scope in areas such as IT and cyber security, supply chain management, business continuity, disaster recovery and operational risk management. For example, internal audit may need to extend the scope of a traditional third party/outsourced management review to capture the following:
- Does the organisation understand third party business continuity risk (including concentrations risks such as cloud provider and outsourcing geography concentration)?
- Has management performed due diligence over its supplier’s continuity arrangements (including their own supplier dependency to the necessary degree of dependency)?
- Are there workarounds in place for supplier loss?
While the approach to be taken will be driven by an organisation’s size and complexity, understanding how management ensures an integrated approach to achieving its overall resilience goals will be critical to delivering an effective and valued audit.
Internal audit should form its own view of the risks that could impair an organisation’s operational resilience and/or cause operational disruption to guide its review scope and testing programme. If management has completed a resilience maturity assessment, internal audit should compare these results to its own view of the current position.
Many organisations are (or soon will be) looking at the lessons learned from responding to, and dealing with, the Covid-19 crisis and this will have covered (either directly or indirectly) elements of a resilience framework.
Understanding who within the management team is leading the organisation’s efforts in this area is another important part of an audit. Typically, this will be the Chief Operating Officer working closely with the Chief Information/Chief Technology Officer and the Chief Risk Officer.
Unclear responsibilities, particularly where a business service is supported by a range of people, systems, processes and third parties is a real threat to effective resilience. In addition, assessing the extent of ownership, understanding at and escalation to Board level should be covered.
Internal audit should consider the extent to which management has completed the following:
- Has the organisation identified its key business services?
- Has the organisation identified and documented the people, processes, technology, facilities and information that supports the delivery of each important business service?
- Has the organisation defined the scenarios and supporting methodology to be used for the scenario testing and are the results current and justified?
- What does management think most threatens the resilience of the organisation?
- What areas of the organisation does management think are most mature?
- Which executive is leading and taking ownership of the operational resilience plan?
- Does the Board collectively have sufficient knowledge, skills and expertise in relation to operational resilience?
- What evaluations of past experience have been performed?
- Does management have a view as to what are acceptable levels of disruption? Does it know it can meet this level?
- Does the organisation have any strategy for responding to lessons learned and modifying its processes in order to be more resilient in the future?
Looking out for red flags
When assessing risk, internal audit should consider potential red flags that could indicate weaknesses. These include:
- Lack of skills and understanding at senior levels
- Lack of substantiated analysis of key services and the required resilience levels
- Limited data and unrealistic assumptions supporting scenario analysis and testing
- Limited/incomprehensive register of business services
- Limited/incomprehensive inventory of people, processes, technology, facilities and data (especially those relevant to critical services
- Over reliance on end-user computing
- Qualification, experience and the role of personnel involved in performing resilience arrangements (including analysis and design activities)
- Significant/unexplained fluctuations in probability assessments, disruptions and the potential impact
- Poor articulation and understanding of risk appetite and risk tolerances across the organisation
- Inflexible legacy infrastructure that is hard to fix and further complicated by adding ever more layers and systems to manage
- New regulations that increase operational resilience challenges (particularly when it relates to the risk of illegally sharing sensitive customer information).
The interconnectedness of technology infrastructure that exists today is also a concern and the potential risks attached to this should be acknowledged. For example, the outage RBS had in 2012 following a routine software upgrade that went wrong led to a £56million fine. This outage impacted many customer facing activities and lasted a number of weeks.
Organisations should therefore always consider the impact of small scale changes that are made incrementally or through minor alterations. These can often attract less oversight and control than major projects, but can lead to significant issues.
Focus areas for internal audit
Further questions to consider
Business Continuity Plans
How has management developed and implemented business continuity plans that are designed to maintain the provision of a service either within impact tolerances or within current organisational capabilities? Has management identified and defined important business services and articulated the outcomes required for each?
Has management developed a range of plausible scenarios and does it test the efficacy of its continuity plans in maintaining services within its impact tolerances? Is there a consideration of intelligence (such as emerging risks, near misses, previous incidents within the business, local and global industry) in developing the scenarios? Are the scenario testing plans for each scenario commensurate? Are the root causes of failed tests being addressed?
While scenarios should be severe, it would be unreasonable to expect the organisation to withstand the most extreme forms of disruption.
How has management designed its methodology and playbook for responding to disruptive events such as cyber-attack or data protection breach?
How does management capture and assess lessons learned from scenario testing and develop strategies for closing identified gaps? This is likely to have been brought to the forefront through responses to the Covid-19 crisis. The crisis is likely to have highlighted shortcomings in the way many organisations have historically set and tested scenarios as part of their business continuity arrangements.
How has management assessed the extent of manual processes across the organisation and identified opportunities for automation? Again, this is likely to have been accelerated because of Covid-19, given the levels of remote working practices that have been introduced.
Does the organisation have appropriate arrangements in place, including the following?
- An effective and sustainable governance strategy to address operational resilience (and is this aligned to the business strategy)
- Adequate oversight and monitoring of the resilience risk appetite and investment decisions
- Sufficient and appropriate testing of its response to a disruptive event
- Relevant and adequate management information (both quantitative and qualitative) that flows up through committees to the Board. Good quality management information should enable a Board to measure and monitor the key drivers of operational resilience and have appropriate oversight over the business’ performance against risk appetite.
- A set of Key Risk Indicators linked to the drivers of operational resilience and operational availability
- Whether the organisation’s risk appetite statement gives recognition to operational disruption as a key risk and quantifies the amount of disruption that could be tolerated in the event of an incident
- Is the risk appetite statement sufficiently clear, and does it include metrics/limits that are subject to an annual review by the Board
- An aligned and integrated framework for the management of operational resilience within the enterprise wide risk management framework
Allocated roles and responsibilities for managing and reporting on operational resilience, particularly those between the 1st and 2nd lines of defence.
As we noted in our first paper, achieving operational resilience is non-negotiable. It is also far-reaching and complex, and for many organisations, assessing what they need to do and how to do this is very much a work-in-progress.
This presents a challenge and opportunity for internal audit to provide appropriately balanced and valued advisory and assurance support as their organisations seek to embed a fit-for-purpose framework, policies and processes that will achieve the desired outcomes.