The Recovery Test Nobody Runs

May 12, 2026

IT Operations

Business Continuity Beyond the Checkbox

Three weeks after passing its ISO 22301 certification review, a regional financial services firm was hit by ransomware. Its documented recovery time objective was 48 hours. Actual recovery took nineteen days. Backup infrastructure functioned largely as designed. The gap had nothing to do with technology.

Four senior executives held conflicting interpretations of decision authority that the plan had never resolved. The communications team lacked sign-off authority to engage regulators without a board quorum, which took 72 hours to convene. In the first few days, a vendor management team discovered that three critical SaaS providers had been compromised by the same attack vector — dependencies the plan had recorded as stable without anyone having tested that assumption directly with those vendors. By day four, the organisation was not managing a ransomware recovery. It was managing a governance crisis the ransomware had exposed.

The plan existed. The organisation could not execute it. That distinction is where business continuity programs most commonly fail, and it is the distinction that standard testing almost never surfaces.

What Standard Testing Measures

The dominant model for continuity testing is technical. Systems fail over, backups restore, and recovery time actuals are compared against recovery time objectives. When infrastructure returns within the documented window, the test passes. That framing is coherent for what it measures. Technical recovery is typically the most tractable part of a serious disruption. The components that extend recovery — unclear decision rights, broken communication chains, untested dependencies, leadership teams improvising under pressure without a rehearsed framework — are not surfaced by a failover drill. They are revealed only by the kind of test most organisations never run.

The Organisational Failures That Plans Don’t Catch

Continuity plans are written at a point in time. Organisations change faster than documentation cycles. Systems are replaced, operating models shift, personnel move on, and supplier relationships evolve. Plans reviewed on annual cycles by the same teams that wrote them, using processes optimised for confirming existing content rather than challenging it, routinely describe organisations that no longer exist in the form documented. By the time a disruption occurs, the plan may be accurate enough to satisfy an auditor and inadequate enough to fail an operational test.

The mechanism is straightforward. When continuity investment is driven primarily by regulatory compliance, the program’s success criteria orient toward evidence production rather than capability development. Plans are written to be auditable. Exercises are designed to produce a sign-off. After-action reports document what went smoothly. What gets measured is completion. What goes unmeasured is whether the organisation could actually recover under conditions the plan was not written to anticipate.

The 2017 British Airways IT outage made this visible at scale. An accidental power surge at a data centre — a category of event for which the airline had documented recovery procedures — cascaded into a multi-day collapse affecting more than 75,000 passengers at an estimated cost of £80 million. The technical failure was recoverable in principle. What proved far harder was coordinated decision-making, customer communication, and operational sequencing across business units that had never practised working together under that kind of pressure. The event did not expose a technology weakness. It exposed the absence of practised cross-functional recovery capability that the documentation had obscured.

Six months later, NotPetya provided a larger version of the same lesson. Maersk lost approximately 45,000 endpoints in minutes. Its recovery was faster than most comparable organisations — partly because a backup domain controller survived in a remote office in Ghana, an accident of geography rather than planned redundancy. More decisively, the company’s capacity to make and implement decisions rapidly, across geographies and functions, without the communication infrastructure that normal operations depended on, reflected a culture of distributed decision authority and operational discipline that no continuity document had created and no annual review had produced. Merck, facing the same attack, recorded losses exceeding $870 million. The differential was not of documentation quality. It was organisational capability that had been built before the event made it necessary.

These cases share a structural pattern. The technical failure was not the decisive variable. The decisive variable was whether the organisation could restore governance — decision rights, communication, prioritisation, coordinated action — in parallel with or ahead of the technical recovery. In both cases, organisations with mature documentation but untested governance structures performed substantially worse than those with practised operational coherence. The plan was not what determined the outcome.

Compliance-led continuity programs create a specific and expensive risk: the conviction that preparedness exists because documentation does. When the primary evidence of readiness is a completed review cycle and a green status indicator on the board risk dashboard, leadership is rational to feel protected. What has actually been verified is the existence of a plan, not the organisation’s capacity to execute it under conditions the plan was not written against. False confidence is not a soft organisational failing. It is a measurable gap between the risk profile an organisation believes it carries and the risk it actually carries.

Dependencies and the Limits of What Plans Map

Most continuity plans document what the organisation controls directly. The gaps that most consistently extend recovery sit in the dependencies the organisation does not control and has not tested. Direct vendors are typically mapped. What is rarely mapped is what those vendors depend on, and what happens when a failure propagates upstream rather than originating inside the organisation’s own perimeter.

In 2020, the SolarWinds breach revealed that many organisations relied on vendor-managed systems whose resilience hadn’t been tested, assuming availability that didn’t exist. Fourth-party risk—vendor dependencies—is often overlooked in continuity plans. Contracts and SLAs are reassuring documents, not proven recovery capabilities.

The manual workaround problem is similar. Plans often include fallback procedures when automated systems fail. These procedures are rarely rehearsed, leading to failures when first used: the documented process is inaccurate, tools are inaccessible, the workload exceeds manual capacity, and no one has practical experience. Manual fallbacks need rehearsing, but most organisations treat them as mere documentation.

Recovery Is a Governance Problem

There is a persistent tendency to frame business continuity as an IT discipline. That framing reflects where visible failure points tend to sit — when organisations go down, a system has usually failed — but it mislocates where recovery actually stalls. In complex disruptions, technical recovery frequently outpaces operational recovery. Systems come back online before there is clarity about who is authorised to make which decisions. Infrastructure restores before communication protocols have been established. Backups recover before the business has agreed on which workloads to prioritise, which customer obligations to honour first, or what message to put in front of regulators.

The decision queue waiting for a governance structure is usually longer than the system restoration queue. Decisions need agreement on authority during crises, which can’t be reached under pressure by those unpracticed in it. Research shows that under high-pressure, low-information conditions, leaders tend to centralise decision-making, experience communication breakdowns, struggle to triage urgent demands, and default to familiar, often unsuitable processes. These are not character flaws but predictable responses to unpracticed situations. Documentation doesn’t fix them; rehearsal does.

The governance gap also extends into vendor and supply chain management in ways that continuity programs rarely address adequately. COVID-19 demonstrated this systematically. Organisations that had invested seriously in internal IT resilience found their recovery capability sharply limited by the simultaneous fragility of their supply ecosystems. The continuity plans had assumed recoverable dependencies. The reality was a synchronised, multi-point failure against which no individual plan had been written and for which no coordination with third parties had been rehearsed. The technical architecture held. The operating model it served did not.

Testing That Finds Nothing Has Found Nothing

The shift from documented preparedness to demonstrated recovery capability requires testing designed to surface gaps rather than confirm that plans exist. That distinction matters because exercises designed to produce clean outcomes will produce clean outcomes. They will not produce useful information about whether the organisation can recover under conditions that diverge from the plan’s assumptions.

Credible testing introduces scenarios the plan was not written for, involves people who have not been briefed on the intended narrative, and includes decision points with no single documented answer. It tests the communication and governance layer directly. It requires the organisation to demonstrate, rather than assert, that critical vendor capabilities will be available when needed. A clean pass is not a success indicator. It is evidence that the test was not difficult enough. The value of an exercise is proportional to what it reveals, which means an exercise that surfaces nothing has revealed nothing about whether the organisation can actually recover.

Scenario realism outweighs coverage; practising 3-4 high-stakes scenarios builds better capacity than listing 40 disruption types. Focus on adaptive response, decision-making, and communication under stress. After-action learning solidifies this skill by honest evaluation of failures and assumptions. True readiness comes from real learning, not perfect exercises.

Continuity maturity is not measured by the elegance of the plan. It is measured by how well the organisation recovers when the plan meets a reality it was not written for — when the dependencies the plan assumed were stable have failed, when the people the plan relied on are unreachable, and when the governance structure the plan described has never been tested under pressure.

The question worth putting to any organisation’s senior leadership is not whether the continuity plan is current. It is whether the business could actually recover, and what evidence exists to support that answer. In most organisations, the honest answer to the second part of that question is the most useful piece of risk information the board is not receiving.