The binder was impressive. Sixty-two pages, indexed tabs, version-controlled, signed by the COO. It had survived three audits and a cyber insurance review. It listed roles, escalation paths, external contacts, and a recovery time objective of four hours. The only thing it had never survived was an actual incident. We found that out twenty minutes into a tabletop exercise last spring with a Hudson Valley manufacturer—250 employees, $80 million in annual revenue, three production lines running single-shift. Twenty minutes before the plan stopped making contact with reality.

The scenario was straightforward: ransomware detected on two workstations in the plant floor network at 6:45 a.m. on a Tuesday. The security team in the exercise—IT manager, operations director, HR lead, and the CFO—opened the plan and began working through it. That's when the gaps started surfacing, one after another, with the quiet inevitability of a leak finding every crack in a hull.

The plant manager wasn't in the room. He wasn't supposed to be—the IR plan didn't list him as part of the response team. But his name appeared in three places as a required decision-maker for production shutdowns. Nobody had told him. He learned he was in the communication chain about forty seconds after we pointed it out.

An IR plan that hasn't been tested isn't a plan. It's a hypothesis. And Tuesday is not when you want to find out it was wrong.

The backup vendor contact was next. The IR plan listed a support number and a named account manager. The number went to general support. The account manager had left the company. The managed backup contract itself had lapsed eleven months prior—auto-renewal had failed when the company changed its accounts payable platform and the invoice routing broke. Nobody had noticed, because backups were still running. They just weren't monitored, tested, or under any service agreement.

When we asked the team how long since they'd done a full restore test, the IT manager said six months. When we pulled the actual log records, it was fourteen months. There is always a gap between what people remember and what the records show. That gap is where incidents become disasters.

01

The Difference Between a Document and a Capability

An IR plan that lives in a document repository and gets updated for audits is not a capability. It is a compliance artifact. The distinction matters enormously, because the two require completely different things to maintain.

A compliance artifact needs to be current, accurate, and accessible. A capability needs to be practiced, tested, and load-bearing under pressure. You can maintain a document through annual review cycles. You cannot maintain a capability that way. Capabilities atrophy. Contacts change. Vendor relationships expire. People leave, and institutional memory walks out with them. The plan you wrote eighteen months ago reflects a company that no longer exists.

Compliance Artifact

Updated annually. Passes audit. Reviewed by security team. Lives in SharePoint. Nobody reads it under pressure.

Operational Capability

Tested quarterly. Roles walked. Contacts verified. Backup restores validated. The team has muscle memory, not just documentation.

The manufacturer's plan was a compliance artifact. It described what the company intended to do. It did not reflect what the company was actually capable of doing on a given Tuesday morning, with the specific people on shift, using the specific tools under contract, within the constraints of a plant that cannot afford more than a few hours of unplanned downtime before the margin math turns ugly.

This is not a failure of intent. It is a structural problem that affects most mid-size organizations without a dedicated security operations function. The plan was built once, audited periodically, and never stress-tested against the friction of real conditions. That's not negligence—it's the default state of IR planning in organizations where security is one of a dozen competing priorities and nobody owns readiness as a continuous function.

02

Why Manufacturing Is Uniquely Vulnerable

Every industry has IR challenges. Manufacturing has a set of structural conditions that make those challenges significantly harder, and most IR frameworks were not designed with those conditions in mind.

The first is OT/IT convergence. Over the past decade, plant floor systems have become progressively more networked. SCADA systems, programmable logic controllers, human-machine interfaces, and MES platforms now share network infrastructure with corporate IT environments. The security architecture almost never kept pace with that connectivity. The result is an attack surface that spans both the business network and operational technology environments—two domains with fundamentally different security tolerances, patching cycles, and recovery requirements.

The OT/IT Convergence Problem in Plain Terms

Your IT network can be taken offline, rebuilt, and restored from backup. Your production line cannot. A PLC running decades-old firmware, connected to a network that also connects to your email system, does not have a four-hour recovery window. It has a recovery window measured by how long your customers will wait before going somewhere else.

The second structural condition is production pressure. Manufacturing operates on narrow margins and tight schedules. Downtime is not an abstraction—it is a number per hour with a decimal point and a direct line to customer relationships and cash flow. When an incident occurs, the pressure to restore production is immediate and real. That pressure directly competes with the disciplined, methodical containment process that effective incident response requires. In practice, the pressure usually wins, and the response gets shortened or skipped in ways that create secondary problems.

The third condition is thin security staffing. An $80 million manufacturer with 250 employees does not have a security operations center. It has an IT manager who also handles the help desk, network infrastructure, and vendor management. That person is technically capable but structurally overwhelmed during a crisis. The IR plan assumes capabilities and bandwidth that don't exist in the building.

03

The Tabletop That Exposed Everything

Tabletop exercises work because they create consequences without creating damage. The scenario is real enough to surface genuine gaps—the confusion, the missing contacts, the process steps nobody has actually done—without burning down the house. What follows is a compressed walk-through of how the exercise unfolded and what it found.

Scenario Inject — 06:45 Tuesday

T+0:00: Plant floor supervisor calls IT manager about two workstations displaying ransom notes. IT manager is at home. Remote access to management tools takes 11 minutes to establish.
T+0:14: IR plan retrieved. First listed action: "Notify CISO." The company does not have a CISO. The role listed was a position that had been eliminated 18 months prior. Nobody updated the plan.
T+0:20: Team identifies that the plant manager must authorize production shutdown. Plant manager is not in the exercise. In a real event, he would have been on the floor with no context and no protocol for what he was being asked to decide.
T+0:31: Team attempts to contact managed backup vendor. Account manager number disconnected. Main support line requires a contract number. Contract number is in the plan document—but the contract has lapsed and the number is not in the vendor's active system.
T+0:47: Exercise facilitator asks team to demonstrate a restore from the most recent backup. Team cannot locate restore documentation. IT manager recalls doing a test restore "sometime last year." Log records indicate 14 months since last validated restore.
T+1:15: Team attempts to identify scope of infection. No network diagram with current OT/IT boundaries exists. IT manager believes the affected workstations "probably aren't connected to the line," but cannot confirm without physical inspection.
T+1:42: Exercise concludes. Four-hour RTO in the plan. Actual elapsed time to reach containment decision: still unresolved at scenario end.

None of these failures were exotic. Expired vendor contracts, outdated role assignments, untested backups, missing network documentation, absent stakeholders—these are not sophisticated security problems. They are operational maintenance failures. They happen because nobody owns the job of keeping IR readiness current as a continuous function.

The COO in the room was quiet for most of the debrief. At the end, he said something I've heard variations of in a lot of rooms: "I thought we had this covered. We paid for the plan. We passed the audit." That's the document-vs-capability gap in a sentence. Payment and passage don't produce readiness. Practice does.

The four-hour RTO looked reasonable on paper. In the room, with real people and real gaps, we couldn't make a containment decision in under two hours—and production hadn't been touched yet.

04

What Response Readiness Actually Looks Like

Auditors accept documentation. Incidents test capability. Those two standards are not the same, and organizations that conflate them are making a category error that will cost them when the event is real.

Response readiness has three components that no document alone can satisfy: people who know their role under pressure, processes that have been executed at least in rehearsal, and tools and relationships that have been verified recently enough to be trusted. Strip any one of those three and you have a plan that will fail under stress.

People Readiness

Every person in the communication chain knows they're in it. Decision authority is explicit. Escalation paths are practiced, not just written.

Process Readiness

Playbooks have been walked, not just written. Containment steps are understood by the people who will execute them, not just the person who wrote them.

Technical Readiness

Backups have been tested within 90 days. Vendor contacts are verified active. Network documentation reflects current reality. Restore procedures exist and work.

The distinction auditors miss—because most audit frameworks are not designed to test it—is temporal. A plan that was accurate and practiced twelve months ago may be neither today. People leave. Contracts expire. Networks change. Mergers happen. The audit cadence and the decay rate of IR readiness are not synchronized, and in most mid-size organizations the decay rate wins.

True readiness requires a maintenance cadence that matches the decay rate: quarterly at a minimum for contact verification and backup testing, semi-annually for tabletop exercises, and annually for full plan review incorporating any infrastructure or organizational changes. That is not a heavy lift. It is a scheduled, lightweight discipline that most organizations skip because nothing has gone wrong yet.

05

The Cost Argument: Downtime Is Not Abstract

The conversation about IR investment always comes back to cost. Quarterly exercises, backup validation, vendor management, OT/IT network documentation—these are not free. For a mid-size manufacturer operating on 8-12% EBITDA margins, every discretionary dollar is competitive. The argument for IR readiness has to be made in the same language as every other capital decision: what does it cost, and what is the risk of not doing it?

For the manufacturer in this case study, we worked through the numbers explicitly. The calculation is not complicated, but it requires intellectual honesty about what downtime actually costs when you account for all of it.

Cost Category Conservative Estimate Notes
Direct production downtime $18,000–$24,000/hr Based on revenue/operating hours, excluding fixed overhead absorption
Incident response retainer / emergency IR $35,000–$85,000 Emergency IR engagement without a retainer runs at premium rates; retainers average $20–40K/yr
Regulatory notification and legal $15,000–$50,000 Depends on data exposure; NY SHIELD Act notification obligations apply
Customer penalty clauses / SLA breach Contractual Manufacturing customers often have late-delivery penalties; quantify your exposure specifically
Cyber insurance deductible $25,000–$100,000 Deductibles have risen significantly; verify your current policy terms
Reputational / competitive impact Unquantified Single-source customers represent outsized risk; recovery timeline matters as much as the event itself

A 24-hour production disruption at this manufacturer lands conservatively between $600,000 and $900,000 in direct costs, before legal fees, before the insurance deductible, before any customer penalties. The cost of a quarterly IR readiness program—contact verification, backup testing, one tabletop per year, maintaining an IR retainer—runs roughly $30,000 to $50,000 annually when properly scoped for a company this size.

That is not a security spend. That is operational insurance with a calculable premium and a calculable exposure. The CFO in the room ran the math in about ninety seconds. The readiness program was approved before the debrief was over.

Your cyber insurance covers the aftermath. IR readiness compresses the duration. Those are not the same protection, and only one of them keeps your customers from calling your competitor.

One more cost consideration that doesn't get enough attention: cyber insurance carriers are paying attention to IR readiness now. Carriers who write manufacturing policies are increasingly requiring evidence of tested incident response programs—not just documented plans. Organizations without that evidence are seeing coverage terms tighten, deductibles rise, and in some cases, coverage denied for incidents where negligent readiness contributed to the extent of the damage. The market is pricing readiness. Your premium already reflects it.

06

Five Things Every Manufacturer Should Test Quarterly

This is not a comprehensive IR program. It is a minimum viable readiness cadence—the five activities that, done consistently every quarter, keep the gap between your documented plan and your actual capability from becoming dangerous. Each one takes between 30 minutes and two hours. None of them require outside consultants to execute once you've built the habit.

A Note on OT-Specific Readiness

If your production environment includes networked SCADA systems, PLCs, or any operational technology connected to your IT network, add two items to your quarterly list that most generic IR frameworks omit entirely.

The manufacturer from this case study completed their first full quarterly readiness review three weeks after the tabletop. They found two more expired vendor contracts, updated the communication chain to include the plant manager and shift supervisors, and conducted their first documented backup restore in fourteen months—which partially failed and exposed a configuration problem in their backup agent that had been silently degrading their backup integrity for six weeks.

They found it in a test. Not on a Tuesday.

That is exactly what this work is for. The plan in the binder was not the problem. The absence of a practice that keeps the plan honest was. Fix the practice. The binder takes care of itself.