1. What does IT resilience mean? #

IT resilience is the ability of an IT infrastructure to remain operational under adverse conditions, or to restore operability within a defined timeframe. The concept goes beyond classical high availability: availability protects against individual component failures. Resilience protects against scenarios where entire systems, locations, or infrastructure layers fail simultaneously.

Resilience vs. availability vs. security #

ConceptProtection goalTypical scenarioExample measure
AvailabilityIndividual components fail, system keeps runningDisk failure, server failureRAID, clustering, redundancy
SecurityPreventing attacksMalware infection, phishingFirewall, EDR, patch management
ResilienceBecoming operational again after total failureRansomware encrypts everything, data center burnsAir-gap backup, DR site, recovery runbook

The critical insight: Availability and security can fail. Resilience must not fail — it is the last safety net when all other layers have been breached.

Why IT resilience is now a board-level priority #

Three developments make IT resilience an executive responsibility:

  1. as an existential threat: A successful attack can cause weeks to months of operational downtime. Organizations that cannot recover do not survive.

  2. Regulatory pressure: NIS2, the KRITIS umbrella law, and sector-specific regulation (BAIT, ) make resilience a legal obligation — with personal liability for management.

  3. Supply chain dependencies: A failure at a critical supplier or cloud provider can interrupt entire value chains. Resilience must be considered beyond the organization’s own boundaries.

Blog Post | 5/4/2026
What Is IT Resilience? Definition and Distinctions
IT resilience is the ability of an organization to restore its information systems and critical business processes to a functional state quickly after a disruption. It is not the ability to avoid disruptions: it is the ability to absorb them, deal with them, and recover from them. As cyberattacks grow more frequent and more sophisticated, resilience has become an existential question for every organization.The definition can be made more precise: IT resilience is the combination of **prevention, detection, response, recovery, and continuous adaptation**. It responds to the central insight of modern IT security: not if, but when you become the target of an attack. This posture distinguishes resilience fundamentally from security alone.The concept is frequently confused with two related but distinct terms: **high availability** and **IT security**. A clear distinction is essential.---
Blog Post | 5/6/2026
IT Resilience vs. IT Security: Where Is the Difference?
This is the central confusion in modern IT practice. Many organizations believe that investments in IT security (firewall, EDR, multi-factor authentication) are sufficient to handle cyberattacks. The data says otherwise: according to the Veeam Ransomware Trends Report 2025, about 7 in 10 organizations were hit by at least one ransomware attack in the preceding year, despite improved defenses, and in 89% of those attacks the adversaries targeted the backup repositories.This reveals the fundamental problem: IT security alone does not provide adequate protection. A solid understanding of the difference between security and resilience is the first step toward a more effective defensive strategy.---

2. The five pillars of IT resilience #

IT resilience is not a single product or a single measure — it is an architectural principle resting on five pillars.

Pillar 1: Prevention #

Preventing attacks and failures where possible.

  • Patch management and vulnerability management
  • Endpoint Detection and Response (E)
  • Network segmentation
  • Zero-trust architecture
  • Security awareness training

Reality check: Prevention reduces risk but does not eliminate it. Attackers often remain undetected in networks for weeks, sometimes months — many attacks are only discovered after the damage has already been done.

Pillar 2: Detection #

Identifying attacks and anomalies before maximum damage occurs.

  • SIEM (Security Information and Event Management)
  • Network Detection and Response (N)
  • Anomaly detection in backup systems
  • Log analysis and correlation
  • 247 Security Operations Center (SOC)

Pillar 3: Response (incident response) #

Acting quickly and in a structured way when an incident occurs.

  • Incident response plan with defined roles and escalation steps
  • Communication plan (internal and external)
  • Forensic analysis capability
  • Coordination with authorities (BSI, state criminal offices)
  • Documentation obligations under NIS2

Pillar 4: Recovery #

The most critical pillar: restoring systems and data within an acceptable timeframe.

  • Multi-tier backup architecture with air-gap layer
  • Documented s (RTO) and s (RPO)
  • Recovery runbooks for all critical systems
  • Regular recovery tests (quarterly)
  • Prioritized recovery sequence

Why recovery is the decisive pillar: Prevention, detection, and response can fail. Recovery is the point where it is decided whether an organization survives or not. And recovery only works when the data from which you are recovering has not also been compromised.

Pillar 5: Adaptation #

Learning from incidents and continuously improving resilience.

  • Post-incident reviews (lessons learned)
  • Adapting architecture to new threats
  • Tabletop exercises and simulations
  • Annual architecture reviews
  • Exchange in sector CERTs and ISACs
Blog Post | 5/12/2026
The 5 Pillars of IT Resilience: A Practical Framework
A robust IT resilience strategy does not rest on a single pillar. It rests on five. Each pillar has specific technologies, processes, and responsibilities. Many organizations invest heavily in Pillar 1 (Prevention) and neglect the other four. That is a classic mistake that leads to vulnerability, and it is also a compliance gap: NIS2 (Directive (EU) 2022/2555) explicitly requires backup management, disaster recovery, and crisis management alongside preventive measures.Here is a practical framework showing what belongs to each pillar and how to implement it.---
Blog Post | 3/3/2026
Creating an Incident Response Plan: Template and Guide
An incident response plan (IRP) is the backbone of resilience. It is the document that prepares your organization for a cyberattack before it happens. A well-structured IRP significantly reduces response time and minimizes the extent of damage. Yet many organizations have none, or only an outdated concept that nobody has tested.Under the NIS2 Directive (Directive (EU) 2022/2555), incident handling is an explicit risk management requirement for essential and important entities. Here is a concrete template with 8 required sections that every IR plan must have.---

3. Cyber resilience: When prevention is not enough #

is the specialization of IT resilience for cyberattacks. It addresses a specific problem: cyberattacks — in particular ransomware — are designed not only to disrupt individual systems but to destroy the entire recovery capability.

The ransomware dilemma #

Modern ransomware specifically targets backup infrastructure. This means: the classical disaster recovery plan, which assumes that backups are intact, no longer works.

The scenario that cyber resilience must solve:

  • Production systems: encrypted ✗
  • Active Directory: compromised ✗
  • Online backup: deleted ✗
  • Cloud backup: deleted via compromised IAM credentials ✗
  • Air-gap backup: intact ✓ — was physically unreachable

means: even in the absolute worst case — where an attacker had domain administrator rights and went undetected for weeks — at least one recovery path remains intact.

The three principles of cyber resilience #

Principle 1: Assume breach Assume your network will be compromised. Build your recovery architecture to work even then.

Principle 2: Isolated recovery capability At least one recovery path must be physically separated from the production network — not just logically, not just through software policies, but physically unreachable.

Principle 3: Verified recoverability A backup that has never been tested is not a recovery plan — it is an assumption. Quarterly recovery tests are the minimum.

Cyber resilience architecture: Three zones #

Zone 1: Production zone
├── Servers, VMs, databases, applications
├── Network-connected systems
└── Attack surface: HIGH

Zone 2: Backup zone (network-connected)
├── Primary backup repository
├── Snapshot immutability (supplementary)
└── Attack surface: MEDIUM (credentials-reachable)

Zone 3: Isolated recovery zone (air gap)
├── Hardware air gap system
├── Only reachable during backup windows
├── No network interface when offline
└── Attack surface: MINIMAL

Zone 3 is the cyber resilience insurance: Even if Zone 1 and Zone 2 are fully compromised, data in Zone 3 remains intact.

How the  works

Blog Post | 5/8/2026
Cyber Resilience vs. IT Security: Why Both Are Necessary
Cyber resilience is not an alternative to IT security. It is a **specialisation of IT resilience, focused on cyberattacks**. The distinction matters, because it shows: you cannot neglect one of the two and hope the other will be sufficient.IT resilience is broad. It covers natural disasters, hardware failures, software bugs, human error, and cyberattacks.Cyber resilience is narrow. It deals specifically with recovery from malicious, intelligent attacks designed to destroy your backups and sabotage your recovery.That makes cyber resilience harder: it requires more layers of defence, because the adversary reacts intelligently. The Veeam Ransomware Trends Report 2025 found that in 89% of ransomware attacks, the attackers went after the backup repositories, and on average about a third of those repositories were modified or deleted.---
Blog Post | 5/21/2026
Assume Breach: The Design Principle That Changes Your Architecture
"Assume Breach" is not just a security slogan. It is a fundamental design principle that reshapes the entire architecture of an organization. Think it through consistently, and you have to rebuild parts of your IT.The concept is simple: **not if, but when will your organization be attacked and compromised?**This is not pessimism. The data is unambiguous: in the Veeam Ransomware Trends Report 2025, roughly 7 in 10 organizations reported at least one ransomware attack in the preceding year, despite improved defenses. For exposed industries (financial services, healthcare, manufacturing), the question is realistically only: when?---
Blog Post | 5/25/2026
Isolated Recovery Environment: Building a Protected Recovery Zone
An Isolated Recovery Environment (IRE), sometimes called a cleanroom, is not a single device. It is an infrastructure zone that is completely isolated from the production network. It is the place where you restore, verify, and clean compromised systems before returning them to production.Without an IRE, recovery in a compromised network is a gamble: the restored server gets reinfected before you can use it.---

4. Business continuity and disaster recovery #

Business Continuity Management (BCM) #

is the organizational framework within which IT resilience operates. defines:

  • Critical business processes: Which processes must be restored first?
  • Maximum Tolerable Downtime (MTD): How long can a process be unavailable before the organization suffers existentially threatening damage?
  • Business Impact Analysis (BIA): What financial, operational, and reputational damage occurs per hour of downtime?

RTO and RPO: The two metrics that determine everything #

MetricMeaningExampleDetermined by
RTO (Recovery Time Objective)Maximum acceptable downtime4 hours: ERP systemBusiness requirement
RPO (Recovery Point Objective)Maximum acceptable data loss1 hour: transaction dataBackup frequency

The most common mistake: RTO and RPO are defined but never tested against the actual backup architecture. An RTO of 4 hours is worthless if an actual restore takes 48 hours.

Disaster recovery: The plan for worst case #

A disaster recovery plan documents exactly how systems are restored after a total failure. It must contain the following elements:

  1. Trigger criteria: When is the plan activated?
  2. Roles and responsibilities: Who decides, who acts?
  3. Recovery sequence: Which systems first?
  4. Technical recovery steps: Step-by-step instructions per system
  5. Communication plan: Who is informed when and how?
  6. Success criteria: How do we know recovery is complete?

Critical: The plan must be available offline — printed, in a safe. If your IT infrastructure is compromised, your SharePoint folder with the plan may not be accessible either.

Typical RTO values by backup architecture #

SystemTarget RTO (typical)RTO with online backupRTO with hardware air gap
Active Directory1 – 2 hours1 hour2 – 4 hours
ERP system4 hours2 hours4 – 8 hours
Email system4 hours1 hour4 – 6 hours
File server (10 TB)8 hours4 hours6 – 10 hours
Full environment24 hours8 hours*12 – 24 hours

*Online backup: RTO only achievable if backup was not compromised — no guarantee in a ransomware attack.

Blog Post | 6/4/2026
Creating a Business Continuity Plan: Guide for IT Leaders
A Business Continuity Plan (BCP) is not just an IT document. It is the written strategy for how an organization maintains (or quickly restores) its critical business processes when a disruption occurs. A cyberattack, a natural disaster, a building failure: the BCP covers all of it.Many IT leaders confuse the BCP with a DR Plan (Disaster Recovery Plan). That is a mistake. The DR Plan is technical (how do we bring systems back up?). The BCP is business-oriented (which processes are critical, and how long can they be down?).---
Blog Post | 6/1/2026
Defining RTO and RPO Correctly: A Practical Guide
RTO (Recovery Time Objective) and RPO (Recovery Point Objective) are the most critical metrics in any resilience strategy. They answer two questions:- **RTO:** How long can my system be down? - **RPO:** How much data loss can I tolerate?The problem: many organizations "estimate" RTO/RPO based on gut feeling or IT tradition. That is the wrong approach. RTO/RPO must be derived from a **Business Impact Analysis (BIA)**, not the other way around. The BIA-first approach is also what the relevant standards expect: ISO 22301 builds the entire BCM system on it, and NIS2 (Directive (EU) 2022/2555) requires risk-based backup management and disaster recovery.---

5. The role of backup architecture #

Backup as the foundation of resilience #

The backup architecture is the technical foundation of every resilience strategy. Without intact backups, there is no recovery — and without recovery, there is no resilience.

The multi-tier reference architecture #

A resilient backup architecture works in tiers — each tier addresses a different risk scenario:

Tier 1 — Primary backup (online)

  • Network-connected backup repository
  • Function: Fast recovery of individual files and VMs
  • RPO: 1 – 4 hours | RTO: < 1 hour
  • Risk: Credentials-reachable — potentially compromised by ransomware

Tier 2 — Air-gap layer (physically isolated)

  • Hardware air gap system (e.g. Silent Brick System)
  • Function: -resistant recovery of entire systems
  • RPO: 24 hours | RTO: 4 – 8 hours
  • Risk: Minimal — physically unreachable outside backup windows

Tier 3 — Long-term archive ()

  • Immutable archive (e.g. Silent Cubes)
  • Function: Audit-proof long-term retention, historical recovery
  • RPO: 7 days | RTO: 8 – 24 hours
  • Risk: Very low — written data physically immutable

Tier 4 — Geographic redundancy

  • Off-site replication to a second location
  • Function: Protection against site-wide disasters
  • RPO: 4 – 24 hours | RTO: 4 – 24 hours

Why the air-gap layer is decisive #

In a ransomware situation, Tier 1 (online backup) and Tier 4 (cloud replication) are potentially compromised — both are network-reachable. Tier 3 () protects archive data but not necessarily current backup generations.

Tier 2 — the air-gap layer — is the resilience insurance: It contains current backup data that was physically unattackable.

Silent Brick System: Air-gap backupSilent Cubes: archiving

NIS2 Directive: Resilience is no longer a recommendation #

The NIS2 Directive and the NIS2 transposition law make IT resilience a legal obligation for thousands of organizations. §30 BSIG-new specifically requires:

NIS2 requirement (§30 BSIG-new)Resilience measure
Backup management and recoveryMulti-tier backup architecture with defined RTO/RPO
Crisis managementDR plan with roles, escalation, communication
Supply chain securityAssessment of backup hardware and software vendors
Incident handlingIncident response plan with forensic capability
Business continuityBIA, BCM plan, regular testing

Personal liability of management #

NIS2 tightens liability: managing directors and board members are personally liable for ensuring that appropriate risk management measures are implemented. We did not know” is not a defense — the NIS2 transposition law requires management to inform themselves regularly about the cybersecurity situation and to approve measures.

KRITIS umbrella law: Physical and IT resilience converge #

The KRITIS umbrella law extends the resilience concept to physical security. For , this means: IT resilience and physical resilience must be planned together. A data center requires not only ransomware protection but also protection against power failure, flooding, and physical access.

What auditors will check #

Affected entities must expect audits. Typical checkpoints in the area of resilience:

  • [ ] Is a documented data backup concept in place? (BSI CON.3)
  • [ ] Are RTO/RPO documented per system and verified through tests?
  • [ ] Is a physically separated (air-gapped) backup in place?
  • [ ] Are recovery tests conducted regularly and documented?
  • [ ] Is a  plan with defined roles and communication paths in place?
  • [ ] Is the plan available offline (printed, in a safe)?
  • [ ] Are backup systems managed with separate administrator accounts?
  • [ ] Is management informed about the resilience measures?
Blog Post | 1/13/2026
NIS2 and IT Resilience: What the Directive Specifically Requires
Directive (EU) 2022/2555 (NIS2) applies across the EU. Member states had to transpose it by 17 October 2024; Germany, for example, did so with the NIS2UmsuCG, in force since 6 December 2025 and without a general transition period. For organisations in scope, NIS2 demands more than cybersecurity controls: Article 21 explicitly requires **IT resilience**, meaning the proven ability to keep operating and to recover after an incident.This is not a checkbox exercise. It is governance with personal consequences for management, which must approve the measures, oversee their implementation, and attend training.---
Blog Post | 1/21/2026
Personal Liability Under NIS2: What Executives Need to Know

This is uncomfortable but important: under NIS2, cybersecurity is explicitly a management duty, and breaching it can cost executives personally. Article 20 of Directive (EU) 2022/2555 requires the management body to approve the cybersecurity risk management measures, oversee their implementation, and attend training. Member states must ensure that management can be held liable for infringements of these duties. National implementation acts spell this out; in Germany, for example, the amended BSIG makes executives liable towards their own company for culpable breaches of these duties, and that claim targets personal assets. This article explains how the liability works across the EU and what executives can do to minimise it. ---

7. Resilience maturity: Where does your organization stand? #

The resilience maturity model #

Use this model for self-assessment. Where does your organization fit?

Level 1 — Reactive (unprepared)

  • No documented  plan
  • Backups exist but are never tested
  • RTO/RPO not defined
  • No incident response process
  • Risk: Existentially threatening in a ransomware attack

Level 2 — Basic (partially prepared)

  • plan exists but is outdated
  • Backups run, but recovery tests are rare
  • RTO/RPO defined but not verified
  • Backup systems managed with production credentials
  • Risk: Weeks to months of downtime possible

Level 3 — Defined (structured preparation)

  • Current plan with defined roles
  • Regular backups with occasional recovery tests
  • RTO/RPO documented and reflected in backup architecture
  • Network segmentation in place
  • Risk: Days to weeks of downtime in a ransomware attack

Level 4 — Managed (resilient)

  • Multi-tier backup architecture with air-gap layer
  • Quarterly recovery tests with documented results
  • Separate administrator accounts for backup systems
  • Incident response plan with regular exercises
  • plan available offline
  • Risk: Hours to days of downtime — controllable

Level 5 — Optimized (cyber-resilient)

  • Automated with verified recovery
  • architecture with isolated recovery zone
  • Annual tabletop exercises and red team tests
  • Continuous improvement after every incident
  • NIS2/KRITIS-compliant with full documentation
  • Risk: Controlled — recovery within defined timeframes demonstrated

Where most organizations stand #

Based on our experience from over 2,500 installations, most German organizations are at Level 2 or 3 — they have backups and basic processes, but no demonstrated recovery capability when a ransomware attack also hits the backup infrastructure.

The jump from Level 3 to Level 4 — introducing an air-gap layer and regular recovery tests — is the single most impactful step to increase IT resilience.

8. Building a resilient IT architecture #

The 10-point plan for IT resilience #

No.MeasurePriorityTimeframe
1Conduct Business Impact AnalysisCritical2 weeks
2Define RTO/RPO per critical systemCritical1 week
3Introduce air-gap layer in backup architectureCritical4 – 8 weeks
4Create recovery runbooks for all critical systemsHigh2 – 4 weeks
5Set up separate backup administrator accountsHigh1 week
6Establish quarterly recovery testsHighOngoing
7Create and practice incident response planHigh2 – 4 weeks
8Make DR plan available offlineMedium1 day
9Document BSI CON.3 data backup conceptMedium2 weeks
10Annual architecture reviews and tabletop exercisesMediumOngoing

Quick wins: What you can do this week #

  1. Create a backup inventory: List all backup systems. For each: could an attacker with admin credentials delete it? If yes — it is at risk.
  2. Find out when the last recovery test was: When was the last complete restore tested? If the answer is never” or over a year ago” — critical gap.
  3. Print the plan: If your plan only exists digitally, print the most critical sections and place them in a safe.
  4. Request a resilience assessment: Have your architecture assessed by an outside perspective — fresh eyes see gaps that routine overlooks.

Request a free resilience assessmentView reference architecture

IT security aims to prevent attacks. IT resilience ensures that the organization can become operational again after a successful attack. Security is a subset of resilience — resilience additionally encompasses recovery, business continuity, and adaptability.

Every organization with real ransomware risk benefits from an air-gap layer. For -affected organizations and , physically isolated backup is in practice a regulatory obligation.

The cost of a resilient architecture is a fraction of an uncontrolled outage. According to Bitkom 2024, a ransomware attack causes an average of EUR 5.3 million in damage (estimate based on aggregated total damage figures). An air-gap backup solution costs a fraction of that depending on capacity — and reduces downtime from weeks to hours.

Quarterly recovery tests of critical systems are the minimum — recommended by both BSI (CON.3.A11) and requirements. Additionally, a complete recovery test of all critical systems with timing against RTO targets should be conducted annually.

Yes. The most important measurable metrics are: (1) RTO — measured in the recovery test, not estimated; (2) RPO — actual data loss in the recovery test; (3) backup success rate — share of successful backup jobs; (4) recovery success rate — share of successful restore tests; (5) Time to Detect (TTD) and Time to Respond (TTR) for incidents.

§30 BSIG-new requires: backup management and recovery, crisis management, business continuity, incident handling, supply chain security, and vulnerability management. Management is personally liable for implementation. Fines: up to EUR 10m or 2% of global annual revenue.

Disclaimer

This article was written by our editorial team and edited using AI. It provides a general overview and does not constitute legal advice – we recommend seeking professional advice for your specific situation.