Incident Response Framework
An incident response framework defines the organisational structures, decision authorities, and operational sequences that govern how security incidents progress from initial detection through complete resolution. The framework establishes predictable coordination patterns that function under pressure, when ambiguity is highest and time constraints are tightest. For mission-driven organisations handling sensitive beneficiary data across multiple jurisdictions, the framework provides the scaffolding that transforms ad-hoc reactions into structured responses that preserve evidence, limit damage, and satisfy regulatory notification requirements.
- Security incident
- An event that compromises the confidentiality, integrity, or availability of information assets, or violates security policies. Incidents range from malware infections affecting a single device to coordinated attacks exfiltrating donor databases.
- Incident response
- The organised approach to addressing and managing the aftermath of a security incident, with goals of limiting damage, reducing recovery time, and preventing recurrence.
- Indicator of compromise
- Technical artefact suggesting a security incident has occurred or is in progress. Examples include unexpected network connections to known malicious infrastructure, unusual authentication patterns, or file modifications outside change windows.
- Mean time to detect
- Average elapsed time between incident occurrence and initial identification. Industry benchmarks show detection times ranging from hours for well-monitored environments to 200+ days for organisations lacking detection capabilities.
- Mean time to respond
- Average elapsed time between incident detection and containment. This metric directly correlates with breach cost; each hour of uncontained compromise increases data exposure and recovery complexity.
- Containment
- Actions taken to limit incident scope and prevent further damage while preserving evidence for investigation. Containment decisions balance operational disruption against continued exposure.
- Eradication
- Removal of threat actor presence, malicious code, and compromised credentials from the environment. Incomplete eradication enables re-compromise through persistent access mechanisms.
Incident lifecycle
The incident response lifecycle progresses through six phases that provide structure without imposing rigidity. Each phase has defined objectives, decision points, and handoff criteria that signal readiness to advance. The lifecycle is iterative rather than linear; investigation findings during containment frequently trigger returns to detection and analysis as incident scope expands.
+------------------------------------------------------------------+| INCIDENT RESPONSE LIFECYCLE |+------------------------------------------------------------------+| || +-------------+ +-------------+ +-------------+ || | | | | | | || | PREPARATION +---->+ DETECTION +---->+ ANALYSIS | || | | | & REPORTING | | | || +------+------+ +------+------+ +------+------+ || ^ | | || | | v || | | +------+------+ || | | | | || | +----------->+ CONTAINMENT | || | (escalation) | | || | +------+------+ || | | || | v || +------+------+ +-------------+ +------+------+ || | | | | | | || | LESSONS +<----+ RECOVERY +<----+ ERADICATION | || | LEARNED | | | | | || +-------------+ +-------------+ +-------------+ || |+------------------------------------------------------------------+Figure 1: Incident response lifecycle phases with progression and feedback paths
Preparation establishes the capability to respond before incidents occur. This phase includes team formation, contact list maintenance, playbook development, tool deployment, and regular exercises. Organisations invest in preparation during calm periods; the investment pays returns when incidents demand immediate, coordinated action. Preparation activities include configuring logging to capture forensically relevant data, establishing communication channels that function when primary systems are compromised, and documenting system architectures that responders will need during investigation.
Detection and reporting identifies potential incidents through automated alerting, user reports, or external notification. Detection mechanisms include security monitoring tools generating alerts, staff reporting suspicious emails or system behaviour, partner organisations sharing threat intelligence, and regulators or researchers disclosing vulnerabilities. The reporting path must be unambiguous; every staff member should know exactly how to report a suspected incident without navigating organisational hierarchy.
Analysis determines whether a reported event constitutes an actual incident, assesses scope and severity, and identifies appropriate response procedures. Analysis activities include correlating alerts across systems, reviewing logs for related activity, and determining which systems and data are affected. The analysis phase produces a severity classification that drives subsequent resource allocation and notification decisions. Premature classification is preferable to delayed classification; initial severity can be adjusted as understanding develops.
Containment limits incident scope while preserving evidence for investigation. Containment decisions balance operational impact against continued exposure. Disconnecting a compromised server from the network halts data exfiltration but may disrupt programme operations; the containment strategy must weigh these factors against incident severity. Short-term containment provides immediate risk reduction while long-term containment enables sustained operations during extended investigation and remediation.
Eradication removes threat actor access, malicious code, and compromised credentials. Eradication requires confidence that all persistence mechanisms are identified; partial eradication enables rapid re-compromise. For sophisticated attacks, eradication and recovery proceed system-by-system with verification at each stage. Credential resets occur only after confirming the authentication infrastructure itself is clean; resetting passwords while attackers maintain access to the identity provider accomplishes nothing.
Recovery restores systems to normal operation and confirms they function correctly. Recovery includes restoring from verified-clean backups, rebuilding compromised systems, and gradually returning services to production. Monitoring intensifies during recovery to detect any signs of continued compromise or re-attack. Recovery is not complete until systems operate normally and monitoring confirms no residual threat actor presence.
Lessons learned extracts value from the incident experience through structured review. This phase documents what happened, evaluates response effectiveness, and produces improvement recommendations. The review occurs within 14 days of incident closure while details remain fresh. Improvement actions receive owners and deadlines; without accountability, lessons learned documents accumulate without producing change.
Severity classification
Severity classification determines resource allocation, notification requirements, and escalation paths. A four-level model provides sufficient granularity without creating ambiguity about level boundaries. Classification occurs during analysis and adjusts as understanding develops; initial classification errs toward higher severity because downgrading is simpler than upgrading once response activities have begun.
+------------------------------------------------------------------+| SEVERITY CLASSIFICATION |+------------------------------------------------------------------+| || +------------------------+ || | SEVERITY 1 | || | CRITICAL | || | | || | - Active data breach | Response: Immediate || | - Ransomware spreading | Team: Full IR + Exec || | - Complete outage | Comms: Hourly updates || | - Safety risk | Target: 4hr containment || +------------------------+ || | || v || +------------------------+ || | SEVERITY 2 | || | HIGH | || | | || | - Confirmed intrusion | Response: Within 1 hour || | - Sensitive data risk | Team: IR core + specialists || | - Major service impact | Comms: 4hr updates || | - Credential theft | Target: 24hr containment || +------------------------+ || | || v || +------------------------+ || | SEVERITY 3 | || | MEDIUM | || | | || | - Malware (contained) | Response: Within 4 hours || | - Policy violation | Team: IR lead + assigned || | - Minor data exposure | Comms: Daily updates || | - Failed attack | Target: 72hr resolution || +------------------------+ || | || v || +------------------------+ || | SEVERITY 4 | || | LOW | || | | || | - Suspicious activity | Response: Within 24 hours || | - Scan/probe detected | Team: IT staff || | - No confirmed impact | Comms: As needed || | - User report (benign) | Target: 5 day resolution || +------------------------+ || |+------------------------------------------------------------------+Figure 2: Severity levels with classification criteria and response parameters
Severity 1 (Critical) incidents involve active, ongoing harm with organisational or safety implications. Examples include ransomware actively encrypting systems, confirmed exfiltration of beneficiary protection data, complete loss of critical services during an emergency response, or any incident creating physical safety risks for staff or beneficiaries. Critical incidents command all available resources, executive involvement, and hourly status communications. The containment target is 4 hours; every hour beyond this threshold increases damage.
Severity 2 (High) incidents involve confirmed compromise with significant potential impact that has not yet fully materialised. Examples include confirmed unauthorised access to systems containing sensitive data where exfiltration is suspected but not confirmed, theft of privileged credentials, or service disruptions affecting programme delivery. High severity incidents activate the core response team with specialist support and target containment within 24 hours.
Severity 3 (Medium) incidents involve confirmed security events with limited scope or successful containment already in place. Examples include malware detected and quarantined on a single workstation, policy violations without confirmed external impact, or exposure of non-sensitive data. Medium severity incidents assign an incident lead with supporting staff and target resolution within 72 hours.
Severity 4 (Low) incidents involve suspicious activity or minor events that require investigation but show no confirmed malicious impact. Examples include reconnaissance scanning detected at the perimeter, user reports of suspicious emails that were not clicked, or anomalies that require explanation but may prove benign. Low severity incidents are handled by IT operations staff during normal working hours with resolution targeted within 5 business days.
The classification decision tree provides structured evaluation when severity is ambiguous:
+------------------+ | Incident | | Detected | +--------+---------+ | +--------v---------+ | Safety risk to | | people? | +--------+---------+ | +-----------------+-----------------+ | | | Yes | No v v +--------+---------+ +--------+---------+ | | | Active data | | SEVERITY 1 | | exfiltration? | | CRITICAL | +--------+---------+ +------------------+ | +-----------------+-----------------+ | | | Yes | No v v +--------+---------+ +--------+---------+ | | | Confirmed | | SEVERITY 1 | | unauthorised | | CRITICAL | | access? | +------------------+ +--------+---------+ | +-----------------+-----------------+ | | | Yes | No v v +--------+---------+ +--------+---------+ | Sensitive data | | Malware or | | at risk? | | policy breach? | +--------+---------+ +--------+---------+ | | +-----------------+-------+ +-----------------+-----------------+ | | | | | Yes | No | Yes | No v v v v +--------+------+ +--------+------+ +--------+------+ +--------+------+ | | | | | | | | | SEVERITY 2 | | SEVERITY 3 | | SEVERITY 3 | | SEVERITY 4 | | HIGH | | MEDIUM | | MEDIUM | | LOW | +---------------+ +---------------+ +---------------+ +---------------+Figure 3: Severity classification decision tree for consistent triage
Team structure and roles
The incident response team assembles from across the organisation, bringing together technical expertise, decision authority, and communication capability. Team composition scales with incident severity; low severity incidents require only IT staff, while critical incidents activate cross-functional teams including executive leadership.
+------------------------------------------------------------------+| INCIDENT RESPONSE TEAM STRUCTURE |+------------------------------------------------------------------+| || +-------------------+ || | EXECUTIVE | || | SPONSOR | || | (Sev 1-2 only) | || +--------+----------+ || | || +--------------+--------------+ || | | || v v || +-----------+----------+ +-----------+----------+ || | INCIDENT | | COMMUNICATIONS | || | COMMANDER |<--->| LEAD | || | | | | || | - Overall authority | | - Internal comms | || | - Resource decisions | | - External comms | || | - Escalation | | - Stakeholder mgmt | || +-----------+----------+ +----------------------+ || | || | || +---------+---------+---------+---------+ || | | | | | || v v v v v || +--+---+ +---+--+ +---+---+ +---+---+ +---+---+ || |Tech | |Tech | |Legal/ | |Data | |Ops | || |Lead | |Analyst| |Privacy| |Owner | |Lead | || | | | | | | | | | | || |Invest| |Log | |Regul- | |Business| |Service| || |igation| |review| |atory | |impact | |restor-| || +------+ +------+ +-------+ +-------+ |ation | || +-------+ || || +--------------------------------------------------------------+ || | SUPPORTING ROLES | || | | || | +------------+ +------------+ +------------+ +------------+ | || | |HR | |Facilities | |Vendor | |Partner | | || | |(insider) | |(physical) | |Liaison | |Liaison | | || | +------------+ +------------+ +------------+ +------------+ | || +--------------------------------------------------------------+ |+------------------------------------------------------------------+Figure 4: Incident response team structure with role relationships
The incident commander holds overall authority for response decisions during active incidents. This role coordinates team activities, allocates resources, makes containment decisions, and determines when to escalate or de-escalate severity. The incident commander is not necessarily the most senior person involved; the role requires incident management skills rather than organisational rank. In small organisations, the IT manager or a senior technical staff member fills this role. During Severity 1 incidents, the incident commander reports to executive leadership but retains operational authority.
The technical lead directs investigation and remediation activities. This role coordinates evidence collection, leads analysis of compromised systems, develops and executes containment strategies, and validates eradication completeness. The technical lead possesses deep technical knowledge of organisational systems and security tools. For complex incidents, multiple technical specialists report to the technical lead, dividing work across affected systems or investigation threads.
The communications lead manages all incident-related communications. This role drafts status updates, coordinates stakeholder notifications, manages media inquiries for significant incidents, and maintains the incident timeline. Separating communications from technical work ensures consistent messaging and frees technical staff to focus on investigation. The communications lead coordinates with legal and privacy advisors on regulatory notifications.
The legal and privacy advisor provides guidance on regulatory requirements, evidence preservation, and liability considerations. This role determines notification obligations, advises on law enforcement engagement, and reviews external communications. For organisations without in-house legal counsel, external legal advisors should be identified during preparation and engaged when incidents potentially trigger regulatory notification.
Data owners represent affected business functions and provide context about data sensitivity, system criticality, and acceptable downtime. During a compromise of the grants management system, the grants team data owner advises on operational impact and helps prioritise recovery sequencing.
The operations lead coordinates service restoration and manages the interface between incident response and normal operations. This role communicates with affected users, arranges workarounds during system unavailability, and executes recovery procedures.
Role assignments for each severity level establish clear expectations:
| Role | Severity 1 | Severity 2 | Severity 3 | Severity 4 |
|---|---|---|---|---|
| Executive sponsor | Active | Informed | Not engaged | Not engaged |
| Incident commander | Dedicated | Dedicated | Part-time | IT manager |
| Technical lead | Dedicated | Dedicated | Assigned | IT staff |
| Communications lead | Dedicated | Part-time | Incident commander | Not required |
| Legal/privacy | Engaged | On standby | As needed | Not required |
| Data owner | Engaged | Consulted | Informed | Not required |
Escalation and communication
Escalation paths define how incidents gain additional resources, authority, and visibility as severity increases or circumstances change. Escalation is not failure; it is the mechanism that ensures incidents receive appropriate attention. Under-escalation leaves critical incidents without necessary resources, while over-escalation wastes leadership attention and creates unnecessary alarm.
Escalation triggers include severity increases based on new information, resource requirements exceeding available capacity, decisions requiring authority beyond the incident commander’s scope, regulatory notification thresholds being crossed, or elapsed time exceeding containment targets. When any trigger condition is met, escalation occurs immediately without waiting for the next scheduled update.
Communication cadence scales with severity. Severity 1 incidents require hourly status updates to the executive sponsor and affected stakeholders. Severity 2 incidents require updates every 4 hours during active response. Severity 3 incidents require daily updates until resolution. Severity 4 incidents require communication only on significant developments or upon resolution.
+------------------------------------------------------------------+| ESCALATION MATRIX |+------------------------------------------------------------------+| || Severity 4 || +--------+--------+--------+--------+--------+--------+ || | | | | | | | || | 0-4hr | 4-8hr | 8-24hr | 24-48hr| 48-72hr| >72hr | || | | | | | | | || | IT | IT | IT Mgr | IT Mgr | Review | Escalate || | Staff | Staff | inform | inform | scope | to Sev 3 || +--------+--------+--------+--------+--------+--------+ || || Severity 3 || +--------+--------+--------+--------+--------+--------+ || | | | | | | | || | 0-4hr | 4-8hr | 8-24hr | 24-48hr| 48-72hr| >72hr | || | | | | | | | || | IR Lead| IR Lead| IT Mgr | IT Mgr | Exec | Escalate || | owns | update | update | engaged| inform | to Sev 2 || +--------+--------+--------+--------+--------+--------+ || || Severity 2 || +--------+--------+--------+--------+--------+--------+ || | | | | | | | || | 0-1hr | 1-4hr | 4-8hr | 8-16hr | 16-24hr| >24hr | || | | | | | | | || | IR Team| Exec | Exec | Exec | Board | Escalate || | assemb | inform | update | engaged| inform | to Sev 1 || +--------+--------+--------+--------+--------+--------+ || || Severity 1 || +--------+--------+--------+--------+--------+--------+ || | | | | | | | || | 0-30m | 30m-1hr| 1-2hr | 2-4hr | 4-8hr | >8hr | || | | | | | | | || | Full | Exec | Board | Board | Ext | Crisis | || | team | active | chair | update | legal | mode | || +--------+--------+--------+--------+--------+--------+ || |+------------------------------------------------------------------+Figure 5: Escalation matrix showing engagement levels by severity and elapsed time
Status updates follow a structured format that enables rapid comprehension. Each update includes current severity, incident summary in 2-3 sentences, current phase, actions completed since last update, actions in progress, blockers requiring resolution, and time of next scheduled update. Consistency in format allows recipients to quickly locate relevant information across multiple updates.
Out-of-band communication channels ensure incident coordination continues when primary systems are compromised. If organisational email is affected, the team communicates through a pre-established messaging platform. If network access is lost, mobile phone numbers for all response team members are maintained in printed form. Communication channel selection is determined during preparation, not during active incidents.
External coordination
Incidents frequently require coordination with parties outside the organisation. External coordination includes regulatory notification, law enforcement engagement, vendor incident response, partner notification, and in some cases media relations. Each external relationship follows distinct protocols with different triggering conditions and communication channels.
Regulatory notification requirements vary by jurisdiction, data type, and incident characteristics. GDPR requires notification to supervisory authorities within 72 hours of becoming aware of a personal data breach likely to result in risk to individuals. Other jurisdictions impose different timeframes and thresholds. The legal and privacy advisor maintains a notification requirements matrix identifying obligations for each jurisdiction where the organisation operates.
Law enforcement engagement is appropriate for incidents involving criminal activity, including ransomware, business email compromise resulting in financial loss, or attacks attributed to nation-state actors. Law enforcement provides threat intelligence, may assist with attribution, and in some cases can take action against threat actors. Engagement decisions consider potential benefits against operational disruption from investigation requirements. For incidents affecting multiple organisations, sector coordination bodies such as ISACs facilitate collective engagement.
Vendor coordination becomes necessary when incidents involve managed services, cloud platforms, or third-party applications. Cloud providers may have visibility into infrastructure-level indicators that complement tenant-level logs. Managed security service providers may detect threats before the organisation does. Software vendors may have knowledge of vulnerabilities being exploited. Vendor incident response coordination follows established support channels with escalation to security-specific contacts for confirmed incidents.
Partner notification applies when incidents may affect partner organisations or shared systems. Humanitarian organisations operating in consortia share systems and data; a compromise at one partner may indicate risk to others. Notification timing balances the obligation to warn partners against premature disclosure before incident scope is understood. Partner notification typically occurs once initial analysis confirms the incident and identifies potential partner impact.
Forensic preservation
Evidence preservation occurs throughout the incident lifecycle but is most critical during containment when responders interact with compromised systems. Improperly handled evidence loses evidentiary value and may be inadmissible if legal proceedings follow. Even when legal proceedings are not anticipated, preserved evidence enables complete post-incident analysis.
Preservation priorities focus on volatile evidence that disappears when systems power down or reboot. Memory contents, running processes, network connections, and logged-in sessions exist only while systems operate. When a compromised system must be taken offline for containment, volatile evidence capture occurs first. Non-volatile evidence such as disk contents, configuration files, and log archives can be collected after containment stabilises.
Chain of custody documentation tracks evidence from collection through analysis and storage. Each evidence item receives a unique identifier, documented location, collector identity, collection timestamp, and hash value confirming integrity. Transfers between parties are logged. This documentation demonstrates that evidence presented during analysis is identical to evidence originally collected.
Forensic preservation procedures are detailed in the Evidence Collection task. This framework establishes the principles; procedures provide step-by-step collection instructions.
Post-incident review
The post-incident review extracts learning from incident experience and translates findings into improvement actions. Reviews occur within 14 days of incident closure, when details remain fresh and participants are available. Delayed reviews produce less accurate findings as memory fades and staff attention moves to other priorities.
Effective reviews distinguish root cause analysis from blame assignment. The question is not who made mistakes but what systemic factors enabled those mistakes. A staff member clicking a phishing link is not the root cause; the root cause may be lack of training, email filtering gaps, or process pressure that discouraged verification. Root cause analysis traces the causal chain until reaching factors the organisation can address.
Review outputs include an incident timeline reconstructing key events, analysis of what worked well in the response, identification of what could be improved, and specific improvement actions with owners and target dates. Improvement actions are tracked through normal project or change management processes; logging actions without tracking completion produces documentation without improvement.
Recurring themes across multiple incident reviews indicate systemic issues warranting investment. If three incidents in six months involve compromised credentials without MFA, the pattern indicates MFA deployment should be prioritised. Trend analysis across reviews provides evidence for resource requests and security programme prioritisation.
Implementation considerations
For organisations with limited IT capacity
Organisations with single-person IT functions or no dedicated IT staff cannot maintain standing incident response teams. The framework adapts to resource constraints through pre-incident preparation that reduces decisions required during response.
Pre-defined escalation to external support occurs automatically at specified severity thresholds. A managed security service provider, IT support partner, or sector organisation may provide incident response assistance. The preparation phase establishes these relationships and documents engagement procedures. When Severity 1 or 2 incidents occur, external assistance activates immediately rather than waiting until internal capacity is exhausted.
Simplified severity classification uses a two-level model: incidents requiring external help versus incidents manageable with internal resources. This binary classification reduces decision complexity during high-stress situations.
Playbooks for the highest-probability incidents provide step-by-step guidance that reduces expertise requirements. An organisation experiencing ransomware follows the ransomware playbook rather than making novel decisions under pressure. Playbooks are maintained in the Procedures collection with specific response sequences for common incident types.
Communication templates prepared during calm periods enable rapid stakeholder notification during incidents. Templates include placeholders for incident-specific details while providing pre-approved language for sensitive communications.
For organisations with established IT functions
Organisations with dedicated IT teams and security staff implement the full framework with internal capability for most response activities.
Internal detection capability through security monitoring tools enables rapid identification of incidents that would otherwise be detected only through external notification or obvious impact. Investment in detection reduces mean time to detect, directly limiting incident damage.
Tiered response teams provide depth for sustained incidents. Initial response team members can be relieved after 8-12 hours by secondary team members, preventing exhaustion during extended response operations. Rotation schedules ensure continuous coverage without individual burnout.
Regular exercises test framework components and build team competence. Tabletop exercises walk through incident scenarios in a discussion format. Technical exercises simulate actual incidents in test environments. Full-scale exercises combine tabletop decision-making with technical execution. Organisations conduct at least quarterly tabletop exercises and annual technical exercises.
Integration with governance structures ensures incident response aligns with organisational risk management. Security incidents are reported to risk committees. Significant incidents trigger board notification. Post-incident review findings feed into security programme planning.
For organisations in high-risk contexts
Organisations operating in hostile environments face incident scenarios that include physical safety dimensions. Device confiscation by authorities, targeted surveillance, and insider threats from compromised staff require adaptations beyond standard incident response.
Safety-first escalation overrides normal severity classification when incidents create physical risk. Any incident potentially endangering staff or beneficiaries immediately becomes Severity 1 regardless of technical characteristics.
Operational security considerations may limit incident documentation. Detailed incident records are themselves sensitive if they reveal security practices to hostile actors. Organisations balance thorough documentation against documentation risks, potentially maintaining sensitive records separately or applying additional access controls.
Alternative communication channels assume primary systems are compromised or monitored. Satellite communication, pre-positioned devices, or trusted partner networks provide coordination capability when organisational infrastructure is unavailable or unsafe.
Coordination with security departments integrates information security incidents with physical security incident management. A cyber attack against staff may precede or accompany physical threats; unified incident awareness enables appropriate protective response.