Security and Monitoring
Security and monitoring tools provide visibility into infrastructure health, application performance, and security events across distributed systems. This category encompasses Security Information and Event Management (SIEM) platforms that aggregate and analyse security logs, metrics collection systems that capture time-series performance data, and log aggregation platforms that centralise application and system logs for analysis.
This page covers five open source platforms that address overlapping but distinct monitoring functions: Wazuh provides integrated SIEM and endpoint detection capabilities; OpenSearch serves as a search and analytics engine with built-in security analytics; Prometheus collects and stores metrics with alerting; Grafana visualises data from multiple sources; and Loki aggregates logs using a label-based approach. These tools are complementary rather than competing. Effective monitoring architectures combine metrics (Prometheus), logs (Loki or OpenSearch), and security events (Wazuh or OpenSearch Security Analytics), with Grafana providing unified visualisation.
Assessment methodology
Tool assessments derive from official developer documentation, published API references, release notes, and technical specifications as of 2026-01-23. Feature availability varies by deployment model and configuration. Community-reported information is excluded; only documented features are assessed.
Requirements taxonomy
This taxonomy defines evaluation criteria for security and monitoring tools. Requirements are organised by functional area and weighted by priority for mission-driven organisations operating across diverse infrastructure environments.
Functional requirements
Core capabilities that define what security and monitoring tools must do.
Log and event collection
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| F1.1 | Agent-based collection | Lightweight agents deployed on endpoints collect logs, metrics, and security events locally before forwarding to central servers | Full: documented agent for Linux, Windows, macOS with configurable collection. Partial: limited OS support or fixed collection. None: agentless only | Review agent documentation; verify supported platforms | Essential |
| F1.2 | Agentless collection | Collection via network protocols (Syslog, SNMP, API polling) without installing software on monitored systems | Full: multiple protocols supported with documented configuration. Partial: single protocol. None: agent required | Review ingestion documentation; check protocol support | Important |
| F1.3 | Cloud service integration | Native connectors for cloud provider logs and metrics (AWS CloudWatch, Azure Monitor, GCP Cloud Logging) | Full: documented integrations with major providers. Partial: limited provider support. None: manual configuration required | Review cloud integration documentation | Important |
| F1.4 | Container and Kubernetes support | Collection from containerised workloads including automatic discovery of pods and services | Full: Kubernetes-native deployment, automatic discovery, metadata enrichment. Partial: manual container configuration. None: no container support | Review Kubernetes deployment documentation | Important |
| F1.5 | Custom log parsing | Ability to define parsing rules for non-standard log formats | Full: flexible parsing language, regex support, field extraction. Partial: limited parsing options. None: fixed formats only | Review parser configuration documentation | Essential |
Data storage and retention
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| F2.1 | Time-series storage | Efficient storage of timestamped data points with compression and downsampling | Full: purpose-built time-series storage, documented compression ratios, configurable retention. Partial: general-purpose storage with time indexing. None: no time-series optimisation | Review storage architecture documentation | Essential |
| F2.2 | Index lifecycle management | Automated policies for data tiering, rollover, and deletion based on age or size | Full: configurable policies, warm/cold/frozen tiers, automated execution. Partial: manual index management. None: no lifecycle management | Review index management documentation | Important |
| F2.3 | Data compression | Reduction of storage requirements through compression algorithms | Full: documented compression with configurable algorithms. Partial: fixed compression. None: uncompressed storage | Review storage documentation; check compression options | Important |
| F2.4 | Horizontal scalability | Ability to distribute data across multiple nodes for capacity and performance | Full: documented sharding, replication, cluster scaling. Partial: limited scaling options. None: single-node only | Review clustering documentation | Important |
| F2.5 | Long-term archival | Support for moving older data to cost-effective storage while maintaining queryability | Full: archive tier with query capability, object storage integration. Partial: archive without query. None: no archival support | Review archival documentation | Desirable |
Query and analysis
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| F3.1 | Query language | Expressive language for filtering, aggregating, and transforming data | Full: documented query language with functions, operators, aggregations. Partial: basic filtering only. None: no query capability | Review query language documentation; test complexity | Essential |
| F3.2 | Full-text search | Ability to search across log content without pre-defined field extraction | Full: inverted index, relevance scoring, phrase matching. Partial: basic string matching. None: field-based queries only | Review search documentation; test search capabilities | Essential |
| F3.3 | Correlation across sources | Ability to join or correlate events from different data sources | Full: cross-source queries, correlation rules, relationship mapping. Partial: limited correlation. None: single-source queries | Review correlation documentation | Important |
| F3.4 | Saved queries and templates | Ability to store and reuse common queries | Full: named queries, parameterised templates, sharing. Partial: basic saved queries. None: no persistence | Review query management documentation | Desirable |
| F3.5 | Query performance optimisation | Features to improve query speed on large datasets | Full: query caching, result streaming, timeout controls. Partial: basic optimisation. None: no performance features | Review performance documentation | Important |
Alerting and notification
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| F4.1 | Threshold-based alerts | Alerts triggered when metrics exceed configured thresholds | Full: configurable thresholds, multiple conditions, hysteresis. Partial: simple thresholds. None: no threshold alerting | Review alerting documentation | Essential |
| F4.2 | Anomaly detection | Automatic identification of unusual patterns without predefined thresholds | Full: machine learning-based detection, baseline calculation. Partial: statistical methods. None: threshold-only | Review anomaly detection documentation | Desirable |
| F4.3 | Alert grouping and deduplication | Reduction of alert noise through intelligent grouping of related alerts | Full: configurable grouping, deduplication windows, aggregation. Partial: basic grouping. None: individual alerts only | Review alert management documentation | Important |
| F4.4 | Notification channels | Integration with external notification systems (email, Slack, PagerDuty, webhooks) | Full: multiple channels, configurable routing, templating. Partial: limited channels. None: no external notifications | Review notification documentation; check integrations | Essential |
| F4.5 | Alert escalation | Automatic escalation of unacknowledged alerts to additional recipients | Full: escalation policies, on-call schedules, acknowledgement tracking. Partial: basic escalation. None: no escalation | Review escalation documentation | Important |
| F4.6 | Alert silencing | Ability to temporarily suppress alerts during maintenance windows | Full: scheduled silences, matcher-based silencing. Partial: global silence only. None: no silencing | Review silence documentation | Important |
Security analytics
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| F5.1 | Detection rules | Predefined and custom rules for identifying security threats | Full: rule library (Sigma, YARA), custom rule creation, rule testing. Partial: fixed rules only. None: no detection rules | Review detection documentation; check rule formats | Essential for SIEM |
| F5.2 | Threat intelligence integration | Ability to enrich events with external threat intelligence feeds | Full: multiple feed formats, IOC matching, automated updates. Partial: manual threat data. None: no threat intelligence | Review threat intelligence documentation | Important for SIEM |
| F5.3 | Compliance reporting | Pre-built reports for regulatory frameworks (PCI DSS, HIPAA, GDPR) | Full: multiple frameworks, automated reporting, evidence collection. Partial: limited frameworks. None: no compliance reports | Review compliance documentation | Important for SIEM |
| F5.4 | File integrity monitoring | Detection of unauthorised changes to critical system files | Full: real-time monitoring, baseline comparison, change reporting. Partial: scheduled scanning. None: no FIM | Review FIM documentation | Important for SIEM |
| F5.5 | Vulnerability detection | Identification of known vulnerabilities on monitored systems | Full: CVE database integration, severity scoring, remediation guidance. Partial: basic vulnerability scanning. None: no vulnerability detection | Review vulnerability documentation | Important for SIEM |
Visualisation and dashboards
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| F6.1 | Dashboard creation | Tools for building custom visual displays of monitoring data | Full: drag-and-drop builder, multiple visualisation types, variables. Partial: limited customisation. None: fixed dashboards | Review dashboard documentation | Essential |
| F6.2 | Pre-built dashboards | Ready-to-use dashboards for common use cases | Full: dashboard library, community contributions, one-click import. Partial: basic templates. None: build from scratch | Review available dashboards | Desirable |
| F6.3 | Multiple data sources | Ability to visualise data from different backends in unified dashboards | Full: plugin architecture, data source federation, mixed queries. Partial: single backend. None: no data source abstraction | Review data source documentation | Important |
| F6.4 | Interactive exploration | Drill-down, zoom, and filtering directly within visualisations | Full: click-through, time range selection, ad-hoc filtering. Partial: limited interactivity. None: static visualisations | Test dashboard interactivity | Important |
| F6.5 | Sharing and embedding | Ability to share dashboards externally or embed in other applications | Full: public links, embedding, snapshots. Partial: internal sharing only. None: no sharing | Review sharing documentation | Desirable |
Technical requirements
Infrastructure, architecture, and deployment considerations.
Deployment and hosting
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| T1.1 | Self-hosted deployment | Ability to deploy on organisation-controlled infrastructure | Full: complete documentation, all features available. Partial: self-hosted with limitations. None: SaaS only | Review deployment documentation | Essential |
| T1.2 | Container deployment | Support for Docker and Kubernetes deployment | Full: official images, Helm charts, operators. Partial: community images. None: no container support | Check container registry; review orchestration docs | Important |
| T1.3 | High availability | Support for redundant deployment eliminating single points of failure | Full: documented HA architecture, automatic failover. Partial: manual failover. None: single-instance | Review HA documentation | Important |
| T1.4 | Air-gapped deployment | Operation in environments without internet connectivity | Full: offline installation, local updates. Partial: initial internet required. None: requires connectivity | Review offline deployment documentation | Context-dependent |
Scalability and performance
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| T2.1 | Horizontal scaling | Addition of nodes to increase capacity | Full: documented scaling, load distribution. Partial: limited scaling. None: vertical only | Review scaling documentation | Important |
| T2.2 | Documented performance characteristics | Published benchmarks and sizing guidance | Full: detailed benchmarks with methodology, sizing calculator. Partial: general guidance. None: undocumented | Review performance documentation | Important |
| T2.3 | Resource efficiency | Optimised resource consumption relative to data volume | Full: documented resource requirements by scale, optimisation guides. Partial: minimum requirements only | Review sizing documentation | Important |
Integration architecture
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| T3.1 | REST API | Programmatic access for automation and integration | Full: comprehensive API, versioned, documented. Partial: limited endpoints. None: no API | Review API documentation completeness | Essential |
| T3.2 | Webhook support | Push notifications to external systems on events | Full: configurable webhooks, retry logic. Partial: limited events. None: polling only | Review webhook documentation | Important |
| T3.3 | Standard protocols | Support for industry-standard data formats and protocols | Document supported standards: OpenTelemetry, Prometheus exposition format, Syslog, Beats, OTLP | Review protocol documentation | Important |
| T3.4 | Plugin ecosystem | Extensibility through plugins or extensions | Full: documented plugin API, community plugins. Partial: limited extensibility. None: closed | Review plugin documentation; check plugin availability | Desirable |
Security requirements
Security controls and data protection capabilities.
Authentication and access control
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| S1.1 | Multi-factor authentication | MFA support for user accounts | Full: multiple MFA methods. Partial: single method. None: password only | Review authentication documentation | Essential |
| S1.2 | Single sign-on | Federated identity via SAML or OIDC | Full: SAML 2.0 and OIDC. Partial: single protocol. None: local auth only | Review SSO documentation | Essential |
| S1.3 | Role-based access control | Granular permissions based on roles | Full: custom roles, fine-grained permissions. Partial: fixed roles. None: all-or-nothing | Review RBAC documentation | Essential |
| S1.4 | API authentication | Secure authentication for programmatic access | Full: API keys, OAuth, service accounts. Partial: single method. None: no API auth | Review API security documentation | Important |
Data protection
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| S2.1 | Encryption at rest | Stored data encryption | Full: AES-256, documented key management. Partial: optional encryption. None: unencrypted | Review encryption documentation | Essential |
| S2.2 | Encryption in transit | TLS for all network communication | Full: TLS 1.2+ enforced. Partial: TLS optional. None: unencrypted | Review transport security documentation | Essential |
| S2.3 | Audit logging | Logging of administrative and access events | Full: comprehensive audit trail, tamper protection. Partial: basic logging. None: no audit logs | Review audit log documentation | Essential |
| S2.4 | Data masking | Ability to redact sensitive data in logs | Full: configurable masking rules. Partial: fixed patterns. None: no masking | Review data protection documentation | Important |
Operational requirements
Day-to-day administration and management.
Administration
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| O1.1 | Web-based administration | Browser-based management interface | Full: comprehensive admin UI. Partial: limited UI. None: CLI only | Review admin interface documentation | Important |
| O1.2 | Configuration as code | Version-controlled configuration management | Full: complete config via files, GitOps support. Partial: limited options. None: UI only | Review configuration documentation | Desirable |
| O1.3 | Multi-tenancy | Logical separation for different teams or projects | Full: tenant isolation, separate permissions. Partial: workspace separation. None: single tenant | Review multi-tenancy documentation | Context-dependent |
Backup and recovery
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| O2.1 | Backup procedures | Documented data backup methods | Full: automated backup, point-in-time recovery. Partial: manual backup. None: undocumented | Review backup documentation | Essential |
| O2.2 | Disaster recovery | Recovery procedures for catastrophic failure | Full: documented DR, tested procedures. Partial: general guidance. None: undocumented | Review DR documentation | Important |
Support and maintenance
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| O3.1 | Documentation quality | Completeness of technical documentation | Full: comprehensive, current, searchable. Partial: incomplete. Poor: minimal | Assess documentation during evaluation | Essential |
| O3.2 | Release cadence | Frequency and predictability of updates | Full: regular releases, roadmap visibility. Partial: irregular. None: stagnant | Review release history | Important |
| O3.3 | Community health | Activity of open source community | Metrics: contributors, commit frequency, issue response time | Review repository statistics | Important |
Commercial requirements
Licensing and vendor considerations.
| ID | Requirement | Description | Assessment criteria | Verification method | Typical priority |
|---|---|---|---|---|---|
| C1.1 | Open source licence | Permissive licensing for self-hosted deployment | Document licence type and implications (GPL, Apache, AGPL) | Review licence terms | Essential |
| C1.2 | Commercial support availability | Option for paid support from vendor or partners | Full: vendor support tiers. Partial: partner support. None: community only | Review support options | Important |
| C1.3 | Enterprise features | Additional capabilities beyond open source version | Document which features require enterprise licence | Review feature comparison | Desirable |
Assessment methodology
Tools are assessed against each requirement using the following scale:
| Rating | Symbol | Definition |
|---|---|---|
| Full support | ● | Requirement fully met with documented, production-ready capability |
| Partial support | ◐ | Requirement partially met; limitations documented in notes |
| Minimal support | ○ | Basic capability exists but significant gaps |
| Not supported | ✗ | Capability not available |
| Not applicable | - | Requirement not relevant to this tool’s primary function |
Additional notation:
- E -Feature available in enterprise tier only
- P -Feature requires plugin or extension
Functional capability comparison
Primary function matrix
Each tool addresses different primary functions within the monitoring stack:
| Function | Wazuh | OpenSearch | Prometheus | Grafana | Loki |
|---|---|---|---|---|---|
| Primary purpose | SIEM, endpoint security | Search, analytics, SIEM | Metrics collection | Visualisation | Log aggregation |
| Data type focus | Security events, logs | Logs, documents | Time-series metrics | None (visualisation) | Logs |
| Collection method | Agent-based | Ingestion APIs | Pull-based scraping | None | Push-based |
| Storage included | Yes (indexer) | Yes | Yes (TSDB) | No | Yes |
| Query language | Wazuh Query Language | OpenSearch DSL, SQL, PPL | PromQL | None (passes to backends) | LogQL |
Log and event collection
| Req ID | Requirement | Wazuh | OpenSearch | Prometheus | Grafana | Loki |
|---|---|---|---|---|---|---|
| F1.1 | Agent-based collection | ● | ○ | ◐ | - | ●P |
| F1.2 | Agentless collection | ● | ● | ● | - | ◐ |
| F1.3 | Cloud service integration | ● | ● | ●P | - | ●P |
| F1.4 | Container/Kubernetes support | ● | ● | ● | - | ● |
| F1.5 | Custom log parsing | ● | ● | - | - | ● |
Assessment notes:
- Wazuh F1.1: Native agent supports Linux, Windows, macOS, Solaris, AIX, HP-UX with full feature parity
- OpenSearch F1.1: Relies on external agents (Beats, Fluent Bit, OpenTelemetry Collector) for collection
- Prometheus F1.1: Node Exporter and other exporters provide agent-like functionality for metrics; not designed for log collection
- Loki F1.1: Grafana Alloy (successor to Promtail) provides agent functionality; requires separate deployment
- Prometheus F1.5: Not applicable; Prometheus collects structured metrics, not logs requiring parsing
Data storage and retention
| Req ID | Requirement | Wazuh | OpenSearch | Prometheus | Grafana | Loki |
|---|---|---|---|---|---|---|
| F2.1 | Time-series storage | ◐ | ● | ● | - | ● |
| F2.2 | Index lifecycle management | ● | ● | ● | - | ● |
| F2.3 | Data compression | ● | ● | ● | - | ● |
| F2.4 | Horizontal scalability | ● | ● | ◐ | - | ● |
| F2.5 | Long-term archival | ● | ● | ◐ | - | ● |
Assessment notes:
- Wazuh F2.1: Uses OpenSearch-based indexer optimised for security events rather than high-cardinality metrics
- Prometheus F2.4: Federation and remote write provide horizontal distribution; single-server TSDB has scaling limits
- Prometheus F2.5: Remote write to long-term storage (Thanos, Mimir, Cortex) required for extended retention
Query and analysis
| Req ID | Requirement | Wazuh | OpenSearch | Prometheus | Grafana | Loki |
|---|---|---|---|---|---|---|
| F3.1 | Query language | ● | ● | ● | ◐ | ● |
| F3.2 | Full-text search | ● | ● | ✗ | - | ◐ |
| F3.3 | Correlation across sources | ● | ● | ◐ | ● | ○ |
| F3.4 | Saved queries | ● | ● | ● | ● | ● |
| F3.5 | Query performance optimisation | ● | ● | ● | - | ● |
Assessment notes:
- Grafana F3.1: Passes queries to backend data sources; provides unified interface but not its own query language
- Prometheus F3.2: PromQL operates on metrics labels; no full-text search capability
- Loki F3.2: Label-based filtering with optional line filtering; not full inverted index like OpenSearch
- Grafana F3.3: Correlates data from multiple backends through unified dashboards and Explore interface
Alerting and notification
| Req ID | Requirement | Wazuh | OpenSearch | Prometheus | Grafana | Loki |
|---|---|---|---|---|---|---|
| F4.1 | Threshold-based alerts | ● | ● | ● | ● | ◐ |
| F4.2 | Anomaly detection | ● | ● | ✗ | ○ | ✗ |
| F4.3 | Alert grouping/deduplication | ○ | ● | ● | ● | - |
| F4.4 | Notification channels | ● | ● | ● | ● | - |
| F4.5 | Alert escalation | ○ | ◐ | ● | ● | - |
| F4.6 | Alert silencing | ○ | ● | ● | ● | - |
Assessment notes:
- Prometheus F4.1: Alertmanager handles all alert management; highly configurable threshold and multi-condition rules
- Loki F4.1: Alerting via Grafana integration; LogQL-based alert rules
- OpenSearch F4.2: ML Commons plugin provides anomaly detection
- Grafana F4.2: Basic threshold anomaly alerting; advanced ML requires external integration
- Loki alerting: Designed to route alerts through Grafana Alerting; does not provide standalone alert management
Security analytics
| Req ID | Requirement | Wazuh | OpenSearch | Prometheus | Grafana | Loki |
|---|---|---|---|---|---|---|
| F5.1 | Detection rules | ● | ● | - | - | - |
| F5.2 | Threat intelligence | ● | ● | - | - | - |
| F5.3 | Compliance reporting | ● | ◐ | - | ◐ | - |
| F5.4 | File integrity monitoring | ● | ✗ | - | - | - |
| F5.5 | Vulnerability detection | ● | ✗ | - | - | - |
Assessment notes:
- Wazuh F5.1: Includes 4,000+ rules aligned with MITRE ATT&CK; supports custom rules and Sigma format
- OpenSearch F5.1: Security Analytics includes Sigma rules library; requires manual log source mapping
- Wazuh F5.3: Pre-built dashboards for PCI DSS, HIPAA, GDPR, NIST 800-53, TSC
- OpenSearch F5.3: Compliance features require custom dashboard configuration
- Grafana F5.3: Can visualise compliance data from backends but does not generate compliance reports
Visualisation and dashboards
| Req ID | Requirement | Wazuh | OpenSearch | Prometheus | Grafana | Loki |
|---|---|---|---|---|---|---|
| F6.1 | Dashboard creation | ● | ● | ○ | ● | - |
| F6.2 | Pre-built dashboards | ● | ● | ○ | ● | - |
| F6.3 | Multiple data sources | ○ | ○ | ✗ | ● | - |
| F6.4 | Interactive exploration | ● | ● | ◐ | ● | - |
| F6.5 | Sharing and embedding | ● | ● | ✗ | ● | - |
Assessment notes:
- Wazuh F6.1: Dashboard built on OpenSearch Dashboards with security-specific customisations
- Prometheus F6.1: Basic expression browser; designed to export to Grafana for visualisation
- Grafana F6.3: Primary strength; supports 150+ data source plugins
- Loki: No native visualisation; designed for use with Grafana
Technical capability comparison
Deployment options
| Tool | Self-hosted | Docker | Kubernetes | Cloud managed |
|---|---|---|---|---|
| Wazuh | ● | ● | ● | ● (Wazuh Cloud) |
| OpenSearch | ● | ● | ● | ● (AWS, Aiven) |
| Prometheus | ● | ● | ● | ◐ (Grafana Cloud) |
| Grafana | ● | ● | ● | ● (Grafana Cloud) |
| Loki | ● | ● | ● | ● (Grafana Cloud) |
Resource requirements
| Tool | Minimum RAM | Recommended RAM (production) | Storage model |
|---|---|---|---|
| Wazuh (all-in-one) | 4 GB | 16 GB | OpenSearch indices |
| OpenSearch (single node) | 4 GB | 32 GB | Lucene indices |
| Prometheus | 2 GB | 8 GB+ | Custom TSDB |
| Grafana | 512 MB | 2 GB | SQLite/PostgreSQL (metadata only) |
| Loki | 1 GB | 4 GB+ | Object storage + BoltDB/TSDB |
Sizing notes:
- Wazuh: 90 days of alerts for 100 endpoints requires approximately 50 GB storage
- OpenSearch: 1 TB raw logs compresses to approximately 300-400 GB indexed
- Prometheus: 1-2 bytes per sample; 100,000 active series with 15-second scrape interval produces approximately 35 GB per month
- Loki: Designed for minimal indexing; stores compressed log chunks with label index only
High availability architecture
| Tool | Clustering | Replication | Automatic failover |
|---|---|---|---|
| Wazuh | ● Server cluster, indexer cluster | ● Index replication | ● |
| OpenSearch | ● Multi-node cluster | ● Configurable replicas | ● |
| Prometheus | ◐ Federation | ○ External (Thanos) | ○ External |
| Grafana | ● Multi-instance | ● Shared database | ● With load balancer |
| Loki | ● Microservices mode | ● Replication factor | ● |
Integration capabilities
| Tool | REST API | Remote write/read | OpenTelemetry | Prometheus format |
|---|---|---|---|---|
| Wazuh | ● | ○ | ◐ | ◐ |
| OpenSearch | ● | - | ● | ●P |
| Prometheus | ● | ● | ● | ● |
| Grafana | ● | - | ● | ● |
| Loki | ● | - | ● | ● (LogQL) |
Assessment notes:
- Wazuh OpenTelemetry: Receives OTLP logs via documented integration
- OpenSearch Prometheus: Requires prometheus-exporter plugin
- Loki OpenTelemetry: Native OTLP ingestion endpoint
Security capability comparison
Authentication methods
| Tool | Local auth | LDAP | SAML | OIDC | API keys |
|---|---|---|---|---|---|
| Wazuh | ● | ● | ● | ● | ● |
| OpenSearch | ● | ● | ● | ● | ● |
| Prometheus | ○ | ✗ | ✗ | ✗ | ○ |
| Grafana | ● | ● | ● | ● | ● |
| Loki | ○ | - | - | - | ○ |
Assessment notes:
- Prometheus: Basic authentication via reverse proxy; no native SSO
- Loki: Authentication handled at gateway level or via Grafana; minimal native auth
Authorisation and access control
| Tool | RBAC | Field-level security | Multi-tenancy |
|---|---|---|---|
| Wazuh | ● | ● | ● |
| OpenSearch | ● | ● | ● |
| Prometheus | ○ | ✗ | ✗ |
| Grafana | ● | ◐ | ● |
| Loki | ◐ | ✗ | ● |
Data protection
| Tool | Encryption at rest | Encryption in transit | Audit logging |
|---|---|---|---|
| Wazuh | ● | ● | ● |
| OpenSearch | ● | ● | ● |
| Prometheus | ○ | ● | ○ |
| Grafana | ● | ● | ● |
| Loki | ● | ● | ◐ |
Assessment notes:
- Prometheus encryption at rest: Requires filesystem-level encryption; no native encryption
- Prometheus audit logging: Minimal; logs queries but limited administrative audit trail
Detailed tool assessments
Wazuh
- Type
- Open source
- Licence
- GPL-2.0 (server, agent) and Apache-2.0 (indexer, dashboard)
- Current version
- 4.14.2 (released 2026-01-14)
- Deployment options
- Self-hosted (Linux), Docker, Kubernetes, Wazuh Cloud
- Source repository
- https://github.com/wazuh/wazuh
- Documentation
- https://documentation.wazuh.com
Overview
Wazuh provides a unified security platform combining SIEM capabilities with endpoint detection and response (EDR). The architecture comprises three central components: the Wazuh server (analysis engine), the Wazuh indexer (based on OpenSearch for data storage), and the Wazuh dashboard (based on OpenSearch Dashboards for visualisation). Agents deployed on endpoints collect security events, logs, and system inventory data.
The platform originated as a fork of OSSEC in 2015 and has evolved into a comprehensive security solution. Development is led by Wazuh, Inc., with the core platform remaining open source under GPL-2.0 licensing. Commercial support and a cloud-hosted option (Wazuh Cloud) are available.
Capability assessment for security monitoring
Wazuh excels at security-specific monitoring with integrated threat detection, compliance reporting, and endpoint visibility. The platform includes over 4,000 detection rules aligned with the MITRE ATT&CK framework, covering common attack patterns across Windows, Linux, and macOS. File integrity monitoring, rootkit detection, and vulnerability assessment provide defence-in-depth visibility.
For organisations requiring SIEM functionality, Wazuh provides an integrated solution without requiring separate log aggregation, correlation, and alerting components. The agent-based architecture enables deep endpoint visibility including process monitoring, registry changes (Windows), and kernel-level events.
Key strengths:
- Integrated security stack: Single platform provides SIEM, FIM, vulnerability detection, and compliance without additional tools
- Comprehensive agent: Cross-platform agent collects security events, inventory, and enables active response
- Compliance dashboards: Pre-built regulatory compliance dashboards (PCI DSS, HIPAA, GDPR, NIST 800-53) reduce implementation effort
- Active response: Automated threat response capabilities (blocking IPs, killing processes) based on detection rules
Key limitations:
- Resource requirements: All-in-one deployment requires substantial resources; minimum 4 GB RAM, recommended 16 GB for production
- OpenSearch dependency: Indexer is tightly coupled to OpenSearch; cannot substitute alternative backends
- Metrics monitoring gap: Designed for security events and logs; requires Prometheus or similar for infrastructure metrics
- Scaling complexity: Multi-node deployment requires careful planning; cluster configuration has learning curve
Deployment and operations
Self-hosted requirements:
Operating system: RHEL/CentOS 7-9, Ubuntu 16.04-24.04, Debian 9-12, Amazon Linux 2Components: Wazuh server, Wazuh indexer, Wazuh dashboardMinimum (all-in-one): 4 vCPU, 8 GB RAM, 50 GB storageRecommended (100 agents): 8 vCPU, 16 GB RAM, 200 GB storageLarge deployment: Separate nodes for each componentDeployment complexity: Medium. Single-node installation via assistant script completes in under 10 minutes. Multi-node clusters require manual certificate generation and configuration.
Operational overhead: Medium. Regular maintenance includes index lifecycle management, rule updates, and agent version management. Wazuh provides automated upgrade procedures.
Upgrade path: Sequential minor version upgrades supported. Major version upgrades require following specific migration procedures. Version 4.x provides stable API compatibility.
Integration capabilities
API coverage: Comprehensive REST API (port 55000) covers agent management, rule configuration, alert retrieval, and system status. API documentation includes OpenAPI specification.
Key integrations:
| Integration | Type | Status | Documentation |
|---|---|---|---|
| AWS CloudTrail, GuardDuty | Native | Production | Documented module |
| Azure Activity Logs | Native | Production | Documented module |
| GCP Cloud Audit | Native | Production | Documented module |
| Microsoft 365 | Native | Production | Documented module |
| Slack, PagerDuty, email | Native | Production | Documented integrations |
| TheHive, MISP | Native | Production | Documented integrations |
Standards supported: Syslog (RFC 3164, RFC 5424), Windows Event Log, MITRE ATT&CK mapping, Sigma rules (partial), STIX/TAXII (threat intelligence)
Security assessment
Authentication: Local users, LDAP, Active Directory, SAML 2.0, OpenID Connect. Multi-factor authentication supported via IdP integration.
Authorisation: Role-based access control with predefined roles (administrator, read-only, agents-admin) and custom role creation. Policies define permitted API operations and data access.
Data protection: TLS encryption for all component communication. Agent-server communication encrypted with AES. Indexer data encrypted at rest via OpenSearch security plugin.
Security track record: Regular security advisories published. Vulnerability disclosure programme available. CVEs addressed through patch releases within documented timeframes.
Cost analysis
Direct costs:
- Licence: Free (GPL-2.0/Apache-2.0)
- Support: Wazuh Cloud includes support; self-hosted support available via partners
- Enterprise features: None; all features included in open source version
Infrastructure costs (self-hosted):
| Scale | Infrastructure estimate | Configuration |
|---|---|---|
| 1-50 agents | 1 VM (8 GB RAM, 100 GB) | All-in-one deployment |
| 50-500 agents | 3 VMs (16 GB RAM each) | Distributed deployment |
| 500+ agents | 5+ VMs with dedicated indexer cluster | Multi-node with HA |
Hidden costs:
- Storage growth: Security data accumulates rapidly; plan for index lifecycle management and archival
- Expertise requirement: Effective rule tuning and incident response requires security operations knowledge
Organisational fit
Best suited for:
- Organisations requiring integrated SIEM without assembling multiple components
- Environments with compliance requirements (PCI DSS, HIPAA, GDPR)
- Teams wanting endpoint visibility with minimal agent diversity
Less suitable for:
- Organisations primarily needing infrastructure metrics monitoring
- Environments already using mature SIEM solutions (Splunk, QRadar)
- Small deployments where operational overhead exceeds benefit
OpenSearch
- Type
- Open source
- Licence
- Apache-2.0
- Current version
- 3.4.0 (released 2025-12-16)
- Deployment options
- Self-hosted (Linux), Docker, Kubernetes, AWS OpenSearch Service, Aiven
- Source repository
- https://github.com/opensearch-project/OpenSearch
- Documentation
- https://docs.opensearch.org
Overview
OpenSearch is a search and analytics engine forked from Elasticsearch 7.10.2 in 2021 following Elastic’s licence change. The project is managed by the OpenSearch Software Foundation under the Linux Foundation, with Amazon Web Services as a primary contributor. OpenSearch provides distributed search, log analytics, and security analytics capabilities.
The architecture centres on a cluster of nodes that index and search data stored in indices composed of shards. OpenSearch Dashboards provides the visualisation layer. The Security Analytics plugin adds SIEM functionality including detection rules, correlation, and threat intelligence integration.
Capability assessment for security monitoring
OpenSearch serves as a flexible backend for log aggregation and security analytics. The Security Analytics plugin provides SIEM capabilities including pre-packaged Sigma detection rules, correlation engine for multi-log analysis, and threat intelligence integration. The platform excels at full-text search across large log volumes.
For security monitoring, OpenSearch provides the analytical backbone but requires external agents (Beats, Fluent Bit, OpenTelemetry Collector) for log collection. This modular approach offers flexibility but increases deployment complexity compared to integrated solutions.
Key strengths:
- Search performance: Full-text search with relevance scoring optimised for large datasets
- Query flexibility: Multiple query languages (DSL, SQL, PPL) support diverse analysis patterns
- Correlation engine: Security Analytics links findings across log sources for complex attack detection
- Extensibility: Plugin architecture enables feature additions without core modifications
Key limitations:
- No native collection: Requires external agents for log ingestion; no built-in endpoint agent
- Security Analytics maturity: SIEM features newer than Wazuh; documentation gaps reported by community
- Resource intensity: Full-text indexing requires substantial memory; minimum 4 GB heap per node
- Operational complexity: Cluster management, shard allocation, and index lifecycle require expertise
Deployment and operations
Self-hosted requirements:
Operating system: RHEL/CentOS 7+, Ubuntu 18.04+, Debian 10+, Amazon Linux 2Java: Bundled JDK (OpenJDK 21 in 3.x)Minimum (single node): 4 vCPU, 8 GB RAM, 50 GB SSDRecommended (production): 3+ nodes, 32 GB RAM each, dedicated master nodesStorage: SSD required for acceptable performanceDeployment complexity: Medium to High. Single-node Docker deployment straightforward. Production clusters require certificate configuration, node role assignment, and shard planning.
Operational overhead: High. Index lifecycle management, shard rebalancing, cluster health monitoring, and version upgrades require ongoing attention.
Upgrade path: Rolling upgrades supported within major versions. Major version upgrades (2.x to 3.x) require cluster restart or reindexing. Detailed migration guide provided.
Integration capabilities
API coverage: Full REST API for all operations. OpenAPI specification available. Backwards compatible with Elasticsearch 7.x API for most operations.
Key integrations:
| Integration | Type | Status | Documentation |
|---|---|---|---|
| Logstash | Native | Production | Documented |
| Beats (Filebeat, etc.) | Native | Production | Documented |
| Fluent Bit | Native | Production | Documented |
| OpenTelemetry | Native | Production | Documented |
| AWS services | Native (AWS) | Production | Documented |
| Grafana | Plugin | Production | Grafana data source |
Standards supported: OpenTelemetry (OTLP), Elasticsearch bulk API, Syslog via Logstash, Sigma rules (Security Analytics)
Security assessment
Authentication: Internal user database, LDAP/Active Directory, SAML 2.0, OpenID Connect. HTTP basic authentication for API.
Authorisation: Fine-grained access control with document-level, field-level, and index-level permissions. Predefined roles and custom role creation.
Data protection: TLS for transport and REST layers. Encryption at rest via security plugin (requires configuration). Comprehensive audit logging.
Security track record: Security advisories published via GitHub. Maintains CVE tracking. Responsive to vulnerability reports.
Cost analysis
Direct costs:
- Licence: Free (Apache-2.0)
- Support: AWS support for OpenSearch Service; community support for self-hosted; third-party support available
- Enterprise features: None; all features included in open source
Infrastructure costs (self-hosted):
| Scale | Infrastructure estimate | Configuration |
|---|---|---|
| Development | 1 VM (8 GB RAM, 100 GB SSD) | Single node |
| Small production | 3 VMs (16 GB RAM each) | 3-node cluster |
| Medium production | 6+ VMs (32 GB RAM each) | Dedicated masters, data nodes |
Hidden costs:
- Storage I/O: SSD storage required; HDD causes unacceptable latency
- Memory requirements: JVM heap sizing critical; plan 50% of RAM for heap, 50% for OS cache
Organisational fit
Best suited for:
- Organisations with existing Elasticsearch expertise migrating to open source
- Environments requiring flexible log analytics with full-text search
- Teams building custom security analytics solutions
Less suitable for:
- Organisations wanting turnkey SIEM without integration work
- Small teams without capacity for cluster operations
- Use cases focused primarily on metrics rather than logs
Prometheus
- Type
- Open source
- Licence
- Apache-2.0
- Current version
- 3.8.1 (released 2025-12-16)
- Deployment options
- Self-hosted (Linux, Windows, macOS), Docker, Kubernetes
- Source repository
- https://github.com/prometheus/prometheus
- Documentation
- https://prometheus.io/docs
Overview
Prometheus is a metrics-based monitoring system that collects time-series data through a pull model. Originally developed at SoundCloud in 2012 and donated to the Cloud Native Computing Foundation (CNCF) in 2016, Prometheus became the second CNCF graduated project after Kubernetes. The project maintains a clear focus on metrics collection, storage, and alerting.
The architecture comprises the Prometheus server (scrapes and stores metrics), Alertmanager (handles alert deduplication, grouping, and routing), and various exporters (expose metrics from systems and applications). The pull-based model enables dynamic service discovery and reduces target configuration complexity.
Capability assessment for security monitoring
Prometheus excels at infrastructure and application metrics monitoring but does not address security-specific monitoring requirements. The platform collects numeric time-series data rather than log events, making it unsuitable as a standalone SIEM solution. For security monitoring architectures, Prometheus provides the metrics layer (CPU, memory, network, application performance) while other tools handle logs and security events.
PromQL, the query language, enables sophisticated metric analysis including rate calculations, aggregations, and mathematical operations. Alertmanager provides flexible alert routing with grouping, inhibition, and silencing capabilities.
Key strengths:
- Pull-based architecture: Service discovery integration eliminates target configuration; targets define their own metrics
- PromQL power: Expressive query language enables complex metric analysis and alert conditions
- Alertmanager sophistication: Alert grouping, deduplication, inhibition, and silence reduce noise effectively
- Ecosystem breadth: Thousands of exporters cover diverse systems; Prometheus format is industry standard
Key limitations:
- Not for logs: Cannot ingest or analyse log data; requires complementary log aggregation
- No security analytics: No detection rules, correlation, or compliance reporting
- Scaling ceiling: Single-server TSDB has cardinality and retention limits; requires federation or remote storage for scale
- No native long-term storage: Designed for recent data; extended retention requires Thanos, Mimir, or Cortex
Deployment and operations
Self-hosted requirements:
Operating system: Linux, Windows, macOS, FreeBSD (single binary)Dependencies: None (statically compiled)Minimum: 2 vCPU, 2 GB RAM, 20 GB SSDRecommended: Scales with series count and retentionRule of thumb: 1-2 bytes per sample; plan storage accordinglyDeployment complexity: Low. Single binary with YAML configuration. Kubernetes deployment via Prometheus Operator simplifies cluster monitoring.
Operational overhead: Low to Medium. Prometheus itself is low-maintenance. Complexity increases with federation, remote storage, and high availability requirements.
Upgrade path: Prometheus 3.0 (November 2024) was first major release in 7 years. Minor versions release every 6 weeks. LTS versions available for extended support.
Integration capabilities
API coverage: HTTP API for queries, targets, alerts, metadata. Prometheus format (exposition format) is documented standard.
Key integrations:
| Integration | Type | Status | Documentation |
|---|---|---|---|
| Kubernetes | Native | Production | Service discovery |
| Grafana | Native | Production | Primary visualisation |
| Alertmanager | Native | Production | Bundled |
| Thanos | External | Production | Long-term storage |
| Mimir | External | Production | Scalable long-term storage |
| OpenTelemetry | Native | Production | OTLP ingestion in 3.x |
Standards supported: Prometheus exposition format, OpenMetrics, OpenTelemetry (OTLP), Remote Write protocol
Security assessment
Authentication: Basic authentication via configuration or reverse proxy. No native SSO or MFA.
Authorisation: No native RBAC. All authenticated users have full access. Requires proxy layer for access control.
Data protection: TLS for scraping and remote write. No native encryption at rest; relies on filesystem encryption.
Security track record: Security advisories via GitHub. No critical CVEs in recent history. Minimal attack surface due to simplicity.
Cost analysis
Direct costs:
- Licence: Free (Apache-2.0)
- Support: Community support; commercial support from Grafana Labs (via Mimir/Grafana Cloud)
- Enterprise features: None; all features included
Infrastructure costs (self-hosted):
| Scale | Infrastructure estimate | Configuration |
|---|---|---|
| Small (10,000 series) | 1 VM (2 GB RAM, 50 GB SSD) | Single instance |
| Medium (100,000 series) | 1 VM (8 GB RAM, 200 GB SSD) | Single instance |
| Large (1M+ series) | Multiple instances | Federation or Thanos/Mimir |
Hidden costs:
- High availability: Requires duplicate Prometheus instances or Thanos/Mimir for HA
- Long-term storage: Extended retention requires additional infrastructure (object storage, query layer)
Organisational fit
Best suited for:
- Cloud-native and Kubernetes environments
- Organisations prioritising infrastructure metrics
- Teams wanting open standards (Prometheus format widely adopted)
Less suitable for:
- Organisations needing SIEM or log analysis
- Environments requiring long-term metric retention without additional components
- Teams needing built-in visualisation (Grafana required)
Grafana
- Type
- Open source
- Licence
- AGPL-3.0
- Current version
- 12.3 (released 2025-12)
- Deployment options
- Self-hosted (Linux, Windows, macOS), Docker, Kubernetes, Grafana Cloud
- Source repository
- https://github.com/grafana/grafana
- Documentation
- https://grafana.com/docs/grafana/latest
Overview
Grafana is a visualisation and observability platform that aggregates data from multiple sources into unified dashboards. Founded in 2014 as a fork of Kibana, Grafana has evolved into the standard visualisation layer for metrics, logs, and traces. Grafana Labs maintains the open source project while offering Grafana Cloud and enterprise features.
The architecture separates visualisation from storage. Grafana connects to backend data sources (Prometheus, Loki, OpenSearch, InfluxDB, and 150+ others) via plugins. This approach enables unified observability dashboards across heterogeneous monitoring infrastructure.
Capability assessment for security monitoring
Grafana serves as the visualisation layer for security monitoring architectures rather than providing security analytics directly. Dashboards can display security events from Wazuh, logs from Loki or OpenSearch, and metrics from Prometheus in unified views. Grafana Alerting provides rule-based alerting across all connected data sources.
For security operations centres, Grafana enables custom dashboards combining security events, infrastructure metrics, and application logs. The Explore feature supports ad-hoc investigation across data sources.
Key strengths:
- Data source federation: Single interface for metrics, logs, and traces from diverse backends
- Dashboard ecosystem: Thousands of community dashboards available; one-click import
- Alerting unification: Grafana Alerting creates rules across all data sources with unified notification
- Explore investigation: Ad-hoc querying and correlation across data sources supports incident response
Key limitations:
- No data storage: Visualisation only; requires separate backends for all data
- No security analytics: No detection rules, correlation engine, or compliance reporting
- Licence consideration: AGPL-3.0 requires source disclosure for modified versions
- Complexity at scale: Large deployments require high availability configuration and database tuning
Deployment and operations
Self-hosted requirements:
Operating system: Linux, Windows, macOSDatabase: SQLite (default), PostgreSQL, MySQL (production recommended)Minimum: 1 vCPU, 512 MB RAMRecommended: 2 vCPU, 2 GB RAM, PostgreSQL databaseDeployment complexity: Low. Single binary or container. Production deployment straightforward with external database.
Operational overhead: Low. Minimal maintenance required. Dashboard and user management through web interface.
Upgrade path: Monthly minor releases; annual major releases. Upgrade guide provided for each version. Breaking changes documented.
Integration capabilities
API coverage: Comprehensive HTTP API for dashboards, data sources, users, alerting. Provisioning via configuration files.
Key integrations:
| Integration | Type | Status | Documentation |
|---|---|---|---|
| Prometheus | Native | Production | Built-in data source |
| Loki | Native | Production | Built-in data source |
| OpenSearch | Plugin | Production | Official plugin |
| InfluxDB | Native | Production | Built-in data source |
| Elasticsearch | Native | Production | Built-in data source |
| CloudWatch, Azure Monitor | Native | Production | Built-in data sources |
Standards supported: Prometheus query, LogQL, OpenSearch DSL, SQL, GraphQL (varies by data source)
Security assessment
Authentication: Internal users, LDAP, Active Directory, SAML 2.0, OAuth 2.0, Generic OAuth. Multi-factor via IdP.
Authorisation: Organisation-based multi-tenancy. Role-based access (Admin, Editor, Viewer). Dashboard and folder permissions.
Data protection: TLS for web interface. Database encryption depends on backend. Secrets management for data source credentials.
Security track record: Regular security releases. Security advisories published. Bug bounty programme via HackerOne.
Cost analysis
Direct costs:
- Licence: Free (AGPL-3.0)
- Support: Community; Grafana Cloud includes support; enterprise support available
- Enterprise features: Enhanced RBAC, reporting, SLO tracking (Grafana Enterprise licence)
Infrastructure costs (self-hosted):
| Scale | Infrastructure estimate | Configuration |
|---|---|---|
| Small team | 1 VM (2 GB RAM) | Single instance, SQLite |
| Medium organisation | 1 VM (4 GB RAM), PostgreSQL | Single instance, external DB |
| Large deployment | 2+ VMs, PostgreSQL cluster | High availability |
Hidden costs:
- Enterprise features: Some governance features require enterprise licence
- Backend infrastructure: Grafana cost is low; backend data sources are the significant expense
Organisational fit
Best suited for:
- Organisations using multiple monitoring backends needing unified visualisation
- Teams standardising on Prometheus/Loki stack
- Environments requiring custom dashboards and alerting
Less suitable for:
- Organisations needing turnkey security monitoring (requires additional backends)
- Use cases where single-vendor solution preferred
- Deployments where AGPL licence is problematic
Loki
- Type
- Open source
- Licence
- AGPL-3.0
- Current version
- 3.6.4 (released 2026-01-21)
- Deployment options
- Self-hosted (Linux), Docker, Kubernetes, Grafana Cloud
- Source repository
- https://github.com/grafana/loki
- Documentation
- https://grafana.com/docs/loki/latest
Overview
Loki is a log aggregation system designed for efficiency and cost-effectiveness. Unlike traditional log platforms that index full log content, Loki indexes only metadata (labels), storing compressed log chunks. This approach dramatically reduces storage and operational costs while maintaining query capability.
Developed by Grafana Labs and released in 2018, Loki applies Prometheus concepts to logs. The label-based model familiar from Prometheus enables correlation between metrics and logs. Log collection uses Grafana Alloy (formerly Promtail) or other agents supporting Loki’s push API.
Capability assessment for security monitoring
Loki provides cost-effective log aggregation but lacks security-specific analytics capabilities. For security monitoring, Loki serves as the log storage layer while alerting and analysis occur in Grafana. LogQL queries can filter security-relevant log patterns, but detection rules, correlation, and compliance reporting require external tools or Grafana features.
The label-based architecture suits environments where log structure is consistent and known. High-cardinality labels (unique request IDs, user IDs) degrade performance, requiring careful label design.
Key strengths:
- Cost efficiency: Label-only indexing reduces storage 10-100x compared to full-text indexing
- Prometheus alignment: Same label model enables metrics-log correlation
- Operational simplicity: Simpler to operate than OpenSearch/Elasticsearch for log aggregation
- Grafana integration: Native integration with Grafana for querying and alerting
Key limitations:
- No full-text index: Queries on log content require scanning; slower than indexed search
- No security analytics: No detection rules, correlation, or compliance features
- Label cardinality constraints: High-cardinality labels cause performance degradation
- Query limitations: Complex queries slower than dedicated search engines
Deployment and operations
Self-hosted requirements:
Operating system: Linux (recommended), Docker, KubernetesStorage: Local filesystem, S3, GCS, Azure Blob, MinIOMinimum: 1 vCPU, 1 GB RAM, local storageRecommended (production): Microservices mode, object storage backendDeployment complexity: Low (monolithic) to Medium (microservices). Single-binary deployment for small scale. Microservices mode for production scaling.
Operational overhead: Low to Medium. Simpler than OpenSearch. Compaction and retention require monitoring.
Upgrade path: Quarterly minor releases. Schema versioning (v13 current) enables forward compatibility. Helm chart simplifies Kubernetes upgrades.
Integration capabilities
API coverage: HTTP API for push (ingestion), query, and administration. LogQL for querying.
Key integrations:
| Integration | Type | Status | Documentation |
|---|---|---|---|
| Grafana | Native | Production | Built-in data source |
| Grafana Alloy | Native | Production | Primary collection agent |
| Promtail | Native | Deprecated | Legacy agent |
| OpenTelemetry | Native | Production | OTLP ingestion |
| Fluent Bit | Plugin | Production | Community plugin |
| Fluentd | Plugin | Production | Community plugin |
Standards supported: OpenTelemetry (OTLP), Prometheus label model, Syslog via agents
Security assessment
Authentication: Basic authentication, tenant ID header. Multi-tenancy via X-Scope-OrgID header.
Authorisation: Tenant-based isolation. No fine-grained RBAC within tenant. Authentication/authorisation typically handled at gateway level.
Data protection: TLS for ingestion and queries. Encryption at rest depends on storage backend (S3 server-side encryption, etc.).
Security track record: Security advisories via Grafana security portal. Regular security patches in minor releases.
Cost analysis
Direct costs:
- Licence: Free (AGPL-3.0)
- Support: Community; Grafana Cloud includes support
- Enterprise features: Grafana Enterprise Logs adds features for large deployments
Infrastructure costs (self-hosted):
| Scale | Infrastructure estimate | Configuration |
|---|---|---|
| Small (< 100 GB/day) | 1 VM (4 GB RAM), object storage | Monolithic mode |
| Medium (100 GB-1 TB/day) | 3+ VMs, object storage | Simple scalable |
| Large (> 1 TB/day) | Kubernetes, object storage | Microservices mode |
Hidden costs:
- Object storage: Primary storage cost for production deployments
- Grafana dependency: Requires Grafana for visualisation and alerting
Organisational fit
Best suited for:
- Cost-conscious deployments needing log aggregation without full-text search requirements
- Teams already using Prometheus and Grafana
- Environments with high log volume and limited query complexity
Less suitable for:
- Security operations requiring detection rules and correlation
- Use cases requiring full-text search across log content
- Organisations not using Grafana
Selection guidance
Decision framework
Use this framework to identify appropriate tools based on requirements:
START | v +----------------------------------+ | Primary need: Security (SIEM) | | or Infrastructure monitoring? | +----------------+-----------------+ | +------------------------+------------------------+ | | v v +---------+---------+ +-----------+-----------+ | SIEM / Security | | Infrastructure | | Events | | Metrics / Logs | +----+----+---------+ +-----+-----+-----------+ | | | | | +---> Wazuh (integrated) | +---> Prometheus (metrics) | OpenSearch (flexible) | Loki (logs, cost-effective) | | OpenSearch (logs, search) +---> Add Grafana for visualisation | +---> Grafana (visualisation)Recommendations by organisational context
For organisations with minimal IT capacity
Recommended stack: Wazuh (all-in-one)
Wazuh provides integrated SIEM, endpoint detection, and compliance reporting in a single deployment. The installation assistant completes setup in under 10 minutes. While resource requirements are higher than individual components, operational complexity is lower than assembling multiple tools.
For infrastructure metrics, add Prometheus with Node Exporter and use Wazuh dashboard or add Grafana for visualisation.
Configuration guidance:
- Deploy Wazuh all-in-one on a VM with 8 GB RAM minimum
- Use installation assistant for automated certificate and component configuration
- Enable relevant compliance dashboards (PCI DSS, GDPR) during setup
- Configure index lifecycle to manage storage growth
For organisations with established IT capacity
Recommended stack: Prometheus + Loki + Grafana + Wazuh (or OpenSearch)
Separate concerns across purpose-built tools:
- Prometheus: Infrastructure and application metrics with Alertmanager
- Loki: Log aggregation with cost-effective storage
- Grafana: Unified visualisation and alerting
- Wazuh or OpenSearch Security Analytics: Security events and compliance
This architecture provides flexibility and allows independent scaling of each component.
Configuration guidance:
- Deploy Prometheus with appropriate retention (15-30 days typical)
- Configure Loki with object storage backend for production
- Use Grafana Alloy for unified collection (replaces separate Promtail and exporters)
- Connect Grafana to all data sources for unified dashboards
- Add Wazuh agents to endpoints requiring security monitoring
For organisations requiring full-text log search
Recommended stack: OpenSearch + Prometheus + Grafana
When log analysis requires full-text search, relevance scoring, or complex queries across log content, OpenSearch provides capabilities Loki cannot match. Security Analytics adds SIEM functionality.
Configuration guidance:
- Deploy OpenSearch with minimum 3 nodes for production
- Configure index lifecycle management from day one
- Use OpenSearch Dashboards for log exploration
- Add Grafana for metrics visualisation and unified alerting
- Consider Wazuh if integrated endpoint agents needed (uses OpenSearch backend)
Migration paths
| From | To | Complexity | Approach |
|---|---|---|---|
| Elasticsearch | OpenSearch | Low | Rolling migration; API compatible |
| ELK Stack | Wazuh | Medium | Deploy Wazuh alongside; migrate agents |
| Splunk | Wazuh + OpenSearch | High | Parallel operation; phased migration |
| Nagios | Prometheus + Grafana | Medium | Deploy alongside; migrate checks to exporters |
| Zabbix | Prometheus + Grafana | Medium | Deploy alongside; migrate templates |
| Commercial SIEM | Wazuh | High | Parallel operation; rule migration |
Resources and references
Official documentation
Relevant standards
| Standard | Description | URL |
|---|---|---|
| OpenMetrics | Prometheus exposition format standardisation | https://openmetrics.io |
| OpenTelemetry | Observability framework for traces, metrics, logs | https://opentelemetry.io |
| Sigma | Generic signature format for SIEM systems | https://github.com/SigmaHQ/sigma |
| MITRE ATT&CK | Adversary tactics and techniques knowledge base | https://attack.mitre.org |
See also
- Security Operations -Security monitoring strategy and operations
- SIEM Implementation -Implementing security information and event management
- Monitoring Strategy -Infrastructure monitoring approach
- Infrastructure Monitoring -Implementing infrastructure monitoring
- Log Management -Log collection and management procedures
- Alerting and Escalation -Alert management and escalation procedures