On this page

Log Management

Log management establishes the infrastructure and procedures for capturing machine-generated records from across your IT environment into a centralised system where they can be searched, analysed, and retained according to policy. This task covers deploying collection agents, configuring forwarding and aggregation, setting up parsing pipelines, and managing log lifecycle from ingestion through archival.

Prerequisites

Before implementing log management, confirm the following requirements are in place.

Requirement	Detail
Log source inventory	Document all systems that generate logs: servers, network devices, applications, cloud services, security tools
Storage capacity	Calculate required storage based on log volume estimates; plan for 1.5x headroom above projected daily ingestion
Retention policy	Obtain approved retention periods from Data Retention and Records for each log category
Network access	Collection agents require outbound connectivity to aggregation tier on TCP 514 (syslog), TCP 5044 (Beats), or TCP 6514 (syslog-TLS)
Administrative access	Root or administrator privileges on log sources for agent installation; administrative access to log management platform
Time synchronisation	All log sources must synchronise to NTP; log correlation fails with clock skew exceeding 1 second

Verify NTP synchronisation on Linux systems:

timedatectl status | grep -E "(synchronized|NTP)"
# Expected output:
# System clock synchronized: yes
# NTP service: active

On Windows systems:

w32tm /query /status
# Verify "Stratum" is less than 5 and "Last Successful Sync Time" is recent

Storage estimation

Calculate storage requirements before deployment. Log volume varies substantially by source type and verbosity level.

Source type	Typical daily volume per instance	Notes
Linux server (syslog)	50-200 MB	Increases with service count
Windows server (Event Log)	100-500 MB	Security logs dominate
Web server (access logs)	1-10 GB	Scales with request volume
Application (structured JSON)	200 MB-2 GB	Depends on logging level
Network device	10-100 MB	Increases with flow logging
Container platform	500 MB-5 GB	Per node, not per container

For an organisation with 50 Linux servers, 20 Windows servers, 5 web servers, and 10 network devices, calculate baseline daily volume:

Linux:    50 × 100 MB  =   5,000 MB
Windows:  20 × 300 MB  =   6,000 MB
Web:       5 × 3 GB    =  15,000 MB
Network:  10 × 50 MB   =     500 MB
                       -----------
Daily total:             26,500 MB (approximately 26 GB)

With a 90-day retention requirement and 1.5x headroom factor:

26 GB × 90 days × 1.5 = 3,510 GB (approximately 3.5 TB)

This calculation assumes raw log storage. Compression reduces requirements by 70-90% depending on log content, but indexing for search adds 10-30% overhead. Plan for net storage of approximately 50% of raw calculated volume when using compressed, indexed storage.

Procedure

Phase 1: Deploy log aggregation infrastructure

The aggregation tier receives logs from all sources, parses them into structured format, and stores them for analysis. Deploy this infrastructure before configuring collection agents.

Provision the log aggregation server with resources appropriate to your volume. For the 26 GB daily example, allocate a minimum of 4 CPU cores, 16 GB RAM, and 4 TB storage. Create the server and install the base operating system (Ubuntu 22.04 LTS or later).

   # Verify system resources
   nproc
   # Expected: 4 or higher

   free -h
   # Expected: 16 GB or higher total memory

   df -h /var
   # Expected: 4 TB or higher available

Install the log aggregation platform. This procedure uses Graylog with OpenSearch as the storage backend, representing an open source option. Commercial alternatives include Splunk, Elastic Cloud, and Datadog.

   # Install dependencies
   sudo apt update
   sudo apt install -y apt-transport-https openjdk-17-jre-headless uuid-runtime pwgen

   # Add MongoDB repository (required for Graylog metadata)
   curl -fsSL https://pgp.mongodb.com/server-7.0.asc | sudo gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor
   echo "deb [ signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list

   # Add OpenSearch repository
   curl -fsSL https://artifacts.opensearch.org/publickeys/opensearch.pgp | sudo gpg -o /usr/share/keyrings/opensearch.gpg --dearmor
   echo "deb [signed-by=/usr/share/keyrings/opensearch.gpg] https://artifacts.opensearch.org/releases/bundle/opensearch/2.x/apt stable main" | sudo tee /etc/apt/sources.list.d/opensearch.list

   # Add Graylog repository
   wget https://packages.graylog2.org/repo/packages/graylog-5.2-repository_latest.deb
   sudo dpkg -i graylog-5.2-repository_latest.deb

   sudo apt update

Install and configure MongoDB for Graylog metadata storage.

   sudo apt install -y mongodb-org
   sudo systemctl daemon-reload
   sudo systemctl enable mongod
   sudo systemctl start mongod

   # Verify MongoDB is running
   mongosh --eval "db.adminCommand('ping')"
   # Expected: { ok: 1 }

Install and configure OpenSearch for log storage and indexing.

   sudo apt install -y opensearch

   # Configure OpenSearch for single-node deployment
   sudo tee /etc/opensearch/opensearch.yml << 'EOF'
   cluster.name: graylog-cluster
   node.name: node-1
   path.data: /var/lib/opensearch
   path.logs: /var/log/opensearch
   network.host: 127.0.0.1
   discovery.type: single-node
   plugins.security.disabled: true
   indices.query.bool.max_clause_count: 32768
   EOF

   # Set JVM heap size to 50% of available memory (max 31 GB)
   sudo sed -i 's/-Xms1g/-Xms8g/g' /etc/opensearch/jvm.options
   sudo sed -i 's/-Xmx1g/-Xmx8g/g' /etc/opensearch/jvm.options

   sudo systemctl daemon-reload
   sudo systemctl enable opensearch
   sudo systemctl start opensearch

   # Verify OpenSearch is running
   curl -s http://localhost:9200/_cluster/health | jq .status
   # Expected: "green" or "yellow" (yellow is acceptable for single-node)

Install and configure Graylog server.

   sudo apt install -y graylog-server

   # Generate password secret (minimum 64 characters)
   PASSWORD_SECRET=$(pwgen -N 1 -s 96)

   # Generate admin password hash
   echo -n "Enter admin password: " && read -s ADMIN_PASS && echo
   ADMIN_HASH=$(echo -n "$ADMIN_PASS" | sha256sum | cut -d" " -f1)

   # Configure Graylog
   sudo tee /etc/graylog/server/server.conf << EOF
   is_leader = true
   node_id_file = /etc/graylog/server/node-id
   password_secret = ${PASSWORD_SECRET}
   root_password_sha2 = ${ADMIN_HASH}
   root_email = admin@example.org
   root_timezone = UTC
   bin_dir = /usr/share/graylog-server/bin
   data_dir = /var/lib/graylog-server
   plugin_dir = /usr/share/graylog-server/plugin
   http_bind_address = 0.0.0.0:9000
   elasticsearch_hosts = http://127.0.0.1:9200
   mongodb_uri = mongodb://127.0.0.1:27017/graylog
   EOF

   sudo systemctl daemon-reload
   sudo systemctl enable graylog-server
   sudo systemctl start graylog-server

Verify the aggregation platform is operational by accessing the web interface at http://<server-ip>:9000. Log in with username admin and the password you configured. The system status indicator in the top navigation should show green.

The aggregation infrastructure is now ready to receive logs. The following diagram shows the architecture deployed in this phase:

+------------------------------------------------------------------+
|                    LOG AGGREGATION SERVER                        |
+------------------------------------------------------------------+
|                                                                  |
|  +------------------+     +------------------+                   |
|  |                  |     |                  |                   |
|  |    Graylog       |     |    MongoDB       |                   |
|  |    Server        +---->+    (metadata)    |                   |
|  |    :9000         |     |    :27017        |                   |
|  |                  |     |                  |                   |
|  +--------+---------+     +------------------+                   |
|           |                                                      |
|           |                                                      |
|           v                                                      |
|  +--------+---------+                                            |
|  |                  |                                            |
|  |   OpenSearch     |                                            |
|  |   (log storage)  |                                            |
|  |   :9200          |                                            |
|  |                  |                                            |
|  +------------------+                                            |
|                                                                  |
+------------------------------------------------------------------+
           ^
           |
    Incoming logs
    (TCP 5044, 514, 12201)

Figure 1: Single-node log aggregation architecture

Phase 2: Configure log inputs

Inputs define how the aggregation platform receives logs. Configure inputs before deploying collection agents to ensure the platform is ready to accept data.

Create a Beats input for receiving logs from Filebeat and Winlogbeat agents. In the Graylog web interface, navigate to System → Inputs. Select “Beats” from the input type dropdown and click “Launch new input”.
Configure the input with these parameters:
Parameter Value
Title Beats Input
Bind address 0.0.0.0
Port 5044
No. of worker threads 4
TLS cert file Leave empty for initial setup
TLS key file Leave empty for initial setup
Click “Save” to create the input, then click “Start” to activate it.
Create a syslog input for network devices and legacy systems that cannot run collection agents.
Select “Syslog UDP” from the input type dropdown and configure:
Parameter Value
Title Syslog UDP Input
Bind address 0.0.0.0
Port 514
Store full message Yes
Create an additional “Syslog TCP” input on port 1514 for reliable delivery from systems that support TCP syslog.
Create a GELF input for applications that support structured logging directly.
Select “GELF UDP” and configure:
Parameter Value
Title GELF UDP Input
Bind address 0.0.0.0
Port 12201
Decompress size limit 8388608
Verify inputs are running by checking the input status page. Each input should show “RUNNING” status with “0 messages” initially.

Parameter	Value
Title	Beats Input
Bind address	0.0.0.0
Port	5044
No. of worker threads	4
TLS cert file	Leave empty for initial setup
TLS key file	Leave empty for initial setup

Parameter	Value
Title	Syslog UDP Input
Bind address	0.0.0.0
Port	514
Store full message	Yes

Parameter	Value
Title	GELF UDP Input
Bind address	0.0.0.0
Port	12201
Decompress size limit	8388608

   # Verify ports are listening
   sudo ss -tlnp | grep -E "(5044|514|1514|12201)"
   # Expected: Four lines showing graylog-server listening on each port

Configure firewall rules to allow log traffic from your network.

   sudo ufw allow from 10.0.0.0/8 to any port 5044 proto tcp comment "Beats"
   sudo ufw allow from 10.0.0.0/8 to any port 514 proto udp comment "Syslog UDP"
   sudo ufw allow from 10.0.0.0/8 to any port 1514 proto tcp comment "Syslog TCP"
   sudo ufw allow from 10.0.0.0/8 to any port 12201 proto udp comment "GELF"
   sudo ufw reload

Phase 3: Deploy collection agents

Collection agents run on log sources and forward logs to the aggregation platform. Use Filebeat for Linux systems and Winlogbeat for Windows systems.

Install Filebeat on Linux servers. Download and install from the Elastic repository (Filebeat works with Graylog despite being an Elastic product).

   curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg -o /usr/share/keyrings/elastic.gpg --dearmor
   echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
   sudo apt update
   sudo apt install -y filebeat

Configure Filebeat to collect system logs and forward to your aggregation server.

   sudo tee /etc/filebeat/filebeat.yml << 'EOF'
   filebeat.inputs:
     - type: log
       enabled: true
       paths:
         - /var/log/syslog
         - /var/log/auth.log
         - /var/log/kern.log
       fields:
         log_type: syslog
       fields_under_root: true

     - type: log
       enabled: true
       paths:
         - /var/log/apache2/access.log
         - /var/log/nginx/access.log
       fields:
         log_type: webserver_access
       fields_under_root: true

     - type: log
       enabled: true
       paths:
         - /var/log/apache2/error.log
         - /var/log/nginx/error.log
       fields:
         log_type: webserver_error
       fields_under_root: true

   processors:
     - add_host_metadata:
         when.not.contains.tags: forwarded
     - add_cloud_metadata: ~

   output.logstash:
     hosts: ["log-aggregator.example.org:5044"]

   logging.level: warning
   logging.to_files: true
   logging.files:
     path: /var/log/filebeat
     name: filebeat
     keepfiles: 3
   EOF

Enable and start Filebeat.

   sudo systemctl enable filebeat
   sudo systemctl start filebeat

   # Verify Filebeat is running and shipping logs
   sudo systemctl status filebeat
   sudo filebeat test output
   # Expected: "logstash: log-aggregator.example.org:5044... connection..."

Install Winlogbeat on Windows servers. Download Winlogbeat 8.x from elastic.co and extract to C:\Program Files\Winlogbeat.

   # Run in PowerShell as Administrator
   cd "C:\Program Files\Winlogbeat"
   .\install-service-winlogbeat.ps1

Configure Winlogbeat to collect Windows Event Logs.
Edit C:\Program Files\Winlogbeat\winlogbeat.yml:

   winlogbeat.event_logs:
     - name: Application
       ignore_older: 72h
     - name: System
       ignore_older: 72h
     - name: Security
       ignore_older: 72h
     - name: Microsoft-Windows-Sysmon/Operational
       ignore_older: 72h
     - name: Microsoft-Windows-PowerShell/Operational
       ignore_older: 72h

   processors:
     - add_host_metadata:
         when.not.contains.tags: forwarded

   output.logstash:
     hosts: ["log-aggregator.example.org:5044"]

   logging.level: warning
   logging.to_files: true
   logging.files:
     path: C:\ProgramData\Winlogbeat\logs
     name: winlogbeat
     keepfiles: 3

Start the Winlogbeat service.

   Start-Service winlogbeat
   Get-Service winlogbeat
   # Expected: Status "Running"

   # Test connectivity
   cd "C:\Program Files\Winlogbeat"
   .\winlogbeat.exe test output

Configure network devices to forward syslog to the aggregation server. The exact procedure varies by vendor. For Cisco IOS devices:

   configure terminal
   logging host 10.0.1.50 transport udp port 514
   logging trap informational
   logging source-interface Loopback0
   logging on
   end
   write memory

For Juniper Junos:

   set system syslog host 10.0.1.50 any info
   set system syslog host 10.0.1.50 port 514
   commit

The log collection architecture now spans your infrastructure:

+------------------+     +------------------+     +------------------+
|  Linux Servers   |     | Windows Servers  |     | Network Devices  |
|                  |     |                  |     |                  |
|  +------------+  |     |  +------------+  |     |  +-----------+   |
|  | Filebeat   |  |     |  | Winlogbeat |  |     |  | Syslog    |   |
|  +-----+------+  |     |  +-----+------+  |     |  +-----+-----+   |
|        |         |     |        |         |     |        |         |
+--------+---------+     +--------+---------+     +--------+---------+
         |                        |                        |
         |    TCP 5044            |    TCP 5044            |  UDP 514
         |                        |                        |
         +------------------------+------------------------+
                                  |
                                  v
                    +-------------+-------------+
                    |                           |
                    |    Log Aggregation        |
                    |    Server                 |
                    |                           |
                    |  +-------+   +-------+    |
                    |  |Graylog|   |  Open |    |
                    |  |       +-->+ Search|    |
                    |  +-------+   +-------+    |
                    |                           |
                    +---------------------------+

Figure 2: Log collection flow from sources to aggregation

Phase 4: Configure log parsing

Raw logs arrive as unstructured text. Parsing extracts fields into structured data that can be searched and analysed. Configure extractors and pipelines to parse logs from each source type.

Create an extractor for syslog messages to extract timestamp, hostname, facility, and message body. In Graylog, navigate to System → Inputs, select your Syslog UDP input, and click “Manage extractors”.

Click “Add extractor” and select “Grok pattern”:

Parameter	Value
Source field	message
Grok pattern	`%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:source_host} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}`
Store as	extracted_fields
Extractor title	Syslog Parser

Create a pipeline for web server access log parsing. Navigate to System → Pipelines and create a new pipeline named “Web Access Logs”.
Create a pipeline rule:

   rule "parse_nginx_access"
   when
     has_field("log_type") AND to_string($message.log_type) == "webserver_access"
   then
     let parsed = grok(
       pattern: "%{IPORHOST:client_ip} - %{DATA:user_name} \\[%{HTTPDATE:timestamp}\\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:bytes_sent} \"%{DATA:referrer}\" \"%{DATA:user_agent}\"",
       value: to_string($message.message),
       only_named_captures: true
     );
     set_fields(parsed);
   end

Create index field mappings for parsed fields. Navigate to System → Index Sets and edit your default index set. Under “Field Type Profiles”, add mappings:
Field Type
client_ip ip
response_code long
bytes_sent long
syslog_pid long
timestamp date
Connect the pipeline to the appropriate input stream. Navigate to Streams, select or create a stream for web logs (e.g., “Web Server Logs”), and connect the “Web Access Logs” pipeline.
Verify parsing by examining recent messages. Navigate to Search, select the last 5 minutes, and examine a web access log message. The message should display parsed fields (client_ip, method, request, response_code) rather than raw text only.

Field	Type
client_ip	ip
response_code	long
bytes_sent	long
syslog_pid	long
timestamp	date

The parsing pipeline transforms raw text into structured, queryable data:

+-------------------------------------------------------------------+
|                        LOG PARSING PIPELINE                       |
+-------------------------------------------------------------------+
|                                                                   |
|  +-----------------+     +-----------------+     +--------------+ |
|  |                 |     |                 |     |              | |
|  |  Raw Log        |     |   Grok/Regex    |     |  Structured  | |
|  |  Message        +---->+   Extractors    +---->+  Fields      | |
|  |                 |     |                 |     |              | |
|  +-----------------+     +-----------------+     +--------------+ |
|                                                                   |
|  Example transformation:                                          |
|                                                                   |
|  INPUT:                                                           |
|  "10.0.1.100 - - [05/Jan/2026:14:30:22 +0000] \"GET /api/users    |
|   HTTP/1.1\" 200 1532 \"-\" \"Mozilla/5.0\""                      |
|                                                                   |
|  OUTPUT:                                                          |
|  +------------------+------------------------------------------+  |
|  | client_ip        | 10.0.1.100                               |  |
|  | timestamp        | 2026-01-05T14:30:22Z                     |  |
|  | method           | GET                                      |  |
|  | request          | /api/users                               |  |
|  | response_code    | 200                                      |  |
|  | bytes_sent       | 1532                                     |  |
|  | user_agent       | Mozilla/5.0                              |  |
|  +------------------+------------------------------------------+  |
|                                                                   |
+-------------------------------------------------------------------+

Figure 3: Log parsing extracts structured fields from raw messages

Phase 5: Configure retention and archival

Retention rules determine how long logs remain searchable and when they move to cold storage or deletion. Configure retention based on your organisation’s policy requirements and storage capacity.

Configure index rotation to create new indices on a regular schedule. Navigate to System → Index Sets and edit your default index set.
Set rotation strategy:
Parameter Value
Rotation strategy Index Time
Rotation period P1D (daily rotation)
Max number of indices 90
This configuration retains 90 days of searchable logs. Indices older than 90 days are automatically deleted.
Configure retention strategy for logs requiring longer retention. Create a separate index set for compliance-relevant logs (authentication, security events).
Navigate to System → Index Sets and click “Create index set”:
Parameter Value
Title Compliance Logs
Index prefix compliance
Rotation strategy Index Time
Rotation period P1D
Retention strategy Close
Max number of indices 365
The “Close” strategy keeps indices on disk but removes them from active search, reducing resource usage while maintaining data.
Create a stream to route compliance-relevant logs to the compliance index set. Navigate to Streams and create:
Parameter Value
Title Compliance Logs
Index Set Compliance Logs
Description Authentication and security events for compliance retention
Add stream rules to capture relevant logs:
- Field log_type equals security
- Field EventID exists (Windows Security logs)
- Field syslog_program equals sshd
- Field syslog_program equals sudo
Configure archival to external storage for logs requiring retention beyond active index capacity. Set up an archive configuration that exports closed indices to object storage.

Parameter	Value
Rotation strategy	Index Time
Rotation period	P1D (daily rotation)
Max number of indices	90

Parameter	Value
Title	Compliance Logs
Index prefix	compliance
Rotation strategy	Index Time
Rotation period	P1D
Retention strategy	Close
Max number of indices	365

Parameter	Value
Title	Compliance Logs
Index Set	Compliance Logs
Description	Authentication and security events for compliance retention

   # Create archive script (run from cron)
   sudo tee /usr/local/bin/archive-indices.sh << 'EOF'
   #!/bin/bash
   ARCHIVE_DIR="/mnt/archive/graylog"
   DAYS_OLD=90

   # Find indices older than retention period
   INDICES=$(curl -s 'http://localhost:9200/_cat/indices?h=index,creation.date.string' \
     | awk -v d="$(date -d "-${DAYS_OLD} days" +%Y-%m-%d)" '$2 < d {print $1}' \
     | grep -E '^graylog_')

   for INDEX in $INDICES; do
     # Export index to archive
     /usr/share/opensearch/bin/opensearch-snapshot \
       --index "$INDEX" \
       --output "${ARCHIVE_DIR}/${INDEX}.tar.gz"

     # Delete archived index
     curl -X DELETE "http://localhost:9200/${INDEX}"
   done
   EOF

   sudo chmod +x /usr/local/bin/archive-indices.sh

   # Schedule daily archival at 02:00
   echo "0 2 * * * root /usr/local/bin/archive-indices.sh >> /var/log/archive-indices.log 2>&1" \
     | sudo tee /etc/cron.d/archive-indices

Verify retention configuration by checking index status.

   curl -s 'http://localhost:9200/_cat/indices?v&h=index,docs.count,store.size,creation.date.string' \
     | grep graylog | sort -k4

Expected output shows indices with creation dates spanning your retention period, with oldest indices at the top of the sorted list.

The retention architecture maintains logs across active, closed, and archived tiers:

+----------------------------------------------------------------------------------------+
|                                      LOG RETENTION LIFECYCLE                           |
+----------------------------------------------------------------------------------------+
|                                                                                        |
|                 Days 0-30          Days 31-90         Days 91-365        365+          |
|                 +-----------+      +-----------+      +-----------+      +----------+  |
|                 |           |      |           |      |           |      |          |  |
|                 |   HOT     |      |   WARM    |      |   COLD    |      | ARCHIVE  |  |
|                 |           |      |           |      |           |      |          |  |
|                 | Active    +----->+ Reduced   +----->+ Closed    +----->+ Object   |  |
|                 | indexing  |      | replicas  |      | indices   |      | storage  |  |
|                 | Full      |      | Full      |      | Searchable|      | Restore  |  |
|                 | search    |      | search    |      | on demand |      | on req   |  |
|                 |           |      |           |      |           |      |          |  |
|                 +-----------+      +-----------+      +-----------+      +----------+  |
|                                                                                        |
|  Storage cost:  High               Medium             Low                Minimal       |
|  Query speed:   Fast               Fast               Slow               Minutes       |
|                                                                                        |
+----------------------------------------------------------------------------------------+

Figure 4: Log retention tiers from hot storage to archive

Phase 6: Configure access controls

Restrict log access based on data sensitivity and job function. Security logs containing authentication details require tighter controls than general application logs.

Create roles for different access levels. Navigate to System → Authentication → Roles.

Create the following roles:

Role	Permissions
Log Reader	Read access to general logs stream; no access to compliance stream
Security Analyst	Read access to all streams; search and export permissions
Log Administrator	Full administrative access including configuration

Configure role permissions. For the “Log Reader” role, add permissions:
- streams:read for general log streams
- searches:relative and searches:absolute for search capability
- Explicitly exclude the Compliance Logs stream
Create user accounts and assign appropriate roles. Navigate to System → Authentication → Users and create accounts for each team member requiring log access.
Enable audit logging for the log management platform itself. Add to /etc/graylog/server/server.conf:

   audit_log_enabled = true
   audit_log_dir = /var/log/graylog/audit

Restart Graylog server and verify audit logs are being generated:

   sudo systemctl restart graylog-server
   ls -la /var/log/graylog/audit/
   # Expected: audit log files with recent timestamps

Verification

After completing all phases, verify the log management system is operating correctly.

Collection verification

Confirm logs are arriving from all configured sources:

# Check message count by source in the last hour
curl -s -u admin:password 'http://localhost:9000/api/search/universal/relative?query=*&range=3600&fields=source' \
  | jq '.messages[].message.source' | sort | uniq -c | sort -rn

Expected output shows message counts from each configured log source. Missing sources indicate collection agent issues.

Parsing verification

Confirm structured fields are being extracted:

# Search for web access logs with parsed response codes
curl -s -u admin:password 'http://localhost:9000/api/search/universal/relative?query=response_code:200&range=3600' \
  | jq '.total_results'

A non-zero result confirms the web access log parser is functioning.

Retention verification

Confirm index lifecycle is operating:

# List indices by age
curl -s 'http://localhost:9200/_cat/indices?v&h=index,docs.count,store.size,creation.date.string' \
  | grep -E '^(graylog|compliance)' | sort -k4

Verify the oldest index date does not exceed your retention policy (90 days for general logs, 365 days for compliance logs in the example configuration).

Query performance verification

Test search performance meets operational requirements:

# Time a complex query
time curl -s -u admin:password \
  'http://localhost:9000/api/search/universal/relative?query=response_code:500%20AND%20source:webserver*&range=86400' \
  > /dev/null

Queries against the last 24 hours should complete in under 5 seconds for a properly sized deployment.

Troubleshooting

Symptom	Cause	Resolution
Filebeat shows “connection refused”	Aggregation server input not running or firewall blocking	Verify input status in Graylog; check firewall rules with `sudo ufw status`
Logs arriving but no parsed fields	Extractor pattern not matching log format	Test pattern in Graylog extractor tester; adjust regex for actual log format
OpenSearch cluster status “red”	Primary shard unassigned, often disk space	Check disk with `df -h`; review `/var/log/opensearch/` for errors
High memory usage on aggregation server	JVM heap undersized for index volume	Increase OpenSearch heap in `/etc/opensearch/jvm.options`; restart service
Windows Event Logs not arriving	Winlogbeat service stopped or misconfigured	Check `Get-Service winlogbeat`; review logs in `C:\ProgramData\Winlogbeat\logs`
Network device logs missing hostname	Syslog not including hostname in message	Configure device to include hostname; use DNS lookup on aggregation server
Search queries timing out	Index too large or insufficient resources	Implement index lifecycle to reduce active index size; add resources
Duplicate log entries	Multiple collection paths or agent misconfiguration	Review agent config for overlapping paths; check for both syslog and agent collection
Timestamp parsing incorrect	Timezone mismatch between source and parser	Configure timezone in extractor; ensure NTP synchronisation on sources
Logs disappearing before retention period	Retention policy misconfigured or disk pressure	Review index set configuration; check OpenSearch watermark settings
High CPU during index rotation	Large indices rotating during business hours	Schedule rotation during off-hours; reduce index rotation frequency
GELF messages truncated	Message exceeds chunk size limit	Increase `decompress_size_limit` on GELF input; configure application to chunk properly
Permission denied in Graylog UI	User role lacks required permissions	Review role assignments; add necessary stream and search permissions
Archive script failing	Insufficient permissions or storage full	Check script runs as root; verify archive storage has available space

Log source not appearing

When a configured source does not appear in log data, diagnose systematically:

# On Linux source: verify Filebeat is running and shipping
sudo systemctl status filebeat
sudo journalctl -u filebeat -n 50

# Check Filebeat registry for file positions
sudo cat /var/lib/filebeat/registry/filebeat/log.json | jq '.[] | {source: .source, offset: .offset}'

# On aggregation server: check for connection attempts
sudo tcpdump -i any port 5044 -c 10

# In Graylog: check input metrics
curl -s -u admin:password 'http://localhost:9000/api/system/metrics' | jq '.gauges | to_entries[] | select(.key | contains("input"))'

High ingestion latency

When logs appear with significant delay (more than 60 seconds from generation to searchability):

# Check OpenSearch indexing queue
curl -s 'http://localhost:9200/_nodes/stats/thread_pool' | jq '.nodes[].thread_pool.write'
# High "queue" and "rejected" values indicate indexing bottleneck

# Check Graylog processing buffer
curl -s -u admin:password 'http://localhost:9000/api/system/buffers' | jq '.buffers.process'
# utilization_percent above 80 indicates processing bottleneck

# Review journal size
ls -lh /var/lib/graylog-server/journal/
# Large journal indicates messages queuing faster than processing

Resolution depends on bottleneck location: add OpenSearch nodes for indexing bottlenecks, add processing buffer capacity or Graylog nodes for processing bottlenecks.