Log Management
Log management establishes the infrastructure and procedures for capturing machine-generated records from across your IT environment into a centralised system where they can be searched, analysed, and retained according to policy. This task covers deploying collection agents, configuring forwarding and aggregation, setting up parsing pipelines, and managing log lifecycle from ingestion through archival.
Prerequisites
Before implementing log management, confirm the following requirements are in place.
| Requirement | Detail |
|---|---|
| Log source inventory | Document all systems that generate logs: servers, network devices, applications, cloud services, security tools |
| Storage capacity | Calculate required storage based on log volume estimates; plan for 1.5x headroom above projected daily ingestion |
| Retention policy | Obtain approved retention periods from Data Retention and Records for each log category |
| Network access | Collection agents require outbound connectivity to aggregation tier on TCP 514 (syslog), TCP 5044 (Beats), or TCP 6514 (syslog-TLS) |
| Administrative access | Root or administrator privileges on log sources for agent installation; administrative access to log management platform |
| Time synchronisation | All log sources must synchronise to NTP; log correlation fails with clock skew exceeding 1 second |
Verify NTP synchronisation on Linux systems:
timedatectl status | grep -E "(synchronized|NTP)"# Expected output:# System clock synchronized: yes# NTP service: activeOn Windows systems:
w32tm /query /status# Verify "Stratum" is less than 5 and "Last Successful Sync Time" is recentStorage estimation
Calculate storage requirements before deployment. Log volume varies substantially by source type and verbosity level.
| Source type | Typical daily volume per instance | Notes |
|---|---|---|
| Linux server (syslog) | 50-200 MB | Increases with service count |
| Windows server (Event Log) | 100-500 MB | Security logs dominate |
| Web server (access logs) | 1-10 GB | Scales with request volume |
| Application (structured JSON) | 200 MB-2 GB | Depends on logging level |
| Network device | 10-100 MB | Increases with flow logging |
| Container platform | 500 MB-5 GB | Per node, not per container |
For an organisation with 50 Linux servers, 20 Windows servers, 5 web servers, and 10 network devices, calculate baseline daily volume:
Linux: 50 × 100 MB = 5,000 MBWindows: 20 × 300 MB = 6,000 MBWeb: 5 × 3 GB = 15,000 MBNetwork: 10 × 50 MB = 500 MB -----------Daily total: 26,500 MB (approximately 26 GB)With a 90-day retention requirement and 1.5x headroom factor:
26 GB × 90 days × 1.5 = 3,510 GB (approximately 3.5 TB)This calculation assumes raw log storage. Compression reduces requirements by 70-90% depending on log content, but indexing for search adds 10-30% overhead. Plan for net storage of approximately 50% of raw calculated volume when using compressed, indexed storage.
Procedure
Phase 1: Deploy log aggregation infrastructure
The aggregation tier receives logs from all sources, parses them into structured format, and stores them for analysis. Deploy this infrastructure before configuring collection agents.
- Provision the log aggregation server with resources appropriate to your volume. For the 26 GB daily example, allocate a minimum of 4 CPU cores, 16 GB RAM, and 4 TB storage. Create the server and install the base operating system (Ubuntu 22.04 LTS or later).
# Verify system resources nproc # Expected: 4 or higher
free -h # Expected: 16 GB or higher total memory
df -h /var # Expected: 4 TB or higher available- Install the log aggregation platform. This procedure uses Graylog with OpenSearch as the storage backend, representing an open source option. Commercial alternatives include Splunk, Elastic Cloud, and Datadog.
# Install dependencies sudo apt update sudo apt install -y apt-transport-https openjdk-17-jre-headless uuid-runtime pwgen
# Add MongoDB repository (required for Graylog metadata) curl -fsSL https://pgp.mongodb.com/server-7.0.asc | sudo gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor echo "deb [ signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
# Add OpenSearch repository curl -fsSL https://artifacts.opensearch.org/publickeys/opensearch.pgp | sudo gpg -o /usr/share/keyrings/opensearch.gpg --dearmor echo "deb [signed-by=/usr/share/keyrings/opensearch.gpg] https://artifacts.opensearch.org/releases/bundle/opensearch/2.x/apt stable main" | sudo tee /etc/apt/sources.list.d/opensearch.list
# Add Graylog repository wget https://packages.graylog2.org/repo/packages/graylog-5.2-repository_latest.deb sudo dpkg -i graylog-5.2-repository_latest.deb
sudo apt update- Install and configure MongoDB for Graylog metadata storage.
sudo apt install -y mongodb-org sudo systemctl daemon-reload sudo systemctl enable mongod sudo systemctl start mongod
# Verify MongoDB is running mongosh --eval "db.adminCommand('ping')" # Expected: { ok: 1 }- Install and configure OpenSearch for log storage and indexing.
sudo apt install -y opensearch
# Configure OpenSearch for single-node deployment sudo tee /etc/opensearch/opensearch.yml << 'EOF' cluster.name: graylog-cluster node.name: node-1 path.data: /var/lib/opensearch path.logs: /var/log/opensearch network.host: 127.0.0.1 discovery.type: single-node plugins.security.disabled: true indices.query.bool.max_clause_count: 32768 EOF
# Set JVM heap size to 50% of available memory (max 31 GB) sudo sed -i 's/-Xms1g/-Xms8g/g' /etc/opensearch/jvm.options sudo sed -i 's/-Xmx1g/-Xmx8g/g' /etc/opensearch/jvm.options
sudo systemctl daemon-reload sudo systemctl enable opensearch sudo systemctl start opensearch
# Verify OpenSearch is running curl -s http://localhost:9200/_cluster/health | jq .status # Expected: "green" or "yellow" (yellow is acceptable for single-node)- Install and configure Graylog server.
sudo apt install -y graylog-server
# Generate password secret (minimum 64 characters) PASSWORD_SECRET=$(pwgen -N 1 -s 96)
# Generate admin password hash echo -n "Enter admin password: " && read -s ADMIN_PASS && echo ADMIN_HASH=$(echo -n "$ADMIN_PASS" | sha256sum | cut -d" " -f1)
# Configure Graylog sudo tee /etc/graylog/server/server.conf << EOF is_leader = true node_id_file = /etc/graylog/server/node-id password_secret = ${PASSWORD_SECRET} root_password_sha2 = ${ADMIN_HASH} root_email = admin@example.org root_timezone = UTC bin_dir = /usr/share/graylog-server/bin data_dir = /var/lib/graylog-server plugin_dir = /usr/share/graylog-server/plugin http_bind_address = 0.0.0.0:9000 elasticsearch_hosts = http://127.0.0.1:9200 mongodb_uri = mongodb://127.0.0.1:27017/graylog EOF
sudo systemctl daemon-reload sudo systemctl enable graylog-server sudo systemctl start graylog-server- Verify the aggregation platform is operational by accessing the web interface at
http://<server-ip>:9000. Log in with usernameadminand the password you configured. The system status indicator in the top navigation should show green.
The aggregation infrastructure is now ready to receive logs. The following diagram shows the architecture deployed in this phase:
+------------------------------------------------------------------+| LOG AGGREGATION SERVER |+------------------------------------------------------------------+| || +------------------+ +------------------+ || | | | | || | Graylog | | MongoDB | || | Server +---->+ (metadata) | || | :9000 | | :27017 | || | | | | || +--------+---------+ +------------------+ || | || | || v || +--------+---------+ || | | || | OpenSearch | || | (log storage) | || | :9200 | || | | || +------------------+ || |+------------------------------------------------------------------+ ^ | Incoming logs (TCP 5044, 514, 12201)Figure 1: Single-node log aggregation architecture
Phase 2: Configure log inputs
Inputs define how the aggregation platform receives logs. Configure inputs before deploying collection agents to ensure the platform is ready to accept data.
Create a Beats input for receiving logs from Filebeat and Winlogbeat agents. In the Graylog web interface, navigate to System → Inputs. Select “Beats” from the input type dropdown and click “Launch new input”.
Configure the input with these parameters:
Parameter Value Title Beats Input Bind address 0.0.0.0 Port 5044 No. of worker threads 4 TLS cert file Leave empty for initial setup TLS key file Leave empty for initial setup Click “Save” to create the input, then click “Start” to activate it.
Create a syslog input for network devices and legacy systems that cannot run collection agents.
Select “Syslog UDP” from the input type dropdown and configure:
Parameter Value Title Syslog UDP Input Bind address 0.0.0.0 Port 514 Store full message Yes Create an additional “Syslog TCP” input on port 1514 for reliable delivery from systems that support TCP syslog.
Create a GELF input for applications that support structured logging directly.
Select “GELF UDP” and configure:
Parameter Value Title GELF UDP Input Bind address 0.0.0.0 Port 12201 Decompress size limit 8388608 Verify inputs are running by checking the input status page. Each input should show “RUNNING” status with “0 messages” initially.
# Verify ports are listening sudo ss -tlnp | grep -E "(5044|514|1514|12201)" # Expected: Four lines showing graylog-server listening on each port- Configure firewall rules to allow log traffic from your network.
sudo ufw allow from 10.0.0.0/8 to any port 5044 proto tcp comment "Beats" sudo ufw allow from 10.0.0.0/8 to any port 514 proto udp comment "Syslog UDP" sudo ufw allow from 10.0.0.0/8 to any port 1514 proto tcp comment "Syslog TCP" sudo ufw allow from 10.0.0.0/8 to any port 12201 proto udp comment "GELF" sudo ufw reloadPhase 3: Deploy collection agents
Collection agents run on log sources and forward logs to the aggregation platform. Use Filebeat for Linux systems and Winlogbeat for Windows systems.
- Install Filebeat on Linux servers. Download and install from the Elastic repository (Filebeat works with Graylog despite being an Elastic product).
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg -o /usr/share/keyrings/elastic.gpg --dearmor echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list sudo apt update sudo apt install -y filebeat- Configure Filebeat to collect system logs and forward to your aggregation server.
sudo tee /etc/filebeat/filebeat.yml << 'EOF' filebeat.inputs: - type: log enabled: true paths: - /var/log/syslog - /var/log/auth.log - /var/log/kern.log fields: log_type: syslog fields_under_root: true
- type: log enabled: true paths: - /var/log/apache2/access.log - /var/log/nginx/access.log fields: log_type: webserver_access fields_under_root: true
- type: log enabled: true paths: - /var/log/apache2/error.log - /var/log/nginx/error.log fields: log_type: webserver_error fields_under_root: true
processors: - add_host_metadata: when.not.contains.tags: forwarded - add_cloud_metadata: ~
output.logstash: hosts: ["log-aggregator.example.org:5044"]
logging.level: warning logging.to_files: true logging.files: path: /var/log/filebeat name: filebeat keepfiles: 3 EOF- Enable and start Filebeat.
sudo systemctl enable filebeat sudo systemctl start filebeat
# Verify Filebeat is running and shipping logs sudo systemctl status filebeat sudo filebeat test output # Expected: "logstash: log-aggregator.example.org:5044... connection..."- Install Winlogbeat on Windows servers. Download Winlogbeat 8.x from elastic.co and extract to
C:\Program Files\Winlogbeat.
# Run in PowerShell as Administrator cd "C:\Program Files\Winlogbeat" .\install-service-winlogbeat.ps1Configure Winlogbeat to collect Windows Event Logs.
Edit
C:\Program Files\Winlogbeat\winlogbeat.yml:
winlogbeat.event_logs: - name: Application ignore_older: 72h - name: System ignore_older: 72h - name: Security ignore_older: 72h - name: Microsoft-Windows-Sysmon/Operational ignore_older: 72h - name: Microsoft-Windows-PowerShell/Operational ignore_older: 72h
processors: - add_host_metadata: when.not.contains.tags: forwarded
output.logstash: hosts: ["log-aggregator.example.org:5044"]
logging.level: warning logging.to_files: true logging.files: path: C:\ProgramData\Winlogbeat\logs name: winlogbeat keepfiles: 3- Start the Winlogbeat service.
Start-Service winlogbeat Get-Service winlogbeat # Expected: Status "Running"
# Test connectivity cd "C:\Program Files\Winlogbeat" .\winlogbeat.exe test output- Configure network devices to forward syslog to the aggregation server. The exact procedure varies by vendor. For Cisco IOS devices:
configure terminal logging host 10.0.1.50 transport udp port 514 logging trap informational logging source-interface Loopback0 logging on end write memoryFor Juniper Junos:
set system syslog host 10.0.1.50 any info set system syslog host 10.0.1.50 port 514 commitThe log collection architecture now spans your infrastructure:
+------------------+ +------------------+ +------------------+| Linux Servers | | Windows Servers | | Network Devices || | | | | || +------------+ | | +------------+ | | +-----------+ || | Filebeat | | | | Winlogbeat | | | | Syslog | || +-----+------+ | | +-----+------+ | | +-----+-----+ || | | | | | | | |+--------+---------+ +--------+---------+ +--------+---------+ | | | | TCP 5044 | TCP 5044 | UDP 514 | | | +------------------------+------------------------+ | v +-------------+-------------+ | | | Log Aggregation | | Server | | | | +-------+ +-------+ | | |Graylog| | Open | | | | +-->+ Search| | | +-------+ +-------+ | | | +---------------------------+Figure 2: Log collection flow from sources to aggregation
Phase 4: Configure log parsing
Raw logs arrive as unstructured text. Parsing extracts fields into structured data that can be searched and analysed. Configure extractors and pipelines to parse logs from each source type.
Create an extractor for syslog messages to extract timestamp, hostname, facility, and message body. In Graylog, navigate to System → Inputs, select your Syslog UDP input, and click “Manage extractors”.
Click “Add extractor” and select “Grok pattern”:
Parameter Value Source field message Grok pattern %{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:source_host} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}Store as extracted_fields Extractor title Syslog Parser Create a pipeline for web server access log parsing. Navigate to System → Pipelines and create a new pipeline named “Web Access Logs”.
Create a pipeline rule:
rule "parse_nginx_access" when has_field("log_type") AND to_string($message.log_type) == "webserver_access" then let parsed = grok( pattern: "%{IPORHOST:client_ip} - %{DATA:user_name} \\[%{HTTPDATE:timestamp}\\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:bytes_sent} \"%{DATA:referrer}\" \"%{DATA:user_agent}\"", value: to_string($message.message), only_named_captures: true ); set_fields(parsed); endCreate index field mappings for parsed fields. Navigate to System → Index Sets and edit your default index set. Under “Field Type Profiles”, add mappings:
Field Type client_ip ip response_code long bytes_sent long syslog_pid long timestamp date Connect the pipeline to the appropriate input stream. Navigate to Streams, select or create a stream for web logs (e.g., “Web Server Logs”), and connect the “Web Access Logs” pipeline.
Verify parsing by examining recent messages. Navigate to Search, select the last 5 minutes, and examine a web access log message. The message should display parsed fields (client_ip, method, request, response_code) rather than raw text only.
The parsing pipeline transforms raw text into structured, queryable data:
+-------------------------------------------------------------------+| LOG PARSING PIPELINE |+-------------------------------------------------------------------+| || +-----------------+ +-----------------+ +--------------+ || | | | | | | || | Raw Log | | Grok/Regex | | Structured | || | Message +---->+ Extractors +---->+ Fields | || | | | | | | || +-----------------+ +-----------------+ +--------------+ || || Example transformation: || || INPUT: || "10.0.1.100 - - [05/Jan/2026:14:30:22 +0000] \"GET /api/users || HTTP/1.1\" 200 1532 \"-\" \"Mozilla/5.0\"" || || OUTPUT: || +------------------+------------------------------------------+ || | client_ip | 10.0.1.100 | || | timestamp | 2026-01-05T14:30:22Z | || | method | GET | || | request | /api/users | || | response_code | 200 | || | bytes_sent | 1532 | || | user_agent | Mozilla/5.0 | || +------------------+------------------------------------------+ || |+-------------------------------------------------------------------+Figure 3: Log parsing extracts structured fields from raw messages
Phase 5: Configure retention and archival
Retention rules determine how long logs remain searchable and when they move to cold storage or deletion. Configure retention based on your organisation’s policy requirements and storage capacity.
Configure index rotation to create new indices on a regular schedule. Navigate to System → Index Sets and edit your default index set.
Set rotation strategy:
Parameter Value Rotation strategy Index Time Rotation period P1D (daily rotation) Max number of indices 90 This configuration retains 90 days of searchable logs. Indices older than 90 days are automatically deleted.
Configure retention strategy for logs requiring longer retention. Create a separate index set for compliance-relevant logs (authentication, security events).
Navigate to System → Index Sets and click “Create index set”:
Parameter Value Title Compliance Logs Index prefix compliance Rotation strategy Index Time Rotation period P1D Retention strategy Close Max number of indices 365 The “Close” strategy keeps indices on disk but removes them from active search, reducing resource usage while maintaining data.
Create a stream to route compliance-relevant logs to the compliance index set. Navigate to Streams and create:
Parameter Value Title Compliance Logs Index Set Compliance Logs Description Authentication and security events for compliance retention Add stream rules to capture relevant logs:
- Field
log_typeequalssecurity - Field
EventIDexists (Windows Security logs) - Field
syslog_programequalssshd - Field
syslog_programequalssudo
- Field
Configure archival to external storage for logs requiring retention beyond active index capacity. Set up an archive configuration that exports closed indices to object storage.
# Create archive script (run from cron) sudo tee /usr/local/bin/archive-indices.sh << 'EOF' #!/bin/bash ARCHIVE_DIR="/mnt/archive/graylog" DAYS_OLD=90
# Find indices older than retention period INDICES=$(curl -s 'http://localhost:9200/_cat/indices?h=index,creation.date.string' \ | awk -v d="$(date -d "-${DAYS_OLD} days" +%Y-%m-%d)" '$2 < d {print $1}' \ | grep -E '^graylog_')
for INDEX in $INDICES; do # Export index to archive /usr/share/opensearch/bin/opensearch-snapshot \ --index "$INDEX" \ --output "${ARCHIVE_DIR}/${INDEX}.tar.gz"
# Delete archived index curl -X DELETE "http://localhost:9200/${INDEX}" done EOF
sudo chmod +x /usr/local/bin/archive-indices.sh
# Schedule daily archival at 02:00 echo "0 2 * * * root /usr/local/bin/archive-indices.sh >> /var/log/archive-indices.log 2>&1" \ | sudo tee /etc/cron.d/archive-indices- Verify retention configuration by checking index status.
curl -s 'http://localhost:9200/_cat/indices?v&h=index,docs.count,store.size,creation.date.string' \ | grep graylog | sort -k4Expected output shows indices with creation dates spanning your retention period, with oldest indices at the top of the sorted list.
The retention architecture maintains logs across active, closed, and archived tiers:
+----------------------------------------------------------------------------------------+| LOG RETENTION LIFECYCLE |+----------------------------------------------------------------------------------------+| || Days 0-30 Days 31-90 Days 91-365 365+ || +-----------+ +-----------+ +-----------+ +----------+ || | | | | | | | | || | HOT | | WARM | | COLD | | ARCHIVE | || | | | | | | | | || | Active +----->+ Reduced +----->+ Closed +----->+ Object | || | indexing | | replicas | | indices | | storage | || | Full | | Full | | Searchable| | Restore | || | search | | search | | on demand | | on req | || | | | | | | | | || +-----------+ +-----------+ +-----------+ +----------+ || || Storage cost: High Medium Low Minimal || Query speed: Fast Fast Slow Minutes || |+----------------------------------------------------------------------------------------+Figure 4: Log retention tiers from hot storage to archive
Phase 6: Configure access controls
Restrict log access based on data sensitivity and job function. Security logs containing authentication details require tighter controls than general application logs.
Create roles for different access levels. Navigate to System → Authentication → Roles.
Create the following roles:
Role Permissions Log Reader Read access to general logs stream; no access to compliance stream Security Analyst Read access to all streams; search and export permissions Log Administrator Full administrative access including configuration Configure role permissions. For the “Log Reader” role, add permissions:
streams:readfor general log streamssearches:relativeandsearches:absolutefor search capability- Explicitly exclude the Compliance Logs stream
Create user accounts and assign appropriate roles. Navigate to System → Authentication → Users and create accounts for each team member requiring log access.
Enable audit logging for the log management platform itself. Add to
/etc/graylog/server/server.conf:
audit_log_enabled = true audit_log_dir = /var/log/graylog/auditRestart Graylog server and verify audit logs are being generated:
sudo systemctl restart graylog-server ls -la /var/log/graylog/audit/ # Expected: audit log files with recent timestampsVerification
After completing all phases, verify the log management system is operating correctly.
Collection verification
Confirm logs are arriving from all configured sources:
# Check message count by source in the last hourcurl -s -u admin:password 'http://localhost:9000/api/search/universal/relative?query=*&range=3600&fields=source' \ | jq '.messages[].message.source' | sort | uniq -c | sort -rnExpected output shows message counts from each configured log source. Missing sources indicate collection agent issues.
Parsing verification
Confirm structured fields are being extracted:
# Search for web access logs with parsed response codescurl -s -u admin:password 'http://localhost:9000/api/search/universal/relative?query=response_code:200&range=3600' \ | jq '.total_results'A non-zero result confirms the web access log parser is functioning.
Retention verification
Confirm index lifecycle is operating:
# List indices by agecurl -s 'http://localhost:9200/_cat/indices?v&h=index,docs.count,store.size,creation.date.string' \ | grep -E '^(graylog|compliance)' | sort -k4Verify the oldest index date does not exceed your retention policy (90 days for general logs, 365 days for compliance logs in the example configuration).
Query performance verification
Test search performance meets operational requirements:
# Time a complex querytime curl -s -u admin:password \ 'http://localhost:9000/api/search/universal/relative?query=response_code:500%20AND%20source:webserver*&range=86400' \ > /dev/nullQueries against the last 24 hours should complete in under 5 seconds for a properly sized deployment.
Troubleshooting
| Symptom | Cause | Resolution |
|---|---|---|
| Filebeat shows “connection refused” | Aggregation server input not running or firewall blocking | Verify input status in Graylog; check firewall rules with sudo ufw status |
| Logs arriving but no parsed fields | Extractor pattern not matching log format | Test pattern in Graylog extractor tester; adjust regex for actual log format |
| OpenSearch cluster status “red” | Primary shard unassigned, often disk space | Check disk with df -h; review /var/log/opensearch/ for errors |
| High memory usage on aggregation server | JVM heap undersized for index volume | Increase OpenSearch heap in /etc/opensearch/jvm.options; restart service |
| Windows Event Logs not arriving | Winlogbeat service stopped or misconfigured | Check Get-Service winlogbeat; review logs in C:\ProgramData\Winlogbeat\logs |
| Network device logs missing hostname | Syslog not including hostname in message | Configure device to include hostname; use DNS lookup on aggregation server |
| Search queries timing out | Index too large or insufficient resources | Implement index lifecycle to reduce active index size; add resources |
| Duplicate log entries | Multiple collection paths or agent misconfiguration | Review agent config for overlapping paths; check for both syslog and agent collection |
| Timestamp parsing incorrect | Timezone mismatch between source and parser | Configure timezone in extractor; ensure NTP synchronisation on sources |
| Logs disappearing before retention period | Retention policy misconfigured or disk pressure | Review index set configuration; check OpenSearch watermark settings |
| High CPU during index rotation | Large indices rotating during business hours | Schedule rotation during off-hours; reduce index rotation frequency |
| GELF messages truncated | Message exceeds chunk size limit | Increase decompress_size_limit on GELF input; configure application to chunk properly |
| Permission denied in Graylog UI | User role lacks required permissions | Review role assignments; add necessary stream and search permissions |
| Archive script failing | Insufficient permissions or storage full | Check script runs as root; verify archive storage has available space |
Log source not appearing
When a configured source does not appear in log data, diagnose systematically:
# On Linux source: verify Filebeat is running and shippingsudo systemctl status filebeatsudo journalctl -u filebeat -n 50
# Check Filebeat registry for file positionssudo cat /var/lib/filebeat/registry/filebeat/log.json | jq '.[] | {source: .source, offset: .offset}'
# On aggregation server: check for connection attemptssudo tcpdump -i any port 5044 -c 10
# In Graylog: check input metricscurl -s -u admin:password 'http://localhost:9000/api/system/metrics' | jq '.gauges | to_entries[] | select(.key | contains("input"))'High ingestion latency
When logs appear with significant delay (more than 60 seconds from generation to searchability):
# Check OpenSearch indexing queuecurl -s 'http://localhost:9200/_nodes/stats/thread_pool' | jq '.nodes[].thread_pool.write'# High "queue" and "rejected" values indicate indexing bottleneck
# Check Graylog processing buffercurl -s -u admin:password 'http://localhost:9000/api/system/buffers' | jq '.buffers.process'# utilization_percent above 80 indicates processing bottleneck
# Review journal sizels -lh /var/lib/graylog-server/journal/# Large journal indicates messages queuing faster than processingResolution depends on bottleneck location: add OpenSearch nodes for indexing bottlenecks, add processing buffer capacity or Graylog nodes for processing bottlenecks.