Skip to main content

Log Management

Log management establishes the infrastructure and procedures for capturing machine-generated records from across your IT environment into a centralised system where they can be searched, analysed, and retained according to policy. This task covers deploying collection agents, configuring forwarding and aggregation, setting up parsing pipelines, and managing log lifecycle from ingestion through archival.

Prerequisites

Before implementing log management, confirm the following requirements are in place.

RequirementDetail
Log source inventoryDocument all systems that generate logs: servers, network devices, applications, cloud services, security tools
Storage capacityCalculate required storage based on log volume estimates; plan for 1.5x headroom above projected daily ingestion
Retention policyObtain approved retention periods from Data Retention and Records for each log category
Network accessCollection agents require outbound connectivity to aggregation tier on TCP 514 (syslog), TCP 5044 (Beats), or TCP 6514 (syslog-TLS)
Administrative accessRoot or administrator privileges on log sources for agent installation; administrative access to log management platform
Time synchronisationAll log sources must synchronise to NTP; log correlation fails with clock skew exceeding 1 second

Verify NTP synchronisation on Linux systems:

Terminal window
timedatectl status | grep -E "(synchronized|NTP)"
# Expected output:
# System clock synchronized: yes
# NTP service: active

On Windows systems:

Terminal window
w32tm /query /status
# Verify "Stratum" is less than 5 and "Last Successful Sync Time" is recent

Storage estimation

Calculate storage requirements before deployment. Log volume varies substantially by source type and verbosity level.

Source typeTypical daily volume per instanceNotes
Linux server (syslog)50-200 MBIncreases with service count
Windows server (Event Log)100-500 MBSecurity logs dominate
Web server (access logs)1-10 GBScales with request volume
Application (structured JSON)200 MB-2 GBDepends on logging level
Network device10-100 MBIncreases with flow logging
Container platform500 MB-5 GBPer node, not per container

For an organisation with 50 Linux servers, 20 Windows servers, 5 web servers, and 10 network devices, calculate baseline daily volume:

Linux: 50 × 100 MB = 5,000 MB
Windows: 20 × 300 MB = 6,000 MB
Web: 5 × 3 GB = 15,000 MB
Network: 10 × 50 MB = 500 MB
-----------
Daily total: 26,500 MB (approximately 26 GB)

With a 90-day retention requirement and 1.5x headroom factor:

26 GB × 90 days × 1.5 = 3,510 GB (approximately 3.5 TB)

This calculation assumes raw log storage. Compression reduces requirements by 70-90% depending on log content, but indexing for search adds 10-30% overhead. Plan for net storage of approximately 50% of raw calculated volume when using compressed, indexed storage.

Procedure

Phase 1: Deploy log aggregation infrastructure

The aggregation tier receives logs from all sources, parses them into structured format, and stores them for analysis. Deploy this infrastructure before configuring collection agents.

  1. Provision the log aggregation server with resources appropriate to your volume. For the 26 GB daily example, allocate a minimum of 4 CPU cores, 16 GB RAM, and 4 TB storage. Create the server and install the base operating system (Ubuntu 22.04 LTS or later).
Terminal window
# Verify system resources
nproc
# Expected: 4 or higher
free -h
# Expected: 16 GB or higher total memory
df -h /var
# Expected: 4 TB or higher available
  1. Install the log aggregation platform. This procedure uses Graylog with OpenSearch as the storage backend, representing an open source option. Commercial alternatives include Splunk, Elastic Cloud, and Datadog.
Terminal window
# Install dependencies
sudo apt update
sudo apt install -y apt-transport-https openjdk-17-jre-headless uuid-runtime pwgen
# Add MongoDB repository (required for Graylog metadata)
curl -fsSL https://pgp.mongodb.com/server-7.0.asc | sudo gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor
echo "deb [ signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
# Add OpenSearch repository
curl -fsSL https://artifacts.opensearch.org/publickeys/opensearch.pgp | sudo gpg -o /usr/share/keyrings/opensearch.gpg --dearmor
echo "deb [signed-by=/usr/share/keyrings/opensearch.gpg] https://artifacts.opensearch.org/releases/bundle/opensearch/2.x/apt stable main" | sudo tee /etc/apt/sources.list.d/opensearch.list
# Add Graylog repository
wget https://packages.graylog2.org/repo/packages/graylog-5.2-repository_latest.deb
sudo dpkg -i graylog-5.2-repository_latest.deb
sudo apt update
  1. Install and configure MongoDB for Graylog metadata storage.
Terminal window
sudo apt install -y mongodb-org
sudo systemctl daemon-reload
sudo systemctl enable mongod
sudo systemctl start mongod
# Verify MongoDB is running
mongosh --eval "db.adminCommand('ping')"
# Expected: { ok: 1 }
  1. Install and configure OpenSearch for log storage and indexing.
Terminal window
sudo apt install -y opensearch
# Configure OpenSearch for single-node deployment
sudo tee /etc/opensearch/opensearch.yml << 'EOF'
cluster.name: graylog-cluster
node.name: node-1
path.data: /var/lib/opensearch
path.logs: /var/log/opensearch
network.host: 127.0.0.1
discovery.type: single-node
plugins.security.disabled: true
indices.query.bool.max_clause_count: 32768
EOF
# Set JVM heap size to 50% of available memory (max 31 GB)
sudo sed -i 's/-Xms1g/-Xms8g/g' /etc/opensearch/jvm.options
sudo sed -i 's/-Xmx1g/-Xmx8g/g' /etc/opensearch/jvm.options
sudo systemctl daemon-reload
sudo systemctl enable opensearch
sudo systemctl start opensearch
# Verify OpenSearch is running
curl -s http://localhost:9200/_cluster/health | jq .status
# Expected: "green" or "yellow" (yellow is acceptable for single-node)
  1. Install and configure Graylog server.
Terminal window
sudo apt install -y graylog-server
# Generate password secret (minimum 64 characters)
PASSWORD_SECRET=$(pwgen -N 1 -s 96)
# Generate admin password hash
echo -n "Enter admin password: " && read -s ADMIN_PASS && echo
ADMIN_HASH=$(echo -n "$ADMIN_PASS" | sha256sum | cut -d" " -f1)
# Configure Graylog
sudo tee /etc/graylog/server/server.conf << EOF
is_leader = true
node_id_file = /etc/graylog/server/node-id
password_secret = ${PASSWORD_SECRET}
root_password_sha2 = ${ADMIN_HASH}
root_email = admin@example.org
root_timezone = UTC
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = 0.0.0.0:9000
elasticsearch_hosts = http://127.0.0.1:9200
mongodb_uri = mongodb://127.0.0.1:27017/graylog
EOF
sudo systemctl daemon-reload
sudo systemctl enable graylog-server
sudo systemctl start graylog-server
  1. Verify the aggregation platform is operational by accessing the web interface at http://<server-ip>:9000. Log in with username admin and the password you configured. The system status indicator in the top navigation should show green.

The aggregation infrastructure is now ready to receive logs. The following diagram shows the architecture deployed in this phase:

+------------------------------------------------------------------+
| LOG AGGREGATION SERVER |
+------------------------------------------------------------------+
| |
| +------------------+ +------------------+ |
| | | | | |
| | Graylog | | MongoDB | |
| | Server +---->+ (metadata) | |
| | :9000 | | :27017 | |
| | | | | |
| +--------+---------+ +------------------+ |
| | |
| | |
| v |
| +--------+---------+ |
| | | |
| | OpenSearch | |
| | (log storage) | |
| | :9200 | |
| | | |
| +------------------+ |
| |
+------------------------------------------------------------------+
^
|
Incoming logs
(TCP 5044, 514, 12201)

Figure 1: Single-node log aggregation architecture

Phase 2: Configure log inputs

Inputs define how the aggregation platform receives logs. Configure inputs before deploying collection agents to ensure the platform is ready to accept data.

  1. Create a Beats input for receiving logs from Filebeat and Winlogbeat agents. In the Graylog web interface, navigate to System → Inputs. Select “Beats” from the input type dropdown and click “Launch new input”.

    Configure the input with these parameters:

    ParameterValue
    TitleBeats Input
    Bind address0.0.0.0
    Port5044
    No. of worker threads4
    TLS cert fileLeave empty for initial setup
    TLS key fileLeave empty for initial setup

    Click “Save” to create the input, then click “Start” to activate it.

  2. Create a syslog input for network devices and legacy systems that cannot run collection agents.

    Select “Syslog UDP” from the input type dropdown and configure:

    ParameterValue
    TitleSyslog UDP Input
    Bind address0.0.0.0
    Port514
    Store full messageYes

    Create an additional “Syslog TCP” input on port 1514 for reliable delivery from systems that support TCP syslog.

  3. Create a GELF input for applications that support structured logging directly.

    Select “GELF UDP” and configure:

    ParameterValue
    TitleGELF UDP Input
    Bind address0.0.0.0
    Port12201
    Decompress size limit8388608
  4. Verify inputs are running by checking the input status page. Each input should show “RUNNING” status with “0 messages” initially.

Terminal window
# Verify ports are listening
sudo ss -tlnp | grep -E "(5044|514|1514|12201)"
# Expected: Four lines showing graylog-server listening on each port
  1. Configure firewall rules to allow log traffic from your network.
Terminal window
sudo ufw allow from 10.0.0.0/8 to any port 5044 proto tcp comment "Beats"
sudo ufw allow from 10.0.0.0/8 to any port 514 proto udp comment "Syslog UDP"
sudo ufw allow from 10.0.0.0/8 to any port 1514 proto tcp comment "Syslog TCP"
sudo ufw allow from 10.0.0.0/8 to any port 12201 proto udp comment "GELF"
sudo ufw reload

Phase 3: Deploy collection agents

Collection agents run on log sources and forward logs to the aggregation platform. Use Filebeat for Linux systems and Winlogbeat for Windows systems.

  1. Install Filebeat on Linux servers. Download and install from the Elastic repository (Filebeat works with Graylog despite being an Elastic product).
Terminal window
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg -o /usr/share/keyrings/elastic.gpg --dearmor
echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt update
sudo apt install -y filebeat
  1. Configure Filebeat to collect system logs and forward to your aggregation server.
Terminal window
sudo tee /etc/filebeat/filebeat.yml << 'EOF'
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/syslog
- /var/log/auth.log
- /var/log/kern.log
fields:
log_type: syslog
fields_under_root: true
- type: log
enabled: true
paths:
- /var/log/apache2/access.log
- /var/log/nginx/access.log
fields:
log_type: webserver_access
fields_under_root: true
- type: log
enabled: true
paths:
- /var/log/apache2/error.log
- /var/log/nginx/error.log
fields:
log_type: webserver_error
fields_under_root: true
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
output.logstash:
hosts: ["log-aggregator.example.org:5044"]
logging.level: warning
logging.to_files: true
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 3
EOF
  1. Enable and start Filebeat.
Terminal window
sudo systemctl enable filebeat
sudo systemctl start filebeat
# Verify Filebeat is running and shipping logs
sudo systemctl status filebeat
sudo filebeat test output
# Expected: "logstash: log-aggregator.example.org:5044... connection..."
  1. Install Winlogbeat on Windows servers. Download Winlogbeat 8.x from elastic.co and extract to C:\Program Files\Winlogbeat.
Terminal window
# Run in PowerShell as Administrator
cd "C:\Program Files\Winlogbeat"
.\install-service-winlogbeat.ps1
  1. Configure Winlogbeat to collect Windows Event Logs.

    Edit C:\Program Files\Winlogbeat\winlogbeat.yml:

winlogbeat.event_logs:
- name: Application
ignore_older: 72h
- name: System
ignore_older: 72h
- name: Security
ignore_older: 72h
- name: Microsoft-Windows-Sysmon/Operational
ignore_older: 72h
- name: Microsoft-Windows-PowerShell/Operational
ignore_older: 72h
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
output.logstash:
hosts: ["log-aggregator.example.org:5044"]
logging.level: warning
logging.to_files: true
logging.files:
path: C:\ProgramData\Winlogbeat\logs
name: winlogbeat
keepfiles: 3
  1. Start the Winlogbeat service.
Terminal window
Start-Service winlogbeat
Get-Service winlogbeat
# Expected: Status "Running"
# Test connectivity
cd "C:\Program Files\Winlogbeat"
.\winlogbeat.exe test output
  1. Configure network devices to forward syslog to the aggregation server. The exact procedure varies by vendor. For Cisco IOS devices:
configure terminal
logging host 10.0.1.50 transport udp port 514
logging trap informational
logging source-interface Loopback0
logging on
end
write memory

For Juniper Junos:

set system syslog host 10.0.1.50 any info
set system syslog host 10.0.1.50 port 514
commit

The log collection architecture now spans your infrastructure:

+------------------+ +------------------+ +------------------+
| Linux Servers | | Windows Servers | | Network Devices |
| | | | | |
| +------------+ | | +------------+ | | +-----------+ |
| | Filebeat | | | | Winlogbeat | | | | Syslog | |
| +-----+------+ | | +-----+------+ | | +-----+-----+ |
| | | | | | | | |
+--------+---------+ +--------+---------+ +--------+---------+
| | |
| TCP 5044 | TCP 5044 | UDP 514
| | |
+------------------------+------------------------+
|
v
+-------------+-------------+
| |
| Log Aggregation |
| Server |
| |
| +-------+ +-------+ |
| |Graylog| | Open | |
| | +-->+ Search| |
| +-------+ +-------+ |
| |
+---------------------------+

Figure 2: Log collection flow from sources to aggregation

Phase 4: Configure log parsing

Raw logs arrive as unstructured text. Parsing extracts fields into structured data that can be searched and analysed. Configure extractors and pipelines to parse logs from each source type.

  1. Create an extractor for syslog messages to extract timestamp, hostname, facility, and message body. In Graylog, navigate to System → Inputs, select your Syslog UDP input, and click “Manage extractors”.

    Click “Add extractor” and select “Grok pattern”:

    ParameterValue
    Source fieldmessage
    Grok pattern%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:source_host} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}
    Store asextracted_fields
    Extractor titleSyslog Parser
  2. Create a pipeline for web server access log parsing. Navigate to System → Pipelines and create a new pipeline named “Web Access Logs”.

    Create a pipeline rule:

rule "parse_nginx_access"
when
has_field("log_type") AND to_string($message.log_type) == "webserver_access"
then
let parsed = grok(
pattern: "%{IPORHOST:client_ip} - %{DATA:user_name} \\[%{HTTPDATE:timestamp}\\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:bytes_sent} \"%{DATA:referrer}\" \"%{DATA:user_agent}\"",
value: to_string($message.message),
only_named_captures: true
);
set_fields(parsed);
end
  1. Create index field mappings for parsed fields. Navigate to System → Index Sets and edit your default index set. Under “Field Type Profiles”, add mappings:

    FieldType
    client_ipip
    response_codelong
    bytes_sentlong
    syslog_pidlong
    timestampdate
  2. Connect the pipeline to the appropriate input stream. Navigate to Streams, select or create a stream for web logs (e.g., “Web Server Logs”), and connect the “Web Access Logs” pipeline.

  3. Verify parsing by examining recent messages. Navigate to Search, select the last 5 minutes, and examine a web access log message. The message should display parsed fields (client_ip, method, request, response_code) rather than raw text only.

The parsing pipeline transforms raw text into structured, queryable data:

+-------------------------------------------------------------------+
| LOG PARSING PIPELINE |
+-------------------------------------------------------------------+
| |
| +-----------------+ +-----------------+ +--------------+ |
| | | | | | | |
| | Raw Log | | Grok/Regex | | Structured | |
| | Message +---->+ Extractors +---->+ Fields | |
| | | | | | | |
| +-----------------+ +-----------------+ +--------------+ |
| |
| Example transformation: |
| |
| INPUT: |
| "10.0.1.100 - - [05/Jan/2026:14:30:22 +0000] \"GET /api/users |
| HTTP/1.1\" 200 1532 \"-\" \"Mozilla/5.0\"" |
| |
| OUTPUT: |
| +------------------+------------------------------------------+ |
| | client_ip | 10.0.1.100 | |
| | timestamp | 2026-01-05T14:30:22Z | |
| | method | GET | |
| | request | /api/users | |
| | response_code | 200 | |
| | bytes_sent | 1532 | |
| | user_agent | Mozilla/5.0 | |
| +------------------+------------------------------------------+ |
| |
+-------------------------------------------------------------------+

Figure 3: Log parsing extracts structured fields from raw messages

Phase 5: Configure retention and archival

Retention rules determine how long logs remain searchable and when they move to cold storage or deletion. Configure retention based on your organisation’s policy requirements and storage capacity.

  1. Configure index rotation to create new indices on a regular schedule. Navigate to System → Index Sets and edit your default index set.

    Set rotation strategy:

    ParameterValue
    Rotation strategyIndex Time
    Rotation periodP1D (daily rotation)
    Max number of indices90

    This configuration retains 90 days of searchable logs. Indices older than 90 days are automatically deleted.

  2. Configure retention strategy for logs requiring longer retention. Create a separate index set for compliance-relevant logs (authentication, security events).

    Navigate to System → Index Sets and click “Create index set”:

    ParameterValue
    TitleCompliance Logs
    Index prefixcompliance
    Rotation strategyIndex Time
    Rotation periodP1D
    Retention strategyClose
    Max number of indices365

    The “Close” strategy keeps indices on disk but removes them from active search, reducing resource usage while maintaining data.

  3. Create a stream to route compliance-relevant logs to the compliance index set. Navigate to Streams and create:

    ParameterValue
    TitleCompliance Logs
    Index SetCompliance Logs
    DescriptionAuthentication and security events for compliance retention

    Add stream rules to capture relevant logs:

    • Field log_type equals security
    • Field EventID exists (Windows Security logs)
    • Field syslog_program equals sshd
    • Field syslog_program equals sudo
  4. Configure archival to external storage for logs requiring retention beyond active index capacity. Set up an archive configuration that exports closed indices to object storage.

# Create archive script (run from cron)
sudo tee /usr/local/bin/archive-indices.sh << 'EOF'
#!/bin/bash
ARCHIVE_DIR="/mnt/archive/graylog"
DAYS_OLD=90
# Find indices older than retention period
INDICES=$(curl -s 'http://localhost:9200/_cat/indices?h=index,creation.date.string' \
| awk -v d="$(date -d "-${DAYS_OLD} days" +%Y-%m-%d)" '$2 < d {print $1}' \
| grep -E '^graylog_')
for INDEX in $INDICES; do
# Export index to archive
/usr/share/opensearch/bin/opensearch-snapshot \
--index "$INDEX" \
--output "${ARCHIVE_DIR}/${INDEX}.tar.gz"
# Delete archived index
curl -X DELETE "http://localhost:9200/${INDEX}"
done
EOF
sudo chmod +x /usr/local/bin/archive-indices.sh
# Schedule daily archival at 02:00
echo "0 2 * * * root /usr/local/bin/archive-indices.sh >> /var/log/archive-indices.log 2>&1" \
| sudo tee /etc/cron.d/archive-indices
  1. Verify retention configuration by checking index status.
Terminal window
curl -s 'http://localhost:9200/_cat/indices?v&h=index,docs.count,store.size,creation.date.string' \
| grep graylog | sort -k4

Expected output shows indices with creation dates spanning your retention period, with oldest indices at the top of the sorted list.

The retention architecture maintains logs across active, closed, and archived tiers:

+----------------------------------------------------------------------------------------+
| LOG RETENTION LIFECYCLE |
+----------------------------------------------------------------------------------------+
| |
| Days 0-30 Days 31-90 Days 91-365 365+ |
| +-----------+ +-----------+ +-----------+ +----------+ |
| | | | | | | | | |
| | HOT | | WARM | | COLD | | ARCHIVE | |
| | | | | | | | | |
| | Active +----->+ Reduced +----->+ Closed +----->+ Object | |
| | indexing | | replicas | | indices | | storage | |
| | Full | | Full | | Searchable| | Restore | |
| | search | | search | | on demand | | on req | |
| | | | | | | | | |
| +-----------+ +-----------+ +-----------+ +----------+ |
| |
| Storage cost: High Medium Low Minimal |
| Query speed: Fast Fast Slow Minutes |
| |
+----------------------------------------------------------------------------------------+

Figure 4: Log retention tiers from hot storage to archive

Phase 6: Configure access controls

Restrict log access based on data sensitivity and job function. Security logs containing authentication details require tighter controls than general application logs.

  1. Create roles for different access levels. Navigate to System → Authentication → Roles.

    Create the following roles:

    RolePermissions
    Log ReaderRead access to general logs stream; no access to compliance stream
    Security AnalystRead access to all streams; search and export permissions
    Log AdministratorFull administrative access including configuration
  2. Configure role permissions. For the “Log Reader” role, add permissions:

    • streams:read for general log streams
    • searches:relative and searches:absolute for search capability
    • Explicitly exclude the Compliance Logs stream
  3. Create user accounts and assign appropriate roles. Navigate to System → Authentication → Users and create accounts for each team member requiring log access.

  4. Enable audit logging for the log management platform itself. Add to /etc/graylog/server/server.conf:

audit_log_enabled = true
audit_log_dir = /var/log/graylog/audit

Restart Graylog server and verify audit logs are being generated:

Terminal window
sudo systemctl restart graylog-server
ls -la /var/log/graylog/audit/
# Expected: audit log files with recent timestamps

Verification

After completing all phases, verify the log management system is operating correctly.

Collection verification

Confirm logs are arriving from all configured sources:

Terminal window
# Check message count by source in the last hour
curl -s -u admin:password 'http://localhost:9000/api/search/universal/relative?query=*&range=3600&fields=source' \
| jq '.messages[].message.source' | sort | uniq -c | sort -rn

Expected output shows message counts from each configured log source. Missing sources indicate collection agent issues.

Parsing verification

Confirm structured fields are being extracted:

Terminal window
# Search for web access logs with parsed response codes
curl -s -u admin:password 'http://localhost:9000/api/search/universal/relative?query=response_code:200&range=3600' \
| jq '.total_results'

A non-zero result confirms the web access log parser is functioning.

Retention verification

Confirm index lifecycle is operating:

Terminal window
# List indices by age
curl -s 'http://localhost:9200/_cat/indices?v&h=index,docs.count,store.size,creation.date.string' \
| grep -E '^(graylog|compliance)' | sort -k4

Verify the oldest index date does not exceed your retention policy (90 days for general logs, 365 days for compliance logs in the example configuration).

Query performance verification

Test search performance meets operational requirements:

Terminal window
# Time a complex query
time curl -s -u admin:password \
'http://localhost:9000/api/search/universal/relative?query=response_code:500%20AND%20source:webserver*&range=86400' \
> /dev/null

Queries against the last 24 hours should complete in under 5 seconds for a properly sized deployment.

Troubleshooting

SymptomCauseResolution
Filebeat shows “connection refused”Aggregation server input not running or firewall blockingVerify input status in Graylog; check firewall rules with sudo ufw status
Logs arriving but no parsed fieldsExtractor pattern not matching log formatTest pattern in Graylog extractor tester; adjust regex for actual log format
OpenSearch cluster status “red”Primary shard unassigned, often disk spaceCheck disk with df -h; review /var/log/opensearch/ for errors
High memory usage on aggregation serverJVM heap undersized for index volumeIncrease OpenSearch heap in /etc/opensearch/jvm.options; restart service
Windows Event Logs not arrivingWinlogbeat service stopped or misconfiguredCheck Get-Service winlogbeat; review logs in C:\ProgramData\Winlogbeat\logs
Network device logs missing hostnameSyslog not including hostname in messageConfigure device to include hostname; use DNS lookup on aggregation server
Search queries timing outIndex too large or insufficient resourcesImplement index lifecycle to reduce active index size; add resources
Duplicate log entriesMultiple collection paths or agent misconfigurationReview agent config for overlapping paths; check for both syslog and agent collection
Timestamp parsing incorrectTimezone mismatch between source and parserConfigure timezone in extractor; ensure NTP synchronisation on sources
Logs disappearing before retention periodRetention policy misconfigured or disk pressureReview index set configuration; check OpenSearch watermark settings
High CPU during index rotationLarge indices rotating during business hoursSchedule rotation during off-hours; reduce index rotation frequency
GELF messages truncatedMessage exceeds chunk size limitIncrease decompress_size_limit on GELF input; configure application to chunk properly
Permission denied in Graylog UIUser role lacks required permissionsReview role assignments; add necessary stream and search permissions
Archive script failingInsufficient permissions or storage fullCheck script runs as root; verify archive storage has available space

Log source not appearing

When a configured source does not appear in log data, diagnose systematically:

Terminal window
# On Linux source: verify Filebeat is running and shipping
sudo systemctl status filebeat
sudo journalctl -u filebeat -n 50
# Check Filebeat registry for file positions
sudo cat /var/lib/filebeat/registry/filebeat/log.json | jq '.[] | {source: .source, offset: .offset}'
# On aggregation server: check for connection attempts
sudo tcpdump -i any port 5044 -c 10
# In Graylog: check input metrics
curl -s -u admin:password 'http://localhost:9000/api/system/metrics' | jq '.gauges | to_entries[] | select(.key | contains("input"))'

High ingestion latency

When logs appear with significant delay (more than 60 seconds from generation to searchability):

Terminal window
# Check OpenSearch indexing queue
curl -s 'http://localhost:9200/_nodes/stats/thread_pool' | jq '.nodes[].thread_pool.write'
# High "queue" and "rejected" values indicate indexing bottleneck
# Check Graylog processing buffer
curl -s -u admin:password 'http://localhost:9000/api/system/buffers' | jq '.buffers.process'
# utilization_percent above 80 indicates processing bottleneck
# Review journal size
ls -lh /var/lib/graylog-server/journal/
# Large journal indicates messages queuing faster than processing

Resolution depends on bottleneck location: add OpenSearch nodes for indexing bottlenecks, add processing buffer capacity or Graylog nodes for processing bottlenecks.

See also