Skip to main content

Webhook Configuration

Webhook configuration establishes HTTP callbacks that deliver event notifications between systems in near-real-time. Configure webhooks when you need to trigger actions in response to events occurring in another system, such as receiving notifications when a form submission arrives, a payment completes, or a case status changes. Successful configuration results in reliable event delivery with proper authentication, retry handling, and failure management.

Prerequisites

RequirementDetail
Network accessFirewall rules permitting inbound HTTPS on port 443 from source IP ranges, or outbound HTTPS to destination endpoints
TLS certificateValid certificate from public CA for receiving endpoints; self-signed certificates cause delivery failures
Authentication credentialsAPI keys, shared secrets for HMAC, or OAuth client credentials depending on provider requirements
Endpoint URLPublicly accessible HTTPS URL for inbound webhooks; target URL for outbound webhooks
PermissionsAdministrative access to both sending and receiving systems
Logging infrastructureCentralised logging capable of capturing webhook payloads for debugging

Verify your endpoint is reachable before beginning configuration:

Terminal window
# Test endpoint accessibility from external network
curl -I https://api.example.org/webhooks/incoming
# Expected: HTTP/2 200 or 401 (authentication required)
# Verify TLS certificate validity
echo | openssl s_client -servername api.example.org -connect api.example.org:443 2>/dev/null | openssl x509 -noout -dates
# Expected: notAfter date in the future

Configuring outbound webhooks

Outbound webhooks send event notifications from your system to external endpoints. The sending system initiates HTTP POST requests containing event data whenever specified triggers occur.

  1. Register the destination endpoint in your application’s webhook configuration. Provide the full URL including protocol and path:
Endpoint URL: https://partner.example.org/api/webhooks/receive

Most platforms require HTTPS endpoints. HTTP endpoints without TLS are rejected by default due to payload exposure risk.

  1. Select the events that trigger webhook delivery. Limit subscriptions to events the receiving system actually processes:
# Example event subscription configuration
webhook:
url: "https://partner.example.org/api/webhooks/receive"
events:
- case.created
- case.status_changed
- case.closed
# Avoid subscribing to high-volume events unless needed
# - case.viewed (generates excessive traffic)

Each subscribed event type increases delivery volume. A case management system generating 500 cases daily with 4 status changes each produces 2,500 webhook deliveries daily when subscribed to both creation and status events.

  1. Configure authentication for the outbound request. The receiving system must verify requests originate from your application:
webhook:
url: "https://partner.example.org/api/webhooks/receive"
authentication:
type: hmac_sha256
secret: "${WEBHOOK_SECRET}" # From environment variable
header: "X-Signature-256"

The HMAC signature is computed over the request body using the shared secret. The receiving system computes the same signature and compares values. Mismatched signatures indicate tampering or misconfiguration.

For systems requiring bearer tokens instead of HMAC:

webhook:
authentication:
type: bearer
token: "${WEBHOOK_BEARER_TOKEN}"
header: "Authorization"
  1. Set retry configuration to handle transient delivery failures:
webhook:
retry:
max_attempts: 5
initial_delay_seconds: 60
backoff_multiplier: 2
max_delay_seconds: 3600

This configuration retries failed deliveries at 1, 2, 4, 8, and 16 minutes after the initial failure. The exponential backoff prevents overwhelming recovering systems. After 5 failures spanning approximately 31 minutes, the delivery attempt is abandoned and logged.

  1. Configure timeout values appropriate for the receiving endpoint’s expected response time:
webhook:
timeout:
connect_seconds: 10
read_seconds: 30

Set read timeout above the receiving system’s processing time. A webhook triggering database operations requiring 15 seconds needs at least 20 seconds read timeout to avoid premature disconnection.

  1. Enable delivery logging for troubleshooting:
webhook:
logging:
log_payloads: true
log_responses: true
retention_days: 30

Payload logging captures the full request body sent to the endpoint. Disable payload logging if webhooks contain sensitive personal data subject to retention restrictions.

  1. Test the configuration by triggering a test event:
Terminal window
# Most platforms provide a test delivery function
curl -X POST https://your-system.example.org/api/webhooks/test \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"webhook_id": "wh_abc123", "event_type": "case.created"}'

Verify the test delivery appears in both sending system logs and receiving system logs.

Configuring inbound webhooks

Inbound webhooks receive event notifications from external systems. Your application exposes an HTTP endpoint that external systems call when events occur.

  1. Create the webhook receiver endpoint in your application. The endpoint must accept POST requests and respond within the sender’s timeout window:
# Flask example
from flask import Flask, request, jsonify
import hmac
import hashlib
app = Flask(__name__)
@app.route('/webhooks/receive', methods=['POST'])
def receive_webhook():
# Respond quickly - process asynchronously
# Return 200 before heavy processing
return jsonify({'received': True}), 200

Return HTTP 200 immediately upon receiving valid requests. Defer processing to background workers. Senders interpret slow responses as failures and retry, causing duplicate deliveries.

  1. Implement signature verification to authenticate incoming requests. Extract the signature from the header and compute the expected value:
import hmac
import hashlib
import os
def verify_signature(payload_body, signature_header):
"""Verify HMAC-SHA256 signature."""
secret = os.environ['WEBHOOK_SECRET'].encode('utf-8')
expected_signature = hmac.new(
secret,
payload_body,
hashlib.sha256
).hexdigest()
# Use constant-time comparison to prevent timing attacks
return hmac.compare_digest(
f"sha256={expected_signature}",
signature_header
)
@app.route('/webhooks/receive', methods=['POST'])
def receive_webhook():
signature = request.headers.get('X-Signature-256')
if not signature:
return jsonify({'error': 'Missing signature'}), 401
if not verify_signature(request.data, signature):
return jsonify({'error': 'Invalid signature'}), 401
# Signature valid - queue for processing
queue_webhook_processing(request.json)
return jsonify({'received': True}), 200

The constant-time comparison using hmac.compare_digest() prevents timing attacks where attackers measure response times to deduce valid signature characters.

  1. Implement idempotency handling to manage duplicate deliveries. Senders retry on timeout or network errors, potentially delivering the same event multiple times:
from functools import wraps
import redis
redis_client = redis.Redis(host='localhost', port=6379, db=0)
def idempotent_webhook(f):
@wraps(f)
def decorated(*args, kwargs):
# Extract unique event identifier
event_id = request.headers.get('X-Event-ID')
if not event_id:
event_id = request.json.get('event_id')
if not event_id:
# No idempotency key - process anyway with warning
app.logger.warning('Webhook received without event ID')
return f(*args, kwargs)
# Check if already processed
cache_key = f"webhook:processed:{event_id}"
if redis_client.get(cache_key):
app.logger.info(f'Duplicate webhook ignored: {event_id}')
return jsonify({'received': True, 'duplicate': True}), 200
# Mark as processing (with TTL to handle crashes)
redis_client.setex(cache_key, 86400, 'processing')
try:
result = f(*args, kwargs)
redis_client.setex(cache_key, 604800, 'completed') # 7 days
return result
except Exception as e:
redis_client.delete(cache_key)
raise
return decorated
@app.route('/webhooks/receive', methods=['POST'])
@idempotent_webhook
def receive_webhook():
# Processing logic here
pass

The 7-day retention for processed event IDs exceeds typical retry windows. Senders abandoning retries after 24 hours cannot cause duplicate processing when retrying within that window.

  1. Configure your web server or load balancer to handle webhook traffic. Webhook endpoints receive bursty traffic when source systems process batches:
# Nginx configuration for webhook endpoint
location /webhooks/ {
# Increase timeouts for webhook processing
proxy_read_timeout 60s;
proxy_send_timeout 60s;
# Limit request body size
client_max_body_size 1m;
# Rate limiting per source IP
limit_req zone=webhooks burst=50 nodelay;
proxy_pass http://webhook_backend;
}

The burst parameter allows 50 requests to queue before rate limiting applies. Webhook senders delivering batched events need burst capacity to avoid spurious failures.

  1. Set up a dead letter queue for webhooks that fail processing after receipt:
import json
from datetime import datetime
def queue_webhook_processing(payload):
try:
process_webhook(payload)
except Exception as e:
# Store failed webhook for manual review
store_dead_letter(payload, str(e))
raise
def store_dead_letter(payload, error):
dead_letter = {
'payload': payload,
'error': error,
'received_at': datetime.utcnow().isoformat(),
'retry_count': 0
}
redis_client.lpush('webhook:dead_letter', json.dumps(dead_letter))
# Alert if dead letter queue grows
queue_length = redis_client.llen('webhook:dead_letter')
if queue_length > 100:
send_alert(f'Webhook dead letter queue has {queue_length} items')

The dead letter queue preserves failed webhooks for investigation and manual replay. Without dead letter handling, processing failures result in permanent data loss.

  1. Expose a health check endpoint for the webhook receiver. Senders often verify endpoint availability before attempting delivery:
@app.route('/webhooks/health', methods=['GET', 'HEAD'])
def webhook_health():
# Check dependencies
try:
redis_client.ping()
return jsonify({'status': 'healthy'}), 200
except Exception:
return jsonify({'status': 'unhealthy'}), 503

Some webhook providers disable integrations after consecutive failed health checks. Ensure the health endpoint reflects actual processing capability.

Payload validation

Webhook payloads require validation beyond signature verification. Malformed payloads cause processing errors; unexpected payloads may indicate API version changes or misconfiguration.

Implement schema validation for incoming payloads:

from jsonschema import validate, ValidationError
WEBHOOK_SCHEMAS = {
'case.created': {
'type': 'object',
'required': ['event_type', 'timestamp', 'data'],
'properties': {
'event_type': {'type': 'string', 'const': 'case.created'},
'timestamp': {'type': 'string', 'format': 'date-time'},
'data': {
'type': 'object',
'required': ['case_id', 'created_by'],
'properties': {
'case_id': {'type': 'string', 'pattern': '^case_[a-z0-9]+$'},
'created_by': {'type': 'string'},
'priority': {'type': 'string', 'enum': ['low', 'medium', 'high', 'critical']}
}
}
}
}
}
def validate_webhook_payload(payload):
event_type = payload.get('event_type')
schema = WEBHOOK_SCHEMAS.get(event_type)
if not schema:
raise ValueError(f'Unknown event type: {event_type}')
validate(instance=payload, schema=schema)

Schema validation catches payload structure changes before they cause downstream errors. When the sending system adds or removes fields, validation failures provide clear diagnostic information.

Retry and failure handling

Webhook delivery fails for transient reasons (network timeouts, temporary service unavailability) and permanent reasons (invalid endpoint, authentication failure). Configure retry behaviour to handle transient failures without overwhelming failing systems.

+--------------------------------------------------------------------+
| WEBHOOK DELIVERY FLOW |
+--------------------------------------------------------------------+
| |
| Initial Delivery |
| | |
| v |
| +----+----+ Success (2xx) +------------------+ |
| | Deliver +----------------------->| Complete | |
| +----+----+ +------------------+ |
| | |
| | Failure (timeout, 5xx, network error) |
| v |
| +----+----+ |
| | Queue | |
| | Retry | |
| +----+----+ |
| | |
| v |
| +----+----+ Attempt < Max +------------------+ |
| | Wait +----------------------->| Retry Delivery +----+ |
| | Backoff | +------------------+ | |
| +----+----+ | | |
| | | Success | |
| | Attempt >= Max v | |
| v +-----+------+ | |
| +----+----+ | Complete | | |
| | Dead | +------------+ | |
| | Letter | | |
| +---------+ +-----+------+ | |
| | Failure +-------------------+ |
| +-----+------+ |
+--------------------------------------------------------------------+

Figure 1: Webhook delivery state transitions with retry and dead letter handling

Configure retry policies based on the failure type:

webhook:
retry:
# Retry on these status codes
retryable_status_codes:
- 408 # Request Timeout
- 429 # Too Many Requests
- 500 # Internal Server Error
- 502 # Bad Gateway
- 503 # Service Unavailable
- 504 # Gateway Timeout
# Do not retry on these (permanent failures)
terminal_status_codes:
- 400 # Bad Request - payload issue
- 401 # Unauthorised - credential issue
- 403 # Forbidden - permission issue
- 404 # Not Found - endpoint removed
- 410 # Gone - endpoint permanently removed
# Retry timing
schedule:
- delay_seconds: 60 # Attempt 2 at T+1m
- delay_seconds: 300 # Attempt 3 at T+6m
- delay_seconds: 1800 # Attempt 4 at T+36m
- delay_seconds: 7200 # Attempt 5 at T+2h36m
- delay_seconds: 21600 # Attempt 6 at T+8h36m

This schedule spaces retry attempts over 8.5 hours, providing time for extended outages to resolve while avoiding excessive retry volume.

For high-volume webhook integrations, implement circuit breaker logic to prevent cascade failures:

from datetime import datetime, timedelta
from enum import Enum
class CircuitState(Enum):
CLOSED = 'closed' # Normal operation
OPEN = 'open' # Failing - reject requests
HALF_OPEN = 'half_open' # Testing recovery
class WebhookCircuitBreaker:
def __init__(self, failure_threshold=5, recovery_timeout=300):
self.failure_threshold = failure_threshold
self.recovery_timeout = recovery_timeout
self.failure_count = 0
self.last_failure_time = None
self.state = CircuitState.CLOSED
def record_success(self):
self.failure_count = 0
self.state = CircuitState.CLOSED
def record_failure(self):
self.failure_count += 1
self.last_failure_time = datetime.utcnow()
if self.failure_count >= self.failure_threshold:
self.state = CircuitState.OPEN
def can_attempt(self):
if self.state == CircuitState.CLOSED:
return True
if self.state == CircuitState.OPEN:
# Check if recovery timeout elapsed
if datetime.utcnow() - self.last_failure_time > timedelta(seconds=self.recovery_timeout):
self.state = CircuitState.HALF_OPEN
return True
return False
# HALF_OPEN: allow single attempt
return True
# Usage per destination endpoint
circuit_breakers = {}
def get_circuit_breaker(endpoint_url):
if endpoint_url not in circuit_breakers:
circuit_breakers[endpoint_url] = WebhookCircuitBreaker()
return circuit_breakers[endpoint_url]

The circuit breaker prevents sending system resources from being consumed by requests to unavailable endpoints. After 5 consecutive failures, the circuit opens and rejects delivery attempts for 5 minutes before allowing a test request.

Monitoring and logging

Webhook integrations require monitoring for delivery success rate, latency, and error patterns. Instrument both sending and receiving components.

import time
from prometheus_client import Counter, Histogram
# Outbound webhook metrics
webhook_deliveries = Counter(
'webhook_deliveries_total',
'Total webhook delivery attempts',
['endpoint', 'event_type', 'status']
)
webhook_latency = Histogram(
'webhook_delivery_seconds',
'Webhook delivery latency',
['endpoint'],
buckets=[0.1, 0.5, 1.0, 2.0, 5.0, 10.0, 30.0]
)
def deliver_webhook(endpoint, event_type, payload):
start_time = time.time()
try:
response = requests.post(endpoint, json=payload, timeout=30)
status = 'success' if response.status_code < 400 else 'client_error' if response.status_code < 500 else 'server_error'
webhook_deliveries.labels(endpoint=endpoint, event_type=event_type, status=status).inc()
return response
except requests.Timeout:
webhook_deliveries.labels(endpoint=endpoint, event_type=event_type, status='timeout').inc()
raise
except requests.ConnectionError:
webhook_deliveries.labels(endpoint=endpoint, event_type=event_type, status='connection_error').inc()
raise
finally:
webhook_latency.labels(endpoint=endpoint).observe(time.time() - start_time)

Configure alerting thresholds based on expected delivery patterns:

MetricWarning thresholdCritical threshold
Delivery success rateBelow 95% over 15 minutesBelow 80% over 5 minutes
Average latencyAbove 5 secondsAbove 15 seconds
Dead letter queue depthAbove 50 itemsAbove 200 items
Circuit breakers openAny endpoint openMore than 3 endpoints open

Log webhook deliveries with sufficient context for debugging:

import logging
import json
logger = logging.getLogger('webhooks')
def log_webhook_delivery(endpoint, event_type, payload, response=None, error=None):
log_entry = {
'endpoint': endpoint,
'event_type': event_type,
'event_id': payload.get('event_id'),
'timestamp': datetime.utcnow().isoformat(),
}
if response:
log_entry['status_code'] = response.status_code
log_entry['response_time_ms'] = response.elapsed.total_seconds() * 1000
# Truncate response body to prevent log bloat
log_entry['response_body'] = response.text[:500] if response.text else None
if error:
log_entry['error'] = str(error)
log_entry['error_type'] = type(error).__name__
# Log payload only in debug mode or on failure
if error or (response and response.status_code >= 400):
log_entry['payload'] = payload
logger.info(json.dumps(log_entry))

Testing webhook integrations

Test webhook configurations before enabling in production. Testing validates endpoint connectivity, authentication, payload handling, and retry behaviour.

  1. Use a webhook testing service to inspect payloads during development. Services like webhook.site provide temporary endpoints that display received requests:
Terminal window
# Configure webhook to temporary test endpoint
curl -X POST https://your-system.example.org/api/webhooks \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"url": "https://webhook.site/unique-id",
"events": ["case.created"],
"enabled": true
}'

Trigger a test event and verify the payload appears in the testing service interface with correct headers and body structure.

  1. Test signature verification by sending requests with invalid signatures:
Terminal window
# Send request with wrong signature
curl -X POST https://api.example.org/webhooks/receive \
-H "Content-Type: application/json" \
-H "X-Signature-256: sha256=invalid_signature_here" \
-d '{"event_type": "case.created", "data": {}}'
# Expected response: 401 Unauthorized

Confirm the endpoint rejects requests with missing, malformed, or incorrect signatures.

  1. Test idempotency by sending duplicate requests:
Terminal window
# Send same event ID twice
EVENT_ID="evt_test_$(date +%s)"
curl -X POST https://api.example.org/webhooks/receive \
-H "Content-Type: application/json" \
-H "X-Signature-256: ${VALID_SIGNATURE}" \
-H "X-Event-ID: ${EVENT_ID}" \
-d '{"event_type": "case.created", "event_id": "'${EVENT_ID}'", "data": {}}'
# Send again with same event ID
curl -X POST https://api.example.org/webhooks/receive \
-H "Content-Type: application/json" \
-H "X-Signature-256: ${VALID_SIGNATURE}" \
-H "X-Event-ID: ${EVENT_ID}" \
-d '{"event_type": "case.created", "event_id": "'${EVENT_ID}'", "data": {}}'

Verify the second request returns success but does not create duplicate records.

  1. Test timeout handling by configuring an artificially slow endpoint:
# Test endpoint that delays response
@app.route('/webhooks/slow', methods=['POST'])
def slow_webhook():
time.sleep(45) # Exceeds typical 30-second timeout
return jsonify({'received': True}), 200

Confirm the sending system times out appropriately and queues for retry.

  1. Validate retry behaviour by returning error responses:
# Test endpoint that fails initially then succeeds
attempt_count = {}
@app.route('/webhooks/flaky', methods=['POST'])
def flaky_webhook():
event_id = request.json.get('event_id')
attempt_count[event_id] = attempt_count.get(event_id, 0) + 1
if attempt_count[event_id] < 3:
return jsonify({'error': 'Temporary failure'}), 503
return jsonify({'received': True}), 200

Verify the sending system retries and eventually succeeds on the third attempt.

Verification

After configuration, verify the complete integration path.

Confirm outbound webhook delivery:

Terminal window
# Check recent delivery logs
curl -X GET "https://your-system.example.org/api/webhooks/wh_abc123/deliveries?limit=10" \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
# Expected output shows successful deliveries
{
"deliveries": [
{
"id": "del_xyz789",
"event_type": "case.created",
"status": "delivered",
"response_code": 200,
"response_time_ms": 245,
"delivered_at": "2024-11-15T14:30:00Z"
}
]
}

Verify inbound webhook processing:

Terminal window
# Check application logs for received webhooks
grep "webhook" /var/log/app/application.log | tail -20
# Expected: entries showing received and processed webhooks
{"endpoint":"/webhooks/receive","event_type":"case.created","event_id":"evt_abc123","status_code":200,"response_time_ms":45}

Confirm signature verification is enforced:

Terminal window
# Attempt delivery without signature
curl -X POST https://api.example.org/webhooks/receive \
-H "Content-Type: application/json" \
-d '{"event_type": "test"}'
# Must return 401, not 200

Verify dead letter queue is operational:

Terminal window
# Check dead letter queue depth
redis-cli LLEN webhook:dead_letter
# Expected: 0 if no failures, or count of failed webhooks
# Inspect failed webhook if present
redis-cli LINDEX webhook:dead_letter 0

Troubleshooting

SymptomCauseResolution
Deliveries fail with “connection refused”Endpoint not listening, firewall blockingVerify endpoint is running: curl -I https://endpoint; check firewall rules permit source IPs
Deliveries fail with “certificate verify failed”Invalid, expired, or self-signed TLS certificateInstall valid certificate from public CA; verify with openssl s_client -connect host:443
Signature verification fails on valid requestsSecret mismatch, encoding issue, or signature computed over wrong contentVerify secrets match exactly including whitespace; confirm signature computed over raw body not parsed JSON
Duplicate events processed despite idempotencyEvent ID not present in payload, Redis unavailable, or TTL too shortCheck payload contains event ID; verify Redis connectivity; extend deduplication TTL
Webhook endpoint returns 504 Gateway TimeoutProcessing takes longer than proxy timeoutReturn 200 immediately, process asynchronously; increase proxy timeout if synchronous processing required
Retries not occurring after failureStatus code not in retryable list, circuit breaker openCheck retry configuration includes the status code; verify circuit breaker state
High latency on webhook deliverySlow DNS resolution, TLS handshake overhead, or receiver processing timeCache DNS; use connection pooling; ensure receiver responds before processing
Webhook deliveries succeed but events not processedReceiver returns 200 before processing, then fails silentlyImplement dead letter queue; add processing confirmation logs; monitor queue depth
Rate limiting errors (429) from receiverDelivery volume exceeds receiver capacityImplement sending-side rate limiting; batch events if receiver supports it; increase receiver capacity
Payload too large errors (413)Event data exceeds receiver’s body size limitIncrease client_max_body_size on receiver; reduce payload size by omitting unnecessary fields
Events delivered out of orderConcurrent delivery, retry delaysInclude sequence number in payload; implement ordering in receiver if required; accept eventual consistency
Memory exhaustion in receiverUnbounded payload buffering, queue growthLimit in-flight webhooks; stream large payloads; implement backpressure
Webhook secret exposed in logsLogging configuration includes headers or secretsRedact sensitive headers in logging; rotate compromised secrets immediately
Intermittent authentication failuresSecret rotation during delivery, clock skew for time-based authImplement grace period accepting old and new secrets during rotation; synchronise clocks with NTP

See also