Data Backup and Recovery
Data backup creates copies of information at specific points in time, enabling restoration when primary data becomes unavailable due to hardware failure, corruption, accidental deletion, or security incidents. Recovery transforms those backup copies into usable data within production systems. The procedures in this page cover operational execution of backup and recovery tasks across database systems, file storage, and cloud platforms.
Prerequisites
Before configuring or executing backup operations, verify the following requirements are satisfied.
- Retention policy
- A documented retention schedule specifying how long backups must be kept for each data classification. Without this, you cannot configure retention parameters correctly. See Backup and Recovery Standard for policy requirements.
- Recovery objectives
- Defined Recovery Point Objective (RPO) and Recovery Time Objective (RTO) for each system. RPO determines backup frequency; RTO determines recovery procedure design.
- Storage allocation
- Sufficient backup storage provisioned and accessible. Calculate required capacity as: (full backup size) + (daily change rate × days between full backups × number of retained cycles).
- Access credentials
- Service account credentials with appropriate permissions for backup operations. Database backups require backup-specific roles; file system backups require read access to source and write access to destination.
- Network connectivity
- Verified network path between source systems and backup storage. For cloud backup, confirm egress bandwidth and any data transfer cost implications.
Verify backup tool installation and version compatibility:
# Check restic version (minimum 0.14.0 for compression support)restic version# Expected: restic 0.16.0 or higher
# Check PostgreSQL client toolspg_dump --version# Expected: pg_dump (PostgreSQL) 15.x or compatible with target server
# Verify cloud CLI authenticationaws sts get-caller-identity# Expected: Account and ARN details for backup service accountConfirm storage repository is initialised and accessible:
# For restic with S3-compatible storageexport RESTIC_REPOSITORY="s3:s3.eu-west-1.amazonaws.com/orgname-backups"export RESTIC_PASSWORD_FILE="/etc/restic/password"
restic snapshots# Expected: List of existing snapshots or empty list for new repository# Error "unable to open config file" indicates uninitialised repositoryBackup Types
Three backup types serve different purposes in a backup strategy. Understanding their mechanics determines appropriate scheduling and storage allocation.
A full backup copies all data in the backup scope regardless of previous backups. Full backups are self-contained and require no other backups for recovery. The time to complete a full backup scales linearly with data volume. A 500 GB database requires 2 to 4 hours for full backup over gigabit network connections, depending on compression efficiency and storage write speed.
An incremental backup copies only data that changed since the most recent backup of any type. Incremental backups complete faster and consume less storage than full backups, but recovery requires the most recent full backup plus all subsequent incrementals in sequence. Loss of any incremental in the chain prevents recovery of later data.
A differential backup copies all data that changed since the most recent full backup. Differential backups grow larger as time passes since the last full but require only two backup sets for recovery: the full plus the single differential. This simplifies recovery at the cost of larger backup sizes than pure incremental strategies.
+------------------------------------------------------------------+| BACKUP TYPE COMPARISON |+------------------------------------------------------------------+| || FULL BACKUP || +----------------------------------------------------------+ || | Day 1: 500 GB (complete dataset) | || | Day 8: 500 GB (complete dataset) | || | Recovery: Single backup set required | || +----------------------------------------------------------+ || || INCREMENTAL BACKUP || +----------------------------------------------------------+ || | Day 1: 500 GB (full) | || | Day 2: 15 GB (changes since Day 1) | || | Day 3: 12 GB (changes since Day 2) | || | Day 4: 18 GB (changes since Day 3) | || | Recovery: Full + all incrementals in sequence | || +----------------------------------------------------------+ || || DIFFERENTIAL BACKUP || +----------------------------------------------------------+ || | Day 1: 500 GB (full) | || | Day 2: 15 GB (changes since Day 1) | || | Day 3: 27 GB (changes since Day 1) | || | Day 4: 45 GB (changes since Day 1) | || | Recovery: Full + single differential | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+Figure 1: Backup type comparison showing storage consumption and recovery dependencies
For organisations with limited backup windows, schedule weekly full backups with daily incrementals. This balances backup speed against recovery complexity. Where recovery speed is critical and storage costs are secondary, daily differentials simplify restoration procedures.
Procedure
Configuring PostgreSQL Database Backups
PostgreSQL backups use pg_dump for logical backups or pg_basebackup for physical backups. Logical backups create SQL statements that reconstruct the database; physical backups copy the actual data files. Logical backups offer flexibility for partial restores and version migrations. Physical backups enable point-in-time recovery when combined with write-ahead log (WAL) archiving.
- Create a dedicated backup user with minimal required permissions:
-- Connect as superuser CREATE ROLE backup_user WITH LOGIN PASSWORD 'secure-password-here'; GRANT CONNECT ON DATABASE programme_db TO backup_user; GRANT SELECT ON ALL TABLES IN SCHEMA public TO backup_user; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO backup_user;Store credentials in a .pgpass file with restricted permissions:
echo "db-server:5432:programme_db:backup_user:secure-password-here" >> ~/.pgpass chmod 600 ~/.pgpass- Execute a full logical backup with compression:
pg_dump \ --host=db-server \ --port=5432 \ --username=backup_user \ --format=custom \ --compress=9 \ --file=/backup/staging/programme_db_$(date +%Y%m%d_%H%M%S).dump \ programme_dbExpected output: No output on success. Exit code 0 indicates completion.
Verify backup file creation:
ls -lh /backup/staging/programme_db_*.dump # Expected: File with reasonable size (compression typically achieves 70-80% reduction)- Validate backup integrity before transfer to permanent storage:
pg_restore \ --list \ /backup/staging/programme_db_20240115_020000.dump > /dev/null
echo "Exit code: $?" # Expected: Exit code: 0 # Non-zero exit code indicates corrupt backup- Transfer validated backup to remote storage:
# Using restic for encrypted, deduplicated backup restic backup \ --tag postgresql \ --tag programme_db \ /backup/staging/programme_db_20240115_020000.dump
# Expected output includes: # snapshot abc12345 saved- Apply retention policy to remove expired backups:
restic forget \ --keep-daily 7 \ --keep-weekly 4 \ --keep-monthly 12 \ --prune \ --tag postgresql
# Expected: List of removed snapshots and freed space- Remove local staging copy after successful remote storage:
rm /backup/staging/programme_db_20240115_020000.dumpConfiguring File System Backups
File system backups protect documents, configuration files, and unstructured data. The procedures differ from database backups because file systems lack transactional consistency guarantees. Files may be modified during backup, creating inconsistent copies. For critical data, consider application-level quiescing or snapshot-based approaches.
- Define backup scope by creating an inclusion list:
cat > /etc/restic/backup-paths.txt << 'EOF' /home /var/www /etc /opt/applications EOFCreate an exclusion list for temporary and cache files:
cat > /etc/restic/excludes.txt << 'EOF' *.tmp *.cache /node_modules /.git /vendor /home/*/.cache /var/www/*/storage/logs EOF- Initialise the backup repository if not already configured:
export RESTIC_REPOSITORY="s3:s3.eu-west-1.amazonaws.com/orgname-backups" export RESTIC_PASSWORD_FILE="/etc/restic/password"
restic init # Expected: created restic repository at s3:... # Skip if repository already exists- Execute backup with exclusions and tagging:
restic backup \ --files-from /etc/restic/backup-paths.txt \ --exclude-file /etc/restic/excludes.txt \ --tag fileserver \ --tag $(hostname) \ --verbose
# Expected output: # Files: 1523 new, 47 changed, 8934 unmodified # Dirs: 245 new, 12 changed, 1456 unmodified # Added to the repo: 234.5 MiB # snapshot def67890 saved- Verify snapshot content:
restic ls latest --long | head -20 # Expected: File listing with permissions, sizes, and paths- Check repository integrity:
restic check # Expected: "no errors were found" # Errors indicate repository corruption requiring investigationConfiguring Cloud Database Backups
Cloud-managed databases provide automated backup capabilities that differ from self-managed approaches. These procedures configure and verify cloud-native backup features rather than replacing them with external tools.
- Configure automated backups for AWS RDS PostgreSQL:
aws rds modify-db-instance \ --db-instance-identifier programme-db-prod \ --backup-retention-period 14 \ --preferred-backup-window "02:00-03:00" \ --apply-immediately
# Expected: DBInstance modification pendingVerify configuration:
aws rds describe-db-instances \ --db-instance-identifier programme-db-prod \ --query 'DBInstances[0].{Retention:BackupRetentionPeriod,Window:PreferredBackupWindow}'
# Expected: # { # "Retention": 14, # "Window": "02:00-03:00" # }- Create a manual snapshot before major changes:
aws rds create-db-snapshot \ --db-instance-identifier programme-db-prod \ --db-snapshot-identifier programme-db-pre-migration-20240115
# Expected: DBSnapshot creation initiatedMonitor snapshot progress:
aws rds describe-db-snapshots \ --db-snapshot-identifier programme-db-pre-migration-20240115 \ --query 'DBSnapshots[0].Status'
# Expected: "available" when complete (typically 10-30 minutes)- Configure cross-region snapshot replication for disaster recovery:
aws rds copy-db-snapshot \ --source-db-snapshot-identifier arn:aws:rds:eu-west-1:123456789:snapshot:programme-db-pre-migration-20240115 \ --target-db-snapshot-identifier programme-db-pre-migration-20240115-eu-west-2 \ --source-region eu-west-1 \ --region eu-west-2 \ --kms-key-id alias/rds-backup-key
# Expected: Snapshot copy initiated to secondary region- Enable point-in-time recovery verification:
aws rds describe-db-instances \ --db-instance-identifier programme-db-prod \ --query 'DBInstances[0].LatestRestorableTime'
# Expected: ISO timestamp within last 5 minutes # Older timestamps indicate replication lag or configuration issuesCloud backup costs
Cloud database snapshots incur storage costs. A 500 GB database with 14-day retention and 3% daily change rate requires approximately 710 GB of snapshot storage. Cross-region replication doubles this. Calculate expected costs before enabling extended retention.
Configuring MongoDB Backups
MongoDB backup procedures depend on the deployment topology. Standalone instances use mongodump; replica sets require coordination to ensure consistent backups. Sharded clusters add complexity through distributed data and configuration servers.
- Create backup user with appropriate roles:
// Connect to admin database as administrator use admin db.createUser({ user: "backup_user", pwd: "secure-password-here", roles: [ { role: "backup", db: "admin" }, { role: "readAnyDatabase", db: "admin" } ] })- Execute backup with compression and authentication:
mongodump \ --uri="mongodb://backup_user:secure-password-here@mongo-primary:27017" \ --authenticationDatabase=admin \ --gzip \ --archive=/backup/staging/beneficiary_db_$(date +%Y%m%d_%H%M%S).archive
# Expected output: # done dumping beneficiary_db.registrations (45678 documents) # done dumping beneficiary_db.distributions (12345 documents)- For replica sets, backup from secondary to reduce primary load:
mongodump \ --uri="mongodb://backup_user:secure-password-here@mongo-secondary:27017" \ --authenticationDatabase=admin \ --readPreference=secondary \ --gzip \ --oplog \ --archive=/backup/staging/beneficiary_db_$(date +%Y%m%d_%H%M%S).archive
# The --oplog flag captures operations during backup for point-in-time consistency- Validate backup by listing contents:
mongorestore \ --gzip \ --archive=/backup/staging/beneficiary_db_20240115_020000.archive \ --dryRun
# Expected: List of collections with document counts, no actual restorationRecovery Procedures
Recovery operations restore data from backups to production systems. The specific procedure depends on recovery scope (full database, specific tables, individual files) and the backup format. All recovery operations should execute first in a non-production environment to validate data integrity before production restoration.
+------------------------------------------------------------------+| RECOVERY DECISION FLOW |+------------------------------------------------------------------+| || +-------------------+ || | Recovery needed | || +--------+----------+ || | || +--------------+--------------+ || | | || v v || +---------+----------+ +----------+---------+ || | Full system | | Partial data | || | recovery | | recovery | || +---------+----------+ +----------+---------+ || | | || v v || +---------+----------+ +----------+---------+ || | Use most recent | | Identify affected | || | full backup | | objects/tables | || +---------+----------+ +----------+---------+ || | | || v v || +---------+----------+ +----------+---------+ || | Apply incrementals | | Extract from | || | in sequence | | backup to staging | || +---------+----------+ +----------+---------+ || | | || v v || +---------+----------+ +----------+---------+ || | Validate and | | Validate and | || | promote | | merge | || +--------------------+ +--------------------+ || |+------------------------------------------------------------------+Figure 2: Recovery procedure decision flow
Full Database Recovery
- Identify the appropriate backup snapshot for recovery:
restic snapshots --tag postgresql --tag programme_db
# Output shows available snapshots with timestamps: # ID Time Host Tags # abc12345 2024-01-15 02:15:00 db-server postgresql,programme_db # def67890 2024-01-14 02:15:00 db-server postgresql,programme_dbSelect the most recent snapshot before the incident. For data corruption discovered on 15 January at 14:00, use the 02:15:00 snapshot from the same day.
- Restore backup file to local staging:
restic restore abc12345 \ --target /backup/restore \ --include "programme_db_*.dump"
# Expected: restoring to /backup/restore- Terminate existing connections to the target database:
-- Execute as superuser SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'programme_db' AND pid <> pg_backend_pid();- Drop and recreate the database:
DROP DATABASE programme_db; CREATE DATABASE programme_db OWNER programme_app;- Restore from backup:
pg_restore \ --host=db-server \ --port=5432 \ --username=postgres \ --dbname=programme_db \ --verbose \ /backup/restore/programme_db_20240115_021500.dump
# Expected: Restoration messages for each object # Errors about existing objects indicate incomplete drop- Verify restoration by checking row counts and sample data:
-- Compare against known pre-incident values if available SELECT schemaname, relname, n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC LIMIT 10;
-- Spot-check recent records SELECT * FROM registrations ORDER BY created_at DESC LIMIT 5;Partial Table Recovery
When only specific tables require recovery, extract them from backup without affecting other data.
- List contents of the backup to identify available objects:
pg_restore --list /backup/restore/programme_db_20240115_021500.dump | grep -i distributions
# Output shows table and related objects: # 1234; 1259 16385 TABLE public distributions programme_app # 1235; 0 16385 TABLE DATA public distributions programme_app- Create a table list file for selective restoration:
pg_restore --list /backup/restore/programme_db_20240115_021500.dump > /tmp/full-toc.txt
# Edit to keep only desired objects grep -E "distributions|SEQUENCE.*distributions" /tmp/full-toc.txt > /tmp/restore-toc.txt- Restore to a temporary schema to avoid overwriting current data:
CREATE SCHEMA restore_staging; pg_restore \ --host=db-server \ --username=postgres \ --dbname=programme_db \ --use-list=/tmp/restore-toc.txt \ --schema=restore_staging \ /backup/restore/programme_db_20240115_021500.dump- Compare restored data against production and merge as needed:
-- Count comparison SELECT 'production' as source, count(*) FROM public.distributions UNION ALL SELECT 'backup' as source, count(*) FROM restore_staging.distributions;
-- Identify records in backup but not production (deleted/corrupted) SELECT b.* FROM restore_staging.distributions b LEFT JOIN public.distributions p ON b.id = p.id WHERE p.id IS NULL;- Merge recovered data into production:
-- Insert missing records INSERT INTO public.distributions SELECT * FROM restore_staging.distributions b WHERE NOT EXISTS ( SELECT 1 FROM public.distributions p WHERE p.id = b.id );
-- Clean up DROP SCHEMA restore_staging CASCADE;File Recovery
- List available snapshots and locate the required files:
restic snapshots --tag fileserver
restic ls latest --long | grep "important-document" # Expected: Full path to file with timestamp- Restore specific files or directories:
# Single file restic restore latest \ --target /tmp/restore \ --include "/var/www/uploads/important-document.pdf"
# Directory tree restic restore latest \ --target /tmp/restore \ --include "/home/shared/project-files/"
# Expected: Restores to /tmp/restore maintaining original path structure- Verify restored files:
ls -la /tmp/restore/var/www/uploads/important-document.pdf file /tmp/restore/var/www/uploads/important-document.pdf # Expected: PDF document with expected size and type- Move restored files to production location:
# Backup current version if it exists mv /var/www/uploads/important-document.pdf \ /var/www/uploads/important-document.pdf.corrupted
# Restore from backup cp /tmp/restore/var/www/uploads/important-document.pdf \ /var/www/uploads/important-document.pdf
# Restore permissions chown www-data:www-data /var/www/uploads/important-document.pdf chmod 644 /var/www/uploads/important-document.pdfCloud Database Point-in-Time Recovery
- Identify the target recovery time:
# Check latest restorable time aws rds describe-db-instances \ --db-instance-identifier programme-db-prod \ --query 'DBInstances[0].LatestRestorableTime'
# Output: "2024-01-15T14:30:00+00:00" # Recovery target must be between backup retention start and this time- Initiate point-in-time recovery to a new instance:
aws rds restore-db-instance-to-point-in-time \ --source-db-instance-identifier programme-db-prod \ --target-db-instance-identifier programme-db-recovery \ --restore-time "2024-01-15T10:00:00+00:00" \ --db-instance-class db.t3.medium \ --vpc-security-group-ids sg-12345678 \ --db-subnet-group-name private-subnets
# Expected: DB instance creation initiated- Monitor restoration progress:
watch -n 30 'aws rds describe-db-instances \ --db-instance-identifier programme-db-recovery \ --query "DBInstances[0].DBInstanceStatus"'
# Status progression: creating -> backing-up -> available # Duration: 15-60 minutes depending on database size- Validate recovered data before production cutover:
# Connect to recovery instance psql -h programme-db-recovery.abc123.eu-west-1.rds.amazonaws.com \ -U admin_user \ -d programme_db
# Run validation queries SELECT count(*) FROM registrations WHERE created_at < '2024-01-15 10:00:00';- Perform production cutover by renaming instances:
# Rename current production (requires brief outage) aws rds modify-db-instance \ --db-instance-identifier programme-db-prod \ --new-db-instance-identifier programme-db-old \ --apply-immediately
# Wait for rename to complete sleep 300
# Rename recovery to production aws rds modify-db-instance \ --db-instance-identifier programme-db-recovery \ --new-db-instance-identifier programme-db-prod \ --apply-immediatelyVerification
Backup verification confirms that backup data can be successfully restored. Verification should occur after every backup job and through periodic full restore tests.
Post-Backup Verification
Execute these checks after each backup completes:
# Verify restic snapshot was createdLATEST_SNAPSHOT=$(restic snapshots --json --last 1 | jq -r '.[0].id')if [ -z "$LATEST_SNAPSHOT" ]; then echo "ERROR: No snapshot found" exit 1fi
# Check snapshot integrityrestic check --read-data-subset=1%# Expected: "no errors were found"# Checks 1% of data blocks; adjust percentage based on time available
# For PostgreSQL backups, verify dump file integrityDUMP_FILE=$(restic ls "$LATEST_SNAPSHOT" | grep "\.dump$" | head -1)restic dump "$LATEST_SNAPSHOT" "$DUMP_FILE" | pg_restore --list > /dev/nullecho "Dump verification exit code: $?"# Expected: 0Scheduled Restore Testing
Perform full restore tests monthly for critical systems and quarterly for standard systems.
#!/bin/bash# restore-test.sh - Monthly restore verification script
set -e
TIMESTAMP=$(date +%Y%m%d)TEST_DB="restore_test_${TIMESTAMP}"
# Create test databasepsql -h localhost -U postgres -c "CREATE DATABASE ${TEST_DB};"
# Get most recent backupSNAPSHOT=$(restic snapshots --json --last 1 --tag postgresql | jq -r '.[0].id')
# Restore to staging directoryrestic restore "${SNAPSHOT}" --target /tmp/restore-test
# Find dump fileDUMP=$(find /tmp/restore-test -name "*.dump" | head -1)
# Restore to test databasepg_restore -h localhost -U postgres -d "${TEST_DB}" "${DUMP}"
# Run validation queriesEXPECTED_COUNT=45678ACTUAL_COUNT=$(psql -h localhost -U postgres -d "${TEST_DB}" -t -c \ "SELECT count(*) FROM registrations;")
if [ "${ACTUAL_COUNT}" -lt "${EXPECTED_COUNT}" ]; then echo "ERROR: Row count mismatch. Expected ${EXPECTED_COUNT}, got ${ACTUAL_COUNT}" exit 1fi
# Clean uppsql -h localhost -U postgres -c "DROP DATABASE ${TEST_DB};"rm -rf /tmp/restore-test
echo "Restore test completed successfully"Document restore test results including restoration time, data validation outcome, and any issues encountered.
Troubleshooting
| Symptom | Cause | Resolution |
|---|---|---|
pg_dump fails with “connection refused” | PostgreSQL not accepting connections on specified host/port | Verify PostgreSQL is running with systemctl status postgresql; check listen_addresses in postgresql.conf; verify firewall rules allow connection |
pg_dump fails with “permission denied for table” | Backup user lacks SELECT permission on tables | Grant permissions with GRANT SELECT ON ALL TABLES IN SCHEMA public TO backup_user;; check for tables in non-public schemas |
| restic reports “repository is already locked” | Previous backup process did not complete cleanly | Remove stale lock with restic unlock; investigate why previous process failed before proceeding |
| Backup completes but file size is unexpectedly small | Empty database, connection to wrong database, or compression issues | Verify database name and connection; run SELECT count(*) FROM pg_tables; to confirm table count; check for error messages in backup output |
| ”unable to open config file” from restic | Repository not initialised or incorrect path | Run restic init to create repository; verify RESTIC_REPOSITORY environment variable is set correctly |
| Cloud snapshot creation fails with “insufficient capacity” | Snapshot storage quota exceeded | Delete old snapshots; request quota increase; consider enabling snapshot archiving to cheaper tier |
| Restore fails with “relation already exists” | Target database contains existing objects | Drop conflicting objects before restore, or use --clean flag with pg_restore; for partial restore, use staging schema approach |
| Restored database has missing foreign key references | Backup taken during write operations without transaction consistency | Use --serializable-deferrable with pg_dump for consistent snapshot; schedule backups during low-activity windows |
| MongoDB backup reports “not primary” | Connected to secondary in replica set when authentication requires primary | Use --readPreference=primaryPreferred or connect directly to primary; verify replica set configuration with rs.status() |
| Incremental backup takes as long as full backup | Changed block tracking not functioning; all blocks marked changed | For restic, check if cache directory exists and has correct permissions; for native tools, verify CBT configuration |
| Restore succeeds but application reports errors | Schema version mismatch between backup and application expectations | Run application migrations after restore; check for migration files created after backup timestamp |
| ”authentication failed” during cloud backup | Expired or rotated credentials | Verify service account credentials; check credential expiration; regenerate access keys if needed |
Recovery From Backup Chain Failures
When an incremental backup chain becomes broken due to a missing or corrupt intermediate backup, recovery requires returning to the most recent valid full backup. This may result in data loss for the period between the full backup and the incident.
To verify backup chain integrity:
# List all snapshots in a chainrestic snapshots --tag postgresql | sort -k2
# Attempt restore from each snapshot to verify recoverabilityfor SNAP in $(restic snapshots --json --tag postgresql | jq -r '.[].id'); do echo "Testing snapshot: ${SNAP}" restic ls "${SNAP}" > /dev/null 2>&1 || echo "FAILED: ${SNAP}"doneIf chain verification reveals a gap, document the data loss window and assess recovery options including alternative data sources, application logs, or manual reconstruction.
Automation
Automate backup operations using systemd timers or cron jobs. Systemd timers provide better logging integration and dependency management than cron.
Create the backup service unit:
[Unit]Description=PostgreSQL BackupAfter=network-online.target postgresql.serviceWants=network-online.target
[Service]Type=oneshotUser=backupGroup=backupEnvironmentFile=/etc/restic/environmentExecStart=/usr/local/bin/backup-postgresql.shExecStartPost=/usr/local/bin/notify-backup-status.sh postgresql
[Install]WantedBy=multi-user.targetCreate the timer unit:
[Unit]Description=Daily PostgreSQL Backup
[Timer]OnCalendar=*-*-* 02:00:00RandomizedDelaySec=300Persistent=true
[Install]WantedBy=timers.targetEnable and start the timer:
systemctl daemon-reloadsystemctl enable backup-postgresql.timersystemctl start backup-postgresql.timer
# Verify timer is scheduledsystemctl list-timers | grep backupCreate monitoring alerts for backup failures:
#!/bin/bashBACKUP_TYPE=$1EXIT_CODE=${EXIT_STATUS:-0}
if [ "$EXIT_CODE" -ne 0 ]; then # Send alert via webhook, email, or monitoring system curl -X POST "https://alerts.example.org/webhook" \ -H "Content-Type: application/json" \ -d "{\"severity\": \"critical\", \"message\": \"${BACKUP_TYPE} backup failed with exit code ${EXIT_CODE}\"}"fi