Skip to main content

Data Backup and Recovery

Data backup creates copies of information at specific points in time, enabling restoration when primary data becomes unavailable due to hardware failure, corruption, accidental deletion, or security incidents. Recovery transforms those backup copies into usable data within production systems. The procedures in this page cover operational execution of backup and recovery tasks across database systems, file storage, and cloud platforms.

Prerequisites

Before configuring or executing backup operations, verify the following requirements are satisfied.

Retention policy
A documented retention schedule specifying how long backups must be kept for each data classification. Without this, you cannot configure retention parameters correctly. See Backup and Recovery Standard for policy requirements.
Recovery objectives
Defined Recovery Point Objective (RPO) and Recovery Time Objective (RTO) for each system. RPO determines backup frequency; RTO determines recovery procedure design.
Storage allocation
Sufficient backup storage provisioned and accessible. Calculate required capacity as: (full backup size) + (daily change rate × days between full backups × number of retained cycles).
Access credentials
Service account credentials with appropriate permissions for backup operations. Database backups require backup-specific roles; file system backups require read access to source and write access to destination.
Network connectivity
Verified network path between source systems and backup storage. For cloud backup, confirm egress bandwidth and any data transfer cost implications.

Verify backup tool installation and version compatibility:

Terminal window
# Check restic version (minimum 0.14.0 for compression support)
restic version
# Expected: restic 0.16.0 or higher
# Check PostgreSQL client tools
pg_dump --version
# Expected: pg_dump (PostgreSQL) 15.x or compatible with target server
# Verify cloud CLI authentication
aws sts get-caller-identity
# Expected: Account and ARN details for backup service account

Confirm storage repository is initialised and accessible:

Terminal window
# For restic with S3-compatible storage
export RESTIC_REPOSITORY="s3:s3.eu-west-1.amazonaws.com/orgname-backups"
export RESTIC_PASSWORD_FILE="/etc/restic/password"
restic snapshots
# Expected: List of existing snapshots or empty list for new repository
# Error "unable to open config file" indicates uninitialised repository

Backup Types

Three backup types serve different purposes in a backup strategy. Understanding their mechanics determines appropriate scheduling and storage allocation.

A full backup copies all data in the backup scope regardless of previous backups. Full backups are self-contained and require no other backups for recovery. The time to complete a full backup scales linearly with data volume. A 500 GB database requires 2 to 4 hours for full backup over gigabit network connections, depending on compression efficiency and storage write speed.

An incremental backup copies only data that changed since the most recent backup of any type. Incremental backups complete faster and consume less storage than full backups, but recovery requires the most recent full backup plus all subsequent incrementals in sequence. Loss of any incremental in the chain prevents recovery of later data.

A differential backup copies all data that changed since the most recent full backup. Differential backups grow larger as time passes since the last full but require only two backup sets for recovery: the full plus the single differential. This simplifies recovery at the cost of larger backup sizes than pure incremental strategies.

+------------------------------------------------------------------+
| BACKUP TYPE COMPARISON |
+------------------------------------------------------------------+
| |
| FULL BACKUP |
| +----------------------------------------------------------+ |
| | Day 1: 500 GB (complete dataset) | |
| | Day 8: 500 GB (complete dataset) | |
| | Recovery: Single backup set required | |
| +----------------------------------------------------------+ |
| |
| INCREMENTAL BACKUP |
| +----------------------------------------------------------+ |
| | Day 1: 500 GB (full) | |
| | Day 2: 15 GB (changes since Day 1) | |
| | Day 3: 12 GB (changes since Day 2) | |
| | Day 4: 18 GB (changes since Day 3) | |
| | Recovery: Full + all incrementals in sequence | |
| +----------------------------------------------------------+ |
| |
| DIFFERENTIAL BACKUP |
| +----------------------------------------------------------+ |
| | Day 1: 500 GB (full) | |
| | Day 2: 15 GB (changes since Day 1) | |
| | Day 3: 27 GB (changes since Day 1) | |
| | Day 4: 45 GB (changes since Day 1) | |
| | Recovery: Full + single differential | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Figure 1: Backup type comparison showing storage consumption and recovery dependencies

For organisations with limited backup windows, schedule weekly full backups with daily incrementals. This balances backup speed against recovery complexity. Where recovery speed is critical and storage costs are secondary, daily differentials simplify restoration procedures.

Procedure

Configuring PostgreSQL Database Backups

PostgreSQL backups use pg_dump for logical backups or pg_basebackup for physical backups. Logical backups create SQL statements that reconstruct the database; physical backups copy the actual data files. Logical backups offer flexibility for partial restores and version migrations. Physical backups enable point-in-time recovery when combined with write-ahead log (WAL) archiving.

  1. Create a dedicated backup user with minimal required permissions:
-- Connect as superuser
CREATE ROLE backup_user WITH LOGIN PASSWORD 'secure-password-here';
GRANT CONNECT ON DATABASE programme_db TO backup_user;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO backup_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO backup_user;

Store credentials in a .pgpass file with restricted permissions:

Terminal window
echo "db-server:5432:programme_db:backup_user:secure-password-here" >> ~/.pgpass
chmod 600 ~/.pgpass
  1. Execute a full logical backup with compression:
Terminal window
pg_dump \
--host=db-server \
--port=5432 \
--username=backup_user \
--format=custom \
--compress=9 \
--file=/backup/staging/programme_db_$(date +%Y%m%d_%H%M%S).dump \
programme_db

Expected output: No output on success. Exit code 0 indicates completion.

Verify backup file creation:

Terminal window
ls -lh /backup/staging/programme_db_*.dump
# Expected: File with reasonable size (compression typically achieves 70-80% reduction)
  1. Validate backup integrity before transfer to permanent storage:
Terminal window
pg_restore \
--list \
/backup/staging/programme_db_20240115_020000.dump > /dev/null
echo "Exit code: $?"
# Expected: Exit code: 0
# Non-zero exit code indicates corrupt backup
  1. Transfer validated backup to remote storage:
Terminal window
# Using restic for encrypted, deduplicated backup
restic backup \
--tag postgresql \
--tag programme_db \
/backup/staging/programme_db_20240115_020000.dump
# Expected output includes:
# snapshot abc12345 saved
  1. Apply retention policy to remove expired backups:
Terminal window
restic forget \
--keep-daily 7 \
--keep-weekly 4 \
--keep-monthly 12 \
--prune \
--tag postgresql
# Expected: List of removed snapshots and freed space
  1. Remove local staging copy after successful remote storage:
Terminal window
rm /backup/staging/programme_db_20240115_020000.dump

Configuring File System Backups

File system backups protect documents, configuration files, and unstructured data. The procedures differ from database backups because file systems lack transactional consistency guarantees. Files may be modified during backup, creating inconsistent copies. For critical data, consider application-level quiescing or snapshot-based approaches.

  1. Define backup scope by creating an inclusion list:
Terminal window
cat > /etc/restic/backup-paths.txt << 'EOF'
/home
/var/www
/etc
/opt/applications
EOF

Create an exclusion list for temporary and cache files:

Terminal window
cat > /etc/restic/excludes.txt << 'EOF'
*.tmp
*.cache
/node_modules
/.git
/vendor
/home/*/.cache
/var/www/*/storage/logs
EOF
  1. Initialise the backup repository if not already configured:
Terminal window
export RESTIC_REPOSITORY="s3:s3.eu-west-1.amazonaws.com/orgname-backups"
export RESTIC_PASSWORD_FILE="/etc/restic/password"
restic init
# Expected: created restic repository at s3:...
# Skip if repository already exists
  1. Execute backup with exclusions and tagging:
Terminal window
restic backup \
--files-from /etc/restic/backup-paths.txt \
--exclude-file /etc/restic/excludes.txt \
--tag fileserver \
--tag $(hostname) \
--verbose
# Expected output:
# Files: 1523 new, 47 changed, 8934 unmodified
# Dirs: 245 new, 12 changed, 1456 unmodified
# Added to the repo: 234.5 MiB
# snapshot def67890 saved
  1. Verify snapshot content:
Terminal window
restic ls latest --long | head -20
# Expected: File listing with permissions, sizes, and paths
  1. Check repository integrity:
Terminal window
restic check
# Expected: "no errors were found"
# Errors indicate repository corruption requiring investigation

Configuring Cloud Database Backups

Cloud-managed databases provide automated backup capabilities that differ from self-managed approaches. These procedures configure and verify cloud-native backup features rather than replacing them with external tools.

  1. Configure automated backups for AWS RDS PostgreSQL:
Terminal window
aws rds modify-db-instance \
--db-instance-identifier programme-db-prod \
--backup-retention-period 14 \
--preferred-backup-window "02:00-03:00" \
--apply-immediately
# Expected: DBInstance modification pending

Verify configuration:

Terminal window
aws rds describe-db-instances \
--db-instance-identifier programme-db-prod \
--query 'DBInstances[0].{Retention:BackupRetentionPeriod,Window:PreferredBackupWindow}'
# Expected:
# {
# "Retention": 14,
# "Window": "02:00-03:00"
# }
  1. Create a manual snapshot before major changes:
Terminal window
aws rds create-db-snapshot \
--db-instance-identifier programme-db-prod \
--db-snapshot-identifier programme-db-pre-migration-20240115
# Expected: DBSnapshot creation initiated

Monitor snapshot progress:

Terminal window
aws rds describe-db-snapshots \
--db-snapshot-identifier programme-db-pre-migration-20240115 \
--query 'DBSnapshots[0].Status'
# Expected: "available" when complete (typically 10-30 minutes)
  1. Configure cross-region snapshot replication for disaster recovery:
Terminal window
aws rds copy-db-snapshot \
--source-db-snapshot-identifier arn:aws:rds:eu-west-1:123456789:snapshot:programme-db-pre-migration-20240115 \
--target-db-snapshot-identifier programme-db-pre-migration-20240115-eu-west-2 \
--source-region eu-west-1 \
--region eu-west-2 \
--kms-key-id alias/rds-backup-key
# Expected: Snapshot copy initiated to secondary region
  1. Enable point-in-time recovery verification:
Terminal window
aws rds describe-db-instances \
--db-instance-identifier programme-db-prod \
--query 'DBInstances[0].LatestRestorableTime'
# Expected: ISO timestamp within last 5 minutes
# Older timestamps indicate replication lag or configuration issues

Cloud backup costs

Cloud database snapshots incur storage costs. A 500 GB database with 14-day retention and 3% daily change rate requires approximately 710 GB of snapshot storage. Cross-region replication doubles this. Calculate expected costs before enabling extended retention.

Configuring MongoDB Backups

MongoDB backup procedures depend on the deployment topology. Standalone instances use mongodump; replica sets require coordination to ensure consistent backups. Sharded clusters add complexity through distributed data and configuration servers.

  1. Create backup user with appropriate roles:
// Connect to admin database as administrator
use admin
db.createUser({
user: "backup_user",
pwd: "secure-password-here",
roles: [
{ role: "backup", db: "admin" },
{ role: "readAnyDatabase", db: "admin" }
]
})
  1. Execute backup with compression and authentication:
Terminal window
mongodump \
--uri="mongodb://backup_user:secure-password-here@mongo-primary:27017" \
--authenticationDatabase=admin \
--gzip \
--archive=/backup/staging/beneficiary_db_$(date +%Y%m%d_%H%M%S).archive
# Expected output:
# done dumping beneficiary_db.registrations (45678 documents)
# done dumping beneficiary_db.distributions (12345 documents)
  1. For replica sets, backup from secondary to reduce primary load:
Terminal window
mongodump \
--uri="mongodb://backup_user:secure-password-here@mongo-secondary:27017" \
--authenticationDatabase=admin \
--readPreference=secondary \
--gzip \
--oplog \
--archive=/backup/staging/beneficiary_db_$(date +%Y%m%d_%H%M%S).archive
# The --oplog flag captures operations during backup for point-in-time consistency
  1. Validate backup by listing contents:
Terminal window
mongorestore \
--gzip \
--archive=/backup/staging/beneficiary_db_20240115_020000.archive \
--dryRun
# Expected: List of collections with document counts, no actual restoration

Recovery Procedures

Recovery operations restore data from backups to production systems. The specific procedure depends on recovery scope (full database, specific tables, individual files) and the backup format. All recovery operations should execute first in a non-production environment to validate data integrity before production restoration.

+------------------------------------------------------------------+
| RECOVERY DECISION FLOW |
+------------------------------------------------------------------+
| |
| +-------------------+ |
| | Recovery needed | |
| +--------+----------+ |
| | |
| +--------------+--------------+ |
| | | |
| v v |
| +---------+----------+ +----------+---------+ |
| | Full system | | Partial data | |
| | recovery | | recovery | |
| +---------+----------+ +----------+---------+ |
| | | |
| v v |
| +---------+----------+ +----------+---------+ |
| | Use most recent | | Identify affected | |
| | full backup | | objects/tables | |
| +---------+----------+ +----------+---------+ |
| | | |
| v v |
| +---------+----------+ +----------+---------+ |
| | Apply incrementals | | Extract from | |
| | in sequence | | backup to staging | |
| +---------+----------+ +----------+---------+ |
| | | |
| v v |
| +---------+----------+ +----------+---------+ |
| | Validate and | | Validate and | |
| | promote | | merge | |
| +--------------------+ +--------------------+ |
| |
+------------------------------------------------------------------+

Figure 2: Recovery procedure decision flow

Full Database Recovery

  1. Identify the appropriate backup snapshot for recovery:
Terminal window
restic snapshots --tag postgresql --tag programme_db
# Output shows available snapshots with timestamps:
# ID Time Host Tags
# abc12345 2024-01-15 02:15:00 db-server postgresql,programme_db
# def67890 2024-01-14 02:15:00 db-server postgresql,programme_db

Select the most recent snapshot before the incident. For data corruption discovered on 15 January at 14:00, use the 02:15:00 snapshot from the same day.

  1. Restore backup file to local staging:
Terminal window
restic restore abc12345 \
--target /backup/restore \
--include "programme_db_*.dump"
# Expected: restoring to /backup/restore
  1. Terminate existing connections to the target database:
-- Execute as superuser
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'programme_db'
AND pid <> pg_backend_pid();
  1. Drop and recreate the database:
DROP DATABASE programme_db;
CREATE DATABASE programme_db OWNER programme_app;
  1. Restore from backup:
Terminal window
pg_restore \
--host=db-server \
--port=5432 \
--username=postgres \
--dbname=programme_db \
--verbose \
/backup/restore/programme_db_20240115_021500.dump
# Expected: Restoration messages for each object
# Errors about existing objects indicate incomplete drop
  1. Verify restoration by checking row counts and sample data:
-- Compare against known pre-incident values if available
SELECT schemaname, relname, n_live_tup
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC
LIMIT 10;
-- Spot-check recent records
SELECT * FROM registrations
ORDER BY created_at DESC
LIMIT 5;

Partial Table Recovery

When only specific tables require recovery, extract them from backup without affecting other data.

  1. List contents of the backup to identify available objects:
Terminal window
pg_restore --list /backup/restore/programme_db_20240115_021500.dump | grep -i distributions
# Output shows table and related objects:
# 1234; 1259 16385 TABLE public distributions programme_app
# 1235; 0 16385 TABLE DATA public distributions programme_app
  1. Create a table list file for selective restoration:
Terminal window
pg_restore --list /backup/restore/programme_db_20240115_021500.dump > /tmp/full-toc.txt
# Edit to keep only desired objects
grep -E "distributions|SEQUENCE.*distributions" /tmp/full-toc.txt > /tmp/restore-toc.txt
  1. Restore to a temporary schema to avoid overwriting current data:
CREATE SCHEMA restore_staging;
Terminal window
pg_restore \
--host=db-server \
--username=postgres \
--dbname=programme_db \
--use-list=/tmp/restore-toc.txt \
--schema=restore_staging \
/backup/restore/programme_db_20240115_021500.dump
  1. Compare restored data against production and merge as needed:
-- Count comparison
SELECT 'production' as source, count(*) FROM public.distributions
UNION ALL
SELECT 'backup' as source, count(*) FROM restore_staging.distributions;
-- Identify records in backup but not production (deleted/corrupted)
SELECT b.*
FROM restore_staging.distributions b
LEFT JOIN public.distributions p ON b.id = p.id
WHERE p.id IS NULL;
  1. Merge recovered data into production:
-- Insert missing records
INSERT INTO public.distributions
SELECT * FROM restore_staging.distributions b
WHERE NOT EXISTS (
SELECT 1 FROM public.distributions p WHERE p.id = b.id
);
-- Clean up
DROP SCHEMA restore_staging CASCADE;

File Recovery

  1. List available snapshots and locate the required files:
Terminal window
restic snapshots --tag fileserver
restic ls latest --long | grep "important-document"
# Expected: Full path to file with timestamp
  1. Restore specific files or directories:
Terminal window
# Single file
restic restore latest \
--target /tmp/restore \
--include "/var/www/uploads/important-document.pdf"
# Directory tree
restic restore latest \
--target /tmp/restore \
--include "/home/shared/project-files/"
# Expected: Restores to /tmp/restore maintaining original path structure
  1. Verify restored files:
Terminal window
ls -la /tmp/restore/var/www/uploads/important-document.pdf
file /tmp/restore/var/www/uploads/important-document.pdf
# Expected: PDF document with expected size and type
  1. Move restored files to production location:
Terminal window
# Backup current version if it exists
mv /var/www/uploads/important-document.pdf \
/var/www/uploads/important-document.pdf.corrupted
# Restore from backup
cp /tmp/restore/var/www/uploads/important-document.pdf \
/var/www/uploads/important-document.pdf
# Restore permissions
chown www-data:www-data /var/www/uploads/important-document.pdf
chmod 644 /var/www/uploads/important-document.pdf

Cloud Database Point-in-Time Recovery

  1. Identify the target recovery time:
Terminal window
# Check latest restorable time
aws rds describe-db-instances \
--db-instance-identifier programme-db-prod \
--query 'DBInstances[0].LatestRestorableTime'
# Output: "2024-01-15T14:30:00+00:00"
# Recovery target must be between backup retention start and this time
  1. Initiate point-in-time recovery to a new instance:
Terminal window
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier programme-db-prod \
--target-db-instance-identifier programme-db-recovery \
--restore-time "2024-01-15T10:00:00+00:00" \
--db-instance-class db.t3.medium \
--vpc-security-group-ids sg-12345678 \
--db-subnet-group-name private-subnets
# Expected: DB instance creation initiated
  1. Monitor restoration progress:
Terminal window
watch -n 30 'aws rds describe-db-instances \
--db-instance-identifier programme-db-recovery \
--query "DBInstances[0].DBInstanceStatus"'
# Status progression: creating -> backing-up -> available
# Duration: 15-60 minutes depending on database size
  1. Validate recovered data before production cutover:
Terminal window
# Connect to recovery instance
psql -h programme-db-recovery.abc123.eu-west-1.rds.amazonaws.com \
-U admin_user \
-d programme_db
# Run validation queries
SELECT count(*) FROM registrations WHERE created_at < '2024-01-15 10:00:00';
  1. Perform production cutover by renaming instances:
Terminal window
# Rename current production (requires brief outage)
aws rds modify-db-instance \
--db-instance-identifier programme-db-prod \
--new-db-instance-identifier programme-db-old \
--apply-immediately
# Wait for rename to complete
sleep 300
# Rename recovery to production
aws rds modify-db-instance \
--db-instance-identifier programme-db-recovery \
--new-db-instance-identifier programme-db-prod \
--apply-immediately

Verification

Backup verification confirms that backup data can be successfully restored. Verification should occur after every backup job and through periodic full restore tests.

Post-Backup Verification

Execute these checks after each backup completes:

Terminal window
# Verify restic snapshot was created
LATEST_SNAPSHOT=$(restic snapshots --json --last 1 | jq -r '.[0].id')
if [ -z "$LATEST_SNAPSHOT" ]; then
echo "ERROR: No snapshot found"
exit 1
fi
# Check snapshot integrity
restic check --read-data-subset=1%
# Expected: "no errors were found"
# Checks 1% of data blocks; adjust percentage based on time available
# For PostgreSQL backups, verify dump file integrity
DUMP_FILE=$(restic ls "$LATEST_SNAPSHOT" | grep "\.dump$" | head -1)
restic dump "$LATEST_SNAPSHOT" "$DUMP_FILE" | pg_restore --list > /dev/null
echo "Dump verification exit code: $?"
# Expected: 0

Scheduled Restore Testing

Perform full restore tests monthly for critical systems and quarterly for standard systems.

#!/bin/bash
# restore-test.sh - Monthly restore verification script
set -e
TIMESTAMP=$(date +%Y%m%d)
TEST_DB="restore_test_${TIMESTAMP}"
# Create test database
psql -h localhost -U postgres -c "CREATE DATABASE ${TEST_DB};"
# Get most recent backup
SNAPSHOT=$(restic snapshots --json --last 1 --tag postgresql | jq -r '.[0].id')
# Restore to staging directory
restic restore "${SNAPSHOT}" --target /tmp/restore-test
# Find dump file
DUMP=$(find /tmp/restore-test -name "*.dump" | head -1)
# Restore to test database
pg_restore -h localhost -U postgres -d "${TEST_DB}" "${DUMP}"
# Run validation queries
EXPECTED_COUNT=45678
ACTUAL_COUNT=$(psql -h localhost -U postgres -d "${TEST_DB}" -t -c \
"SELECT count(*) FROM registrations;")
if [ "${ACTUAL_COUNT}" -lt "${EXPECTED_COUNT}" ]; then
echo "ERROR: Row count mismatch. Expected ${EXPECTED_COUNT}, got ${ACTUAL_COUNT}"
exit 1
fi
# Clean up
psql -h localhost -U postgres -c "DROP DATABASE ${TEST_DB};"
rm -rf /tmp/restore-test
echo "Restore test completed successfully"

Document restore test results including restoration time, data validation outcome, and any issues encountered.

Troubleshooting

SymptomCauseResolution
pg_dump fails with “connection refused”PostgreSQL not accepting connections on specified host/portVerify PostgreSQL is running with systemctl status postgresql; check listen_addresses in postgresql.conf; verify firewall rules allow connection
pg_dump fails with “permission denied for table”Backup user lacks SELECT permission on tablesGrant permissions with GRANT SELECT ON ALL TABLES IN SCHEMA public TO backup_user;; check for tables in non-public schemas
restic reports “repository is already locked”Previous backup process did not complete cleanlyRemove stale lock with restic unlock; investigate why previous process failed before proceeding
Backup completes but file size is unexpectedly smallEmpty database, connection to wrong database, or compression issuesVerify database name and connection; run SELECT count(*) FROM pg_tables; to confirm table count; check for error messages in backup output
”unable to open config file” from resticRepository not initialised or incorrect pathRun restic init to create repository; verify RESTIC_REPOSITORY environment variable is set correctly
Cloud snapshot creation fails with “insufficient capacity”Snapshot storage quota exceededDelete old snapshots; request quota increase; consider enabling snapshot archiving to cheaper tier
Restore fails with “relation already exists”Target database contains existing objectsDrop conflicting objects before restore, or use --clean flag with pg_restore; for partial restore, use staging schema approach
Restored database has missing foreign key referencesBackup taken during write operations without transaction consistencyUse --serializable-deferrable with pg_dump for consistent snapshot; schedule backups during low-activity windows
MongoDB backup reports “not primary”Connected to secondary in replica set when authentication requires primaryUse --readPreference=primaryPreferred or connect directly to primary; verify replica set configuration with rs.status()
Incremental backup takes as long as full backupChanged block tracking not functioning; all blocks marked changedFor restic, check if cache directory exists and has correct permissions; for native tools, verify CBT configuration
Restore succeeds but application reports errorsSchema version mismatch between backup and application expectationsRun application migrations after restore; check for migration files created after backup timestamp
”authentication failed” during cloud backupExpired or rotated credentialsVerify service account credentials; check credential expiration; regenerate access keys if needed

Recovery From Backup Chain Failures

When an incremental backup chain becomes broken due to a missing or corrupt intermediate backup, recovery requires returning to the most recent valid full backup. This may result in data loss for the period between the full backup and the incident.

To verify backup chain integrity:

Terminal window
# List all snapshots in a chain
restic snapshots --tag postgresql | sort -k2
# Attempt restore from each snapshot to verify recoverability
for SNAP in $(restic snapshots --json --tag postgresql | jq -r '.[].id'); do
echo "Testing snapshot: ${SNAP}"
restic ls "${SNAP}" > /dev/null 2>&1 || echo "FAILED: ${SNAP}"
done

If chain verification reveals a gap, document the data loss window and assess recovery options including alternative data sources, application logs, or manual reconstruction.

Automation

Automate backup operations using systemd timers or cron jobs. Systemd timers provide better logging integration and dependency management than cron.

Create the backup service unit:

/etc/systemd/system/backup-postgresql.service
[Unit]
Description=PostgreSQL Backup
After=network-online.target postgresql.service
Wants=network-online.target
[Service]
Type=oneshot
User=backup
Group=backup
EnvironmentFile=/etc/restic/environment
ExecStart=/usr/local/bin/backup-postgresql.sh
ExecStartPost=/usr/local/bin/notify-backup-status.sh postgresql
[Install]
WantedBy=multi-user.target

Create the timer unit:

/etc/systemd/system/backup-postgresql.timer
[Unit]
Description=Daily PostgreSQL Backup
[Timer]
OnCalendar=*-*-* 02:00:00
RandomizedDelaySec=300
Persistent=true
[Install]
WantedBy=timers.target

Enable and start the timer:

Terminal window
systemctl daemon-reload
systemctl enable backup-postgresql.timer
systemctl start backup-postgresql.timer
# Verify timer is scheduled
systemctl list-timers | grep backup

Create monitoring alerts for backup failures:

/usr/local/bin/notify-backup-status.sh
#!/bin/bash
BACKUP_TYPE=$1
EXIT_CODE=${EXIT_STATUS:-0}
if [ "$EXIT_CODE" -ne 0 ]; then
# Send alert via webhook, email, or monitoring system
curl -X POST "https://alerts.example.org/webhook" \
-H "Content-Type: application/json" \
-d "{\"severity\": \"critical\", \"message\": \"${BACKUP_TYPE} backup failed with exit code ${EXIT_CODE}\"}"
fi

See also