Cloud Database Services
Cloud database services are fully managed data storage and retrieval systems operated by cloud providers, where the provider handles provisioning, patching, backup, and infrastructure maintenance while organisations retain control over schema design, data, and access policies. These services eliminate the operational burden of database administration for organisations lacking dedicated database expertise, though they introduce dependencies on provider availability and create data portability considerations that require careful evaluation during platform selection.
Mission-driven organisations benefit from managed databases because the operational model shifts infrastructure maintenance to the provider. A self-managed PostgreSQL deployment requires operating system patching, storage management, backup scripting, replication configuration, and failover testing. The equivalent managed service handles these concerns automatically, allowing IT staff to focus on application requirements rather than infrastructure maintenance. This trade-off favours managed services for organisations where IT capacity is constrained, while organisations with established database administration skills may prefer self-managed deployments for greater control and cost predictability.
- Managed database service
- A cloud-hosted database where the provider operates the underlying infrastructure, handles maintenance tasks, and exposes the database through standard connection protocols. The organisation manages data and schema; the provider manages servers, storage, and availability.
- Database engine
- The software that processes queries and manages data storage. Examples include PostgreSQL, MySQL, MongoDB, and Redis. Managed services run specific engine versions with provider-managed upgrades.
- Instance class
- The compute specification allocated to a database instance, determining CPU cores, memory, and network capacity. Larger instance classes support higher concurrent connections and query throughput.
Database service categories
Cloud providers organise database services into categories based on data models and access patterns. Each category optimises for specific workload characteristics, and selecting the appropriate category determines query capabilities, scaling behaviour, and operational constraints.
Relational databases
Relational databases store data in tables with defined schemas, enforce relationships through foreign keys, and support SQL for complex queries across multiple tables. The relational model suits workloads requiring transactional consistency, ad-hoc reporting, and data integrity constraints. Financial records, donor management, grant tracking, and human resources data fit the relational model because these domains involve structured records with well-defined relationships.
Managed relational services run established database engines. PostgreSQL services provide advanced features including JSON storage, full-text search, and PostGIS spatial extensions. MySQL services offer broad application compatibility and read replica scaling. MariaDB services provide MySQL-compatible functionality with open-source licensing. Microsoft SQL Server services suit organisations with existing SQL Server applications or specific Microsoft ecosystem requirements.
A grants management application illustrates relational database use. The schema includes tables for grants, donors, projects, budget lines, and expenditures. Foreign key relationships ensure expenditures reference valid budget lines, budget lines reference valid grants, and grants reference valid donors. SQL queries join these tables to produce reports: total expenditure by donor, remaining budget by project, or variance analysis across fiscal years. The relational model enforces these relationships at the database level, preventing orphaned records and maintaining referential integrity regardless of application behaviour.
Relational databases scale vertically by increasing instance class and horizontally through read replicas. Write operations concentrate on a primary instance, while read replicas serve reporting queries and application reads. This architecture supports read-heavy workloads effectively but constrains write throughput to a single primary instance. For most mission-driven organisation workloads, write throughput limits do not constrain operations, but applications with high-volume data collection (millions of records daily) may require alternative architectures.
Document databases
Document databases store data as self-contained documents, where each document contains all related information without requiring joins across tables. The document model suits workloads with variable schemas, nested data structures, and access patterns that retrieve complete entities rather than joining partial records.
MongoDB services dominate the document database category, providing a mature query language, indexing capabilities, and aggregation pipelines for complex analysis. The document model stores related data together: a beneficiary registration document contains demographics, programme enrolments, service history, and assessment results in a single retrievable unit. This structure eliminates joins for common access patterns but requires denormalisation that increases storage consumption and complicates updates affecting multiple documents.
Document databases scale horizontally through sharding, distributing documents across multiple servers based on a shard key. This architecture supports both read and write scaling, though query patterns must align with the sharding strategy to avoid scatter-gather operations that query all shards. Effective sharding requires understanding access patterns before selecting a shard key, and poor shard key selection creates performance problems that require data migration to resolve.
Case management systems illustrate document database strengths. Each case document contains the complete case record: intake information, assessments, service plans, case notes, and outcome tracking. Case workers retrieve and update complete cases, and the document model optimises this access pattern. Reporting across cases requires aggregation pipelines rather than SQL joins, which suits predefined reports but complicates ad-hoc analysis.
Key-value stores
Key-value stores provide simple get and put operations using unique keys to retrieve associated values. This minimal interface enables extreme performance and scaling characteristics suitable for caching, session management, and high-throughput data access where query flexibility is unnecessary.
Redis services provide in-memory key-value storage with sub-millisecond latency, supporting data structures beyond simple strings: lists, sets, sorted sets, and hashes. Redis serves as a caching layer in front of slower databases, storing frequently accessed data in memory to reduce database load and improve response times. Session storage represents another common use: web applications store user session data in Redis, enabling session sharing across application servers and fast session retrieval.
A beneficiary lookup service demonstrates key-value store application. The service caches beneficiary records by ID after initial database retrieval. Subsequent lookups retrieve cached records in under 1 millisecond rather than querying the primary database. Cache entries expire after a configured duration (300 seconds for active sessions, 3600 seconds for reference data), balancing memory consumption against database load reduction.
Key-value stores lack query capabilities beyond key lookup. Searching for records matching specific criteria requires maintaining secondary indexes as additional key-value mappings or using a separate query-capable database. This constraint suits workloads with known access patterns but prevents ad-hoc exploration.
Graph databases
Graph databases store entities as nodes and relationships as edges, optimising for queries that traverse connections between entities. The graph model suits workloads involving network analysis, relationship mapping, and path finding where relational joins would require recursive queries or multiple round trips.
Neo4j services provide the Cypher query language for expressing graph traversals declaratively. Queries find shortest paths, identify clusters, and analyse relationship patterns with syntax that mirrors the graph structure. Graph databases excel at queries relational databases handle poorly: finding all entities within N relationships of a starting point, identifying communities within a network, or recommending connections based on shared relationships.
Referral network analysis illustrates graph database application. Service providers form nodes, referrals form edges, and queries identify referral patterns: which providers receive referrals from which sources, average referral chain length for different service types, or providers with high centrality in the referral network. These queries execute efficiently on graph databases but require complex recursive CTEs or multiple queries on relational databases.
Graph databases store relationship metadata on edges, enabling weighted traversals and filtered paths. A referral edge includes the referral date, outcome, and service type, allowing queries to find successful referral paths for specific services or analyse referral patterns over time periods.
Time-series databases
Time-series databases optimise for append-heavy workloads where data arrives continuously with timestamps and queries aggregate across time ranges. The time-series model suits monitoring data, IoT sensor readings, financial transactions, and any workload involving temporal aggregation.
InfluxDB and TimescaleDB services provide time-series capabilities through different approaches. InfluxDB uses a purpose-built storage engine optimised for time-series access patterns. TimescaleDB extends PostgreSQL with time-series optimisations, retaining SQL compatibility while improving performance for temporal queries. The PostgreSQL foundation suits organisations preferring familiar SQL semantics and existing PostgreSQL expertise.
Device monitoring demonstrates time-series database use. Field devices report connectivity status, battery level, and location coordinates every 5 minutes. Queries aggregate this data: average connectivity over the past week, devices with battery below 20% in the past 24 hours, or connectivity trends by region. Time-series databases handle millions of data points efficiently through time-partitioned storage and downsampling that reduces storage requirements for historical data while retaining recent detail.
Workload-to-database mapping
Selecting database categories requires matching workload characteristics to database strengths. The decision involves access patterns, consistency requirements, query complexity, and scaling expectations.
+------------------+ | What is the | | primary access | | pattern? | +--------+---------+ | +-----------------------------+-----------------------------+ | | | v v v+--------+--------+ +---------+--------+ +---------+--------+| Retrieve by | | Query by | | Aggregate over || known key | | attributes | | time ranges |+--------+--------+ +---------+--------+ +---------+--------+ | | | v | v+--------+--------+ | +---------+--------+| KEY-VALUE | | | TIME-SERIES || Redis, Memcached| | | InfluxDB, |+-----------------+ | | TimescaleDB | | +------------------+ +-----------------+-----------------+ | | v v +----------+----------+ +---------+---------+ | Need joins across | | Access complete | | normalised tables? | | entities with | | | | nested data? | +----------+----------+ +---------+---------+ | | +-----------+-----------+ v | | +---------+---------+ v v | DOCUMENT |+--------+--------+ +---------+--------+ | MongoDB, || Yes: Complex | | No: Simple | | DocumentDB || reporting, | | lookups, | +-------------------+| referential | | flexible schema || integrity | | |+--------+--------+ +------------------+ | v+--------+--------+| RELATIONAL || PostgreSQL, || MySQL, MariaDB |+-----------------+
+------------------+ | Relationship | | traversal | | queries? | +--------+---------+ | v +--------+--------+ | GRAPH | | Neo4j, | | Neptune | +-----------------+Figure 1: Database category selection based on access patterns
The decision tree addresses primary access patterns, but real workloads involve multiple patterns. A programme management system requires relational storage for grants and budgets, document storage for flexible assessment forms, and caching for frequently accessed reference data. Multi-model architectures address this complexity through polyglot persistence (multiple specialised databases) or multi-model databases (single database supporting multiple paradigms). PostgreSQL with its JSON capabilities represents a pragmatic middle ground: relational structure for core data with document flexibility for semi-structured content.
High availability patterns
Managed database services provide high availability through redundant instances that maintain synchronised copies of data and assume primary responsibility when failures occur. The availability architecture determines recovery time, data loss exposure, and geographic resilience.
Single-region high availability
Single-region high availability deploys redundant database instances within one geographic region, protecting against instance and availability zone failures while remaining vulnerable to region-wide outages.
+------------------------------------------------------------------+| CLOUD REGION (EU-WEST-1) |+------------------------------------------------------------------+| || +---------------------------+ +---------------------------+ || | AVAILABILITY ZONE A | | AVAILABILITY ZONE B | || | | | | || | +---------------------+ | | +---------------------+ | || | | PRIMARY | | | | STANDBY | | || | | INSTANCE | | | | INSTANCE | | || | | | | | | | | || | | - Accepts writes | | | | - Synchronous | | || | | - Serves reads +--+----+--> replication | | || | | - Transaction logs | | | | - Promoted on | | || | | | | | | primary failure | | || | +---------------------+ | | +---------------------+ | || | | | | || | +---------------------+ | | +---------------------+ | || | | STORAGE | | | | STORAGE | | || | | (replicated) +--+----+--> (replicated) | | || | +---------------------+ | | +---------------------+ | || | | | | || +---------------------------+ +---------------------------+ || |+------------------------------------------------------------------+Figure 2: Single-region high availability with synchronous replication across availability zones
Synchronous replication ensures the standby instance contains all committed transactions before the primary acknowledges writes. Failover to the standby loses no committed data but introduces write latency (1-3 milliseconds additional) for the cross-zone replication acknowledgement. Asynchronous replication reduces write latency but risks losing transactions committed on the primary but not yet replicated when failure occurs.
Failover time determines how long applications experience unavailability during instance failure. Managed services detect failures through health checks at configurable intervals (5-30 seconds) and initiate failover automatically. The failover process promotes the standby to primary, updates DNS records, and terminates the failed instance. Total failover time ranges from 30 seconds to 3 minutes depending on database size and provider implementation.
Read replica scaling
Read replicas provide horizontal scaling for read-heavy workloads by distributing read queries across multiple instances while concentrating writes on the primary. Each read replica maintains an asynchronous copy of primary data, introducing replication lag between write and read visibility.
+------------------------------------------------------------------+| APPLICATION TIER |+------------------------------------------------------------------+| || +------------------+ +------------------+ || | Application | | Reporting | || | Servers | | Services | || +--------+---------+ +--------+---------+ || | | |+------------------------------------------------------------------+ | | | Writes | Reads | | v v+------------------------------------------------------------------+| DATABASE TIER |+------------------------------------------------------------------+| || +---------------------+ || | PRIMARY | || | INSTANCE | || +----------+----------+ || | || +-------------+-------------+ || | | | || v v v || +-----+-----+ +-----+-----+ +-----+-----+ || | REPLICA | | REPLICA | | REPLICA | || | 1 | | 2 | | 3 | || +-----------+ +-----------+ +-----------+ || || Replication lag: 10-100ms (async) || |+------------------------------------------------------------------+Figure 3: Read replica architecture separating write and read traffic
Replication lag creates consistency considerations for applications. A user updates their profile on the primary, then immediately retrieves their profile from a replica that has not yet received the update. The application displays stale data. Addressing this requires either read-your-writes consistency (routing reads to the primary immediately after writes) or accepting eventual consistency where recent writes may not appear in subsequent reads.
Read replicas suit reporting and analytics workloads that tolerate eventual consistency and benefit from isolating analytical queries from transactional workloads. Running a complex report against a read replica prevents that query from consuming resources needed for application transactions on the primary.
Multi-region deployment
Multi-region deployment extends high availability across geographic regions, protecting against region-wide outages and enabling data locality for globally distributed applications.
+-------------------------------+ +-------------------------------+| PRIMARY REGION | | SECONDARY REGION || (EU-WEST-1) | | (EU-CENTRAL-1) |+-------------------------------+ +-------------------------------+| | | || +-------------------------+ | | +-------------------------+ || | PRIMARY INSTANCE | | | | REPLICA INSTANCE | || | | | | | | || | - All writes | | | | - Read-only | || | - Local reads +--+---->+-->- Can promote to primary | || | - Synchronous standby | | async | - Serves local reads | || | within region | | | | | || +-------------------------+ | | +-------------------------+ || | | || +-------------------------+ | | +-------------------------+ || | STANDBY INSTANCE | | | | (Promotable on | || | (same region HA) | | | | regional failover) | || +-------------------------+ | | +-------------------------+ || | | |+-------------------------------+ +-------------------------------+ ^ ^ | | | +---------------+ | +----------->| GLOBAL DNS |<----------+ | (latency- | | based | | routing) | +---------------+ ^ | +---------+---------+ | APPLICATIONS | +-------------------+Figure 4: Multi-region database deployment with asynchronous cross-region replication
Cross-region replication introduces latency that prevents synchronous replication for most workloads. A write in EU-WEST-1 synchronously replicated to EU-CENTRAL-1 adds 15-30 milliseconds per transaction, unacceptable for interactive workloads. Asynchronous replication accepts data loss exposure: transactions committed in the primary region but not yet replicated are lost if the primary region fails before replication completes.
Recovery Point Objective (RPO) quantifies acceptable data loss. An RPO of 5 minutes means losing up to 5 minutes of transactions during regional failover is acceptable. Asynchronous replication with typical lag under 1 minute supports RPO targets of 5-15 minutes. Lower RPO targets require synchronous replication with associated latency costs or alternative architectures like multi-active writes with conflict resolution.
Recovery Time Objective (RTO) quantifies acceptable downtime. Regional failover requires DNS propagation, application reconnection, and potentially manual promotion of the secondary to primary. Automated regional failover achieves RTO of 5-15 minutes. Manual failover requiring operational decision-making extends RTO to 30-60 minutes but provides human validation that failover is appropriate.
Backup and recovery
Managed database services automate backup operations while providing controls over retention, recovery granularity, and cross-region backup distribution.
Automated backups
Automated backups capture database state on configurable schedules. Full backups capture complete database contents, while incremental backups capture changes since the last backup. The combination enables recovery to any point within the retention window.
Backup windows specify when full backups execute, preferably during low-activity periods to minimise performance impact. A backup window of 02:00-04:00 UTC suits European organisations where activity is minimal overnight. Backup duration depends on database size and changed data volume: a 500 GB database with 10% daily change typically completes incremental backup within 30 minutes.
Retention periods determine how long backups remain available for recovery. A 35-day retention period enables recovery to any point within the past 35 days. Compliance requirements may mandate longer retention: 7 years for financial data under some regulations. Extended retention beyond provider limits requires exporting backups to separate storage.
Point-in-time recovery
Point-in-time recovery (PITR) enables restoration to any second within the retention window by combining full backups with transaction logs. The recovery process restores the nearest full backup, then replays transaction logs up to the specified recovery point.
Consider a scenario where an administrator accidentally deletes beneficiary records at 14:23:15 UTC. Point-in-time recovery restores the database to 14:23:14 UTC, before the deletion occurred. The recovery creates a new database instance; it does not modify the existing instance. Applications then switch to the recovered instance after validation.
PITR recovery time depends on database size and the time span between the base backup and recovery point. Recovering a 200 GB database to a point 4 hours after the base backup requires restoring 200 GB then replaying 4 hours of transaction logs. Estimated recovery time: 45-90 minutes.
Cross-region backup replication
Cross-region backup replication copies backups to a secondary region, enabling recovery even if the primary region becomes unavailable. This capability addresses scenarios where a regional outage prevents accessing both the database and its backups.
Backup replication introduces storage costs in the secondary region and potential data residency considerations. Backups replicated to a different jurisdiction may fall under different regulatory requirements. Verify that secondary region backup storage complies with applicable data protection regulations before enabling cross-region replication.
Security configuration
Database security involves encryption for data protection, access control for authorisation, and auditing for accountability. Managed services provide these capabilities through configuration rather than requiring custom implementation.
Encryption
Encryption at rest protects stored data using keys managed by the cloud provider or customer-managed keys in a key management service. Provider-managed keys simplify operations: the provider handles key rotation and storage. Customer-managed keys provide greater control and enable key revocation that renders data unreadable, useful for compliance scenarios requiring cryptographic data destruction.
Encryption in transit protects data moving between applications and databases using TLS. Managed services provide TLS certificates and endpoints; applications must configure connection strings to require TLS and validate certificates. A PostgreSQL connection string with TLS enforcement:
postgresql://user:password@hostname:5432/database?sslmode=verify-full&sslrootcert=/path/to/ca-certificate.crtThe sslmode=verify-full parameter ensures the connection validates the server certificate against the certificate authority, preventing man-in-the-middle attacks. Weaker settings like sslmode=require encrypt traffic but do not validate server identity.
Access control
Database access control operates at multiple levels. Network access control determines which systems can establish connections. Authentication verifies identity when connections are established. Authorisation determines which operations authenticated users can perform.
Network access control uses security groups or firewall rules to restrict connection sources. A production database permits connections only from application servers in specific network segments. Administrative access routes through a bastion host or privileged access management system rather than direct connection from administrator workstations.
Authentication uses database credentials (username and password) or IAM integration where cloud identity tokens authenticate database connections. IAM authentication eliminates password management for database access, tying database permissions to cloud identity. This approach suits organisations with mature IAM practices but requires application support for token-based database authentication.
Authorisation grants permissions at database, schema, table, and column levels. The principle of least privilege applies: application accounts receive only permissions required for application functionality. A reporting application receives SELECT permissions on specific tables, not full database access. Administrative accounts with elevated permissions require separate authentication and access logging.
Audit logging
Audit logging records database activities for security monitoring and compliance evidence. Log categories include authentication events (successful and failed logins), data definition events (schema changes), and data manipulation events (queries and modifications).
Audit log volume requires attention. Logging all SELECT queries on a high-traffic database produces terabytes of logs daily. Selective logging balances visibility against storage and processing costs: log all authentication events and DDL statements, sample or filter DML statements based on risk.
Audit logs integrate with security information and event management (SIEM) systems for alerting and correlation. Failed authentication attempts exceeding a threshold trigger alerts. Queries against sensitive tables outside normal application patterns warrant investigation. Schema modifications require change management correlation.
Performance considerations
Database performance depends on instance sizing, query efficiency, indexing, and connection management. Managed services provide monitoring and recommendations; effective use requires understanding performance fundamentals.
Instance sizing
Instance sizing balances cost against performance headroom. Under-provisioned instances exhibit high CPU utilisation, memory pressure, and increased query latency under load. Over-provisioned instances waste budget on unused capacity.
Sizing begins with workload characterisation. A donor management database supporting 50 concurrent users with mixed transactional and reporting queries requires different resources than a data warehouse running nightly batch loads. Baseline metrics from monitoring inform sizing: average and peak CPU utilisation, memory consumption, storage IOPS, and network throughput.
A worked example: current production shows 45% average CPU utilisation with peaks reaching 85% during month-end reporting. Memory utilisation averages 70% with minimal swap activity. The current instance class provides 8 vCPUs and 32 GB RAM. This profile suggests adequate sizing: peaks remain below 90% with headroom for growth. If peaks regularly exceed 90% or baseline exceeds 70%, scaling up reduces latency and prevents resource exhaustion.
Vertical scaling (larger instance class) addresses CPU and memory constraints. Horizontal scaling (read replicas) addresses read throughput constraints. Storage scaling addresses IOPS and capacity constraints independently of compute scaling on services that separate these resources.
Query optimisation
Query execution plans reveal how the database processes queries and where optimisation opportunities exist. The EXPLAIN command (PostgreSQL, MySQL) or equivalent displays the execution plan including table scans, index usage, and join methods.
A slow query scanning a full table benefits from an index on the filtered columns. Consider a query finding donors who contributed in the past year:
SELECT * FROM donors WHERE last_donation_date > '2024-01-01';Without an index on last_donation_date, the database scans every row in the donors table. With 500,000 donor records, this scan takes 3-5 seconds. An index on last_donation_date enables the database to locate matching rows directly, reducing execution time to 50-100 milliseconds.
Indexing involves trade-offs. Each index consumes storage and slows write operations (INSERT, UPDATE, DELETE) because the database must maintain index consistency. Index columns involved in WHERE clauses, JOIN conditions, and ORDER BY clauses of frequent queries. Avoid indexing columns with low selectivity (few distinct values) or columns rarely used in query predicates.
Connection management
Database connections consume server resources. Each connection allocates memory for session state, query parsing, and result buffering. Excessive connections exhaust available memory and connection slots, causing connection failures for new requests.
Connection pooling addresses this constraint by maintaining a pool of database connections shared across application threads. Rather than each application thread establishing its own database connection, threads borrow connections from the pool, execute queries, and return connections for reuse.
A connection pool configuration for a web application:
minimum_pool_size: 5maximum_pool_size: 20connection_timeout_seconds: 30idle_timeout_seconds: 300The pool maintains 5-20 connections based on demand. When all 20 connections are in use, new requests wait up to 30 seconds for an available connection. Idle connections exceeding 300 seconds are closed to release resources.
Pool sizing depends on application concurrency and query duration. An application handling 100 concurrent requests where each request executes queries for 50 milliseconds requires approximately 5 connections for steady-state throughput (100 requests × 0.05 seconds = 5 connection-seconds per second). Peak load handling and query variation suggest sizing pools at 2-4x steady-state requirements.
Migration approaches
Migrating from self-managed databases to managed services requires planning for schema compatibility, data transfer, and application cutover with minimal disruption.
Assessment
Assessment identifies compatibility gaps between source and target platforms. Database engines differ in SQL dialect, data types, and feature availability. A PostgreSQL database using specific extensions may require extension availability verification on the target managed service. MySQL databases using storage engines other than InnoDB may require conversion.
Assessment tools provided by cloud platforms analyse source databases and report compatibility issues. Run assessments against non-production database copies to avoid production impact. Assessment reports categorise issues by severity: blockers requiring resolution before migration, warnings requiring attention but not blocking, and informational items for awareness.
Schema compatibility represents the primary assessment focus. Application queries must execute correctly on the target platform. Testing application functionality against a migrated schema copy validates compatibility before production migration.
Data transfer methods
Data transfer methods balance migration speed against downtime requirements.
Offline migration involves stopping the source database, exporting data, transferring the export, and importing to the target. This approach guarantees consistency but requires downtime proportional to database size. A 100 GB database exports in 30-60 minutes, transfers in 15-30 minutes (depending on network bandwidth), and imports in 30-60 minutes. Total downtime: 2-3 hours.
Online migration uses replication to synchronise source and target databases while the source remains operational. Initial bulk data transfer occurs during normal operations. Ongoing changes replicate continuously, maintaining target synchronisation. Cutover involves brief downtime (minutes rather than hours) to stop the source, verify replication completion, and redirect applications to the target.
Database migration services from cloud providers orchestrate online migrations with monitoring, validation, and rollback capabilities. These services handle replication configuration, ongoing synchronisation, and cutover coordination for supported database engines.
Cutover planning
Cutover planning addresses application connection changes, rollback procedures, and validation criteria.
Connection string updates must propagate to all application instances. Configuration management systems, environment variables, or secrets managers centralise connection strings for coordinated updates. Hardcoded connection strings in application code require deployment coordination.
Rollback procedures enable reverting to the source database if issues emerge post-cutover. Maintaining the source database in read-only mode during initial post-cutover operation enables rollback without data loss. Rollback windows depend on acceptable data divergence: once the target database receives writes not present on the source, rollback requires data reconciliation.
Validation confirms successful migration through functional testing, performance comparison, and data integrity verification. Functional tests exercise critical application workflows. Performance comparison ensures query latency and throughput meet expectations. Data integrity verification confirms record counts and checksums match between source and target.
Global distribution
Global distribution extends database presence across geographic regions for latency reduction, data locality compliance, and resilience against regional outages.
Active-passive global distribution
Active-passive distribution concentrates writes in a primary region while read replicas in secondary regions serve local read traffic. Applications in secondary regions read from local replicas with low latency while writes route to the primary region with cross-region latency.
This architecture suits workloads with read-heavy access patterns where write latency tolerance exists. A beneficiary lookup service in East Africa reads from a local replica with 5-10 millisecond latency rather than crossing to a European primary with 150-200 millisecond latency. Beneficiary updates route to the European primary, with replication updating the African replica within seconds.
Active-passive distribution does not provide write availability during primary region outages. Failover to a secondary region requires promoting a read replica to primary, updating application configuration, and accepting any data loss from unreplicated transactions.
Active-active global distribution
Active-active distribution accepts writes in multiple regions simultaneously, reducing write latency for globally distributed users while introducing conflict resolution complexity.
+-------------------------------+ +-------------------------------+| REGION A | | REGION B || (Europe) | | (Africa) |+-------------------------------+ +-------------------------------+| | | || +-------------------------+ | | +-------------------------+ || | PRIMARY (LOCAL) | | | | PRIMARY (LOCAL) | || | | | | | | || | - Accepts local writes +--+<--->+--+ - Accepts local writes | || | - Serves local reads | | | | - Serves local reads | || | - Conflict resolution | | | | - Conflict resolution | || | | | | | | || +-------------------------+ | | +-------------------------+ || | | || Applications connect to | | Applications connect to || local primary for reads | | local primary for reads || and writes | | and writes || | | |+-------------------------------+ +-------------------------------+ | | +------------------+------------------+ | Bidirectional replication with conflict resolutionFigure 5: Active-active global distribution with bidirectional replication
Conflict resolution addresses concurrent writes to the same record in different regions. Resolution strategies include last-write-wins (most recent timestamp prevails), application-defined merge logic, or conflict flagging for manual resolution. Last-write-wins risks data loss when conflicting writes both contain valid information. Application-defined merging requires conflict-aware application design. Manual resolution introduces operational overhead.
Active-active suits workloads where write locality provides significant benefit and conflict frequency is low. Organisations with regional offices operating independently on locally-relevant data experience few conflicts. Shared datasets modified simultaneously across regions require careful schema design to minimise conflict potential.
Offline and sync patterns
Field operations with intermittent connectivity require databases that function offline and synchronise when connectivity returns. This capability enables data collection and service delivery regardless of network availability.
Offline-capable databases
Offline-capable databases store data locally on devices and synchronise with central databases when connected. Mobile applications using these databases function without network connectivity, queueing changes for later synchronisation.
PouchDB provides offline-capable document storage that synchronises with CouchDB or compatible servers. The application interacts with the local PouchDB database regardless of connectivity state. When connectivity exists, PouchDB replicates changes bidirectionally with the server. Applications receive change notifications to update user interfaces as remote data arrives.
SQLite provides offline relational storage on mobile and desktop devices. Synchronisation requires application logic or third-party synchronisation frameworks because SQLite lacks built-in replication. The application explicitly manages change tracking, conflict detection, and server communication.
Synchronisation architecture
Synchronisation architecture addresses change propagation, conflict handling, and bandwidth efficiency.
+-------------------------------------------------------------------+| CENTRAL DATABASE || (Cloud-hosted PostgreSQL) |+-------------------------------------------------------------------+| || +-------------------+ +-------------------+ +----------------+ || | Beneficiaries | | Assessments | | Reference Data | || | (master records) | | (master records) | | (read-only) | || +-------------------+ +-------------------+ +----------------+ || |+----------------------------------+--------------------------------+ | +--------------+--------------+ | | v v +---------+----------+ +-----------+--------+ | SYNC SERVICE | | SYNC SERVICE | | (Regional hub) | | (Regional hub) | +--------------------+ +--------------------+ | | +-------------+-------------+ | | | | | v v v v+-----+-----+ +-----+-----+ +-----+-----+ +------+----+| FIELD | | FIELD | | FIELD | | FIELD || DEVICE | | DEVICE | | DEVICE | | DEVICE || | | | | | | || PouchDB/ | | PouchDB/ | | PouchDB/ | | PouchDB/ || SQLite | | SQLite | | SQLite | | SQLite |+-----------+ +-----------+ +-----------+ +-----------+
Sync frequency: On connectivity (opportunistic) + Daily minimum (mandatory)
Conflict resolution: Last-write-wins for simple fields Manual review for complex recordsFigure 6: Offline synchronisation architecture with regional hubs and field devices
Regional sync hubs reduce bandwidth requirements for field devices by providing geographically closer synchronisation endpoints. Field devices synchronise with regional hubs rather than directly with central databases, reducing latency and improving synchronisation reliability on constrained connections.
Conflict resolution in synchronisation mirrors active-active database challenges. Two field workers updating the same beneficiary record while offline create conflicting versions. Resolution options include last-write-wins (automatic, potential data loss), field-level merging (automatic, requires non-overlapping field edits), or conflict flagging (manual, ensures data review).
Bandwidth efficiency matters for metered connections. Delta synchronisation transfers only changed records rather than complete datasets. Compression reduces transfer size. Prioritisation ensures critical data synchronises before optional data when connectivity windows are brief.
Implementation considerations
For organisations with limited IT capacity
Managed database services eliminate operational burden that self-managed databases impose. Select services with minimal configuration requirements and comprehensive automation.
Start with relational databases for general-purpose workloads. PostgreSQL services support structured data, JSON documents, full-text search, and spatial queries within a single database engine, reducing the number of services to manage. Add specialised databases only when PostgreSQL limitations demonstrably constrain functionality.
Accept provider defaults for backup schedules, retention periods, and maintenance windows initially. These defaults suit most workloads. Customise when specific requirements emerge rather than optimising prematurely.
Monitor service dashboards provided by cloud platforms rather than deploying separate monitoring infrastructure. Platform dashboards display key metrics, provide alerting capabilities, and require no additional infrastructure or expertise.
Budget for managed service costs rather than self-managing to save licensing fees. The operational cost of self-management (time spent on patching, backup management, troubleshooting, and capacity planning) exceeds managed service premiums for organisations with constrained IT capacity.
For organisations with established IT functions
Established IT functions can optimise managed service configurations and consider hybrid approaches combining managed and self-managed databases.
Implement infrastructure-as-code for database provisioning. Terraform, Pulumi, or provider-specific tools (CloudFormation, ARM templates) codify database configurations, enabling consistent deployments across environments and change tracking through version control.
Integrate database monitoring with existing observability platforms. Export metrics to Prometheus, logs to centralised log management, and traces to distributed tracing systems. Unified observability surfaces database issues alongside application and infrastructure problems.
Evaluate self-managed databases for workloads where managed services impose unacceptable constraints: specific version requirements, custom extensions, extreme performance tuning, or cost optimisation at scale. Self-management requires database administration skills and operational procedures for patching, backup, and failover.
Implement database change management processes. Schema migrations through version-controlled scripts, deployment pipelines, and rollback procedures ensure database changes follow the same rigour as application deployments.
For field operations
Field operations require offline capability and bandwidth-efficient synchronisation.
Select databases supporting offline operation for field-facing applications. PouchDB with CouchDB backend provides mature offline synchronisation for document workloads. SQLite with custom synchronisation logic suits relational requirements but demands development investment.
Design synchronisation for constrained connectivity. Prioritise essential data for synchronisation during brief connectivity windows. Implement background synchronisation that operates during any connectivity rather than requiring user-initiated sync. Provide visibility into synchronisation status so users understand data freshness.
Plan for conflict resolution before deployment. Define which record version prevails when conflicts occur, or implement conflict queues for manual resolution. Test conflict scenarios during development because field conditions inevitably produce conflicts.
Consider regional database deployments to reduce latency for field users. A database replica in-region provides lower latency than cross-continental connections, improving both interactive use and synchronisation performance.