On this page

Cloud Compute and Storage

Cloud compute and storage services provide on-demand processing capacity and data persistence without requiring organisations to procure, deploy, or maintain physical hardware. These services span a spectrum from virtual machines that replicate traditional server environments to serverless functions that execute code in response to events without any infrastructure management. Storage services similarly range from block devices attached to virtual machines through to object storage systems designed for web-scale data retention. Understanding the mechanisms underlying each service category enables informed selection based on workload characteristics, operational capacity, and cost constraints.

Virtual Machine (VM): An emulated computer system providing CPU, memory, storage, and networking as a discrete unit. Cloud VMs run on shared physical infrastructure with resource isolation enforced by hypervisors.
Instance Type: A predefined combination of vCPU count, memory, storage, and networking capacity. Providers publish instance type catalogues with specific resource allocations and hourly prices.
Container: A lightweight execution environment sharing the host operating system kernel while isolating application processes, libraries, and configuration.
Serverless Function: Code executed by the cloud provider in response to triggers, with no persistent infrastructure. Billing occurs per invocation and execution duration.
Object Storage: A storage system addressing data as discrete objects with unique identifiers and metadata, accessed via HTTP APIs rather than filesystem protocols.
Block Storage: Storage presented as raw volumes that operating systems format with filesystems. Block storage attaches to compute instances as virtual disks.
IOPS: Input/Output Operations Per Second. A measure of storage performance indicating how many read or write operations complete each second.
Throughput: The volume of data transferred per unit time, measured in megabytes per second (MB/s) or gigabytes per second (GB/s).

Compute service landscape

Cloud compute services organise into three categories distinguished by the level of infrastructure abstraction and the operational responsibility retained by the organisation. Virtual machines provide the lowest abstraction, exposing complete operating systems that staff configure and maintain. Container services raise abstraction by managing the container runtime and orchestration layer while organisations supply container images. Serverless functions abstract infrastructure entirely, accepting code that the provider executes on demand.

+------------------------------------------------------------------+
|                    COMPUTE SERVICE SPECTRUM                      |
+------------------------------------------------------------------+
|                                                                  |
|  CONTROL                                                MANAGED  |
|  <---------------------------------------------------------->    |
|                                                                  |
|  +----------------+  +----------------+  +----------------+      |
|  |                |  |                |  |                |      |
|  |  VIRTUAL       |  |  CONTAINER     |  |  SERVERLESS    |      |
|  |  MACHINES      |  |  SERVICES      |  |  FUNCTIONS     |      |
|  |                |  |                |  |                |      |
|  |  - Full OS     |  |  - Container   |  |  - Code only   |      |
|  |  - Any stack   |  |    runtime     |  |  - Event       |      |
|  |  - You patch   |  |  - You build   |  |    triggered   |      |
|  |  - You scale   |  |    images      |  |  - Auto-scale  |      |
|  |                |  |  - Managed     |  |  - Per-invoke  |      |
|  |                |  |    orchestrate |  |    billing     |      |
|  +----------------+  +----------------+  +----------------+      |
|                                                                  |
|  RESPONSIBILITY                                                  |
|  +----------------+  +----------------+  +----------------+      |
|  | OS patching    |  | Image updates  |  | Code updates   |      |
|  | Runtime setup  |  | Scaling rules  |  | Dependencies   |      |
|  | Scaling        |  | Health checks  |  |                |      |
|  | Networking     |  |                |  |                |      |
|  | Load balancing |  |                |  |                |      |
|  +----------------+  +----------------+  +----------------+      |
|                                                                  |
+------------------------------------------------------------------+

Figure 1: Compute service categories arranged by abstraction level and operational responsibility

The selection between categories depends on workload requirements and organisational capacity. Workloads requiring specific operating system configurations, kernel modules, or hardware access necessitate virtual machines. Applications packaged as containers benefit from container services that handle orchestration complexity. Event-driven workloads with intermittent execution patterns suit serverless functions. Organisations with limited IT capacity gain operational efficiency from higher-abstraction services at the cost of reduced control over execution environments.

Virtual machine patterns

Virtual machines in cloud environments function identically to physical servers from the operating system perspective. The hypervisor presents virtualised CPU cores, memory allocations, and storage devices that the guest operating system manages without awareness of the underlying shared infrastructure. This abstraction enables organisations to migrate existing workloads without modification while gaining cloud benefits of rapid provisioning and elastic scaling.

Instance types determine the resource allocation available to each virtual machine. Providers structure their catalogues around workload profiles: general-purpose instances balance CPU and memory for typical applications; compute-optimised instances provide high vCPU-to-memory ratios for processing-intensive workloads; memory-optimised instances reverse this ratio for in-memory databases and caching; storage-optimised instances include high-performance local storage for data-intensive operations.

A practical selection process begins with understanding the workload’s resource consumption. A web application server handling 500 concurrent users with an average response time of 200 milliseconds requires approximately 4 vCPUs and 8 GB memory based on measured utilisation. Selecting a general-purpose instance with these specifications provides a starting point. Performance monitoring during operation reveals whether the instance is under-provisioned (sustained CPU above 80%, memory pressure events) or over-provisioned (CPU consistently below 30%, large memory buffers unused), enabling adjustment through instance type changes without workload modification.

Right-sizing requires ongoing attention because workload characteristics change over time. An initial deployment supporting 200 staff members grows to support 500 after programme expansion. The original 2 vCPU instance that performed adequately becomes a bottleneck. Cloud platforms provide utilisation metrics that expose these patterns: CPU utilisation averaging 85% over 14 days indicates the need for vertical scaling to a larger instance or horizontal scaling to multiple instances behind a load balancer.

Horizontal scaling distributes load across multiple instances, improving both capacity and resilience. A load balancer directs incoming requests to healthy instances, automatically removing failed instances from rotation. This pattern requires applications to manage state externally (in databases or caches) rather than in local memory, as any instance might handle any request. Stateless application design enables horizontal scaling; stateful applications constrain scaling to vertical approaches.

+------------------------------------------------------------------+
|                   HORIZONTAL SCALING PATTERN                     |
+------------------------------------------------------------------+
|                                                                  |
|                        +----------------+                        |
|                        |  LOAD          |                        |
|                        |  BALANCER      |                        |
|                        +-------+--------+                        |
|                                |                                 |
|            +-------------------+-------------------+             |
|            |                   |                   |             |
|            v                   v                   v             |
|     +------+------+     +------+------+     +------+------+      |
|     |             |     |             |     |             |      |
|     |  Instance   |     |  Instance   |     |  Instance   |      |
|     |  (web app)  |     |  (web app)  |     |  (web app)  |      |
|     |             |     |             |     |             |      |
|     +------+------+     +------+------+     +------+------+      |
|            |                   |                   |             |
|            +-------------------+-------------------+             |
|                                |                                 |
|                        +-------v--------+                        |
|                        |  SHARED STATE  |                        |
|                        |  (database,    |                        |
|                        |   cache)       |                        |
|                        +----------------+                        |
|                                                                  |
+------------------------------------------------------------------+

Figure 2: Horizontal scaling with load balancer and externalised state

Auto-scaling automates horizontal scaling by adding or removing instances based on metrics. A scaling policy specifies the trigger metric (CPU utilisation, request count, queue depth), threshold values, and scaling actions. An example policy adds one instance when average CPU exceeds 70% for 5 minutes and removes one instance when average CPU falls below 30% for 15 minutes. The asymmetric timing prevents thrashing: quick scale-out responds to load increases while slow scale-in avoids premature capacity reduction during temporary lulls.

Container services

Container services manage the execution of containerised workloads without requiring organisations to operate the underlying container orchestration infrastructure. The provider runs the control plane components that schedule containers, manage networking, and handle service discovery. Organisations interact through APIs to deploy container images, define scaling rules, and configure networking policies.

Managed Kubernetes services dominate this category. Kubernetes has established itself as the standard container orchestration platform, and all major cloud providers offer managed variants. The managed service handles control plane availability, Kubernetes version upgrades, and integration with cloud-native networking and storage. Organisations retain responsibility for worker node configuration, container image creation, and application-level concerns.

+-------------------------------------------------------------------+
|              MANAGED KUBERNETES RESPONSIBILITY SPLIT              |
+-------------------------------------------------------------------+
|                                                                   |
|  PROVIDER MANAGES                    ORGANISATION MANAGES         |
|  +-------------------------+         +-------------------------+  |
|  |                         |         |                         |  |
|  |  - API server HA        |         |  - Worker node sizing   |  |
|  |  - etcd cluster         |         |  - Container images     |  |
|  |  - Controller manager   |         |  - Deployment configs   |  |
|  |  - Scheduler            |         |  - Scaling policies     |  |
|  |  - Version upgrades     |         |  - Network policies     |  |
|  |  - Security patches     |         |  - Storage claims       |  |
|  |  - Control plane SLA    |         |  - Application code     |  |
|  |                         |         |  - Secrets management   |  |
|  +-------------------------+         +-------------------------+  |
|                                                                   |
+-------------------------------------------------------------------+

Figure 3: Responsibility division in managed Kubernetes services

Simpler container services exist for organisations without Kubernetes expertise or workloads that do not require its orchestration capabilities. These services accept container images and run them with basic scaling and networking, abstracting the orchestration layer entirely. A single container running a web application scales horizontally based on HTTP request count without any Kubernetes concepts appearing in the configuration. The trade-off is reduced control over scheduling, networking, and resource management compared to full Kubernetes access.

Container services charge based on the compute resources allocated to running containers. A container requesting 1 vCPU and 2 GB memory running for 720 hours (one month) incurs charges equivalent to a similarly-sized virtual machine. The operational efficiency gain comes from density: multiple containers share worker nodes, and the orchestration system bins containers efficiently, achieving higher utilisation than dedicated virtual machines per workload.

Organisations already operating containerised workloads on-premises face a migration decision. Lift-and-shift to managed Kubernetes minimises application changes but requires adapting Kubernetes configurations to provider-specific storage classes, ingress controllers, and networking models. This migration path suits organisations with Kubernetes expertise seeking to eliminate control plane operational burden. Organisations without container experience should evaluate whether container adoption serves their needs before adding container orchestration complexity.

Serverless computing

Serverless functions execute code in response to events without any persistent infrastructure. The provider receives code, allocates execution environments on demand, runs the code when triggers fire, and terminates environments after execution completes. Organisations pay per invocation (typically $0.20 per million invocations) plus execution duration (approximately $0.00001667 per GB-second), with no charge during idle periods.

The cold start problem affects serverless function performance. When no execution environment exists for a function, the provider must initialise one before executing the code. This initialisation adds latency ranging from 100 milliseconds for lightweight runtimes to several seconds for functions with large dependency packages. Subsequent invocations within a provider-determined window reuse the warm environment, executing with minimal latency. Applications requiring consistent low latency must account for cold starts through provisioned concurrency (maintaining warm environments at additional cost) or architectural patterns that tolerate occasional latency spikes.

Serverless functions suit event-driven workloads with variable demand. Processing uploaded files, responding to webhook notifications, handling scheduled tasks, and transforming data streams all exhibit intermittent execution patterns where traditional compute resources would sit idle between events. A function processing donation receipts fires when receipts arrive, processes each receipt in under one second, then incurs no cost until the next receipt. The equivalent always-on virtual machine would cost approximately $50 per month regardless of receipt volume.

+-------------------------------------------------------------------+
|                EVENT-DRIVEN SERVERLESS PATTERN                    |
+-------------------------------------------------------------------+
|                                                                   |
|  +----------+     +----------+     +----------+     +----------+  |
|  |          |     |          |     |          |     |          |  |
|  |  EVENT   |---->|  QUEUE   |---->| FUNCTION |---->|  OUTPUT  |  |
|  |  SOURCE  |     |          |     |          |     |          |  |
|  +----------+     +----------+     +----------+     +----------+  |
|                                                                   |
|  Examples:                                                        |
|                                                                   |
|  +----------+     +----------+     +----------+     +----------+  |
|  | File     |     | Object   |     | Process  |     | Database |  |
|  | upload   |---->| storage  |---->| image,   |---->| record + |  |
|  |          |     | event    |     | extract  |     | thumbnail|  |
|  +----------+     +----------+     +----------+     +----------+  |
|                                                                   |
|  +----------+     +----------+     +----------+     +----------+  |
|  | HTTP     |     | API      |     | Validate |     | Response |  |
|  | request  |---->| Gateway  |---->| + route  |---->| JSON     |  |
|  |          |     |          |     |          |     |          |  |
|  +----------+     +----------+     +----------+     +----------+  |
|                                                                   |
|  +----------+     +----------+     +----------+     +----------+  |
|  | Schedule |     | Timer    |     | Generate |     | Email    |  |
|  | (daily)  |---->| trigger  |---->| report   |---->| send     |  |
|  |          |     |          |     |          |     |          |  |
|  +----------+     +----------+     +----------+     +----------+  |
|                                                                   |
+-------------------------------------------------------------------+

Figure 4: Event-driven serverless architecture patterns with concrete examples

Serverless functions impose constraints that exclude certain workloads. Maximum execution duration limits (15 minutes on most platforms) prevent long-running processes. Memory limits (up to 10 GB) constrain memory-intensive operations. The stateless execution model requires external storage for any data persisting between invocations. Applications requiring persistent connections, background processing, or execution exceeding platform limits require container or virtual machine deployment.

A worked example illustrates serverless cost calculation. A function processing beneficiary registration forms executes 50,000 times per month. Each execution consumes 512 MB memory and completes in 800 milliseconds (0.8 seconds). The invocation cost is 50,000 / 1,000,000 × $0.20 = $0.01. The duration cost is 50,000 × 0.8 × 0.5 GB × $0.00001667 = $0.33. Total monthly cost is $0.34. The equivalent always-on virtual machine capable of handling this load costs approximately $15 per month, making serverless economically advantageous for intermittent workloads of this scale.

Storage service categories

Cloud storage services divide into three categories based on access patterns and data organisation. Object storage addresses data as discrete objects accessed via HTTP APIs, optimised for web-scale data volumes with high durability requirements. Block storage presents volumes as raw devices that attach to compute instances, providing the performance characteristics required for operating system disks and databases. File storage exposes traditional filesystem interfaces (NFS, SMB) for applications requiring shared file access across multiple compute instances.

Object storage systems store data with 99.999999999% (eleven nines) durability by replicating objects across multiple facilities. Each object receives a unique identifier and associates with user-defined metadata. Access occurs through HTTP operations: PUT to store, GET to retrieve, DELETE to remove. This interface suits unstructured data (documents, images, videos, backups) that applications access as complete units rather than modifying in place.

Object storage pricing combines storage volume, request count, and data transfer. A typical structure charges $0.023 per GB per month for storage, $0.005 per 1,000 GET requests, and $0.09 per GB for data transferred out of the provider’s network. An organisation storing 5 TB of programme documentation with 100,000 monthly downloads averaging 2 MB each incurs: storage cost of 5,000 × $0.023 = $115; request cost of 100,000 / 1,000 × $0.005 = $0.50; transfer cost of 200 GB × $0.09 = $18. Total monthly cost is $133.50.

Block storage attaches to virtual machines as volumes that appear as local disks. The operating system formats these volumes with filesystems (ext4, XFS, NTFS) and mounts them at specified paths. Block storage performance varies by tier: standard tiers provide baseline IOPS suitable for general workloads; provisioned IOPS tiers guarantee specific performance levels for databases and other latency-sensitive applications.

+-------------------------------------------------------------------+
|                    STORAGE TIER ARCHITECTURE                      |
+-------------------------------------------------------------------+
|                                                                   |
|  HOT TIER                           PERFORMANCE                   |
|  +---------------------------+      +---------------------------+ |
|  |  Block Storage (SSD)      |      |  3000+ IOPS baseline      | |
|  |  - OS disks               |      |  Up to 64000 provisioned  | |
|  |  - Databases              |      |  Sub-millisecond latency  | |
|  |  - Active workloads       |      |  $0.10 per GB/month       | |
|  +---------------------------+      +---------------------------+ |
|                                                                   |
|  WARM TIER                          BALANCED                      |
|  +---------------------------+      +---------------------------+ |
|  |  Object Storage (standard)|      |  HTTP access              | |
|  |  - Active documents       |      |  11 nines durability      | |
|  |  - Recent backups         |      |  Millisecond latency      | |
|  |  - Frequently accessed    |      |  $0.023 per GB/month      | |
|  +---------------------------+      +---------------------------+ |
|                                                                   |
|  COOL TIER                          INFREQUENT ACCESS             |
|  +---------------------------+      +---------------------------+ |
|  |  Object Storage (IA)      |      |  Higher retrieval cost    | |
|  |  - Older documents        |      |  30-day minimum storage   | |
|  |  - Compliance archives    |      |  Same durability          | |
|  |  - Infrequent access      |      |  $0.0125 per GB/month     | |
|  +---------------------------+      +---------------------------+ |
|                                                                   |
|  COLD TIER                          ARCHIVE                       |
|  +---------------------------+      +---------------------------+ |
|  |  Archive Storage          |      |  Hours to retrieve        | |
|  |  - Long-term retention    |      |  90-180 day minimum       | |
|  |  - Regulatory compliance  |      |  Highest retrieval cost   | |
|  |  - Disaster recovery      |      |  $0.004 per GB/month      | |
|  +---------------------------+      +---------------------------+ |
|                                                                   |
+-------------------------------------------------------------------+

Figure 5: Storage tier characteristics and pricing comparison

File storage provides shared filesystem access for workloads requiring multiple compute instances to read and write the same files. Traditional applications expecting mounted filesystems work without modification. Performance scales with provisioned capacity or throughput, with managed services handling the underlying infrastructure. Use cases include shared configuration files, content management systems, and legacy applications designed for network file shares.

Data lifecycle management

Data lifecycle management automates the movement of data between storage tiers based on age and access patterns. A lifecycle policy defines rules that transition objects from expensive high-performance tiers to cost-effective archive tiers as data ages and access frequency decreases. This automation reduces storage costs without requiring manual intervention or application changes.

A lifecycle policy for programme documentation might specify: objects remain in standard storage for 90 days; objects transition to infrequent-access storage after 90 days; objects transition to archive storage after 365 days; objects delete after 2,555 days (7 years) to comply with retention requirements. The policy executes automatically, moving each object through tiers based on its creation date.

The cost impact of lifecycle policies compounds significantly at scale. An organisation storing 50 TB of data with average age distribution of 20% current (under 90 days), 30% recent (90-365 days), and 50% archival (over 365 days) without lifecycle policies pays: 50,000 GB × $0.023 = $1,150 per month. With lifecycle policies placing data in appropriate tiers: 10,000 × $0.023 + 15,000 × $0.0125 + 25,000 × $0.004 = $230 + $187.50 + $100 = $517.50 per month. Annual savings exceed $7,500.

Lifecycle policies require understanding of access patterns before implementation. Moving frequently-accessed data to archive tiers incurs retrieval costs and delays that degrade application performance. Analysis of actual access patterns through storage access logs informs appropriate transition timing. The 90-day threshold in the example above assumes programme documentation receives active use during project implementation, occasional reference during reporting periods, and rare access thereafter. Different data types warrant different policies based on their actual usage patterns.

Backup and resilience in cloud

Cloud storage systems provide built-in durability through replication, but durability differs from backup. Replication protects against hardware failure and maintains data availability; backup protects against logical errors (accidental deletion, data corruption, malicious modification) by preserving point-in-time copies. Both mechanisms serve distinct purposes in data protection strategy.

Snapshots capture the state of block storage volumes at a specific moment. The snapshot stores only changed blocks since the previous snapshot (incremental), minimising storage consumption while enabling restoration to any captured point in time. A daily snapshot schedule with 30-day retention provides recovery points for the preceding month. Snapshots replicate to separate infrastructure from the source volume, protecting against facility-level failures affecting the original.

Object storage versioning maintains previous versions of objects when updates occur. Each PUT operation to an existing object creates a new version rather than overwriting. The previous version remains accessible via its version identifier. Versioning protects against accidental overwrites and deletions: a deleted object’s versions persist until explicitly purged. Combining versioning with lifecycle policies that expire old versions after defined periods balances protection against storage cost accumulation.

Cross-region replication copies data to geographically separate regions, protecting against regional outages affecting all infrastructure in a single location. Replication occurs asynchronously: a write to the primary region propagates to the secondary region within seconds to minutes depending on data volume and network conditions. Applications requiring zero data loss must account for the replication lag in their recovery point objectives. Cross-region replication doubles storage costs and incurs data transfer charges between regions.

Backup strategy for cloud workloads combines these mechanisms based on recovery requirements. A representative configuration for a programme management system includes: database snapshots every 6 hours with 14-day retention (2 GB database, 336 GB snapshot storage at $7.73/month); object storage versioning enabled with 90-day version lifecycle (50 GB active data, approximately 75 GB version storage at $1.73/month); cross-region replication for critical configuration and identity data (5 GB replicated at $0.12/month storage plus $0.45/month transfer). Total backup infrastructure cost is $10.03 per month for comprehensive protection of a system storing 57 GB of active data.

Pricing models and commitment

Cloud compute pricing operates across three models offering different trade-offs between flexibility and cost. On-demand pricing charges per-second or per-hour usage with no commitment, enabling workloads to start and stop without penalty. Reserved capacity provides discounts of 30-72% in exchange for one-year or three-year commitments to specific instance types or usage volumes. Spot instances offer discounts of 60-90% for interruptible capacity that the provider can reclaim with two minutes’ notice.

On-demand pricing suits variable workloads, development environments, and initial deployments where usage patterns remain uncertain. The flexibility to scale to zero during idle periods and scale rapidly during demand peaks outweighs the higher per-unit cost. A development environment running 8 hours per day for 22 workdays monthly consumes 176 hours. At $0.10 per hour on-demand, monthly cost is $17.60. The equivalent reserved pricing at $0.06 per hour (730 hours committed) costs $43.80, making on-demand more economical for environments not running continuously.

Reserved capacity suits steady-state production workloads with predictable requirements. A web application requiring two instances continuously justifies reserved capacity. Two instances at $0.10 per hour on-demand cost $146 monthly (730 hours × 2 × $0.10). The same instances with one-year reserved pricing at $0.065 per hour cost $94.90 monthly, saving $612 annually. Three-year reservations reduce costs further but constrain flexibility: technology changes or workload migrations before the term ends strand the commitment.

Reserved capacity planning requires forecasting resource needs across the commitment period. Under-committing leaves savings on the table; over-committing wastes money on unused reservations. A conservative approach reserves 60-70% of baseline capacity, covering predictable steady-state load with on-demand pricing handling peaks and growth. Quarterly review of utilisation metrics against reservations enables adjustment at renewal.

Spot instances suit fault-tolerant batch workloads that can checkpoint progress and resume after interruption. Data processing pipelines, rendering jobs, and scientific computing benefit from spot pricing. A batch job running 1,000 instance-hours monthly costs $100 on-demand at $0.10 per hour. The same job on spot instances at $0.015 per hour costs $15, saving $1,020 annually. The application must handle interruptions gracefully: saving intermediate results, using queue-based architectures, and running across multiple spot pools to reduce simultaneous interruption probability.

Worked example of blended pricing: An organisation runs a programme management system with web servers, application servers, and batch processing. Analysis shows steady-state requirements of 4 instances (2 web, 2 application) running continuously and batch processing consuming 500 instance-hours monthly. Pricing strategy: reserve 4 instances at $0.065/hour ($189.80/month); run batch on spot at $0.015/hour ($7.50/month). Total monthly compute cost is $197.30 compared to $219 for all on-demand, saving $260 annually. The batch workload’s interruptibility enables spot usage while production systems receive reserved pricing for predictable costs and capacity guarantees.

Field and edge considerations

Workloads serving field locations face latency constraints that affect service selection and architecture. A field office in Nairobi accessing a cloud application hosted in a European region experiences 150-200 milliseconds round-trip latency due to network distance. This latency affects interactive applications perceptibly: each API call adds visible delay to user interfaces. Applications requiring multiple sequential API calls compound latency, potentially creating unusable experiences.

Mitigating field latency involves architectural patterns that reduce round-trips and move processing closer to users. API design should enable single requests to return complete data sets rather than requiring multiple calls to assemble responses. A beneficiary lookup returning the complete beneficiary record with related programme enrolments in one response performs better than separate calls for beneficiary data and each programme enrolment.

Content delivery networks cache static assets at edge locations near users. A field worker in Nairobi loading a web application retrieves JavaScript, CSS, and images from an edge location in Nairobi (20-30 milliseconds) rather than from Europe (150-200 milliseconds). CDN configuration requires identifying cacheable content (static assets, infrequently-changing API responses) and setting appropriate cache durations.

Regional deployment places application infrastructure in regions closer to user concentrations. An organisation with substantial operations in East Africa benefits from deploying application instances in a South Africa or Kenya region, reducing latency to 30-60 milliseconds for users in the surrounding area. Multi-region deployment adds operational complexity: data synchronisation between regions, region-specific configurations, and failover procedures. The latency benefit must justify this complexity.

Edge computing extends cloud services to locations outside traditional cloud regions. Providers offer edge locations with compute capacity in cities lacking full cloud regions. Edge locations suit latency-sensitive components while central regions handle data storage and batch processing. A mobile data collection application might process form validation at an edge location (low latency for field workers) while synchronising completed forms to a central database (latency-tolerant background process).

For satellite-connected field sites, latency mitigation has limits. Geostationary satellite links impose 600+ millisecond round-trip times regardless of cloud region location due to the physical distance to satellites. These environments require offline-capable applications that function without continuous connectivity, synchronising when connections permit. Cloud services support offline patterns through queuing services that buffer operations during disconnection and process them upon reconnection.

Implementation considerations

For organisations with limited IT capacity

Organisations with minimal dedicated IT capacity benefit from higher-abstraction services that reduce operational burden. Serverless functions and managed container services eliminate infrastructure management entirely. A single IT person can deploy and operate applications on these platforms without expertise in server administration, security patching, or capacity management.

Start with serverless functions for new automation requirements. A scheduled function generating weekly reports, a webhook handler processing form submissions, or an event handler transforming uploaded files each requires only application code. The provider manages execution environments, scaling, and availability. Costs scale to zero during idle periods, aligning expenses with actual usage.

For applications requiring persistent processes, managed container services with simplified interfaces (not full Kubernetes) provide a middle ground. These services accept container images and handle scaling, load balancing, and health monitoring. The operational surface area is smaller than virtual machines: update the container image to deploy changes rather than configuring servers.

Virtual machines remain appropriate when workloads require specific operating system configurations or when migrating existing applications without modification. In these cases, providers offer managed services that handle patching and backup, reducing operational burden. A managed database service eliminates database administration tasks that would otherwise consume IT capacity.

For organisations with established IT functions

Established IT teams can leverage lower-abstraction services for greater control while implementing automation to manage operational complexity. Virtual machines with infrastructure-as-code define environments reproducibly, enabling consistent deployments and disaster recovery. Configuration management tools maintain server state, applying security patches and configuration changes across fleets.

Managed Kubernetes suits organisations with container expertise seeking to eliminate control plane operations. The team retains control over workload configuration, scaling policies, and networking while the provider ensures Kubernetes availability. This model enables sophisticated deployment patterns (blue-green, canary) and fine-grained resource management.

Implement tagging standards enabling cost attribution to programmes, projects, or departments. Tagging all resources with consistent metadata supports cost analysis and showback reporting. The landing zone design should enforce tagging through policies that reject untagged resource creation.

Establish reserved capacity programmes based on utilisation analysis. Quarterly review of compute utilisation identifies resources suitable for reserved pricing. Centralised reservation purchasing achieves better discounts than fragmented purchases and enables capacity sharing across accounts within the organisation.

For organisations operating in constrained environments

Field-heavy operations require architecture accommodating intermittent connectivity and high latency. Design applications for offline operation with synchronisation when connected. Queue-based architectures decouple components, enabling continued operation during connectivity gaps.

Deploy regional infrastructure where user concentrations justify the complexity. A regional deployment serving field offices across East Africa provides better experience than a single European deployment while remaining simpler than full multi-region active-active architecture.

Evaluate data residency requirements affecting cloud region selection. Donor requirements, data protection regulations, and organisational policies constrain where data can reside. Select cloud providers and regions satisfying all applicable requirements before beginning architecture design.