Skip to main content

Storage Architecture

Storage architecture determines how data persists across an organisation’s infrastructure, encompassing the physical and logical structures that hold information, the protocols that provide access, and the patterns that ensure availability and performance. For organisations operating distributed offices with varying connectivity and resource constraints, storage decisions shape everything from daily file access to disaster recovery capabilities.

Block storage
Storage presented as raw volumes divided into fixed-size blocks, each addressable independently. The storage system has no knowledge of file structure; the consuming system’s filesystem imposes organisation. Block storage provides the lowest latency and highest throughput for structured workloads.
File storage
Storage presented as a hierarchical filesystem with directories and files. The storage system manages file structure and metadata, presenting a familiar interface to clients. File storage suits shared access and unstructured data.
Object storage
Storage presenting data as discrete objects identified by unique keys within flat namespaces. Each object contains data, metadata, and a unique identifier. Object storage scales horizontally and suits large unstructured datasets accessed via HTTP APIs.
IOPS
Input/output operations per second, measuring how many read or write operations storage can complete per second. Critical for database workloads with many small transactions.
Throughput
Data transfer rate, measured in megabytes or gigabytes per second. Critical for streaming workloads like video or large file transfers.
Latency
Time elapsed between requesting data and receiving the first byte, measured in milliseconds or microseconds. Critical for interactive applications and real-time systems.

Storage types and access protocols

The three fundamental storage types serve distinct purposes, and understanding their characteristics guides architecture decisions. Block storage provides raw capacity that a host system formats with its chosen filesystem. File storage provides shared access to a common filesystem managed by the storage system itself. Object storage provides scalable capacity for large datasets accessed programmatically rather than mounted as drives.

Block storage excels where applications require direct disk access with predictable, low-latency performance. Databases write transaction logs sequentially and query data randomly; block storage handles both patterns efficiently. Virtual machine hypervisors store disk images as block devices, benefiting from the thin provisioning and snapshot capabilities that modern block storage provides. The consuming host sees block storage as a local disk, formatting it with ext4, XFS, NTFS, or another filesystem appropriate to the operating system.

File storage suits scenarios where multiple systems require simultaneous access to the same files. Shared project folders, home directories, and collaborative document stores work naturally as file storage. The storage system handles file locking, permission enforcement, and metadata management. Clients mount file shares using standard protocols and interact with familiar filesystem semantics.

Object storage addresses scale and durability requirements that exceed traditional filesystem capabilities. A single namespace can hold billions of objects totalling petabytes. Objects are immutable once written; updates create new versions rather than modifying in place. The HTTP-based access pattern integrates readily with web applications and distributed systems. Object storage sacrifices the random-access patterns of block storage and the hierarchical organisation of file storage in exchange for massive scale and geographic distribution.

Access protocols determine how clients communicate with storage systems. The choice of protocol affects performance, compatibility, security, and operational complexity.

Block protocols connect hosts to block storage:

iSCSI (Internet Small Computer Systems Interface) encapsulates SCSI commands within TCP/IP packets, enabling block storage access over standard Ethernet networks. iSCSI requires no specialised hardware beyond standard network infrastructure, making it accessible to organisations without dedicated storage networks. Performance depends on network quality; dedicated VLANs or separate physical networks prevent contention with general traffic. Typical iSCSI deployments achieve 1-10 Gbps throughput depending on network infrastructure.

Fibre Channel provides dedicated storage networking with guaranteed bandwidth and low latency. Fibre Channel requires specialised host bus adapters (HBAs), switches, and cabling separate from IP networks. The protocol delivers consistent sub-millisecond latency and throughput from 8 Gbps to 128 Gbps per port. The cost and complexity of Fibre Channel infrastructure limits its use to organisations with demanding performance requirements and dedicated storage administration capacity.

NVMe over Fabrics (NVMe-oF) extends the NVMe protocol designed for local solid-state storage across network fabrics. NVMe-oF achieves latencies under 100 microseconds, approaching local storage performance. Implementations exist for RDMA networks, Fibre Channel, and TCP. The protocol suits workloads where microseconds matter, such as high-frequency transaction processing or real-time analytics.

File protocols provide network filesystem access:

NFS (Network File System) originated in Unix environments and provides stateless file access over IP networks. NFSv4 added security improvements including Kerberos authentication and encryption. NFS performs well for Unix and Linux clients accessing shared storage. Configuration is straightforward, and the protocol integrates with standard Unix permission models.

SMB (Server Message Block), also known as CIFS, provides file sharing for Windows environments and is supported on Linux and macOS. SMB3 introduced encryption, multichannel aggregation, and continuous availability features. SMB integrates with Active Directory for authentication and authorisation. Most Windows-based organisations standardise on SMB for user file shares.

Object protocols provide programmatic access:

S3 (Simple Storage Service) defines the de facto standard API for object storage, originally developed by AWS and now implemented by numerous vendors and open-source projects. Applications interact with S3 through HTTP requests using access keys for authentication. The protocol’s simplicity and ubiquity make it the default choice for object storage integration.

Swift provides object storage access for OpenStack deployments. While less prevalent than S3, Swift remains relevant in private cloud environments built on OpenStack infrastructure.

Architecture patterns

Storage architecture patterns describe how storage capacity connects to the systems that use it. The three primary patterns differ in connectivity, sharing capability, performance characteristics, and operational requirements.

+-------------------------------------------------------------+
| STORAGE ARCHITECTURE PATTERNS |
+-------------------------------------------------------------+
DIRECT-ATTACHED STORAGE (DAS)
+------------+ +------------+ +------------+
| Server 1 | | Server 2 | | Server 3 |
| | | | | |
| +--------+ | | +--------+ | | +--------+ |
| | Disk | | | | Disk | | | | Disk | |
| | Array | | | | Array | | | | Array | |
| +--------+ | | +--------+ | | +--------+ |
+------------+ +------------+ +------------+
Isolated Isolated Isolated
NETWORK-ATTACHED STORAGE (NAS)
+------------+ +------------+ +------------+
| Server 1 | | Server 2 | | Server 3 |
+-----+------+ +-----+------+ +-----+------+
| | |
+--------+--------+--------+--------+
| |
+-----v-----+ +-----v-----+
| NAS | | NAS |
| Filer 1 | | Filer 2 |
| (NFS/SMB) | | (NFS/SMB) |
+-----------+ +-----------+
Shared file access
STORAGE AREA NETWORK (SAN)
+------------+ +------------+ +------------+
| Server 1 | | Server 2 | | Server 3 |
| (HBA/iSCSI)| | (HBA/iSCSI)| | (HBA/iSCSI)|
+-----+------+ +-----+------+ +-----+------+
| | |
+--------+--------+--------+--------+
|
+-----v-----------------+
| SAN Fabric |
| (FC switches/Ethernet)|
+-----+-----------------+
|
+-----v-----+
| SAN |
| Storage |
| Array |
+-----------+
Block-level access

Figure 1: Comparison of DAS, NAS, and SAN architecture patterns

Direct-attached storage connects storage devices directly to individual servers without network intermediation. Internal drives, external drive enclosures, and directly connected RAID arrays constitute DAS. Each server’s storage remains isolated; sharing data between servers requires copying across the network or application-level coordination. DAS provides the simplest architecture with the lowest latency, as no network stack intervenes between server and storage. The pattern suits single-server deployments, workstations, and scenarios where storage sharing is unnecessary. Limitations emerge when multiple servers need access to common data or when storage capacity must scale independently of compute resources.

Network-attached storage presents file-level storage over standard IP networks. NAS devices, also called filers, run specialised operating systems optimised for file serving. Clients mount NAS shares using NFS or SMB protocols and access files as though stored locally. Multiple servers can access the same files simultaneously, with the NAS system managing concurrent access and locking. NAS suits shared file access patterns: user home directories, departmental file shares, web server content, and media repositories. Performance depends on network bandwidth and NAS device capabilities. Enterprise NAS systems support 10-100 Gbps aggregate throughput; entry-level devices provide 1-2 Gbps.

Storage area networks provide block-level access over dedicated networks. Servers see SAN storage as local disks despite the network separation. SAN architecture enables storage pooling, where capacity from multiple arrays appears as a unified pool allocable to any connected server. Storage administrators provision logical units (LUNs) from the pool and present them to specific servers. SAN suits database servers, virtualisation hosts, and applications requiring block storage with the flexibility to migrate workloads between servers. The dedicated network isolates storage traffic from general network congestion, ensuring consistent performance. SAN complexity and cost exceed NAS, appropriate for organisations with dedicated storage administration capacity and performance-critical workloads.

The choice between patterns depends on workload requirements, budget constraints, and operational capacity. Organisations with limited IT staff benefit from NAS simplicity for file sharing needs. Those running virtualisation platforms or databases requiring shared block storage may justify SAN investment. Many environments combine patterns: NAS for user files and unstructured data, SAN or DAS for performance-critical applications.

Software-defined storage

Software-defined storage decouples storage services from underlying hardware, implementing storage logic in software running on commodity servers. Rather than purchasing proprietary storage arrays, organisations deploy storage software on standard servers with attached disks. This approach reduces hardware costs, eliminates vendor lock-in, and enables storage that scales horizontally by adding nodes.

Distributed storage systems spread data across multiple nodes, providing redundancy through replication or erasure coding rather than hardware RAID. Ceph, GlusterFS, and MinIO exemplify open-source distributed storage. These systems aggregate capacity from many servers into unified pools, automatically distributing data to survive node failures. A three-node Ceph cluster with triple replication tolerates any single node failure without data loss or service interruption. Adding nodes increases both capacity and performance, as the system distributes load across all participants.

Hyper-converged infrastructure integrates compute, storage, and networking in each node, with software aggregating resources across the cluster. Rather than separate server, storage, and network tiers, hyper-converged systems deploy identical nodes that contribute all resource types. Proxmox VE with Ceph, for example, provides virtualisation and distributed storage in a single platform. This model simplifies architecture and procurement: capacity grows by adding identical nodes rather than separately scaling compute and storage tiers.

Software-defined storage requires operational investment. Distributed storage systems demand understanding of failure domains, rebalancing behaviour, and performance tuning. Organisations without Linux systems administration expertise may find the operational burden exceeds hardware cost savings. Managed or commercial variants reduce operational complexity at higher cost.

+------------------------------------------------------------------+
| SOFTWARE-DEFINED STORAGE CLUSTER |
+------------------------------------------------------------------+
+------------------+ +------------------+ +------------------+
| Node 1 | | Node 2 | | Node 3 |
| | | | | |
| +------+------+ | | +------+------+ | | +------+------+ |
| | SSD | SSD | | | | SSD | SSD | | | | SSD | SSD | |
| +------+------+ | | +------+------+ | | +------+------+ |
| | HDD | HDD | | | | HDD | HDD | | | | HDD | HDD | |
| | HDD | HDD | | | | HDD | HDD | | | | HDD | HDD | |
| +------+------+ | | +------+------+ | | +------+------+ |
| | | | | |
| +-------------+ | | +-------------+ | | +-------------+ |
| | Storage | | | | Storage | | | | Storage | |
| | Software | | | | Software | | | | Software | |
| | (Ceph OSD) | | | | (Ceph OSD) | | | | (Ceph OSD) | |
| +-------------+ | | +-------------+ | | +-------------+ |
+--------+---------+ +--------+---------+ +--------+---------+
| | |
+-----------+-----------+-----------+-----------+
|
+------v------+
| Cluster |
| Network |
| (10+ Gbps) |
+------+------+
|
+----------------+----------------+
| | |
+---v---+ +----v----+ +----v----+
| Block | | File | | Object |
| (RBD) | | (CephFS)| | (RGW) |
+-------+ +---------+ +---------+
| | |
v v v
VMs/DBs Shared files Applications

Figure 2: Software-defined storage cluster presenting multiple access methods

Performance characteristics

Storage performance manifests through three primary metrics: IOPS, throughput, and latency. Different workloads stress different metrics, and optimising for one may compromise another.

IOPS-intensive workloads generate many small read and write operations. Transactional databases exemplify this pattern: each query may touch dozens of small data blocks scattered across storage. A busy database server may require 10,000-50,000 IOPS. Solid-state drives deliver 10,000-100,000 IOPS depending on interface and quality, while spinning disks provide 100-200 IOPS. IOPS capacity determines how many concurrent transactions storage can support.

Throughput-intensive workloads transfer large sequential data volumes. Video editing, backup operations, and data analytics read or write gigabytes in sustained streams. A 4K video editing workload requires 400-800 MB/s sustained throughput. Storage throughput depends on drive capabilities, interface bandwidth, and network capacity. NVMe SSDs deliver 3,000-7,000 MB/s; SATA SSDs provide 500-550 MB/s; spinning disks achieve 100-200 MB/s sequential.

Latency-sensitive workloads require rapid response to individual operations. Real-time applications, interactive services, and databases with strict response time requirements fall into this category. NVMe storage provides sub-millisecond latency; SATA SSDs deliver 0.1-0.5 ms; spinning disks incur 5-15 ms average seek time. Network latency adds to storage latency: iSCSI over Ethernet adds 0.5-2 ms; Fibre Channel adds 0.1-0.5 ms.

Storage performance planning begins with workload characterisation. Existing systems reveal I/O patterns through monitoring. New deployments require estimation based on application requirements and user counts. A rule of thumb for planning: provision 5 IOPS per user for general file services, 50-100 IOPS per user for VDI workloads, and measure actual database requirements during testing.

Media selection determines achievable performance:

Media typeIOPS (random 4K)Throughput (MB/s)LatencyCost per TBUse case
Enterprise NVMe SSD100,000-1,000,0003,000-7,00050-100 μs£200-400Tier 1: databases, high-performance apps
Enterprise SATA SSD50,000-100,000400-550100-500 μs£100-200Tier 2: virtualisation, general apps
10K RPM SAS HDD150-200150-2004-8 ms£40-80Tier 3: warm data, moderate access
7.2K RPM NL-SAS HDD80-100100-1508-15 ms£20-40Tier 4: cold data, archival, backup

Table 1: Storage media performance and cost characteristics

Data tiering and lifecycle

Data tiering places information on storage media matching its access frequency and performance requirements. Active data resides on fast, expensive storage; ageing data migrates to slower, cheaper media. Automated tiering moves data between tiers based on access patterns without administrator intervention.

A three-tier architecture serves most organisations. The hot tier uses NVMe or high-performance SSD for actively accessed data: current financial records, active project files, production databases. The warm tier uses SATA SSD or fast HDD for less frequently accessed data: completed projects, historical reports, secondary database replicas. The cold tier uses high-capacity HDD or object storage for rarely accessed data: archived records, backup copies, compliance retention.

The economic argument for tiering follows from media cost differences. Storing 100 TB entirely on NVMe SSD costs £20,000-40,000 in media alone. Distributing that data with 10 TB hot (NVMe), 30 TB warm (SATA SSD), and 60 TB cold (HDD) costs £5,000-8,000, a 70-80% reduction. Tiering requires operational investment in classification, policies, and migration mechanisms, but the cost savings justify this investment at scale.

+------------------------------------------------------------------+
| DATA TIERING ARCHITECTURE |
+------------------------------------------------------------------+
Data Age/Access Frequency
High access +------------------------------------------+
(hot) | HOT TIER |
| NVMe SSD: 10 TB |
| Active databases, current projects |
| 100,000+ IOPS |
+--------------------+---------------------+
|
| Automated migration
| (access < 10/month)
v
Medium access +------------------------------------------+
(warm) | WARM TIER |
| SATA SSD: 30 TB |
| Recent archives, reference data |
| 50,000 IOPS |
+--------------------+---------------------+
|
| Automated migration
| (access < 1/quarter)
v
Low access +------------------------------------------+
(cold) | COLD TIER |
| HDD: 60 TB |
| Compliance archives, old backups |
| 1,000 IOPS |
+------------------------------------------+
Infrequent +------------------------------------------+
(archive) | ARCHIVE TIER |
| Object storage/tape: 200+ TB |
| Long-term retention, disaster recovery |
+------------------------------------------+

Figure 3: Four-tier storage architecture with automated data migration

Policy-based tiering automates data movement. Policies specify criteria: files unaccessed for 90 days migrate from hot to warm; files unaccessed for one year migrate from warm to cold. Storage systems track access timestamps and execute migrations during low-activity periods. Transparent tiering presents a unified namespace to users regardless of physical location; access to cold data simply incurs higher latency during recall.

Manual tiering remains viable for smaller deployments. Administrators create separate shares for active and archive data, with users or scheduled jobs moving files between them. This approach requires user discipline but avoids automated tiering complexity.

Storage for virtualisation and containers

Virtualisation platforms require storage with specific characteristics. Virtual machine disk images consume substantial capacity: a modest VM with 100 GB disk, multiplied by 50 VMs, demands 5 TB storage. Hypervisors benefit from storage features including thin provisioning, snapshots, and cloning.

Thin provisioning allocates physical capacity only as VMs write data, rather than reserving the full provisioned size. A 100 GB thin-provisioned disk consuming 20 GB of actual data uses only 20 GB of physical storage. This oversubscription model improves capacity efficiency but requires monitoring to prevent exhaustion. Storage administrators set alerts when physical utilisation reaches 80% of provisioned capacity.

Snapshots capture point-in-time images of storage volumes. Before applying patches or making configuration changes, administrators snapshot affected VMs. If problems occur, reverting to the snapshot restores the previous state within minutes. Snapshots consume storage proportional to changes since creation; a VM changing 5 GB daily accumulates 35 GB of snapshot data weekly. Retention policies prevent snapshot accumulation from consuming excessive capacity.

Cloning creates independent copies of volumes, enabling rapid provisioning. Deploying a new VM from a template involves cloning the template’s disk rather than copying gigabytes of data. Linked clones share read-only base images, further reducing storage consumption for similar VMs.

Shared storage enables virtualisation features that standalone hosts cannot provide. Live migration moves running VMs between hosts without downtime; both hosts must access the VM’s storage simultaneously during the transfer. High availability restarts failed VMs on surviving hosts, requiring those hosts to access the failed VM’s storage. These capabilities demand either SAN block storage accessible from all hosts or distributed storage systems that replicate data across the cluster.

Container platforms interact with storage differently than virtualisation. Containers are ephemeral by default; data written inside a container disappears when the container stops. Persistent volumes provide durable storage that survives container lifecycle events. Container orchestrators like Kubernetes abstract storage through the Container Storage Interface (CSI), enabling consistent volume provisioning regardless of underlying storage type.

Container storage patterns differ from VM patterns. Individual containers use less storage but greater quantities exist. A Kubernetes cluster might run 500 containers where a VMware cluster runs 50 VMs. Storage systems must handle frequent small volume operations efficiently. Distributed storage systems designed for cloud-native workloads suit container platforms better than traditional SAN optimised for fewer, larger volumes.

Replication and mirroring

Storage replication maintains copies of data on separate storage systems, providing protection against storage system failure and enabling disaster recovery. Replication operates independently of backup; where backup creates point-in-time copies for recovery from data corruption or deletion, replication maintains current copies for rapid failover when primary storage fails.

Synchronous replication writes data to both primary and secondary storage before acknowledging the write to the application. The application experiences latency equal to the slower of the two paths. Synchronous replication guarantees zero data loss (RPO = 0) because no acknowledged write exists only on failed storage. The latency penalty limits synchronous replication to distances where round-trip time remains acceptable, practically under 100 km for latency-sensitive applications.

Asynchronous replication acknowledges writes after committing to primary storage, then transmits changes to secondary storage in the background. Applications experience only primary storage latency. The replication lag between primary and secondary creates a window of potential data loss; if primary fails, transactions committed after the last replicated point are lost. Asynchronous replication tolerates any distance, making it suitable for disaster recovery sites in different regions.

+------------------------------------------------------------------+
| REPLICATION TOPOLOGY OPTIONS |
+------------------------------------------------------------------+
ONE-TO-ONE (Simple DR)
+-------------+ +-------------+
| Primary | ------ sync/async ----> | Secondary |
| Site A | | Site B |
+-------------+ +-------------+
ONE-TO-MANY (Multiple DR targets)
+-------------+
+--> | Site B |
+-------------+ | +-------------+
| Primary | --------+
| Site A | | +-------------+
+-------------+ +--> | Site C |
+-------------+
CASCADED (Geographic distribution)
+-------------+ +-------------+ +-------------+
| Primary | ----> | Regional | ----> | Remote |
| Site A | sync | Site B | async | Site C |
+-------------+ +-------------+ +-------------+
ACTIVE-ACTIVE (Bidirectional)
+-------------+ +-------------+
| Site A | <----- sync/async ----> | Site B |
| (active) | | (active) |
+-------------+ +-------------+
| |
v v
Local users Local users

Figure 4: Storage replication topology patterns

Replication topology describes the relationship between storage systems. One-to-one replication protects primary storage with a single replica. One-to-many replication distributes copies to multiple targets, enabling both local high availability and remote disaster recovery from the same primary. Cascaded replication chains sites together, reducing primary site bandwidth requirements for distant targets. Active-active replication enables reads and writes at both sites, with conflict resolution handling simultaneous updates.

Replication bandwidth requirements depend on data change rate and desired RPO. An application writing 100 GB daily requires 9.3 Mbps sustained bandwidth for synchronous replication (100 GB / 86,400 seconds × 8 bits). Bursts during peak activity may require significantly higher instantaneous bandwidth. Asynchronous replication tolerates lower bandwidth by queuing changes during peaks, but the queue must drain before it exhausts buffer space.

Capacity planning

Capacity planning forecasts storage requirements and schedules procurement to maintain adequate headroom. Reactive approaches risk outages when storage exhausts unexpectedly; proactive planning maintains safety margins without excessive overprovisioning.

Current state assessment establishes the planning baseline. Document total provisioned capacity, allocated capacity, and actual utilisation across all storage systems. Calculate the difference between allocated and utilised to understand thin provisioning exposure. Identify growth trends by comparing utilisation over the past 12 months.

Growth projection extrapolates future requirements. Historical growth rate provides the simplest projection: if storage grew 25% last year, assume 25% growth next year. Business factors may accelerate or decelerate growth: new programmes, office openings, or digital transformation initiatives increase requirements; system retirements or data archival reduce them. Apply growth rate to current utilisation rather than allocated capacity to avoid compounding thin provisioning assumptions.

Headroom policy defines acceptable utilisation thresholds. Storage performance degrades as capacity approaches exhaustion; most systems should maintain at least 20% free capacity. Procurement lead times affect required headroom: if acquiring new storage requires 90 days, plan to initiate procurement before reaching 80% utilisation.

A worked example illustrates the methodology. An organisation operates 50 TB of SAN storage, currently 35 TB allocated with 28 TB actual utilisation (56% of total, 80% of allocated). Historical growth averaged 30% annually. Projecting forward: 28 TB × 1.30 = 36.4 TB utilisation in 12 months. At 80% maximum utilisation threshold, the organisation needs 36.4 / 0.80 = 45.5 TB total capacity. Current 50 TB suffices for the next year. However, if growth continues, year two requires 36.4 × 1.30 / 0.80 = 59 TB, exceeding current capacity. Procurement should begin 90 days before projected capacity exhaustion, approximately 18 months from now.

Monitor these metrics continuously:

MetricAlert thresholdCritical thresholdAction
Total utilisation70%85%Plan expansion
Thin provisioning ratio150%200%Review allocation
Growth rate deviation+50% from baseline+100% from baselineInvestigate cause
IOPS utilisation70%85%Plan performance tier expansion
Snapshot consumption15% of capacity25% of capacityReview retention policies

Table 2: Storage capacity monitoring thresholds

Field office storage considerations

Field offices present unique storage challenges. Limited bandwidth constrains synchronisation with headquarters. Unreliable power threatens data integrity. Physical security may be compromised. Equipment support is distant. Storage architecture must address these constraints while maintaining data availability and protection.

Local storage with replication suits offices generating data that must remain accessible during connectivity outages. A local NAS device stores working files, with replication to headquarters during connectivity windows. Users experience local-speed access regardless of WAN conditions. Replication priority rules ensure critical data synchronises first when bandwidth is limited.

Caching appliances address bandwidth constraints for read-heavy workloads. The appliance stores frequently accessed files locally while fetching less common files from central storage on demand. Users perceive fast access to common files without duplicating entire datasets. Caching suits offices primarily consuming rather than producing data.

Offline synchronisation handles intermittent connectivity scenarios. When connectivity exists, systems synchronise bidirectionally. During outages, work continues against local copies. Conflict resolution handles cases where the same file changed in multiple locations. Tools like Syncthing provide open-source file synchronisation with conflict detection.

Storage devices for field deployment require environmental resilience. Consumer-grade NAS devices fail in temperature extremes, high humidity, and dusty conditions. Industrial or ruggedised devices withstand harsher environments. See Ruggedised and Environmental for equipment specifications.

Power protection is mandatory for field storage. Sudden power loss corrupts filesystems and damages drives. Uninterruptible power supplies provide runtime for graceful shutdown. Battery-backed write caches prevent data loss from writes in progress during power failure. Size UPS capacity for both storage device load and expected outage duration in the deployment context.

Technology options

Open-source storage solutions provide capable alternatives to commercial products across all storage patterns.

TrueNAS (formerly FreeNAS) delivers enterprise NAS capabilities on commodity hardware. ZFS provides built-in data integrity checking, snapshots, replication, and compression. TrueNAS SCALE adds Kubernetes support and distributed storage capabilities. The platform suits organisations with Linux administration skills seeking NAS without licensing costs.

Ceph provides distributed storage delivering block, file, and object interfaces from a unified cluster. Ceph scales horizontally by adding nodes, handling multi-petabyte deployments. The learning curve is steep, and operational complexity is significant, but large deployments justify the investment. Ceph integrates with Proxmox VE for hyper-converged virtualisation.

MinIO offers S3-compatible object storage with straightforward deployment. Single-node MinIO suits development and small production workloads; distributed MinIO scales for enterprise requirements. The clear S3 compatibility simplifies application integration.

GlusterFS aggregates storage across commodity servers into scalable file storage. GlusterFS suits large file storage requirements with simpler operations than Ceph, though it lacks Ceph’s block storage capabilities.

Commercial storage vendors offer nonprofit programmes reducing acquisition costs. Evaluate total cost including support, training, and operational overhead rather than licensing alone. Organisations without Linux systems administration expertise may find commercial solutions more economical despite higher licensing costs.

SolutionTypeStrengthsOperational requirements
TrueNAS CoreFile/block (ZFS)Data integrity, snapshots, web UIModerate: FreeBSD administration
TrueNAS SCALEFile/block/objectLinux-based, Kubernetes supportModerate: Linux administration
CephBlock/file/objectHorizontal scaling, resilienceHigh: distributed systems expertise
MinIOObject (S3)S3 compatibility, simple deploymentLow: container deployment
GlusterFSFileHorizontal scaling, simple modelModerate: Linux administration

Table 3: Open-source storage solution comparison

Implementation considerations

For organisations with limited IT capacity

Single-server deployments with DAS provide the simplest architecture. Internal drives or a directly attached RAID array require no storage networking expertise. ZFS on Linux or TrueNAS on dedicated hardware adds enterprise features (snapshots, compression, integrity checking) without complexity. This approach serves organisations under 5 TB with single-server infrastructure.

When shared storage becomes necessary, NAS appliances provide file sharing without SAN complexity. Entry-level TrueNAS or Synology hardware handles 10-20 users with minimal administration. Prioritise reliability features (mirrored drives, battery-backed cache) over raw capacity. Budget for annual drive replacements based on manufacturer specifications.

Cloud storage for file sharing eliminates on-premises storage entirely for organisations with reliable connectivity. Services like Nextcloud (self-hosted) or SharePoint/Google Drive (commercial) move storage operations to platforms with built-in redundancy. Evaluate bandwidth requirements: 50 users each accessing 100 MB daily require 5 GB daily transfer, approximately 4 Mbps sustained.

For organisations with established IT functions

Shared storage enables virtualisation features and accommodates growth. iSCSI over dedicated VLANs provides SAN capability without Fibre Channel investment. Ensure network infrastructure supports storage traffic: 10 GbE minimum for production workloads, dedicated switches or VLANs preventing contention.

Software-defined storage reduces hardware costs at operational expense. Ceph or TrueNAS SCALE on commodity servers provides scalable storage with replication. Budget for operational learning curve; allocate training time before production deployment. Start with non-critical workloads and expand as operational confidence develops.

Capacity planning formalisation prevents surprises. Establish quarterly capacity reviews, procurement thresholds, and growth forecasting. Document storage allocation policies to prevent sprawl. Implement chargeback or showback to make storage consumption visible to consuming departments.

For field deployments

Prioritise resilience over features. Simple architectures fail less frequently. A mirrored pair of drives in a ruggedised enclosure survives better than sophisticated storage systems sensitive to environmental conditions.

Plan for disconnected operation. Local storage must function indefinitely without headquarters connectivity. Synchronisation catches up when connectivity returns. Design conflict resolution procedures before deployment rather than discovering conflicts in production.

Document recovery procedures thoroughly. Field sites may lack technical staff. Step-by-step guides for common failures (drive replacement, filesystem recovery, replication restart) enable non-specialists to restore service without remote assistance.

See also