Voice and Telephony
Voice telephony systems transmit real-time audio communication between endpoints using either circuit-switched networks (traditional PSTN) or packet-switched networks (VoIP). For mission-driven organisations operating across headquarters, regional offices, and field locations, voice infrastructure decisions affect communication costs, reliability during emergencies, and integration with broader collaboration tools. The transition from legacy private branch exchange (PBX) systems to IP-based and cloud telephony introduces architectural choices that balance control, cost, and operational complexity.
- VoIP (Voice over Internet Protocol)
- Transmission of voice communication as digital packets over IP networks rather than dedicated circuit-switched telephone lines. VoIP converts analogue audio to digital data, packetises it, and transmits it using standard IP routing.
- PBX (Private Branch Exchange)
- A private telephone switching system that connects internal extensions to each other and provides access to external telephone networks. Traditional PBX systems use circuit-switched technology; IP PBX systems route calls over IP networks.
- SIP (Session Initiation Protocol)
- A signalling protocol for establishing, modifying, and terminating multimedia sessions including voice calls. SIP handles call setup, negotiation, and teardown but does not carry the actual voice data.
- RTP (Real-time Transport Protocol)
- A protocol for delivering audio and video over IP networks with mechanisms for timing reconstruction, loss detection, and payload identification. RTP carries the actual voice data during a call.
- PSTN (Public Switched Telephone Network)
- The global circuit-switched telephone network providing traditional landline telephone service. PSTN connections remain necessary for reaching subscribers without internet connectivity and for emergency services in most jurisdictions.
- SIP Trunk
- A virtual connection between an IP PBX and the PSTN via a service provider, replacing physical telephone lines with IP-based connectivity. SIP trunks carry voice calls over the internet to and from the traditional telephone network.
- Codec
- An algorithm that encodes and decodes audio for transmission. Voice codecs compress audio to reduce bandwidth requirements while maintaining acceptable quality. Different codecs trade bandwidth consumption against audio fidelity.
Voice technology architecture
Voice systems have evolved through three distinct generations, each remaining relevant in different organisational contexts. Traditional circuit-switched PBX systems connect to the PSTN via physical trunk lines and require dedicated wiring to each telephone handset. These systems provide reliable voice quality independent of data network conditions but incur significant costs for inter-office calls and lack integration with modern collaboration tools.
IP PBX systems replace circuit-switched technology with packet-based voice transmission over standard IP networks. An IP PBX maintains the call control intelligence on-premises but routes voice traffic alongside data traffic on existing network infrastructure. This consolidation eliminates separate voice cabling, enables software-based feature development, and reduces long-distance costs by routing inter-office calls over IP connections rather than PSTN trunks.
Cloud telephony moves call control entirely to service provider infrastructure, delivering voice capability as a service rather than a system to operate. Organisations connect endpoints directly to cloud platforms without maintaining PBX hardware or software. Cloud models reduce capital expenditure and operational burden but introduce dependencies on internet connectivity and service provider availability.
+-------------------------------------------------------------------+| VOICE TECHNOLOGY EVOLUTION |+-------------------------------------------------------------------+| || TRADITIONAL PBX IP PBX CLOUD PBX || || +-------------+ +-------------+ +-------------+ || | PBX | | IP PBX | | Cloud | || | Hardware | | Server | | Service | || +------+------+ +------+------+ +------+------+ || | | | || +------+------+ +------+------+ +------+------+ || | Analogue | | IP Network | | Internet | || | Wiring | | | | | || +------+------+ +------+------+ +------+------+ || | | | || +------+------+ +------+------+ +------+------+ || | Analogue | | IP Phones | | IP Phones/ | || | Phones | | Softphones | | Softphones | || +-------------+ +-------------+ +-------------+ || || PSTN Trunks SIP Trunks Provider PSTN || (physical lines) (IP-based) Gateway || || Capital: High Capital: Medium Capital: Low || Operational: Low Operational: Medium Operational: Low || Control: Full Control: Full Control: Limited |+-------------------------------------------------------------------+The choice between these architectures depends on existing infrastructure, IT operational capacity, connectivity reliability, and control requirements. Organisations with stable headquarters connectivity and limited IT staff benefit from cloud telephony’s reduced operational burden. Organisations requiring call routing customisation, integration with specialised systems, or operation during internet outages maintain on-premises IP PBX systems. Many organisations adopt hybrid approaches, operating IP PBX systems at headquarters while using cloud services for remote offices or mobile workers.
VoIP fundamentals
VoIP transmission involves three distinct phases: signalling to establish calls, media transport to carry voice data, and codec processing to encode and decode audio. Understanding these mechanisms enables informed decisions about network requirements, quality management, and troubleshooting.
Signalling with SIP
SIP operates as a request-response protocol similar in structure to HTTP. When a user initiates a call, their endpoint sends an INVITE request to the destination endpoint or a SIP proxy server that routes the request. The INVITE contains a Session Description Protocol (SDP) payload describing the calling endpoint’s media capabilities: supported codecs, IP address, and UDP port for receiving voice data.
The called endpoint responds with its own SDP payload indicating compatible media parameters. This offer-answer exchange negotiates which codec to use and where to send voice packets. After the called party answers, both endpoints begin exchanging RTP packets directly, bypassing the signalling servers entirely. SIP servers handle only call setup and teardown, not the actual voice traffic.
Caller SIP Server Callee | | | |----(1) INVITE + SDP------>| | | |----(2) INVITE + SDP------>| | | | | |<---(3) 180 Ringing--------| |<---(4) 180 Ringing--------| | | | | | |<---(5) 200 OK + SDP-------| |<---(6) 200 OK + SDP-------| | | | | |----(7) ACK--------------->| | | |----(8) ACK--------------->| | | | |<===================RTP Media Stream==================>| | | | |----(9) BYE--------------->| | | |----(10) BYE-------------->| | | | | |<---(11) 200 OK------------| |<---(12) 200 OK------------| | | | |SIP uses port 5060 for unencrypted signalling and port 5061 for TLS-encrypted signalling. Production deployments should use TLS exclusively, as unencrypted SIP exposes call metadata and authentication credentials to network observers. SIP authentication typically uses digest authentication, where the server challenges the client with a nonce value and the client responds with a hash incorporating its password.
Media transport with RTP
RTP carries the actual voice data after SIP establishes the call. Each RTP packet contains a sequence number for reordering, a timestamp for playback timing, a payload type identifier indicating the codec, and the encoded audio data. RTP uses UDP rather than TCP because voice communication prioritises timely delivery over reliable delivery. A retransmitted packet arriving 200ms late provides no value for real-time conversation; the moment for playing that audio has passed.
A companion protocol, RTCP (RTP Control Protocol), runs alongside RTP to exchange statistics between endpoints. RTCP reports include packet loss rates, jitter measurements, and round-trip time calculations. These statistics enable endpoints to adapt quality dynamically and provide diagnostic information for troubleshooting.
RTP negotiates port numbers dynamically during SIP setup, typically selecting from a configured range (commonly 10000-20000). This dynamic port allocation creates challenges for firewalls and NAT devices that must allow RTP traffic without knowing in advance which ports will be used. The solutions involve SIP Application Layer Gateways (ALGs) that inspect signalling to open appropriate ports, Session Border Controllers that proxy both signalling and media, or ICE (Interactive Connectivity Establishment) negotiation for NAT traversal.
Audio codecs
Codecs determine the bandwidth consumption and audio quality of voice calls. The G.711 codec, developed for traditional telephony, provides toll-quality audio at 64 kbps per direction with no compression delay. G.711 exists in two variants: μ-law used in North America and Japan, and A-law used elsewhere. Because G.711 requires consistent 64 kbps bandwidth, it suits office environments with reliable connectivity but wastes capacity in bandwidth-constrained field contexts.
The G.729 codec compresses voice to 8 kbps through predictive coding algorithms. This 8x bandwidth reduction enables voice calls over limited connections but introduces 15ms of algorithmic delay and degrades audio quality during non-voice sounds like music or DTMF tones. G.729 requires licensing fees for commercial use, though implementations for personal use exist freely.
The Opus codec, developed as an open standard, adapts dynamically from 6 kbps to 510 kbps based on available bandwidth and packet loss conditions. Opus provides superior quality to older codecs at equivalent bitrates and handles music and voice equally well. As a royalty-free codec, Opus has become the default for WebRTC-based communication and modern softphones.
| Codec | Bitrate (kbps) | Sample Rate (kHz) | Frame Size (ms) | Licensing |
|---|---|---|---|---|
| G.711 | 64 | 8 | 20 | Royalty-free |
| G.729 | 8 | 8 | 10 | Licensed |
| G.722 | 64 | 16 | 20 | Royalty-free |
| Opus | 6-510 | 8-48 | 2.5-60 | Royalty-free |
| iLBC | 13.3/15.2 | 8 | 20/30 | Royalty-free |
Codec selection for a deployment involves matching bandwidth constraints to quality requirements. Field offices with satellite connectivity benefit from G.729 or low-bitrate Opus configurations. Headquarters with abundant bandwidth can use G.711 for maximum quality and lowest processing overhead. Modern IP PBX systems negotiate codecs dynamically, selecting the highest-quality codec both endpoints support.
IP PBX architecture
An IP PBX system consists of server software providing call control, endpoints registering to the server, and trunk connections to external networks. The call control server maintains a registration database mapping extensions to their current IP addresses, processes dial plan rules to route calls, and provides supplementary services like voicemail, call queuing, and conferencing.
Server components
The registration service accepts connections from endpoints and records their IP addresses, enabling the PBX to route incoming calls to the correct destination. Endpoints register periodically (commonly every 60-300 seconds) to confirm their continued availability and update their addresses if changed. Registration uses SIP REGISTER requests authenticated against a user database.
The dial plan engine processes dialled numbers through pattern matching rules that determine how to route each call. A dial plan distinguishes internal extensions from external numbers, routes emergency calls to local services, selects appropriate trunks for geographic regions, and enforces dialling restrictions. For example, a dial plan might route extensions 100-199 to internal users, 9 followed by digits to external lines, and 112/999 directly to emergency services.
+--------------------------------------------------------------------+| IP PBX ARCHITECTURE |+--------------------------------------------------------------------+| || +------------------------+ +------------------------+ || | ENDPOINTS | | TRUNKS | || | | | | || | IP Phones | | SIP Trunk Provider A | || | Softphones | | SIP Trunk Provider B | || | Mobile Apps | | PSTN Gateway | || | Conference Rooms | | Inter-office SIP | || +----------+-------------+ +----------+-------------+ || | | || v v || +-----------------------------------------------------------+ || | IP PBX SERVER | || | | || | +----------------+ +----------------+ +-------------+ | || | | Registration | | Dial Plan | | Feature | | || | | Service | | Engine | | Services | | || | | | | | | | | || | | Extension DB | | Pattern Match | | Voicemail | | || | | Authentication | | Trunk Select | | IVR | | || | | Presence | | Least Cost | | Queues | | || | +----------------+ +----------------+ | Conference | | || | | Recording | | || | +----------------+ +----------------+ +-------------+ | || | | CDR Database | | Configuration | | || | | | | Management | | || | | Call Records | | Web Interface | | || | | Billing Data | | API Access | | || | +----------------+ +----------------+ | || +-----------------------------------------------------------+ || |+--------------------------------------------------------------------+Feature services extend basic call routing with voicemail (recording messages when calls go unanswered), Interactive Voice Response (presenting menu options to callers), call queuing (distributing incoming calls to available agents), conferencing (mixing audio from multiple participants), and call recording (capturing conversations for training or compliance). These features operate as applications within the PBX software rather than separate systems.
Call Detail Records (CDRs) log every call’s source, destination, start time, duration, and disposition. CDR data enables cost allocation, usage reporting, and capacity planning. A 50-person office generating 200 calls daily produces approximately 6,000 CDR entries monthly, requiring nominal database storage but enabling valuable usage analysis.
High availability
Voice systems require high availability because telephony interruptions immediately affect operations and emergency contact capability. IP PBX high availability uses either active-passive clustering or distributed architecture depending on scale and requirements.
Active-passive clustering runs two PBX servers with synchronised configuration and shared storage. The active server handles all calls while the passive server monitors its health. Upon detecting failure, the passive server assumes the active role and takes over the shared IP address. Endpoints re-register to the new active server, typically restoring service within 30-60 seconds. In-progress calls drop during failover because call state exists only on the failed server.
Distributed architecture deploys multiple PBX nodes that each handle registrations and calls independently while sharing configuration through database replication. Load balancers distribute endpoint registrations across nodes. If one node fails, endpoints register to surviving nodes with no central coordination required. This architecture supports active-active operation where all nodes serve calls simultaneously, providing both availability and scalability.
For organisations with multiple sites, distributed architecture enables local call processing even when inter-site connectivity fails. Each site maintains a local PBX node handling registrations and internal calls. External trunks at each site provide PSTN access independent of headquarters connectivity. Configuration replication ensures all sites share dial plans and extension directories, but each site operates autonomously during network partitions.
Trunk connectivity
Trunk connections link IP PBX systems to external networks, enabling calls to reach the PSTN, mobile networks, and other organisations. Modern deployments use SIP trunking almost exclusively, though PSTN gateways remain necessary for locations without reliable internet connectivity or specific regulatory requirements.
SIP trunking
A SIP trunk provides virtual connectivity to a service provider that terminates calls to the PSTN or routes them to other networks. The IP PBX sends outbound calls to the provider’s Session Border Controller, which authenticates the connection, validates the calling number, and routes the call to the destination network. Inbound calls arrive from the provider directed to numbers assigned to the organisation.
SIP trunk capacity scales dynamically without physical line installation. Where traditional trunks required ordering additional circuits weeks in advance, SIP trunk capacity increases through provider configuration changes. Pricing models vary: per-minute rates work well for low-volume usage, while unlimited calling bundles suit organisations with predictable high volumes. A 50-person office using approximately 5,000 outbound minutes monthly might pay $150-300 with per-minute rates or $200-400 for unlimited service, depending on provider and geographic calling patterns.
+------------------------------------------------------------------+| SIP TRUNKING TOPOLOGY |+------------------------------------------------------------------+| || ORGANISATION || || +------------------------+ +------------------------+ || | HEADQUARTERS | | BRANCH OFFICE | || | | | | || | +----------------+ | | +----------------+ | || | | IP PBX | | | | IP PBX | | || | +-------+--------+ | | +-------+--------+ | || | | | | | | || +----------|-------------+ +------------|-----------+ || | | || +---------------+----------------+ || | || WAN / Internet || | || +---------------+----------------+ || | | || +----------|-------------+ +------------|-----------+ || | v | | v | || | +----------------+ | | +----------------+ | || | | SIP Trunk | | | | SIP Trunk | | || | | Provider A | | | | Provider B | | || | | (Primary) | | | | (Failover) | | || | +-------+--------+ | | +-------+--------+ | || | | | | | | || +----------|-------------+ +------------|-----------+ || | | || +---------------+----------------+ || | || +------v------+ || | PSTN | || | Network | || +-------------+ || |+------------------------------------------------------------------+Geographic number portability allows organisations to retain existing telephone numbers when migrating to SIP trunking. The porting process transfers number ownership from the previous carrier to the SIP trunk provider, maintaining continuity for callers. Porting timelines range from 5-30 business days depending on jurisdiction and carrier cooperation. During transition, call forwarding from old numbers to temporary SIP trunk numbers maintains reachability.
Provider selection criteria include geographic coverage (local numbers and termination in operating regions), call quality (provider network quality and peering arrangements), pricing structure (setup fees, per-minute rates, monthly commitments), regulatory compliance (number provisioning rules, emergency services requirements), and technical compatibility (supported codecs, concurrent call capacity, security options).
Redundancy and failover
SIP trunk failures leave organisations unable to make or receive external calls, making redundancy essential for business continuity. Primary-secondary configurations route outbound calls to the primary provider and fail over to the secondary when the primary becomes unreachable. The IP PBX monitors trunk health through SIP OPTIONS polling or call failure detection, switching routing automatically upon failure.
Inbound redundancy requires additional measures because external callers dial specific numbers hosted by specific providers. Solutions include routing numbers through SIP trunk providers that support geographic redundancy, implementing carrier-level failover that redirects numbers during outages, or maintaining parallel numbers on different providers with instructions to try alternative numbers if the primary fails.
Geographic distribution of trunk connections protects against regional internet outages. An organisation with headquarters in London and operations in Nairobi benefits from SIP trunk connections in both locations. Local trunks at each site provide PSTN access independent of intercontinental connectivity, and local number presence enables callers in each region to reach offices without international dialling.
Cloud telephony
Cloud telephony services provide voice communication without on-premises PBX infrastructure. The service provider operates all call control systems, trunk connections, and feature servers, delivering telephony as a service accessible through internet-connected endpoints.
Service models
Hosted PBX services replicate traditional PBX functionality in provider infrastructure. Each organisation receives a dedicated logical instance with customisable dial plans, extensions, and features. Configuration occurs through web portals rather than server administration. Hosted PBX suits organisations wanting PBX capabilities without operational burden, accepting reduced customisation flexibility.
UCaaS (Unified Communications as a Service) platforms integrate voice with video, messaging, and collaboration features. Providers like Microsoft Teams, Zoom, and RingCentral offer telephony as one component of broader communication suites. UCaaS telephony integrates tightly with other platform features: voicemails appear in messaging interfaces, call status updates presence indicators, and contacts synchronise across applications. The integration benefits suit organisations already committed to these platforms; others may find the bundled pricing unfavourable for voice-only needs.
CPaaS (Communications Platform as a Service) provides voice capability through APIs rather than user-facing applications. Developers integrate voice calling into custom applications using provider APIs for call initiation, routing, and recording. CPaaS suits organisations building communication features into programme management systems, beneficiary feedback platforms, or custom workflows rather than deploying standard telephone systems.
Network requirements
Cloud telephony requires consistently low-latency internet connectivity. Voice quality degrades noticeably above 150ms one-way latency and becomes unusable above 300ms. Packet loss above 1% causes audible degradation; above 3% renders calls difficult. Unlike data applications that tolerate variable performance, voice communication requires predictable quality during every call.
Bandwidth requirements for cloud telephony depend on codec selection and concurrent call count. A G.711 call consumes 85 kbps bidirectionally including protocol overhead. G.729 reduces this to 32 kbps. An office supporting 20 concurrent calls with G.711 requires 1.7 Mbps dedicated to voice traffic. QoS configuration prioritising voice packets over other traffic prevents bandwidth contention from affecting call quality.
Firewall configuration for cloud telephony involves permitting outbound connections to provider services and allowing return traffic on dynamically negotiated RTP ports. Most providers support HTTPS-based connectivity that traverses firewalls cleanly, encapsulating signalling and media in standard web traffic. This encapsulation simplifies firewall management but prevents standard QoS marking on encrypted flows.
Voice quality management
Voice quality depends on network performance characteristics that differ from those affecting data applications. Users tolerate multi-second web page loads but notice 200ms of voice delay. Quality management involves measuring the factors affecting voice perception, configuring networks to protect voice traffic, and diagnosing quality problems when they occur.
Quality metrics
Latency measures the time for audio to travel from speaker to listener. One-way latency below 150ms feels like normal conversation. Latency between 150-300ms creates awkward conversational rhythm with speakers talking over each other. Above 300ms, conversation becomes difficult and users compensate by speaking in turns rather than naturally. End-to-end latency accumulates from codec processing (typically 20-40ms), network transmission, jitter buffer delay, and any protocol processing.
Jitter measures variation in packet arrival times. A packet transmitted every 20ms might arrive at 18ms, 23ms, 19ms, and 25ms intervals due to varying network paths and queuing. Endpoints compensate using jitter buffers that hold packets before playback, absorbing arrival time variation. A 60ms jitter buffer tolerates 60ms of arrival time variation but adds 60ms to end-to-end latency. High jitter forces larger buffers, increasing latency and potentially causing buffer underruns that produce audio gaps.
Packet loss occurs when network congestion causes routers to discard packets or when UDP packets simply fail to arrive. Voice codecs employ concealment techniques that interpolate missing audio based on surrounding packets, making losses below 1% largely inaudible. Losses between 1-3% cause occasional clicks or brief gaps. Above 3%, concealment fails to maintain intelligibility. Unlike TCP applications where retransmission handles loss, real-time voice cannot wait for retransmitted packets.
The Mean Opinion Score (MOS) provides a single metric summarising voice quality on a 1-5 scale. A MOS of 4.0-4.5 represents toll quality comparable to traditional telephone service. MOS of 3.5-4.0 remains acceptable for business communication. Below 3.5, quality complaints increase significantly. Estimated MOS calculations derive from measurable factors (latency, jitter, loss, codec) without requiring subjective testing.
+------------------------------------------------------------------+| VOICE QUALITY FACTORS |+------------------------------------------------------------------+| || +------------------------+ || | AUDIO SOURCE | || | (Microphone) | || +-----------+------------+ || | || v || +-----------+------------+ || | CODEC ENCODING |----> Delay: 20-40ms || | (Compression) | Quality: Codec-dependent || +-----------+------------+ || | || v || +-----------+------------+ || | NETWORK TRANSIT |----> Delay: 5-200ms+ || | (IP Routing) | Loss: 0-5%+ || +-----------+------------+ Jitter: 0-100ms+ || | || v || +-----------+------------+ || | JITTER BUFFER |----> Delay: 20-100ms || | (Smoothing) | Compensates for jitter || +-----------+------------+ || | || v || +-----------+------------+ || | CODEC DECODING |----> Delay: 10-20ms || | (Decompression) | Loss concealment || +-----------+------------+ || | || v || +-----------+------------+ || | AUDIO OUTPUT | || | (Speaker) | || +------------------------+ || || TOTAL ONE-WAY LATENCY: 55-360ms+ (target: <150ms) || |+------------------------------------------------------------------+QoS implementation
Quality of Service mechanisms prioritise voice traffic over less time-sensitive applications. Without QoS, a large file download can consume available bandwidth, forcing voice packets to queue and introducing delay and loss. QoS ensures voice packets receive preferential treatment during congestion.
DSCP (Differentiated Services Code Point) marking tags packets with priority indicators that network equipment respects. Voice signalling (SIP) receives DSCP value CS3 (24); voice media (RTP) receives EF (Expedited Forwarding, 46). Endpoints mark packets at transmission; routers and switches honour these markings when making queuing decisions. The marking persists only within networks configured to respect DSCP; ISP networks often ignore or reset markings.
Traffic shaping allocates guaranteed bandwidth to voice traffic. A WAN link configured with 500 kbps reserved for voice ensures that 20 concurrent G.729 calls always have sufficient bandwidth regardless of other traffic. Shaping typically occurs at network edges where traffic exits to WAN links.
Local network QoS requires switch configuration to prioritise voice VLANs and wireless access points configured for WMM (Wi-Fi Multimedia) prioritisation. Voice phones often connect to dedicated voice VLANs separated from data traffic, simplifying QoS policy application. Wireless deployments face greater QoS challenges because Wi-Fi contention introduces variable delays that wired networks avoid.
Field deployment considerations
Voice communication over constrained connections requires adaptation to bandwidth limitations, high latency, and unreliable connectivity. Field offices connected via satellite experience 600ms minimum latency from geostationary orbit physics. LEO satellite services reduce latency to 40-100ms but introduce variable quality during satellite handovers.
Codec selection significantly affects voice feasibility over limited connections. G.729 at 8 kbps enables voice calls over connections as slow as 32 kbps with quality sufficient for operational communication. Opus at 6 kbps further reduces requirements for extremely constrained links. These low-bitrate codecs sacrifice audio fidelity for accessibility.
Local call processing at field sites preserves internal voice communication during WAN outages. A local IP PBX or survivable gateway at the field site handles registrations and routes internal calls when headquarters connectivity fails. External calling during outages requires local trunk connections, either SIP trunks where internet remains available or PSTN gateways where local telephone service exists.
Echo becomes problematic on high-latency links because acoustic echo cancellation struggles with round-trip times above 100ms. Hardware endpoints with strong echo cancellation hardware perform better than software-based solutions. Headsets eliminate acoustic coupling between speaker and microphone, avoiding echo entirely.
Integration with unified communications
Voice telephony increasingly operates as one component within unified communications platforms rather than as standalone infrastructure. Integration involves connecting IP PBX systems to collaboration platforms, synchronising presence information, enabling click-to-dial functionality, and providing consistent user experience across communication modes.
Microsoft Teams telephony integration occurs through Direct Routing, which connects an organisation’s SIP trunk to Teams via a Session Border Controller. Calls to and from PSTN numbers route through the SBC, which bridges between the SIP trunk and Teams. This approach preserves existing trunk relationships and number assignments while adding Teams as an endpoint. Alternatively, Microsoft Calling Plans provide telephony directly from Microsoft without external trunks, simplifying architecture but limiting geographic availability and control.
Open source IP PBX systems integrate with collaboration platforms through APIs and presence protocols. Asterisk and FreePBX publish call state through AMI (Asterisk Manager Interface), enabling external systems to display who is on calls, initiate calls from contact directories, and route calls based on presence information. Custom integrations connect voice systems to programme management applications, CRM platforms, and helpdesk systems.
Presence synchronisation ensures that availability shown in messaging applications reflects telephony status. A user on a voice call appears busy across all communication platforms, preventing simultaneous call attempts. XMPP (Extensible Messaging and Presence Protocol) provides a standard mechanism for presence exchange between systems, though many platforms use proprietary presence APIs requiring specific integration development.
Technology options
Open source platforms
Asterisk provides the most widely deployed open source PBX platform. Written in C for performance, Asterisk implements SIP, IAX2 (Inter-Asterisk eXchange), and legacy telephony protocols. The dial plan language enables sophisticated call routing logic. Asterisk requires Linux administration skills for deployment and lacks a built-in web interface, though FreePBX and other distributions add graphical configuration. Organisations with Linux expertise and requirements for customisation or integration find Asterisk capable of matching commercial PBX features.
FreePBX layers a web administration interface over Asterisk, making configuration accessible without dial plan scripting. Commercial modules extend functionality with conference management, endpoint provisioning, and call centre features. Sangoma, the company maintaining FreePBX, offers commercial support and hardware. FreePBX suits organisations wanting open source economics with reduced operational complexity.
Kamailio functions as a SIP proxy and registrar rather than a full PBX. Kamailio scales to handle millions of registrations and routes calls between endpoints or to media servers. Organisations building large-scale voice platforms or requiring SIP routing intelligence beyond PBX dial plans deploy Kamailio as the SIP layer with Asterisk handling media processing.
FusionPBX provides multi-tenant PBX capability built on the FreeSWITCH media server. The multi-tenant architecture suits service providers or organisations wanting to host separate PBX instances for different business units. FusionPBX offers a web interface comparable to FreePBX with different underlying architecture.
Commercial platforms with nonprofit programmes
3CX offers a software PBX with integrated video conferencing and team messaging. The free tier supports up to 10 users with core PBX functionality. Commercial licenses scale based on concurrent call capacity. 3CX runs on Linux, Windows, or as a cloud service. Nonprofit pricing reduces commercial license costs by approximately 50%.
RingCentral provides UCaaS including voice, video, messaging, and contact centre. Nonprofit pricing through TechSoup and similar programmes reduces per-user costs. RingCentral suits organisations preferring cloud delivery without on-premises infrastructure and wanting integrated unified communications.
Microsoft Teams Phone integrates telephony with Teams collaboration. Nonprofit licensing through Microsoft 365 nonprofit programmes includes Teams; telephony add-ons require additional licensing. Teams Phone suits organisations already using Teams extensively, providing single-platform communication. Direct Routing integration enables use of existing SIP trunks rather than Microsoft Calling Plans.
Zoom Phone adds telephony to Zoom’s video conferencing platform. Per-user pricing with unlimited domestic calling suits organisations with predictable calling patterns. Nonprofit pricing through TechSoup reduces costs. Zoom Phone suits organisations already invested in Zoom for video conferencing.
| Platform | Type | Minimum Scale | Nonprofit Programme | Self-Hosted Option |
|---|---|---|---|---|
| Asterisk | Open source | Any | N/A | Yes |
| FreePBX | Open source | Any | N/A | Yes |
| 3CX | Commercial | 10 users (free) | Yes (~50% discount) | Yes |
| RingCentral | Cloud | 1 user | Yes (TechSoup) | No |
| Teams Phone | Cloud | 1 user | Yes (MS Nonprofit) | No |
| Zoom Phone | Cloud | 1 user | Yes (TechSoup) | No |
Implementation considerations
For organisations with minimal IT capacity
Cloud telephony services eliminate PBX administration while providing enterprise voice features. Microsoft Teams Phone or Zoom Phone suits organisations already using these platforms for other communication. Setup involves configuring user accounts, porting existing numbers, and deploying softphones or desk phones. Per-user cloud pricing creates predictable costs without capital expenditure. The tradeoff involves reduced control over call routing customisation, dependency on internet connectivity for all calls, and ongoing subscription costs that may exceed self-hosted solutions over time.
For organisations requiring telephony independent of internet connectivity, mobile phones with organisational SIM cards provide simpler voice communication than maintaining PBX infrastructure. Group calling plans from mobile carriers offer unlimited internal calling and predictable external call costs.
For organisations with established IT capacity
On-premises IP PBX systems provide maximum control and customisation. FreePBX on modest hardware supports 50-100 extensions with voicemail, conferencing, and call queuing. SIP trunk connections from multiple providers enable cost optimisation and redundancy. Integration with Active Directory or other identity systems synchronises user provisioning. The implementation requires Linux administration skills for ongoing maintenance, security patching, and troubleshooting.
Hybrid architectures combine on-premises control at headquarters with cloud services for distributed locations. The headquarters IP PBX provides feature-rich telephony with full customisation. Branch offices and remote workers use cloud telephony or softphones connecting to headquarters. SIP trunking at both headquarters and cloud platforms provides redundancy and local presence.
For field deployments
Field sites with reliable internet connectivity benefit from softphones or IP phones connecting to headquarters PBX or cloud services. Bandwidth requirements remain modest (40 kbps per call with efficient codecs) but latency requirements demand reasonably direct internet paths.
Field sites with intermittent connectivity require local survivability. A small IP PBX or survivable gateway at the site handles local calls when WAN connectivity fails. Local SIP trunk connections or PSTN gateways provide external calling capability independent of headquarters. Configuration replication ensures dial plans remain synchronised during connected periods.
Satellite-connected sites face latency challenges that affect conversation quality regardless of bandwidth. Hardware phones with strong echo cancellation perform better than softphones. Setting expectations with users about conversation rhythm helps adoption. For critical communication, dedicated satellite phones separate from data connectivity provide reliable voice channels.