Delivering Ultra-Reliable and Available Communications from the Cloud is a Process
August 18, 2016
In my last post, I described the various attributes that can impact your business communications either positively (with the right vendor) or negatively (with the wrong vendor). Now, I’ll answer that $64,000 question with a $1M answer:
- Management Philosophy: PanTerra's management comes from a telecom heritage that includes Bell Laboratories, the preeminent creator of the original analog telephone network still in use today and regarded as one of the most reliable networks in the world. Many of the architectural, design and operational philosophies of that network have been updated and applied to the ultra-reliable IP communications infrastructure that PanTerra has built today. While we pride ourselves in delivering innovative cloud services that make our customers more competitive, we are equally, if not more proud, of our relentless pursuit of "the perfect" cloud infrastructure able to deliver unlimited scalability with virtually 100% reliability and availability.
- Operational Methodology: Our management philosophy matches our operational methodology. In other words, we practice what we preach. It starts with rigorous operating procedures and processes for all our services, which are designed to prevent human error from impacting operational stability. Error detection and recovery procedures are reviewed and practiced on a weekly and monthly basis. Live 24/7 monitoring of all critical components including carrier bandwidth, registrations, concurrent media processes, call and quality metrics, hardware metrics and network health ensure that if an anomaly does occur, action can be taken swiftly and effectively. As both the technology provider and service provider, operations personnel have direct access to development engineers at any time to address escalated issues immediately.
- Software: The service is only as reliable as the underlying software. Developing ultra-reliable service software requires special coding skills and years of experience. PanTerra has over 400 person-years of development in the WorldSmart solution and uses a continuous improvement methodology to identify bugs early and throughout the life of a feature. PanTerra also implements four separate pre-production networks to test and QA software before releasing to production.
- Data Center: PanTerra operates completely redundant, hardened and secure data centers that are SAS70 compliant. Redundancy extends beyond server and network hardware and includes power systems (with dual backup generators), carrier networks, and dual HVAC systems. Equally important are the locations for PanTerra's data centers. PanTerra does extensive analysis of a data center location to make sure it has a very low disaster event index (probability of a natural disaster impacting it). Thus PanTerra will not locate any data center in an earthquake, hurricane, tornado or flood zone. All data centers must have multiple tier 1 carrier connections as well. This is one (of many) reason PanTerra's operations will never be impacted by hurricanes, tornadoes, flooding or earthquakes.
- Connectivity: This is one of the most susceptible and least controllable (by PanTerra) component in the system, especially if the customer installs a self-managed WorldSmart cloud service as opposed to a fully managed SentraCloud solution. In the latter case, PanTerra is installing its own bandwidth, MPLS SmartBand, which they have more control over. While in the former solution, the customer manages and maintains bandwidth connectivity themselves. A single connection solution in either case becomes a single point of failure. PanTerra's solution can offset that failure event with its disaster recovery re-routing capability. Implementing a solution with multiple connections is the preferred method for eliminating a single point of failure case. PanTerra also implements real-time monitoring of the connections including the monitoring of bandwidth utilization, registrations and QoS scores. Any anomalies detected are reviewed and addressed expediently.
PanTerra understands what really contributes to reliability and availability of a cloud-based communications solution and has developed a complete approach to address each and every component in the system. PanTerra understands that the solution is only as good as the weakest link, thus we are constantly driving to improve the reliability of all components in the system. The results speak for themselves as PanTerra has virtually maintained a 99.999% uptime for its customers over the past 4 years. While we may not be perfect, we are committed to our quest for that perfect reliability number in the sky!
Comments