Real-World Statistics on Managing Cloud Outage Risks

Shraddha Nair
Updated: Apr 19, 2024

TL:DR

Key Takeaways

As companies increasingly rely on cloud providers for operations (because of flexibility, accessibility, and scaling), they are exposed to the risk of downtime. The cloud is reliable and reduces the cost of running in-house servers and applications — but it isn’t perfect. And when it is down, everything comes to a grinding halt, especially for unprepared businesses. Fortunately, risk managers now have access to responsive downtime insurance products, among other solutions, that can effectively transfer cloud downtime exposures and overcome cloud outage risks.

Acknowledging the World’s Reliance on the Cloud

From individuals to businesses, the cloud has transformed how we store, access, and share data — from the peace of mind provided by cloud storage with automatic backups and synchronization across devices to facilitating real-time collaboration for teams across the globe.

In a recent survey, Parametrix, Founder Shield’s digital insurance partner and an insurance carrier providing coverage for cloud outages, said 95% of corporate decision-makers believe their business depends on the cloud. This will be for day-to-day operations and to empower companies to scale rapidly.

It’s hard to imagine today’s digital world without it: The cloud has reshaped how people consume entertainment (with the streaming service Netflix), healthcare providers securely manage patient records, and teachers and students provide a seamless digital learning environment.

And there are countless examples of why companies should upgrade to the cloud, like Southwest Airlines’ system breakdown in December 2022. A major winter storm overwhelmed its flight scheduling system, which hadn’t changed much since the 1990s. As a result, the airline took a $325 million hit to revenue in the first quarter of 2023, highlighting the need for critical upgrades to the airline’s infrastructure.

However, even though many companies have moved to the cloud, it doesn’t matter how much redundancy engineers build into the cloud-based systems; downtime will continue. This is due to software, hardware, and infrastructure issues — like connectivity errors, equipment failure, cyberattacks, and insufficient resources for increased traffic — and human error, like misconfiguration during maintenance.

And these outages are even more likely to impact many businesses when two-thirds of the global supply of cloud services are solely from three providers: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).

Understanding Cloud Outage Risks

In a 324 US-based corporate decision-makers survey for Parametrix, 31% of respondents said eight hours of cloud downtime during business hours would be catastrophic. Around 50% state a downtime event upsets customers, increases churn, and causes lost revenue and sales.

These outages are not uncommon either: Parametrix monitored nearly 1200 performance interruptions and disruptions in 2022, four times the number reported by the providers.

For example, despite 99.9% uptime and a “steady cadence of security enhancements,” Atlassian, a software company, experienced an outage that spanned up to 14 days for a subset of customers in April 2022. It was not caused by a cyberattack but by human error and a communication gap during the acquisition and integration of an Atlassian app for Jira Service Management and Jira software. Then, in June 2022, a Cloudflare outage caused widespread chaos, affecting big names like Discord, DoorDash, Coinbase, and NordVPN.

So, what are the kinds of impacts businesses can expect after outages like these?

Financial impact

According to Uptime Institute, the percentage of outages costing companies more than $1 million has increased from 11% to 15% since 2019. Legal fees, fines, and penalties add up quickly, often falling between $1 and $5 million. This is mainly because these outages shut down critical sales channels, preventing customers from initiating a purchase.

Reputational impact

If potential customers see the negative sentiment on social media about a company’s outage, this will drive them to look elsewhere. In early 2023, Microsoft experienced two major cloud outages in two weeks. Numerous Azure cloud services became inaccessible, including Outlook, Microsoft Teams, SharePoint Online, and OneDrive for Business. This was quickly covered by CNN, Reuters, and TechCrunch, exposing Microsoft’s shortcomings.

Legal impact

In contractual agreements, businesses that rely on cloud services may have service-level agreements (SLAs) that outline uptime guarantees and compensation for downtime, particularly in regulated industries like SaaS and fintech. Outages that exceed agreed-upon thresholds can lead to breach of contracts, and companies may need to pay service credits or hard cash to customers.

Additionally, outages may trigger regulatory investigations, penalties, and legal action if sensitive information is compromised or lost. Companies may also face legal challenges related to negligence or failure to implement proper backup and recovery measures.

Operational risk

During downtime, cloud-based systems, servers, and networks are often offline or unavailable, so employees are involuntarily unproductive and unable to conduct business or service clients. The lack of internal or external communications can lead to a drastic drop in productivity.

Solutions to Managing Cloud Outage Risks

Leading companies and risk managers have already been mapping and understanding the potential impact of cloud outages on their businesses. But it is time all companies, large and small, recognize, mitigate, and transfer risk by:

Having downtime insurance: During the downtime of a public cloud provider, having insurance can mean hourly compensation for companies to cover any losses and expenses incurred. For example, Parametrix uses proprietary technology to monitor the availability and performance of top cloud service providers and assess each interruption’s impact on users. This allows Parametrix to support better comparison, indexing, pricing, and structure risk transfer products that react to actual disruptions and protect companies against lost revenues, customer compensation, and SLA liabilities.
Adopting a multi-cloud environment: If a cloud platform faces an outage, multi-cloud redundancy means that resources are duplicated on multiple clouds to at least provide fail-over to a second cloud platform. Turning to a multi-cloud strategy allows companies to work with several providers to minimize lags and delays and support an optimal user experience. Nevertheless, this approach can incur significant costs. Alternatively, businesses can opt for a hybrid cloud strategy, which combines the scalability advantages of a multi-tenant cloud with the data security and control offered by single-tenant clouds.
Performing regular backups: When downtime occurs, your backed-up data is your last resort. By securely backing up essential data, you can swiftly recover and resume normal functioning, even when disaster strikes. Make sure your IT department is testing your backups regularly to ensure your data recovery works.
Following best cybersecurity practices: Companies may feel the expenditure on redundancy and disaster recovery technologies isn’t warranted. But they must guarantee devices are up to date, monitor networks and devices, and train employees to reduce human error and increase their awareness about malware and ransomware. Having a disaster recovery plan with a site reliability engineering team that can locate and fix the problem quickly is vital.

Understanding the details of what coverage your company needs to face a cloud outage can be confusing. Founder Shield specializes in knowing the risks your industry faces to make sure you have adequate protection. Feel free to reach out to us, and we’ll walk you through the process of finding the right policy.

Want to know more about downtime insurance? Please contact us at info@foundershield.com or create an account here to get started on a quote.