Splunk says the hidden costs of downtime are $400 billion - Image by Gerd Altmann from PixabayWhat is the hidden cost of downtime? According to a new report from Splunk and Oxford Economics, it costs the Global 2,000 companies as much as US$400 billion annually. The details are revealed in a global report titled “The Hidden Costs of Downtime.”

Putting the $400 billion into perspective, the report authors say, “It equates to 9 per cent of profits when digital environments fail unexpectedly.” If that is the cost to the Global 2,000, the wider cost to the global economy will be several times higher.

Gary Steele, President, Go-to-Market at Cisco & GM, Splunk (image credit - LinkedIn/Gary Steele)
Gary Steele, President, Go-to-Market at Cisco & GM, Splunk

Gary Steele, President of Go-to-Market, Cisco & GM, Splunk, said, “Disruption in business is unavoidable. When digital systems fail unexpectedly, companies not only lose substantial revenue and risk facing regulatory fines, they also lose customer trust and reputation.

“How an organization reacts, adapts and evolves to disruption is what sets it apart as a leader. A foundational building block for a resilient enterprise is a unified approach to security and observability to quickly detect and fix problems across their entire digital footprint.”

What is unplanned downtime and its costs?

The authors say, “unplanned downtime is any service degradation or outage of a business system — can be a frustrating inconvenience or even a life-threatening scenario for customers. For companies, downtime inflicts real financial damage in the form of regulatory fines, lost revenue, overtime wages, and much more.”

The key thing to remember is that this is something other than direct cash out of the pocket of organisations. It comprises several costs, some of which may be harder to quantify.

In their analysis, the report authors say, “the consequences of downtime go beyond immediate financial costs and take a lasting toll on a company’s shareholder value, brand reputation, innovation velocity and customer trust.”

They break costs into two groups: direct and hidden costs.

Direct costs

These are costs that a business can quantify from its accounts and other data. They can be directly related to dealing with the cause(s) of the downtime.

Examples of direct costs are lost revenue ($49M), regulatory fines ($22M), missed SLA penalties ($16M) and settlement/legal costs ($15M). Other notable direct costs identified in the report include PR/Investor Relations ($13M), overtime wages ($11M), Cyber Insurance premiums ($10M), and recovering from backups ($9M).

The report also put the cost of paying extortion as $8M, a surprisingly low number. Making this even more interesting is that the report claims 67% of CFOs advise management to pay the ransom.

Hidden costs

These are not always obvious. Some may not have a clearly identifiable impact on the business, but the fact they are incurred makes them an issue. The report does not put a direct value on any of these costs. Instead, it relies on what respondents felt were costs that occurred due to downtime but which they couldn’t quantify.

Examples of hidden costs identified by respondents include diminished shareholder value (28%), stock price drops (1-9%), stagnant developer productivity (64%), delayed time-to-market (74%) and other costs.

However, there is an even more interesting part of the hidden costs, which the report authors mapped against the road to recovery. From the start of the downtime, they say it takes 60 days for brand health to recover. Revenue can take up to 75 days to recover, and the stock price can take as much as 79 days.

Downtime impacts the value of a business

With businesses moving online, they need to rethink how they monitor for downtime. The report states that. “41% of technology executives admit customers are “often” or “always” the first to detect downtime.”

Not only do customers detect it, but in a social media-driven world, they are quick to make their displeasure known. 40% of CMOs reveal that downtime impacts average customer lifetime value (CLV), and another 40% say it damages reseller and partner relationships.

But there is a deeper impact on a business identified in the report: the impact on marketing teams. 92% of CMOs said it puts marketing teams at a disadvantage. That disadvantage costs significant sums to overcome. Marketing teams increase advertising to promote brand trust (67%). The same number said they pivoted marketing teams to crisis management, and 61% diverted their budget to crisis management.

What are the causes and possible solutions to downtime?

The report says that the number one cause of downtime is human error. Over half of the respondents put it as often or very often to blame. It also seems that organisations struggle to identify and remediate human error. The average time to detect an incident caused by human error is 17 to 18 hours. Shockingly, it can take 67 to 76 hours to recover from such incidents.

The top reasons given for the causes of downtime were:

  1. Cybersecurity-related human error
  2. ITOps-related human error
  3. Software failure
  4. Malware attack
  5. Hardware failure
  6. Phishing attack
  7. Third-party software outage

Finding the root cause of an error is essential. Without that, the error will occur again with the same likely consequences. In some industries, such as the auto industry, replacing a part is more practical than finding out what caused the failure. In technology, that is not an option.

Surprisingly, just 63% of respondents claim to always fix the root cause of a downtime incident. However, finding it and then remediating it to prevent a reoccurrence is something that the report authors question. They believe that the complexity of modern hybrid environments makes this difficult.

Of concern to all boards is that 54% of respondents said they sometimes intentionally leave the root causes of downtime unfixed. It means that problems can and probably will reoccur.

How do we fix this?

The second half of the report addresses this. The key message is to invest in smart technology to become resilient.

The report says otherwise for those who think all they do is pour money into a cybersecurity black hole. 84% of CFOs say they get a solid ROI from security, ITOps and engineering investments. The report needs to clarify whether that ROI is directly related to reduced downtime and whether the level of investment can be shown to reduce downtime losses.

Unsurprisingly, generative AI is considered part of the solution, with 65% of respondents using gen AI tools to address downtime. 75% of those claim to see a considerable benefit. The report’s authors say that domain-specific chat experiences show the most benefit.

Become a resilience leader

Those individuals who make their organisations more resilient are called Resilience Leaders in the report. It says they can fix problems much faster than those who are not resilient. Ransomware attacks are fixed 84% faster, deployment issues 57% faster and cybersecurity-related human issues 44% faster.

All of that translates into money and reduced direct costs. Lost revenue is reduced by $17M, regulatory fines by $10M, SLA payment savings are $10M, and ransomware payments by $7M. It should be noted that these are savings per incident.

Indirect costs are also slashed significantly. There is less impact on innovation, shareholder value and customer churn. Most importantly for marketing and other teams, the reputational impact is, at worst, moderate. That means less time and money spent repairing the damage and more on bringing in new business.

The report finishes with some pro tips for strengthening resilience. They are all obvious steps that should already be in place, so the question is, why are they not?

Enterprise Times: What does this mean?

There is much to absorb from this report, on both the negative and the positive side. However, the biggest takeaway is to become a resilience leader rather than a follower. Be forward-looking, take advantage of new technologies, and don’t be afraid to invest in new technologies.

In many ways, there is nothing new here. All three are good strategies for any business of any size, and one would expect them to be standard approaches across the global 2,000. That they are being called out here suggests revisiting how we train those who lead our organisations.

If $400 billion is being lost by the Global 2,000, think of the money being lost across an even wider group of businesses. That money, of course, isn’t just lost. Much of it is absorbed by other businesses that help fix and remediate the impact of downtime.

However, the authors don’t define what percentage of that money still feeds into the global economy to support jobs. There could be a follow-up piece of research to see what is actually lost and what the impact would be on the wider ecosystem if there were no downtime.


Please enter your comment!
Please enter your name here