Downtime, outages and outages: Understand your true costs
- 11. April 2019
- Written by: Gad Cohen
One
This content is brought to you by Evolven. Evolven Change Analytics is a unique AIOps solution that tracks and analyzes all actual changes to the enterprise cloud environment. Evolven helps leading companies reduce the number of incidents, reduce problem resolution time and eliminate unauthorized changes.Learn more
When it comes to mission-critical applications or data center performance quality, companies are willing to invest heavily. Unfortunately, these investments are not always fully delivered.
Against system failure
Despite the efforts that have been invested in infrastructure resilience, many IT organizations continue to struggle with database, hardware and software failures lasting from a few minutes to several days, completely shutting down the business and large cause losses.
expected downtime
The world of IT outages can seem strange at times.
Despite the variety of advanced solutions and the growing amount of data being collected by leading enterprise software vendors and IT departments - from ERP to CRM and more - outages remain a valid and serious threat to the industry.
On the other hand, IT outages have somehow become an inherently accepted, even expected, part of business life.
That's counterintuitive...
IT downtime review
While IT professionals face downtime from time to time and then focus their full attention on overcoming it, the business organization as a whole suffers "financial pain" from the impact, which is usually quite significant.
In the past, we've taken a closer look at the various ways IT downtime can impact business outcomes (you can read more about this here:Cost and scope of unplanned outages). In doing so, we consider different aspects, from direct sales losses and damage to reputation to indirect effects such as reduced productivity.
Now, I want to return to the topic and examine how organizations should address and assess threats to their IT operations, including systems, applications and data, by looking at robust (and established) benchmarks that represent the potential costs behind downtime and disruption .
System failures:
Measuring the failures of big brands
When should the industry start measuring the financial impact of major brand disruptions like the one that occurred recently?Facebook, Öone that reached hundreds of thousands of Lloyds Bank customers, or theJetstar failurethat caused hundreds of flight delays?
In other words, at what point is an outage "significant enough" that a cost analysis becomes valuable for the industry to learn from and predict the impact of future outage incidents?
Well, apparently at some point the disruption creates an impact that PR-wise can't ignore. This is the point of no return, followed by estimates of the financial impact.
The cost of downtime varies significantly between industries. The size of the affected company is of course a critical but not the only important factor. The role of IT systems in the company is also essential.
Defining a numeric value behind an IT outage means pre-defining its impact on various organizational and business aspects so the entire industry can learn and optimize accordingly.
A failure of a critical application can result in two different types of losses:
- Application service outage: The impact of downtime varies by application and organization;
- Data Loss: The potential loss of data due to a system failure can have significant legal and financial implications.
Well, I'm sure you'll agree that today's data centers should never go down; Applications must remain available 24/7, and internal (let alone external) end users around the world must be confident that data centers are always available (for critical data and application availability).
Well, reality bites. This is not the case in the back office (i.e. within the data center). No organization enjoys 100% uptime. Should You Try to Achieve 100%? Clear. However, you also need to develop a deep understanding of the impact of downtime and ways to minimize it.
Worst blackout nightmare in history? What probably happened to you...
Some past blackouts have turned into PR disasters, like the mythical Virgin Blue disaster in 2010 or the recent one that struck Facebook.
Because? The massive impact probably had something to do with it.
As a reminder, Virgin Blue's outage prevented passengers from boarding flights for 11 days (!!), resulting in negative press, damaged reputations and millions in losses.
More specifically, Virgin Blue's reserve management company, Navitaire, eventually compensated Virgin Blue for more than $20 million (Navitaire's booking decision gives Virgin $20 million in compo).
There are many other incidents that still attract media attention. Here's a current oneUSA Today article on the Wells Fargo power outagewho prevented customers from accessing their accounts for many hours.
It's safe to say that anyone in IT would agree that failures or disruptions are VERY bad for business. They are undesirable, very damaging financially and must be combated with all available means.
Configuration errors are critical
The IT Process Institute's Visible Operations Handbook has reported in the past that "80% of unplanned outages are due to poorly planned changes made by administrators ("operations staff") or developers" (visible operations).
The Enterprise Management Association reported that 60% of availability and performance failures are due to misconfigurations.
How much does it cost?
Downtime can cost organizations $5,600 per minute and up to $300,000 per hour in web application downtime (according to aAnalyse Gartner 2014).
Average cost per hour of downtime for enterprise servers, worldwide, 2017-2018:
Fuente:Politically
Application maintenance costs are increasing at 20% annually. But that can't solve all your problems. Previous industry studies have found that at least a quarter of the downtime surveyed is due to configuration errors. (How much will you spend on app downtime this year?).
How common is downtime or disruption?
Granted, downtime can be a financial nightmare. That part is clear. But if you want to properly assess the risk potential of business disruption, the immediate question should be, "How likely is that?"
Fuente:data center knowledge
Admittedly, failures are too common to ignore the thought, "I probably won't have a major failure." Now the question is how do you calculate the risk specific to your business.
Clarified production costs and app downtime
Unplanned outages are resolved by IT. However, as I mentioned earlier, these outages ultimately impact the entire organization.
An important part of a complete downtime risk assessment process is estimating how much money you will lose per hour (or minute, or whatever time interval you choose) due to the downtime.
For organizations that rely solely on the ability of data centers to provide IT and network services to customers, such as For example, telcos or e-commerce companies, downtime can be particularly costly, with the highest cost for a single event reaching $1 million (more than $11,000 per minute) according to expert estimates.
In a USA Today survey of 200 data center managers, more than 80% said their downtime costs exceeded $50,000 per hour. More than 25% reported downtime costs of more than $500,000 per hour (!!).
According to another survey, while companies cannot achieve zero downtime, one in ten companies indicated that their availability should be greater than 99.999%.
Fuente:Searchcio Techtarget
To get a solid understanding of the impact of production and release downtime, let's take a look at how the consequences of downtime manifest themselves.
Downtime costs: per year or per incident?
AStudy 2017found that 46% of 400 IT decision makers experienced more than four hours of IT-related downtime in 12 months; 23% said they incur costs between $12,000 and more than $1 million per hour.
More than 35% admitted they are unsure of the cost of a business interruption.
If you ask Delta Airlines, which had to cancel 280 flights due to disruptions in 2017, the losses from a single disruption incidentcould reach more than 150 million US dollars.
A few years ago, Dun & Bradstreet reported that 59% of Fortune 500 companies experience at least 1.6 hours of downtime per week.
If you take an average Fortune 500 company (or any company with at least 10,000 employees) and assume that it pays IT staff an average of $56 an hour, then (assuming all IT employs it is to fix downtime) Downtime for a company this size would be $896,000 per week, which equates to over $46 million per year (Assessing the financial impact of downtime).
The reality is of course more complicated, since many parameters have to be taken into account, such as: B. the time of the event (weekdays or weekends? day or night?) and much more. However, understanding the cost of downtime will go a long way in assessing your potential risk and return on investment from tools that can help minimize the impact of downtime.
Could the industry learn from the past and minimize collateral damage during an outage?
How have things changed since the past?
So we already know that downtime and power outages are still happening today and that the industry is not yet able to eliminate them. But how have costs changed over time? Are these incidents less harmful today?
ab 2010,a poll by Coleman Parkesfound that IT downtime costs companies a total of more than 127 million hours per year in employee productivity, an average of 545 hours per company.
In 2009, the average cost of downtime varied significantly by industry, from about $90,000 per hour in the media industry to about $6.48 million per hour for major online brokers (How to quantify downtime).
According to a survey of IT managers over the years, companies are increasingly aware of the direct financial cost of computer failures. Research has found that one in five businesses is losing $12,000 an hour due to system downtime (How to quantify downtime).
As mentioned above, a subsequent analysis by Gartner in 2014 found average costs of $5,600 per minute and more than $300,000 per hour.
As early as 2004, a conservative estimate by Gartner put the cost of computer network downtime at $42,000 per hour. As a result, a company with less than 175 hours of downtime per year can lose more than $7 million per year. However, the cost of each disruption affects every business differently, so it's important to know how to calculate the exact financial impact (How to quantify downtime).
It makes sense to think that the cost of disruption will only increase over time (since we now rely more on data systems). Here's how to understand why past data can be multiplied by a significant number to reflect current reality...
Every minute counts
More than a decade ago, the average cost of data center downtime across all industries was estimated at approximately $5,600 per minute (Unplanned IT outages cost more than $5,000 per minute), appreciate that, secondgardener, remained the same until 2014. The previous Ponemon Institute study referenced above calculated the minimum, mean, average, and maximum cost per minute of unplanned outages, based on information from 41 data centers. The highest cost of an unplanned outage was over $11,000 per minute.
On average, the cost of an unplanned outage is likely to be over $5,000 per minute.
It just becomes more meaningful
AStudy 2013saw an increase of more than 41% over the previous averages described above and an average cost of more than $7,900 per minute.
LikeITIC-Umfrage 2015clearly shown that the cost per hour (compared to 2008 data) increased by 25-30%.
Impact of downtime per year
A previous Gartner analysis calculated that downtime can average 87 hours per year. Obviously this is the sum of many interruptions from a few minutes to several hours (The average large enterprise experiences 87 hours of network downtime per year).
How have things changed?
laterSurvey 2011found that while the industry has been successful in addressing the downtime epidemic and reducing its incidence, we are still seeing significant downtime and huge revenue losses (Source:resulted in more than 3 million (apparently WhatsApp users) switching to Telegram)
The impact on reputation and loyalty
How much is your company's reputation worth? This can be extremely difficult to assess, as can the long-term impact of a damaged reputation and its impact on sales and profitability.
In this case, downtime costs include customer losses (both short- and long-term) and other tangible items that reflect the cost of reputation degradation, such as inventory levels, marketing time (crisis management and recovery), branding), and the media budget needed to get the organization up and running again build and polish. Profile.
What parameters should affect its calculation?
When attempting to estimate the cost of downtime, there are obvious direct costs (e.g., lost business during downtime). However, there are also many indirect costs to consider, such as: B. Personnel expenses or the above-mentioned reputation problems.
Personnel costs come from the cost of burning out “war room” tasks aimed at getting IT systems up and running again, the cost of being behind on all other scheduled tasks, the cost of staff extras (if applicable) and more. Add to this the value of data loss, emergency maintenance fees (especially if the outage occurs outside of business hours), and additional repair costs that can persist long after service is restored.
It goes without saying that you should consider these costs when estimating the impact of downtime, as they are often very high; But even a rough estimate can be extremely helpful in understanding the risks and deciding what level of technology to rely on to combat them.
There's also the impact of lost sales. To get an accurate estimate of total lost sales, the hit rate needs to be increased to reflect the true lifetime value of customers who permanently switch to a competitor. For example, the Facebook (and Whatsapp) outage mentioned above.Unconscious Costs: Denying the true cost of network downtime. What is the revenue loss due to these users experiencing fewer billable ad impressions?
Stocks fell 25%
Even if it is difficult to quantify so many parameters, they are still substantial and meaningful. For example, when Amazon.com was offline for several hours in the first few days, its inventory dropped by 25% in a single day (Unconscious Costs: Denying the true cost of network downtime)!
DarinAmazon Cloud OutageFor example, the company continued to fight to bring its cloud services back online. As a result, many customers questioned the reliability of their cloud and Amazon's communications surrounding the outage. Other customers felt they should be compensated for downtime as part of their SLA.
I know you're curious: SLA-wise, Amazon's EC2 SLA wasn't breached despite the nearly four-day outage (Seven lessons from the Amazon outage).
The cost of downtime: Calculate it yourself
How much will you lose due to unexpected server or business application downtime?
According to various sources, the easiest way to calculate potential lost revenue during an outage is to use this equation:
LOSS OF INCOME | = | (GR/TH) x I x H |
GRAMM | = | annual gross income |
º | = | total annual working time |
UE | = | percentage impact |
H | = | Number of hours of downtime |
How to minimize the risk of disruptions and downtime?
Downtime and failures are catastrophic, but they don't have to be overly shocking. By using solutions that focus on getting to the root of the problem, failures can be prevented before they happen.
Developed change analysishas developed a unique AIOps solution that targets changes that are the true cause of performance incidents. Evolven helps enterprise IT and cloud operations teams prevent and remediate incidents before problems arise.
Contact usto see how we are helping leading companies reduce incidents and MTTR.
FAQs
What is the real cost of downtime? ›
For the Fortune 1000, the average total cost of unplanned application downtime per year is $1.25 billion to $2.5 billion. The average hourly cost of an infrastructure failure is $100,000 per hour.
What is downtime What are the costs associated with downtime? ›Downtime cost is defined as any profit that a company loses when its equipment or network stops functioning. The cost of downtime implies not only direct financial loss but can have an impact on your company in at least the other 4 ways.
What is true downtime cost analysis? ›TDC is a methodology of analyzing all cost factors associated with downtime, and using this information for cost justification and day to day management decisions. Most likely, this data is already being collected in your facility, and need only be consolidated and organized according to the TDC guidelines.
What is downtime failure? ›In industrial environments, downtime may refer to failures in production equipment. This type of downtime is often measured as downtime per work shift or downtime per a 12- or 24-hour period. Downtime duration is the period of time when a system fails to perform its primary function.
What are the three types of downtime? ›Common categories of downtime include excessive tool changeover, excessive job changeover, lack of operator, and unplanned machine maintenance.
What is the real cost concept? ›The real cost is a cost as measured by the physical labor and materials consumed in production. For example, real costs would include, but not be limited to, production, market analysis, distribution, and advertising.
What are the main causes of downtime? ›This can be due to several reasons including hardware or software failure, human error, malicious attacks or natural disasters. Since unplanned downtime is unexpected and occurs without a warning, preventing it can be a challenge.
What are the two types of downtime? ›Downtime falls into two categories: planned and unplanned. Planned downtime is notable because it offers advanced warning and gives users a chance to prepare. Planned downtime is usually done for upgrades or maintenance to the network infrastructure.
How do you explain downtime? ›a time during a regular working period when an employee is not actively productive. an interval during which a machine is not productive, as during repair, malfunction, maintenance.
What are the two major considerations when calculating the cost of downtime? ›Calculating Downtime Cost
The duration of the downtime and the cost incurred per minute you're offline are the two variables that most affect the financial impact of an outage.
What are the different types of downtime? ›
- Defects.
- Overproduction.
- Waiting.
- Not-Utilizing Talent.
- Transporting.
- Inventory.
- Motion Waste.
- Excess Processing.
What is a cost breakdown analysis? A cost breakdown analysis refers to the process of identifying the factors that determine the price of a product. It's also known as should-cost analysis, as it essentially pinpoints all the elements within a product's price, resulting in what the product should cost.
What is the difference between downtime and breakdown? ›Downtime can be planned or unplanned activity but the breakdown is entirely an unplanned activity. A planned event such as scheduled downtime is cost-effective compared to an unplanned event such as a sudden breakdown. Planned downtime does not delay production whereas breakdown time can cause delays in production.
What is the difference between failure and breakdown? ›Breakdown is the result of failure and the effect that failure has over the failure developing period. For example, if the temperature of your electric motor remains too high, it can cause the shaft to snap, creating a breakdown.
How do you manage downtime? ›- Know the best windows of time for planned downtime based on your company's production cycle. ...
- Prioritize all your assets and know which should be handled first. ...
- Implement clear guidelines and well-defined standard operating procedures (SOPs) for each repeated operation.
To get a quick estimate of your company's probable downtime costs, use the following formula, based on the size of your business and the number of minutes your most recent incident lasted: Downtime cost = minutes of downtime x cost-per-minute.
What is the industry standard for downtime? ›World Class Standards For Downtime
Aim for unscheduled downtime to be 10% or less.
Costs are broadly classified into four types: fixed cost, variable cost, direct cost, and indirect cost.
What is real cost vs actual cost? ›Actual cost refers to the real cost of manufacturing a product, which can be calculated after it has been produced. While standard cost is an estimate of the expected cost, actual cost is what was actually spent to produce the product.
What is real cost also known as? ›Real costs are also termed as social costs because the society faces a number of difficulties during the production process Money cost – cost of production measured in terms of money is called the 'Money Cost' Money cost is the monetary expenditure made by the producer for hiring various factors of Production.
What are the consequences of downtime? ›
Consequences of unplanned downtime
Lost productivity and revenue: Every minute of downtime can result in lost productivity and revenue, affecting a business's bottom line. Decreased customer satisfaction: Unplanned downtime can lead to delayed deliveries, canceled orders, and frustrated customers.
“Period during which an equipment or machine is not functional or cannot work. It may be due to technical failure, machine adjustment, maintenance, or non-availability of inputs such as materials, labor, power.”
What is downtime importance? ›Downtime restores attention and #motivation, fosters #creativity, improves work #efficiency and is essential for #peak performance. Think about the word recreation for a second and break it apart.
What is a high cost of downtime? ›How Much Does Downtime Cost a Company? The average cost of downtime is significant. Each minute costs an average of $9,000, according to the Ponemon Institute, bringing the downtime cost per hour to over $500,000.
What are the two different cost approaches? ›There are two main methods of using the cost approach: the replication method and replacement method. The replication method assumes that a replica of the property is built using the same materials with the same pricing.
What are the two cost systems? ›The two basic cost accounting systems include the job order costing system and the process costing system. Job order costing focuses on custom products, while process costing focuses on standardized or mass-produced products.
What is downtime in accounting? ›Downtime is the period during which equipment is not operational. This situation is caused by such factors as maintenance, setup for a job, broken equipment, or missing inputs, such as raw materials or qualified operators.
What is the meaning of down time in accounting? ›A period during which an equipment or machine is not functional or cannot work. It may be due to technical failure, machine adjustment, maintenance, or non-availability of inputs such as materials, labor or power.
What are some downtime activities? ›- Volunteer. There are only a few things that feel better than genuinely making a contribution and helping other people. ...
- Write down everything you're grateful for. ...
- Meditate. ...
- Do something creative. ...
- Spend time in nature. ...
- Organize your space. ...
- Go over and personalize your devices' settings. ...
- Go for Inbox Zero.
The cost breakdown means breaking the costs into various components, such as labor, materials, overhead, and other expenses. This information can then determine where cost savings can be made or compare the costs of different projects.
How do you find the breakdown cost? ›
To get that cost breakdown, you'll need to include your expenses to sell the product, such as marketing and sales. You might also want to assign company overhead (e.g., insurance, phones, copy machine lease, utilities) to the cost of the item, suggests project-management solutions provider, AcitTime.
How do you calculate cost breakdown? ›- Analyze your Work Breakdown Structure. Before you can identify your costs, you must first determine what your project entails. ...
- Estimate the cost of work. ...
- Estimate the cost of materials. ...
- Build contingency into your CBS. ...
- Sense-check.
For example, in the auto industry, downtime can cost up to $50,000 per minute. That's $3 million per hour. 400 The true downtime cost includes a variety of wasted business support costs and lost business opportunity costs because resources were needed to resolve a downtime incident that probably didn't need to happen.
How much does downtime cost in healthcare? ›The cost of information technology (IT) downtime for the health care industry is similar to other enterprises, with most recent studies citing ranges between $5,300 and $9,000 per minute.
How do you calculate downtime cost per hour? ›The cost per hour of downtime is calculated by adding labor costs per hour to the revenue lost per hour.
What are downtime metrics? ›The most well-known downtime metric is Mean Time to Repair (MTTR). The MTTR metric reflects the average time it takes to troubleshoot and repair a failed piece of equipment.
Is auto industry still short chips? ›The Auto Chip Shortage Remains, But It May Be Improving
Automotive manufacturers are still dealing with the effects of the microchip shortage that began in 2020.
The worldwide semiconductor shortage that began in 2021 has continued to be one of the biggest stories in the automotive industry. Automakers have faced slashed production schedules and staggering revenue losses since the shortage of computer chips began.
What is a downtime policy? ›The downtime policy ensures that systems be taken offline to maintain and improve system performance, safeguard data, or to respond to emergency situations.
Why do we need downtime procedures in healthcare? ›Downtime preparedness is essential to ensure patient safety and continuity of care when electronic health records are completely inaccessible.
What are some other costs of downtime? ›
...
But it also includes less tangible costs, like:
- Reputation loss.
- Employee salaries - incurred with little work being completed.
- Building expenses (a/c, light, security, etc.) during "none working hours."
- Track Downtime. Before jumping into the steps of reducing downtime, it is critical to track it. ...
- Monitor Production. Having a system to monitor production can also help reduce downtime. ...
- Create a Preventative Maintenance Schedule. ...
- Provide Operator Decision Support. ...
- Perform DMAIC Analysis.
1. Divide your total revenue by the planned operating time to get your daily revenue. 2. Assess by how much your daily revenue goes down if the chosen piece of equipment stops working for 1 hour.