Major outages of the past: lessons learned for digital resilience
In a blameless culture of teal organizations with flat structures, we don’t look for guilty ones, focusing on the reasons and consequences instead, seeking ways of resolution. The Crowdstrike outage is doubtlessly not the first and not the last, what can we learn from the history of incidents?
Network Solutions Inc.
An outage associated with Network Solutions Inc. crashed a million websites ending in .com or .net for the reason of a misconfigured database back in 1997. Despite comparatively minimal problems due to minimal World Web Coverage back then, some companies lost their businesses being unable to reach their customers via email or sell products and services online.
Entire country
One truly understands that the Internet is just a set of cables and a series of tubes knowing how a 75-year-old woman took 2.9 million Armenians offline in 2011. A connection of the entire nation relied on a single fiber-optic cable running through Georgia. Getting arrested for slicing that very cable with a spade, the local lady said “I have no idea what the internet is”.
DYN Inc.
On October, 21st, 2021 a series of massive distributed denial-of-service DDoS attacks affected DYN, a major domain name system (DNS) provider. Relying on DYN, tech giants like Twitter, Spotify, Netflix, Airbnb, Amazon, and the PlayStation Network became unreachable for users. The root cause is malicious software known as Mirai infecting everyday devices such as printers, cameras, and even baby monitors. According to Cover Link reports, organizations spend an average of $2.5 million recovering from DDoS attacks.
British Airways
On May 28th, 2017, one engineer in a data center close to London Heathrow Airport accidentally disconnected a power supply and thus caused nearly 1,000 British Airways flights to be grounded. 75,000 passengers overflowing the terminal were unable to access the booking system, and baggage handling became a nightmare. It is estimated that the outage costs British Airways $102 million in revenue losses as well as issuing mandatory refunds and commendatory compensations for passengers.
City in Alaska
In 2019, 100,000 residents of the Matanuska-Susitna community in Alaska were sent back in time. It took 10 weeks to bring the workstations in all institutions back online
People had to get back to typewriters before it became possible to get back online after 10 weeks.
Amazon Web Services
The tech giant we all rely on quite heavily, can’t be called immune to outages. An outage in 2017 started with a human mistake. A simple typo made in command by an engineer trying to fix a billing issue took the cloud down for hours and cost companies over $150 million. In December 2020 AWS experienced three outages due to power failures. The far-reaching financial damage is estimated to be at least a billion dollars in economic loss.
December, 20th 2020 has proven that when Google sneezes, the internet catches a cold. For 45 minutes, key services like Gmail, YouTube and Google Drive went dark, due to the system crashing due to a lack of authentication storage. YouTube outage alone cost Google $1.7 million in ad revenue.
Fastly
On June 8th, 2021, a 1-hour outage was experienced by a company you might not recognize by name despite its crucial role in delivering content across the web. A seemingly harmless tweak being a customer making a routine change in their settings triggered a dormant bug within Fastly’s software, disrupting the entire ecosystem and resulting in websites like the New York Times, BBC, CNN, and the UK government’s portal going dark for nearly an hour. Thankfully, Fastly’s team saved the day and got 95% of their network back online within 49 minutes. Their prompt response prevented even bigger financial problems, with the actual impact being counted to reach $150 million faced by digital platforms due to missing crucial updates and e-commerce stores suffering from sales losses.
One Monday in October 2021 a routine maintenance job at Facebook went wrong. That meant no Facebook, Messenger, Instagram, and WhatsApp, let alone all apps relying on Facebook logins. According to Bloomberg, Facebook lost $47.3 billion in market value during the downtime. That’s a hit that even a tech giant like Facebook feels. Even Mark Zuckerberg took a financial blow, losing an estimated $6 billion from his wealth.
Roger Communications
A preview of last week’s Crowdstrike Debacle was given to 11 million Canadians in 2022 due to an outage experienced by Roger Communications, one of the country’s major telecom providers. Emergency services couldn’t accept phone calls, hospitals cancelled appointments and businesses across the country couldn’t accept debit card transactions. Students missed their exams, and R&B star The Weeknd was forced to postpone a concert.
Spotify and Discord
Spotify and Discord outage on March 8, 2022, started early in the afternoon Eastern time with minor issues like unstable support pages or troubles logging in. After half an hour, it became impossible to send a message or connect to the platform. Two hours of silence were caused by a malfunctioning component in the Google Cloud system.
Instagram and Twitter
July, 14th of 2022 was unfortunate for social media users. First, Twitter went down for 40 minutes, then fate came for Instagram. Users experienced difficulties accessing feeds, sending messages, and launching the app. Ironically, Instagram users reported the outage on X. Event as Instagram got back, but their troubles weren’t over. By the time the platform recovered, many large accounts had lost millions of followers.
Political events
Outages aren’t always shocking incidents caused by purely tech reasons. Sometimes, they come as a planned action caused by political reasons. In July 2024, Bangladesh faced a near-total internet blackout after a government shutdown in response to violent clashes between protests. At least 150 people have been killed in the clashes, with online cutoff accompanied by a curfew making it impossible for citizens to know the truth about ongoing events.
India is arguably the global leader in implementing internet shutdowns to control unrest. However, this tactic is prevalent worldwide, with at least 83 countries, including Iran, Russia, Algeria, Senegal, Tanzania, Cameroon, and Venezuela, having utilized it.
Internet is never safe as long as it’s dependent on the cables lying down on the bottom of the ocean. In 1964, long before Tim Berners-Lee invented World Wide Web, the United Nations Environment Programme reported problems with phones and telegraphs because of sharks and other fish and barracudas biting ocean cables, not only leaving teeth marks but also penetrating the insulation and mixing seawater to ground the power conductors.
How to make your infrastructure resilient? Build systems secure by design, diversify risks, and have a reliable tech partner by your side. Feel like joining forces with Lerpal to be ready for all challenges? Contact Us via Lerpal.com and book a meeting to secure growth despite all the challenges!