AWS Outage Knocks Canva, Amazon, Duolingo, and Major Apps Offline Globally

Share

The recent AWS Outage brought widespread disruption as it took major global services like Canva, Snap, and Duolingo offline all at once.

Picture this: You have your phone in your hand and your start to check your friend’s Snapchat, or when you are about to log in to check your fitness goals, and you just see that tiny wheel that won’t stop spinning. For millions of people around the world, this was the unfortunate reality on Monday. This had happened because of the technical problem in the Amazon web service (AWS). AWS is a company that developed the servers and hosts a huge portion on the internet.

The problems started deep in the AWS US-EAST-1 region, one of the core infrastructure areas. AWS may not be as well known as Google or Apple, but it had $108 billion worth of services in revenue last year. It is the backbone of many apps, websites, and business tools. Because of this, a failure in one area can trigger a technical domino effect globally in a matter of seconds. The scale of this AWS Outage was unprecedented.

When reports started coming through, services like Snapchat were hit with major disruptions with over 5,000 user complaints on Downdetector in seconds. It was a day where logging in was impossible. Did you try to check your language practice streak on Duolingo? You failed. Were you trying to build a new design on Canva? Access was denied. You even had your Ring doorbell, and tried to check a notification? You failed as well.

The outage impacted everyday services and entertainment like gaming on Roblox and Fortnite. Even Life360, used for tracking the location of family and friends, and the finance service Coinbase, were affected. In what some saw as a clear sign of the company ‘dogfooding’ its services, even Amazon services such as the giant e-commerce site Amazon.com and Amazon Prime Video were having difficulties. Amazon has one of the most advanced and reliable digital infrastructures in the world, and still got impacted by the widespread AWS Outage.

AWS (Amazon Web Service) acknowledged the inconvenience multiple services were facing due to the AWS Outage. The company also stated that every service was having “increased error rates and latencies.” The engineers “mitigate the impact and investigate the root cause.”

Disturbance in the AWS services initially involved the basic services given by the infrastructure.

  1. Amazon DynamoDB: A NoSQL database service handling primary data storage, retrieval, and management for multiple applications in a vertical.
  2. Amazon Elastic Compute Cloud (EC2): Provides an application service that has scalable computing resources and runs initiatives as a virtual machine in a data center.

However, when basic server and storage functions start to fail, every service that relies on them, from basic mobile games to advanced enterprise software like Xero to governmental services like HMRC, also start to fail. And this exactly what customers reported, widespread service access failures stemming from the AWS Outage. It reminds us that even the latest and greatest machines can have breakdowns of human proportions. The complexity of the cloud made the AWS Outage difficult to resolve quickly.

The services that experienced outages, such as MyFitnessPal, Wordle, PlayStation, and Pokémon Go, also affected people’s everyday lives. For companies, an outage of their services can stall revenue and affect their brand. For consumers of these services, the outage can affect an important engagement, an work assignment, and bring on a feeling of powerlessness.

Picture a small business owner using an AWS-backed service to complete sales on a Monday, only to have their entire operation shutdown. Or a student tackling a cloud-based assignment when they suddenly lose access to the most important section of their work. It is almost as if the only cloud service desk is endlessly shifting the focus of the work to the core of the cloud. The severity of this AWS Outage was felt globally.

This event highlighted to the whole industry the value of having both redundancy and decentralization in the system. Since AWS is technically the most used cloud service provider globally, its technical stability is a matter of public digital health. When one of its server farms sneezes, the whole world catches a cold. AWS still has its engineers assigned on the redundancy and decentralization problem, while fully restoring and diagnosing the first downfall of AWS cloud. It has been a problem globally to provide real-time updates to the stuck users about the state of the system. The entire event has served a powerful lesson in the interconnected world, and the consequential weight on the people responsible for the system. This significant AWS Outage will be studied for years.

Read more

Local News