How a 'fat-finger typo by Amazon employee' took down thousands of websites





Amazon has blamed a typo for a massive cloud-computing outage that caused problems for thousands of websites and apps.

The web giant has apologised for the five-hour outage of some Amazon Web Services that took down websites including Slack, Trello and Medium.

Amazon has now revealed exactly what went wrong, explaining that an incorrectly typed command during a routine debugging of its billing system was to blame, also known as a fat-finger typo. 

The employee only “intended to remove a small number of servers”, but instead the typo caused unprecedented performance problems for thousands of companies that rely on Amazon’s cloud-computing service.

“Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended,” the Seattle company explained.

Amazon added its Simple Storage Service had “experienced massive growth” over the past few years, adding that “the process of restarting these services and running the necessary safety checks to validate the integrity of the metadata took longer than expected”.

The Jeff Bezos company said it was “making several changes” to prevent a similar incident from happening again.

“We want to apologise for the impact this event caused for our customers,” the company said.

“We know how critical this service is to our customers, their applications and end users, and their businesses. We will do everything we can to learn from this event and use it to improve our availability even further.”

