Amazon Web Services blamed human error for a four-hour outage that impacted a number of high-profile customers earlier this week.
Amazon Web Services Thursday said an employee trying to debug its billing system entered a command incorrectly, causing the four-hour outage that disrupted service to Amazon S3 clients around the world earlier this week.
In a postmortem detailing the events that led to Tuesday’s high-profile disruption, AWS apologized for the impact on customers and outlined the changes it is making to prevent the problem from occurring again.
The crash started with a bad command entered during a debugging process on its S3 billing system at 9:37 am PT in the provider’s Virginia data center that serves the eastern region of the United States.
[Related: Partners: AWS Outage Puts Spotlight Onn Private Cloud Advantages ]
An employee, using « an established playbook, » intended to pull down a small number of servers that hosted subsystems for the billing process. Instead, the accidental command resulted in a far broader swath of servers being taken offline, including one subsystem necessary to serve specific requests for data storage functions, and another allocating new storage, AWS said in the postmortem.