1 Dec 2011

Amazon cloud servers go down

Posted by iwgcr

Five years ago, Amazon set up an offer of computing resources to businesses from its network of sophisticated data centers. Today, the company is the early leader in the fast-growing business of cloud computing.

The outages began April 21 2011 knocking hundreds of sites off the web, and persisted through April 24, when Amazon began reporting progress restoring the affected storage volumes.

“Due to the scale of the power disruption, a large number of EBS servers lost power and require manual operations before volumes can be restored,” Amazon said at 11:04 PM on Sunday night. “Restoring these volumes requires that we make an extra copy of all data, which has consumed most spare capacity and slowed our recovery process. We’ve been able to restore EC2 instances without attached EBS volumes, as well as some EC2 instances with attached EBS volumes.”

Some website on the 25th still not online. Service troubles included Foursquare, a location-based social networking site, Quora, a question-and-answer service; Reddit, a news-sharing site; and BigDoor, which makes game tools for Web publishers.

Steve Lohr wrote on The New York Times: “Amazon’s Trouble Raises Cloud Computing Doubts”

Consequently, The Amazon Web Services developer forums are full of angry messages as

” I have to question the ability of your department regarding the handling this type of issue. We located our service on EC2 because we trust Amazon’s reputation, but your recent event caused our web service to go offline resulting in loss of profit which is not acceptable.
We ask for a stable system or we will consider an alternative solution away from Amazon”

Obviously, all users agree on Longbottom says when he stressed that businesses using cloud services need to plan for service failures themselves:

“An SLA in itself is worth nothing. When something like this happens, you can go back and say ‘Well the agreement says that you now have to give us three months of service free of charge – but hang on we nearly went out of business while this was all going on’.”

Amazon Web Services publishes information on service availability and we can see when there is Amazon performance issues or Service disruption following the link bellow: Status Amazon

Edit: Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US East Region

Date

Service

Duration

Critical Data Lost

2011-04-25 Amazon 4 days No