The cloud came tumbling down for many startups and sites based on the Amazon EC2 API today. There's a perception that storage and scaling in the cloud is supposed to mean we don't have to deal with outages. It's pretty clear that's not actually the case, as some popular sites like Heroku and Foursquare nosedived.
Scaling expert Clay Loveless bluntly wrote that where there are clouds, it sometimes rains:
The buzz around Infrastructre as a Service has shifted from 2006 “IF YOU USE IT YOU ARE CRAZY” tones to “YOU ARE CRAZY IF YOU DON’T USE IT”, and with that shift has come this pleasant sensation that cloud infrastructure never goes down.
Snap out of it.
ANYTHING TECHNICAL has the potential to just up and crap the bed at ANY TIME. A service built without anticipating failure deserves the downtime it experiences.
That said, there are a number of high profile services impacted by the outage. All Things Digital has a partial list of companies.
The promise of the cloud, though, is that it can make scaling easier. As Network World points out, Amazon claims availability zones help avoid outages:
"By launching instances in separate Availability Zones, you can protect your applications from failure of a single location," Amazon says in pitching its Elastic Compute Cloud service.
Customers who build applications in just one availability zone are more likely to suffer outages. But what happens when multiple availability zones go dark at the same time? We found out today when an outage forced websites such as Foursquare, Reddit, Quora and Hootsuite offline.
For those looking for a silver lining in this cloud: here's a reminder to test potential errors. And if that sounds familiar, maybe we've said it before: Foursquare outage a chance to test for errors.