The Cloud Does not Auto-Validate Your Work

Kevin Farnham
Nov. 26 2008, 02:02AM EST

ReadWriteWeb's Rick Turoczy recently reported in "Dark Side of the Cloud" the recent incident that resulted in a loss of data for ylastic, a company that facilitates management of Amazon's AWS environments for businesses.

A few days ago, something went amiss with Ylastic's Elastic Block Stores (EBS) on AWS. Application instances were hung. It's unclear how it happened, whether it was a Ylastic issue, an AWS issue, or other, but ultimately data was lost, and Ylastic was forced to revert to a previous data snapshot. Unfortunately, the most recent valid data snapshot was 7 weeks old. Ouch. As Ylastic reported:

"Some time in the last month or so, our EBS snapshotting of this stuck volume seems to have stopped working correctly.... We have gone back and run through all the snapshots, and the last good snapshot that we have is from October 1."

The "cloud" provides an incredible opportunity for start-ups, in terms of purchasing computing and database and data storage facilities at usage-based pricing. Amazon Web Services is about as stable as you can get, when it comes to highly-reliable high-volume computational, database, and data infrastructure. But as with any IT infrastructure, be it in-house or in the cloud, it's the customer's responsibility to monitor processing results, and notice if a processing anomaly occurs. AWS monitors status in these terms: "This program is running, as we promised it would. It output this data; yes, we have the data the app has produced..." In other words, AWS is a similar to an electric utility. They keep the systems running, the processing chugging along, the storage of output intact. If a customer's application has a problem, only the customer can notice that. If the problem occurs only intermittently, and the customer isn't monitoring the processing closely -- that's when big problems occur. Use of the cloud means you are outsourcing processing and data storage to a contractor (for example, Amazon Web Services). As ylastic learned:

Our first outage and lesson learned - test those EBS snapshots religiously...

ZDNet's Phil Wainewright points out these fundamental principles in his "Back up your online data. Now." which highlights how Digital Railroad, a photo archiving and commerce site used by over 1,500 professional photographers, shut down without warning after running into financial trouble. Their creditor decided to "have all information erased from the storage devices and then sell the equipment at auction." Despite pleas from customers. Phil asks the operative question:

Does this example mean we should all stop using cloud providers and go back to the ‘good old days’ of running our own software and servers? Of course not. You’re more likely to lose everything to a disk failure on your own machines than you are to a business failure of a thid-party provider. But it’s still essential in either case to have a back-up strategy.

So, have you monitored the cloud processing you accomplished today? And, did you back up your transactions and data? And verify that those back-ups are valid? If not, now's a good time to check...

Kevin Farnham

Comments

Comments(3)

maisa

Hi Kevin!

Interesting. It's a very unfortunate situation for those who lost their trusted data in the cloud, but I do agree with you when you mention about data responsibility. Even to the safest offer by a very reliable company to keep your data assuming total responsibility for it shouldn't take your personal control on your own data. If it's important, how can you even trust the cloud only?

I think everybody can benefit a lot from Cloud Computing in general - either through its available apps, interactivity options or cutting costs when it comes to online storage - and they should, not losing control of the what's really meaningful to them.

Great article! I'll post a link reference to it on my blog.

And if you get the time, let me know your impressions on icloud (http://icloud.com/maisa/)

Maisa

[...] This a typical example of one of the drawbacks with Cloud computing. Though Cloud computing is an ideal choice for startups with low capital, it can turn into a nuisance if the problems aren’t noticed by the end customer. Cloud computing platform does nothing but hosting the service. Any problems with the service must be noticed and handled by the service provider. [...]

I covered this same event (http://www.transparentuptime.com/2008/11/transparency-case-study-courtes...) from a slightly different angle. On the point you make that trust is THE key for companies to move to the cloud, what companies must to do beyond simply monitoring the cloud platforms and having a backup plan...is to provide transparency during and after these kinds of events to their own customers. It actually works at number of levels. In this case, AWS must be transparent to ylastic, and ylastic must be transparent to their customers. If AWS doesn't provide any information on what may be wrong or when an issue is going to be resolved, ylastic will be in a very tough spot. Then ylastic must provide as much information as possible directly to their customers (in this event they communicated through Twitter).

Without a heavy dose of transparency, trust is impossible.