Monday 15 December 2008

Redundant Array of Inexpensive Clouds - Pt I

Storagezilla was quick to turn a Twitter conversation into a PR opportunity for EMC this week. Have a read. As one of the originators of this conversation, I'd intended to blog on it but was slightly beaten to print. Never mind, I've got more content to add to the discussion.

The original question was whether IT departments with purely DAS environments should consider going straight to cloud storage rather than implement traditional NAS or SAN.

For me the answer at the moment is a resounding no. Cloud computing is far too unreliable to commit production/operational data to it. However that's not to say the cloud can't be used for some things.

First of all, consideration needs to be given to the fact that all storage environments have a working set of data and that this forms only a small part of the overall quantity of data deployed across an enterprise. Most data is created and very quickly becomes inactive. This includes structured data, email, unstructured files and so on.

In some organisations, inactive data is retained - sometimes indefinitely, especially if it relates to content deemed "too hard" to process or legally sensitive. This inactive data is the perfect candidate for migration into the cloud, for a number of reasons;

  • It gets the data out of expensive datacentres, where the cost of maintaining that data is not just about the cost of the storage hardware, but also the whole TCO relating to data retention; power/cooling/floorspace, backup, technology refresh and so on.
  • It moves the data into a location where the cost of maintenance is simple to calculate as the cloud providers simply charge per GB per month.
  • It puts the data in a place where cloud providers could offer value added services.

Now, by value added services, I'm referring to a number of things. There's the possibility to offer simple services like automated virus scanning, content conversion and so on. There's also the option for the cloud providers to offer more advanced services.

Imagine you've terabytes of unstructured content that's been too difficult to process; perhaps there's copyrighted material in there, perhaps there's commercially useful data. Whatever it is, you don't have the time or the inclination to manage it, so up to now the data has been left, moved to cheaper storage and simply dumped in the storage landfill. Enter the cloud providers. For a fee, they will take this data off your hands and pick over it like parasites, removing illegal content, deleting irrelevant data and returning to you the gems in the rough that you should be re-using.

The cloud guys are in a perfect position to do it as they get to see *lots* of data and can build models of the content which allow them to automate the analysis process.

Now If data is pushed into the cloud, you (a) may want to guarantee security of the data and (b) standardise access to these providers. More on this in the next 2 posts.

No comments: