The Storage Architect: December 2008

Tuesday 23 December 2008

Storage Predictions for 2009

It's the end of another year and of course time to the obligatory posts on predictions in the industry for the next 12 months. True to form, I've spent some time thinking and here are my top 5 ruminations for the coming year.

EMC join SPC. EMC will finally have the epiphany we've all been dreaming of and embrace the world of generalised benchmarking. DMX will prove to be underpowered (due to the lack of SATA drives and the discovery that SSD is in fact the slowest disk technology) and be outperformed by Tony Asaro and his new Drobo appliance.
HDS Win At Web Awards. HDS embrace new media in a game changing way and Hu wins first prize at the Weblog Awards. Barry Burke is so impressed, he defects to HDS from EMC, becoming www.thestoragedefeatist.com.
XIV Conquers the World. IBM releases XIV-1000, by sending Moshe Yanai back in time from 2032, the time when Atmos became self-aware and took over storage for the entire world. EMC counter, sending StorageZilla (now aged 32) back to defeat Moshe in a monumental battle pitting SATA against SSD.
JWT joins Star Trek. Jon Toigo bows out of storage to follow a career in acting. His first role as the father of Willam Riker in Star Trek XI is critically acclaimed as a work of genius.
Convoy II: SWCSA is Released. The life of Marc Farley is brought to the big screen as he attempts to outwit the authorities and drive from the west to east coast using nothing but his blogging skills. Farley is portrayed on screen by Kris Kristofferson, reprising his role from the original cult movie.

Check back next year to see how many of these predictions did in fact come true.

Thursday 18 December 2008

Do You Really Need a SAN - Of Course You Do!

The wacky boys at Forrester have a great new article posted relating to the requirement to have a Storage Area Network. Here's a link to their post. Tony Asaro and Hu Yoshida have both posted on the subject already but I couldn't resist having my 2 cents.

SANs evolved for very good reasons; the need to consolidate storage, the need to provide additional connectivity to storage arrays and the need to remove the requirement to closely couple the storage to the server (remember 25m limits on SCSI cabling). SANs and most notably fibre channel, enable that (as does iSCSI for the record).

Some of the Forrester objections to deploying SAN include;

Low Capacity Utilization - I did some work a couple of weeks ago at a client with purely DAS . They had 30% utilisation on their disks and after excluding boot drives, still had 30% utilisation. I've never seen a SAN array only 30% full, unless it was a new array onto which data was being deployed.

Inability to Prioritize Application Performance - Hmm, this seems a bit odd. DMX has Dynamic Cache Partitioning, Optimiser, USP has I/O prioritisation, Compellent can dynamically move data between tiers of storage; 3PAR has similar features which allow performance to be tweaked dynamically. There's also the same options in the fabric, especially with Cisco equipment. DAS has no such benefit, in fact if you have performance issues on a single server then you're potentially in a world of pain to fix it.

Long Provisioning Times - this is not a technology issue but one of process. I can provision terabytes of storage in minutes, however I have to guarantee that within a shared environment I don't take down another production environment - that's the nature of shared systems. In addition, users think they can just demand more and more storage without any consequences. Storage resources are finite - even more so in a non-scalable DAS solution. With sensible process, SAN storage can be turned around in hours - not the case for DAS unless you intend keeping spare disks onsite (at a price).

Soaring Costs - again, another conundrum. If you focus on pure hardware then SANs are inevitably more expensive, however TCO for storage is very rarely done. Don't forget SAN also includes iSCSI, which can be implemented across any IP hardware - hardly expensive.

So, there are other benefits that SAN easily wins over DAS.

Disaster Recovery. SAN-based replication is essential in large environments where the requirement to manually recover each server would be totally impractical. Imagine trying to recover 100+ database servers where each server requires the DBA to log in and perform forward recovery of shipped logs - all in a 2 hour recovery window.

Storage Tiering. SANs allow easy access to multiple storage tiers, either within the same array or across multiple arrays. Without SAN, tiering would be wasteful, as most servers would not be able to utilise multiple tiers fully.

SANs also provide high scalability and availability, simply not achievable with DAS.

There was a reason we moved to SAN. SAN has delivered, despite what Forrester say. However like all technologies, they need to be managed correctly. With sensible planning, standards and process, SANs knock DAS into a cocked hat.

Tuesday 16 December 2008

Redundant Array of Inexpensive Clouds - Pt II

In my previous post I started the discussion on how cloud storage could actually be useful to organisations and not be simply for consumer use.

Standards

One of the big issues that will arise is the subject of standards. To my knowledge, there is no standard so far which determines how cloud storage should be accessed and how objects should be stored. Looking at the two main infrastructure providers, Amazon and Nirvanix, the following services are offered:

Amazon

S3 (Simple Storage Service) - storage of data objects up to 5GB in size. These objects are basically files with metadata and can be accessed via HTTP or BitTorrent protocols. The application programming interface (API) uses REST/SOAP (which is standard) but follows Amazon's own standards in terms of functions to store and retrieve data.

Elastic Block Store (EBS) - this feature offers block-level storage to Amazon EC2 instances (elastic compute cloud) to store persistent data outside of the compute instance itself. Data is accessed at the block level, however it is still stored in S3.

Nirvanix

Storage Delivery Network (SDN) - provides file-based access to store and retrieve data on Nirvanix's Internet Media File System. Access is via HTTP(S) using standard REST/SOAP protocols but follow Nirvanix's proprietary API. Nirvanix also offer access to files with their CloudNAS and FTP Proxy services.

The protocols from both Amazon and Nirvanix follow standard access methods (i.e. REST/SOAP) but the format of the APIs are proprietary in nature. This means the terminology is different, command structures are different, the method of storing and retrieving objects is different and the metadata format for referencing those objects is different.

Lack of standards is a problem. Without a consistent method for storing and retrieving data, it will become necessary to program to each service provider implementation, effectively causing lock-in to that solution or creating significant overhead for development.

What about availability? Some customers may choose not to use one service provider in isolation, in order to improve the availability of data. Unfortunately this means programming to two (or potentially more) interfaces and investing time to standardise data access to those features available in both products.

What's required is middleware to sit between the service providers and the customer. The middleware would provide a set of standardized services, which would allow data to be stored in either cloud, or both depending on the requirement. This is where RAIC comes in:

RAIC-0 - data is striped across multiple Cloud Storage infrastructure providers. No redundancy is provided, however data can be stored selectively based on cost or performance.

RAIC-1 - data is replicated across multiple Cloud Storage infrastructure providers. Redundancy is provided by multiple copies (as many as required by the customer) and data can be retrieved using the cheapest or fastest service provider.

Now there are already service providers out there offering services that store data on Amazon S3 and Nirvanix SDN; companies like FreeDrive and JungleDisk, however these companies are providing cloud storage as a service rather than offering a tool which integrates the datacentre directly with S3 and SDN.

I'm proposing middleware which sits on the customer's infrastructure and provides the bridge between the internal systems and the infrastructure providers. How this middleware should work, I haven't formulated yet. Perhaps it sits on a server, perhaps it is integrated into a NAS application, or a fabric device. I guess it depends on the data itself.

At this stage there are only two cloud storage infrastructure providers (CSIPs), however barriers to entry in the market are low; just get yourself some kit and an API and off you go. I envisage that we'll see lots of companies entering the CSIP space (EMC have already set their stall out by offering Atmos as a product, they just need to now offer it as a service via Decho) and if that's the case, then competition will be fierce. As the offering count grows, then the ability to differentiate and access multiple suppliers becomes critical. When costs are forced down and access becomes transparent, then we'll truly have usable cloud storage.

Monday 15 December 2008

HDS Play Catch Up

Second post today!

So HDS have announced solid state disks. Here's the post. They're only 11 months behind EMC and once they've actually become available it will be nearly a full 12 months "late".

It's interesting to note that HP haven't (yet) made an equivalent announcement on XP. I imagine it will follow in the fullness of time, although its odd as Hitachi product announcements tend to be released by HDS and HP at the same time.

I wonder what HDS really think about this announcement? Part of the press release says:

"Flash-based SSDs in the USP V will help differentiate Hitachi Data Systems high-end offerings when deployed in combination with the company’s virtualization, thin provisioning and integrated management features."

Well Duh! As no other vendor (excluding the obvious HP) has virtualisation then clearly USP will always be differentiated, regardless of SSD support.

None of the HDS bloggers have co-ordinated a post with the announcement so there's no depth behind the press release, for instance to explain exactly how SSD and virtualisation create a differentiator - Tony??

Redundant Array of Inexpensive Clouds - Pt I

Storagezilla was quick to turn a Twitter conversation into a PR opportunity for EMC this week. Have a read. As one of the originators of this conversation, I'd intended to blog on it but was slightly beaten to print. Never mind, I've got more content to add to the discussion.

The original question was whether IT departments with purely DAS environments should consider going straight to cloud storage rather than implement traditional NAS or SAN.

For me the answer at the moment is a resounding no. Cloud computing is far too unreliable to commit production/operational data to it. However that's not to say the cloud can't be used for some things.

First of all, consideration needs to be given to the fact that all storage environments have a working set of data and that this forms only a small part of the overall quantity of data deployed across an enterprise. Most data is created and very quickly becomes inactive. This includes structured data, email, unstructured files and so on.

In some organisations, inactive data is retained - sometimes indefinitely, especially if it relates to content deemed "too hard" to process or legally sensitive. This inactive data is the perfect candidate for migration into the cloud, for a number of reasons;

It gets the data out of expensive datacentres, where the cost of maintaining that data is not just about the cost of the storage hardware, but also the whole TCO relating to data retention; power/cooling/floorspace, backup, technology refresh and so on.
It moves the data into a location where the cost of maintenance is simple to calculate as the cloud providers simply charge per GB per month.
It puts the data in a place where cloud providers could offer value added services.

Now, by value added services, I'm referring to a number of things. There's the possibility to offer simple services like automated virus scanning, content conversion and so on. There's also the option for the cloud providers to offer more advanced services.

Imagine you've terabytes of unstructured content that's been too difficult to process; perhaps there's copyrighted material in there, perhaps there's commercially useful data. Whatever it is, you don't have the time or the inclination to manage it, so up to now the data has been left, moved to cheaper storage and simply dumped in the storage landfill. Enter the cloud providers. For a fee, they will take this data off your hands and pick over it like parasites, removing illegal content, deleting irrelevant data and returning to you the gems in the rough that you should be re-using.

The cloud guys are in a perfect position to do it as they get to see *lots* of data and can build models of the content which allow them to automate the analysis process.

Now If data is pushed into the cloud, you (a) may want to guarantee security of the data and (b) standardise access to these providers. More on this in the next 2 posts.

Thursday 11 December 2008

2V Or Not 2V (vendors that is)

Over on ITToolbox the age old subject of dual versus single vendor strategy has raised its head again.

This time, the consensus, apart from yours truly, was that a single vendor strategy was best - mostly because it is easier to implement.

I'm still of the opinion that a correctly executed dual-vendor strategy works well and can be achieved without the headache people think is involved. Here's some pointers as a recap.

Standardise Your Services. I've seen many sites where a particular vendor is chosen over another for certain services - for instance using EMC for remotely replicated storage and HDS for non-replicated. If you want a real dual-vendor environment, each platform should offer the same services (unless by real exception).

Standardise Your Support Matrix. Here's another issue; using one vendor for Windows and another for Unix because of things like driver or multi-pathing support.

Standardise your Configuration. Keep things consistent. Create a design which you treat as consistent between vendors; for instance, in an Enterprise array, create a standard model which shows front-end port/cache/disk ratios and set a "module" size. This may be 8 FEPs/100 hosts/100GB. This becomes your purchasing unit when requesting quotes.

Standardise Your Provisioning. Lots gets said about having to train staff twice or maintain two teams. This just isn't necessary. What is important is to document how storage is selected and provisioned (port choice, masking, LUN size etc).

Standardise Your Offering. Give your customers no reason to question where there storage comes from. All they care about is availability, performance.

Ok there are some problems with dual-vendor'ing.

Implementing a common tool set. No-one really fully supports multi-vendor provisioning. You will have to use more than one tool. Accept it. You can mitigate the problem however, by sensible scripting where necessary. This includes creating scripts which will do failover/replication support on HDS and EMC equipment. It can be done but needs to be thought through.

Migration. Moving data from one platform to another will be problematic cross-vendor. However there are tools out there to do it - host and fabric based (even some array-based tools). Migration techniques need to be given serious thought before you spread data far and wide.

Functionality. Not all vendors are the same and functionality is an issue. For instance, until recently the "big-boys" didn't do thin provisioning. You may have to compromise on your functionality or accept a limited amount of "one vendor only" functions.

Dual vendor is not for everyone. Size and complexity of environment will determine whether you feel comfortable with investing the time to manage dual (or even multi) vendors. However it can work and save you a shed load of cash into the bargain.

Wednesday 10 December 2008

Storage Waterfall Revisited

A while back I presented a diagram showing how storage is lost throughout the provisioning process.

I've added a few more items onto the diagram and heres version 2. The additions show reasons why storage is lost at various points in the cycle, for example, disks not not in use, hot spares, not using all the remaining space on the disk etc.

If anyone has additional reasons I've missed, then please let me know.

The next step is to look at ways of retrieving this storage and improving efficiency.

Tuesday 9 December 2008

All I Want For Christmas...

In the words of Mariah Carey, I don't want a lot for Christmas, I've got everything I need, but possibly not everything I want. Here's my Crimble list for this year (in no particular order):

MacBook Pro - I guess I should see what all the fuss is about. I've never been an Apple fan (I guess it's a Marmite thing, you love them or hate them). Obviously I'll make sure I have VMware Fusion to run some decent Windows apps.
Sony Ericsson Bluetooth Headphones. Can't get a pair of these in the UK, despite trying and having confirmed ordered cancelled. I already have the iPod Bluetooth broadcaster so just need something to send it too!
Seagate FreeAgent Go. You can *never* have too much personal storage and I its hard to turn down brushed metal and a docking station. Preferred colour: green.
Jaguar XK60. Slightly off-track, but desirable non-the-less. Don't actually care about the colour (although if push comes to shove I probably would). I expect this is the least likely item to be in my stocking on Christmas morning (unless it is a model one).

What's on your list?

Monday 8 December 2008

Testing Out IT Principles - With Children

I was driving my youngest son back from Beavers this evening and we were talking computers and games (he's 6). He reminded me that some of his games won't run on Vista which is why I'd installed dual booted XP on the kid's machine. He asked me if I'd played "Age of Mythology" when I was young. I had trouble explaining to him the concept of an audio cassette tape and how my Sinclair Spectrum took 15 minutes to load games. I tried LPs as a starting point to try and explain cassettes and he said "oh yes, one of our teachers bought one in once". To him it was a piece of ancient history.

The interesting point we reached was me trying to explain why I'd even upgraded the kid's PC to Vista in the first place. I couldn't come up with a good reason at all (so I made up some lame excuse about future support).

Perhaps we should explain all of our upgrade/purchase decisions to our children and try and justify it with them - it might help us to understand it ourselves!

Thursday 4 December 2008

Betamax

In case you don't always go back and look at comments (and let's face it, it's not easy to track them), then have a look at Tony's comment from my post yesterday. It's good to see a bit of banter going on and that's what HDS could do with.

HDS have hardly developed a good blogging prowess, it's more a case of "oh well, better do that" than taking a lead in new media.

Look at EMC.

There's geeky leaky Storagezilla with his uber-technical posts and sneaky advance notice of EMC technology.

Next The Storage Anarchist with his ascerbic character and product assassinations of the competition.

And who can forget Chuck, EMC's futurologist with his head in the cloud.

There's others of course, filling in the technical detail. Apologies if I haven't mentioned you by name. EMC have certainly grabbed Web 2.0 and given it a good shake.

Sadly HDS don't seem to have the same enthusiasm for marketing to their customers. Blog posts are few and far between from the small slew of bloggers they have to date. Content is shallow and that's a big problem.

We *all* know USP is faster than DMX. Anyone who's had the products in their lab know exactly what I'm talking about. Unfortunately unless HDS make a song and dance about it, they're going to be the Betamax of the Enterprise storage world.

Tony, keep the posts coming! Give is some real substance to beat up the competition with!!

Wednesday 3 December 2008

2 Days, 2 Bod Posts

For the second time in two days I find myself drawn to comment on a Storagebod related post.

The subject today is Tony Asaro's rant on one of StorageBod's recent posts denegrating virtualisation.

Now let's get things clear - I like HDS's implementation of virtualisation. I've deployed it, I'd recommend it again, but some of Tony's comments are way off base.

"The cost per GB for your DMX FATA and SATA drives is much higher than using other tiered storage solutions." - yes, but UVM ain't free - there's a licence charge. When you virtualise an array, you're not paying for just JBOD, you're paying for extra stuff like the controllers. Also on the USP array you have to reserve out ports for virtualisation; if you connect the storage together through a fabric then you'll be taking up fabric ports too. The point is, the cost of HDS virtualisation means there's a break even point in the TBs of storage - from my experience, that was a big number.

"Storagebod does not want to have applications span multiple storage systems but other IT professionals are open to doing this. And storage virtualization is a powerful technology to enable this. That is the point of any virtualization technology - to overcome the physical limitations of IT infrastructure." - there are very good reasons for not spanning arrays with applications, like ensuring replicated storage is consistent, for instance. Whilst virtualisation allows a virtual array to grow in size to almost limitless amounts (247PB in USP V) it also means there's a concentration of risk; multiple components to fail, multiple places where data can be pinned in cache when things go wrong. In fact, placing data on lower availability arrays will increase risk.

"That may be true for Storagebod but that is not my experience in most data centers. We are shifting from transactional data being the “big space-hogs” to unstructured data consuming the lion’s share." - this may be true, but USP LUN-based virtualisation isn't going to help here. Overlaying file-level granularity data migration onto LUN-based arrays would require a particularly complicated scheme for ensuring data for migration was concentrated onto exactly the right LUNs so they could be moved to another tier. Anyway, why put unstructured data on expensive enterprise arrays?

I think we all expected Tony would have something better to talk about than technology HDS brought to the market 4+ years ago. We need to hear something new, something game-changing (oh and not me-too stuff like HDS putting SSDs into their arrays).

Tomorrow I *promise* I'll talk about something else.

Tuesday 2 December 2008

The SRM Conundrum

Martin (Storagebod) has an interesting post today. Rather than post a long reply, I've chosen to steal his thunder and post specifically on the subject - of SRM tools.

Apart from when I worked in the mainframe storage arena, I've always struggled with SRM tools. Just for reference, the mainframe was great - SMS did the job, although there were a few shortcomings like the lack of quota tools. In the open world, things are so, so different. I think the reason open systems is a problem relates to the fact that although standards exist, technology is all different.

Look back at my recent post; there are two fundamental issues happening here. First of all, each vendor has a different implementation of technology - EMC/HDS/IBM/3Par/Pillar/Equallogic, the list goes on. Why are they different? Because there has to be something to create a USP, a differentiator. Sure, front-end technology might be consistent; each vendor will implement LUNs and the fibre channel standards, but in reality the back-end deployment will be different as each manufacturer competes on features and functionality. The same applies for the switch vendors, NAS vendors, and so on.

SMI-S was meant to address these problems but never would as it basically dumbs down each vendor to a single set of features and doesn't address the platform specific functionality. Try using IBM and HDS arrays from ECC (in fact, try managing EMC arrays like Clariion from ECC) and you'll fall at the first post. I won't even suggest trying to use any other product like HiCommand...

Some software vendors have tried to do cross-platform SRM. Think of Creekpath. It failed miserably to offer cross platform support because (as Martin rightly states) they never understood how Storage Admins did their work.

The answer to the lack of an SRM tool would be for an independent to develop one. However there's one major barrier to entry and that's the vendors themselves. All the major vendors make a tidy profit (Martin's cash cow) from their SRM tools - software without which you could do *nothing* but for which you are obliged to pay. Why would those vendors give up that monopoly position?

I've been working on a tool for some months (see here) which will provide cross-platform reporting, but full SRM is another step again. Without full vendor support, and by that I mean full knowledge of the APIs and interfaces to their products, not just the standard SMI-S providers - and advance notice and access to new features -then developing an SRM tool will be impossible.

However if anyone is prepared to pony up the cash, I'm still up for it!!

Monday 1 December 2008

Home Storage Management #1

My first-pass cleanup has focused on my laptop, which is my main work device.

I've already mentioned I segment data from applications by having a separate partition, in my case labelled L:\ for local. I also use offline files to map most of my data from a personal network share on my main file server.

The Offline Files feature enabled files from network file servers to be cached locally on a desktop or laptop for access when the PC is not connected to the network. As I travel a lot, Offline Files are essential for me and my local cache is quite large. However like a lot of people I choose to sync the whole of my network drive.

Using Treesize, I browsed the Offline Files cache, which is located by default in the CSC directory under the systemroot folder - in my case C:\Windows\CSC (CSC stands for Client Side Caching). A nice feature of Treesize is its ability to traverse the offline files folder directly as if it were a standard file system. That quickly allowed me to sort the offline files by size and type and immediately highlight some issues. I found;

A directory called BackupToDiskTest which I'd used to test a backup product in 2005 (12GB of unwanted data).
A large number of ISO files for software installation, which I moved to an archive directory on the main server.
2.7GB of home movie AVI files, subsequently moved to the main server.

Obviously I've been lazy in dumping everything into my own directory including data which I don't need offline. Now I didn't delete all of these files, however I did save space on my laptop drive, which is pretty limited at just over 103GB.

Rescanning the C:\ drive, I now found "System Volume Information". This is an area of disk used by Windows to store recovery information in the event that you need to restore your laptop to a previous known "good configuration". In my case, Windows was using 12.6GB of storage to retain my previous configuration details. Now, touch wood, I've never bothered to use the restore feature of Windows. I keep my machines pretty tidy and don't install a lot of test or junk software. The last restore point appeared to have been created by my virus scanner so I felt confident to delete the restore information. I did this by simply unchecking, applying and rechecking the drive letter in Control Panel -> System -> System Protection.

I also found a few other bits and pieces - some content in BBC iPlayer that had expired and could be deleted; 3.5GB of temp files in my local profile; another 5GB of home movie WMVs on my L: drive which I moved to the server.

So at the end of pass #1, things stand as follows;

Laptop C:\ Drive - capacity 103GB - allocated reduced from 75.4GB to 63.8GB (15%)

Laptop L:\ Drive - capacity 38.7GB - allocated reduced from 34.85GB to 24.1GB (31%)

I'm pleased with the savings, however there's a lot more to do. Each cleanup highlights new issues and I don't believe the Offline Files has reclaimed all of the files I moved. In money terms, the recovered space doesn't equate to anything of value, however it does mean as I move to consider online backups that I have only the relevant data being backed up - and that does translate into money.

The Storage Architect

Tuesday 23 December 2008

Storage Predictions for 2009

Thursday 18 December 2008

Do You Really Need a SAN - Of Course You Do!

Tuesday 16 December 2008

Redundant Array of Inexpensive Clouds - Pt II

Monday 15 December 2008

HDS Play Catch Up

Redundant Array of Inexpensive Clouds - Pt I

Thursday 11 December 2008

2V Or Not 2V (vendors that is)

Wednesday 10 December 2008

Storage Waterfall Revisited

Tuesday 9 December 2008

All I Want For Christmas...

Monday 8 December 2008

Testing Out IT Principles - With Children

Thursday 4 December 2008

Betamax

Wednesday 3 December 2008

2 Days, 2 Bod Posts

Tuesday 2 December 2008

The SRM Conundrum

Monday 1 December 2008

Home Storage Management #1

My Personal Profile

My Company

What Am I Doing?

Blog Archive

FEEDJIT Live Page Popularity

FEEDJIT Live Traffic Map

FEEDJIT Live Traffic Feed

Tuesday 23 December 2008

Thursday 18 December 2008

Tuesday 16 December 2008

Monday 15 December 2008

Thursday 11 December 2008

Wednesday 10 December 2008

Tuesday 9 December 2008

Monday 8 December 2008

Thursday 4 December 2008

Wednesday 3 December 2008

Tuesday 2 December 2008

Monday 1 December 2008

My Personal Profile

My Company

Subscribe To

What Am I Doing?

Blog Archive

FEEDJIT Live Page Popularity

FEEDJIT Live Traffic Map

FEEDJIT Live Traffic Feed