Wednesday 25 April 2007

What's your favorite fruit? EMC versus HDS

Nigel has posted the age old question, which is best EMC or HDS? For those who watch Harry Hill - there's only one way to sort it out - fiiiiight!

But seriously, I have been working with both EMC and HDS for the last 6 years on large scale deployments and you can bet I have my opinion - Nigel, more opinion than I can put on a comment on your site, so forgive me for hijacking your post.

Firstly, the USP and DMX have fundamentally the same architecture. Front end adaptor ports and processors, centralised and replicated cache and disks on back-end directors. All components are connected to each other providing a "shared everything" configuration. Both arrays use hard disk drives from the same manufacturers which have similar performance characteristics. Both offer multiple drive types, the DMX3 including 500GB drives.

From a scalability perspective the (current) USP scales to more front-end ports but can't scale to the capacity of the DMX3. Personally, I think the DMX3 scaling is irrelevant. Who in their right mind would put 2400 drives into a single array (especially with only 64 FC ports)? The USP offers 4GB FC ports, I'm not sure if DMX3 offers that too. The USP scales to 192 ports, the DMX3 only 64 (or 80 if you lose some back-end directors).

The way DMX3 and USP disks are laid out is different. The USP groups disks into array groups depending on the RAID type - for instance a 6+2 RAID group has 8 drives. It's then up to you how you carve out the LUNs - they're completely customisable to your choice of size. Although a configuration file can be loaded (like an EMC binfile) its usually never used and LUNs are user-created through a web interface to the USP SVP called Storage Navigator. LUN numbering is also user configured, so it's possible to carve all LUNs consecutively from the same RAID group - not desirable if you assign LUNs sequentially and put them on the same host. EMC split physical drives into hypers. Hypers are then recombined to create LUNs - two hypers for a RAID1 LUN, 4 hypers for a RAID5 LUN. The hypers are selected from different (and usually opposing) back-end FC loops to provide resiliency and performance. It is possible for users to create LUNs on EMC arrays (using Solutions Enabler), but usually not done. Customers tend to get EMC to create new LUNs via a binfile change which replaces the mapping of LUNs with a new configuration. This can be a pain as it has to go though EMC validation and the configuration has to be locked for new configurations until EMC implement the binfile.

For me, the main difference is how features such as synchronous replication are managed. With EMC, each LUN has a personality even before it is assigned to a host or storage port. This may be a source LUN for SRDF (an R1) or a target LUN (an R2). Replication is defined from LUN to LUN, irrespective of how the LUNs are then assigned out. HDS on the other hand, only allow replication for LUNs to be established once they are presented on a storage port and the pairing is based on the position of the LUN on the port. This isn't easy to manage and I think prone to error.

Now we come to software. EMC wipe the floor with HDS at this point. Solutions Enabler, the tool used to interact with the DMX is slick, simple to operate and (usually) works with a consistent syntax. The logic to ensure replication and point-in-time commands don't conflict or lose data is very good and it takes a certain amount of effort to screw data up. Solutions Enabler is a CLI and so quick to install and a "lite" application. There's a GUI version (SMC) and then the full blown ECC.

HDS's software still leaves a lot to be desired. Tools such as Tuning Manager and Device Manager are still cumbersome. There is CLIEX, which provides some functionality via the command line, but none of it is as slick as EMC. Anyone who uses CCI (especially earlier versions) will know how fraught with danger using CCI commands can be.

For reliability, I can only comment on my experiences. I've found HDS marginally more reliable than EMC, but that's not to say DMX isn't reliable.

Overall, I'd choose HDS for hardware. I can configure it more easily, it scales better, and - as Hu mentions almost weekly, it supports virtualisation (more on that in a moment). If I was dependent on a complex replication configuration, then I'd choose EMC.

One feature I've not mentioned earlier is virtualisation. HDS USP and NSC55 offer the ability to present externally connected arrays and present them as HDS storage. There are lots of benefits for this - migration, cost saving etc. I don't need to list them all. It's true that virtualisation is a great feature but it is *not* free and you have to look at the cost benefit of using it - or beat your HDS salesman up to give you it for free. Another useful HDS feature is partitioning. An array can be partitioned to look like up to 32 separate arrays. Great if you want to segment cache, ports and array groups to isolate for performance or security.

There are lots of other things I could talk about but I think if I go on much further I will start rambling...

Tuesday 24 April 2007

Optimisation tools

Large disk arrays can suffer from an imbalance of data across their RAID/parity groups. This is inevitable even if you plan your LUN allocation as data profiles change over time and storage is allocated and de-allocated.

So, tools are available. Think of EMC Optimizer, HDS Cruise Control and Volume Migrator.

I've put a poll up on the blog to see what people think - I have my own views and I'll save them until after the vote closes next week.

Goodbye ASNP

It's all over. ASNP is no more. Not really a surprise as it stood for nothing useful. With 2500 members, it could have been so much more, however I think it won't be missed.

Kryptonite Discovered!

Totally off post but fantastic non the less!! Kryptonite Discovered

Monday 23 April 2007

Hurrah for EMC

Hurrah! EMC has implemented SMI-S v1.2 in ControlCenter and DMX/Clariion (although the reference on the SNIA website seems to relate to SMI-S v1.1). Actually it seems that you need ECC v6.0 (not out yet and likely to be a mother of an upgrade from the current version) and I'd imagine the array support has been achieved using Solutions Enabler.

So quick poll, how many of you out there are using ECC to manage IBM DSxxx or HDS USP arrays? How many of you are using HSSM to manage ECC arrays? How many of you are using IBM TPC to manage anything other than DSxxx arrays??

Simulator Update

Following a few comments on the previous simulator post, it doesn't look like there are any more simulators out there for general use.

If anyone does know - feel free to comment!

Simulator Update

Following a few comments on the previous simulator post, it doesn't look like there are any more simulators out there for general use.

If anyone does know - feel free to comment!

Wednesday 18 April 2007

The Power Question

I've seen a lot of discussion (and I think a bit of a theme at SNW) on power consumption in datacentres. Obviously the subjects of global warming and increased energy prices have put the subject at the centre of focus. But I think when datacentres are being built, there isn't an issue. The problem comes along as the datacentre fills up with equipment. Invariably, new equipment (especially in the storage world) is coming in denser and requiring more power per rack or square metre. So, as equipment is swapped out and replaced, the original calculations done on how much power per square metre is needed are no longer accurate and the balance tips from one of "have we got the space" for new equipment to one of "can we power the new kit up".

I don't see how this problem will be solved as datacentre planners will always cater for the power/cooling of today's products, not the mythical power demands of future products. Datacentres will therefore have a finite life, after which you may as well start again.

Here's a practical example; There is a manufacturer of highly dense storage arrays (that don't need to be powered up all the time) who can't deploy into a number of London datacentres I know because product density would cause the array to fall to the floor. The datacentres were never designed to take products of that weight...

Tuesday 17 April 2007

Another Great Idea

I've another great idea for a software product (I have these ideas from time to time, but converting them into reality always proves difficult).

So, museum environments for backups. They're going to be a major headache going forward, even more than they are today as there are more demands on the timely keeping and retrieval of backups. What's needed is a product which can understand legacy backup products and do two things; (a) extract a copy of the backup catalog into a single database, based on a standard schema for backup data and (b) read the content of backup media directly without the need to use the old backup product.

This may seem like backup software companies giving away their IP but I don't think it is. I was in a discussion recently where EMC would not give support on (admittedly very) old versions of Legato, especially with respect to merging catalogs from multiple platforms. This leads to costly and risky options requiring the retention of legacy hardware (subject to failure), legacy software (no longer supported) and legacy media (prone to failure). The lack of a single catalog precludes the ability to easily identify and manage backups and backup images when multiple backup systems exist.

I wonder if any of the vendors out there would be happy to let me have copies and information on the defunct versions of their backup products?

Footnote; I am aware of the backup management products out there like Bocada; to my knowledge none of them actually merge the catalogs into a view at the file level or offer direct media restore.

Storage as a commodity

I just read a comment over at Zerowait regarding Netapp and proprietary hardware. It reminded me of something I was thinking about recently on the commoditisation of storage.

There's nothing worse to my mind than a storage vendor who has no competition. Inevitably in some organisations that situation can exist when a single supplier is chosen to supply (for example) switches, SAN or NAS. The difficulty though is how to avoid that situation. Most vendors would love to lock you into their proprietary tools and relating back to the above article link, Netapp is one I see who try that more than anyone. They have a bewildering array of interlinked product options; once your hooked (especially where you use a feature to retain long term backups via snapshot/vaults) then you're sucked into a dependency on their products which just isn't healthy.

What's the solution? Well, for me I like to commoditise storage functionality. Pick out those features which all vendors support and only use the proprietary features where absolutely necessary. At least then you can maintain multiple vendors all on the hook for your next piece of business.

Of course implementing commoditised storage is more difficult than just picking a few common product features. However as far as your users are concerned, a LUN is a LUN and NAS storage is NAS storage, with a few caveats on things like driver levels for HBAs and so on.

I've previously posted a modular storage comparison sheet. As an example, here are some of the features that almost all support:

  • RAID 5 protection
  • Consistent LUN size
  • dual pathing
  • Active/Passive failover
  • remote replication
  • Fibre Channel presentation
  • SNMP alerting
  • online code upgrades
  • hot swappable components

Before I get lots of comments saying "hold on, not all modular products are the same"; remember I'm not saying that. What I am saying is having a consistent set of requirements allows you to maintain a shortlist of vendors who can all enjoy the healthy competition of bidding for business. So, time to draw up a NAS spreadsheet....

Friday 13 April 2007

AoE/FCoE/iSCSI

Robin harris discusses AoE from Coraid. I looked at this last year (reminder) as I saw it as a great way to get a FC/iSCSI solution at a low cost. However, before everyone rips out their FC SANs and runs to put an Ethernet solution in place, take one step back and consider the issues. Fibre Channel is successful because it works; because it is reliable. FC switches have features such as non-blocking architecture, QOS, preferred path and so on which help to remove or eliminate throughput or performance issues. Would ATA over Ethernet (or for that matter as it seems to be a topic of the moment) FC over Ethernet provide for that level of switch point to point bandwidth guarantee?

Consider also your monitoring tools. Both Brocade and Cisco offer features to do traffic redirecting (e.g. SPAN ports) to easily analyse SAN traffic without putting TAPs in place. Will AoE and FCoE offer that?

Consider security. Will AoE provide the same level of security as FC?

Without a doubt, you get what you pay for; however you should only pay for what you *need*. If you are running a mission critical application FC is still the best option - interoperability is more widely tested; diagnostic tools are mature; the technology is reliable. I do think there's a place for AoE, iSCSI and FCoE, but use it in the wrong place and what you save in cost, you may pay for later in downtime.

Wednesday 11 April 2007

Where are all the simulators

I love the Netapp simulator (well, apart from the annoying issues with creating and deleting disks) and I use it all the time. It is great for testing ideas, testing scripting and generally refreshing knowledge on commands before having to touch real equipment. I use it with VMware (as I have probably mentioned before) and I can knock up a new environment in a few minutes by cloning an existing machine. Netapp have got a huge advantage in offering the tool as it enables customers who can't or won't put in test equipment to do work and protect their production environments.

So, where are all the other simulators? Is it just that I don't know they exist or do most vendors not provide them? For the same reasons as I mentioned above, if there were simulators for EMC DMX, HDS USP, Cisco and Brocade/McDATA switches, then there would be a huge opportunity for people to test and develop scripts, test upgrades and other useful work.

Would anyone else like a simulator? Can the vendors tell me why they don't produce them?

Monday 9 April 2007

Improving efficiency

Hu posted an interesting view of storage utilisation here. His view is that virtualisation on the USP would improve on the average 30% utilisation. I have to say that I disagree with this. There are lots of reasons why storage remains unused in the enterprise:


  1. inefficient administration by Storage Admins, SAs and DBAs (lost/orphan disks etc)
  2. deliberate overallocation by SAs and DBAs (in an attempt to manage change control versus growth)
  3. in process tasks (frame/host migrations, delayed decommissions)
  4. data distribution for performance management
  5. Storage Admin "buffers" to manage growth.

Many of these issues are process driven rather than due to the inadequacies of technology. Now, "true" virtualisation may address some of the problems listed above, especially those technologies which provide for thin provisioning or other overallocation methods. There are plenty of technologies on the market already offering this style of allocation however there are obviously shortcomings with the technology that could hold it back; most notably are performance and reporting.

Performance is seen as a problem due to the overhead of having to work out where blocks have been virtualised to. In addition, I/O must be written to the virtualisation device and rewritten to the physical storage creating a "dual" write scenario. Data distributed across multiple devices may not provide the same performance profile and lead to uneven response times. In fact this scenario already exists in existing enterprise storage. Data is written to cache and destaged at a later time; arrays may contain more than one type of disk size and type; data is mapped across physical disks in a RAID configuration.

Reporting is an issue as most people like to know where their data resides. There are good reasons for this; if hardware fails or if any disaster scenario is invoked then it is important to know what data is pinned where; is it still left in cache? If a RAID group fails, what data did it contain and from which volumes? In addition, overallocation creates its own problems. Predicting how virtual volumes will increase their storage usage is tricky and you most certainly don't want to get caught out refusing I/O write requests for production disks. This issue was very obvious with the early implementation of thin provisioning on Iceberg (a StorageTek storage subsystem) where reporting failed to cope with volumes that had been copied with snapshot.

Now, if HDS were to add thin provisioning to the USP, how good would that be....

Saturday 7 April 2007

Distributed Backup

Following on from a previous post on RAID and backup, I've been doing some more thinking on how to back up consumer data from a PC workstation. I reckon I've got about over 200GB of data on my server, which previously was on my main workstation. I dabbled with the Linksys NSLU2 however I hated it; I was really nervous (a) about the fact I would lose access to the device (b) it couldn't cope with the volume of files and seemed to lose track of what I had allocated and (c) how I would recover the data from the USB drives I used if the device eventually packed up. In fact, I got rid of the NSLU2 when it did lose track of my data. I was lucky to find a Windows read-only driver capable of reading the NSLU2 format and I got my data back.

Getting back to the question in hand, how would I back up 200GB? I guess I could fork out a few grand for an LTO drive and tapes, but that's not cost effective. I could do disk to disk copy (which I do) but D2D isn't as portable as tape and much more expensive if I intend to maintain multiple copies. I should mention that I've automatically discounted DVD and HD-DVD/Blu-Ray due to lack of capacity and cost (the same applies to the latest optical drives too).

I could use one of the many network backup services on offer. About 10 years ago, I looked at the feasibility of setting up one of these services for the storage company I worked for at the time. It was almost feasible; Freeserve was doing "free" dial-up internet (you paid for just the cost of the calls) and companies such as Energis were selling virtual dial-up modems on very good terms. However the backup model failed as the cost to the customer just didn't stack up due to the length of time to copy files out to the backup service.

I think network backup services *could* be the best answer to safeguarding your PC/workstation data. However the existing services have issues for me; basically I don't trust someone else with my data, which could include bank details, confidential letters and files. Even if I can encrypt my data as it is transmitted to the network backup service, they still have *all* of my data and with enough compute power could crack my encryption key.

If anyone has examples of services which could provide 100% security, I'd be interested to know.

So, here's my idea. a distributed backup service. We all have plenty of free space on those 500GB drives we've installed. Why not distribute your backups amongst other users in a peer to peer fashion? There are two main drawbacks to my mind. First, how can I guarantee my data will always be available for me to access (PCs may be powered off) and secondly, how can I guarantee security?

Existing P2P services work by finding as many servers/PCs as possible which hold the data you want to download. Many may not be online; many may be online and running slow. By locating multiple copies of the required data, then hopefully one or more will be online and available for download.

The same can be applied to backups; split files up and distribute the fragments to P2P servers and index where they are. The fragments would need to be encrypted in some way to guarantee anonymity but make them common to files on both your machine and others. You then maintain an index to rebuild the backup data; all that then needs to be backed up is the index which could easily fit onto something like a CD-ROM. All data could then be recovered using just a small CD index, which could be recreated from anywhere.

There are a lot of unanswered issues; how would data be encrypted; how would the fragments be created and "de-duplicated", how would fragments be distributed across the P2P members to ensure availability? How would the fragments be created to prevent the actual original files from being discovered?

Still, it's only a concept at this stage. But using the internet and all that unused disk space out there could prove a winner.

Wednesday 4 April 2007

Giving RAID the thumbs up

Just read Robin Harris' post at his new blog location; http://blogs.zdnet.com/storage/?p=116 and his comment on another blog discussing RAID. He quotes a VAR who has tracked disk failures and thinks RAID is an expensive luxury for desktops.

It's interesting to see the failure rates quoted, anywhere from 1-3%, which on the face of it, seems low. However when its *your* disk that has failed and the data is irretrievable, there's cold comfort to be had in failure rate statistics. I run RAID1 on my server; I have two 500GB SATA drives. Backing up that volume of data on a regular basis is a nightware without investing in a very expensive backup solution like LTO and it is a real disappointment to see tape hasn't kept pace with disk in terms of the capacity/cost ratio.

So, I'm sticking with RAID. I augment it with disk-to-disk backups because, yes, you do have to cater for the d'oh factor of user errors or even dodgy software which corrupts files too, but RAID works for me and that's all I need to worry about.