Thursday, 21 December 2006

Understanding Statistics

I've been reading a few IDC press releases today. The most interesting (if any ever are) was that relating to Q3 2006 revenue figures for the top vendors. It goes like this:

Top 5 Vendors, Worldwide External Disk Storage Systems Factory Revenue 3Q2006 (millions)

EMC: $927 (21.4%)
HP: $760 (17.6%)
IBM: $591 (13.7%)
Dell: $347 (8.0%)
Hitachi: $340 (7.9%)

So EMC comes out on top, followed by the other usual suspects. EMC gained market share from all other vendors including "The Others" (makes me think of Lost - who are those "others"?). However, IDC also quote the following:

Top 5 Vendors, Worldwide Total Disk Storage Systems Factory Revenue 3Q2006 (millions)

HP: $1406 (22.7%)
IBM: $1250 (20.2%)
EMC: $927 (15.0%)
Dell: $507 (8.2%)
Hitachi: $348 (5.6%)

So what does this mean? IDC defines a Disk Storage System as at least 3 disk drives and the associated cables etc, to connect them to a server. This could mean 3 disks with a RAID controller in a server. Clearly EMC don't ship anything other than external disks as their figures are the same in each list. HP make only 50% of their disk revenue from external systems, the rest presumably are disks shipped with servers, IBM even less as a percentage, Dell about $160m. The intruiging one was Hitachi - what Disk Storage Systems to they sell (other than external) which made $8m of revenue? The source of my data can be found here:

What does this tell me? It says there's a hell of a lot of DAS/JBOD stuff still being shipped out there - about 30% of total revenue.

Now, if EMC were to buy Dell or the other way around, between them they could (just) pip HP to the post. Are EMC and Dell merging? I don't know, but I don't mind starting a rumour...

Oh, another intesting IDC article I found referred to how big storage virtualisation is about to come. I've been saying the word "virtualisation" for about 12 months in meetings just to get people to even listen to the concept, even when the meeting has nothing to do with the subject. I'm starting to feel vindicated.

Wednesday, 20 December 2006

Modular Storage Products

I’ve read a lot of posts recently on various storage related websites asking for comparisons of modular storage products. By that I’m referring to “dual controller architecture” products such as the HDS AMS, EMC Clariion and HP EVA. The questions come up time and time again, usually comparing IBM to EMC or HP and little comparison to HDS, but lots of people recommending HDS.

So, to be more objective, I’ve started compiling a features comparison of the various models from HDS, IBM, EMC and HP. Before anyone starts, I know there are other vendors out there – 3PAR, Pillar and others come to mind. At some stage, I’ll drag in some comparisons to them too, but to begin with this is simply the “big boys”. The spreadsheet attached is my first attempt. It has a few gaps where I couldn’t determine the comparable data, mainly on whether iSCSI or NAS is a supported option and the obvious problem of performance throughput.

So, from a simple physical perspective, these arrays are pretty simple to compare. EMC and HDS give the highest disk capacity options, EMC, HDS and IBM offer the same maximum levels of cache. Only HDS offers RAID6 (at the moment), most vendors offer a range of disk drives and speeds. Most products offer 4Gb/s front-end connections and there are various options for 2/4Gb/s speeds at the back end.

Choosing a vendor on physical specifications alone is simple using the spreadsheet. However there are plenty of other factors not included here. First, there’s performance. Only IBM (from what I can find) offers their arrays to scrutiny by the Storage Performance Council. Without a consistent testing method, any other figures offered by vendors are completely subjective.

Next, there’s the thorny subject of feature sets. All vendors offer variable LUN sizes, some kind of failover (I think most are active/passive), multiple O/S support, replication and remote copy functionality and so on. Comparing these isn’t simple, though as the implementation of what should be common features can vary widely.

Lastly there’s reliability and the bugs and gotchas that all products have and which the manufacturers don’t document. I’ll pick an example or two; do FC front-end ports share a multiprocessor? If so, what impact does load on one port have on the other shared port? What downtime is required to do maintenance, such as code upgrades? What is the level of SNMP or other management/alerting software?

The last set of issues would prove more difficult to track so I’m working on a consistent set of requirements from a product. In the meantime, I hope the spreadsheet is useful and if anyone can fill the gaps or wants to suggest other comparable mid-range/modular products, let me know.

You can download the spreadsheet here:

Tuesday, 19 December 2006


Lots of people are talking about how we need a new way to protect our data and that RAID has had it. Agreed, going RAID6 gives some benefits (i.e. puts off the inevitable failure by a factor again), however the single problem to my mind with RAID today is the need to read all the other disks when a real failure occurs. Dave over at Netapp once calculated the risk of re-reading all those disks in terms of the chance of a hard failure.

The problem is, the drive is not involved in the rebuild process - it dumbly responds to the request from the controller to re-read all the data. What we need are more intelligent drives combined with more intelligent controllers; for example; why not have multiple interfaces to a single HDD? Use a hybrid drive with more onboard memory to cache reads while the heads are moving to obtain real data requests. Store that data in memory on the drive to be used for drive rebuilds. Secondly, why do we need to store all the data for all rebuilds across all drives? Why with a disk array of 16 drives can't we run multiple instances of 6+2 RAID across different sections of the drive?

I'd love to be the person who patents the next version of RAID....

Tuesday, 5 December 2006

How low can they go!

I love this picture. Toshiba announced today that they are producing a new 1.8" disk drive using perpendicular recording techniques. This drive has a capacity of 100GB!

It will be used on portable devices such as music players; its only 54 x 71 x 8 mm in size, weighs 59g and can transfer data at 100MB/s using an ATA interface.

I thought I'd compare this to some technology I used to use many years ago - the 3380 disk drive.

The model shown on the right is the 3380 CJ2 with a massive 1.26GB per unit and an access time similar to the Toshiba device. However the transfer rate was only about 3MB/s.

I couldn't find any dimensions for the 3380, but from the picture of the lovely lady, I'd estimate it is 1700 x 850 x 500mm which means 23,500 of the Tosh drives could fit in the same space!

Where will we be in the next 20 years? I suspect we'll see more hybrid drives, with NAND memory used to increase HDD cache then more pure NAND drives (there are already some 32GB drives announced). Exciting times...

Monday, 4 December 2006

Is the Revolution Over?

I noticed over the weekend that's website was down. Well it's back up today, but the forums have disappeared. I wonder, is the revolution over for JWT and pals?