Friday, 4 July 2008

5TB drives

I just read this on The Register. 5TB drives! Can you imagine it! The HDD manufacturers continue to push the envelope even further.

Now I have a concern about drives getting to this size and that's the ability to get data on/off the drive itself. With 73/146/300GB drives, the capacity to response time ratio is still within a tolerance that means adequate random access throughput can be achieved. But with larger drives the number of different concurrent accesses will increase and if response time doesn't decrease then very large HDDs will start to operate like sequential devices.

I think I need an illustration to make my point. Imagine a 73GB drive is receiving 200 random I/Os per second, each with an average 5ms response time. Scale the capacity up to a 5TB drive and that's about 69 times the capacity. The scaled up drive would have to cope with 13800 I/Os a second and provide an average response time of 0.07ms!

Firstly, it is unlikely 5TB drives will be expected to perform like today's 73GB drives but it serves to illustrate that we can't expect to simply consolidate and shrink the number of drives installed into an array. We need something more.

I think we need a more innovative approach to the design of the drive interface. This may simply be shed loads of cache, to improve the overall average response time, or perhaps multiple virtual interfaces per drive or independently mobile read/write heads which don't need to read/write a cylinder at the same time. It could even be drives that dynamically reallocate their data to make read/write quicker (for example, put frequently read/write blocks in the same physical area of the drive).

Who knows what the solution is, but rest assured something needs to happen to make 5TB drives useful devices.

6 comments:

Alex said...

have you read this article ?

http://blogs.sun.com/ahl/entry/hybrid_storage_pools_in_cacm

It seems relevant to your performance concerns

Alex

BarryWhyte said...

Chris,

I agree completely. I've heard rumours of one vendor looking at putting 4 SAS ports on a single device... maybe multi-ported drives, separate 'segmented caches' within the drive and more parallelism will come. Infact it needs to if we are to maintain anything other than archive / backup data on such drives.

Chris M Evans said...

Alex, thanks for the link, it says similar things to my thinking - plus it gives me another RSS feed to add to my ever growing list!

Chris M Evans said...

Barry

I wonder if that means the HDD of 20 years time will look as different today as the Winchester drives used in the 50's and 60's?

Pete Steg said...

SSD is the solution. Disk is the new Tape, and Flash is the new Disk.

Give it some time, though. What we see on the market today from STEC and others is a good start, but the technology has a ways to go to complement performance drives in the mainstream.

Chris M Evans said...

Pete, So all the "secret sauce" that goes into DMX/USP type arrays could then be completely negated as a lot of effort is put into coping with the disadvantages of mechanical disks. However I don't think that disk is ready to usurp tape's portability and innate ability to exist without needing power. Interesting times...