I've decided to move the blog over to Wordpress and there's a new direct URL too; http://www.thestoragearchitect.com. Please check me out in the new location. In addition, there's a new feed too; http://thestoragearchitect.com/feed/ - the feedburner feed stays the same and redirects. Please update your bookmarks!
Monday, 9 February 2009
Thursday, 5 February 2009
A quick check on Twitter this morning shows me they're up to message number 1,179,118,180 or just over the 1.1 billion mark. That's a pretty big number - or so it seems, but in the context of data storage devices, it's not that big. Let me explain...
Assume Twitter messages are all the full 140 characters long. That means, assuming all messages are being retained, that the whole of Twitter is approximately, 153GB in size. OK, so there will be data structures needed to store that data, plus space for all the user details, however I doubt whether the whole of Twitter exceeds 400GB. That fits comfortably on my Seagate FreeAgent Go!
If every message ever sent on Twitter can be stored on a single portable hard drive, then what on earth are we storing on the millions of hard drives that get sold each year?
I suspect the answer is simply that we don't know. The focus in data storage is to provide the facility to store more and more data, rather than rationalise what we do have. For example, a quick sweep of my hard drives (which I'm trying to do regularly) showed half a dozen copies of the Winzip installer, the Adobe Acrobat installer plus various other software products that are regularly updated, for example the 2.2.1 update of the iPhone software at 246MB!
What we need is (a) common sense standards for how we store our data (I'm working on those), (b) better search and indexing functionality that can make decisons based on the content of files - like the automated deletion of defunct software installers.
There's also one other angle and that's when network speeds become so fast that storing a download is irrelevant. Then our data can all be cloud-based and data cleansing becomes a value add service and someone else's problem!
Wednesday, 4 February 2009
Seagate announced this week the release of their new Constellation hard drives. Compared to the Savvio range (which are high-performance, low form-factor), these drives are aimed at lower tier archiving solutions and will scale to 2TB.
Wednesday, 28 January 2009
It looks like the open storage management project Aperi has finally been put to rest. See this link.
- It doesn't rely on generic standards for reporting, but gets the full detail on each platform.
- It uses element managers or management console/CLIs to retrieve data.
- It doesn't need additional servers or effort to deploy or manage.
- It normalises all data to provide a simple consistent framework for capacity reporting.
Now reporting is good, but management is hard by comparison. Reporting on hardware doesn't necessarily break it - SRM software which changes the array could - therefore it needs to know exactly how to interact with an array and therefore requires decent API access.
Vendors aren't going to give this out to each other, so here's a proposal:
Vendors fund a single organisation to develop a unified global SRM tool. They provide API access under licence which doesn't permit sharing of that API with competitors. As the product is licensed to end users, each vendor gets paid a fee per array per GB managed so thay have some financial recompense for putting skin into the game.
Monday, 26 January 2009
Saturday, 24 January 2009
Thanks to Hu Yoshida for the reference to a previous post of mine which mentioned using virtualisation (USP, SVC, take your pick) for performing data migrations. As Hu rightly points out, the USP, USP-V, NSC55 and USP-VM can all be used to virtualise other arrays and migrate data into the USP as part of a new deployment. However nothing is ever as straightforward as it seems. This post will discuss the considerations in using a USP to virtualise and migrate data into a USP array from external sources.
In summary, here are the points that must be considered when using USP virtualisation for migration:
- Configuring the external array to the USP requires licensing Universal Volume Manager.
- UVM is not free!
- Storage ports on the USP have to be reserved for connecting to the external storage.
- LUN sizes from the source array have to be retained.
- LUN sizes aren't guaranteed to be exactly the same as the source array.
- Once "externalised" LUNs are replicated into the USP using ShadowImage/TSM/VM.
- A host outage may be required to re-zone and present the new LUNs to the host.
- If the source array is replicated, this adds additional complication.
Friday, 23 January 2009
Over the last few weeks I've been using a product called Dropbox. This nifty little tool let's you sync up your files from anywhere and across multiple platforms. It's a perfect example of Cloud Storage in action.
Why This is Good
Wednesday, 21 January 2009
Monday, 19 January 2009
I've only just picked up my MacBook for the day; too much real work do to!
- Office 2008 for Mac - chargeable
- Office 2007 for Windows under Fusion (or other)
- Symmetrix 3430 - 96 drives, 0.84TB
- Symmetrix 5500 - 128 drives, 1.1TB
- Symmetrix 8830 - 384 drives, 69.5TB
- DMX3000 - 576 drives, 76.5TB
- DMX-4 - 1920 drives, 1054TB
Note: these figures are indicative only!
DMX-3 and DMX-4 introduced arrays which scale to petabytes (1000TB) of available raw capacity. At some point, these petabyte arrays will need to be replaced and will represent a unique challenge to today's storage managers. Here's why.
Doing The Maths
From my experience, storage migrations from array to array can be complex and time consuming. Issues include:
- Identifying all hosts for migration
- Identifying all owners for storage
- Negotiating migration windows
- Gap analysis on driver, firmware, O/S, patch levels
- Change Control
- Migration Planning
- Migration Execution
With all of of the above work to do, it's not surprising that realistically, around 10 servers per week is a good estimate of the capability of a single FTE (Full Time Equivalent, e.g. a storage guy). Some organisations may find this figure can be pushed higher, but I'm talking about one person, day in day out, performing this work, so I'll stick with my 10/week figure.
Assume an array has 250 hosts, each of an average 500GB, then this equates to about 125TB of data and almost 6 month's effort for our single FTE! In addition, the weekly migration schedule requires moving on average 5TB of data. If the target array differs from the source (e.g. a new vendor, different LUN size) then the migration task can be time consuming and complex to execute.
Look at the following diagram. It shows the lifecycle of physical storage in an array over time. Initially the array is deployed and storage configured. Over the lifetime of the array, more storage is added and presented to hosts until either the array reaches a maximum physical capacity or an acceptable capacity threshold. This remains until migrations start to take place to another array. Up to the point migrations take place, storage is added and paid for as required, however once migrations start, there is no refund from the vendor for the unused resources (those represented in green). They have been purchased but remain unused until the entire array is decommissioned. If the decommissioning process is lengthy then the amount of unused resources becomes high, especially on petabyte arrays. Imagine a typical 4-year lifecycle; up to 1 year could be spent moving host to new arrays - at significant cost in terms of manpower and impact to the business.
So how should we adapt migration processes to handle the issue of migrating these monster arrays?
- Establish Standards. This is an age old issue but one that comes up time and time again. Get your standards right. These include consistent LUN sizes, naming standards and support matrix (compatibility) standards.
- Consider Virtualisation. Products including SVC, USP, InVista (EMC) and iNSP (Incipient) all allow the storage layer to be virtualised. This can assist in the migration process.
- Keep Accurate Records. This may seem a bit obvious but it is amazing the number of sites who don't know how to contact the owner of some of the servers connected to their storage.
- Talk to Your Customers. Migrations inevitably result in server changes and potentially an outage. Knowing your customer and keeping them in the loop regarding change planning saves a significant amount of hassle.
Sunday, 18 January 2009
So, second day with my MacBook and I've started to look at application transparency between Mac and Windows.
- Exchange Email
Saturday, 17 January 2009
For those who don't follow me on Twitter, today I "upgraded" my laptop to a shiny new MacBook. If you are interested, it's the 2.4Ghz version with 4GB of RAM. Enough of the specifications, how am I finding it so far?
Tuesday, 13 January 2009
Eric Savitz over at Tech Trader has an interesting article today.
Demand at Seagate is down and consolidation of the industry is expected. However as recently as March last year EMC was telling us how storage growth just keeps on spiralling upwards.
So what's happening? Are we becoming inherently more efficient at storing our data all of a sudden, now that a credit crunch is upon us? Somehow I don't think so.
Demand ebbs and flows as finances dictate the ability to purchase new equipment, but growth remains steady. Technology is replaced constantly but just like you or I might hold on to our car for another year or so before replacement, so will IT departments, preferring to pay maintenance on existing kit rather than rip and replace to the latest and greatest. I can see two consequences from this;
- More time and effort will need to be paid to using current resources more efficiently.
- Migration to new hardware will need to be even more slick and quick to reduce the overhead of migration wastage.
I'll discuss these subjects in more detail this week.
EMC have finally announced that they will be following the industry trend and cutting staff. Approximately 7% (2400) of the workforce will go. The cuts are widely reported (here for instance) and at their earliest were forecast by Stephen Foskett in his December post.
Have a look at this list of tech layoffs. Those storage related are Seagate, Dell, EMC, WD, Pillar Data, Sun, SanDisk, HP. Not on the list are Quantum and COPAN.
Do you know of any others?
2009 will be the year of rationalisation and optimisation. The only prediction to make for the next 12 months is that end-users will be looking to do more with less.
Wednesday, 7 January 2009
In my two previous articles I discussed Cloud Storage and the concept of using middleware to store multiple copies of data across different service providers. In this final part, I'd like to discuss the whole issue of security.
Tuesday, 6 January 2009
I like to use the Christmas holidays as an excuse for a good old-fashioned cleanout. This invariably means burning (shredding takes to long and we don't have a hamster) old paperwork and junking lots of defunct technology.
- Two 3.5" floppy drives -I don't actually have any floppy media so the drives are no longer useful
- Philips DVDRW208 - one of my earliest DVD writers
- Toshiba DVD-RAM SD-W1101
- Creative 52x CD Drive CD5233E
- Pioneer CD-ROM DR-U06S
- Exabyte EXB-4200CT DAT drive
- Seagate STT320000A DAT drive
The early DVD writers were a pain to get working with different media types. The quality of the media sure made a difference. I never got on with DVD-RAM, especially with the cartridge loading format; and as for the DAT drives...
At the time this technology seemed new and cutting edge. Now it seems so old hat. I wonder what I'll be throwing out next year!
Monday, 5 January 2009
- Server failure
- Catastrophic array failure
- Software bug
- Site failure
- User stupidity
If data is the lifeblood of your organisation then you *must* replicate it onto another online copy or at least onto a backup and have multiple copies in multiple locations.
If anyone out there is not sure they're protecting their data properly - then give me a call!