Tuesday 22 April 2008

Techdirt

I recently joined TechDirt; here's my first post if you're interested!!

http://thefutureofstorage.com/archives/23

Tuesday 15 April 2008

Drobo Update

I’ve had my Drobo for a few months now. For those of you not familiar with the technology, the Drobo is a storage device from a company called Data Robotics. Follow the link above to their website for full details.

I’d been looking for a decent home/home office storage device for some time. RAID support was a must and initially I thought I wanted NAS because my solution at the time was to keep a server running continuously. The server performs other tasks and I was using it for file serving too.

Previously I had taken the plunge with the Linksys NSLU2 which runs a modified version of Linux. Unfortunately at the time, the device only supported ext3 filesystems and as I loaded the device with more data, responses became erratic and I found the exported systems going read-only and losing content. Lucky for me the problem seemed to be the device rather than the actual data on disk and I was able to recover everything using a little software utility which allowed me to read ext3 devices on Windows. This experience scared me and it was time to look for something else.

The Drobo hit the spot for a number of reasons; firstly it was a dedicated device which took SATA II drives. It has a USB connection, allowing me to plug it directly into my existing server and third (and at the time most importantly) Data Robotics had just released a NAS head which could be used with the standard Drobo, or removed without affecting the format of data on the device itself.

So, as I said, I’ve had it for a few months and what is there to say about it? Well, not a lot. It works – and so far has worked flawlessly. But there are a few things of note.

Firstly, I have a BIOS incompatibility issue; when my server reboots, if the Drobo is plugged into a USB port, it hangs the server. I haven’t bothered to resolve this yet; weighing up the relative merits of just living with this problem or upgrading the BIOS on my server, then I fall squarely on the side of accepting the workaround of unplugging the drive at boot time and plugging it back in as the system comes up. If I was using a standalone PC, then I would obviously have fixed the problem.

Second, I was interested to see that despite my system having two 1TB drives and RAID protection, the X: drive I’d created reported back a 2TB file system. Was RAID on or not? Well, yes it was; the Drobo presents a 2TB file system regardless of the drives you have installed. It’s virtualisation in action! As you allocate all of the physical storage available, you get prompted to add or swap drives to match the physical demand. I like this feature as it’s a painless way to upgrade your storage over time and as terabyte drives drop in price (currently I’ve seen them at 1TB for £99) it helps smooth out the cost of upgrade because drive sizes can be mixed and matched.

Last, there’s the issue of firmware upgrade. Version 1.1.1 of firmware is available and it was a simple task to upgrade, however I can’t implement the code without rebooting the Drobo and that requires closing all the active files accessing the Drobo on the server. This is not a major problem though and wouldn’t be a problem on a standalone PC.

All in all, the Drobo looks good and does the job. Having 1TB of new capacity has encouraged me to spend time moving my data over in a controlled and structured fashion. The process will take months (a subject I will return to), but in the meantime I have bags of spare capacity and an easy upgrade path for both additional capacity and NAS connectivity.

Now, if anyone out there would like me to review their NAS product, then I’d be only too happy….

Monday 14 April 2008

FCoE

Fibre Channel over Ethernet has been back on my radar recently, especially as it was touted again at Storage Networking World in Orlando last week. Unfortunately I wasn’t there and didn’t see for myself, although I was in Orlando the week before on vacation. I can imagine if I’d extended or moved the holiday to include SNW that I’d be none too popular with Mrs E and my sons.

Any hoo, I looked back over my blog and I first briefly mentioned FCoE back in April 2007, a whole 12 months ago. Now, we know 12 months is a long time in the storage world (in which time iSCSI will have claimed another 3000% market share, EMC will have purchased another 50,000 storage companies of various and dubious value, HDS will have released nothing and IBM will have developed 2 or 3 new technologies which won’t see the light of day until I’m dead and buried). I expect then that FCoE should have moved on somewhat and it appears it almost has. Products are being touted, for example, Emulex with the LP21000 CNA card (not an HBA card, please note the new acronym) and Cisco with their Nexus 5000 switch (plus others).

At this stage I don’t believe the FCoE protocol has been fully ratified as a standard. I have been spending some time wading reading through the FC-BB-5 project documentation on the T11 website, covering FCoE to understand exactly how the protocol works in more detail and how it can be compared to native fibre channel, iSCSI, iFCP and FCIP. In the words of Cilla, here’s a quick reminder on storage protocols in case you’d forgotten.

Fibre channel and the Fibre Channel Protocol (FCP) provide a lossless, packet based data transmission protocol for moving data between a host (initiator) and a storage device (target). FCP implements SCSI over fibre channel. To date, fibre channel has been implemented on dedicated hardware from vendors including Cisco and McDATA/Brocade. iSCSI uses TCP/IP to exchange data between a host and storage device using the SCSI protocol. It therefore includes the overhead of TCP/IP but provides for lossy and long distance connectivity. iFCP and FCIP are two implementations which encapsulate FCP in TCP/IP packets. FCIP extends an existing fibre channel SAN, whereas iFCP allows data to be routed between fibre channel SANs.

FCoE will sit alongside fibre channel and allow the transmission of FCP packets at the Ethernet layer, removing the need for TCP/IP (and effectively allowing TCP/IP and FCP packets to exist on the same Ethernet network).

So hurrah, we have another storage protocol available in our armoury and the storage vendors are telling us that this is good because we can converge our IP and storage networks into one and save a few hundred dollars per server on HBA cards and SAN ports. But is it all good? Years back, I looked at using IP over fibre channel as a way to remove network interface cards from servers. The aim was to remove the NICs used for backup and put that traffic across the SAN using IPFC. I never did it. Not because I couldn’t; I’m sure technically it would have worked, but rather because the idea scared the willies out of “the management” for two reasons (a) we had no idea of the impact of two traffic types going over the same physical network and (b) the Network Team would have “sent the boys round” to sort us out.

Will this be any different with FCoE? Will anyone really be 100% happy mixing traffic? Will the politics allow the Networks teams to own SAN traffic entirely? Let’s face it, in large environments I currently advocate the separation of host, tape and replication traffic to separate fibre channel fabrics. I can’t imagine reversing my position and going back to single consolidated networks.

So then, is FCoE going to be better in smaller environments where the consolidation is more practical? Well, if that’s the case, then surely that makes FCoE just another niche player to FC, just like iSCSI.

It’s early days yet. There are a million-and-one questions which need to be answered, not least of which will be how FCoE will interoperate with standard FC, how drivers will interact with the existing storage protocol stack on a server and how performance/throughput will be managed. Some of these issues have been answered, however this blog entry is already far too long and rambling to include a discussion on these points this time and I will save them for another time.

Thursday 10 April 2008

May The Force Be With You

Just had to share this with everyone if you haven't already seen it...

On a PC (or server) with Internet connectivity, type "telnet towel.blinkenlights.nl" (without the quotes).

Assuming you have telnet and your firewall allows it, sit back and enjoy!

When Storage Planning Goes Bad

I was chatting to colleagues today and we were reflecting on an installation which had just completed and needed another additional storage tranche installed. Ironically, the initial disk installation on the new array hadn't been fully implemented because the vendor "forgot" to install the full quota of cache in the array. Although this was a simple gotcha, it reminded me of others I've had along the way in my career including;

  • An engineer was testing the Halon system in the newly completed computer room extension at my first site. Unfortunately he'd forgotten to turn the key to "test" before pressing the fire button and let off the Halon in the whole of the datacentre with both the equipment up and running and operators in the room mounting tapes. Needless to say, they were out of there like a rat up a drainpipe!
  • During a recent delivery of storage arrays; one array literally fell of the back of the lorry. It had to be shipped back for repair...
  • An array installation I managed in one site was mis-cabled by both the electricians and the vendor. When it was powered up, it exploded...
  • On a delivery of equipment, the vendor arrived at the loading bay at the datacentre. As the loading bay door was opened, it jammed and broke, just too low for the arrays being delivered to be pushed under the door. The vendor had to return the following day after the broken door had been repaired.
  • A tape drive on a StorageTek library I worked on took 12 hours and around 6 staff to complete. Half way through the upgrade, we took a go/no go point and checked both the MVS and VM installations to ensure the new drives worked. The MVS connected drives were fine; the VM drives had a "minor problem", so we proceeded, in anticipation of resolving the VM problem. The following day we discovered the VM problem was not correctable and had to purchase additional drives at considerable cost.
  • After loaning out some disk space to a "temporary" project, we had a hardware failure 3 months later. It turned out that the team had forgotten to ask for backups for their data and 3 months of the work of a dozen people was lost.


Fortunately, most of the above were not life threatening (except the first, which I was not involved in directly). However one of these problems did result in data loss (albeit on a development environment). It shows how many times the unexpected and unplanned can happen and mess up the best laid plans.


Care to share any of your stories?

Tuesday 1 April 2008

Multi-vendor Storage

Reading Chuck’s blog during my vacation, I stopped on his comment that multi-vendor environments are on the decline due to their complexity and the hassle of dealing with multiple vendors.

I have to say that firstly, I don’t believe this and secondly and companies with large storage environments would be mad not to consider a multi-vendor setup.

The reasons people have problems with multi-vendor environments are because they don’t spend time turning their storage into a commodity. EMC and HDS both recommend their own LUN sizes; each will sell you their management solutions; each will have their own support matrices.

But these things can and should be standardised. It is a simple task to define and migrate to consistent LUN sizes, regardless of vendor hardware. Software tools can be simplified; most people choose use command line or the basic configuration tools rather than the bloated EMC tools, so no problem there. In addition, scripting can be developed for failover and PIT/Snapshot management, making their use generic across vendors. Finally, driver/firmware/HBA/fabric standards can all be established to converge on a common set across all storage vendors.

Once storage has been established as a commodity, any new purchases can come from any of the vendors in your multi-vendor strategy.

Oh and one last thought; do you really believe HDS, IBM and EMC would give you the absolute best price if they know you can only use their product on most of your server farm environment? Competition within the storage market is a false premise; moving to another hardware platform to replace an existing one takes months (in some cases years). Vendors know that customers’ threats to move everything to another platform are only hollow unless you have a true multi-vendor strategy.