Friday, 7 September 2007

Using Virtualisation for DR

It's good to see virtualisation and the various products being discussed again at length. Here's an idea I had some time ago for implementing remote replication by using virtualisation. I'd be interested to know whether it is possible (so far no-one from HDS can answer the question on whether USP/UVM can do this, but read on).

The virtualisation products make a virtue out of allowing heterogenous environments to be presented as a unified storage infrastructure. This even means carving LUNs/LDEVs presented from an array into consituent parts to make logical virtual volumes at the virtualisation level. Whilst this can be done, it isn't a requirement and in fact HDS sell the USP virtualisation on the basis that you can virtualise an existing array through the USP without destroying the data, then use the USP to move the data to another physical LUN. Presumably the 1:1 mapping can be achieved on Invista and SVC (I see no reason why this wouldn't be the case). Now, as the virtualisation layer simply acts as a host (in USP's case a Windows one - not sure what the others emulate) then it is possible (but not usually desirable) to present storage which is being virtualised to both the virtual device and a local host, by using multiple paths from the external array.

If the virtualisation engine is placed in one site and the external storage in another, then the external storage could be configured to be accessed in the remote site by a DR server. See example 1.

Obviously this doesn't gain much over a standard solution using TC/SRDF other than perhaps the ability to asynchronously write to the remote storage, making use of the cache in the virtualisation engine to provide good response times. So, the second picture shows using a USP as an example, a 3 datacentre configuration where there are two local USP's providing replication between each other but the secondary LUNs in the "local DR site" are actually located on external storage in a remote datacentre. This configuration gives failover between the local site pair and also access to a third copy of data in a remote site (although technically, the third copy doesn't actually exist).

Why do this? Well, if you have two closely sited locations with existing USPs where you want to retain synchronous replication and don't want to pay for a 3rd copy of data then you get a poor man's 3DC solution without paying for that third data copy.

Clearly there are some drawbacks; you are dependent on comms links to access the secondary copy of data and in a DR scenario performance may be poor. In addition, as the DR USP has to cache writes, it may not be able to destage them to the external storage in a timely fashion to prevent cache overrun due to the latency on writing to the remote external copy.

I think there's one technical question which determines whether this solution is technically feasible and that is; how do virtualisation devices destage cached I/O to their external disks? There are two options I see; firstly they destage using an algorithm which minimises the amount of disk activity or they destage in order to ensure integrity of data on the external disk in case of a failure of the virtualisation hardware itself. I would hope the answer would be the latter rather than the former here, as if the virtualisation engine suffered some kind of hardware failure, I would want the data on disk to still have write order integrity. If this is the case then my designs presented here should mean that the remote copy of data would still be valid in case of loss of both local sites, albeit as an async copy slightly out of date.

Can IBM/HDS/EMC answer the question of integrity?


Ron said...

While your scenario is technically possible, I don't know that I would trust it. Too many "big yellow cable pullers" out there that could really mess up your day.

With virualization you don't have consistency groups, so I'm not sure how they would guarantee a "crash consistent" copy of your data.

I know we've asked before about how cache is destaged and what happens to the data if there are "issues". I have yet to see a technical answer.

BarryWhyte said...

Needless to say, yes SVC's GlobalMirror and MetroMirror functions ensure consistency of the remote image by ensuring the I/O is submitted to the remote site before its is cached locally. The exact behavior depends on whether its synchronous or asynchronous.

This is quite a complex area to explain and I'll post a more in-depth answer over on my blog as I think I'll need some pictures to explain it properly.

BarryWhyte said...

Note however that what you propose in fig1 is not possible with SVC. The backend storage can only be presented to one cluster. We do not permit the sharing of disks with SVC and the host (the same disks that is) You can however configure 'split controller' where some LUNs are presented to SVC and some direct to a host - just not the same LUNs. The risk of data-miscompares is too great to support such as configuration. I suspect this would be the case for other virtualizers.

Mark said...

Invista is stateless.

No I/O is cached by the Data Path Controllers, there is no store & forward operation.

Nigel said...


Sorry that I dont actually represent any of the vendors you are asking. But I do have an opinion and some experience with UVM....... and I am fairly confident that UVM does NOT destage with application write sequencing in mind.

I "understand" that the same destage algorithms are used for both internal and external disk.

After all, UVM is not sold as a DR or distance replication technology. So I would be surprised to see HP or HDS rubber stamp your proposed solution. I wouldn't buy it from you just because of your comment "..although technically, the third copy doesn't actually exist..". Even my wife, with her limited storage knowledge, would raise an eyebrow at that :-D