Wednesday 4 July 2007

Performance - Part II

Next on the performance list - Sidefile. Sidefile is only relevant if you are using asynchronous replication. Cache is used to store write I/O requests (which have been committed locally) until they have been confirmed by a remote array in a TrueCopy pair. Both the local and remote arrays use sidefile cache to store replication recordsets which must be processed in seqence order in order to maintain consistency. The benefit of having Sidefile cache is that it minimises the effect of replication latency on TrueCopy write I/O to a local array. Sidefile usage rises and falls as write activity rises and falls, however if (for instance) replication is being managed across a shared IP network then other IP traffic could increase latency and affect the amount of sidefile cache used.

HDS recommends not letting Sidefile cache rise to higher than 10%, however there are a number of parameters which can be set to control sidefile usage. Probably most serious is Pending Update Data Rate (defaulting to 50%), which if breached causes primary array I/O delay and eventually TrueCopy pair suspension. There are also two other parameters, I/O Delay Start and I/O Delay Increase. Breaching these thresholds causes I/O delay, however it isn't clear how much of an impact this has on a host.

Now, I don't know where the 10% threshold comes from when sidefile controls by default start at 30%. Doing a simple calculation on an array with 20 active ports doing 5MB/s write IOPS each on average produces 100MB/s. With 48GB of cache, it takes less than 1 minute to reach 10% of sidefile cache, easily possible if replication goes across a congested network. I imagine that HDS are recommending that sidefile problems should be alerted as early as possible.

No comments: