In my previous post covering LeftHand's Virtual Storage Appliance, I discussed deploying a VSA guest under VMware. This post discusses performance of the VSA itself.
Tuesday, 28 October 2008
Deciding how to measure a virtual storage appliance's performance wasn't particularly difficult. VMware provides performance monitoring through the Virtual Infrastructure Client and gives some nice pretty graphs to work with. So from the appliance (called VSA1 in my configuration) I can see CPU, disk, memory and network throughput.
The tricky part comes in determining what to test. Initally I configured an RDM LUN from my Clariion array and ran the tests against that. Performance was poor and when I checked out the Clariion I found it was running degraded with a single SP and therefore no write cache. In addition, I also used a test Windows 2003 VM on the same VMware server - D'oh! That clearly wasn't going to give fair results as the iSCSI I/O would be going straight through the hypervisor and potentially VSA1 and the test W2K3 box would contend for hardware resources.
So, on to test plan 2, using another VMware server with only one single W2K3 guest, talking to VSA1 on the initial VMware hardware. So far so good - separate hardware for each component and a proper network in between (which is gigabit). To run the tests I decided to use Iometer. It's free, easy to use and you can perform a variety of tests with sequential and random I/O at different block sizes.
The first test was for 4K blocks, 50% sequential read/writes to an internal VMFS LUN on a SATA drive. The following two graphics show the VMware throughput; CPU wasn't max'd out and sat at an average of 80%. Data throughput averaged around 7MB/s for reads and only 704KB/s for writes.
I'm not sure why write performance is so poor compared to reads however I suspect there's a bit of caching going on somewhere. That's evident from looking at the network traffic which shows an equivalent amount of write traffic as there is network traffic. The read traffic doesn't add up. There's more read traffic on VSA1 than expected, which is shown in the figures from Iometer. It indicates around 700KB/s for both reads and writes.
I performed a few other tests, including a thin provisioned LUN. That showed a CPU increase for the same throughput - no surprise there. There's also a significant decrease in throughput when using 32KB blocks compared to 4KB and 512 bytes.
So, here's the $64,000 dollar question - what kind of throughput can I expect per Ghz of CPU and per GB of memory? Because remember there's no supplied hardware here from LeftHand, just the software. Perhaps with a 2TB limit per VSA maybe the performance isn't that much of an issue but it would be good to know if there's a formula to use. This throughtput versus CPU versus memory is the only indicator I can see that could be used to compare future virtual SANs against each other and when you're paying for the hardware, it's a good thing to know!