Disk performance reporting

Disk performance reporting and trending is probably the most difficult part of performance troubleshooting and capacity planning and trending. During the years I used several counters just to understand none of them can give a synthetic and accurate answer on disk performance. Ever tried to user % idle time or % disk time? Or Avg queue length if that matters. In the last few years I standardized on Avg Disk Sec / read / Write / Transfer as a single indicator of disk responsiveness. Generally I take 20ms to 30ms as a threshold should not  be exceeded on average.

However, even these counters are prone to errors (I don’t know when the OS team will change the perf counters architecture, but it will never be too early). First of all you must be aware of the following issues that arise with virtualized Windows 2003 servers with more than one core:

But issues are there to hit on Windows 2008 as well. Look at the following table that reports Avg Disk sec / Transfer on a Windows 2008 server:


As you can see, among “normal” values (0,xxx , is the decimal separator in Italy), we have clearly bad ones. It is obvious the server is not taking 93” or 3139” seconds (little less than 1 hour) to execute an I/O on average. The presence of these bad values can wrack havoc your reporting experience, the average of the above values is 285”, now it is clear this cannot be the case.

I didn’t find a root cause for this behavior. I can observe it on several Windows 2008 servers with a predominance of hyper-v hosts, it can be hardware related or just a bug in the perf counter (immo both), in any case your reports are doomed.

The only thing I can advice on is to change your SQL query to filter out obviously bad values. For example filtering out response time above 2 seconds, changes my average on the period to 0.029 seconds or 29 msec that denotes a fairy busy storage subsystem.

If I manage to find more info on this issue I’ll keep you posted, in the meantime take your reports on disk response time with a grain of salt.

– Daniele

This posting is provided "AS IS" with no warranties, and confers no rights.

  1. Still fighting against bad performance data… but getting some results « Quae Nocent Docent

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: