OpsMgr Disk I/O Capacity planning

One of the toughest task for a proper Operations Manager capacity planning is the storage subsystem dimensioning. We have good indications in the OpsMgr Design Guide (Operations Manager 2007 R2 Design Guide) and we can even use an excel sizer (Operations Manager 2007 R2 Sizing Helper). Being involved in OpsMgr planning pretty often I started to collect my own real world metrics, in this post I share my findings. Since this is my reference article for disk capacity planning I’ll keep updating it.

To start and have a general understanding of the topic you can read Jonathan’s post http://blogs.technet.com/b/jonathanalmquist/archive/2009/04/06/how-can-i-gauge-operations-manager-database-performance.aspx.

Since we don’t have any percentile measure in OpsMgr Data warehouse I based my conclusions on a 30 days observation period with an hourly aggregation, collecting the maximum average value for every single disk. I feel the maximum of the hourly averages is a good indication of what we really need from the storage subsystem.

I focused my analysis on Windows 2008 R2 since this will be the platform Operations manager 2012 will require for the product infrastructure. The observed environment is managing about 650 agents and extensively uses dedicated gateways. The RMS hasn’t any gateway or agent directly connected, all the gateways are managed by a dedicated Management Server. Obviously your mileage may vary especially on the database side, there are simply too many factors to take into account: number of consoles, console polling frequency, level of alert noise, number of management packs, number and frequency of reporting, …

Workload Iops % Read % Write Notes
Base OS 100 80 20  
Live DB 450 50 50  
Data Warehouse 1100 70 30  
Management Server 200 5 95 I tried to subtract the OS load, since the agent cache is on the system disk
Management Server with OS 300 20 80  
RMS 160 5 95 I tried to subtract the OS load, since the agent cache is on the system disk
RMS with OS 260 35 65  
Gateway 20 agents with OS 200 40 60  
Gateway 80 agents with OS 270 35 65  
Gateway 100 agents with OS 300 30 70  
Gateway 120 agents with OS 270 20 80  
Gateway 150 agents with OS 390 20 80  


From the above table, with the interesting exception of the average for the gateways in the 120 agents load, we can infer the IOps demand grows linearly and once agents are over 50/60 it’s safe to suppose about 2 IOps per agent for gateway servers.

Hope this can be of help in your capacity planning effort.

– Daniele

This posting is provided "AS IS" with no warranties, and confers no rights.

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: