Class properties that get updated frequently is a WORST PRACTICE not only for RMS


One of our customer complains that  HealthService consumes too much CPU on his servers, so I started to investigate the cause of the problem.

The following picture shows the HealthService performance graph taken from process explorer :

HealthServcie Performance Graph

What I found is that the cause of the high CPU usage was a discovery rule that updates a property LastRun every time it runs (it was scheduled to run every 5 minutes) . As outlined in the following blog post “WORST PRACTICE: Class properties that get updated frequently”, it is not a good idea to do that because it has a performance impact on RMS. In my tests I notice a performance impact on the Agent HelathServce that execute the discovery too. Every time the Discovery rule updates a Class property it seems that the RMS forces the agent to reload the configuration and the following event is logged in the RMS EventLog:

Event Source: OpsMgr Config Service | Event ID: 29102 | Date:  17/01/2009 | Time:  22.22.38 | Computer: RMS
Description:Configuration state of OpsMgr Health Service “{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}” running on “<agentfqdn>” may be out of date. It should contact OpsMgr Config Service to synchronize its configuration state.

AGENT requests an updated configuration :

Event Source: OpsMgr Connector | Event ID: 21024 | Date:  17/01/2009 | Time:  22.22.40 | Computer: AGENT
Description:OpsMgr’s configuration may be out-of-date for management group <MGNAME>, and has requested updated configuration from the Configuration Service. The current(out-of-date) state cookie is “XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX”

RMS receives the request :

Event Source: OpsMgr Config Service | Event ID: 29103 | Date:  17/01/2009 | Time:  22.22.43 | Computer: RMS
Description:OpsMgr Health Service “{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}” running on “<agentfqdn>” has contacted OpsMgr Config Service to synchronize its configuration state.  The configuration state cookie for the OpsMgr Health Service running on “<agentfqdn>” is “YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY”.

AGENT receives the new configuration :

Event Source: OpsMgr Connector  | Event ID: 21025 | Date:  17/01/2009 | Time:  22.22.49 | Computer: AGENT
Description:OpsMgr has received new configuration for management group <MGNAME> from the Configuration Service.  The new state cookie is “YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY”.

AGENT loads the new configuration :

Event Source: HealthService | Event ID: 1210 | Date:  17/01/2009 | Time:  22.23.08 | Computer: AGENT
Description:New configuration became active. Management group “<MGNAME>”, configuration id:”YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY YY”

The configuration reload is a CPU intensive operation, for example it need to parse all the management packs (I saw it with filemon). After removing the property LastRun in the discovery, CPU usage decreases to an acceptable value.

If you want to try in your LAB Here is a sample MP with a discovery that runs every 2 minutes to simulate this behaviour : 
Discovery.HighCPU.xml (remove the .doc extension used only to upload the MP)  – DON’T USE IN A PRODUCTION ENVIRONMENT

Advertisements

  1. Troubleshooting 21025 events – Part 1 evidence « Quaue Nocent Docent

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: