Things to make and do for agent health


I read several post about issues agents are facing even with R2. Yes R2 has still issues (clustering anyone?), but before pointing your finger at OpsMgr you should consider that a monitoring agent uses interfaces that are not normally used and this can lead to “new” bug discovered in OS or application components. So it’s not an agent issue, but the bug rises up only after agent installation. Bottom line: monitoring is never for free even agentlessly.

Before pointing your finger at OpsMgr this is our recommended fix list.

Last Updated: January 11th, 2012

Please take also a look at: Agent Health Tips and Fixes for System Center Operations Manager 2007

Check your antivirus exclusions:

Recommendations for antivirus exclusions that relate to MOM 2005 and to Operations Manager 2007

Virus scanning recommendations for Enterprise computers that are running currently supported versions of Windows

On every OS

  • KB 981574it resolves an handle leak in .net framework 2.0 that affects monitoringhost (in our experience)On Windows 2008 R2 with SP1
    • KB 2470949 – The RegQueryValueEx function returns a very large incorrect value for the “Avg. Disk sec/Transfer” performance counter in Windows Server 2008 R2 or in Windows 7
    • KB 2618982 – FIX: Memory leak in Rhs.exe after you configure the IIS 7.5 W3SVC service in a Windows Server 2008 R2 SP1 failover cluster (Needed to avoid memory leak in IIS 7.5 monitoring)

    On Windows 2008 R2 (I advise SP1)

    • KB 981314 fixes a nasty memory leak with WMI
    • KB 981936 fixes EDB corruptions the culprit of many mysterious agents failures
    • KB 2470949 – The RegQueryValueEx function returns a very large incorrect value for the “Avg. Disk sec/Transfer” performance counter in Windows Server 2008 R2 or in Windows 7

    On Windows Server 2008

    • Service Pack 2
    • KB 968967 it resolves high CPU utilization related to MSXML
    • KB 981936 fixes EDB corruptions the culprit of many mysterious agents failures
    • KB 2458331 where you experience notification faults after clearing the eventlog
    • KB 2495300 Invalid “Avg. Disk sec/Transfer” value returned by the RegQueryValueEx function in Windows Server 2008 or in Windows Vista

    On Windows Server 2003

    • Service Pack 2
    • windows scripting host 5.7 on Windows 2003
    • KB 931320 another issue with WMI on Windows 2003
    • KB 932370 addresses an issue with processors count in WMI (used by OpsMgr discovery)
    • KB 933061 it fixes several issues in WMI on Windows 2003, it is of great help with WMI issues even if it won’t resolve them all
    • KB 943071 issue with event provider in managed code and WMI on Windows 2003
    • KB 955360 issue with windows scripting host 5.7
    • KB 956523 on Windows 2003 to address a memory leak in WMI
    • KB 968967 it resolves high CPU utilization related to MSXML
    • KB 981263 fixes EDB corruptions the culprit of many mysterious agents failures

    On Biztalk 2009 servers (very noisy MP btw)

    On SQL Servers (monitored)

  • SQL Server 2005: Service Pack 3 + CU9
  • SQL Server 2008: Service Pack 1 + CU9 or Service Pack 2 + CU1
  • SQL Server 2008 R2: CU3 or Service Pack 1I’ll try to keep this post up to date with any new fix we’ll consider useful for agent health.

    – Daniele

    This posting is provided “AS IS” with no warranties, and confers no rights.