There’s some fuzz about issues System Center agents are facing once deployed. No one is perfect and System Center agents definitely have bugs, but before pointing your finger at them you should consider that a monitoring agent uses interfaces that are not normally used and this can lead to “new” bugs discovered in OS or application components. So it’s not an agent issue, but the bug exposes itself only after agent installation. Bottom line: monitoring is never for free even agentlessly.
This post used to be an Operations manager agent reference only, but now I’ve expanded the scope to include the whole of System Center 2012.
This is mine recommended fix list.
March 4th, 2014 – added KB 2919394 for Windows Server 2012 R2
January 10th, 2014 – updated KB reference for Operations Manager agents, added SQL 2012 minimum CU level
December 6th, 2013 – updated antivirus excelusions and added reference to the curah! page
November 28th, 2013 – added KB 2878378
May 13th, 2013 – added KB2600907 and KB2600217
April 26th, 2013 – added KB2692929, KB2812950 reviewed service pack and CU requirements for SQL monitoring
March 26th, 2013 – added Windows 2012 January 2013 CU for VMM
February 18th, 2013 – added first fix for Windows Server 2012
January 25th, 2013 – included fixes related to all System Center agents
January 11th, 2013 – removed support for Windows Server 2008 R2 RTM, clarified the fixes apply to System Center 2012 as well
July 6th, 2012 – added a couple of WMI fixes for Windows 2008 R2 SP1 from AskPerf blog.
July 1st, 2012 – Added KB 2613988 for better WMI performance counter reliability
June 25th, 2012 – Added KB 982018 for Windows 2008 R2 SP1 and KB 2703157 or security bulletin MS12-037 for all versions of Windows 2008 R2
Please take also a look at: System Center 2012 Operations Manager: Recommended agent operating system fixes and updates and at my curated content here.
Check your antivirus exclusions:
With SC 2012 Operations Manager installation paths have changed, if the AV software requires executables specified with the full path and/or it doesn’t allow process based exclusions the following paths needs to be added (if you used default installation paths):
%ProgramFiles%\System Center Operations Manager\Agent
%ProgramFiles%\System Center 2012\Operations Manager
OS independent and system center infrastructure systems
- KB 981574 – it resolves an handle leak in .net framework 2.0 that affects monitoringhost (in our experience)
- KB 2484832 – FIX: ArgumentOutOfRangeException exception when you resize or scroll a window that has a DataGrid control in a .NET Framework 4.0-based WPF application. Useful on systems with admin console installed.
- KB2600217 – Reliability Update 2 for the .NET Framework 4
- KB2600907 – FIX: A .NET Framework 3.0-based WCF service may crash with a System.ServiceModel.CommunicationException exception if the service uses the netTcpBinding binding
On Windows Server 2012 R2
- KB 2919394 – Windows RT 8.1, Windows 8.1, and Windows Server 2012 R2 update rollup: February 2014. This solves an issue with agents going gray (see Kevin’s post http://blogs.technet.com/b/kevinholman/archive/2014/03/03/agents-on-windows-2012-r2-domain-controllers-can-stop-responding-or-heart-beating.aspx)
On Windows Server 2012
- KB 2790831 – An update that resolves an handle leak in WMI – Handle leak in WmiPrvSE.exe process on a Windows 8-based or Windows Server 2012-based computer
- KB 2785094 – Cumulative update that solves among others storage related issues. Especially useful for VMM 2012 – Windows 8 and Windows Server 2012 cumulative update: January 2013
On Windows 2008 R2
- Service Pack 1
- KB 982018 – An update that improves the compatibility of Windows 7 and Windows Server 2008 R2 with Advanced Format Disks is available (fixes EDB corruptions the culprit of many mysterious agents failures)
- KB 2465990 – “0×80041002 (WBEM_E_NOT_FOUND)” error occurs when you try to open a WMI namespace on a computer that is running Windows 7 or Windows Server 2008 R2
- KB 2470949 – The RegQueryValueEx function returns a very large incorrect value for the “Avg. Disk sec/Transfer” performance counter in Windows Server 2008 R2 or in Windows 7
- KB 2512889 - Windows Remote Management service leaks memory when it handles certificate authentication requests in Windows 7 or in Windows Server 2008 R2
- KB 2547244 – The WMI service and the WMI providers stop responding when you use WMI performance classes to monitor performance on a computer that is running Windows 7 or Windows Server 2008 R2
- KB 2608408 - The BITS Compact Server service randomly stops responding under high stress in Windows Server 2008 R2 or in Windows Server 2008 SP2
- KB 2613988 – Changes to performance counters are not updated for at least 15 minutes when you use WMI to query performance counter values in Windows 7 or in Windows Server 2008 R2
- KB 2617858 – Unexpectedly slow startup or logon process in Windows Server 2008 R2 or in Windows 7
- KB 2618982 – FIX: Memory leak in Rhs.exe after you configure the IIS 7.5 W3SVC service in a Windows Server 2008 R2 SP1 failover cluster (Needed to avoid memory leak in IIS 7.5 monitoring)
- KB 2692929 – “0×80041001″ error when the Win32_Environment WMI class is queried by multiple requestors in Windows 7 or in Windows Server 2008 R2
- KB 2703157 – You encounter a memory leak issue when an application calls the WinHttpGetProxyForUrl function in Windows 7 or in Windows Server 2008 R2 OR KB 2699988 MS12-037: Cumulative Security Update for Internet Explorer: June 12, 2012. It fixes a memory leak with WinHttpGetProxyForUrl that in my experience sometimes affects Management Servers (it depends on the workflows being runned).
- KB 2705357 – The WMI process stops sending events to WMI clients from a Windows 7-based or Windows Server 2008 R2-based server
- KB 2878378 – SCOM 2012 or SCOM 2007 R2 throws a “Heartbeat Failure” message and then goes into a greyed out state in Windows Server 2008 R2 SP1
On Windows Server 2008
- Service Pack 2
- KB 968967 it resolves high CPU utilization related to MSXML
- KB 2458331 where you experience notification faults after clearing the eventlog
- KB 2495300 Invalid “Avg. Disk sec/Transfer” value returned by the RegQueryValueEx function in Windows Server 2008 or in Windows Vista
- KB 2553708 A hotfix rollup that improves Windows Vista and Windows Server 2008 compatibility with Advanced Format disks (fixes EDB corruptions the culprit of many mysterious agents failures)
- KB 2812950 The MOMPerfSnapshotHelper.exe process crashes or experiences high CPU usage in Windows Server 2008 SP2
On Windows Server 2003
- Service Pack 2
- windows scripting host 5.7 on Windows 2003
- KB 931320 another issue with WMI on Windows 2003
- KB 932370 addresses an issue with processors count in WMI (used by OpsMgr discovery and agent CPU usage collection)
- KB 933061 it fixes several issues in WMI on Windows 2003, it is of great help with WMI issues even if it won’t resolve them all
- KB 943071 issue with event provider in managed code and WMI on Windows 2003
- KB 955360 issue with windows scripting host 5.7
- KB 956523 on Windows 2003 to address a memory leak in WMI
- KB 968967 it resolves high CPU utilization related to MSXML
- KB 981263 fixes EDB corruptions the culprit of many mysterious agents failures
On Biztalk 2009 servers (very noisy MP btw)
- *must* follow what’s reported here http://msdn.microsoft.com/en-us/library/ee290753(BTS.10).aspx. You’ll find specific fixes to be applied other than the ones I report in this short list.
- *must* apply Cumulative Update 5 (http://support.microsoft.com/kb/2649852), it includes a fix where Biztalk could hang when monitored by OpsMgr (it’s a WMI related issue)
On SQL Servers (monitored)
· SQL Server 2005: Service Pack 4 + CU2
· SQL Server 2008: Service Pack 2 + CU4
· SQL Server 2008 R2: Service Pack 1 + CU4
· SQL Server 2012: CU6
I’ll try to keep this post up to date with any new fix we’ll consider useful for agent health.
This posting is provided “AS IS” with no warranties, and confers no rights.