In one of my latest posts (Tired of WMI Probe Failed Execution-) I stated that the WMI fix cited in KB 933061 has very positive effects on monitor reliability. I must say that WMI errors have not disappeared all together, but they have decreased by two orders of magnitude. In my quest for the perfect monitoring agent I noticed a significative decrease in CPU utilization by HealthService. As you can see from the following screenshots, this is a huge improvement.
So the fix is not only recommended for monitor reliability, but for decreasing the monitoring impact as well. Not all the environments will record the same improvement, if you have a quiet healthservice you won’t notice any difference, but if you have tons of 21025 (see the event saga in previous and future posts) on your RMS than you’ll benefit from the fix.
The reason here is hidden in monitors state, rollups and internal tasks. The sequence is more or less the following: WMI is not reliable, so monitors based on WMI probes are not always up to date or worse are in a unknown health state. WMI errors affect the discovery process, some properties and objects are not discovered at every iteration and there are cases where they flip / flop between iterations. These changes in discovered entities will fire 21025 on agents and worse on RMS. At every 21025 on the RMS (caused by these and other reasons, see the wrap up post on 21025) the RMS fires a recalculation of every monitor involved in a rollup it doesn’t know the state of. The internal task reaches the agent, the agent reloads the workflows (huge cpu spike as you can see from previous screenshots) and tries to recalculate the monitor state, alas WMI is unreliable and the agent is not able to recalculate it. At the next 21025 on the RMS the entire process restarts. If 21025 on the RMS are frequent the recalculation basically never ends. The fix makes the agent more reliable in calculating monitors state and, as you can see, health service becomes quieter.
For more troubleshooting tips on WMI failures I would suggest the following post: Getting lots of Script Failed To Run alerts? WMI Probe Failed Execution? Backward Compatibility Script Error?
This posting is provided "AS IS" with no warranties, and confers no rights.