The downside of on demand detections


Very few management packs implement on demand detections for monitors. On demand detections are the semantics by which the “Recalculate Health” or “Reset Health” actions work.

image

Alas the two cannot be implemented together, I can have a single on demand (and in this case I can use Reset Health) *or* multiple on demands (one for every possible monitor state, so two or three) and in this case I can have a “Recalculate Health”. More over for On Demand Detections cookdown doesn’t work. So for every single monitor target all the on demand detections are run. Bad, very bad. How much?

Take into account this example, you need to monitor 100 sql databases on a single SQL Instance. You want to have a three state monitor for, let’s say, locks count on every single DB. It is possible to write a monitor targeted at each DB (100) that runs just once for SQL instance. In a single run the monitor returns the data items for all the databases and then via filtering every single instance state is evaluated. This can be achieved thanks to the cookdown logic built into the HealthService process. But if you want to have on demand detections for the same monitor, you’ll find your data provider (let’s say a script) is going to be run number of db instances (100) * number of on demand detections (3) or 300 cscript.exe processes run at once.

Bad design that makes on demand detections very dangerous. You must consider on demand detections are run at every monitor initialization, so when the MP is deployed, the monitor is modified by an override, the agent exits maintenance mode, etc., etc. not just when the user hits the recalculate button. I hope it will be fixed in R2 (stay tuned it will be one of the first things I’ll try once RC will be out).

Technorati Tags: ,

– Daniele

This posting is provided "AS IS" with no warranties, and confers no rights.

Advertisements
  1. #1 by RogerM on March 13, 2009 - 9:02 am

    Hi!
    This is very interesting, where have you found information on how the “cock-down” logic for regular detections work?

    What “breaks” this behavior.
    If I override the frequency for one DB, then surely, it must run independently.

    To use your example, you could write a script that returns the value for one DB or for all DBs.

    How does the HelthService distinguish the two.

    As for On-demand detections, they should be “on-demand”, not be run when the monitor initializes.

    That must be a design error.

    Regards
    RogerM

    • #2 by Daniele Grandini on March 14, 2009 - 12:46 pm

      Hi Roger, this should be a post on its own I don’t have any reference at hand (I think Boris or Marius have something on their blogs, see blogs roll) , anyway in two words, if you have rules (monitors, discoveries, rules, etc) that are using the same modules (say datasource) and those datasources have the same signature, the healthservice runs them just once and passes the resulting XML documents to all those entities. btw this is why in multihoming discovery is so slow for added MG (but that’s another story). So for scripts cookdown works if scheduling and parameters are exactly the same (this is the signature). You’re right overrinding for a specific instance breaks cookdown, but just for that instance. Hope to have time in the future to drill into this.

  1. Now that SCOM 2007 R2 RC is out… « Quaue Nocent Docent

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: