Fighting against ACS queries


As many of you know the Audit Collection Service (ACS) is a OpsMgr component aimed at collecting events from security event logs. Many design and capacity planning guides focus on event insertion rates, but this is just one side of the coin, once events are collected they are pretty useless if you’re not able to query them.

This installment is about efficiently query the ACS database and related caveats. I will present several execution scenarios and a few optimizations possible on the run without changing the ACS database schema.

My testing bed has been a production environment with a 300 GB ACS database hosted by an 8 cores 8 gigs SQL 2008 SP1 x64 box running on Windows Server 2003 x64 with an 8 spindles RAID1+0 array. The number of online partitions (days) were 61.

Every execution time has been averaged on 3 runs, the base selection time range was 24 hours with a potential target of 16 millions rows.

ID Rows returned select * time select count(*) time View Notes
1 2393K 5’ 37” 10” dbo.dvAll5_GUID Selecting on about 30 different event ids using a IN clause
2 2393K 35’ 17” 17” AdtServer.dvAll5  
3 600K 1’ 05” 1” dbo.dvAll5_GUID Selecting on a single event id
4 2393K 3’26” 12” AdtServer.dvHeader Selecting on about 30 different event ids using a IN clause
5 926 2’41” n.a. AdtServer.dvHeader Same selection filtered by user name

 

From the above table we can infer a few conclusions:

  1. the time difference between select * and select count(*) is, for the most part, due to the fully normalized db schema. The more fields are returned the more joins need to be resolved, so it is generally a bad idea to use a select *, you should return just the fields that you need. As a corollary using just the Header views is better than the All5 or All views, but this is a well documented fact so I won’t argue more.
  2. The huge time difference between targeting the single partition view (i.e. queries 1 and 3) or the generic full view (i.e. 2) is due the fact that SQL need to check every single partition (61 in my case) for relevant data. This implies that your query execution time is not only influenced by the selectivity of your where clause, but by the number of  online partitions as well, regardless of the time range selected. I noticed that the execution time is influenced in a more than linear way (even if I cannot say exponential) by the number of the partitions. With a little add the the ACS db schema I’ve been able to reduce execution time of 2 to 5’07" or exactly inline with 1. Unfortunately I’ve been asked by the product team not to share this hack since they have to verify it won’t have any side effects. To give the right credit the idea for the mod aroused after a insightful discussion with a colleague of mine, Alessandro Nostini, on database partitioning for a software project of ours.
  3. The execution times have significance only for pure SQL query, if you use these queries in a SQL Server Reporting Services report you must take into account some more factors, not least the number of rows returned. If you’re curious about SRS optimizations and troubleshooting I found the following posts really interesting:
    1. <http://blogs.msdn.com/robertbruckner/archive/2009/01/05/executionlog2-view.aspx>
    2. Pet Peeve: Slow Reports <http://blogs.msdn.com/deanka/archive/2009/01/13/pet-peeve-slow-reports.aspx>

I add the queries I used for your reference, as you can see we use a lookup database to avoid to code event ids inside the sql statements, the lookup will just return a list of event ids

1. –16M rows partition
select COUNT(*) from dbo.dvAll5_19d13be1_6975_4cad_8830_b4487a6c5427
where EventId in (select EventId from ProgelACSLookup.dbo.SecurityEventLookup where Category in (‘Logon’, ‘Logoff’) and Type in (8,16))
AND CollectionTime between ‘2009-11-29’ and ‘2009-11-30’
— 10"

select * from dbo.dvAll5_19d13be1_6975_4cad_8830_b4487a6c5427
where EventId in (select EventId from ProgelACSLookup.dbo.SecurityEventLookup where Category in (‘Logon’, ‘Logoff’) and Type in (8,16))
AND CollectionTime between ‘2009-11-29’ and ‘2009-11-30’
— 5’37"

2. – using the full view

select COUNT(*) from AdtServer.dvAll5
where EventId in (select EventId from ProgelACSLookup.dbo.SecurityEventLookup where Category in (‘Logon’, ‘Logoff’) and Type in (8,16))
AND CollectionTime between ‘2009-11-29’ and ‘2009-11-30’
— 17"

select * from AdtServer.dvAll5
where EventId in (select EventId from ProgelACSLookup.dbo.SecurityEventLookup where Category in (‘Logon’, ‘Logoff’) and Type in (8,16))
AND CollectionTime between ‘2009-11-29’ and ‘2009-11-30’
— 35’17"

3. — single eventid
select * from dbo.dvAll5_19d13be1_6975_4cad_8830_b4487a6c5427
where EventId in (4624)
AND CollectionTime between ‘2009-11-29’ and ‘2009-11-30’
— 1′

— from header only
declare @user nvarchar(50)
Set @user=’myadmin’
select * from AdtServer.dvHeader ACS
where EventId in (select EventId from ProgelACSLookup.dbo.SecurityEventLookup where Category in (‘Logon’, ‘Logoff’) and Type in (8,16))
AND CollectionTime between ‘2009-11-29’ and ‘2009-11-30’
AND     (ACS.EventID IN (528, 538, 540,529,531) AND ACS.HeaderUser Like @User AND ACS.HeaderUser not like ‘%$%’)
        OR
        (ACS.EventID IN (4624,4634,4625) AND ACS.TargetUser Like @User AND ACS.TargetUser not like ‘%$%’)
        OR
        (ACS.EventID >27000 AND ACS.PrimaryUser Like @User)
— 2’55"

 

4. — simplified

select * from AdtServer.dvHeader ACS
where EventId in (select EventId from ProgelACSLookup.dbo.SecurityEventLookup where Category in (‘Logon’, ‘Logoff’) and Type in (8,16))
AND CollectionTime between @startdate and @endDate
AND     (ACS.HeaderUser = @user     OR
        ACS.TargetUser = @user
        OR
        ACS.PrimaryUser = @user)

— 3’26"

declare @query nvarchar(4000)
Set @query=N’
select * from AdtServer.dvHeader ACS
where EventId in (select EventId from ProgelACSLookup.dbo.SecurityEventLookup where Category in (”Logon”, ”Logoff”) and Type in (8,16))
AND CollectionTime between ”2009-11-30” and ”2009-12-02”
AND     (ACS.HeaderUser = N”myadmin”     OR
        ACS.TargetUser = N”myadmin”
        OR
        ACS.PrimaryUser = N”myadmin”)’

execute (@query)

— 3’26"

– Daniele

This posting is provided "AS IS" with no warranties, and confers no rights.

Advertisements
  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: