So we eventually completed our SC 2012 – Operations Manager upgrade and we have a lot of tales to tell and share.
In this installment I will try to outline changes in Sites and Gateways.
Introduction
Just a little background before proceeding further.
Gateways are special purpose management servers: they are a contact point for agents, but do not talk directly with the SQL backend in fact they need to report to a standard management server. Gateways are useful with untrusted forests and for bandwidth optimizations for remote locations.
Sites are a construct OM inherited from System Center Essentials and are basically a way to group resources managed by a one or more gateways. They are an handy way to organize managed objects in different locations (both geographical or logical), plus every alert gets stamped with the generating site. Sites are useful for console scoping and alert management.
To create a site (without using the SDK) you need to create a gateway, more a site can only be created at gateway approval time, this the command line to create MySite site associated with dummygw.dom.lab gateway:
MICROSOFT.ENTERPRISEMANAGEMENT.GATEWAYAPPROVALTOOL.EXE /ManagementServerName=ms1.dom.lab /GatewayName=dummygw.dom.lab /SiteName=MySite
Sites are slightly changed in OM2012, nothing you can get in the UI, but the code has been reviewed and a new relationship of type Microsoft.SystemCenter.SiteContext has been added to track site memberships.
The issue
This change has an important implication (probably a bug, but not confirmed yet): you must avoid orphaned sites. Orphaned sites are sites without any gateway associated with. If you set yourself in this situation you will mess the entire sites relationships and all managed objects will be set as members of the orphaned site.
You can detect you’re in this situation checking the groups hierarchy and the site property of your alerts. The group hierarchy in a normal situation is like this:
While if you’re having the issue it will have the orphaned site as the root element:
This is where things get interesting, aren’t they?
Avoiding orphaned sites
The first thing to think about is how to avoid an orphaned site, this is pretty easy don’t (ever) delete a gateway associated with a site from Operations Console.
This implies you have to delete the gateway and its associated site with the same tool you used to create them:
MICROSOFT.ENTERPRISEMANAGEMENT.GATEWAYAPPROVALTOOL.EXE /ManagementServerName=ms1.dom.lab /GatewayName=dummygw.dom.lab /SiteName=MySite /Action=DELETE
And… you’ll be surprised (maybe not if you’re a fellow QND reader) it doesn’t work:
This is not a new bug, in fact the exact same issue was present in OM2007R2 sigh (Gateway decomissioning post updated). The recovery from this specific error is similar, but need some updates due to the fact the DB schema has changed.
Before going on you just remember the following database updates and queries
ARE TOTALLY UNSUPPORTED IF YOU DO IT, YOU DO IT AT YOUR OWN RISK
The following update will remove the relationship HealthServiceShoudManageEntity between the gateway and site, so that the command line can be executed successfully:
SQL
declare @nodeHS nvarchar(255)
Declare @discoverySourceId uniqueidentifier
Declare @RelationshipId uniqueidentifier
Set @nodeHS=N’Microsoft.SystemCenter.HealthService:dummygw.dom.lab’
SELECT @discoverySourceId=DSR.DiscoverySourceId, @RelationshipId=DSR.RelationshipId
–, RGV.SourceObjectFullName, RGV.TargetObjectFullName, RGV.RelationshipId
FROM dbo.RelationshipGenericView RGV
inner join dbo.DiscoverySourceToRelationship DSR on DSR.RelationshipId=RGV.Id
WHERE ((RGV.[RelationshipId] = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceShouldManageEntity()) AND (((RGV.[IsDeleted] = 0))))
AND (RGV.[SourceObjectId]
IN (select BaseManagedEntityId from BaseManagedEntity where FullName =@NodeHS))
declare @utc datetime
Set @utc = GETUTCDATE()
exec dbo.p_RemoveRelationshipFromDiscoverySourceScope @RelationshipID=@RelationshipId ,
@DiscoverySourceId=@discoverySourceId , @TimeGenerated=@Utc
Do this only if you’re removing the last gateway associated with a site, if not the site is removed and this time the gateway is orphaned. If you have more gateways associated with site you can delete them from UI, except for the last one.
Get out of troubles
Now, let’s face the case where you deleted the gateway from the Console and you’re in troubles. Here again some SQL queries will bring the day home. First thing to know is which is the orphaned site, obviously you can get it in Console in the groups view, but chances are that you can have multiple orphaned sites, so better have a comprehensive list.
The sites list is in the following tables: MT_Site if you upgraded from OM2007, MT_Microsoft$SystemCenter$Site if you have a clean OM2012 installation. In the rest of the post I will use MT_Site in my queries ‘cause it is shorter J. I tested the query in both scenarios fresh install and upgraded DB.
The sites not listed by the following query are orphaned
SQL
– find orphaned sites (i.e. sites not managed by an healthservice) – sites not returned by this query are orphaned
select DISTINCT R.LastModified, R.TimeAdded, R.IsDeleted,
RT.RelationshipTypeName, RT.RelationshipTypeTableName, ME1.DisplayName As ‘Source’, ME.DisplayName As ‘Target’
from Relationship R
inner join RelationshipType RT on RT.RelationshipTypeId=R.RelationshipTypeId
inner join BaseManagedEntity ME on ME.BaseManagedEntityId=R.TargetEntityId
inner join BaseManagedEntity ME1 on ME1.BaseManagedEntityId=R.SourceEntityId
where R.TargetEntityId in (Select BaseManagedEntityId From MT_Site)
AND R.RelationshipTypeId=dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceShouldManageEntity()
AND R.IsDeleted=0
Order By Source
Now that we have a list of orphaned sites, we have basically two options:
1. Add a dummy gateway for each site and then remove the gateway following the steps I outlined
2. Delete the site from the Operations Manager database
Option 1 should be clear by now, to implement option 2 we need to unlink the site from the discovery rule, this way Operations Manager “knows” it must remove the managed entity (the site itself) and replicate the removal to the data warehouse. To do this we must use the p_RemoveEntityFromDiscoverySourceScope stored procedure and pass the TypedManagedEntityId and DiscoverySOurceId for our site. Incidentally the DiscoverySourceId is sdk generated so you won’t find any actual discovery rule associated (Discovery table join on DiscoveryId field), this is appears to be a constant ’85AB926D-6E0F-4B36-A951-77CCD4399681′, getting the TypeManagedEntityId is then just a matter of proper table joining
SQL
declare @TypedManagedEntityId uniqueidentifier
select @TypedManagedEntityId=TME.TypedManagedEntityId from MT_Site S
inner join BaseManagedEntity ME on ME.BaseManagedEntityId=S.BaseManagedEntityId
inner join dbo.TypedManagedEntity TME on TME.BaseManagedEntityId=S.BaseManagedEntityId
inner join dbo.DiscoverySourceToTypedManagedEntity DSTM on DSTM.TypedManagedEntityId=TME.TypedManagedEntityId
inner join dbo.DiscoverySource DS on DS.DiscoverySourceId=DSTM.DiscoverySourceId
where DS.DiscoverySourceType=2 AND DS.DiscoverySourceId=’85AB926D-6E0F-4B36-A951-77CCD4399681′ AND ME.FullName=‘Microsoft.SystemCenter.Site:MySite’
declare @utc datetime
Set @utc = GETUTCDATE()
exec dbo.p_RemoveEntityFromDiscoverySourceScope @TypedManagedEntityId=@TypedManagedEntityId
, @DiscoverySourceId=’85AB926D-6E0F-4B36-A951-77CCD4399681′
, @TimeGenerated = @utc
I strongly suggest you execute the first SELECT statement with SELECT * to check for the returned data.
One last bug
Once you get rid of all the orphaned sites you’ll see your groups hierarchy returning flat (almost), but you’ll have an incorrect assignment for alerts for about two days. The statement above marks the objects for deletion (IsDeleted=1), but it doesn’t physically remove them. The internal grooming process will take care of it, right now it hard deletes object deleted from more than 48 hours. Sadly the p_InsertAlert stored procedura has a bug when it relates the alert to the site, in fact it doesn’t take into consideration the IsDeleted flag. The stored procedure should filter out logically deleted items, but it does not.
SQL
CREATE PROCEDURE [dbo].[p_AlertInsert]
(
@AlertName nvarchar(255),
@AlertDescription nvarchar(2000),
@BaseManagedEntityId uniqueidentifier,
@ProblemId uniqueidentifier,
@IsMonitorAlert bit,
@RuleId uniqueidentifier,
@ResolutionState tinyint,
@Priority tinyint,
@Severity tinyint,
@Owner nvarchar(255),
@ResolvedBy
– Figure out the site this alert came from.
SELECT @SiteName = SITE.[DisplayName]
FROM dbo.[BaseManagedEntity] BME
JOIN dbo.[Relationship] R
ON BME.[TopLevelHostEntityId] = R.[TargetEntityId]
JOIN dbo.[BaseManagedEntity] SITE
ON SITE.[BaseManagedEntityId] = R.[SourceEntityId]
WHERE BME.[BaseManagedEntityId] = @BaseManagedEntityId
AND R.[RelationshipTypeId] = dbo.fn_ManagedTypeId_MicrosoftSystemCenterSiteMembership()
– MISSING AND R.IsDeleted=0
Supporting queries and other info
Troubleshooting the issue I used several supporting queries, I list them all for future reference
SQL – Return Site Context relationships
select R.LastModified, R.TimeAdded, R.IsDeleted, R.RelationshipId,
RT.RelationshipTypeName, RT.RelationshipTypeTableName, ME1.DisplayName, ME.DisplayName, SC.SiteContextOwnerId_2A841CAE_93E8_4BDD_5491_9E4AAF8B5C9A
from dbo.MT_Microsoft$SystemCenter$SiteContext SC
inner join Relationship R on R.RelationshipId=SC.RelationshipId
inner join RelationshipType RT on RT.RelationshipTypeId=R.RelationshipTypeId
inner join BaseManagedEntity ME on ME.BaseManagedEntityId=R.TargetEntityId
inner join BaseManagedEntity ME1 on ME1.BaseManagedEntityId=R.SourceEntityId
SQL – Returns all relationships involving sites
– All site relationships involving sites (table name could change)
select DISTINCT R.LastModified, R.TimeAdded, R.IsDeleted,
RT.RelationshipTypeName, RT.RelationshipTypeTableName, ME1.DisplayName As ‘Source’, ME.DisplayName As ‘Target’
from Relationship R
inner join RelationshipType RT on RT.RelationshipTypeId=R.RelationshipTypeId
inner join BaseManagedEntity ME on ME.BaseManagedEntityId=R.TargetEntityId
inner join BaseManagedEntity ME1 on ME1.BaseManagedEntityId=R.SourceEntityId
where R.TargetEntityId in (Select BaseManagedEntityId From MT_Site)
OR R.SourceEntityId in (Select BaseManagedEntityId From MT_Site)
Order By Source
The process that creates the SiteContext relationship is “SC DAL – EDP”
#1 by DDog on July 24, 2012 - 6:54 pm
YOU ROCK! This was plaguing me and I couldn’t find ANYTHING on the web regarding this issue. This fixed the issue. Thank you!!! I’ve added an article to our internal knowledgebase in case this happens again. What a bug!
#2 by Daniele Grandini on July 26, 2012 - 3:51 pm
glad to be helpful :-)