SCOM 2012 Availibility report bug – Bars display as dark grey Up (Monitoring unavailible)

Share on Social Media

Recently I came across an issue with SCOM 2012 availability reports which causes the bars at the top level to display incorrectly.

avalilibility drill down 2

This is due to an error which is causing a duplicate entry to be created in the HealthServiceOutage table which has an outage start time but not an outage end time which causes an incorrect availability calculation for those objects..

The following SQL query will allow you to identify if you are affected by this issue:

Step 1:

SELECT * FROM HealthServiceOutage HS1 JOIN HealthServiceOutage HS2

ON HS1.StartDateTime = HS2.StartDateTime

AND HS1.ManagedEntityRowId = HS2.ManagedEntityRowId

WHERE HS2.EndDateTime IS NULL AND HS1.HealthServiceOutageRowId <> HS2.HealthServiceOutageRowId

If this query returns any records make a note of the StartDateTime values in the duplicate rows this date will be used again later to correct the problem.

This issue is addressed in UR3 for SCOM 2012 SP1 but if you are not planning on rolling this out in the near future there is a private fix available from Microsoft which will correct the relevant stored procedure. Also as this is an acknowledged known issue Microsoft will not charge for any case to address this problem.

Once you have applied the fix you will need to use the following queries to add an outage end time to the duplicate entries and then re-aggregate the affected data.

As always before performing any database update operations, ensure to make a full backup of the OperationsManager and OperationsManagerDW databases.

Step 2:

This query will update the EndDateTime value from NULL to valid time stamp.

UPDATE HS2

SET HS2.EndDateTime = HS1.EndDateTime

FROM HealthServiceOutage HS1 JOIN HealthServiceOutage HS2

ON HS1.StartDateTime = HS2.StartDateTime

AND HS1.ManagedEntityRowId = HS2.ManagedEntityRowId

WHERE HS2.EndDateTime IS NULL AND HS1.HealthServiceOutageRowId <> HS2.HealthServiceOutageRowId

Once Step 2 has finished running you should re-run the query in Step 1 to make sure that there are no additional affected rows.

 Step 3:

This  query will set the DirtyInd value for all the rows in the specific time range from 0 to 1, making them eligible for re-aggregation. The start date will be the StartDateTime value noted in step 1, the end date should be todays date.

update StandardDatasetAggregationHistory

set DirtyInd = 1

where DatasetId = (Select Datasetid from Standarddataset where Schemaname = ‘state’)

and AggregationDateTime => ‘2012-21-01 00:00:00’

and AggregationDateTime < ‘2012-13-03 00:00:00’

Step 4:

Disable the Standard Data set Maintenance rule for the State data set ONLY, then run the below query to manually re-aggregate the State Data.

declare @i int

set @i=1

while(@i<=500)

begin

DECLARE @DataSet uniqueidentifier

SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE SchemaName = ‘State’)

EXEC standarddatasetmaintenance @DataSet

set @i=@i+1

Waitfor delay ’00:00:05′

End

Note: Thie query may need to be run multiple times depending upon the amount of data that need to be aggregated .

Step 5:

Once this query returns less then 5 rows Step 4 can be stopped and the Standard Data set Maintenance rule  can be re-enabled.

Select count(*) from StandardDatasetAggregationHistory

where Datasetid = (Select Datasetid from Standarddataset where Schemaname = ‘state’)

AND DirtyInd=1

 

In my case there were 741 rows that needed to be re-aggregated, on average it takes between 5 and 10 minutes for each row, which resulted in 105 hours total, although your mileage may vary depending on the power of your SQL server and how busy your environment it.

6 thoughts on “SCOM 2012 Availibility report bug – Bars display as dark grey Up (Monitoring unavailible)

  1. David

    We are getting a blue bar (UP Planned Maintenance) in our Availability Reports. I have applied UR2 to our SCOM 2012 SP1 environment not sure why we are having this issue then. I ran the script in Step 1 and i’ve got 2665 rows affected. If i was to perform the this procedure, going on your estimates of 105 hours for 741 rows, ours could take up to 300-400 hours to run.
    Have you got any more information on the Microsoft fix for this problem?
    Thanks
    David

    Reply
  2. Warren Kahn Post author

    Hi David,

    This issue related to the grey bar, UP (Monitoring unavailbile). The blue bar may be another issue. However if you are seeing results when running the query in step 1 then you definitly had this particular problem before applying UR2.

    The 105 hour estimate will vary depending on the strength of your SQL server(s).

    How many days of data are you looking at? You could adjust the query in step 3 to use smaller incriments and make the re-aggregation more managable.

    I dealt with Microsoft on this issue a couple of months ago. they may have altered the solution since then but you would have to speak to them to find out more.

    Regards,
    Warren

    Reply
  3. John Curtiss

    Warren,
    any reason I wouldn’t want to run steps 2 and 3 automatically on a regular basis? I seem to be having grey objects in availability reports pretty regularly (long before, and still after installing sp1 ur3), and I have to run these steps manually. I don’t ever actually do step 4 or 5, I just let the standard SCOM aggregation rule take care of aggregating the data. I usually only have a few hundred rows to aggregate, and by the next day the availability reports are usually back to being correct.

    Reply
    1. Warren Kahn Post author

      Hi John,

      The issue causing the report to render incorrectly is due to a duplicate entry in the HealthServiceOutage table which has an outage start time but not an outage end time. This is fixed in UR3 and after all the data is re-aggregated you should not be experiencing this issue again. You may want to contact Microsoft if you are still experiencing this issue.

      It shouldn’t be necessary to use steps 2 and 3 more then once if the cause has been fixed.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.