Tag Archives: #Troubleshooting

The case of the missing override checkbox

One of my customers contacted me regarding a rather strange situation, whenever they try to apply an override they are unable to because the checkbox is missing.

As it turns out this the solution while unusual is quite simple, the screen resolution on their server is too high. In this case, for this particular version of Operations Manager the resolution was higher than 1600×900.

Once we dropped the resolution to 1600×900 the check boxes reappeared.

Loading

XPost SCOM 1801: Caveat when upgrading from 2012R2/2016

Just a quick cross post from my college Robert Bird original posting here.

Be aware that when upgrading to SCOM 1801 in order to upgrade reporting successfully you need to install the SCOM console onto the SSRS instance first otherwise the upgrade will fail.

Loading

SCOM 2016 the importance of Backing up your SSRS encryption key

Definitely not something you want to see on your second day back at work:

The report server cannot decrypt the symmetric key used to access sensitive or encrypted data in a report server database. You must either restore a backup key or delete all encrypted content.

Luckily my customer had a recent backup of the SSRS key and it was a simple matter to restore the key and restore SCOM reporting functionality.

You may be asking “How do I backup my SSRS encryption key?” no worries it is a simple procedure, I recommend making this process part of a monthly set of tasks.

First open your SQL Reporting Services Configuration Manager
Then navigate to Encryption Keys on the left hand side
Then click Backup, you will be promoted for a file name and a password

Restoring the key is similar you just select Restore from the same page and locate the key file and provide the password.

Loading

SCOM 2016: Error while connecting to management server: The client has been disconnected from the server

When trying to install a second management server in a SCOM 2016 management group, after entering the OpsDB details,  a few moments later the wizard would return to the SQL details screen.

A brief investigation revealed the following error in the installation log file (OpsMgrSetupWizard.txt this is your best friend for troubleshooting SCOM installations):

Info:Error while connecting to management server: The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection.
Error: :Couldn’t connect to mgt server stack: : Threw Exception.Type: Microsoft.EnterpriseManagement.Common.ServerDisconnectedException, Exception Error Code: 0x80131500, Exception.Message: The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection.

First confirming that the Data Access service was in fact running on the original management server and that a connection could be made to the SDK.

It turns out that the two servers times were out of sync by more than 5 minutes causing a Keberos time skew. After correcting the server time the second management server installed with no issues.

Loading

SCOM and Deprecating SHA1 Certificates what you need to know!

With the recent deprivation of SHA1 certificates in favour of the more secure SHA2 it’s important to know that SCOM uses  SHA1 to manage workloads for cross platform monitoring (Unix / Linux).

Have no fear the MS SCOM team have released an article on how to replace your existsing SHA-1 certificates with the newer SHA256 certificates.

It is important to note that you will need to update your 2012 R2 environments to UR 12 and your 2016 environments to UR2 respectivly in order to use the new SHA256 certificates by default.

The article is available here

Loading

Warning! October Windows updates causing SCOM console crash on all Windows versions

There is an issue with the October cumulative updates (KB3194798, KB3192392, KB3185330 & KB3185331)  for all windows versions which is causing the SCOM console to crash.

Thank you to Dirk Brinkman for blogging about this issue.

The Product Group is aware of this issue and is working on a fix. Unfortunately I do not have an ETA for it. You will find an announcement on the SCOM Team blog (https://blogs.technet.microsoft.com/momteam/) once the fix for this issue is availabe.
All credits and thank’s to my colleague Mihai Sarbulescu for finding this issue!

Update from Dirk Brinkman The product group released a hotfix for this issue: https://support.microsoft.com/en-us/kb/3200006.

Loading

SCOM: SQL Dashboards workaround for slow performance

Microsoft has finally officially recommended a workaround that some of us have been using for some time to keep the SQL dashboards in a usable state.

Dashboards may work slowly if used rarely

Issue: When used rarely or after a long break, the dashboards may work rather slowly due to large amounts of the collected data to be processed; especially, it is related to large environments (2000+ objects).

Resolution: Below is a “warming up” script, which may be used to create an SQL job to run on some schedule. Before scheduling it as an SQL job, please test how long these queries will be executing (if you will schedule it to run too often or execution time is too long, that may kill the performance). If you have dashboards with thousands of objects to load, then time to load the content will be 10+ seconds anyway. It was tested with 600 000 objects, and the dashboard loading time was 1-2 minutes.

USE [OperationsManagerDW]

EXECUTE [sdk].[Microsoft_SQLServer_Visualization_Library_UpdateLastValues]

EXECUTE [sdk].[Microsoft_SQLServer_Visualization_Library_UpdateHierarchy]

It is also worth noting that the following versions of SQL Server Management Pack are considered as deprecated and suspended:

  • 1.314.35
  • 1.400.0
  • 3.173.0
  • 3.173.1
  • 4.0.0
  • 4.1.0
  • 5.1.0
  • 5.4.0
  • 6.0.0
  • 6.2.0
  • 6.3.0

Loading

XPost: Warning Base OS MP version 6.0.7303.0!!!

Kevin Holman updated his MP post with the following warning,:

***WARNING***  There are some significant issues in this release of the Base OS MP, I do not recommend applying this one until an updated version comes out.

Issues:

  • Cluster Disks on Server 2008R2 clusters are no longer discovered as cluster disks.
  • Cluster Disks on Server 2008 clusters are not discovered as logical disks.
  • Quorum (or small size) disks on clusters that ARE discovered as Cluster disks, do not monitor for free space correctly.
  • Cluster shared volumes are discovered twice, once as a Cluster Shared Volume instance, and once as a Logical disk instance, with the latter likely cause by enabling mounted disk discovery.
  • On Hyper-V servers, I discover an extra disk, which has no properties:

So best to hold off on this one folks. This of course comes back to some big questions about MP quality control as we’ve had many issues with the recent SQL MP releases and now this.

Loading

SCOM: Monitoring a Fortigate Firewall

A while ago I had a request from one of my clients to monitor their new Fortigate Firewalls, as there is no existing management pack for this it required a bit of custom work.

First on the firewall you’ll also need to configure SNMP, as well as what trap notifications will be sent.

snmptraps

Then discover the Fortigate using the standard network monitoring discovery.

This is the address for the Fortigate MIB file contents which you will need in order to map OIDs for the next part.

In SCOM create an SNMP Trap alerting Rule targeting the Node Class.

snmpalerting1snmpalerting2

For now leave the OID properties filter empty
snmpalerting3

This rule will be used to identify any OIDs in the future that may be missing from your specific alerting rules.

Now using the MIB list provided earlier each alert ticked in the Fortigate configuration needs to be mapped to the relevant OID and a specific alerting rule created for it, for example 1.3.6.1.4.1.12356.101.4.4.2.1.2 is the OID for HIgh Processor Usage. So in order to generate an alert for High CPU on the Fortigate you will need a rule with this specific OID in the filter 1.3.6.1.4.1.12356.101.4.4.2.1.2.

Repeat for each OID that you need to monitor and use the catch all to identify anything you may have missed.

Loading

XPost: Event 18054 errors in the SQL application log

Here is a great post by Kevin Holman addressing an issue you would come across if you have had to move your SCOM databases or recover them to a new SQL server.

Sample error:

Log Name:      Application
Source:        MSSQL$I01
Date:          10/23/2010 5:40:14 PM
Event ID:      18054
Task Category: Server
Level:         Error
Keywords:      Classic
User:          OPSMGR\msaa
Computer:      SQLDB1.opsmgr.net
Description:
Error 777980007, severity 16, state 1 was raised, but no message with that error number was found in sys.messages. If error is larger than 50000, make sure the user-defined message is added using sp_addmessage.

In essence this happens, as explain, due to the sysmessages being created in the master database on installation. These messages will then be missing after a database move or new sql server recovery.

Kevin has also provided a script to re-add these messages for you here – CAUTION for SCOM 2012 R2 only

Loading