Sometimes for a variety of reasons it becomes necessary to try and figure out which gateways are paired to which management servers and unfortunately this is a configuration that can often slip under the radar when documenting a management group.
Luckily there is a simply way to figure this out without having to log on to each server and trawl through the registry.
Get-SCOMGatewayManagementServer | where {$_.Name –eq “< GATEWAY SERVER >”} | Get-SCOMParentManagementServer
Note: this command has changed slightly from past versions of SCOM
One of my customers contacted me regarding a rather strange situation, whenever they try to apply an override they are unable to because the checkbox is missing.
As it turns out this the solution while unusual is quite simple, the screen resolution on their server is too high. In this case, for this particular version of Operations Manager the resolution was higher than 1600×900.
Once we dropped the resolution to 1600×900 the check boxes reappeared.
Just a quick cross post from my college Robert Bird original posting here.
Be aware that when upgrading to SCOM 1801 in order to upgrade reporting successfully you need to install the SCOM console onto the SSRS instance first otherwise the upgrade will fail.
Definitely not something you want to see on your second day back at work:
The report server cannot decrypt the symmetric key used to access sensitive or encrypted data in a report server database. You must either restore a backup key or delete all encrypted content.
Luckily my customer had a recent backup of the SSRS key and it was a simple matter to restore the key and restore SCOM reporting functionality.
You may be asking “How do I backup my SSRS encryption key?” no worries it is a simple procedure, I recommend making this process part of a monthly set of tasks.
First open your SQL Reporting Services Configuration Manager
Then navigate to Encryption Keys on the left hand side
Then click Backup, you will be promoted for a file name and a password
Restoring the key is similar you just select Restore from the same page and locate the key file and provide the password.
When trying to install a second management server in a SCOM 2016 management group, after entering the OpsDB details, a few moments later the wizard would return to the SQL details screen.
A brief investigation revealed the following error in the installation log file (OpsMgrSetupWizard.txt this is your best friend for troubleshooting SCOM installations):
Info:Error while connecting to management server: The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection. Error: :Couldn’t connect to mgt server stack: : Threw Exception.Type: Microsoft.EnterpriseManagement.Common.ServerDisconnectedException, Exception Error Code: 0x80131500, Exception.Message: The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection.
First confirming that the Data Access service was in fact running on the original management server and that a connection could be made to the SDK.
It turns out that the two servers times were out of sync by more than 5 minutes causing a Keberos time skew. After correcting the server time the second management server installed with no issues.
With the recent deprivation of SHA1 certificates in favour of the more secure SHA2 it’s important to know that SCOM uses SHA1 to manage workloads for cross platform monitoring (Unix / Linux).
Have no fear the MS SCOM team have released an article on how to replace your existsing SHA-1 certificates with the newer SHA256 certificates.
It is important to note that you will need to update your 2012 R2 environments to UR 12 and your 2016 environments to UR2 respectivly in order to use the new SHA256 certificates by default.
There is an issue with the October cumulative updates (KB3194798, KB3192392, KB3185330 & KB3185331) for all windows versions which is causing the SCOM console to crash.
The Product Group is aware of this issue and is working on a fix. Unfortunately I do not have an ETA for it. You will find an announcement on the SCOM Team blog (https://blogs.technet.microsoft.com/momteam/) once the fix for this issue is availabe. All credits and thank’s to my colleague Mihai Sarbulescu for finding this issue!
Tim Culham has been promising version 2 of his health check script for a while now and let me tell you it was worth the wait, it offers a great overview of the health of a SCOM management group on a single page, get it here.
Great stuff Tim.
Features:
A Data Volume graph where you can instantly see the amount of Alerts, Events, Performance Data and State Changes over the last 7 days
The Health State of your SCOM Agents
A Graph of your Alert Statistics – how many Open or Closed Alerts
Your Management Server Health, Versions, Server Uptime & the number of Workflows they are running. Now updated to include Gateways! Also CPU, Disk and Memory Graphs for each!
Any Open Warning or Critical Alerts for your Gateways and Management Servers.
The Top 5 Alerts by Repeat Count (use this to identify recurring problems)
The Top 5 Events by Computer (see which computers are the noisiest)
Your Operational Database & Data Warehouse Servers…how much space they are using, free space, file sizes and locations.
Your Operational Database and Data Warehouse Backups. Make sure they are running.
Are the Databases being groomed? You’ll be able to tell instantly!
Find out what is using all of the space in your Operational Database & Data Warehouse Databases.
The Reporting Server and Web Console Server URL’s and if they are OK.
The Status of your Scheduled Reports
If there are any Overrides in the ‘Default Management Pack’
What Discoveries ran in the last 24 Hours and what Properties were changed?
Identify agents that are lower than the highest version installed. Use this to see which agents should be upgraded to the most current version
Microsoft has finally officially recommended a workaround that some of us have been using for some time to keep the SQL dashboards in a usable state.
Dashboards may work slowly if used rarely
Issue: When used rarely or after a long break, the dashboards may work rather slowly due to large amounts of the collected data to be processed; especially, it is related to large environments (2000+ objects).
Resolution: Below is a “warming up” script, which may be used to create an SQL job to run on some schedule. Before scheduling it as an SQL job, please test how long these queries will be executing (if you will schedule it to run too often or execution time is too long, that may kill the performance). If you have dashboards with thousands of objects to load, then time to load the content will be 10+ seconds anyway. It was tested with 600 000 objects, and the dashboard loading time was 1-2 minutes.
Kevin Holman updated his MP post with the following warning,:
***WARNING*** There are some significant issues in this release of the Base OS MP, I do not recommend applying this one until an updated version comes out.
Issues:
Cluster Disks on Server 2008R2 clusters are no longer discovered as cluster disks.
Cluster Disks on Server 2008 clusters are not discovered as logical disks.
Quorum (or small size) disks on clusters that ARE discovered as Cluster disks, do not monitor for free space correctly.
Cluster shared volumes are discovered twice, once as a Cluster Shared Volume instance, and once as a Logical disk instance, with the latter likely cause by enabling mounted disk discovery.
On Hyper-V servers, I discover an extra disk, which has no properties:
So best to hold off on this one folks. This of course comes back to some big questions about MP quality control as we’ve had many issues with the recent SQL MP releases and now this.