Category Archives: SCOM 2007

SCOM: Agent error Keyset does not exist

An issue to be aware of when you package your SCOM agent with your server build image is that when the server is built a certificate is generated for the agent to use, this certificate resides in the Operation Manager Certificate Store. If the server is then renamed due to it having a temporary build name you will see the below error in your Operations Manager event log.

Event: 7022
Source: HealthService

The Health Service has downloaded secure configuration for management group <MG Name>, and processing the configuration failed with error code Keyset does not exist(0x80090016).

Re-installing the agent will fix this issue but there is a simpler solution by Gerrie Louw, open your certificate MMC, navigate to the Operation Manager Store and delete the certificate, then restart your Healthservice.

The symptoms can occur with all versions of the SCOM / MMA agent under the agent packaged with a server image scenario.

Loading

SCOM: Management pack updates

Microsoft recently released updates to several management packs, including the following:

Windows Base OS MP – updated to version 6.0.7230.0 and is available for download here.

Fixed issues:

  • Bug fixed: Microsoft.Windows.Server.LogicalDiskDiscovery.Module.Type.vbs script does not discover logical disks with large disk size
  • Update to support two configurable threshold values (waiters and timeouts) for triggering alert ‘MAX concurrent API Reached’

Jimmy Harper has also written a nice summary of the “Max Concurrent API Monitor” fix included in the new version of the Windows Server Operating System Management Pack

One change that may require some attention is the update to the “Max Concurrent API Monitor”.

In the previous version of the MP (6.0.7061.0), the Max Concurrent API monitor would run a script that looks at three Netlogon semaphore counters (Waiters, Holders, and Timeouts).  If any of them are greater than 0 and less than 4gb, then we generate an alert. Some customers were seeing false alerts from the Monitor due to some of the counters (especially Semaphore Holders) being greater than zero during non-problematic conditions.

In the new version of the Management Pack, the Max Concurrent API Monitor has the following changes:

  • The Semaphore Holders counter has been removed from the criteria that generates the alert
  • The Alert will only be generated if Semaphore Waiters or Timeouts exceed a defined threshold (instead of 0).  These thresholds can be configured as needed via Overrides.  The default thresholds are:
  • Semaphore Waiters = 50
  • Semaphore Timeouts = 2000

More information on the MaxConcurrentAPI setting can be found here.

 

Active Directory – updated to version 6.0.8293.0 and is available for download here.

Fixed issues:

  • Issue fixed: AD-Trust Monitor does not come back to healthy state
  • Issue fixed: AD_Database_and_Log.vbs does not support using ‘.’ as decimal sign for non-English account.

Cluster – updated to version 6.0.7230.0 and is available for download here.

Fixed issues:

  • Cluster 2008 MP does not collect certain performance metrics.

DNS  – updated to version 7.1.10259.0 and is available for download here.

Fixed issues:

  • DNSMetrics2012R2Probe script can cause high CPU in MonitoringHost.exe

DHCP – updated to version 6.0.7230.0 and is available for download here.

Fixed issues:

  • Various, see guide for full details

Loading

SCOM: New and updated SQL Management Packs 6.5.1.0

Microsoft has recently release updated management packs for SQL, the new version is 6.5.1.0. The new features and fixes list is available at each of the download links.

These have been updated:

System Center Management Pack for SQL Server 2014: Download here 
System Center Management Pack for SQL Server (2005/2008/2102): Download here

These are new:

System Center Management Pack for SQL Server 2014 Reporting Services (Native Mode):  Download here
System Center Management Pack for SQL Server 2012 Reporting Services (Native Mode): Download here
System Center Management Pack for SQL Server 2008 Reporting Services (Native Mode): Download here

As always thoroughly test before deploying into your production environments.

Loading

SCOM: Caution with management packs in large environments

Kevin Holman recently published a great article about the inherent pitfalls of importing a management pack into an environment without understanding the intended scope, scalability, and any known/common issues.

Specifically he discusses the Dell  Hardware Management Pack (Detailed Edition) which has a small scalability limitation of 300 agents.

The lesson to learn here is – be careful when importing MP’s.  A badly written MP, or an MP designed for small environments, might wreak havoc in larger ones.  Sometimes the recovery from this can be long and quite painful.   An MP that tests out fine in your Dev SCOM environment might have issues that wont be seen until it moves into production.  You should always monitor for changes to a production SCOM deployment after a new MP is brought in, to ensure that you don’t see a negative impact.  Check the management server event logs, MS CPU performance, database size, and disk/CPU performance to see if there is a big change from your established baselines.

 

Go here for the full article, definitely worth the read.

Loading

SCOM *nix Monitoring: The WinRM client cannot process the request because the server name cannot be resolved

All Linux monitored servers in a critical state is not an ideal way to start a Monday morning. Especially when none of the servers are actually experiencing an issue.

The issue at hand:
All of the Linux servers generated a heartbeat failure at the same time. Looking through the health explorer revealed the following error:

 The WinRM client cannot process the request because the server name cannot be resolved.

Testing WinRM with the following command also yielded the same result, and testing with DNS resolved the server name successfully.

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:username -password:password -remote:https://servername:1270/wsman -auth:basic -skipCACheck -encoding:utf-8 -format:#pretty

The Solution:

WinRM uses the windows proxy to resolve host names, I checked the windows proxy settings on the Management Server using the following command.

netsh winhttp show proxy

and discovered that my proxy was set correctly but the bypass list for excluded servers had been replaced with a single server, using the below command I was able to amend the bypass list to include all of the local domain servers.

netsh winhttp set proxy proxy-server=”http=<proxy FQDN” bypass-list=”*<Domain Suffix>”

One that was completed the WinRM test returned the correct data and the servers started to turn green again.

 

Loading

SCOM: Alerting when a user is added or removed from a Distribution Group in AD

Although this is similar to alerting when a user is added to a security group there are a few things that need to be changed.

Target: Windows Domain Controllers
Log: Security
Event ID: 5136
EventDescription contains: “Name of Distribution Group”

User added
EventDescription contains: %%14674

User removed
EventDescription contains: %%14675

Loading

SCOM: New Veeam Management Pack™ v7 adds support for Hyper-V

Veeam has recently announced version 7 of their Veeam Management Pack for System Center. The most exciting new feature is the inclusion of support for Microsoft Hyper-V which has been a long awaited addition.

It is currently in public beta which does not include the support for VMware and requires Microsoft System Center 2012 and later and Microsoft Hyper-V 2012 and later.

Click here for more information and to sign up for the beta.
Press release available here.

“New in version 7:

  • Support for Microsoft Hyper-V — now you get the same deep visibility with monitoring, reporting and capacity planning that you’d expect from Veeam for both Hyper-V and vSphere infrastructures.
  • Unique dashboards for vSphere and Hyper-V — see what’s happening in your environment in real time. Get in-context views with heatmaps for visualizing multiple metrics.
  • Visibility of Hyper-V backups in System Center — monitoring, reporting and capacity planning for Veeam Backup & Replication for Hyper-V.”

Loading

SCOM: Some or all Cluster resources are not discovered by the OpsMgr agent

Here is an article from the OpsMgr Engineering Blog detailing an issue with SCOM discovering Ckuster Resources

“If a Cluster has orphaned object entries in ClusterHive registry key, the System Center Operations Manager agent may not discover some or all Cluster resources. This can occur with System Center Operations Manager 2007 (OpsMgr 2007) or System Center 2012 Operations Manager (OpsMgr 2012).”

 

The article details a fix which envolves locating and then removing the orphaned objects:

“First, use the Failover Cluster PowerShell commands Get-ClusterResource and Get-ClusterGroup to get the list of Resources and Groups. Then, using the output, check for Resources/Groups that appear as Offline and verify if these can be seen in the Failover Cluster Console. Verify with the Cluster administrator whether these are still valid, then assuming they are not and you’ve identified which ones that are orphaned, delete them using these commands from an elevated CMD Prompt (Run As Administrator):

For orphaned resources: Cluster RES “<RESOURCE_NAME>” /DELETE

For orphaned groups: Cluster GROUP “<GROUP_NAME>” /DELETE

Once this is done the missing cluster resources should now be discovered.”

Loading

SCOM: New SQL 2014 MP

Microsoft has released version 6.5.0.0 of the SQL MP which enables the discovery and monitoring of SQL Server 2014 Database Engines, Databases, SQL Server Agents and other related components. This Management Pack is designed to run by Operations Manager 2007 R2 (except dashboards), Operations Manager 2012 or Operations Manager 2012 R2.

For more information, including a list of new features and improvements go here.

Loading

SCOM: Monitoring Site Availibility

One of my customers still running SCOM 2007 R2 asked if it would be possible to monitor the availability of their 3 sites so that they could track any outages during the month. The below will also work with SCOM 2012.
Each site has a primary and backup connection and the site is only unavailable if both links fail.

First with the network teams assistance I identified and discovered the Routers and Interfaces responsible for the line connections into each of the sites. Then I create a distributed application for each specific site:

Site mon 1

Scope your container to Device

Site mon 2

Search for the site routers discovered earlier and add them to the container.

Site mon 3

Save and open Configure Health Rollup

Site mon 4

Now because the site is only unavailable in the event of both links failing, set the Health Rollup to “Best Health State of any Member” this will ensure that the DA stays green if even 1 Router is still contactable.

Site mon 5

Below is what you will see in your Distributed Application view.

Site mon 6

Here is a sample of an availability report using the DA for a particular site.

Site mon 7

Happy SCOMing.

If you have another way of achieving this result, I’d be interested to hear about it, drop me a mail or leave a comment.

Loading