In the case of an agent that is managing a large amount of objects you may find that not all of them are discovered or if they are that some of them remain in a Not Monitored State. This can be caused by a couple of things.
If you find this error in your OpsMgr event log: “The health service has removed some items from the send queue for management group since it exceeded the maximum allowed size of 15 megabytes”
The the below registry keys need to be adjusted:
- Set HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\HealthService\Parameters\Persistence Version Store Maximum to 80 MB (5120). Default = 60 MB
- Set HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\HealthService\Parameters\Management Groups\<MG Name>\maximumQueueSizeKb to 100 MB. Default = 15 MB
- Set HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Modules\Global\PowerShell\ScriptLimit\QueueMinutes to 120 mins
However if you find this error: In memory container (hash table System.Health.EntityStateChangeData) had to drop data because it reached max limit. Possible data loss.
Then the following registry key need to be adjusted:
- HKLM\System\CurrentControlSet\Services\HealthService\Parameters:”State Queue Items”, the default value for this key is 1024, depending on the server load double this to 2048 or if the error continues to occur to 4096
I have come across instances where both of these errors occur, after the adjustments were made and the heath service restarted all objects were discovered and monitored correctly.