Windows checks failing, but testing = ok
Hello
Install verison, Professional Edition
Been working for about 1-2 months. Today I had to add some more devices and monitors, taking my count from 74 to 96 monitors.
All of a sudden all my windows checks fail, accross the board, some have come back up but most windows checks are now red. If you "test" the windows health monitor it returns status ok, however it still is showing down on hosts.
I have restarted the machine 2 times, increased the memory by 1gb (now 4 in total) the memory usage of the service "Monitoring_thread2.exe" is grabbing up to 1.9gb of memory every time its runs, then just drops off.
Some of the error message presented on the alert page
"Not enough storage is available to complete this operation" - I have 31gb of hard drive space on the machine and should have a spare 1gb of memory now
"The RPC server is unavailable" yet test returns ok, and ping and bandwidth monitoring are fine.
"The authentication service is unknown" same as above test returns ok.
Another message not related to windows check is a bandwidth checker that has been working fine "The method "_addr_loopback" is not supported by this Transport Domain"
This machine is 4cores, 4gb of memory and windows 2008 standard R2 64 and only runs this software (minus 2am when it does something else for an hour)
As i have been writing this out, all the windows checks are failing again, I run a second server monitoring tool that is reporting everything is fine, could i get some assistance as to what might be causing this
Install verison, Professional Edition
Been working for about 1-2 months. Today I had to add some more devices and monitors, taking my count from 74 to 96 monitors.
All of a sudden all my windows checks fail, accross the board, some have come back up but most windows checks are now red. If you "test" the windows health monitor it returns status ok, however it still is showing down on hosts.
I have restarted the machine 2 times, increased the memory by 1gb (now 4 in total) the memory usage of the service "Monitoring_thread2.exe" is grabbing up to 1.9gb of memory every time its runs, then just drops off.
Some of the error message presented on the alert page
"Not enough storage is available to complete this operation" - I have 31gb of hard drive space on the machine and should have a spare 1gb of memory now
"The RPC server is unavailable" yet test returns ok, and ping and bandwidth monitoring are fine.
"The authentication service is unknown" same as above test returns ok.
Another message not related to windows check is a bandwidth checker that has been working fine "The method "_addr_loopback" is not supported by this Transport Domain"
This machine is 4cores, 4gb of memory and windows 2008 standard R2 64 and only runs this software (minus 2am when it does something else for an hour)
As i have been writing this out, all the windows checks are failing again, I run a second server monitoring tool that is reporting everything is fine, could i get some assistance as to what might be causing this
This discussion has been closed.
Comments
Could you confirm if this is a resource issue of the machine? anything over 85 monitors, I can not find a min spec on the website
What version are you running? Can you provide a debug as well?
To give you some more details, I pushed the monitors up to 95 and 5 of the windowshealths started to fail again. I deleted one host (with a windowshealth check) and now I am back to all green. Seems I have hit a limit at 93.
I will try and get some logs from a debug
In respect to the errors:
"Not enough storage is available to complete this operation" this is a windows kernel error and not related to the actual storage available on the hard drive
"rpc server unavailable" is a Windows security issue. Following article might help: http://wiki.serverscheck.com/index.php/Windows_Errors
Same applies to "The authentication service is unknown"
if the TEST SETTINGS works fine then it most probably is the result of incorrect settings of the ServersCheck Monitoring service. See the service account settings instructions: http://wiki.serverscheck.com/index.php/Configuring_Service_for_Windows_Checks
"The method "_addr_loopback" is not supported by this Transport Domain" => this indicates a problem with communicating with your remote system over the SNMP protocol with the remote host not supporting it or an issue with a firewall in between blocking the UDP communications