SLA Definition
I'm new to this product and was wondering if there was something that Explained how the SLA is calculated? How much time is figured into the SLA, is it configurable? Thanks.
Brad
Brad
This discussion has been closed.
Comments
The data for those checks can be reset at any given time you would like. All the info is stored in the .dat files in your data subdirectory and by simply deleting them (or those that are relevant) you reset the counters.
is posible to know the sla between a cuple of dates.
is very important for me to know the sla for example between may 1-10 or september 3 and december 19. is possible?
when i make sla trend analysis reports i've diferents results
that when i make from the start panel
look
first start panel
and this from sla trend analysis reports
and the value in the start panel
thanks and sorry for my ¿english?
whats the mean of the blank spaces in the last graphic?
In terms of gaps: ServersCheck expects a value at least once every 5 minutes, if not then a gap is shown (cfr RRD specs).
why can i have the % SLA between a couple a dates?
and in terms of gaps, i´m waiting to take the next graph and the result is:
the computer is up and checking every 30 seconds since 4 hours ago more or less
we need the "How to" to do that the command rrdtool generate the graphics between 2 dates with the % LSA the one of our dispositives
this one,but the date that we need
please and thanks!!
4-8-15-16-23-42
Eg when it has a DOWN? status for the last 5 minutes and retrying before going DOWN, then no data is stored and this results in the gaps.
I don't know what edition you are using but beware that the Professional edition performs one check after the other and that this could mean that the minimum interval is not respected.
the monitoring rule have 0 retrying, if the server don´t response it´s down in firt time without any retrying.
but the graphs have long gaps without any sense for us.
Anyhow in this graph the LAST KNOWN SLA: 57,89% (ckeck ok - checks no ok) is correct, please can you explain the way to obtain the "KNOWN SLA: xxx%" betwen to dates, whith the graph manager i can obtain the draw but not this %
thanks.
ANd the gaps persist after the server i´m monitoring is UP.
..and when i try to make de csv file....
the field value & time, are not separate for ";" or some symbol
As to why the SLA graph is empty is not possible to tell based upon the info I have.
Is your system remotely accessible?
1/ Do you have empty SLA graphs for all rules?
2/ How many rules do you have defined?
3/ What is the interval for all rules?
4/ Where empty SLA graphs, is the value graph empty too?
2/ How many rules do you have defined?
3/ What is the interval for all rules?
4/ Where empty SLA graphs, is the value graph empty too?
..............................................................................
1/ only for the rules who have 1 or more down
2/ 29 rules defined
3/ 30 seconds for 24 rules and 15 seconds when down interval & 5 rules = 120 seconds and 15 when down interval
4/ vin the value graph seems no empty.look
Without being able to replicate or access your system, it is not possible for us to debug.
The csv values are correct. Do a view source and you will see 2 fields being returned:
time;value
with each starting at a new line. A browser does not render it correctly; however save it as a csv file and then open it with Excel and you will see the result.
i reinstall the serverscheck...
i create a new rule....
the second time i shutdown the server there is not more drawings in the graphic.
And y the csv,You are right, now i can take the values but...
we don´t understand why the the value is always "0" and why the interval time repeat it
Thanks
From what it seems it is performed every 15 minutes. In order to draw the SLA it needs to be performed within every 5 minutes.
Can you please reply with the content of the log file (in the checklogs subfolder) for the aci0019 rule?
and the log
Why is this check all the time down?
Is there no way to get remote access to your system? It would make it much easier in assisting you.
Another option is that you temporary stop all other rules and then kill the s-graphs.exe.
Then let it run for let's say 20 minutes.
It will have a whole set of .graphs files in the graphs/queue folder
Open each of them (oldest first) and execute the command in there with doing it via the command prompt in the serverscheck_databases directory and having rrdtool.exe in front.
See it throws any erors.
One thing I am thinking of (just a guess) is that you might not have the system with US decimals set...
we changed the regional settings and trayed to stop the proccess but we couln´t stop it.
we deleted all rules, restarted the server and created a new rule the we stoped the monitoring machine and restarted few minutes ago
see log
Fri May 12 10:41:17 2006 DOWN - RTT:.0. - Connection to host timed out
Fri May 12 10:44:17 2006 DOWN - RTT:.0. - Connection to host timed out
Fri May 12 10:47:18 2006 DOWN - RTT:.. - Connection to host timed out
Fri May 12 10:50:17 2006 DOWN - RTT:.. - Connection to host timed out
Fri May 12 10:53:19 2006 DOWN - RTT:.. - Connection to host timed out
ok i think this is working now, because the machine is power on now and the log is ok.
well i delete all "ServersCheck_Monitoringoutputgraphs" gif´s but when the system made it the graph was like this
i thinks someting is wrong in the database because the graphs is no ok.
do you want i send you the rrd files?
Fri May 12 10:27:00 2006;0
Fri May 12 10:30:00 2006;0
Fri May 12 10:33:00 2006;0
Fri May 12 10:36:00 2006;0
Fri May 12 10:39:00 2006;0
Fri May 12 10:42:00 2006;0
Fri May 12 10:45:00 2006;0
Fri May 12 10:48:00 2006;0
Fri May 12 10:51:00 2006;0
Fri May 12 10:54:00 2006;0
Fri May 12 10:57:00 2006;0
Fri May 12 11:00:00 2006;0
Fri May 12 11:03:00 2006;0
Fri May 12 11:06:00 2006;0
Fri May 12 11:09:00 2006;0
Fri May 12 11:12:00 2006;0
Fri May 12 11:15:00 2006;0
Fri May 12 11:18:00 2006;0
Fri May 12 11:21:00 2006;0
Fri May 12 11:24:00 2006;0
Fri May 12 11:27:00 2006;0
Fri May 12 11:30:00 2006;0
0 every time?