Dude have an Outages tab, where we may see: Status: Outage state
Time: When the outage start
Duration: How long does outage is last
Service: Failed service
By using this information we should be aware about polling settings: Probe Interval, Probe Timeout, Probe Down Count
We may use the following variables: [Service.TimeSinceChanged], [Service.TimeLastUp], [Service.TimeLastDown] inside notifications
Code: Select all
Service [Probe.Name] on [Device.Name] is now [Service.Status] at [TimeAndDate] - Service.TimeSinceChanged: [Service.TimeSinceChanged] - Service.TimeLastUp: [Service.TimeLastUp] - Service.TimeLastDown:[Service.TimeLastDown] - Service.TimeUp: [Service.TimeUp] - Service.TimeDown: [Service.TimeDown]
Polling:
Probe Interval = 10 seconds
Probe Timeout = 5 seconds
Probe Down Count = 3
We assume that polling is started at 00:00:00:
1. Poll 1 at 00:00:10 with a timeout 5 seconds - Failed - Down count = 1 and started at 00:00:10
2. Poll 2 at 00:00:20 with a timeout 5 seconds - Failed - Down count = 2 and increased at 00:00:20
3. Poll 3 at 00:00:30 with a timeout 5 seconds - Failed - Down count = 3 and threshold is reached at 00:00:30
Notification is sent corresponding with the "Notification: Delay" settings. If it is 00:00:00 you will receive message that service is down at 00:00:30.
But in our example service down started at Poll 1 - 00:00:10.
Variable [Service.TimeSinceChanged] will shown us service inaccessibility time minus one Probe Interval: 00:00:30 - 00:00:10 = 00:00:20
In such case, Duration we see under Outages tab shown us only service down time which depends on Polling settings.
If we want to calculate real service down time we may use the following formulas:
Code: Select all
Service down time = [Service.TimeSinceChanged] + Probe Interval + Outage Duration
or
Service down time = (Probe Interval * Probe Down Count ) + Outage Duration
1. Probe Timeout value doesn't affect Probe Interval, it only used for service verification.
2. Outages duration show service down time without considering Probe Interval and Probe Down Count.
3. Probe Interval and Probe Down Count only considered to Start Down point.
Note: In presented example, we assume that service is not flapping and it is stable down.
Used information
1. The_Dude_v6/Services
2. The_Dude_v6/Notifications
3. Help me on Dude Probe Interval
Thank you!