Community discussions

MikroTik App
 
eriitguy
Member Candidate
Member Candidate
Topic Author
Posts: 197
Joined: Thu Jan 26, 2017 1:16 pm

Dude Outage time and Service down time

Tue Nov 14, 2017 10:50 am

Hello!

Dude have an Outages tab, where we may see:
Dude-Outages-duration-and-Service-down-duration-01.png
Status: Outage state
Time: When the outage start
Duration: How long does outage is last
Service: Failed service

By using this information we should be aware about polling settings: Probe Interval, Probe Timeout, Probe Down Count
We may use the following variables: [Service.TimeSinceChanged], [Service.TimeLastUp], [Service.TimeLastDown] inside notifications
Service [Probe.Name] on [Device.Name] is now [Service.Status] at [TimeAndDate] - Service.TimeSinceChanged: [Service.TimeSinceChanged] - Service.TimeLastUp: [Service.TimeLastUp] - Service.TimeLastDown:[Service.TimeLastDown] - Service.TimeUp: [Service.TimeUp] - Service.TimeDown: [Service.TimeDown]
For example if we have:
Polling:
Probe Interval = 10 seconds
Probe Timeout = 5 seconds
Probe Down Count = 3

We assume that polling is started at 00:00:00:
1. Poll 1 at 00:00:10 with a timeout 5 seconds - Failed - Down count = 1 and started at 00:00:10
2. Poll 2 at 00:00:20 with a timeout 5 seconds - Failed - Down count = 2 and increased at 00:00:20
3. Poll 3 at 00:00:30 with a timeout 5 seconds - Failed - Down count = 3 and threshold is reached at 00:00:30

Notification is sent corresponding with the "Notification: Delay" settings. If it is 00:00:00 you will receive message that service is down at 00:00:30.
But in our example service down started at Poll 1 - 00:00:10.
Variable [Service.TimeSinceChanged] will shown us service inaccessibility time minus one Probe Interval: 00:00:30 - 00:00:10 = 00:00:20

In such case, Duration we see under Outages tab shown us only service down time which depends on Polling settings.
If we want to calculate real service down time we may use the following formulas:
Service down time = [Service.TimeSinceChanged] + Probe Interval + Outage Duration
or
Service down time = (Probe Interval * Probe Down Count ) + Outage Duration
Conclusions
1. Probe Timeout value doesn't affect Probe Interval, it only used for service verification.
2. Outages duration show service down time without considering Probe Interval and Probe Down Count.
3. Probe Interval and Probe Down Count only considered to Start Down point.

Note: In presented example, we assume that service is not flapping and it is stable down.

Used information
1. The_Dude_v6/Services
2. The_Dude_v6/Notifications
3. Help me on Dude Probe Interval


Thank you!
You do not have the required permissions to view the files attached to this post.
 
simogere
Frequent Visitor
Frequent Visitor
Posts: 56
Joined: Fri May 24, 2013 11:54 am

Re: Dude Outage time and Service down time

Mon Mar 04, 2024 2:53 am

Hi @eriitguy, I made some tests and it seems not correct with ping probe:

For example if we have:

Polling:
Probe Interval = 10 seconds
Probe Timeout = 5 seconds
Probe Down Count = 3

Ping probe:
Retry count = 3
Retry interval = 1s

You will receive a message that service is down at 00:00:23

Poll 1= poll from 00:00:00, failed at 00:00:03 due to ping probe settings
Poll 2= poll from 00:00:10, failed at 00:00:13 due to ping probe settings
Poll 3= poll from 00:00:20, failed at 00:00:23 due to ping probe settings -> Notification

With the default settings, you will receive a message that service is down at 00:02:03

Who is online

Users browsing this forum: No registered users and 0 guests