[dev.icinga.com #3586] State based escalation and RECOVERY #1207
Comments
Updated by mfriedrich on 2013-01-29 19:03:31 +00:00
|
Updated by prism1 on 2013-01-30 17:04:57 +00:00
After spending a bit more time on it, the patch is clearly missing a few things. In its current state, the new patch will only work if you use one, and only one of the state based notifications in each serviceescalation/hostescalation. Using 2 (last_warning and last_critical for example) would break the logic. I guess more time needs to be spent making sure all of this works :) |
Updated by mfriedrich on 2013-02-21 19:21:18 +00:00
|
Updated by mfriedrich on 2014-07-19 13:01:29 +00:00 Any status on this? If not, I'll close it. |
Updated by prism1 on 2014-07-21 14:21:56 +00:00 We have a better patch than this one, but it does not fix the problem completely. Any idea if this is fixed in Icinga2 ? If not, I will find someone on my team to fix it in V2 instead of working on V1. Patrick |
Updated by mfriedrich on 2014-07-21 14:34:20 +00:00 Escalations in Icinga 2 are just a notification with a defined begin and end time. Their behavior and configuration is a bit different to what is possible with Icinga 1.x For an example check http://docs.icinga.org/icinga2/latest/doc/module/icinga2/toc\#!/icinga2/latest/doc/module/icinga2/chapter/monitoring-basics#notification-escalations |
Updated by berk on 2015-05-18 12:17:39 +00:00
|
Updated by mfriedrich on 2015-08-04 19:03:54 +00:00 Anything on the "better" patch? |
This issue has been migrated from Redmine: https://dev.icinga.com/issues/3586
Created by prism1 on 2013-01-29 18:57:42 +00:00
Assignee: prism1
Status: Assigned
Target Version: Backlog
Last Update: 2015-08-04 19:03:54 +00:00 (in Redmine)
When state based escalation is enabled, and multiple escalation levels are defined, RECOVERY notifications are sent to ALL levels of escalation, even if the escalation never reached this level.
For example :
define serviceescalation{
service_description *
hostgroup_name hg-priority2
notification_interval 5
contact_groups cg-unix-pager
escalation_options c,r
first_critical_notification 1
last_critical_notification 0
}
define serviceescalation{
service_description *
hostgroup_name hg-priority2
notification_interval 5
contact_groups cg-unix-pager-247
escalation_options c,r
first_critical_notification 2
last_critical_notification 0
}
define serviceescalation{
service_description *
hostgroup_name hg-priority2
notification_interval 5
contact_groups cg-unix-senior
escalation_options c,r
first_critical_notification 3
last_critical_notification 0
}
If escalation reaches level 2, recovery will be sent to ALL (cg-unix-pager,cg-unix-pager-247,cg-unix-senior)
Looking at the code, I found that when entering the loop on all escalation levels for a particular service,
there is no condition to exit FALSE when the state is STATE_OK.
Here is a link to the patch to add such condition. (on first_ and last_) Our tests show that the behaviour is now
correct, but it definitely needs more review, as I am not familiar with the code, and may have missed something.
http://pastebin.com/wEcyRnP7
Side note : the documentation mentions "first_notification" and "last_notification" as required for escalation objects. But when you use first/last_notification, you don't have to use these. first/last_notification can be used without "first_notification" and "last_notification".
Attachments
The text was updated successfully, but these errors were encountered: