Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #11320] Volatile transitions from HARD NOT-OK->NOT-OK do not trigger notifications #4006

Closed
icinga-migration opened this issue Mar 7, 2016 · 6 comments
Labels
bug Something isn't working
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/11320

Created by dmpcore on 2016-03-07 08:59:09 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2016-03-11 12:25:03 +00:00)
Target Version: 2.4.4
Last Update: 2016-03-24 09:37:57 +00:00 (in Redmine)

Icinga Version: 2.1.0
Backport?: Already backported
Include in Changelog: 1

Hello.

This ticket comes from monitoring-portal:
https://monitoring-portal.org/index.php?thread/35426-notifications-on-volatile-services-not-working/

I'm trying to configure one volatile service so it notifies always that the check returns a CRITICAL state, and not by interval. But it isn't working properly.

I attach the configuration file and the debug file with one test I've made (testing several cases):

  • When state changes from OK to CRITICAL it triggers notification (works properly)
  • When state changes from CRITICAL to OK it doesn't trigger notification (works properly)
  • When state remains in CRITICAL state it doesn't trigger notification, but it should

And this de de runtime object configuration:

[root@maxpges01 conf.d]# /apps/icinga/sbin/icinga2 object list --name "localserver*"
Object 'localserver' of type 'Host':
% declared in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 3:1-3:25
* __name = "localserver"
* action_url = ""
* address = "127.0.0.1"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 6:3-6:23
* address6 = ""
* check_command = "hostalive"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 7:3-7:29
* check_interval = 600
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 10:3-10:22
* check_period = "24x7"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 9:3-9:23
* command_endpoint = ""
* display_name = "localserver"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 5:3-5:30
* enable_active_checks = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 13:3-13:26
* enable_event_handler = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 15:3-15:26
* enable_flapping = false
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 16:3-16:21
* enable_notifications = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 12:3-12:26
* enable_passive_checks = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 14:3-14:27
* enable_perfdata = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 17:3-17:21
* event_command = ""
* flapping_threshold = 30
* groups = [ ]
* icon_image = ""
* icon_image_alt = ""
* max_check_attempts = 3
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 8:3-8:24
* name = "localserver"
* notes = ""
* notes_url = ""
* retry_interval = 100
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 11:3-11:22
* templates = [ "localserver", "base-log" ]
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 3:1-3:25
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 21:1-21:24
* type = "Host"
* vars
* notification_group = "group-mail-24x7"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 18:3-18:45
* volatile = false
* zone = ""


Object 'localserver!LOG' of type 'Service':
% declared in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 26:1-26:19
* __name = "localserver!LOG"
* action_url = ""
* check_command = "check_nrpe"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 28:3-28:30
* check_interval = 300
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 38:3-38:22
* check_period = "24x7"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 37:3-37:23
* command_endpoint = ""
* display_name = "LOG"
* enable_active_checks = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 41:3-41:26
* enable_event_handler = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 43:3-43:26
* enable_flapping = false
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 44:3-44:21
* enable_notifications = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 40:3-40:26
* enable_passive_checks = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 42:3-42:27
* enable_perfdata = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 45:3-45:21
* event_command = ""
* flapping_threshold = 30
* groups = [ ]
* host_name = "localserver"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 26:1-26:19
* icon_image = ""
* icon_image_alt = ""
* max_check_attempts = 1
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 36:3-36:24
* name = "LOG"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 26:1-26:19
* notes = ""
* notes_url = ""
* retry_interval = 60
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 39:3-39:21
* templates = [ "LOG", "volatile-service" ]
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 26:1-26:19
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 35:1-35:35
* type = "Service"
* vars
* address = "127.0.0.1"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 29:3-29:29
* notification_group = "group-mail-24x7"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 31:3-31:45
* remote_nrpe_command = "check_log"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 30:3-30:40
* states = [ 4, 8 ]
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 48:3-48:37
* types = [ 32, 16, 8, 128, 256, 1, 2, 4 ]
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 47:3-47:124
* volatile = true
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 46:3-46:14
* zone = ""


Object 'localserver!LOG!notification-group-mail-24x7' of type 'Notification':
% declared in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 78:1-78:60
* __name = "localserver!LOG!notification-group-mail-24x7"
* command = "mail-service-notification"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 72:3-72:39
* command_endpoint = ""
* host_name = "localserver"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 78:1-78:60
* interval = 7200
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 75:3-75:17
* name = "notification-group-mail-24x7"
* period = "24x7"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 81:3-81:17
* service_name = "LOG"
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 78:1-78:60
* states = [ 4, 8 ]
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 73:3-73:23
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 83:5-83:32
* templates = [ "notification-group-mail-24x7", "template-mail-service-notification" ]
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 78:1-78:60
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 71:1-71:58
* times = null
* type = "Notification"
* types = [ 32, 16, 8, 128, 256, 1, 2, 4 ]
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 74:3-74:129
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 86:5-86:30
* user_groups = [ "group-mail-24x7" ]
% = modified in '/apps/icinga/etc/icinga2/conf.d/test.conf', lines 80:3-80:37
* users = null
* vars = null
* zone = ""

dnsmichi says it could be a bug with NOT-OK -> NOT-OK transitions in volatile objects.

Best regards.
Diego

Attachments

Changesets

2016-03-11 12:19:03 +00:00 by mfriedrich 3e050bd

Fix: Volatile transitions from HARD NOT-OK->NOT-OK do not trigger notifications

fixes #11320

2016-03-11 14:56:43 +00:00 by mfriedrich 7ad7e28

Fix: Volatile transitions from HARD NOT-OK->NOT-OK do not trigger notifications

fixes #11320
@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-07 13:58:46 +00:00

  • Parent Id set to 11310

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-09 10:57:38 +00:00

  • Subject changed from Transitions from NOT-OK->NOT-OK and volatile checkable objects to Volatile transitions from NOT-OK->NOT-OK do not trigger notifications
  • Category changed from Notifications to libicinga
  • Status changed from New to Assigned
  • Assigned to set to mfriedrich

I'll look into that, thanks for the report.

Kind regards,
Michael

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-11 12:20:12 +00:00

  • Subject changed from Volatile transitions from NOT-OK->NOT-OK do not trigger notifications to Volatile transitions from HARD NOT-OK->NOT-OK do not trigger notifications
  • Target Version set to 2.4.4

Found it. The current code only sends notifications for HARD state changes meaning a transition from SOFT to HARD. The documentation for volatile services says that notifications happen for each new check results in a HARD state.

Re-order the send notification a bit while at it.

Fixed:

[2016-03-11 13:15:51 +0100] information/ExternalCommandListener: Executing external command: [1457698551] PROCESS_SERVICE_CHECK_RESULT;11320-host;11320-service-volatile;2;crit to crit
[2016-03-11 13:15:51 +0100] notice/ExternalCommandProcessor: Processing passive check result for service '11320-service-volatile'
[2016-03-11 13:15:51 +0100] debug/DbEvents: add checkable check history for '11320-host!11320-service-volatile'
[2016-03-11 13:15:51 +0100] debug/DbEvents: add state change history for '11320-host!11320-service-volatile'
[2016-03-11 13:15:51 +0100] notice/Checkable: State Change: Checkable 11320-host!11320-service-volatile hard state change from CRITICAL to CRITICAL detected. Checkable is volatile.
[2016-03-11 13:15:51 +0100] information/Checkable: Checking for configured notifications for object '11320-host!11320-service-volatile'
[2016-03-11 13:15:51 +0100] debug/Checkable: Checkable '11320-host!11320-service-volatile' has 1 notification(s).
[2016-03-11 13:15:51 +0100] notice/Notification: Attempting to send notifications for notification object '11320-host!11320-service-volatile!11320-service-notification'.
[2016-03-11 13:15:51 +0100] information/Notification: Sending notification '11320-host!11320-service-volatile!11320-service-notification' for user '11320-user'
[2016-03-11 13:15:51 +0100] debug/DbEvents: add notification history for '11320-host!11320-service-volatile'
[2016-03-11 13:15:51 +0100] notice/Process: Running command 'sh' '-c' 'echo "`date +%s`: Volatile notification for service '11320-service-volatile' with state 'CRITICAL' and type 'PROBLEM'." >> /tmp/i2.volatile': PID 19988
[2016-03-11 13:15:51 +0100] debug/DbEvents: add contact notification history for service '11320-host!11320-service-volatile' and user '11320-user'.
[2016-03-11 13:15:51 +0100] debug/DbEvents: add log entry history for '11320-host!11320-service-volatile'
[2016-03-11 13:15:51 +0100] information/Notification: Completed sending notification '11320-host!11320-service-volatile!11320-service-notification' for checkable '11320-host!11320-service-volatile'
[2016-03-11 13:15:51 +0100] notice/Process: PID 19988 ('sh' '-c' 'echo "`date +%s`: Volatile notification for service '11320-service-volatile' with state 'CRITICAL' and type 'PROBLEM'." >> /tmp/i2.volatile') terminated with exit code 0
[2016-03-11 13:15:52 +0100] debug/IdoMysqlConnection: Query: INSERT INTO icinga_externalcommands (command_args, command_name, command_type, endpoint_object_id, entry_time, instance_id) VALUES ('11320-host;11320-service-volatile;2;crit to crit', 'PROCESS_SERVICE_CHECK_RESULT', '30', 89, FROM_UNIXTIME(1457698551), 1)
[2016-03-11 13:15:52 +0100] debug/IdoMysqlConnection: Query: INSERT INTO icinga_logentries (endpoint_object_id, entry_time, entry_time_usec, instance_id, logentry_data, logentry_time, logentry_type, object_id) VALUES (89, FROM_UNIXTIME(1457698551), '736773', 1, 'SERVICE NOTIFICATION: 11320-user;11320-host;11320-service-volatile;PROBLEM (CRITICAL);dummy;crit to crit', FROM_UNIXTIME(1457698551), '524288', 14446)

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-11 12:25:03 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 0 to 100

Applied in changeset 3e050bd.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-11 14:56:51 +00:00

  • Backport? changed from Not yet backported to Already backported

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-24 09:37:58 +00:00

  • Parent Id deleted 11310

@icinga-migration icinga-migration added bug Something isn't working libicinga labels Jan 17, 2017
@icinga-migration icinga-migration added this to the 2.4.4 milestone Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant