Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #12929] Notification are sent without taking the filter into account #4734

Closed
icinga-migration opened this issue Oct 14, 2016 · 8 comments
Labels
area/notifications Notification events bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/12929

Created by BFalco on 2016-10-14 14:53:15 +00:00

Assignee: BFalco
Status: Feedback
Target Version: (none)
Last Update: 2017-01-02 13:49:29 +00:00 (in Redmine)

Icinga Version: 2.5.4
Backport?: Not yet backported
Include in Changelog: 1

I'm running Icinga 2 in a distributed monitoring setup with one master and several Icinga satellites.
The configuration is replicated to the satellites via top-down schema (zones.d).
Both master and satellites have the notification feature enabled.
The necessary scripts for sending e-mails or sms are available only to the master.

Notifications are sent in intervalls:
One E-Mail is sent after 10 Minutes.
One SMS is sent after 20 Minutes.

Problem 1:
On some occations, an SMS-notification was instantly sent for a service which entered into UNKNOWN omitting the 20 minutes delay and evading the filter.
The UNKNOWN state was caused by a check plugin which didn't receive a answer from the host before timeout.

Problem 2:
For some hosts, which where down for more than 30 minutes, only the first notification (E-Mail) was sent.
After renaming the hosts in icingas configuration the notifications started working normal for these hosts and the problem appeared for other hosts in a
different zone.

Configuration example:

/**
 * MAIL NOTIFICATION TEMPLATE-----------------------------------------------------------------------------------------
 */
template Notification "mail-host-notification" {
    command = "mail-host-notification"
    states = [ Up, Down ]
    types = [ Problem, Acknowledgement, Recovery, Custom,FlappingStart, FlappingEnd,DowntimeStart, DowntimeEnd, DowntimeRemoved ]
}
template Notification "mail-service-notification" {
    command = "mail-service-notification"
    states = [ OK, Warning, Critical, Unknown ]
    types = [ Problem, Acknowledgement, Recovery, Custom,FlappingStart, FlappingEnd,DowntimeStart, DowntimeEnd, DowntimeRemoved ]
}

/**
 * SMS NOTIFICATION TEMPLATE-----------------------------------------------------------------------------------------
 */
template Notification "sms-host-notification" {
    command = "sms-host-notification"
    states = [ Down ]
    types = [ Problem ]
}
template Notification "sms-service-notification" {
    command = "sms-service-notification"
    states = [ Critical ]
    types = [ Problem ]
}

/**
 * HOST---------------------------------------------------------------------------
 */

apply Notification "mail_24x7" to Host {
  import "mail-host-notification"
  user_groups = ["default_group"]
  assign where host.vars.notification.mail == "mail_24x7"
  interval = 0
  times.begin = 10m
  times.end = 15m
  period = "24x7"
}
apply Notification "sms_24x7" to Host {
  import "sms-host-notification"
  user_groups = ["default_group"]
  assign where host.vars.notification.sms == "sms_24x7"
  interval = 0
  times.begin = 20m
  times.end = 25m
  period = "24x7"
}

/**
 * SERVICE---------------------------------------------------------------------------
 */

apply Notification "mail_24x7" to Service {
  import "mail-service-notification"
  user_groups = ["default_group"]
  assign where host.vars.notification.sms == "sms_24x7"
  interval = 0
  times.begin = 20m
  times.end = 25m
  period = "24x7"
}
apply Notification "sms_24x7" to Service {
  import "sms-service-notification"
  user_groups = ["default_group"]
  assign where host.vars.notification.sms == "sms_24x7"
  interval = 0
  times.begin = 20m
  times.end = 25m
  period = "24x7"
}

icinga2 - The Icinga 2 network monitoring daemon (version: v2.5.4)

Copyright (c) 2012-2016 Icinga Development Team (https://www.icinga.org/)
License GPLv2+: GNU GPL version 2 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Application information:
  Installation root: /usr
  Sysconf directory: /etc
  Run directory: /run
  Local state directory: /var
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid

System information:
  Platform: CentOS Linux
  Platform version: 7 (Core)
  Kernel: Linux
  Kernel version: 3.10.0-327.22.2.el7.x86_64
  Architecture: x86_64

Attachments

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-12-07 22:48:04 +00:00

  • Status changed from New to Feedback
  • Assigned to set to BFalco

Corresponding debug log with the notification sending in progress would help :)

@icinga-migration
Copy link
Author

Updated by BFalco on 2016-12-13 17:26:23 +00:00

  • File added debug_anon.txt

mfriedrich wrote:

Corresponding debug log with the notification sending in progress would help :)

Hello mfriedrich,

thank you for your reply. Attached to this post you will find an excerpt of the debug log from the Icinga2 master.
It contains the the debug log for one of the host with the described problem.
The referenced notification objects are applied the same way as the the objects in the opening post.

The host went into hard down state at 13:39:25. The first notification was correctly sent at 13:49:26.
All following notificationt weren't sent at all.

@icinga-migration
Copy link
Author

Updated by BFalco on 2017-01-02 13:49:29 +00:00

  • File added exampleConfig.conf

Hello and happy New Year,
I‘ve managed to reproduce the behaviour of Icinga2 discarding escalation-notifications with the attached example configuration.
The behaviour appears after icinga2 has been reloaded multiple times e.g. after installing a new icinga2 version or have icinga2 reparse the config files (systemctl reload icinga2).
I’ve tested it with version 2.6.0.
[root@CentOs configuration]# icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: v2.6.0)

Copyright © 2012-2016 Icinga Development Team (https://www.icinga.org/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Application information:
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid

System information:
Platform: CentOS Linux
Platform version: 7 (Core)
Kernel: Linux
Kernel version: 3.10.0-514.2.2.el7.x86_64
Architecture: x86_64

Build information:
Compiler: GNU 4.8.5
Build host: unknown

Steps for reproducing the behaviour:

  1. Include the configuration attached to this post in icinga2. Please make shure that the host is available, the entry ‚interval = 0‘ is active in the notifications and let icinga2 check the host.
  2. Change host address in config to 1.2.3.4 and reload the configuration
    -> Notifications and escalations will work as intended.
  3. Change host address back to 8.8.8.8
    -> Recovery Notification will be sent as intended
  4. Repeat step two and only the first notification(mail_24x7) will be sent. The following escalations however are discarded
    Setting period to 24x7 for all notifications resulted in the same behaviour.

Workaround:
Define a time window for Notifications with the times-dictionary
Set interval to something beyond the time defined in times.end

@icinga-migration icinga-migration added needs feedback We'll only proceed once we hear from you again bug Something isn't working area/notifications Notification events labels Jan 17, 2017
@MyMindMyBodyMySoul
Copy link

Hello,
keeping this up to date.
The described behaviour can be reproduced with Icinga 2.6.1.
debug_log-01022017.txt

@gunnarbeutner gunnarbeutner changed the title [dev.icinga.com #12929] Notification are sent whitout taking the filter into account [dev.icinga.com #12929] Notification are sent without taking the filter into account Feb 7, 2017
@dnsmichi
Copy link
Contributor

#4995 might be related. What happens if you add Recovery and Up to sms notification filters?

In terms of the debug log - that's lots of information, but where should I be looking at? "sms_24x7" doesn't exist anywhere. Please extract the corresponding notification log lines.

@MyMindMyBodyMySoul
Copy link

I added Up and Recovery to the sms notification.

Icinga2 notification behaveiour after multiple reloads is as follows::
1st notification (mail_24x7) was sent correctly
2nd notification (sms_24x5) was sent correctly
3rd notification (mail-ticket_24x5) not sent/"forgotten"
Recovery notifications for mail_24x7 and sms_24x5 were sent correctly

debug_clean.txt
exampleConfig_30_03_2017.txt

The debug log of my last post does not contain a sms_24x7 message. I just changed the period of the notification object sms_8x5 to 24x7 as I was testing the configuration outside the the timeperiod 8x5.
Extracting the log lines for sms_8x5 is futile as icinga2 completely forgot this notification exists.

@dnsmichi
Copy link
Contributor

dnsmichi commented Jun 8, 2017

Problem is not notified immediately, but delayed.

[2017-03-30 20:37:55 +0200] notice/Notification: Attempting to send  notifications for notification object 'example_host!mail-ticket_24x5'.
[2017-03-30 20:37:55 +0200] notice/Notification: Not sending  notifications for notification object 'example_host!mail-ticket_24x5': before specified begin time (3 minutes)
[2017-03-30 20:48:13 +0200] debug/Notification: Type 'Recovery', TypeFilter: Problem (FType=64, TypeFilter=32)
[2017-03-30 20:48:13 +0200] notice/Notification: Not sending  notifications for notification object 'example_host!mail-ticket_24x5': type 'Recovery' does not match type filter: Problem.

In addition to that the current notification state would be interesting. Before the first notification attempt would happen, then afterwards, and then after the recovery event happened.

curl -k -s -u root:icinga 'https://localhost:5665/v1/objects/notifications/example_host!mail-ticket_24x5'

More info on the Icinga 2 API can be found in the docs, in case you haven't enabled it before.

@Crunsher Crunsher removed the needs feedback We'll only proceed once we hear from you again label Oct 5, 2017
@dnsmichi
Copy link
Contributor

Looks like a configuration issue to me, and no further details provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/notifications Notification events bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants
@dnsmichi @Crunsher @MyMindMyBodyMySoul @icinga-migration and others