[dev.icinga.com #13393] notifications being sent on ack to a group that wasn't sent the initial down page #4820

icinga-migration · 2016-12-03T02:26:32Z

This issue has been migrated from Redmine: https://dev.icinga.com/issues/13393

Created by djalden on 2016-12-03 02:26:32 +00:00

Assignee: (none)
Status: Closed (closed on 2016-12-07 14:50:26 +00:00)
Target Version: (none)
Last Update: 2016-12-07 14:50:26 +00:00 (in Redmine)

Icinga Version: 2.5.4
Backport?: Not yet backported
Include in Changelog: 1

Hi,
Our site is setup so that all outages send an email. In addition, during business hours the responsible group is paged immediately and the on-call person is paged 30 minutes later (just in case the responsible group is asleep :). This reverses outside of business hours so that the person on call is paged immediately and the responsible group is paged 30 minutes later. We just had the following happen (this was outside of normal business hours):

A service went down (ssh in this case)
The person on call got paged
The person on call ack'ed the service outage
The person on call AND the responsible group got paged with the ack info

The group should not have been paged, right? Here's some (hopefully) relevant info:

% icinga2 object list --name ups1
Object 'ups1' of type 'Host':
  [...]
  * vars
    [...]
    * notify
      * info
        * action = "email"
        * delay = 0
        * interval = 1800
        * users = [ "info" ]
        * window = "24x7"
      * op1
        * action = "sms"
        * delay = 1800
        * interval = 1200
        * users = [ "oncall" ]
        * window = "BusinessHours"
      * op2
        * action = "sms"
        * delay = 0
        * interval = 1200
        * users = [ "oncall" ]
        * window = "7to10SS-exclude-BusinessHours"
      * op3
        * action = "sms"
        * delay = 0
        * groups = [ "network" ]
        * interval = 1200
        * window = "BusinessHours"
      * op4
        * action = "sms"
        * delay = 1800
        * groups = [ "network" ]
        * interval = 1200
        * window = "7to10SS-exclude-BusinessHours"
      * op5
        * action = "email"
        * delay = 0
        * groups = [ "network" ]
        * interval = 7200
        * window = "BusinessHours"

Here's the (again, hopefully relevant) parts of icinga2.log:

information/Checkable: Checking for configured notifications for object 'ups1!SSH'
information/Notification: Sending 'Problem' notification 'ups1!SSH!service_notify_info for user 'info'
information/Notification: Sending 'Problem' notification 'ups1!SSH!service_notify_op2 for user 'oncall'
information/Notification: Completed sending 'Problem' notification 'ups1!SSH!service_notify_info' for checkable 'ups1!SSH' and user 'info'.
information/Notification: Completed sending 'Problem' notification 'ups1!SSH!service_notify_op2' for checkable 'ups1!SSH' and user 'oncall'.

[...]

information/ExternalCommandListener: Executing external command: [1480721459] ACKNOWLEDGE_SVC_PROBLEM;ups1;SSH;2;1;0;me;Checking on this.
information/ConfigItem: Committing config item(s).
information/ConfigItem: Instantiated 1 Comment.
information/ConfigItem: Triggering Start signal for config items
information/ConfigItem: Activated all objects.
information/Checkable: Checking for configured notifications for object 'ups1!SSH'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4 for user 'Bob'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4 for user 'Sue'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4 for user 'Tom'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4 for user 'Larry'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_info for user 'info'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op2 for user 'oncall'
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4' for checkable 'ups1!SSH' and user 'Bob'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_info' for checkable 'ups1!SSH' and user 'info'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4' for checkable 'ups1!SSH' and user 'Tom'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4' for checkable 'ups1!SSH' and user 'Sue'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4' for checkable 'ups1!SSH' and user 'Larry'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op2' for checkable 'ups1!SSH' and user 'oncall'.

The text was updated successfully, but these errors were encountered:

icinga-migration · 2016-12-05T09:35:40Z

Updated by mfriedrich on 2016-12-05 09:35:40 +00:00

Status changed from New to Feedback
Assigned to set to djalden

Acknowledgement notifications are sent to all users/user groups specified for this notification unless you remove the filter type. So that works by design. Is that what you're wondering about?

icinga-migration · 2016-12-05T13:20:27Z

Updated by tgelf on 2016-12-05 13:20:27 +00:00

There are no timestamps in the log, but I guess what he expects would be:

a) to inform only people who got (or would right now get) the problem notification about the fact that the problem has been acknowledged
b) and/or not to bother people with a configured 30min notification delay with an Ack during that 30mins

Cheers,
Thomas

icinga-migration · 2016-12-05T15:14:44Z

Updated by djalden on 2016-12-05 15:14:44 +00:00

Thomas is correct (sorry for editing too much out and removing the timestamps). The Ack went out within 5 minutes of the original notification, which is well under 30 minutes, so I was only expecting the oncall person to receive the ack since the responsible group isn't supposed to receive any notifications within the first 30 minutes (delay = 1800).

...dave

icinga-migration · 2016-12-07T14:50:26Z

Updated by mfriedrich on 2016-12-07 14:50:26 +00:00

Status changed from Feedback to Closed
Assigned to deleted ~~djalden~~

Hi,

times.{begin,end} filters are checked only for the "Problem" notification type. Other types are immediately sent on the event, such as Acknowledgement or DowntimeStart. Such types can be filtered inside the notification/user objects then.
Re-notifications about the problem type will start at the specified times.begin (even if this is the first notification you receive, the former was suppressed).

Furthermore anything except the "Problem" notification type will trigger a notification event independent from times.{begin,end} or interval=0 filters. That's how it is designed to work and, I'm afraid, not a bug. I'm therefore closing this issue.

Kind regards,
Michael

icinga-migration closed this as completed Dec 7, 2016

icinga-migration added bug Something isn't working area/notifications Notification events labels Jan 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dev.icinga.com #13393] notifications being sent on ack to a group that wasn't sent the initial down page #4820

[dev.icinga.com #13393] notifications being sent on ack to a group that wasn't sent the initial down page #4820

icinga-migration commented Dec 3, 2016

icinga-migration commented Dec 5, 2016

icinga-migration commented Dec 5, 2016

icinga-migration commented Dec 5, 2016

icinga-migration commented Dec 7, 2016

[dev.icinga.com #13393] notifications being sent on ack to a group that wasn't sent the initial down page #4820

[dev.icinga.com #13393] notifications being sent on ack to a group that wasn't sent the initial down page #4820

Comments

icinga-migration commented Dec 3, 2016

icinga-migration commented Dec 5, 2016

icinga-migration commented Dec 5, 2016

icinga-migration commented Dec 5, 2016

icinga-migration commented Dec 7, 2016