Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #13393] notifications being sent on ack to a group that wasn't sent the initial down page #4820

Closed
icinga-migration opened this issue Dec 3, 2016 · 4 comments
Labels
area/notifications Notification events bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/13393

Created by djalden on 2016-12-03 02:26:32 +00:00

Assignee: (none)
Status: Closed (closed on 2016-12-07 14:50:26 +00:00)
Target Version: (none)
Last Update: 2016-12-07 14:50:26 +00:00 (in Redmine)

Icinga Version: 2.5.4
Backport?: Not yet backported
Include in Changelog: 1

Hi,
Our site is setup so that all outages send an email. In addition, during business hours the responsible group is paged immediately and the on-call person is paged 30 minutes later (just in case the responsible group is asleep :). This reverses outside of business hours so that the person on call is paged immediately and the responsible group is paged 30 minutes later. We just had the following happen (this was outside of normal business hours):

  1. A service went down (ssh in this case)
  2. The person on call got paged
  3. The person on call ack'ed the service outage
  4. The person on call AND the responsible group got paged with the ack info

The group should not have been paged, right? Here's some (hopefully) relevant info:

% icinga2 object list --name ups1
Object 'ups1' of type 'Host':
  [...]
  * vars
    [...]
    * notify
      * info
        * action = "email"
        * delay = 0
        * interval = 1800
        * users = [ "info" ]
        * window = "24x7"
      * op1
        * action = "sms"
        * delay = 1800
        * interval = 1200
        * users = [ "oncall" ]
        * window = "BusinessHours"
      * op2
        * action = "sms"
        * delay = 0
        * interval = 1200
        * users = [ "oncall" ]
        * window = "7to10SS-exclude-BusinessHours"
      * op3
        * action = "sms"
        * delay = 0
        * groups = [ "network" ]
        * interval = 1200
        * window = "BusinessHours"
      * op4
        * action = "sms"
        * delay = 1800
        * groups = [ "network" ]
        * interval = 1200
        * window = "7to10SS-exclude-BusinessHours"
      * op5
        * action = "email"
        * delay = 0
        * groups = [ "network" ]
        * interval = 7200
        * window = "BusinessHours"

Here's the (again, hopefully relevant) parts of icinga2.log:

information/Checkable: Checking for configured notifications for object 'ups1!SSH'
information/Notification: Sending 'Problem' notification 'ups1!SSH!service_notify_info for user 'info'
information/Notification: Sending 'Problem' notification 'ups1!SSH!service_notify_op2 for user 'oncall'
information/Notification: Completed sending 'Problem' notification 'ups1!SSH!service_notify_info' for checkable 'ups1!SSH' and user 'info'.
information/Notification: Completed sending 'Problem' notification 'ups1!SSH!service_notify_op2' for checkable 'ups1!SSH' and user 'oncall'.

[...]

information/ExternalCommandListener: Executing external command: [1480721459] ACKNOWLEDGE_SVC_PROBLEM;ups1;SSH;2;1;0;me;Checking on this.
information/ConfigItem: Committing config item(s).
information/ConfigItem: Instantiated 1 Comment.
information/ConfigItem: Triggering Start signal for config items
information/ConfigItem: Activated all objects.
information/Checkable: Checking for configured notifications for object 'ups1!SSH'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4 for user 'Bob'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4 for user 'Sue'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4 for user 'Tom'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4 for user 'Larry'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_info for user 'info'
information/Notification: Sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op2 for user 'oncall'
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4' for checkable 'ups1!SSH' and user 'Bob'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_info' for checkable 'ups1!SSH' and user 'info'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4' for checkable 'ups1!SSH' and user 'Tom'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4' for checkable 'ups1!SSH' and user 'Sue'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op4' for checkable 'ups1!SSH' and user 'Larry'.
information/Notification: Completed sending 'Acknowledgement' notification 'ups1!SSH!service_notify_op2' for checkable 'ups1!SSH' and user 'oncall'.
@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-12-05 09:35:40 +00:00

  • Status changed from New to Feedback
  • Assigned to set to djalden

Acknowledgement notifications are sent to all users/user groups specified for this notification unless you remove the filter type. So that works by design. Is that what you're wondering about?

@icinga-migration
Copy link
Author

Updated by tgelf on 2016-12-05 13:20:27 +00:00

There are no timestamps in the log, but I guess what he expects would be:

a) to inform only people who got (or would right now get) the problem notification about the fact that the problem has been acknowledged
b) and/or not to bother people with a configured 30min notification delay with an Ack during that 30mins

Cheers,
Thomas

@icinga-migration
Copy link
Author

Updated by djalden on 2016-12-05 15:14:44 +00:00

Thomas is correct (sorry for editing too much out and removing the timestamps). The Ack went out within 5 minutes of the original notification, which is well under 30 minutes, so I was only expecting the oncall person to receive the ack since the responsible group isn't supposed to receive any notifications within the first 30 minutes (delay = 1800).

...dave

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-12-07 14:50:26 +00:00

  • Status changed from Feedback to Closed
  • Assigned to deleted djalden

Hi,

times.{begin,end} filters are checked only for the "Problem" notification type. Other types are immediately sent on the event, such as Acknowledgement or DowntimeStart. Such types can be filtered inside the notification/user objects then.
Re-notifications about the problem type will start at the specified times.begin (even if this is the first notification you receive, the former was suppressed).

Furthermore anything except the "Problem" notification type will trigger a notification event independent from times.{begin,end} or interval=0 filters. That's how it is designed to work and, I'm afraid, not a bug. I'm therefore closing this issue.

Kind regards,
Michael

@icinga-migration icinga-migration added bug Something isn't working area/notifications Notification events labels Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/notifications Notification events bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant