Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #12402] Notification resent, even if interval = 0 #4461

Closed
icinga-migration opened this issue Aug 11, 2016 · 8 comments
Closed
Labels
area/notifications Notification events bug Something isn't working
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/12402

Created by bsheqa on 2016-08-11 10:16:19 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2016-08-16 11:49:06 +00:00)
Target Version: 2.5.0
Last Update: 2016-08-16 11:49:06 +00:00 (in Redmine)

Icinga Version: v2.4.10-632-ge09fb88
Backport?: Not yet backported
Include in Changelog: 1

Following scenario:

  • a master-master setup
  • a notification with interval = 0

When the service fails, one notification is send from master1.
After that, if master1 fails, the notification is resent from master2.
I don't think its a timing issue, even if master1 fails 10 minutes after the notification, master2 still sends out another one.

Changesets

2016-08-12 12:49:29 +00:00 by mfriedrich e28f30a

Enhance log messages for {,reminder} notifications

refs #12402

2016-08-15 16:32:51 +00:00 by mfriedrich d909c09

Add an explicit flag for disabling reminder notifications

refs #12402

2016-08-16 11:48:09 +00:00 by mfriedrich 832b5be

Remove debug output in NotificationComponent

refs #12402

Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-12 12:19:21 +00:00

  • Category set to Notifications
  • Status changed from New to Assigned
  • Assigned to set to mfriedrich
  • Target Version set to 2.5.0

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-15 16:08:27 +00:00

Ok, it is a reminder notification which triggers the resending on master4 if master3 is stopped.

[2016-08-15 17:55:22 +0200] notice/NotificationComponent: Attempting to send reminder notification for object 'test-icinga-master3.test.netways.de!notification dummy'
[2016-08-15 17:55:22 +0200] notice/Notification: Attempting to send notifications for notification object 'test-icinga-master3.test.netways.de!notification dummy!notification-object-1'.
[2016-08-15 17:55:22 +0200] debug/Notification: Type 'Problem', TypeFilter: Acknowledgement, Custom, DowntimeEnd, DowntimeRemoved, DowntimeStart, FlappingEnd, FlappingStart, Problem and Recovery (FType=32, TypeFilter=511)
[2016-08-15 17:55:22 +0200] debug/Notification: State 'Critical', StateFilter: Critical, OK, Unknown and Warning (FState=4, StateFilter=15)
[2016-08-15 17:55:22 +0200] debug/Notification: User notification, Type 'Problem', TypeFilter: Acknowledgement, Custom, DowntimeEnd, DowntimeRemoved, DowntimeStart, FlappingEnd, FlappingStart, Problem and Recovery (FType=32, TypeFilter=511)
[2016-08-15 17:55:22 +0200] debug/Notification: User notification, State 'Critical', StateFilter: Critical, Down, OK, Unknown, Up and Warning (FState=4, StateFilter=-1)
[2016-08-15 17:55:22 +0200] information/Notification: Sending 'Problem' notification 'test-icinga-master3.test.netways.de!notification dummy!notification-object-1 for user 'mfriedrich'
[2016-08-15 17:55:22 +0200] notice/ApiListener: Relaying 'event::NotificationSentAllUsers' message
[2016-08-15 17:55:22 +0200] notice/Process: Running command '/etc/icinga2/scripts/mail-service-notification.sh': PID 12740
[2016-08-15 17:55:22 +0200] information/Notification: Completed sending 'Problem' notification 'test-icinga-master3.test.netways.de!notification dummy!notification-object-1' for checkable 'test-icinga-master3.test.netways.de!notification dummy' and user 'mfriedrich'.
[2016-08-15 17:55:22 +0200] notice/ApiListener: Relaying 'event::NotificationSentUser' message
[2016-08-15 17:55:22 +0200] notice/Process: PID 12740 ('/etc/icinga2/scripts/mail-service-notification.sh') terminated with exit code 0

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-15 16:39:02 +00:00

For some reason the condition

if (notification->GetInterval() <= 0 && notification->GetLastProblemNotification() > checkable->GetLastHardStateChange())

fails. I would guess that GetLastHardState() is too new (not yet synced from the failing master node or updated by checks happening on master4 before checking the notifications). Then the condition is "false" which means the notifications are skipped.

Since we heavily rely on the timestamps here, I've looked into other ways of ensuring that only one notification is sent. One thing which is also used in 1.x is a dedicated flag to let the current node as well as the HA cluster nodes know about the fact that there are no more notifications needed. This is also exposed via the API and will help determine whether notification objects are going to push notifications or not.

I've pushed a fix to git master which requires further notification tests.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-15 16:40:00 +00:00

LastProblemNotification

<1> => DateTime(1471275464).to_string()
"2016-08-15 17:37:44 +0200"

LastHardStateChange

<2> => DateTime(1471276522).to_string()
"2016-08-15 17:55:22 +0200"

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-16 07:36:59 +00:00

Yep, that fixes it. I'll remove the debug output soon since it will be logged every 5s.

master4

[2016-08-16 09:35:14 +0200] debug/NotificationComponent: Skipping reminder notification 'test-icinga-master3.test.netways.de!notification dummy!notification-object-1'. Interval is 0 and no more notifications are required.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-16 11:49:06 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 0 to 100

Last missing critical bug for 2.5.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-16 12:47:16 +00:00

  • Relates set to 11590

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-16 12:47:59 +00:00

  • Relates set to 11562

@icinga-migration icinga-migration added bug Something isn't working area/notifications Notification events labels Jan 17, 2017
@icinga-migration icinga-migration added this to the 2.5.0 milestone Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/notifications Notification events bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant