Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #11248] Active checks are executed even though passive results are submitted #3985

Closed
icinga-migration opened this issue Feb 27, 2016 · 14 comments
Labels
blocker Blocks a release or needs immediate attention bug Something isn't working
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/11248

Created by julianbrost on 2016-02-27 18:24:55 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2016-03-11 08:35:00 +00:00)
Target Version: 2.4.4
Last Update: 2016-03-24 09:37:51 +00:00 (in Redmine)

Icinga Version: 2.4.3
Backport?: Already backported
Include in Changelog: 1

If a service is defined using the following template and passive check results are submitted every 5 minutes, Icinga will still run the active check every 17 minutes and set the status to unknown causing the service to flap all the time.

template Service "passive-service" {
  check_command = "passive"
  check_interval = 17m
}

The next check time gets impropery updated:

    current time                                             last check                                                 next check
Sat Feb 27 14:41:10 CET 2016          Sat Feb 27 14:35:32 CET 2016          Sat Feb 27 14:41:20 CET 2016
Sat Feb 27 14:41:11 CET 2016          Sat Feb 27 14:35:32 CET 2016          Sat Feb 27 14:41:20 CET 2016
Sat Feb 27 14:41:12 CET 2016          Sat Feb 27 14:35:32 CET 2016          Sat Feb 27 14:41:20 CET 2016
Sat Feb 27 14:41:13 CET 2016          Sat Feb 27 14:41:13 CET 2016          Sat Feb 27 14:41:20 CET 2016  <-- (1)
Sat Feb 27 14:41:14 CET 2016          Sat Feb 27 14:41:13 CET 2016          Sat Feb 27 14:41:20 CET 2016
Sat Feb 27 14:41:16 CET 2016          Sat Feb 27 14:41:13 CET 2016          Sat Feb 27 14:41:20 CET 2016
Sat Feb 27 14:41:17 CET 2016          Sat Feb 27 14:41:13 CET 2016          Sat Feb 27 14:41:20 CET 2016
Sat Feb 27 14:41:18 CET 2016          Sat Feb 27 14:41:13 CET 2016          Sat Feb 27 14:41:20 CET 2016
Sat Feb 27 14:41:19 CET 2016          Sat Feb 27 14:41:13 CET 2016          Sat Feb 27 14:41:20 CET 2016
Sat Feb 27 14:41:20 CET 2016          Sat Feb 27 14:41:20 CET 2016          Sat Feb 27 14:58:20 CET 2016  <-- (2)
Sat Feb 27 14:41:21 CET 2016          Sat Feb 27 14:41:20 CET 2016          Sat Feb 27 14:58:20 CET 2016
Sat Feb 27 14:41:22 CET 2016          Sat Feb 27 14:41:20 CET 2016          Sat Feb 27 14:58:20 CET 2016

At (1) passive check results were submitted but the next check time wasn't updated, at (2) Icinga executed the scheduled check even though passive results were submitted a few seconds earlier, in this case the next check time got updated.

This bug seems to be introduced by 9ca7245. I've attached a patch that fixes this issue.

Attachments

Changesets

2016-03-05 17:15:03 +00:00 by mfriedrich b8e3d61

Revert "Properly set the next check time for active and passive checks"

This reverts commit 2a11b27972e4325bf80e9abc9017eab7dd03e712.

This patch does not properly work and breaks the check_interval setting
for passive checks. Requires a proper patch.

refs #11248
refs #11257
refs #11273

(the old issue)
refs #7287

2016-03-05 17:16:49 +00:00 by mfriedrich ef532f2

Revert "Fix check scheduling w/ retry_interval"

This reverts commit a51e647cc760bd5f7c4de6182961a477478c11a9.

This patch causes trouble with check results received
1) passively 2) throughout the cluster. A proper patch
for setting the retry_interval on NOT-OK state changes
is required.

refs #11248
refs #11257
refs #11273

(the old issue)
refs #7287

2016-03-11 14:55:03 +00:00 by mfriedrich 8344f74

Revert "Properly set the next check time for active and passive checks"

This reverts commit 2a11b27972e4325bf80e9abc9017eab7dd03e712.

This patch does not properly work and breaks the check_interval setting
for passive checks. Requires a proper patch.

refs #11248
refs #11257
refs #11273

(the old issue)
refs #7287

2016-03-11 14:55:14 +00:00 by mfriedrich f99feab

Revert "Fix check scheduling w/ retry_interval"

This reverts commit a51e647cc760bd5f7c4de6182961a477478c11a9.

This patch causes trouble with check results received
1) passively 2) throughout the cluster. A proper patch
for setting the retry_interval on NOT-OK state changes
is required.

refs #11248
refs #11257
refs #11273

(the old issue)
refs #7287
@icinga-migration
Copy link
Author

Updated by julianbrost on 2016-02-27 18:38:14 +00:00

Michael Friedrich pointed out, that passive checks are enabled by default and thus that if (as proposed in my patch) would be triggered almost all the time. I guess one would want to add

&& cr->getActive()

to the condition but I didn't test that so far.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-27 19:04:55 +00:00

  • Status changed from New to Assigned
  • Assigned to set to mfriedrich

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-29 09:00:10 +00:00

  • Relates set to 11257

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-04 15:27:37 +00:00

  • Relates set to 11273

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-04 15:30:36 +00:00

  • Parent Id set to 11310

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-04 15:33:12 +00:00

  • Relates deleted 11257

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-04 15:33:36 +00:00

  • Relates deleted 11273

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-05 17:39:59 +00:00

I've reverted 2 commits which might be causing trouble here. Can you please re-test the current git master?

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-09 10:42:18 +00:00

  • Target Version set to 2.4.4

In terms of cr~~GetActive()~~ good idea, we discussed that today independent of this issue. A proper fix for the retry_interval should take that into account with #11336.

In the meantime I'll assign this issue for 2.4.4 - it'll be great if you could do further tests, allowing this ticket being resolved for the targeted release.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-09 10:44:08 +00:00

  • Priority changed from Normal to High

@icinga-migration
Copy link
Author

Updated by julianbrost on 2016-03-10 13:26:17 +00:00

Seems to be fixed. I can't reproduce the problem with version 68449c2.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-11 08:35:00 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 0 to 100

Thanks.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-11 14:55:23 +00:00

  • Backport? changed from Not yet backported to Already backported

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-24 09:37:51 +00:00

  • Parent Id deleted 11310

@icinga-migration icinga-migration added blocker Blocks a release or needs immediate attention bug Something isn't working libicinga labels Jan 17, 2017
@icinga-migration icinga-migration added this to the 2.4.4 milestone Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker Blocks a release or needs immediate attention bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant