Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #14009] icinga2 merges different service checks together #4922

Closed
icinga-migration opened this issue Jan 13, 2017 · 5 comments
Labels
area/configuration DSL, parser, compiler, error handling area/distributed Distributed monitoring (master, satellites, clients) area/documentation End-user or developer help

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/14009

Created by bebehei on 2017-01-13 20:47:30 +00:00

Assignee: (none)
Status: New
Target Version: (none)
Last Update: 2017-01-14 13:03:19 +00:00 (in Redmine)

Icinga Version: 2.6.0
Backport?: Not yet backported
Include in Changelog: 1

Hi,

we use icinga2 on Ubuntu 14/16.04 with deb-packages from packages.icinga2.org.
All nodes have the same packages and are connected to an icinga2-master, which syncs the CheckCommand definitions to all client-nodes.
All checks are initiated from master.

after latest upgrade[1], some of our services started flapping in our installation.
They flap twice a minute although the check_interval is set to 1m and the configured warn/crit-values defined on master are not reached.
Also all services are self-defined, only some with common names are flapping.

We had been able to hunt it down now:

In the deb-packages, there are default-configurations deployed to /etc/icinga2/conf.d/, where some common checks are defined. (swap, load ...)
We previously deleted these files with our script, which configures our client-nodes. But after the current upgrade, these files got installed by apt again.

And finally we've got services defined on the icinga2 client with name "swap" and on the icinga2 master with name "swap". Same name, but different objects.

And for some reasons, the check-result of the client-defined "swap" service gets reported back to the icinga2 master, too.
But the CheckCommand and warn/crit-values may be different.

How to reproduce:

  1. Install two machines and connect them together as client/master.
  2. On client edit /etc/icinga2/conf.d/services.conf and change the "load"-service to execute "procs"-check instead of "load"-check.
  3. On master, add a service called "load" with "load"-check and apply it to your client.
  4. See screenshot.

[1] Important: It's only the upgrade, I had not been able to bind it to a specific icinga2-version yet.

Attachments


Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2017-01-14 12:41:41 +00:00

That's a known issue when using command endpoint which is why we recommend to completely disable the "conf.d" inclusion on your client. The docs should guide you there as well.

@icinga-migration
Copy link
Author

Updated by bebehei on 2017-01-14 12:48:19 +00:00

Hi, thanks for the fast answer.

I know that the fix on my side is emptying conf.d on the clients. I will do this, of course. But I just consider this as a quickfix.

To be honest, I would like to see this bug fixed in icinga2 itself.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2017-01-14 13:03:19 +00:00

I don't see an easy way to fix this without breaking other things inside the cluster design. I guess - and that really is one - that those problems will be handled once we finally remove the deprecated "node update-config" bottom up mode. That one requires local configured objects to send their check results back to the master which happily accepts them (if the same name). I'm not sure how to make the master aware that this object update does not belong in its zone.

I'm more toying with the option to somehow make conf.d/ inclusion optional via the setup wizard. But again, that involves the removal of the bottom up mode. We promised to keep that for a year or two major releases, which means 2.9 could finally remove it. Meanwhile I strongly suggest to edit icinga2.conf and purge the conf.d include directive.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2017-01-14 13:04:49 +00:00

  • Relates set to 13257

@icinga-migration icinga-migration added the bug Something isn't working label Jan 17, 2017
@dnsmichi dnsmichi added the area/distributed Distributed monitoring (master, satellites, clients) label Jan 19, 2017
@gunnarbeutner gunnarbeutner added area/configuration DSL, parser, compiler, error handling area/documentation End-user or developer help and removed bug Something isn't working labels Feb 7, 2017
@dnsmichi
Copy link
Contributor

I've added that to the troubleshooting docs recently: #5487. The rest will be resolved once we purge away the bottom up mode and make conf.d inclusion optional (defaults to no for client setups then).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/configuration DSL, parser, compiler, error handling area/distributed Distributed monitoring (master, satellites, clients) area/documentation End-user or developer help
Projects
None yet
Development

No branches or pull requests

3 participants