New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #13445] concurrent_checks in CheckerComponent not working when using command_endpoint #4841
Comments
Problem still exists. Is there anything we can do to help debugging? |
We running similar Icinga2 topology, "concurrent_checks" parameter doesn't work:
On all nodes: Build information: Config
Example of service:
We're using concurrent_checks = 50 on master and satellites. Every reload of master makes load averages over 100 on satellites (VPS with 6vCPU and 4GB RAM). No matter which value in concurrent_checks is set. -pb |
Update: All nodes runs r2.7.0-1 now and problem still exists. After every reload of master after a while I can see over 400 checks on satellites per second. If we are trying to run all checks (about 12000), it ends with: Remote Icinga instance 'satelliteX' is not connected to 'master' and master node starts to notify false-positives. In production, we have to disabled some tests to not overload satellites nodes. Content of checker.conf on all nodes: /etc/icinga2/features-enabled/checker.conf
|
This issue has been migrated from Redmine: https://dev.icinga.com/issues/13445
Created by thisismyname on 2016-12-07 08:52:37 +00:00
Assignee: (none)
Status: New
Target Version: (none)
Last Update: 2016-12-07 08:52:37 +00:00 (in Redmine)
It seems that the limit of the checks that can run simultaneously, which can be implemented with the "concurrent_checks" parameter, doesn't work when using command_endpoint.
You can find a sketch of the setup attached.
Problem is the high load on the satellite server (mon3) which executes checks on network devices when we reload the master.
Attachments
The text was updated successfully, but these errors were encountered: