New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #13971] cgroup: fork rejected by pids controller in /system.slice/icinga2.service #4918
Comments
Updated by Skap1981 on 2017-01-12 14:12:54 +00:00 Additional Info: package libboost_system1_54_0: If i try to update Icinga2 to 2.6 I get a dependency error: |
Updated by mfriedrich on 2017-01-12 14:17:18 +00:00
This sounds like a resource limit introduced by systemd or the cgroups. https://lwn.net/Articles/663873/ Can you verify that SLES12SP2 does not accidentally enable this feature? The chrono dependency was introduced by SP2. See #13671 for details. |
Updated by Skap1981 on 2017-01-12 14:42:35 +00:00 mfriedrich wrote:
not accidentally i guess. Now I have to figure out how to handly cgroup
Ok, SLES 12 SDK. Thx, i´m able to update. First I have to resolve the cgroup-problem. |
Updated by Skap1981 on 2017-01-12 21:45:15 +00:00 Problem solved. I set the DefaultTasksMax to infinity Not the best way, but until I get a better solution, this works. Reschedule of 1500 services works fine without any crash. |
Updated by mfriedrich on 2017-01-13 07:37:05 +00:00 Hmm, so yet again it seems systemd related. Do you think that this should be added to the troubleshooting section in the docs? |
Updated by Skap1981 on 2017-01-13 07:49:15 +00:00 mfriedrich wrote:
yes, it is system related. The parameter DefaultTasksMax is set to 512 per default. But in large Icinga2-enviroments, 512 tasks could be a limiter. A hint in the troubleshooting sections could be helpfull. |
Updated by mfriedrich on 2017-01-13 08:19:29 +00:00
Ok, thanks, I'll have a look. |
Should we perhaps update our default systemd unit file to set that parameter to a reasonable value? |
I am having the same problem, and yes, it seems that your unit file must be modified adding the TasksMax= to a reasonable value. Moreover, take in mind that this feature was introduced in 226 version. Please read my bug from another software with the same problem: https://bugs.schedmd.com/show_bug.cgi?id=3526 |
Thanks for the insights, much appreciated. So we would need sort of distribution specific unit files (and let external packagers know about it). Or we'll enhance CMake to generate the service file like your proposed patch for autotools in the linked ticket. I'll add a note to troubleshooting docs meanwhile to help others with a quick workaround until we resolve this issue. |
Add troubleshooting hints for cgroup fork errors refs #4918
Users might open issues, but hopefully will find this one first. |
There is a PR which raises the limit: systemd/systemd#3753 but I doubt that this has hit SLES already. https://github.com/systemd/systemd/blob/dd050decb6ad131ebdeabb71c4f9ecb4733269c0/NEWS#L60
|
This solves the problem with Systemd >= 226 and fork errors with Icinga 2. Seen on SLES 11 SP2. fixes #4918
I'm setting this to 2.7.1 as this affects many users/customers. |
This issue has been migrated from Redmine: https://dev.icinga.com/issues/13971
Created by Skap1981 on 2017-01-12 13:11:08 +00:00
Assignee: mfriedrich
Status: Assigned
Target Version: (none)
Last Update: 2017-01-13 08:19:29 +00:00 (in Redmine)
After Updating from SLES12 SP1 to SLES12 SP2 Icinga2 crashes after a shorttime
In messages I see the following entries:
2017-01-12T11:55:40.742685+01:00 mgtmon035 kernel: [65567.582895] cgroup: fork rejected by pids controller in /system.slice/icinga2.service
2017-01-12T11:55:43.246611+01:00 mgtmon035 kernel: [65570.086553] icinga2[124779]: segfault at 7fff0001a2df ip 00007ffff43f9d44 sp 00007ffff7ebcab0 error 4 in libc-2.22.so[7ffff433f000+19a000]
2017-01-12T11:55:45.129162+01:00 mgtmon035 systemd[1]: icinga2.service: Main process exited, code=killed, status=6/ABRT
2017-01-12T11:55:45.583138+01:00 mgtmon035 systemd[1]: icinga2.service: Unit entered failed state.
2017-01-12T11:55:45.583354+01:00 mgtmon035 systemd[1]: icinga2.service: Failed with result 'signal'.
icinga2.log show many criticals:
[2017-01-12 11:55:41 +0100] critical/checker: Exception occured while checking 'HOSTxxx!Service_xxx': Error: Function call 'fork' failed with error code 11, 'Resource temporarily unavailable'
Stacktrace attatched.
Attachments
The text was updated successfully, but these errors were encountered: