New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #13567] SIGPIPE shutdown on config reload #4867
Comments
Updated by seferovic on 2016-12-14 15:02:28 +00:00 Hi... I can confirm and reproduce this behavior... Icinga Director "kills" my master instance on deployment (I deployed an UserGroup object :) )
|
Updated by mfrosch on 2016-12-14 15:42:22 +00:00 Whats the systemd output / status after that exit? Icinga 2 is doing an restart internally, meaning a new process comes up and replaces the old process in sync. The full log of one of mine servers is:
The re-connection could be faster possibly, but the service restarts properly... |
Updated by mfrosch on 2016-12-14 15:42:30 +00:00
|
Updated by mfriedrich on 2016-12-14 15:48:35 +00:00
Works on CentOS 7 with Systemd, as well as Debian Wheezy in a Docker container with SystemVinit too.
Any specific OS version/release you're using this on? |
Updated by Markus on 2016-12-14 15:50:36 +00:00 I posted the last line of the log. There is no more icinga process running. It is like you would stop icinga manually. Status of icinga is
|
Updated by mfriedrich on 2016-12-14 15:51:18 +00:00 Please add the entire output of "icinga2 --version" (both of you). |
Updated by Markus on 2016-12-14 15:55:29 +00:00 Version infos are already in my first post. |
Updated by mfriedrich on 2016-12-14 15:58:49 +00:00 Ah, yes. So you are using SystemVinit instead of the Systemd default on Debian Jessie, right? |
Updated by Markus on 2016-12-14 16:02:22 +00:00 Yes, sysvinit is used on both systems. |
Updated by mfrosch on 2016-12-14 16:09:19 +00:00 Tried it with the bare initscript on Jessie:
The satellite's log ends with:
|
Updated by mfrosch on 2016-12-14 16:10:58 +00:00
|
Updated by rwaffen on 2016-12-14 16:17:32 +00:00 can confirm this on CentOS 6.8 ... with initV ... service icinga2 reload stops the icinga2 deamon
|
Updated by mfrosch on 2016-12-14 16:55:14 +00:00
To sum up the problem. When the icinga2 daemon is started in foreground, or by sysvinit, and a reload is issued, icinga shuts down. And the new process does not take over. Tested on Debian Jessie:
Interesting parts of the strace:
This does not happen when:
|
Updated by seferovic on 2016-12-14 18:48:52 +00:00 Sorry for not posting the info before...
If you need anything else (mfrosch has a pretty good analysis) I am willing to provide feedback... |
Updated by seferovic on 2016-12-14 23:34:08 +00:00 Interesting behavior.. if I clean out the conf.d directory and only leave api-users.conf and app.conf .. the reload is working as expected. |
Updated by mfriedrich on 2016-12-15 10:52:15 +00:00
|
Updated by gbeutner on 2016-12-15 10:58:02 +00:00 The problem is that the writing side of the pipes that are passed to the new Icinga child process for stdout/stderr are closed once the old Icinga process dies. This problem has been around for a while, however we didn't notice this because we were ignoring SIGPIPE by default. This is no longer the case - at least until after the config validation is finished. Only then do we reset SIGPIPE to SIG_IGN. The result of this is that depending on the length of the log output for the config validation we might end up write()-ing to a pipe whose writing side has been closed. The SIGPIPE signal we get from that causes the new Icinga process to die. There are two possible fixes:
|
Updated by mfriedrich on 2016-12-15 11:07:07 +00:00 Systemd ignores SIGPIPE by default, that's probably the reason why you only experience that with SystemVinit or running in foreground. https://lists.freedesktop.org/archives/systemd-devel/2014-August/021982.html |
Updated by mfrosch on 2016-12-15 11:10:29 +00:00 Quick Fix for sysVinitPlease insert into initscript or default/sysconfig:
This ignores SIGPIPE for all child processes spawned. And mimics the behavior of systemd.exec. |
Updated by mfriedrich on 2016-12-15 11:51:18 +00:00 The quick&dirty fix is available in git master & inside the snapshot packages (specifically Centos6 and Debian8 for you, tested CentOS7 over here). You can also reproduce the issue using the "icinga2x-cluster" Vagrant box using an old snapshot. The current revision will not have this behaviour as it already pulls in the latest snapshot packages. |
Updated by Markus on 2016-12-15 20:27:34 +00:00 mfrosch wrote:
I've tested the quick fix in initscript. Reload is working now. |
Updated by mfriedrich on 2016-12-16 08:41:35 +00:00
|
Updated by denny on 2016-12-16 10:20:08 +00:00 hi, I have the same problem:
===
The problem starts with the latest 2.5.x version. I disabled feature by feature to see, when reload works again. Reload works, if I disable the api feature. =
|
Updated by mfriedrich on 2016-12-16 10:21:45 +00:00
Does the workaround work for you? I'd suggest switching to Systemd on Debian Jessie as well, but that probably results in a flame war. |
Updated by denny on 2016-12-16 10:37:14 +00:00 hi, I changed the line "158" trap 'status=2;' INT # handle intr here with trap "" SIGPIPE It doesn't change the behavior. If i put it anywhere else: === === |
Updated by MarcoKl on 2016-12-17 08:31:02 +00:00 Hello,
at the beginning of the icinga2 init script. |
Updated by denny on 2016-12-18 16:22:15 +00:00 hi, thanks @marcokl works for me too :-) |
Updated by rnekkanti on 2016-12-19 20:38:54 +00:00 Hi.., I am running Ubuntu 14.04 with icinga2 2.6.0-1. I have tried updating the init script by adding trap "" 13 at the start of the init script and in line 158 but icinga2 does not seem to restart after deploying configs through director. It is just shutting down. [2016-12-19 20:35:21 +0000] information/ApiListener: New client connection from [127.0.0.1]:60518 (no client certificate) |
Updated by tgelf on 2016-12-21 10:38:34 +00:00
|
Updated by tgelf on 2016-12-21 10:38:41 +00:00
|
Updated by tgelf on 2016-12-21 10:38:47 +00:00
|
Updated by spillerm on 2016-12-21 10:56:11 +00:00 Hi, adding trap "" SIGPIPE at the beginning of init script solves it for me using Ubuntu 14.04.5 LTS. Cheers, Marianne |
Updated by tgelf on 2016-12-21 10:57:37 +00:00 spillerm wrote:
Good to hear that!
Well... sorry for the bug. |
Updated by rnekkanti on 2016-12-21 17:23:56 +00:00 Adding trap "" SIGPIPE toinit script does not seem to work when deploying configs through director. Iicinga2 is still failing to restart. [2016-12-21 17:19:26 +0000] information/ApiListener: New client connection from [127.0.0.1]:35286 (no client certificate) |
Updated by Nagodar on 2016-12-22 08:18:30 +00:00 Same bug I reported here I think: https://dev.icinga.com/issues/13317 |
Updated by mfrosch on 2016-12-22 10:49:54 +00:00
|
Updated by plarivee on 2016-12-22 16:38:27 +00:00 Had same issue, testing adding seems already more stable. Will report back later |
Updated by rnekkanti on 2016-12-22 18:21:32 +00:00 sig "" 13 on top of the icinga2 init script seems to fix the issue. |
Updated by plarivee on 2016-12-22 19:21:53 +00:00 have been hamering for the last hour with multiple deploy and it did not stopped.
seems to be the way in Ubuntu 14.04 for now. |
Updated by marcof on 2016-12-23 08:39:10 +00:00 We installed the 2.6.0-2 version on Ubuntu Trusty and Xenial and RHEL7, this fixed the issue. one strange thing though: "icinga2 --version" still shows 2.6.0-1 but the bug is gone. |
Updated by mfrosch on 2016-12-25 10:20:02 +00:00 marcof wrote:
icinga2 - |
Updated by fugstrolch on 2017-01-02 11:44:48 +00:00 fyi: we have some centos5 Clients with Icinga2 V2.0.6. Icinga2 -Process on these servers stopped when we've changed our Master's "global-templates"-config. Adding
seems to solve this problem.
|
Updated by mfriedrich on 2017-01-09 15:20:48 +00:00
|
Updated by mfriedrich on 2017-01-11 16:43:19 +00:00
|
Updated by gbeutner on 2017-01-12 09:55:02 +00:00
Applied in changeset 751ca67. |
Updated by mfriedrich on 2017-01-12 09:55:38 +00:00
|
Hi, on Ubuntu 14.04.5 LTS Since about a week I'm experiencing the same problem that is driving me crazy. In the init script I've found: # Block/ignore SIGPIPE inside Icinga2 But the problem still there. |
This issue has been migrated from Redmine: https://dev.icinga.com/issues/13567
Created by Markus on 2016-12-14 14:05:27 +00:00
Assignee: gbeutner
Status: Resolved (closed on 2017-01-12 09:55:02 +00:00)
Target Version: 2.6.1
Last Update: 2017-01-12 09:55:38 +00:00 (in Redmine)
Today we updated our icinga2 from v2.5.4 to the latest 2.6.0-1.
We have master <-> satellite config with one master, one satellite in a remote site and multiple clients connected.
On every config reload icinga2 does a shutdown but does not start up again. When we start icinga2 again everything works as it did with version 2.5.4. We have not made any config changes during the update.
The same happens when we do a restart (not reload!) after config changes on the satellite. Satellite becomes reload signal via API call and does a shutdown.
Any help is welcome.
Markus
icinga.log on master
icinga.log on satellite
icinga.log on satellite after config change
Version informations of master system:
Version informations of satellite system:
Attachments
Changesets
2016-12-15 10:47:07 +00:00 by mfriedrich fb8f410
2017-01-12 09:50:43 +00:00 by gbeutner 751ca67
Relations:
The text was updated successfully, but these errors were encountered: