New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #6450] ipmi-sensors segfault due to stack size #1674
Comments
Updated by mfriedrich on 2014-06-10 12:34:07 +00:00
How does the executed command look like from the logs ('notice' severity)? Doesn't sound like a bug, but a (plugin) configuration issue to me. |
Updated by mfriedrich on 2014-06-10 12:34:45 +00:00
|
Updated by mfriedrich on 2014-06-10 12:34:58 +00:00
|
Updated by dennisp on 2014-06-10 12:40:03 +00:00 [2014-06-10 14:39:19 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ipmi_sensor', '-H', '192.168.12.100', '-T', 'temperature', '-U', 'monitoring', '-P', 'xxx', '-L', 'user': PID 25529 |
Updated by dennisp on 2014-06-10 12:44:28 +00:00 When i use the command directly on shell it works: But not in icinga2 directly i get the error from my report |
Updated by mfriedrich on 2014-06-10 13:00:00 +00:00
I'm not sure why you're adding the new variable like so. Obviously your macro value is resolved to null. Try
or direct access
Oh, and omit the commas at line end. You'll only need them as array separators. |
Updated by dennisp on 2014-06-10 13:16:22 +00:00 I tried this what u wrote and i get the same problem template Host "idrac7-server" { apply Service "MEMORY" { check_command = "check_ipmi_dell", assign where "idrac7-server" in host.templates, [2014-06-10 15:12:38 +0200] notice/Process: Running command '/usr/lib/nagios/plugins/check_ipmi_sensor', '-H', '192.168.x.xx', '-T', 'Memory', '-U', 'monitoring', '-P', 'xxx', '-L', 'user': PID 32152 long_output = 'Sensor Type(s) Memory Status: \\n FreeIPMI returned an empty header map (first line) FreeIPMI could not find any sensors for the given sensor type (option \'-T\').', |
Updated by mfriedrich on 2014-06-10 14:25:54 +00:00
Hm. Ok. Then everything is working as expected in regards of Icinga 2 executing the command from your configuration. I'd rather check if the sensor type "Memory" really exists. Since this now really sounds like a configuration or plugin problem, please proceed at the mailing lists or forums where other users might read and help as well. And provide your manual tests and outputs over there too, most likely your tests and configs do not match. |
Updated by WhoCares on 2014-07-09 13:54:06 +00:00 Since I ran into the very same problem today, I'd like to point out that I don't think that "everything is working as expected in regards of Icinga 2 executing the command from your configuration". The initial problem is a direct result from the binary "ipmimonitoring" or "ipmi-sensors" segfaulting when being called from the check_ipmi_sensor script when run from within Icinga 2. Thus, an empty output is returned to the check_ipmi_sensor script which in turn leads to the errors listed above. When check_ipmi_sensor is run from the command line everything is fine. So I believe that there's something with the environment executed by Icinga 2 that doesn't play well with the "ipmi-sensors" binary. Probably just some memory limit that may need to be raised but I haven't had time to dig into the code and take a look. Here's some excerpt from my syslog: Are there any settings or known limitations when running check_commands? Or any major differences compared to running under plain old "/bin/sh"? |
Updated by gvegidy on 2014-07-09 14:09:59 +00:00
Do you have selinux enabled on your system? If yes, please try either with permissive mode or have a look at the audit-log. I remember having a similar problem where the called program did not correctly handle the permission denied response given by selinux. |
Updated by WhoCares on 2014-07-10 05:50:58 +00:00 Nope, no selinux on that system. Standard Debian Wheezy with the backports repo active and used. |
Updated by dennisp on 2014-07-10 07:45:22 +00:00 good to hear i am not alone. same here no selinux or apparmor |
Updated by WhoCares on 2014-07-10 07:48:44 +00:00 Same segfaults? |
Updated by dennisp on 2014-07-10 07:50:39 +00:00 yes on ubuntu 14 lts. running them with commandline everything is fine |
Updated by mfriedrich on 2014-07-10 07:59:06 +00:00 WhoCares wrote:
According to the original check output, it rather looks like an issue with the plugin itself.
I'm not sure how this check plugin handles "uninitialized values", causing the ipmi binary to segfault. Segfaults shouldn't happen at all cost, and therefore I would rather dive into debugging that plugin and binary calls.
I'd like to see your manual tests including users and environment.
Sounds like one needs a gdb wrapper for the ipmi-sensors call to get a full bt output getting an idea what's wrong here. |
Updated by WhoCares on 2014-07-10 08:12:52 +00:00 dnsmichi wrote:
You got it backwards ;) Let me explain:
or
to determine the version of FreeIPMI installed on the system. On Debian 7/ Ubuntu 14 LTS the latter will be used. Here's the whole function from the perl script where it is failing:
Now, the segfault of the external call on the 4th line will result in the variable `@ipmi_version_output` being empty which in turn leads to the error in line 6 where the regex splitting of $ipmi_version is normally about to happen. At least that's what I determined when actually debugging the perl stuff.
You're mixing up cause and result here. It's not the script that sends an uninitialized output to the binary. It's the binary that doesn't send anything back (due to the segfault) which then causes the "uninitialized variable" error message.
You're very welcome. If you want me to, I'll set up a test system with ssh access so you can have a look for yourself. The plugin itself is pretty straightforward. It could be simplified a fair bit for it still checks for FreeIPMI versions from the stone age but except that, there's nothing spectacular or fancy in there.
|
Updated by WhoCares on 2014-07-10 08:40:42 +00:00 Ok, here we go. First the various config snippets: Command definition:
Service Application:
Example Host Definition:
This should result in a command like:
Running said command directly on the command line gives me:
Doesn't matter whether I run this as
Now using the Icinga 2 config as given above, I see this in the
So I think we can now at least agree to the "check_ipmi_sensor" script not being the problem. |
Updated by mfriedrich on 2014-07-10 08:49:14 +00:00 Ok, thanks. That sounds like a similar issue to #6588 - could you test the current snapshot builds where a fixed stack size is applied already? |
Updated by WhoCares on 2014-07-10 09:00:35 +00:00 Updated Icinga 2 to this:
But still got that:
|
Updated by gbeutner on 2014-07-11 07:52:04 +00:00
|
Updated by WhoCares on 2014-07-11 07:54:21 +00:00 Good timing ;) I just updated to |
Updated by gbeutner on 2014-07-11 09:20:51 +00:00
|
Updated by gbeutner on 2014-07-11 09:21:29 +00:00 Required changes:
|
Updated by tobiasvdk on 2014-07-17 14:26:08 +00:00 I also get these segfaults. I'm running icinga v2.0.1-11-g263f198 with 247 ipmi checks on a debian 7.6. Although these segfaults are occuring (some?) of these checks are working because I see (a valid - OK) output in the web ui. |
Updated by mfriedrich on 2014-07-17 14:35:32 +00:00 You could try setting the stack size manually and then calling the ipmi-sensors binary. Posting your results and the value which then works will help find a better solution.
|
Updated by tobiasvdk on 2014-07-18 09:31:13 +00:00 dnsmichi wrote:
It's the same situation as dennisp ... running the command on the shell works without segfault. Having only one check works.
I will try to figure out how many checks I can have configured... |
Updated by WhoCares on 2014-07-18 09:57:59 +00:00 Just ran this:
And came to this:
This was directly from the command line. If time permits I'm going to try from within Icinga later today. |
Updated by tobiasvdk on 2014-07-18 12:28:17 +00:00
|
Updated by WhoCares on 2014-07-18 12:33:06 +00:00 Strange. Makes me wonder why it is working for me at 2048K and for you at 4096K. Here's what I have:
|
Updated by tobiasvdk on 2014-07-18 12:38:20 +00:00 The "strange" thing is, although these segfaults occur the icinga checks (or only some of them randomly?) return a correct output - as seen in the webui. |
Updated by WhoCares on 2014-07-18 12:40:39 +00:00 I would think it's either random or pure luck on your side. Never returned anything useful for me for > 75 checks running at any given time. |
Updated by tobiasvdk on 2014-07-18 12:46:20 +00:00 WhoCares wrote:
|
Updated by WhoCares on 2014-07-18 12:56:39 +00:00 Thought as much. So my manually built 1.4.4 seems to reduce the needed stack size. |
Updated by gbeutner on 2014-07-21 08:22:47 +00:00
|
Updated by gbeutner on 2014-07-21 11:35:16 +00:00
Applied in changeset 5dcf1a7. |
Updated by gbeutner on 2014-07-21 11:36:03 +00:00
Please recheck if my latest patch fixes this issue. |
Updated by WhoCares on 2014-07-21 11:59:42 +00:00 Works fine for me, no more segfaults. |
Updated by tobiasvdk on 2014-07-28 07:15:34 +00:00 Same for me ... works fine. |
This issue has been migrated from Redmine: https://dev.icinga.com/issues/6450
Created by dennisp on 2014-06-10 12:27:36 +00:00
Assignee: gbeutner
Status: Resolved (closed on 2014-07-21 11:35:16 +00:00)
Target Version: 2.0.2
Last Update: 2014-07-28 07:15:34 +00:00 (in Redmine)
I added a command with:
the command will not work as it should. the debug log shows:
when i add -vvv to the check ipmi command the correct arguments are showed so the command will work when i execute it directly in the shell.
what i can do to get this to work?
Changesets
2014-07-21 11:33:01 +00:00 by gbeutner 5dcf1a7
Relations:
The text was updated successfully, but these errors were encountered: