[dev.icinga.com #13655] Crash - Error: parse error: premature EOF #4882

icinga-migration · 2016-12-16T20:40:28Z

This issue has been migrated from Redmine: https://dev.icinga.com/issues/13655

Created by erikandersonmx on 2016-12-16 20:40:28 +00:00

Assignee: gbeutner
Status: Resolved (closed on 2017-01-16 13:10:03 +00:00)
Target Version: 2.6.1
Last Update: 2017-01-16 13:10:03 +00:00 (in Redmine)

Icinga Version: r2.6.0-1
Backport?: Not yet backported
Include in Changelog: 1

Not sure where to look from here:

@error: parse error: premature EOF
{arguments
(right here) ------^

(0) libbase.so.2.6.0: (+0xc9148) [0x7f60bab36148]
(1) libbase.so.2.6.0: (+0xc91f9) [0x7f60bab361f9]
(2) libbase.so.2.6.0: icinga::JsonDecode(icinga::String const&) (+0x3ce) [0x7f60baad26ae]
(3) libbase.so.2.6.0: (+0x85f82) [0x7f60baaf2f82]
(4) libbase.so.2.6.0: (+0x87640) [0x7f60baaf4640]
(5) /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2() [0x409088]
(6) /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2() [0x406fa9]
(7) libc.so.6: __libc_start_main (+0xf5) [0x7f60b9c43f45]
(8) /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2() [0x4070e0]@

Changesets

2017-01-16 13:04:45 +00:00 by (unknown) 06064e7

Fix crash when sending/receiving messages longer than 4 kB to/from the child process for the Process class

fixes #13655

Signed-off-by: Michael Friedrich <michael.friedrich@icinga.com>

2017-01-16 13:05:18 +00:00 by gbeutner 82bab6b

Clean up the patch for the crash issue in the Process class

fixes #13655

2017-01-16 13:05:36 +00:00 by gbeutner c31d024

Update AUTHORS

refs #13655

2017-01-16 13:15:39 +00:00 by (unknown) 9fa3f3b

Fix crash when sending/receiving messages longer than 4 kB to/from the child process for the Process class

refs #13655

Signed-off-by: Michael Friedrich <michael.friedrich@icinga.com>

2017-01-16 13:15:42 +00:00 by gbeutner 060e20f

Clean up the patch for the crash issue in the Process class

fixes #13655

2017-01-16 13:15:42 +00:00 by gbeutner ff07cee

Update AUTHORS

refs #13655

Relations:

relates #13655
relates #13655
relates #13655
duplicates #13655

The text was updated successfully, but these errors were encountered:

icinga-migration · 2016-12-16T21:02:14Z

Updated by erikandersonmx on 2016-12-16 21:02:14 +00:00

Full debug log https://gist.github.com/erikanderson/4cb950e2ee6888ea98a49e47ba02ca6a

icinga-migration · 2016-12-16T21:59:38Z

Updated by erikandersonmx on 2016-12-16 21:59:38 +00:00

from kern.log

Dec 16 14:55:36 sd-prod-icinga1 kernel: [16255.139654] icinga2[31719]: segfault at 7f3800005587 ip 00007f38e50a8814 sp 00007f38c9ea8340 error 4 in libc-2.19.so[7f38e4fe7000+1ba000]

icinga-migration · 2016-12-16T23:01:05Z

Updated by erikandersonmx on 2016-12-16 23:01:05 +00:00

This issue seems to be the same thing https://dev.icinga.com/issues/13173

icinga-migration · 2016-12-19T23:35:32Z

Updated by erikandersonmx on 2016-12-19 23:35:32 +00:00

I think the root cause of this was the check_rabbitmq plugin from here https://github.com/CaptPhunkosis/check\_rabbitmq

We have a large number of queues and so my theory is that when the plugin reported a recovery it sent a large list of queues that were ok and it overloaded something within icinga. We haven't had this issue in the past, before 2.6.0 but I don't have any hard data on that

I think this can be closed

icinga-migration · 2016-12-21T08:01:59Z

Updated by gvde on 2016-12-21 08:01:59 +00:00

We see the exact same problem on our server since the update last week. 5 crashes within 7 days. Two this morning within two hours.

erikandersonmx wrote:

I think the root cause of this was the check_rabbitmq plugin from here https://github.com/CaptPhunkosis/check\_rabbitmq

We don't use this plugin. I doubt it's related with that.

We have a large number of queues and so my theory is that when the plugin reported a recovery it sent a large list of queues that were ok and it overloaded something within icinga. We haven't had this issue in the past, before 2.6.0 but I don't have any hard data on that

I think this can be closed

I don't think so. IMHO it has nothing to do with that plugin. And even if it was related to the plugin: no plugin should ever be able to break the icinga2 server, regardless of what it returns.

This needs to be fixed.

icinga-migration · 2016-12-21T08:14:44Z

Updated by gvde on 2016-12-21 08:14:44 +00:00

Here is my full crash report:

Caught unhandled exception.
Current time: 2016-12-21 08:16:21 +0100

Application information:
Application version: v2.6.0
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid

System information:
Platform: CentOS Linux
Platform version: 7 (Core)
Kernel: Linux
Kernel version: 3.10.0-514.2.2.el7.x86_64
Architecture: x86_64

Build information:
Compiler: GNU 4.8.5
Build host: unknown

Error: parse error: premature EOF
{arguments
(right here) ------^

(0) libbase.so.2.6.0: (+0xa4a25) [0x7f22a980ca25]
(1) libbase.so.2.6.0: icinga::JsonDecode(icinga::String const&) (+0x3a6) [0x7f22a980d046]
(2) libbase.so.2.6.0: (+0xd06a3) [0x7f22a98386a3]
(3) libbase.so.2.6.0: (+0xe59d9) [0x7f22a984d9d9]
(4) /usr/lib64/icinga2/sbin/icinga2() [0x40b0ba]
(5) /usr/lib64/icinga2/sbin/icinga2() [0x4092c9]
(6) libc.so.6: __libc_start_main (+0xf5) [0x7f22a6c13b35]
(7) /usr/lib64/icinga2/sbin/icinga2() [0x409361]

icinga-migration · 2016-12-21T08:27:28Z

Updated by gvde on 2016-12-21 08:27:28 +00:00

Maybe it has something to do with the call to the notification script. I found these log messages at the time from the crash before:

[2016-12-21 06:27:18 +0100] warning/Process: PID -1 was terminated by signal 58 (Real-time signal 24)
[2016-12-21 06:27:18 +0100] information/Notification: Completed sending 'Problem' notification 'bovecmail-service' for checkable 'bovec!yum-updates' and user 'vogt'.
[2016-12-21 06:27:26 +0100] warning/PluginNotificationTask: Notification command for object 'bovec!yum-updates' (PID: -1, arguments: '/etc/icinga2/scripts/mail-service-notification.sh') terminated with exit code 128, output: <Terminated by signal 58 (Real-time signal 24).>

Oddly, for the last crash there is not the same. However, at the time of the crash it was sending notifications. Same for the crashes before that one...

icinga-migration · 2016-12-22T09:56:41Z

Updated by fallback on 2016-12-22 09:56:41 +00:00

I got a similar problem. Unfortunately there is no other error/warning/suspicious message before the crash. Everything went smooth till then.

Caught unhandled exception.
Current time: 2016-12-21 17:29:02 +0100
Application information:
  Application version: r2.6.0-1
  Installation root: /usr
  Sysconf directory: /etc
  Run directory: /run
  Local state directory: /var
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid
System information:
  Platform: Ubuntu
  Platform version: 16.04.1 LTS (Xenial Xerus)
  Kernel: Linux
  Kernel version: 4.4.0-53-generic
  Architecture: x86_64
Build information:
  Compiler: GNU 5.4.0
  Build host: lcy01-04
Error: parse error: premature EOF
                                       {"arguments":["/etc/icinga2/scr
                     (right here) ------^
        (0) libbase.so.2.6.0:  (+0xd68d3) [0x7fc3a0b1b8d3]
        (1) libbase.so.2.6.0:  (+0xd69b7) [0x7fc3a0b1b9b7]
        (2) libbase.so.2.6.0: icinga::JsonDecode(icinga::String const&) (+0x3cb) [0x7fc3a0aa739b]
        (3) libbase.so.2.6.0:  (+0xaa356) [0x7fc3a0aef356]
        (4) libbase.so.2.6.0:  (+0xaba39) [0x7fc3a0af0a39]
        (5) /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2() [0x409aa5]
        (6) /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2() [0x407c79]
        (7) libc.so.6: __libc_start_main (+0xf0) [0x7fc3a0104830]
        (8) /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2() [0x407de9]
***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://dev.icinga.org/ and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***
Failed to launch GDB: No such file or directory

After this, the Host (satellite) loses the conncetion the its master.

2016-12-21 17:33:43 +0100] notice/ThreadPool: Pool #1: Pending tasks: 846; Average latency: 0ms; Threads: 64; Pool utilization: 100%
[2016-12-21 17:33:44 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[...] # line repeats many times - shortened for better reading
[2016-12-21 17:33:50 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:50 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:50 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:50 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:50 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:50 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:50 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:50 +0100] warning/JsonRpcConnection: API client disconnected for identity 'master.domain.tld'
[2016-12-21 17:33:50 +0100] warning/ApiListener: Removing API client for endpoint 'master.domain.tld'. 0 API clients left.
[2016-12-21 17:33:51 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:51 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:51 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:51 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:51 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:51 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:51 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:51 +0100] notice/ThreadPool: Thread pool; current: 16; adjustment: 2
[2016-12-21 17:33:51 +0100] information/ApiListener: New client connection for identity 'master.domain.tld' from [192.168.62.96]:56218

icinga-migration · 2016-12-22T11:05:58Z

Updated by n0braist on 2016-12-22 11:05:58 +00:00

Hi, same for me but in the moment just on sles with HA cluster environment

permanent crashes after some minutes:

example1:

Caught unhandled exception.
Current time: 2016-12-22 09:23:40 +0100

Application information:
  Application version: v2.6.0
  Installation root: /usr
  Sysconf directory: /etc
  Run directory: /var/run
  Local state directory: /var
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /var/run/icinga2/icinga2.pid

System information:
  Platform: SLES
  Platform version: 11.4
  Kernel: Linux
  Kernel version: 3.0.101-63-default
  Architecture: x86_64

Build information:
  Compiler: GNU 4.8.3
  Build host: 92ab943600f4

Error: parse error: premature EOF
                                       {"arguments":["/usr/lib/nagios/
                     (right here) ------^


        (0) libbase.so.2.6.0:  (+0x9ba15) [0x7f8561ba7a15]
        (1) libbase.so.2.6.0: icinga::JsonDecode(icinga::String const&) (+0x37e) [0x7f8561ba7fee]
        (2) libbase.so.2.6.0:  (+0xc4b43) [0x7f8561bd0b43]
        (3) libbase.so.2.6.0:  (+0xc61d0) [0x7f8561bd21d0]
        (4) /usr/lib64/icinga2/sbin/icinga2() [0x40b6bb]
        (5) /usr/lib64/icinga2/sbin/icinga2() [0x409819]
        (6) libc.so.6: __libc_start_main (+0xe6) [0x7f855f0dcc36]
        (7) /usr/lib64/icinga2/sbin/icinga2() [0x40989d]

example2:

Caught unhandled exception.
Current time: 2016-12-21 11:49:02 +0100

Application information:
  Application version: v2.6.0
  Installation root: /usr
  Sysconf directory: /etc
  Run directory: /var/run
  Local state directory: /var
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /var/run/icinga2/icinga2.pid

System information:
  Platform: SLES
  Platform version: 11.4
  Kernel: Linux
  Kernel version: 3.0.101-63-default
  Architecture: x86_64

Build information:
  Compiler: GNU 4.8.3
  Build host: 92ab943600f4

Error: lexical error: invalid char in json text.
                                         shows the event of the inform
                     (right here) ------^


        (0) libbase.so.2.6.0:  (+0x9ba15) [0x7f8bc92d5a15]
        (1) libbase.so.2.6.0: icinga::JsonDecode(icinga::String const&) (+0x37e) [0x7f8bc92d5fee]
        (2) libbase.so.2.6.0: icinga::Process::DoEvents() (+0x2e6) [0x7f8bc931e366]
        (3) libbase.so.2.6.0: icinga::Process::IOThreadProc(int) (+0x3a9) [0x7f8bc93446c9]
        (4) libboost_thread.so.1.53.0:  (+0xe524) [0x7f8bc9d3c524]
        (5) libpthread.so.0:  (+0x7806) [0x7f8bc6b6f806]
        (6) libc.so.6: clone (+0x6d) [0x7f8bc68ca9bd]

icinga-migration · 2016-12-22T11:22:56Z

Updated by n0braist on 2016-12-22 11:22:56 +00:00

little correction. also on non clustered single node machines

icinga-migration · 2016-12-22T18:50:25Z

Updated by erikandersonmx on 2016-12-22 18:50:25 +00:00

Thanks for the confirmation that it is most likely not that one specific plugin. I just had a similar crash on a node that had an updated version of the check so there is something else going on here. I'm also going to try to see if I can get the previous version of icinga2 somewhere (its been nuked from the ppa of course) so I can revert for the holidays

icinga-migration · 2016-12-22T20:14:15Z

Updated by erikandersonmx on 2016-12-22 20:14:15 +00:00

Based on the error I am looking at this change to see how it might impact this: https://dev.icinga.com/projects/i2/repository/revisions/14ea2596c5cc9a2622f3ade2819cff86cc2eec71/diff/lib/base/json.cpp

icinga-migration · 2016-12-22T20:18:42Z

Updated by erikandersonmx on 2016-12-22 20:18:42 +00:00

This is the issue for that change that is refd in the changelog https://dev.icinga.com/issues/12538

icinga-migration · 2016-12-22T20:31:12Z

Updated by erikandersonmx on 2016-12-22 20:31:12 +00:00

Nm, that is most likely not the issue as I dug into the for loop change. From what I have seen so far it seems to be happening right after a notification is sent so I am going to zero in on changes made to notifications

icinga-migration · 2016-12-22T20:35:52Z

Updated by erikandersonmx on 2016-12-22 20:35:52 +00:00

These are the changes made for notifications and seem like prime candidates for the source of this

https://dev.icinga.com/issues/12718
https://dev.icinga.com/issues/13151

icinga-migration · 2016-12-22T21:06:47Z

Updated by n0braist on 2016-12-22 21:06:47 +00:00

In my case it seems to be a problem during notification or better if different notifications for one problem are defined at the same time (multiple notification definitions).

On the other hand hot candidate is hipsaint (hipchat). calling in notification but I cannot confirm. Still investigating, but later if anything is running again....

icinga-migration · 2016-12-23T06:29:36Z

Updated by fallback on 2016-12-23 06:29:36 +00:00

I'm currently running with the default mail-notification-scripts and one custom sms-notification script written in bash. I think it has nothing to do with hipsaint/hipchat. Morelikely like @erikandersonmx said with the fact that multiple different notifications are defined.

icinga-migration · 2016-12-23T07:35:32Z

Updated by gvde on 2016-12-23 07:35:32 +00:00

n0braist wrote:

In my case it seems to be a problem during notification or better if different notifications for one problem are defined at the same time (multiple notification definitions).

On the other hand hot candidate is hipsaint (hipchat). calling in notification but I cannot confirm. Still investigating, but later if anything is running again....

I don't know how it's organized in SLES but are your notification scripts (i.e. the script mail-host-notification.sh, etc.) in /usr/lib/nagios/ or are they in /etc/icinga2/scripts like on CentOS?

icinga-migration · 2016-12-23T07:48:57Z

Updated by dominicpratt on 2016-12-23 07:48:57 +00:00

We're experiencing the same error on a Debian Jessie system without hipchat or something, we're only using the standard notification scripts in /etc/icinga2/scripts.

icinga-migration · 2016-12-23T08:08:44Z

Updated by n0braist on 2016-12-23 08:08:44 +00:00

Hi, my scripts are in /usr/lib/nagios/
In the moment the problem happens for a special service, but hadn't time what information is inside the check result.
It is a try and error game in the moment :-(

icinga-migration · 2016-12-24T09:47:56Z

Updated by greatexpectations on 2016-12-24 09:47:56 +00:00

Hi, as a temporary workaround on RHEL/CentOS 7 (or any other distribution that uses systemd), you may want to modify the systemd unit file for icinga2 so that the service is automatically restarted in case of a crash.

systemctl edit icinga2.service

then add the Restart=always line in the Service section, e.g.

[Service]
Restart=always

Kind regards

icinga-migration · 2016-12-24T10:04:00Z

Updated by gvde on 2016-12-24 10:04:00 +00:00

greatexpectations wrote:

Hi, as a temporary workaround on RHEL/CentOS 7 (or any other distribution that uses systemd), you may want to modify the systemd unit file for icinga2 so that the service is automatically restarted in case of a crash.

For all the last crashes that won't work on our CentOS 7: even through there is a crash report for icinga2, the icinga2 process itself is still running and the status of the icinga2.service is still active. icinga2 doesn't check anything anymore but still the process is running. Thus, systemd won't notice that icinga2 isn't operating correctly anymore and thus there won't be any restart...

icinga-migration · 2016-12-24T10:12:12Z

Updated by gvde on 2016-12-24 10:12:12 +00:00

greatexpectations wrote:

Hi, as a temporary workaround on RHEL/CentOS 7 (or any other distribution that uses systemd), you may want to modify the systemd unit file for icinga2 so that the service is automatically restarted in case of a crash.

And on a sidenote: you don't want to modify the original icinga2.service file on CentOS 7, i.e. that in /usr/lib/systemd/system/, on ever. All files in that directory come from the rpm packages and are not to be modified. Any change will be overwritten during next update.

Modifications of systemd services go into /etc/systemd/system, e.g. you create a directory icinga2.service.d and put the configuration into a conf file inside that directory...

icinga-migration · 2016-12-29T09:04:20Z

Updated by fallback on 2016-12-29 09:04:20 +00:00

gvde wrote:

greatexpectations wrote:
> Hi, as a temporary workaround on RHEL/CentOS 7 (or any other distribution that uses systemd), you may want to modify the systemd unit file for icinga2 so that the service is automatically restarted in case of a crash.

For all the last crashes that won't work on our CentOS 7: even through there is a crash report for icinga2, the icinga2 process itself is still running and the status of the icinga2.service is still active. icinga2 doesn't check anything anymore but still the process is running. Thus, systemd won't notice that icinga2 isn't operating correctly anymore and thus there won't be any restart...

This is exactly the same behaviour on my "Ubuntu 16.04.1 LTS".

icinga-migration · 2016-12-30T08:12:31Z

Updated by Marax on 2016-12-30 08:12:31 +00:00

I noticed here that the crashes occur when a plugin with extensive output changes its state. Icinga2 tries to send a notification and crash. The mail contains the extended output of the plugin under additional info.

for example...

***** Icinga *****

Notification Type: PROBLEM

Service: Interface Usage
Host: xxxx
Address: xxx
State: CRITICAL

Date/Time: 2016-12-28 09:37:33 +0100

Additional Info: CRITICAL - interface GigabitEthernet0/24 (alias Gi0/24) usage is in:95.51% (955074379.75bit/s) out:0.61% (6077712.50bit/s), interface Vlan1 (alias Vl1) usage is in:0.00% (5821.50bit/s) out:0.00% (14859.00bit/s), interface GigabitEthernet0/1 (alias Gi0/1) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/2 (alias Gi0/2) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/3 (alias Gi0/3) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/4 (alias Gi0/4) usage is in:0.00% (374.25bit/s) out:0.00% (2084.00bit/s), interface GigabitEthernet0/5 (alias Gi0/5) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/6 (alias Gi0/6) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/7 (alias Gi0/7) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/8 (alias Gi0/8) usage is in:0.00% (0.00bit/s) out:0.00% (1517.50bit/s), interface GigabitEthernet0/9 (alias Gi0/9) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/10 (alias Gi0/10) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/11 (alias Gi0/11) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/12 (alias Gi0/12) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/13 (alias Gi0/13) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/14 (alias Gi0/14) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/15 (alias Gi0/15) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/16 (alias Gi0/16) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/17 (alias Gi0/17) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/18 (alias Gi0/18) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/19 (alias Gi0/19) usage is in:0.39% (3866826.75bit/s) out:62.95% (629520783.75bit/s), interface GigabitEthernet0/20 (alias Gi0/20) usage is in:0.22% (2195255.50bit/s) out:32.56% (325550413.50bit/s), interface GigabitEthernet0/21 (alias Gi0/21) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/22 (alias Gi0/22) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/23 (alias Gi0/23) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface Null0 usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s)

Comment: []

Can someone confirm this from his experience?

icinga-migration · 2016-12-30T08:55:43Z

Updated by fallback on 2016-12-30 08:55:43 +00:00

In my experience it's not related to the plugin's output.

It even keeps crashing although no service or host notification was sent by the time.

icinga-migration · 2016-12-30T10:15:46Z

Updated by Marax on 2016-12-30 10:15:46 +00:00

Maybe it's not necessarily a plugin output. We don't run Icinga2 in cluster mode here, but perhaps the bug also affects the communication between cluster nodes. However, for testing purposes i have switched off the notification for the affected services. Let's see if it runs until next year ;-)

icinga-migration · 2017-01-04T13:06:07Z

Updated by flex on 2017-01-04 13:06:07 +00:00

Marax wrote:

I noticed here that the crashes occur when a plugin with extensive output changes its state. Icinga2 tries to send a notification and crash. The mail contains the extended output of the plugin under additional info.

for example...

***** Icinga *****

Notification Type: PROBLEM

Service: Interface Usage
Host: xxxx
Address: xxx
State: CRITICAL

Date/Time: 2016-12-28 09:37:33 +0100

Additional Info: CRITICAL - interface GigabitEthernet0/24 (alias Gi0/24) usage is in:95.51% (955074379.75bit/s) out:0.61% (6077712.50bit/s), interface Vlan1 (alias Vl1) usage is in:0.00% (5821.50bit/s) out:0.00% (14859.00bit/s), interface GigabitEthernet0/1 (alias Gi0/1) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/2 (alias Gi0/2) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/3 (alias Gi0/3) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/4 (alias Gi0/4) usage is in:0.00% (374.25bit/s) out:0.00% (2084.00bit/s), interface GigabitEthernet0/5 (alias Gi0/5) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/6 (alias Gi0/6) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/7 (alias Gi0/7) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/8 (alias Gi0/8) usage is in:0.00% (0.00bit/s) out:0.00% (1517.50bit/s), interface GigabitEthernet0/9 (alias Gi0/9) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/10 (alias Gi0/10) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/11 (alias Gi0/11) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/12 (alias Gi0/12) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/13 (alias Gi0/13) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/14 (alias Gi0/14) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/15 (alias Gi0/15) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/16 (alias Gi0/16) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/17 (alias Gi0/17) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/18 (alias Gi0/18) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/19 (alias Gi0/19) usage is in:0.39% (3866826.75bit/s) out:62.95% (629520783.75bit/s), interface GigabitEthernet0/20 (alias Gi0/20) usage is in:0.22% (2195255.50bit/s) out:32.56% (325550413.50bit/s), interface GigabitEthernet0/21 (alias Gi0/21) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/22 (alias Gi0/22) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface GigabitEthernet0/23 (alias Gi0/23) usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s) (down), interface Null0 usage is in:0.00% (0.00bit/s) out:0.00% (0.00bit/s)

Comment: []

Can someone confirm this from his experience?

I can confirm this, if we remove a check which output too long, everything is OK

icinga-migration · 2017-01-05T13:04:05Z

Updated by fugstrolch on 2017-01-05 13:04:05 +00:00

(sorry for my english :) )

We have a HA-cluster with two Icinga2-Masters. Since the Installation of V2.6.0 they crashed every 2-3 days.
We have no more problems since we changed our "NotificationCommand" - definitions by deleting all lines with "import plugin-notification-command".

MAYBE: The "plugin-notification-command" (which isn't necessary since V2.6.0 ?) has problems with extended plugin-outputs?

Can someone confirm this?

Solution would be: Don't use "import plugin-notification-command" any more (?)

icinga-migration · 2017-01-05T18:36:16Z

Updated by Marax on 2017-01-05 18:36:16 +00:00

flex wrote:

> Can someone confirm this from his experience?

I can confirm this, if we remove a check which output too long, everything is OK

Since switching off the notification of services with extensive output there are no crashs anymore since last year ;-)

It has something in common with passing of a lot of data to external scripts. Perhaps the resulting runtime. It does not have to be a notification. I think that all external scripts will be affected.

icinga-migration · 2017-01-05T22:02:34Z

Updated by lucasmopdx on 2017-01-05 22:02:34 +00:00

Is there any resolution available here?

I've tried removing the import plugin-notification-command. I can't make my notification messages any shorter.

This is a pretty serious bug and the packages for 2.5.4 appear to have been removed from the repository (at least for Ubuntu Trusty), so downgrading is also a difficult option.

icinga-migration · 2017-01-05T22:30:33Z

Updated by lucasmopdx on 2017-01-05 22:30:33 +00:00

Here's a backtrace, if useful:

(gdb) bt

#0  __execvpe (file=0x7ffc00000ae0 , argv=0x7ffcc74f9e30, envp=0x7ffcc74ff780) at execvpe.c:53
#1  0x00007f1bee421e74 in icinga::Application::AttachDebugger (filename=..., interactive=interactive@entry=false) at /build/icinga2-xYTSb8/icinga2-2.6.0/lib/base/application.cpp:613
#2  0x00007f1bee4523ad in icinga::Application::ExceptionHandler () at /build/icinga2-xYTSb8/icinga2-2.6.0/lib/base/application.cpp:805
#3  0x00007f1bedb946d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f1bedb94703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f1bedb94922 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f1bee3fd9af in __cxa_throw (obj=obj@entry=0xd95750,

pvtinfo=pvtinfo@entry=0x7f1bee7297c0 <typeinfo for boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorstd::invalid_argument >>,
dest=dest@entry=0x7f1bee45fe60 <boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorstd::invalid_argument >::~clone_impl()>)
at /build/icinga2-xYTSb8/icinga2-2.6.0/lib/base/exception.cpp:144

#7  0x00007f1bee46f148 in boost::throw_exception > (e=...) at /usr/include/boost/throw_exception.hpp:67
#8  0x00007f1bee46f1f9 in boost::exception_detail::throw_exception_ (x=...,

current_function=current_function@entry=0x7f1bee4ccc40 <icinga::JsonDecode(icinga::String const&)::PRETTY_FUNCTION> "icinga::Value icinga::JsonDecode(const icinga::String&)",
file=file@entry=0x7f1bee4b5040 "/build/icinga2-xYTSb8/icinga2-2.6.0/lib/base/json.cpp", line=line@entry=351) at /usr/include/boost/throw_exception.hpp:84

#9  0x00007f1bee40b6ae in icinga::JsonDecode (data=...) at /build/icinga2-xYTSb8/icinga2-2.6.0/lib/base/json.cpp:351
#10 0x00007f1bee42bf82 in ProcessHandler () at /build/icinga2-xYTSb8/icinga2-2.6.0/lib/base/process.cpp:278
#11 0x00007f1bee42d640 in StartSpawnProcessHelper () at /build/icinga2-xYTSb8/icinga2-2.6.0/lib/base/process.cpp:334
#12 0x00007f1bee42d6d5 in icinga::Process::InitializeSpawnHelper () at /build/icinga2-xYTSb8/icinga2-2.6.0/lib/base/process.cpp:452
#13 0x0000000000409088 in Main () at /build/icinga2-xYTSb8/icinga2-2.6.0/icinga-app/icinga.cpp:452
#14 0x0000000000406fa9 in main (argc=6, argv=0x7ffcc74ff748) at /build/icinga2-xYTSb8/icinga2-2.6.0/icinga-app/icinga.cpp:779

icinga-migration · 2017-01-05T22:42:16Z

Updated by lucasmopdx on 2017-01-05 22:42:16 +00:00

The issue appears to be here:

https://github.com/Icinga/icinga2/blob/v2.6.0/lib/base/process.cpp#L278

Basically, it allocates a buffer that's a maximum of 4096 bytes for a message. If the message exceeds 4096 bytes it's just not read past that length? (I didn't look at the code that sends this)
Since it's expecting a well formed JSON document, a truncated JSON document causes an unhandled exception.

icinga-migration · 2017-01-06T07:37:28Z

Updated by Marax on 2017-01-06 07:37:28 +00:00

lucasmopdx wrote:

The issue appears to be here:

https://github.com/Icinga/icinga2/blob/v2.6.0/lib/base/process.cpp#L278

Basically, it allocates a buffer that's a maximum of 4096 bytes for a message. If the message exceeds 4096 bytes it's just not read past that length? (I didn't look at the code that sends this)
Since it's expecting a well formed JSON document, a truncated JSON document causes an unhandled exception.

I have no idea of programming but is there possibly a relationship with issue #13567 ? Or are two bugs in the same file?

https://dev.icinga.com/projects/i2/repository/revisions/fb8f4105ad9b72b58305b79f830d5e2b9a8128f4/diff/lib/base/process.cpp

icinga-migration · 2017-01-06T21:30:49Z

Updated by lucasmopdx on 2017-01-06 21:30:49 +00:00

Marax wrote:

lucasmopdx wrote:

I have no idea of programming but is there possibly a relationship with issue #13567 ? Or are two bugs in the same file?

https://dev.icinga.com/projects/i2/repository/revisions/fb8f4105ad9b72b58305b79f830d5e2b9a8128f4/diff/lib/base/process.cpp

These are two separate bugs in the same file.

icinga-migration · 2017-01-06T21:35:15Z

Updated by lucasmopdx on 2017-01-06 21:35:15 +00:00

Added a PR on github to fix this issue:

#119

Note that I was not able to test it as we resolved by downgrading to icinga2 2.5.4 (painfully).

icinga-migration · 2017-01-09T15:16:48Z

Updated by mfriedrich on 2017-01-09 15:16:48 +00:00

Duplicated set to 13619

icinga-migration · 2017-01-09T15:18:26Z

Updated by mfriedrich on 2017-01-09 15:18:26 +00:00

Category set to libbase
Priority changed from Normal to High
Target Version set to 2.6.1

icinga-migration · 2017-01-09T15:33:21Z

Updated by n0braist on 2017-01-09 15:33:21 +00:00

On information for the length: I saw the behavior if the output length is longer than 5120 bytes.

icinga-migration · 2017-01-09T15:40:37Z

Updated by mfriedrich on 2017-01-09 15:40:37 +00:00

Relates set to 13895

icinga-migration · 2017-01-09T16:52:52Z

Updated by erikandersonmx on 2017-01-09 16:52:52 +00:00

lucasmopdx wrote:

The issue appears to be here:

https://github.com/Icinga/icinga2/blob/v2.6.0/lib/base/process.cpp#L278

Basically, it allocates a buffer that's a maximum of 4096 bytes for a message. If the message exceeds 4096 bytes it's just not read past that length? (I didn't look at the code that sends this)
Since it's expecting a well formed JSON document, a truncated JSON document causes an unhandled exception.

Awesome find!

icinga-migration · 2017-01-09T16:59:11Z

Updated by mfriedrich on 2017-01-09 16:59:11 +00:00

Relates set to 13739

icinga-migration · 2017-01-11T13:08:26Z

Updated by mfriedrich on 2017-01-11 13:08:26 +00:00

Relates set to 13657

icinga-migration · 2017-01-12T16:09:51Z

Updated by snallygaster on 2017-01-12 16:09:51 +00:00

Just want to check - is everyone that is experiencing this issue running SELinux in enforcing mode? Although not a good idea for a production environment test, since running my test server with SELinux in permissive mode I haven't had a recurrence of the issue.

icinga-migration · 2017-01-12T16:18:38Z

Updated by gvde on 2017-01-12 16:18:38 +00:00

snallygaster wrote:

Just want to check - is everyone that is experiencing this issue running SELinux in enforcing mode? Although not a good idea for a production environment test, since running my test server with SELinux in permissive mode I haven't had a recurrence of the issue.

We are running the production server in permissive mode only as there are a lot of avc hits. And we see the crash from time to time... So at least for us it's not related with selinux avc denies...

icinga-migration · 2017-01-13T09:18:50Z

Updated by gvde on 2017-01-13 09:18:50 +00:00

n0braist wrote:

On information for the length: I saw the behavior if the output length is longer than 5120 bytes.

I can confirm that it seems to be related to the length of the plugin output, crashing when it's longer than 4223 bytes, not crashing when it's less then 2933 bytes.

icinga-migration · 2017-01-16T07:18:33Z

Updated by gbeutner on 2017-01-16 07:18:33 +00:00

Status changed from New to Assigned
Assigned to set to gbeutner

The only thing I'm not quite happy with with regard to the PR is the fact that the ProcessHandler function is throwing exceptions. There's nobody to catch those exceptions which would definitely cause the child process to crash. I'll see if I can get that updated myself and then merge the PR into the master branch.

icinga-migration · 2017-01-16T10:29:08Z

Updated by mfriedrich on 2017-01-16 10:29:08 +00:00

Test case for many arguments creating a crash:

object CheckCommand "13655-data" {
  command = {{
    var cmd = []

    for (i in range(0, 100000)) {
      cmd.add(i)
    }

    return cmd
  }}
}

object Host "13655-host" {
  check_command = "13655-data"
  check_interval = 1s
}

icinga-migration · 2017-01-16T13:10:03Z

Updated by Anonymous on 2017-01-16 13:10:03 +00:00

Status changed from Assigned to Resolved
Done % changed from 0 to 100

Applied in changeset 06064e7.

icinga-migration closed this as completed Jan 16, 2017

icinga-migration added blocker Blocks a release or needs immediate attention bug Something isn't working libbase labels Jan 17, 2017

icinga-migration added this to the 2.6.1 milestone Jan 17, 2017

[dev.icinga.com #13655] Crash - Error: parse error: premature EOF #4882

[dev.icinga.com #13655] Crash - Error: parse error: premature EOF #4882

Comments

icinga-migration commented Dec 16, 2016

icinga-migration commented Dec 16, 2016

icinga-migration commented Dec 16, 2016

icinga-migration commented Dec 16, 2016

icinga-migration commented Dec 19, 2016

icinga-migration commented Dec 21, 2016

icinga-migration commented Dec 21, 2016

icinga-migration commented Dec 21, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 22, 2016

icinga-migration commented Dec 23, 2016

icinga-migration commented Dec 23, 2016

icinga-migration commented Dec 23, 2016

icinga-migration commented Dec 23, 2016

icinga-migration commented Dec 24, 2016

icinga-migration commented Dec 24, 2016

icinga-migration commented Dec 24, 2016

icinga-migration commented Dec 29, 2016

icinga-migration commented Dec 30, 2016

icinga-migration commented Dec 30, 2016

icinga-migration commented Dec 30, 2016

icinga-migration commented Jan 4, 2017

icinga-migration commented Jan 5, 2017

icinga-migration commented Jan 5, 2017

icinga-migration commented Jan 5, 2017

icinga-migration commented Jan 5, 2017

icinga-migration commented Jan 5, 2017

icinga-migration commented Jan 6, 2017

icinga-migration commented Jan 6, 2017

icinga-migration commented Jan 6, 2017

icinga-migration commented Jan 9, 2017

icinga-migration commented Jan 9, 2017

icinga-migration commented Jan 9, 2017

icinga-migration commented Jan 9, 2017

icinga-migration commented Jan 9, 2017

icinga-migration commented Jan 9, 2017

icinga-migration commented Jan 11, 2017

icinga-migration commented Jan 12, 2017

icinga-migration commented Jan 12, 2017

icinga-migration commented Jan 13, 2017

icinga-migration commented Jan 16, 2017

icinga-migration commented Jan 16, 2017

icinga-migration commented Jan 16, 2017