Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #9312] reload timeout with " Reload failed for Icinga host/service/network monitoring system" #3034

Closed
icinga-migration opened this issue May 25, 2015 · 10 comments
Labels
area/configuration DSL, parser, compiler, error handling bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/9312

Created by PowellEB on 2015-05-25 11:57:25 +00:00

Assignee: (none)
Status: Closed (closed on 2016-03-09 13:03:57 +00:00)
Target Version: (none)
Last Update: 2016-03-09 13:03:57 +00:00 (in Redmine)

Icinga Version: 2.3.4
Backport?: Not yet backported
Include in Changelog: 1

We are starting to scale up our icinga2 server in preparation for going
live in Production. Currently 3100 hosts, 11000 services.

Config Reload now takes over 4 minutes, but now crashes.
"..... Reload failed for Icinga host/service/network monitoring system"

Restart works fine.

found issue #7306, #7368 that looks to be our issue as well.

Is there anyway to increase the reload timeout?


Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-05-25 11:59:01 +00:00

  • Status changed from New to Feedback
  • Assigned to set to PowellEB

No, there isn't. But I'd prefer fixing the real bug, so could you please attach the gdb backtrace of that crash?

@icinga-migration
Copy link
Author

Updated by PowellEB on 2015-05-25 12:10:34 +00:00

Thank you for quick response. I am happy to do a backtrace.
Can you provide steps so I can get exactly what is needed?
OS is centos.

Best Regards / Mit freundlichen Grüßen,
Eric

@icinga-migration
Copy link
Author

Updated by PowellEB on 2015-05-27 02:07:48 +00:00

dnsmichi,
I am still not clear on the steps for the getting the reload into a backtrace.......
gdb systemctl reload icinga2.... gdb goes off and looks at all system level procs.

Myself and one of our sysadmins watched logs over and over and found:

(1) message log has reload command issued

(2) 90 seconds later
message logs / systemctl status icinga2.service

May 25 14:01:08 fitc09v205 systemd: icinga2.service: control process exited, code=exited status=11
May 25 14:01:08 fitc09v205 systemd: Reload failed for Icinga host/service/network monitoring system.

(3)
Now it gets really interesting, even after the Reload failed....multiple procs are still doing the reload.

8 minutes later, watching procs, a new pid for icinga2 is now up with a reload from old pid.
Verified in webinterface that a config change did occur.

So it looks like even though console gives error message, the reload does occur, just total time about 9 minutes.

If you can provide steps on how I provide a backtrace of all of this, I am happy to comply. For now, we were using this https://github.com/Icinga/icinga2/blob/master/doc/21-debug.md as a guide for gdb, but no luck.

Best Regards / Mit freundlichen Grüßen,
Eric

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-05-27 07:52:40 +00:00

Attaching to systemctl isn't what you want. You'll need to attach to the process (use e.g. ps aux) directly using "-p ". In that case the parent process and the forked child process are interesting, so fire up two terminals. You may then fetch the backtrace as usual.

@icinga-migration
Copy link
Author

Updated by itbess on 2015-07-08 17:22:46 +00:00

This happens in gdb when the reload error occurs.

~]# gdb program 23482
......
......
Program received signal SIGHUP, Hangup.
[Switching to Thread 0x7f9508a0d700 (LWP 17200)]
vfork () at ../sysdeps/unix/sysv/linux/x86_64/vfork.S:44
44 pushq %rdi

http://pastebin.com/PVA7gRpu

[root@fitc09v205 ~]# gdb program 12196 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
...
program: No such file or directory.
Attaching to process 12196
Reading symbols from /usr/sbin/icinga2...Reading symbols from /usr/lib/debug/usr/sbin/icinga2.debug...done.
done.
Reading symbols from /lib64/libboost_thread-mt.so.1.53.0...Reading symbols from /usr/lib/debug/usr/lib64/libboost_thread-mt.so.1.53.0.debug...done.
done.
Loaded symbols for /lib64/libboost_thread-mt.so.1.53.0
Reading symbols from /lib64/libboost_system-mt.so.1.53.0...Reading symbols from /usr/lib/debug/usr/lib64/libboost_system-mt.so.1.53.0.debug...done.
done.
Loaded symbols for /lib64/libboost_system-mt.so.1.53.0
Reading symbols from /lib64/libboost_program_options-mt.so.1.53.0...Reading symbols from /usr/lib/debug/usr/lib64/libboost_program_options-mt.so.1.53.0.debug...done.
done.
Loaded symbols for /lib64/libboost_program_options-mt.so.1.53.0
Reading symbols from /lib64/libboost_regex-mt.so.1.53.0...Reading symbols from /usr/lib/debug/usr/lib64/libboost_regex-mt.so.1.53.0.debug...done.
done.
Loaded symbols for /lib64/libboost_regex-mt.so.1.53.0
Reading symbols from /usr/lib64/icinga2/libbase.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libbase.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libbase.so
Reading symbols from /usr/lib64/icinga2/libconfig.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libconfig.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libconfig.so
Reading symbols from /usr/lib64/icinga2/libcli.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libcli.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libcli.so
Reading symbols from /usr/lib64/icinga2/libremote.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libremote.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libremote.so
Reading symbols from /lib64/libdl.so.2...Reading symbols from /usr/lib/debug/usr/lib64/libdl-2.17.so.debug...done.
done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libssl.so.10...Reading symbols from /usr/lib/debug/usr/lib64/libssl.so.1.0.1e.debug...done.
done.
Loaded symbols for /lib64/libssl.so.10
Reading symbols from /lib64/libcrypto.so.10...Reading symbols from /usr/lib/debug/usr/lib64/libcrypto.so.1.0.1e.debug...done.
done.
Loaded symbols for /lib64/libcrypto.so.10
Reading symbols from /usr/lib64/icinga2/libyajl.so.2...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libyajl.so.2.1.0.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libyajl.so.2
Reading symbols from /usr/lib64/icinga2/libmmatch.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libmmatch.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libmmatch.so
Reading symbols from /usr/lib64/icinga2/libsocketpair.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libsocketpair.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libsocketpair.so
Reading symbols from /usr/lib64/icinga2/libexecvpe.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libexecvpe.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libexecvpe.so
Reading symbols from /lib64/libstdc++.so.6...Reading symbols from /usr/lib/debug/usr/lib64/libstdc++.so.6.0.19.debug...done.
done.
Loaded symbols for /lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...Reading symbols from /usr/lib/debug/usr/lib64/libm-2.17.so.debug...done.
done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libgcc_s.so.1...Reading symbols from /usr/lib/debug/usr/lib64/libgcc_s-4.8.3-20140911.so.1.debug...done.
done.
Loaded symbols for /lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...Reading symbols from /usr/lib/debug/usr/lib64/libc-2.17.so.debug...done.
done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/librt.so.1...Reading symbols from /usr/lib/debug/usr/lib64/librt-2.17.so.debug...done.
done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libpthread.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libpthread-2.17.so.debug...done.
done.
[New LWP 31925]
[New LWP 31923]
[New LWP 23732]
[New LWP 23551]
[New LWP 5227]
[New LWP 5024]
[New LWP 5023]
[New LWP 5007]
[New LWP 5006]
[New LWP 5005]
[New LWP 5004]
[New LWP 12228]
[New LWP 12227]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libicuuc.so.50...Reading symbols from /usr/lib/debug/usr/lib64/libicuuc.so.50.1.2.debug...done.
done.
Loaded symbols for /lib64/libicuuc.so.50
Reading symbols from /lib64/libicui18n.so.50...Reading symbols from /usr/lib/debug/usr/lib64/libicui18n.so.50.1.2.debug...done.
done.
Loaded symbols for /lib64/libicui18n.so.50
Reading symbols from /lib64/libicudata.so.50...Reading symbols from /usr/lib/debug/usr/lib64/libicudata.so.50.1.2.debug...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Loaded symbols for /lib64/libicudata.so.50
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/usr/lib64/ld-2.17.so.debug...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libgssapi_krb5.so.2...Reading symbols from /usr/lib/debug/usr/lib64/libgssapi_krb5.so.2.2.debug...done.
done.
Loaded symbols for /lib64/libgssapi_krb5.so.2
Reading symbols from /lib64/libkrb5.so.3...Reading symbols from /usr/lib/debug/usr/lib64/libkrb5.so.3.3.debug...done.
done.
Loaded symbols for /lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...Reading symbols from /usr/lib/debug/usr/lib64/libcom_err.so.2.1.debug...done.
done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /lib64/libk5crypto.so.3...Reading symbols from /usr/lib/debug/usr/lib64/libk5crypto.so.3.1.debug...done.
done.
Loaded symbols for /lib64/libk5crypto.so.3
Reading symbols from /lib64/libz.so.1...Reading symbols from /usr/lib/debug/usr/lib64/libz.so.1.2.7.debug...done.
done.
Loaded symbols for /lib64/libz.so.1
Reading symbols from /lib64/libkrb5support.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libkrb5support.so.0.1.debug...done.
done.
Loaded symbols for /lib64/libkrb5support.so.0
Reading symbols from /lib64/libkeyutils.so.1...Reading symbols from /usr/lib/debug/usr/lib64/libkeyutils.so.1.5.debug...done.
done.
Loaded symbols for /lib64/libkeyutils.so.1
done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libselinux.so.1...Reading symbols from /usr/lib/debug/usr/lib64/libselinux.so.1.debug...done.
done.
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libpcre.so.1...Reading symbols from /usr/lib/debug/usr/lib64/libpcre.so.1.2.0.debug...done.
done.
Loaded symbols for /lib64/libpcre.so.1
Reading symbols from /lib64/liblzma.so.5...Reading symbols from /usr/lib/debug/usr/lib64/liblzma.so.5.0.99.debug...done.
done.
Loaded symbols for /lib64/liblzma.so.5
Reading symbols from /lib64/libnss_files.so.2...Reading symbols from /usr/lib/debug/usr/lib64/libnss_files-2.17.so.debug...done.
done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /usr/lib64/icinga2/libicinga.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libicinga.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libicinga.so
Reading symbols from /lib64/libnss_dns.so.2...Reading symbols from /usr/lib/debug/usr/lib64/libnss_dns-2.17.so.debug...done.
done.
Loaded symbols for /lib64/libnss_dns.so.2
Reading symbols from /lib64/libnss_myhostname.so.2...Reading symbols from /usr/lib/debug/usr/lib64/libnss_myhostname.so.2.debug...done.
done.
Loaded symbols for /lib64/libnss_myhostname.so.2
Reading symbols from /usr/lib64/icinga2/liblivestatus.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/liblivestatus.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/liblivestatus.so
Reading symbols from /usr/lib64/icinga2/libchecker.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libchecker.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libchecker.so
Reading symbols from /usr/lib64/icinga2/libmethods.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libmethods.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libmethods.so
Reading symbols from /usr/lib64/icinga2/libnotification.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libnotification.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libnotification.so
Reading symbols from /usr/lib64/icinga2/libcompat.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libcompat.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libcompat.so
Reading symbols from /usr/lib64/icinga2/libdb_ido_mysql.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libdb_ido_mysql.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libdb_ido_mysql.so
Reading symbols from /usr/lib64/mysql/libmysqlclient.so.18...Reading symbols from /usr/lib/debug/usr/lib64/mysql/libmysqlclient.so.18.0.0.debug...done.
done.
Loaded symbols for /usr/lib64/mysql/libmysqlclient.so.18
Reading symbols from /usr/lib64/icinga2/libdb_ido.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libdb_ido.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libdb_ido.so
Reading symbols from /usr/lib64/icinga2/libperfdata.so...Reading symbols from /usr/lib/debug/usr/lib64/icinga2/libperfdata.so.debug...done.
done.
Loaded symbols for /usr/lib64/icinga2/libperfdata.so
0x00002b8c1f4e848d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
81      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) cont
Continuing.
[New Thread 0x2b8c233ae700 (LWP 17006)]
[New Thread 0x2b8c3ba85700 (LWP 17007)]
[New Thread 0x2b8c3bc86700 (LWP 17008)]
Detaching after fork from child process 17009.
[New Thread 0x2b8c239b1700 (LWP 17010)]
Detaching after fork from child process 17011.
Detaching after fork from child process 17012.
Detaching after fork from child process 17015.
Detaching after fork from child process 17016.
Detaching after fork from child process 17014.
Detaching after fork from child process 17013.
Detaching after fork from child process 17019.
Detaching after fork from child process 17020.
Detaching after fork from child process 17021.
Detaching after fork from child process 17025.
Program received signal SIGHUP, Hangup.
[Switching to Thread 0x7f9508a0d700 (LWP 17200)]
vfork () at ../sysdeps/unix/sysv/linux/x86_64/vfork.S:44
44 pushq %rdi

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-08-21 19:47:25 +00:00

Please don't use external paste sites, but attach files here directly.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-04 09:22:28 +00:00

  • Status changed from Feedback to New
  • Assigned to deleted PowellEB

@icinga-migration
Copy link
Author

Updated by PowellEB on 2015-09-27 15:17:37 +00:00

Michael,
One of my sysadmins did some more gdb traces, but nothing points to the problem in icinga2.
This issue has not presented itself on other distros we have tested, only centos (also like the other poster.)

finally, thinking thru this again, what is the problem?
Doing a reload of icinga2, the service does not start fast enough and timesout, then falls back to safe-reload.
KEY IS TIMESOUT,

The timeout is not icinga2, but system parameter (centos)

systemctl show icinga2.service -p TimeoutStartUSec
displays what the start timeout is set to the service.

In centos this is controlled by (/etc/systemd/system.conf /etc/systemd/user.conf)

In our system, modified /etc/systemd/system.conf DefaultTimeoutStartSec=240s
Now a service has up to 4 minutes to start.

So far no problems on reloads or full stops and starts.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-28 08:20:53 +00:00

  • Relates set to 10226

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-09 13:03:57 +00:00

  • Status changed from New to Closed

I'm closing this as duplicate of #10226 which has been fixed and released with 2.4.3.

@icinga-migration icinga-migration added bug Something isn't working area/configuration DSL, parser, compiler, error handling labels Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/configuration DSL, parser, compiler, error handling bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant