Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #12937] Icinga2 crashes after setting a downtime #4738

Closed
icinga-migration opened this issue Oct 17, 2016 · 1 comment
Closed
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/12937

Created by Skap1981 on 2016-10-17 09:35:34 +00:00

Assignee: (none)
Status: Closed (closed on 2016-12-07 22:46:03 +00:00)
Target Version: (none)
Last Update: 2016-12-07 22:46:03 +00:00 (in Redmine)

Icinga Version: 2.5.4
Backport?: Not yet backported
Include in Changelog: 1

When we set a downtime shortly(~10 Minutes) after restart from Icinga2, Icinga2 crashes with SEGFAULT

2016-10-17T10:18:39.187384+02:00 mgtmon043 kernel: [3437538.504009] icinga2[34908]: segfault at 28 ip 00007ffff7198b02 sp 00007ffff09d7d80 error 4 in libbase.so[7ffff7043000+1e7000]

Icinga2 runs in the following setup:

OS: SLES 12 SP1
Icinga2: 2.5.4
Icinga Classic UI 1.13.3
Icingaweb2 2.3.4

master-zone with 2 instances
checker-zone with 2 instances

When we set the downtime in the classic ui, there is no sync from master 1 to master 2.
When we set the downtime in Icingaweb2, two of four icinga2-instances crashes with segfault (1x master, 1x checker). But not every time the same instances.
In the following restart process, serveral instances crashes again.

This morning we had to reboot one instance to get icinga2 running again.

Core-Dump at https://1drv.ms/u/s!ADdiB4ZWXQKNiPVf (size 32MB)

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-12-07 22:46:03 +00:00

  • Status changed from New to Closed

The downtime triggers another bug with sending a downtime start notification, which then causes the sent notification being synced throughout the HA cluster. This one triggers the SIGSEGV. I've fixed these two bugs in git master already, tests are good for 2.6.

@icinga-migration icinga-migration added bug Something isn't working area/distributed Distributed monitoring (master, satellites, clients) labels Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant