Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #11045] Icinga service crashing on Windows #3880

Closed
icinga-migration opened this issue Jan 27, 2016 · 11 comments
Closed

[dev.icinga.com #11045] Icinga service crashing on Windows #3880

icinga-migration opened this issue Jan 27, 2016 · 11 comments
Labels
bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/11045

Created by minibbjd on 2016-01-27 19:20:20 +00:00

Assignee: minibbjd
Status: Closed (closed on 2016-03-07 08:23:33 +00:00)
Target Version: (none)
Last Update: 2016-05-02 13:35:57 +00:00 (in Redmine)

Icinga Version: 2.4.1
Backport?: Not yet backported
Include in Changelog: 1

On mutliple servers my Windows Icinga agent is crashing with several times the same line

[2016-01-27 17:58:18 W. Europe Standard Time] warning/Process: Killing process group 16764 ('"C:\Program Files (x86)\ICINGA2/sbin/check_users.exe"') after timeout of 60 seconds

in the icinga2.log (on some servers it happens with check_users.exe on others with other Windows plugins) and this crash report:

Application information:
  Application version: v2.4.1
  Installation root: C:\Program Files (x86)\ICINGA2
  Sysconf directory: C:\Program Files (x86)\ICINGA2\etc
  Run directory: C:\Program Files (x86)\ICINGA2\var\run
  Local state directory: C:\Program Files (x86)\ICINGA2\var
  Package data directory: C:\Program Files (x86)\ICINGA2\share\icinga2
  State path: C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state
  Modified attributes path: C:\Program Files (x86)\ICINGA2\var/lib/icinga2/modified-attributes.conf
  Objects path: C:\Program Files (x86)\ICINGA2\var/cache/icinga2/icinga2.debug
  Vars path: C:\Program Files (x86)\ICINGA2\var/cache/icinga2/icinga2.vars
  PID path: C:\Program Files (x86)\ICINGA2\var\run/icinga2/icinga2.pid

System information:
  Platform: Windows
  Platform version: 7 SP1 (Server)
  Kernel: Windows
  Kernel version: 6.1
  Architecture: x86_64
Caught unhandled SEH exception.
Current time: 2016-01-27 17:58:18 W. Europe Standard Time

Stacktrace:

    (0): (unknown function)
    (1): icinga::String::GetLength+144
    (2): icinga::String::swap+1208
    (3): icinga::String::swap+104599
    (4): icinga::Socket::SocketPair+643
    (5): icinga::ThreadPool::ManagerThreadProc+1336
    (6): icinga::posix_error::what+7870
    (7): _get_flsindex+111
    (8): _get_flsindex+83
    (9): BaseThreadInitThunk+18
    (10): RtlInitializeExceptionChain+99
    (11): RtlInitializeExceptionChain+54
***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://dev.icinga.org/ and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***

Subtasks:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-01-29 09:19:24 +00:00

  • Category set to libbase
  • Status changed from New to Assigned
  • Assigned to set to jflach

@jean

Please try to reproduce the issue. Thanks.

@icinga-migration
Copy link
Author

Updated by minibbjd on 2016-01-29 16:27:54 +00:00

Some more info on this. I can reproduce the problem by starting from a working setup which has a Windows host defined which has the Windows agent installed. From there, do the following:

  • Duplicate the host entry by copying the "object Host" section. Leave all attributes as they are, including the IP address, just change the name to something else. Also duplicate the endpoint and zone definition for this host, again renaming it.
  • The checker side now has two host entries with the same IP.
  • Restart icinga and you will see that on the remote Windows host the Icinga2.exe process piles up TCP/IP connections. After some time, the process will die.

This might look like a simple misconfiguration, but in fact if you have servers with more than one IP address or MS clusters where resource groups have their own IP, you can easily get into this situation. It might also be an easy way to bring down Windows Agents in a malicious way.

This post shows how I ended up there (trying to check MS cluster resources): http://www.monitoring-portal.org/wbb/index.php?page=Thread&threadID=35155

@icinga-migration
Copy link
Author

Updated by jflach on 2016-02-08 14:43:58 +00:00

I was not able to reproduce this following your steps.
Can you provide a minimal config example that causes this crash?

@icinga-migration
Copy link
Author

Updated by jflach on 2016-02-08 14:44:20 +00:00

  • Status changed from Assigned to Feedback
  • Assigned to changed from jflach to minibbjd

@icinga-migration
Copy link
Author

Updated by jflach on 2016-02-09 12:56:55 +00:00

Also please retest with a master snapshot. We had (maybe) similar problem a short while ago, its fix will be in version 2.4.2

@icinga-migration
Copy link
Author

Updated by minibbjd on 2016-02-09 12:59:28 +00:00

Is there a compiled version for Windows available of the master snapshot? I won't be able to set up a build environment.

@icinga-migration
Copy link
Author

Updated by jflach on 2016-02-10 11:25:01 +00:00

Yes, we provide snapshots for windows (http://packages.icinga.org/windows/)

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-24 20:12:42 +00:00

2.4.3 is also available now.

@icinga-migration
Copy link
Author

Updated by minibbjd on 2016-02-25 15:51:47 +00:00

Seems to be fixed 2.4.3.

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-03-07 08:23:33 +00:00

  • Status changed from Feedback to Closed

@icinga-migration
Copy link
Author

Updated by geotek on 2016-05-02 13:35:57 +00:00

Had the exact same issue on a brand new Win2012 R2 server with Icinga2-v2.4.7-x86_64.msi. It was caused by the local DNS server returning a wrong address for "localhost". After putting "localhost" in hosts file, Icinga stopped crashing. Funny thing was that Icinga had crashed only when running as a service, not when running from the command line.

@icinga-migration icinga-migration added bug Something isn't working libbase labels Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant