Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #10075] Race condition in CreatePipeOverlapped #3370

Closed
icinga-migration opened this issue Sep 1, 2015 · 16 comments
Closed
Labels
bug Something isn't working
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/10075

Created by Anonymous on 2015-09-01 16:16:57 +00:00

Assignee: gbeutner
Status: Resolved (closed on 2016-08-10 10:15:05 +00:00)
Target Version: 2.5.0
Last Update: 2016-11-09 14:59:56 +00:00 (in Redmine)

Icinga Version: 2.4.0
Backport?: Not yet backported
Include in Changelog: 1

The error message on some checks does not make a lot of sense, it is meaningless.

[2015-09-01 14:42:48 W. Europe Daylight Time] critical/checker: Exception occured while checking 'tfc-pc.labdomain.net!procs': Error: Unknown exception
[2015-09-01 14:46:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 14:51:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 14:56:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:01:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:06:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:11:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:15:55 W. Europe Daylight Time] critical/checker: Exception occured while checking 'tfc-pc.labdomain.net!IIS Site aaa': Error: Unknown exception
[2015-09-01 15:16:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:21:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:26:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:31:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:36:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:41:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:46:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:51:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 15:56:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:00:48 W. Europe Daylight Time] critical/checker: Exception occured while checking 'tfc-pc.labdomain.net!procs': Error: Unknown exception
[2015-09-01 16:01:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:06:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:08:48 W. Europe Daylight Time] critical/checker: Exception occured while checking 'tfc-pc.labdomain.net!disk C:': Error: Unknown exception
[2015-09-01 16:11:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:16:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:21:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:25:48 W. Europe Daylight Time] critical/checker: Exception occured while checking 'tfc-pc.labdomain.net!load': Error: Unknown exception
[2015-09-01 16:26:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:31:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:36:48 W. Europe Daylight Time] critical/checker: Exception occured while checking 'tfc-pc.labdomain.net!procs': Error: Unknown exception
[2015-09-01 16:36:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:41:51 W. Europe Daylight Time] information/ConfigObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
[2015-09-01 16:42:48 W. Europe Daylight Time] critical/checker: Exception occured while checking 'tfc-pc.labdomain.net!procs': Error: Unknown exception

I've tried to up the level of logging to debug, however it does not give more details, it reports starting the check, then the error message: Error: Unknown exception. I have seen this error since v2.3.4 that I can remember.
Notice that most of the exceptions is at hh:MM:48, even after daemon restart.

Changesets

2016-08-08 10:51:20 +00:00 by (unknown) 1cd8a25

Add the "exception" check command

refs #10075

2016-08-10 10:12:56 +00:00 by gbeutner 37bd5ad

Fix race condition in CreatePipeOverlapped

fixes #10075

2016-08-11 07:48:01 +00:00 by gbeutner 132ee6c

Use InterlockedIncrement instead of a mutex in CreatePipeOverlapped

refs #10075
@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-04 08:03:43 +00:00

  • Status changed from New to Feedback
  • Assigned to set to __

Please attach the relevant config objects, e.g. service, host, checkcommand.

Is this a single instance setup, or are these checks running inside a cluster?

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-24 20:03:39 +00:00

Ping?

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-10-15 12:55:37 +00:00

Can you please build Icinga 2 with Visual Studio 2013 and attach the debugger to Icinga 2?

@icinga-migration
Copy link
Author

Updated by gbeutner on 2015-11-14 18:30:40 +00:00

  • Status changed from Feedback to New
  • Assigned to deleted ~~~~

I'm able to reproduce this here.

@icinga-migration
Copy link
Author

Updated by gbeutner on 2015-11-14 18:30:47 +00:00

  • Icinga Version changed from v2.3.0-516-g6fff339 to 2

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-24 23:25:41 +00:00

  • Status changed from New to Assigned
  • Assigned to set to gbeutner

Then please fix it :)

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-04 15:50:09 +00:00

  • Parent Id set to 11310

@icinga-migration
Copy link
Author

Updated by rafael.voss on 2016-07-05 15:23:07 +00:00

I have the same problem on my Windowsservers (2012 R2):

@
Exception occured while checking 'FQDNfromHOST!load': Error: Unknown exception
@

I have this problems with all Checks (default load, process etc.) and own Checks. The check suddenly goes to unknown but the next retrycheck is okay again.

The Checks are executed local on the Server. On Servers where the check is triggered from the satellite i don't have this problem.

@icinga-migration
Copy link
Author

Updated by BrandOuellette on 2016-07-13 05:32:24 +00:00

Same issue on Windows Server 2012 R2 Standard running Icinga Version 2.4.10

Returns "Exception occured while checking '...': Error: Unknown exception" periodically for any/all checks without running the actual service check command.

Please make this a priority to fix, as 'Unknown' Check Results fill up the event log.

@icinga-migration
Copy link
Author

Updated by TheFlyingCorpse on 2016-07-14 18:52:17 +00:00

Input, I went over my environment where I have this occuring on Icinga 2 agents ranging from 2.3.11, 2.4.3, 2.4.4 and 2.4.10. It ONLY happens with locally defined checks (the default ones). Any checks issued from upstream (master) does not have this issue.

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-08-08 09:58:21 +00:00

Looks like both exception_detail::get_boost_exception() as well as exception_detail::get_std_exception() returned NULL for the exception - which is unusual because all of our exceptions derive from both boost::exception and std::exception.

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-08-08 12:33:30 +00:00

Well, I'm no longer able to reproduce this (both with VS2013 as well as VS2015).

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-08-10 10:15:05 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 0 to 100

Applied in changeset 37bd5ad.

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-08-10 10:15:42 +00:00

  • Subject changed from Daemon does not give a meaningful error message on checkable->ExecuteCheck() to Race condition in CreatePipeOverlapped
  • Category changed from Checker to libbase
  • Target Version set to 2.5.0

@icinga-migration
Copy link
Author

Updated by TheFlyingCorpse on 2016-08-12 08:02:07 +00:00

Can confirm both the 1st and 2nd fix resolves this.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-11-09 14:59:56 +00:00

  • Parent Id deleted 11310

@icinga-migration icinga-migration added bug Something isn't working libbase labels Jan 17, 2017
@icinga-migration icinga-migration added this to the 2.5.0 milestone Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant