New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #10058] Wrong calculation for host compat state "UNREACHABLE" in DB IDO #3352
Comments
Updated by tgelf on 2015-09-01 08:04:09 +00:00 Seems that I'm missing permissions to link issues, so here are the related ones I found: |
Updated by mfrosch on 2015-09-01 12:51:27 +00:00
|
Updated by mfrosch on 2015-09-01 12:51:41 +00:00
|
Updated by mfrosch on 2015-09-01 12:51:49 +00:00
|
Updated by mfrosch on 2015-09-01 12:52:03 +00:00
|
Updated by mfriedrich on 2015-09-03 16:09:45 +00:00
I'm now removing this from backlog as I want to discuss this further, and keep it open for suggestions and possible fixes. As far as I understand this issue, Icinga 2 actually sets a host being UP to UNREACHABLE, if one of its parent objects is DOWN. Am I right about that? The other relevant issues - multi-parent dependencies - target a different problem. They want to have DOWN hosts marked as UNREACHABLE only if all parent objects are DOWN. That's not the issue here as far as I am concerned. |
Updated by tgelf on 2015-09-03 16:14:30 +00:00 dnsmichi wrote:
Correct, at least that's what I read from #10049.
Well... that one might be subject to farther discussion, afair that's not how 1.x used to work - or is it? Nonetheless you're right, that's not what this specific issue was all about. Thanks, |
Updated by mfriedrich on 2015-09-03 16:28:16 +00:00
tgelf wrote:
I'll try to dig up a test case for that, it still makes no sense that this actually happens.
As you've remarked earlier, there's a difference between dependencies and host parents in 1.x. AFAIK host parents behave as logical AND, see my comment in https://dev.icinga.org/issues/6871#note-23 Nonetheless I'll try to look into this next to 2.4 issues. |
Updated by mfriedrich on 2015-09-04 11:15:05 +00:00
Tests
Steps
ProblemIt's only a matter of external interfaces here. The inner core parts of Icinga2 do not know about the state "UNREACHABLE". We've added that state for convenience reasons to DB IDO, etc but probably should not have done so. Icinga Web 2 already provides the column "reachable" which should be taken into account for better visualization to the user that this host is UP, but the dependency chain caused it to become "unreachable". Though that's a different topic which is not part of this issue. Codehostdbobject.cpp
statusdatawriter.cpp
Certainly more. Proposed FixEliminate all occurences of "UNREACHABLE" and hardcoded 2 as state, and make them a central CompatUtility class method where we change the way this compat state is calculated. |
Updated by mfriedrich on 2015-09-04 11:30:04 +00:00
Applied in changeset 50cd694. |
Updated by mfriedrich on 2015-09-04 11:48:23 +00:00
|
This issue has been migrated from Redmine: https://dev.icinga.com/issues/10058
Created by tgelf on 2015-09-01 08:02:39 +00:00
Assignee: mfriedrich
Status: Resolved (closed on 2015-09-04 11:30:04 +00:00)
Target Version: 2.3.10
Last Update: 2015-09-04 11:48:22 +00:00 (in Redmine)
Bugs reporting erroneous state when multiple "parents" are involved are popping up from time to time, they used to be stalled or rejected. Background:
IMO we have to live with the fact that to support legacy backends we have to continue exporting one single state field as we used to. This means showing no reachability information for services, but I guess most people will not miss that feature. My conclusion is, that all we need to do to fix this is to slightly adjust state calculation. What all bug reporters stumbled upon are erraneos UNREACHABLE states when the object was in fact OK/UP.
So to make them happy we do not need to completely change the current behaviour. Logic for legacy state output writers should be as simple as:
A node depending on multiple parents is to be considered UP as long as it's check plugin is telling me so. If it is DOWN, then we need to find out whether it is reachable or not. If any of it's parents are failing we consider this a network outage, set the (now virtual) state to UNREACHABLE. That would restore the former behaviour and make everybody happy I guess. At least, I hope so ;)
Cheers,
Thomas
Attachments
Changesets
2015-09-04 11:24:41 +00:00 by mfriedrich 50cd694
2015-09-04 11:25:18 +00:00 by mfriedrich 0a43e81
Relations:
The text was updated successfully, but these errors were encountered: