Skip to content
This repository has been archived by the owner on Jan 15, 2019. It is now read-only.

[dev.icinga.com #10114] Service (or host) checks should allow SOFT status for OK as well #1561

Closed
icinga-migration opened this issue Sep 7, 2015 · 4 comments

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/10114

Created by leo9641 on 2015-09-07 15:10:23 +00:00

Assignee: (none)
Status: New
Target Version: Backlog
Last Update: 2015-10-26 08:22:20 +00:00 (in Redmine)


Hi icinga team!

Please consider this request too:

NagiosEnterprises/nagioscore#46

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-07 15:32:42 +00:00

  • Subject changed from [feature-request] Service (or host) checks should allow SOFT status for OK as well to Service (or host) checks should allow SOFT status for OK as well
  • Category set to Notifications

From a quick read, the feature request is to delay recovery notifications for some reason. I'm not really sure I get the problem itself, how would a soft recovery requiring additional steps in SOFT-OK then result in a HARD-OK triggering the recovery notification? That sounds pretty weird to me.

Probably you should come up with some drawing boards to illustrate the timing and intervals including all involved configuration attributes influencing the state machine.

Note: I would consider this for Icinga 2 only. We won't implement such (breaking) changes in 1.x.

@icinga-migration
Copy link
Author

Updated by leo9641 on 2015-10-06 11:11:54 +00:00

dnsmichi wrote:

From a quick read, the feature request is to delay recovery notifications for some reason. I'm not really sure I get the problem itself, how would a soft recovery requiring additional steps in SOFT-OK then result in a HARD-OK triggering the recovery notification? That sounds pretty weird to me.

Probably you should come up with some drawing boards to illustrate the timing and intervals including all involved configuration attributes influencing the state machine.

Note: I would consider this for Icinga 2 only. We won't implement such (breaking) changes in 1.x.

I wrote a simple wrapper for this feature (only for gw-host check, set UP state for gw-host after successful maxhostattempts retries in a row ):
https://gist.github.com/lvasiliev/6c847511e53509c8db51

  1. 'check-gw-alive-extadm' command definition
    define command{
    command_name check-gw-alive-extadm
    command_line $USER3$/extadm/soft_recovery.py --hostname=$HOSTNAME$ --lasthoststate=$LASTHOSTSTATE$ --hostattempt=$HOSTATTEMPT$ --maxhostattempts=$MAXHOSTATTEMPTS$ $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
    }

It is necessary to slow down UP-state for gw hosts, because child hosts depend on them (parents -> gw-host).

Template for gw-host (only timing):
define host{
name generic-router-extadm
check_interval 2 ; Switches are checked every 2 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 5 ; Check each switch 5 times (max)
check_command check-gw-alive-extadm ; Default command to check if routers are "alive"
}

Template for hosts (only timing):
define host{
name freebsd-server-extadm ; The name of this host template
check_interval 4 ; Actively check the host every 4 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each FreeBSD host 10 times (max)
}

I want that UP-state from DOWN for gw hosts was more slow (in case of network unstable, packets loss). In this period child hosts has UNREACHABLE state and don't send notifications.
Sometimes happens that gw-host can quickly be UP from DOWN state. But checks of child hosts still return non-OK state (WARNING, CRITICAL ) and after max_check_attempts host is DOWN state. Then gw-host is DOWN state again...

I use options soft_state_dependencies=1.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-10-26 08:22:14 +00:00

Ok. If someone comes up with a patch which does not break the existing behaviour we might have a look into it.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-10-26 08:22:21 +00:00

  • Target Version set to Backlog

@icinga-migration icinga-migration added this to the Backlog milestone Jan 17, 2017
@dnsmichi dnsmichi removed this from the Backlog milestone Feb 2, 2017
@dnsmichi dnsmichi closed this as completed Feb 2, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants