Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #11532] Hard state after max_check_attempts + 1? #4093

Closed
icinga-migration opened this issue Apr 5, 2016 · 3 comments
Closed
Labels
bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/11532

Created by mnardin on 2016-04-05 16:40:21 +00:00

Assignee: (none)
Status: Rejected (closed on 2016-05-03 12:12:26 +00:00)
Target Version: (none)
Last Update: 2016-05-03 12:29:44 +00:00 (in Redmine)

Icinga Version: 2.4.4
Backport?: Not yet backported
Include in Changelog: 1

Hi,
I was looking into another issue and I've noticed the following behavior:

{"check_result":{"active":true,"check_source":"icingas01-d","command":["/usr/lib64/nagios/plugins/check_ftp","-H","10.139.0.63","-M","warn","-r","crit","-t","10"],"execution_end":1459872244.4687559605,"execution_start":1459872244.4592659473,"exit_status":2.0,"output":"connect to address 10.139.0.63 and port 21: Connection refused","performance_data":[],"schedule_end":1459872244.4688179493,"schedule_start":1459872259.8600001335,"state":2.0,"type":"CheckResult","vars_after":{"attempt":1.0,"reachable":true,"state":2.0,"state_type":0.0},"vars_before":{"attempt":1.0,"reachable":true,"state":0.0,"state_type":1.0}},"host":"testhost","service":"ftp","timestamp":1459872244.4718899727,"type":"CheckResult"}
{"check_result":{"active":true,"check_source":"icingas01-d","command":["/usr/lib64/nagios/plugins/check_ftp","-H","10.139.0.63","-M","warn","-r","crit","-t","10"],"execution_end":1459872259.874890089,"execution_start":1459872259.8616371155,"exit_status":2.0,"output":"connect to address 10.139.0.63 and port 21: Connection refused","performance_data":[],"schedule_end":1459872259.8756198883,"schedule_start":1459872319.8599998951,"state":2.0,"type":"CheckResult","vars_after":{"attempt":2.0,"reachable":true,"state":2.0,"state_type":0.0},"vars_before":{"attempt":1.0,"reachable":true,"state":2.0,"state_type":0.0}},"host":"testhost","service":"ftp","timestamp":1459872259.879117012,"type":"CheckResult"}
{"check_result":{"active":true,"check_source":"icingas01-d","command":["/usr/lib64/nagios/plugins/check_ftp","-H","10.139.0.63","-M","warn","-r","crit","-t","10"],"execution_end":1459872319.875617981,"execution_start":1459872319.8608028889,"exit_status":2.0,"output":"connect to address 10.139.0.63 and port 21: Connection refused","performance_data":[],"schedule_end":1459872319.8757081032,"schedule_start":1459872379.8599998951,"state":2.0,"type":"CheckResult","vars_after":{"attempt":3.0,"reachable":true,"state":2.0,"state_type":0.0},"vars_before":{"attempt":2.0,"reachable":true,"state":2.0,"state_type":0.0}},"host":"testhost","service":"ftp","timestamp":1459872319.8784070015,"type":"CheckResult"}
{"check_result":{"active":true,"check_source":"icingas01-d","command":["/usr/lib64/nagios/plugins/check_ftp","-H","10.139.0.63","-M","warn","-r","crit","-t","10"],"execution_end":1459872379.8700211048,"execution_start":1459872379.8605520725,"exit_status":2.0,"output":"connect to address 10.139.0.63 and port 21: Connection refused","performance_data":[],"schedule_end":1459872379.8701059818,"schedule_start":1459872439.8600001335,"state":2.0,"type":"CheckResult","vars_after":{"attempt":1.0,"reachable":true,"state":2.0,"state_type":1.0},"vars_before":{"attempt":3.0,"reachable":true,"state":2.0,"state_type":0.0}},"host":"testhost","service":"ftp","timestamp":1459872379.8725450039,"type":"CheckResult"}

If you look a the last line you will see in vars_before: attempt=3 and state_type=0. After this check you have attempt=1 and state_type=1.
In my opinion this is already attempt 4 or am I completely wrong?

This are the properties of the object:

icinga2 object list --name 'testhost!ftp'
Object 'testhost!ftp' of type 'Service':
  % declared in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/ris-global/custom-objects/services/generic-services.conf', lines 254:1-254:19
  * __name = "testhost!ftp"
  * action_url = ""
  * check_command = "ftp"
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/ris-global/custom-objects/services/generic-services.conf', lines 257:5-257:25
  * check_interval = 180
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 2:5-2:23
  * check_period = ""
  * command_endpoint = ""
  * display_name = "FTP"
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/ris-global/custom-objects/services/generic-services.conf', lines 256:5-256:24
  * enable_active_checks = true
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 5:5-5:31
  * enable_event_handler = true
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 7:5-7:31
  * enable_flapping = false
  * enable_notifications = true
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 4:5-4:31
  * enable_passive_checks = true
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 6:5-6:32
  * enable_perfdata = true
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 8:5-8:26
  * event_command = ""
  * flapping_threshold = 30
  * groups = [ ]
  * host_name = "testhost"
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/ris-global/custom-objects/services/generic-services.conf', lines 254:1-254:19
  * icon_image = ""
  * icon_image_alt = ""
  * max_check_attempts = 3
  * name = "ftp"
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/ris-global/custom-objects/services/generic-services.conf', lines 254:1-254:19
  * notes = ""
  * notes_url = ""
  * package = "director"
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/ris-global/custom-objects/services/generic-services.conf', lines 254:1-254:19
  * retry_interval = 60
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 3:5-3:23
  * templates = [ "ftp", "standard-service" ]
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/ris-global/custom-objects/services/generic-services.conf', lines 254:1-254:19
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 1:0-1:34
  * type = "Service"
  * vars = null
  * volatile = false
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/director-global/service_templates.conf', lines 9:5-9:20
  * zone = "DR"
    % = modified in '/var/lib/icinga2/api/packages/director/icingam01-p-1459867433-1/zones.d/ris-global/custom-objects/services/generic-services.conf', lines 254:1-254:19

Mirko


Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-05-03 12:12:26 +00:00

  • Status changed from New to Rejected

The way the state machine works is like the following

  • once a service/host changes its state from OK to a NOT-OK state, the check attempt is set to 1 and the check_interval changes to use the retry_interval
  • depending on the mex_check_attempts setting, the counter is increased on every NOT-OK state from a check result
  • once attempt == max_check_attempt is reached, it still remains in a SOFT state
  • if the next check result is still NOT-OK, the state type changes to HARD and the counter is reset to 1. the retry_interval stays in effect
  • if the check result turns into an OK state, it is a HARD recovery and the counter is reset to 1. The retry_interval changes back to the check_interval

3 times the check may fail, the 4th time it fill turn into a hard state and trigger notifications. That works the same way in the existing Icinga1 world. I'm therefore closing this issue.

@icinga-migration
Copy link
Author

Updated by mnardin on 2016-05-03 12:29:44 +00:00

Thank you. This helped a lot to understand how this is designed to work.
I was totally unaware of the following fact:

if the next check result is still NOT-OK, the state type changes to HARD and the counter is reset to 1. the retry_interval stays in effect

Best regards
Mirko

@icinga-migration
Copy link
Author

Updated by mfrosch on 2016-06-07 10:27:03 +00:00

  • Relates set to 11898

@icinga-migration icinga-migration added the bug Something isn't working label Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant