Skip to content
This repository has been archived by the owner on Jan 15, 2019. It is now read-only.

[dev.icinga.com #9157] Only timestamp and "Return code of 255 is out of bounds" in icinga.log #1552

Closed
icinga-migration opened this issue Apr 22, 2015 · 7 comments
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/9157

Created by icilib0815 on 2015-04-22 12:28:29 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2015-08-04 18:17:28 +00:00)
Target Version: 1.14
Last Update: 2015-08-04 18:17:28 +00:00 (in Redmine)

Icinga Version: 1.13.1
OS Version: SLES-11.3

check_nrpe (2.15) returns 255 when the target host is not reachable (I also opened there a bug report, earlier versions returned 2 in this case).
Everytime check_nrpe returns 255, I get only a timestamp and "Return code of 255 is out of bounds" in icinga.log, but no information about the host or service. I had to turn on debugging to find out, which checks were to blame.
So I think it is a bug in icinga's logging, because host and service are missing in this case.
e.g.
icinga.debug:
[1429607307.472741] [016.1] [pid=18124] HOST: lbsf06, SERVICE: NRPE, CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: Yes, RETURN CODE: 255, OUTPUT: connect to address a.b.c.d port 5666: No route to host\nconnect to host a.b.c.d port 5666: No route to host
[1429607307.472861] [016.1] [pid=18124] Service is in a non-OK state!
[1429607307.472874] [016.1] [pid=18124] Host is currently DOWN/UNREACHABLE.
[1429607307.472886] [016.1] [pid=18124] Assuming host is in same state as before...
[1429607307.472911] [032.0] [pid=18124] **** Host Notification Attempt **** Host: 'lbsf06', Type: NORMAL, Options: 0, Current State: 1, Last Notification: Thu Jan 1 01:00:00 1970
[1429607307.472927] [001.0] [pid=18124] check_host_notification_viability()
[1429607307.472939] [001.0] [pid=18124] check_time_against_period()
[1429607307.472958] [032.1] [pid=18124] This host problem has already been acknowledged, so we won't send a notification out!
[1429607307.472971] [032.0] [pid=18124] Notification viability test failed. No notification will be sent out.
[1429607307.472983] [016.1] [pid=18124] Current/Max Attempt(s): 1/3
[1429607307.472994] [016.1] [pid=18124] Host isn't UP, so we won't retry the service check...
[1429607307.473014] [016.1] [pid=18124] Rescheduling next check of service at Tue Apr 21 11:18:18 2015
[1429607307.473026] [001.0] [pid=18124] get_next_valid_time()
[1429607307.473037] [001.0] [pid=18124] check_time_against_period()
[1429607307.473054] [001.0] [pid=18124] schedule_service_check()
[1429607307.473074] [016.0] [pid=18124] Scheduling a non-forced, active check of service 'NRPE' on host 'lbsf06' @ Tue Apr 21 11:18:18 2015

And corresponding entry in icinga.log:
[1429607307] Return code of 255 is out of bounds

Attachments

Changesets

2015-08-04 18:16:30 +00:00 by mfriedrich 82bff8e

Fix missing object context for 'out of bounds' warnings in logs

The host and service code did not match either, added the same
behavior to both.

fixes #9157

Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-04-26 07:27:26 +00:00

  • Status changed from New to Feedback
  • Assigned to set to icilib0815

That kind of output logging should never appear, as the check alerts happen using a defined parsable pattern ("SERVICE ALERT: ..."). Can you provide a little more (log) context for what is happening in detail? I am not able to reproduce this one.

@icinga-migration
Copy link
Author

Updated by icilib0815 on 2015-04-27 12:16:16 +00:00

The service NRPE for host lbsf06 is a simple "/usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$"
And the host lbsf06 is switched off. The host check is in state DOWN.

First I thought it is a problem with the defined service dependency, but even when i remove the service dependency, I get the same log entries in icinga.log
Actually the dependeny is wokring, I get the weird log entry only for the service NRPE and not for the dependent services, which are also all checks via check_nrpe

The service dependency:
define servicedependency{
host_name lbsf06
service_description NRPE
dependent_host_name lbsf06
dependent_service_description CPU Load, Prozesse, Swap, Partition root, Partition home, Partition boot , Bonding IF, Cron, Syslogd, ntp time
notification_failure_criteria u,w,c
execution_failure_criteria u,w,c
}

@icinga-migration
Copy link
Author

Updated by berk on 2015-05-18 12:18:18 +00:00

  • Target Version set to Backlog

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-08-04 17:48:08 +00:00

It seems there's a mismatch between hosts and services and their out of bounds handling.

hosts

                /* make sure the return code is within bounds */
                else if (queued_check_result->return_code < 0 || queued_check_result->return_code > 3) {

                        logit(NSLOG_RUNTIME_WARNING, TRUE, "Warning: Return code of %d for check of host '%s' was out of bounds.%s\n", queued_check_result->return_code, temp_host->name, (queued_check_result->return_code == 126 || queued_check_result->return_code == 127) ? " Make sure the plugin you're trying to run actually exists." : "");

                        my_free(temp_host->plugin_output);
                        my_free(temp_host->long_plugin_output);
                        my_free(temp_host->perf_data);

                        asprintf(&temp_host->plugin_output, "(Return code of %d is out of bounds%s)", queued_check_result->return_code, (queued_check_result->return_code == 126 || queued_check_result->return_code == 127) ? " - plugin may be missing" : "");

                        result = STATE_CRITICAL;
                }

services

        /* make sure the return code is within bounds */
        else if (queued_check_result->return_code < 0 || queued_check_result->return_code > 3) {

                if (queued_check_result->return_code == 126) {
                        asprintf(&temp_service->plugin_output, "The command defined for service %s is not an executable\n", queued_check_result->service_description);
                } else if (queued_check_result->return_code == 127) {
                        asprintf(&temp_service->plugin_output, "The command defined for service %s does not exist\n", queued_check_result->service_description);
                } else {
                        asprintf(&temp_service->plugin_output, "Return code of %d is out of bounds", queued_check_result->return_code);
                }
                logit(NSLOG_RUNTIME_WARNING, TRUE, "%s", temp_service->plugin_output);

                temp_service->current_state = STATE_CRITICAL;
        }

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-08-04 18:16:10 +00:00

  • File added Auswahl_064.png
  • File added 9157.cfg
  • Status changed from Feedback to Assigned
  • Assigned to changed from icilib0815 to mfriedrich
  • Target Version changed from Backlog to 1.14

Test config attached.

Auswahl_064.png

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-08-04 18:17:28 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 0 to 100

Applied in changeset 82bff8e.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-04-07 21:17:31 +00:00

  • Relates set to 11546

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant