[dev.icinga.com #4809] no broker event is created when old/stale downtimes are wiped from the core data #1353

icinga-migration · 2013-10-07T11:01:09Z

This issue has been migrated from Redmine: https://dev.icinga.com/issues/4809

Created by mfrosch on 2013-10-07 11:01:09 +00:00

Assignee: (none)
Status: Rejected (closed on 2015-02-15 01:08:29 +00:00)
Target Version: (none)
Last Update: 2015-02-15 01:08:29 +00:00 (in Redmine)

Icinga Version: 1.9.3
OS Version: all

Sometimes Icinga don't catch the end of a downtime, e.g. when Icinga is don't at that moment.

The downtime itself stays inside the core and status.dat for a bit, and even the info comment.

After some time the core seems to cleanup that data, but no broker event is generated, so idomod doesn't know the downtime is now gone.

Considering for 1.10, if there is enough time to track it down.

The bug was initially opened against Icinga Web (#3822), and also see #4808 for not clearing that data on startup.

Relations:

relates #4809
relates #4809
relates #4808
relates #4809
relates #4428

The text was updated successfully, but these errors were encountered:

icinga-migration · 2013-10-07T20:01:21Z

Updated by mfriedrich on 2013-10-07 20:01:21 +00:00

likely related but not a fix imho: https://github.com/dnsmichi/nagioscore/commit/b81d8280c801ac18e49838a541d049b0c201b736

icinga-migration · 2013-10-07T20:49:33Z

Updated by mfriedrich on 2013-10-07 20:49:33 +00:00

the event for expiring a downtime is only scheduled for flexible downtimes which may never trigger and therefore not being active within their start-end window (EVENT_EXPIRE_DOWNTIME). for fixed downtimes it does not make much sense as they get removed at their end_time anyways.

deleting a downtime happens in delete_service_downtime() which removes the downtime from the list in memory and also triggers a neb callback with broker_downtime_data(NEBTYPE_DOWNTIME_DELETE - the status.dat update isn't done immediately but left for the aggregated status update

there may of course happen an error, which is not respected in its return code in delete_{host,service}_downtime() ... the status update calls are pretty useless afterall.

/******************************************************************/
/********************** DELETION FUNCTIONS ************************/
/******************************************************************/

/* deletes a scheduled host downtime entry */
int xdddefault_delete_host_downtime(unsigned long downtime_id) {
        int result;

        result = xdddefault_delete_downtime(HOST_DOWNTIME, downtime_id);

        return result;
}


/* deletes a scheduled service downtime entry */
int xdddefault_delete_service_downtime(unsigned long downtime_id) {
        int result;

        result = xdddefault_delete_downtime(SERVICE_DOWNTIME, downtime_id);

        return result;
}


/* deletes a scheduled host or service downtime entry */
int xdddefault_delete_downtime(int type, unsigned long downtime_id) {

        /* rewrite the downtime file (downtime was already removed from memory) */
        xdddefault_save_downtime_data();

        return OK;
}



/******************************************************************/
/****************** DOWNTIME OUTPUT FUNCTIONS *********************/
/******************************************************************/

/* writes downtime data to file */
int xdddefault_save_downtime_data(void) {

        /* don't update the status file now (too inefficent), let aggregated status updates do it */
        return OK;
}

would be interesting to get a reproducible sample downtime, as well as (debug) logs for that.

icinga-migration · 2013-10-08T08:38:50Z

Updated by mfriedrich on 2013-10-08 08:38:50 +00:00

Target Version changed from 1.10 to 1.11

icinga-migration · 2013-10-08T17:40:06Z

Updated by mfriedrich on 2013-10-08 17:40:06 +00:00

imho the other issue should take care of the general wipe/insert problem. this one here is special and i am not sure if those events can be triggered accurately given the information provided after reading retention.dat

icinga-migration · 2014-01-25T16:23:24Z

Updated by mfriedrich on 2014-01-25 16:23:24 +00:00

Status changed from Assigned to Feedback
Target Version deleted ~~1.11~~

icinga-migration · 2014-01-27T19:24:00Z

Updated by mfriedrich on 2014-01-27 19:24:00 +00:00

maybe helps

naemon/naemon-core@4fa9af5

icinga-migration · 2015-02-15T01:08:29Z

Updated by mfriedrich on 2015-02-15 01:08:29 +00:00

Status changed from Feedback to Rejected
Assigned to deleted ~~mfrosch~~

I'm unable to reproduce this. During startup old stale downtimes are deleted, and therefore the event broker is triggered on every deletion.

icinga-migration closed this as completed Feb 15, 2015

icinga-migration added bug Event Broker labels Jan 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dev.icinga.com #4809] no broker event is created when old/stale downtimes are wiped from the core data #1353

[dev.icinga.com #4809] no broker event is created when old/stale downtimes are wiped from the core data #1353

icinga-migration commented Oct 7, 2013

icinga-migration commented Oct 7, 2013

icinga-migration commented Oct 7, 2013

icinga-migration commented Oct 8, 2013

icinga-migration commented Oct 8, 2013

icinga-migration commented Jan 25, 2014

icinga-migration commented Jan 27, 2014

icinga-migration commented Feb 15, 2015

[dev.icinga.com #4809] no broker event is created when old/stale downtimes are wiped from the core data #1353

[dev.icinga.com #4809] no broker event is created when old/stale downtimes are wiped from the core data #1353

Comments

icinga-migration commented Oct 7, 2013

icinga-migration commented Oct 7, 2013

icinga-migration commented Oct 7, 2013

icinga-migration commented Oct 8, 2013

icinga-migration commented Oct 8, 2013

icinga-migration commented Jan 25, 2014

icinga-migration commented Jan 27, 2014

icinga-migration commented Feb 15, 2015