[dev.icinga.com #1918] make first_notification_delay depend on the first !OK hard state change and don't reset timer for new hard states which would replace it #742

icinga-migration · 2011-09-17T11:59:14Z

This issue has been migrated from Redmine: https://dev.icinga.com/issues/1918

Created by mfriedrich on 2011-09-17 11:59:14 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2011-11-14 08:20:39 +00:00)
Target Version: 1.6
Last Update: 2011-12-03 11:30:15 +00:00 (in Redmine)

requires a deeper analysis of the logic itsself.

-------- Original Message --------
Subject:    [Nagios-devel] Suggestion to first_notification_delay
Date:   Fri, 2 Sep 2011 17:15:40 -0300
From:   Rogerio F Cunha 
Reply-To:   Nagios Developers List 
To:     nagios-devel@lists.sourceforge.net


I have tested those modifications on Nagios 3.2.3 source and I think it works better than the original code. The patch keeps the first notification greater then the last  state change, whenever the state changes happens.

Rogerio Cunha.

--- base/notifications.c    2010-08-04 23:43:53.000000000 -0300
+++ base2/notifications.c.4    2011-09-02 16:53:40.139155621 -0300
@@ -504,13 +504,9 @@

         /* determine the time to use of the first problem point */
         first_problem_time=svc->last_time_ok; /* not accurate, but its the earliest time we could use in the comparison */
-        if((svc->last_time_warning < first_problem_time) && (svc->last_time_warning > svc->last_time_ok))
-            first_problem_time=svc->last_time_warning;
-        if((svc->last_time_unknown < first_problem_time) && (svc->last_time_unknown > svc->last_time_ok))
-            first_problem_time=svc->last_time_unknown;
-        if((svc->last_time_critical < first_problem_time) && (svc->last_time_critical > svc->last_time_ok))
-            first_problem_time=svc->last_time_critical;
-   
+        if((svc->last_hard_state_change > svc->last_time_ok)) {
+            first_problem_time=svc->last_hard_state_change;
+        }
         if(current_time < (time_t)((first_problem_time==(time_t)0L)?program_start:first_problem_time + (svc->first_notification_delay*interval_length))){
             log_debug_info(DEBUGL_NOTIFICATIONS,1,"Not enough time has elapsed since the service changed to a non-OK state, so we should not notify about this problem yet\n");
             return ERROR;
@@ -1393,11 +1389,8 @@

         /* determine the time to use of the first problem point */
         first_problem_time=hst->last_time_up; /* not accurate, but its the earliest time we could use in the comparison */
-        if((hst->last_time_down < first_problem_time) && (hst->last_time_down > hst->last_time_up))
-            first_problem_time=hst->last_time_down;
-        if((hst->last_time_unreachable < first_problem_time) && (hst->last_time_unreachable > hst->last_time_unreachable))
-            first_problem_time=hst->last_time_unreachable;
-   
+        if((hst->last_hard_state_change > first_problem_time))
+            first_problem_time=hst->last_hard_state_change;
         if(current_time < (time_t)((first_problem_time==(time_t)0L)?program_start:first_problem_time + (hst->first_notification_delay*interval_length))){
             log_debug_info(DEBUGL_NOTIFICATIONS,1,"Not enough time has elapsed since the host changed to a non-UP state (or since program start), so we shouldn't notify about this problem yet.\n");
             return ERROR;

-------- Original Message --------
Subject:    Re: [Nagios-devel] Suggestion to first_notification_delay
Date:   Tue, 06 Sep 2011 23:21:54 +0200
From:   Andreas Ericsson 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 


On 09/02/2011 10:15 PM, Rogerio F Cunha wrote:
> I have tested those modifications on Nagios 3.2.3 source and I think it
> works better than the original code. The patch keeps the first notification
> greater then the last  state change, whenever the state changes happens.
> 

It took me quite a while to grok what you're saying, although the code
speaks clearly enough for itself. To clarify for others who might not
have the stamina to apply the MUA-mangled patch and read the code for
themselves, the patch intends to make first_notification_delay depend
only on the first non-OK hard state change and not reset the timer for
new hard states that replace the one that initially triggered the
first_notification_delay.

Nice patch. I'll take this when I get back from the US on wednesday
next week. If you haven't seen anything around friday next week, feel
free to remind me.

Thanks.

-------- Original Message --------
Subject:    Re: [Nagios-devel] Suggestion to first_notification_delay
Date:   Thu, 15 Sep 2011 09:40:23 +0200
From:   Matthieu Kermagoret 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 


On Tue, Sep 6, 2011 at 11:21 PM, Andreas Ericsson  wrote:
> It took me quite a while to grok what you're saying, although the code
> speaks clearly enough for itself. To clarify for others who might not
> have the stamina to apply the MUA-mangled patch and read the code for
> themselves, the patch intends to make first_notification_delay depend
> only on the first non-OK hard state change and not reset the timer for
> new hard states that replace the one that initially triggered the
> first_notification_delay.
>

From what I understand from this patch, your explanations are the
expected behavior but not what really happens with the code. Indeed
the patch base the first_notification_delay on the last hard state
change, not the first non-OK hard state change. Both might be the same
but only if the hard state have not changed during the first
notification delay (from critical to warning for example).

Best regards,


-------- Original Message --------
Subject:    Re: [Nagios-devel] Suggestion to first_notification_delay
Date:   Thu, 15 Sep 2011 10:49:58 +0200
From:   Andreas Ericsson 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 


On 09/15/2011 09:40 AM, Matthieu Kermagoret wrote:
> On Tue, Sep 6, 2011 at 11:21 PM, Andreas Ericsson  wrote:
>> It took me quite a while to grok what you're saying, although the code
>> speaks clearly enough for itself. To clarify for others who might not
>> have the stamina to apply the MUA-mangled patch and read the code for
>> themselves, the patch intends to make first_notification_delay depend
>> only on the first non-OK hard state change and not reset the timer for
>> new hard states that replace the one that initially triggered the
>> first_notification_delay.
>>
> 
>  From what I understand from this patch, your explanations are the
> expected behavior but not what really happens with the code. Indeed
> the patch base the first_notification_delay on the last hard state
> change, not the first non-OK hard state change. Both might be the same
> but only if the hard state have not changed during the first
> notification delay (from critical to warning for example).
> 

Exactly. So if a service with first_notification_delay of 300 seconds
goes offline at 21:00, the first check resulting in non-OK state after
21:05 should generate the alert no matter if the state has changed
since then or not. Unless, ofcourse, the service has recovered and
then gotten a new non-OK hard state in that time.

Changesets

2011-11-09 12:08:54 +00:00 by mfriedrich 6697aa0

* core: make first_notification_delay depend on the first !OK hard state change and don't reset timer for new hard states which would replace it #1918

refs #1918
refs #2048

Relations:

relates #1918

The text was updated successfully, but these errors were encountered:

icinga-migration · 2011-11-07T18:55:54Z

Updated by mfriedrich on 2011-11-07 18:55:54 +00:00

Status changed from New to Assigned
Assigned to set to mfriedrich

icinga-migration · 2011-11-09T13:56:32Z

Updated by mfriedrich on 2011-11-09 13:56:32 +00:00

Target Version set to 1.6
Done % changed from 0 to 50

icinga-migration · 2011-11-11T16:55:50Z

Updated by mfriedrich on 2011-11-11 16:55:50 +00:00

Status changed from Assigned to Feedback
Done % changed from 50 to 100

icinga-migration · 2011-11-14T08:20:39Z

Updated by mfriedrich on 2011-11-14 08:20:39 +00:00

Status changed from Feedback to Resolved

icinga-migration closed this as completed Nov 14, 2011

icinga-migration added bug Notifications labels Jan 17, 2017

icinga-migration added this to the 1.6 milestone Jan 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dev.icinga.com #1918] make first_notification_delay depend on the first !OK hard state change and don't reset timer for new hard states which would replace it #742

[dev.icinga.com #1918] make first_notification_delay depend on the first !OK hard state change and don't reset timer for new hard states which would replace it #742

icinga-migration commented Sep 17, 2011

icinga-migration commented Nov 7, 2011

icinga-migration commented Nov 9, 2011

icinga-migration commented Nov 11, 2011

icinga-migration commented Nov 14, 2011

[dev.icinga.com #1918] make first_notification_delay depend on the first !OK hard state change and don't reset timer for new hard states which would replace it #742

[dev.icinga.com #1918] make first_notification_delay depend on the first !OK hard state change and don't reset timer for new hard states which would replace it #742

Comments

icinga-migration commented Sep 17, 2011

icinga-migration commented Nov 7, 2011

icinga-migration commented Nov 9, 2011

icinga-migration commented Nov 11, 2011

icinga-migration commented Nov 14, 2011