Skip to content
This repository has been archived by the owner on Jan 15, 2019. It is now read-only.

[dev.icinga.com #1918] make first_notification_delay depend on the first !OK hard state change and don't reset timer for new hard states which would replace it #742

Closed
icinga-migration opened this issue Sep 17, 2011 · 4 comments

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/1918

Created by mfriedrich on 2011-09-17 11:59:14 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2011-11-14 08:20:39 +00:00)
Target Version: 1.6
Last Update: 2011-12-03 11:30:15 +00:00 (in Redmine)


requires a deeper analysis of the logic itsself.

-------- Original Message --------
Subject:    [Nagios-devel] Suggestion to first_notification_delay
Date:   Fri, 2 Sep 2011 17:15:40 -0300
From:   Rogerio F Cunha 
Reply-To:   Nagios Developers List 
To:     nagios-devel@lists.sourceforge.net


I have tested those modifications on Nagios 3.2.3 source and I think it works better than the original code. The patch keeps the first notification greater then the last  state change, whenever the state changes happens.

Rogerio Cunha.

--- base/notifications.c    2010-08-04 23:43:53.000000000 -0300
+++ base2/notifications.c.4    2011-09-02 16:53:40.139155621 -0300
@@ -504,13 +504,9 @@

         /* determine the time to use of the first problem point */
         first_problem_time=svc->last_time_ok; /* not accurate, but its the earliest time we could use in the comparison */
-        if((svc->last_time_warning < first_problem_time) && (svc->last_time_warning > svc->last_time_ok))
-            first_problem_time=svc->last_time_warning;
-        if((svc->last_time_unknown < first_problem_time) && (svc->last_time_unknown > svc->last_time_ok))
-            first_problem_time=svc->last_time_unknown;
-        if((svc->last_time_critical < first_problem_time) && (svc->last_time_critical > svc->last_time_ok))
-            first_problem_time=svc->last_time_critical;
-   
+        if((svc->last_hard_state_change > svc->last_time_ok)) {
+            first_problem_time=svc->last_hard_state_change;
+        }
         if(current_time < (time_t)((first_problem_time==(time_t)0L)?program_start:first_problem_time + (svc->first_notification_delay*interval_length))){
             log_debug_info(DEBUGL_NOTIFICATIONS,1,"Not enough time has elapsed since the service changed to a non-OK state, so we should not notify about this problem yet\n");
             return ERROR;
@@ -1393,11 +1389,8 @@

         /* determine the time to use of the first problem point */
         first_problem_time=hst->last_time_up; /* not accurate, but its the earliest time we could use in the comparison */
-        if((hst->last_time_down < first_problem_time) && (hst->last_time_down > hst->last_time_up))
-            first_problem_time=hst->last_time_down;
-        if((hst->last_time_unreachable < first_problem_time) && (hst->last_time_unreachable > hst->last_time_unreachable))
-            first_problem_time=hst->last_time_unreachable;
-   
+        if((hst->last_hard_state_change > first_problem_time))
+            first_problem_time=hst->last_hard_state_change;
         if(current_time < (time_t)((first_problem_time==(time_t)0L)?program_start:first_problem_time + (hst->first_notification_delay*interval_length))){
             log_debug_info(DEBUGL_NOTIFICATIONS,1,"Not enough time has elapsed since the host changed to a non-UP state (or since program start), so we shouldn't notify about this problem yet.\n");
             return ERROR;

-------- Original Message --------
Subject:    Re: [Nagios-devel] Suggestion to first_notification_delay
Date:   Tue, 06 Sep 2011 23:21:54 +0200
From:   Andreas Ericsson 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 


On 09/02/2011 10:15 PM, Rogerio F Cunha wrote:
> I have tested those modifications on Nagios 3.2.3 source and I think it
> works better than the original code. The patch keeps the first notification
> greater then the last  state change, whenever the state changes happens.
> 

It took me quite a while to grok what you're saying, although the code
speaks clearly enough for itself. To clarify for others who might not
have the stamina to apply the MUA-mangled patch and read the code for
themselves, the patch intends to make first_notification_delay depend
only on the first non-OK hard state change and not reset the timer for
new hard states that replace the one that initially triggered the
first_notification_delay.

Nice patch. I'll take this when I get back from the US on wednesday
next week. If you haven't seen anything around friday next week, feel
free to remind me.

Thanks.

-------- Original Message --------
Subject:    Re: [Nagios-devel] Suggestion to first_notification_delay
Date:   Thu, 15 Sep 2011 09:40:23 +0200
From:   Matthieu Kermagoret 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 


On Tue, Sep 6, 2011 at 11:21 PM, Andreas Ericsson  wrote:
> It took me quite a while to grok what you're saying, although the code
> speaks clearly enough for itself. To clarify for others who might not
> have the stamina to apply the MUA-mangled patch and read the code for
> themselves, the patch intends to make first_notification_delay depend
> only on the first non-OK hard state change and not reset the timer for
> new hard states that replace the one that initially triggered the
> first_notification_delay.
>

From what I understand from this patch, your explanations are the
expected behavior but not what really happens with the code. Indeed
the patch base the first_notification_delay on the last hard state
change, not the first non-OK hard state change. Both might be the same
but only if the hard state have not changed during the first
notification delay (from critical to warning for example).

Best regards,


-------- Original Message --------
Subject:    Re: [Nagios-devel] Suggestion to first_notification_delay
Date:   Thu, 15 Sep 2011 10:49:58 +0200
From:   Andreas Ericsson 
Reply-To:   Nagios Developers List 
To:     Nagios Developers List 


On 09/15/2011 09:40 AM, Matthieu Kermagoret wrote:
> On Tue, Sep 6, 2011 at 11:21 PM, Andreas Ericsson  wrote:
>> It took me quite a while to grok what you're saying, although the code
>> speaks clearly enough for itself. To clarify for others who might not
>> have the stamina to apply the MUA-mangled patch and read the code for
>> themselves, the patch intends to make first_notification_delay depend
>> only on the first non-OK hard state change and not reset the timer for
>> new hard states that replace the one that initially triggered the
>> first_notification_delay.
>>
> 
>  From what I understand from this patch, your explanations are the
> expected behavior but not what really happens with the code. Indeed
> the patch base the first_notification_delay on the last hard state
> change, not the first non-OK hard state change. Both might be the same
> but only if the hard state have not changed during the first
> notification delay (from critical to warning for example).
> 

Exactly. So if a service with first_notification_delay of 300 seconds
goes offline at 21:00, the first check resulting in non-OK state after
21:05 should generate the alert no matter if the state has changed
since then or not. Unless, ofcourse, the service has recovered and
then gotten a new non-OK hard state in that time.

Changesets

2011-11-09 12:08:54 +00:00 by mfriedrich 6697aa0

* core: make first_notification_delay depend on the first !OK hard state change and don't reset timer for new hard states which would replace it #1918

refs #1918
refs #2048

Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2011-11-07 18:55:54 +00:00

  • Status changed from New to Assigned
  • Assigned to set to mfriedrich

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2011-11-09 13:56:32 +00:00

  • Target Version set to 1.6
  • Done % changed from 0 to 50

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2011-11-11 16:55:50 +00:00

  • Status changed from Assigned to Feedback
  • Done % changed from 50 to 100

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2011-11-14 08:20:39 +00:00

  • Status changed from Feedback to Resolved

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant