Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #11012] Scheduled downtimes are reset after restart #3864

Closed
icinga-migration opened this issue Jan 22, 2016 · 27 comments
Closed
Labels
area/notifications Notification events bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/11012

Created by vaisov on 2016-01-22 11:25:42 +00:00

Assignee: mfriedrich
Status: Closed (closed on 2016-12-08 00:06:47 +00:00)
Target Version: (none)
Last Update: 2017-01-05 14:27:12 +00:00 (in Redmine)

Icinga Version: 2.4.1
Backport?: Not yet backported
Include in Changelog: 1

I have two masters setup with couple of satellites working as checkers.
Whenever I reload icinga2 with /etc/init.d/icinga2 reload I start to get notifications for services which had downtime scheduled before reload.


Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-01-22 16:00:37 +00:00

  • Status changed from New to Feedback
  • Assigned to set to vaisov

Do you happen to have any logs and/or configs for easily reproducing the problem?

@icinga-migration
Copy link
Author

Updated by vaisov on 2016-01-25 08:17:31 +00:00

Not sure if I can attach whole configuration. I'd say simple master-master-sattelites setup would be enough to reproduce the issue. If you need any particular part of config, please let me know.

The problem is: master loses downtimes after reload/restart.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-01-25 08:59:14 +00:00

  • Status changed from Feedback to Assigned
  • Assigned to changed from vaisov to mfriedrich

Ok, thanks. I'll have a look into that.

Kind regards,
Michael

@icinga-migration
Copy link
Author

Updated by vaisov on 2016-02-08 10:30:19 +00:00

Is there any update on this?

@icinga-migration
Copy link
Author

Updated by jeunito on 2016-02-25 02:21:02 +00:00

+1. Getting this issue too.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-04 15:39:28 +00:00

  • Parent Id set to 11312

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-09 15:34:00 +00:00

  • Relates set to 11173

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-15 15:14:26 +00:00

  • Relates set to 11382

@icinga-migration
Copy link
Author

Updated by rhillmann on 2016-05-19 09:57:02 +00:00

+1 since update from 2.3.11 to 2.4.8, this is really annoying. the downtimes are going to be resetted completely.

@icinga-migration
Copy link
Author

Updated by ziaunys on 2016-07-22 22:36:09 +00:00

I'm also having this issue. +1

@icinga-migration
Copy link
Author

Updated by pstiffel on 2016-07-26 12:04:50 +00:00

V2.4.10 here, still experiencing this very annoying issue in a 2 master-cluster-szenario. After a reload, notifications for some services which are either in downtime or have notifications disabled are sent out.
I can't find a scheme for which services the notififications are sent out (and which notification object fires)
Example: I have a service with notifications disabled. There are three different notification objects attached to this service (mail, sms, ticketsystemintegration) and only the ticketsystemintegration fires a notification...

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-08-05 13:57:23 +00:00

Testing the current snapshot packages for 2.5 with additional cluster notification fixes. I'm not able to reproduce the issue here.

Can you please test the current snapshot packages and check whether this solves the problem for you?

test config

object Host "11012-host" {
 check_command = "dummy"
}

apply Service "11012-service-" for (i in range(10)) {
  check_command = "dummy"
  check_interval = 30s
  retry_interval = 30s

  vars.dummy_state = 2
  vars.dummy_text = {{
    var downtimes = []
    for (d in get_objects(Downtime)) {
      var host = macro("$host.name$")
      var service = macro("$service.name$")

      if (d.host_name != host && d.service_name != service) {
        continue;
      }

      downtimes.add([d.name, DateTime(d.trigger_time).to_string()])
    }

    var str = "Service downtimes: "

    for (dt in downtimes) {
      str += dt[0] + ":" + dt[1] + "\n"
    }

    return str

  }}

  assign where match("11012*", host.name)
}

object User "mif" {
  email = "michael.friedrich@netways.de"
}

apply Notification "11012-notification" to Service {
  import "mail-service-notification"

  users = [ "mif" ]

  interval = 5s

  assign where match("11012*", host.name)
}

@icinga-migration
Copy link
Author

Updated by rhillmann on 2016-08-29 09:45:34 +00:00

can anyone confirm this is working with 2.5?

@icinga-migration
Copy link
Author

Updated by vaisov on 2016-08-29 10:53:04 +00:00

rhillmann wrote:

can anyone confirm this is working with 2.5?

Just checked with 2.5.3. Downtime doesn't survive both reload and restart.

@icinga-migration
Copy link
Author

Updated by phsc on 2016-08-29 12:49:33 +00:00

vaisov wrote:

rhillmann wrote:
> can anyone confirm this is working with 2.5?

Just checked with 2.5.3. Downtime doesn't survive both reload and restart.

I recently also updated to 2.5.3 and encounter the same issues as well.

@icinga-migration
Copy link
Author

Updated by gvde on 2016-10-18 06:43:32 +00:00

I can confirm that this problems is still there in 2.5.4. I only have a single icinga2 server, running on centos7, installed from the icinga table repositories . No cluster. I don't think it has anything to do with clustering.

Any downtime which is active before the time of a reload/restart won't be active afterwards.

I affects all downtimes. Downtimes set up through icingaweb2 as well as scheduleddowntimes configured in files.

I have noticed that the host object through API has downtime_depth = 1.0 during downtime. After the reload it goes back to 0.0.

It's really annoying as we basically cannot use downtimes. We automatically generate host groups for our vm pool every hour and if any host group changes the server internally does a reload thus forgetting all downtimes...

@icinga-migration
Copy link
Author

Updated by rhillmann on 2016-10-18 07:44:52 +00:00

After update from 2.4.10 to 2.5.4 it seems now to work for me, but i have enabled accept_config = true as well (proably this affects this issue). It seems downtimes are synced via the REST api package.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-11-09 14:58:46 +00:00

  • Parent Id deleted 11312

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-12-08 00:06:47 +00:00

  • Status changed from Assigned to Closed

Probably related to HA cluster and not syncing the downtimes. Re-test that with the snapshot packages and/or 2.6 then. I haven't seen this during the release tests for 2.6.

@icinga-migration
Copy link
Author

Updated by vaisov on 2017-01-05 09:34:41 +00:00

Still happening for me after updating both masters and all satellites to 2.6.0

@icinga-migration
Copy link
Author

Updated by gvde on 2017-01-05 10:47:08 +00:00

vaisov wrote:

Still happening for me after updating both masters and all satellites to 2.6.0

I had the same problem with a standalone server. But it was due to a missing include.conf file for the _api directory. See #13725

For master/satellite setup the directories are different, but check with "icinga2 daemon -C" to see which directories are included and find the directory, where icinga2 stores the downtimes and check if they are included or not...

@icinga-migration
Copy link
Author

Updated by vaisov on 2017-01-05 11:13:00 +00:00

"icinga2 daemon -C" doesn't show any directories included.

in include.conf I have following includes on both masters:

include "*/include.conf"

@icinga-migration
Copy link
Author

Updated by gvde on 2017-01-05 11:58:02 +00:00

vaisov wrote:

"icinga2 daemon -C" doesn't show any directories included.

Sorry. I didn't check the exact command which lists the files included. You need log level notice or debug to see which files are read:

icinga2 daemon -C -x debug

Look for the _api package, e.g.

notice/ConfigCompiler: Compiling config file: /var/lib/icinga2/api/packages/_api/include.conf

In the file system search for the downtime file which you have created, e.g. with "fgrep -r" and check if those files/directories are included or not...

@icinga-migration
Copy link
Author

Updated by vaisov on 2017-01-05 13:29:51 +00:00

Looks like you were right. I didn't have /var/lib/icinga2/api/packages/_api/ops-monmaster1-1447844624-1/include.conf on main master.
Now everything's fine. I think this file should be recreated upon reload/restart.

@icinga-migration
Copy link
Author

Updated by gvde on 2017-01-05 14:27:12 +00:00

vaisov wrote:

Looks like you were right. I didn't have /var/lib/icinga2/api/packages/_api/ops-monmaster1-1447844624-1/include.conf on main master.
Now everything's fine. I think this file should be recreated upon reload/restart.

Great. Could you please post a "me too" at #13725. I have opened #13725 for that exact problem and it's probably good if developers know that it's not a singular problem.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2017-01-09 15:44:29 +00:00

  • Relates set to 13725

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2017-01-09 15:44:39 +00:00

  • Relates set to 10638

@icinga-migration icinga-migration added bug Something isn't working area/notifications Notification events labels Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/notifications Notification events bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant