Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #10085] cluster check requesting host/port attributes #3375

Closed
icinga-migration opened this issue Sep 3, 2015 · 16 comments
Closed
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/10085

Created by henti on 2015-09-03 11:18:57 +00:00

Assignee: (none)
Status: Closed (closed on 2016-03-18 17:29:00 +00:00)
Target Version: (none)
Last Update: 2016-03-18 17:29:00 +00:00 (in Redmine)

Icinga Version: 2.3.9
Backport?: Not yet backported
Include in Changelog: 1

We run a MoM <- Master <- Client setup where we have a Master of Masters (MoM) server in our office, which does all our dependacies and notifications, Masters in regions (DMZ's) which the clients connect to.
Clients are configured using puppet.

The Master is configured in a custer with the MoM. Both Master and MoM has Icingaweb2 for dashboard. Bug 9262 impacted us as statuses were not being updated on the MoM when it changed on the Master so the dashboard were out of sync. With the release of 2.3.9, we set-up the configuration again. All instances are 2.3.9.

The dashboard on the Master is working as expected. I have 6 clients connected, all working.
The Dashboard on the MoM is showing 4 clients as expect with matching services.
Two clients shows as not connected with log lag in access of 16000 days. All services pending.

I've done a full reset of state on both clients and reconnected them, same situation.

I connected a second master to the MoM with 5 clients. Same situation. All clients show on the Master.
Two clients shows as not connected with log lag in access of 16000 days. All services pending.

The log files shows the following :

[2015-09-03 11:44:09 +0200] debug/ApiListener: Not connecting to Endpoint 'stg-qua-za-app01.int.domain.com' because the host/port attributes are missing.

This is consistent for all clients not working as described above.

The endpoint config are generated using update-config so all the configs are the same.

Initially I thought adding the host and port attributes to the Endpoint will resolve the issue, but that only changes the host status from Down to Up, which all services stays pending and doesn’t resolve.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-03 14:38:57 +00:00

  • Status changed from New to Feedback
  • Assigned to set to henti

The topic is misleading imho. I don't really understand how that's supposed to influence your satellites not executing checks.

@icinga-migration
Copy link
Author

Updated by henti on 2015-09-04 05:36:09 +00:00

dnsmichi wrote:

The topic is misleading imho. I don't really understand how that's supposed to influence your satellites not executing checks.

Morning dnsmichi:

Disabled features: command compatlog debuglog gelf graphite icingastatus livestatus notification opentsdb perfdata statusdata syslog
Enabled features: api checker mainlog

There is no other debug information I can see. The checks are running and displaying on the dashboard in the master the client is connected to, the host informaiton is just not being passed to the clustered MoM connected to the master.

H

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-04 09:05:07 +00:00

I really have a hard time following your descriptions. Please always add corresponding configurations and everything else which allows to easily understand and reproduce the issue. For now I'd just guess it's a configuration problem.

@icinga-migration
Copy link
Author

Updated by henti on 2015-09-04 13:03:31 +00:00

dnsmichi wrote:

I really have a hard time following your descriptions. Please always add corresponding configurations and everything else which allows to easily understand and reproduce the issue. For now I'd just guess it's a configuration problem.

Hi dnsmichi,

I'm sorry my description is not very clear. I'm not really sure how else to explain it.

All configs are here : http://pastie.org/private/ngltkahb6oeczykjb5bgq

Henti

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-04 13:10:45 +00:00

Don't use such external pastie urls. You may just put the text here, including proper formatting.

=========== Master of Masters ===========
root@prd-qua-za-mon:/etc/icinga2# cat zones.conf
// Master Config
object Endpoint "prd-qua-za-mon.dc.domain.com" {
  host = "prd-qua-za-mon.dc.domain.com"
    log_duration = 2h
}

/*object Zone "prd-qua-za-mon.dc.domain.com" {
  endpoints = [ "prd-qua-za-mon.dc.domain.com" ]
}*/

object Zone "master" {
  endpoints = [ "prd-qua-za-mon.dc.domain.com" ]
}

root@prd-qua-za-mon:/etc/icinga2# cat repository.d/zones/*
object Zone "stg-qua-za-app01.int.domain.com" {
    endpoints = [ "stg-qua-za-app01.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-aux01.int.domain.com" {
    endpoints = [ "stg-qua-za-aux01.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-db01.int.domain.com" {
    endpoints = [ "stg-qua-za-db01.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-db02.int.domain.com" {
    endpoints = [ "stg-qua-za-db02.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-db03.int.domain.com" {
    endpoints = [ "stg-qua-za-db03.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-mis01.int.domain.com" {
    endpoints = [ "stg-qua-za-mis01.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-prox01.int.domain.com" {
    endpoints = [ "stg-qua-za-prox01.int.domain.com" ]
    parent = "master"
}

root@prd-qua-za-mon:/etc/icinga2# cat repository.d/hosts/stg-qua-za-* 
object Host "stg-qua-za-app01.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-aux01.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-db01.int.domain.com" {
    import "satellite-host"
    check_command = "dummy"
    zone = "stg-qua-za-aux01.int.domain.com"
}

object Host "stg-qua-za-db02.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-db03.int.domain.com" {
    import "satellite-host"
    check_command = "dummy"
    zone = "stg-qua-za-aux01.int.domain.com"
}

object Host "stg-qua-za-mis01.int.domain.com" {
    import "satellite-host"
    check_command = "dummy"
    zone = "stg-qua-za-aux01.int.domain.com"
}

object Host "stg-qua-za-prox01.int.domain.com" {
    import "satellite-host"
    check_command = "dummy"
    zone = "stg-qua-za-aux01.int.domain.com"
}

=========== Master Service in Region ===========

root@stg-qua-za-aux01:/etc/icinga2# cat zones.conf 
object Endpoint "prd-qua-za-mon.dc.domain.com" {
        host = "prd-qua-za-mon.dc.domain.com"
}

object Zone "master" {
        endpoints = [ "prd-qua-za-mon.dc.domain.com" ]
}

object Endpoint "stg-qua-za-aux01.int.domain.com" {
}

object Zone "stg-qua-za-aux01.int.domain.com" {
        //this is the local node = "stg-qua-za-aux01.int.domain.com"
        endpoints = [ "stg-qua-za-aux01.int.domain.com" ]
        parent = "master"
}

root@stg-qua-za-aux01:~/icinga2backup-forhenti# cat repository.d/zones/* 
object Zone "re-ase.re.domain.com" {
    endpoints = [ "re-ase.re.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-app01.int.domain.com" {
    endpoints = [ "stg-qua-za-app01.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-db01.int.domain.com" {
    endpoints = [ "stg-qua-za-db01.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-db02.int.domain.com" {
    endpoints = [ "stg-qua-za-db02.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-db03.int.domain.com" {
    endpoints = [ "stg-qua-za-db03.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-mis01.int.domain.com" {
    endpoints = [ "stg-qua-za-mis01.int.domain.com" ]
    parent = "master"
}

object Zone "stg-qua-za-prox01.int.domain.com" {
    endpoints = [ "stg-qua-za-prox01.int.domain.com" ]
    parent = "master"
}

root@stg-qua-za-aux01:~/icinga2backup-forhenti# cat repository.d/hosts/* 
object Host "re-ase.re.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-app01.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-db01.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-db02.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-db03.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-mis01.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

object Host "stg-qua-za-prox01.int.domain.com" {
    import "satellite-host"
    check_command = "cluster-zone"
}

=========== Clients ===========

root@stg-qua-za-app01:/etc/icinga2# cat zones.conf
/*
 * Generated by Icinga 2 node setup commands
 * on 2015-09-03 10:40:16 +0200
 */

object Endpoint "stg-qua-za-aux01.int.domain.com" {
        host = "stg-qua-za-aux01.int.domain.com"
}

object Zone "master" {
        endpoints = [ "stg-qua-za-aux01.int.domain.com" ]
}

object Endpoint "stg-qua-za-app01.int.domain.com" {
}

object Zone "stg-qua-za-app01.int.domain.com" {
        //this is the local node = "stg-qua-za-app01.int.domain.com"
        endpoints = [ "stg-qua-za-app01.int.domain.com" ]
        parent = "master"
}

root@stg-qua-za-db01:/etc/icinga2# cat zones.conf
/*
 * Generated by Icinga 2 node setup commands
 * on 2015-09-03 10:39:54 +0200
 */

object Endpoint "stg-qua-za-aux01.int.domain.com" {
        host = "stg-qua-za-aux01.int.domain.com"
}

object Zone "master" {
        endpoints = [ "stg-qua-za-aux01.int.domain.com" ]
}

object Endpoint "stg-qua-za-db01.int.domain.com" {
}

object Zone "stg-qua-za-db01.int.domain.com" {
        //this is the local node = "stg-qua-za-db01.int.domain.com"
        endpoints = [ "stg-qua-za-db01.int.domain.com" ]
        parent = "master"
}

root@stg-qua-za-db02:/etc/icinga2# cat zones.conf
/*
 * Generated by Icinga 2 node setup commands
 * on 2015-09-03 10:39:50 +0200
 */

object Endpoint "stg-qua-za-aux01.int.domain.com" {
        host = "stg-qua-za-aux01.int.domain.com"
}

object Zone "master" {
        endpoints = [ "stg-qua-za-aux01.int.domain.com" ]
}

object Endpoint "stg-qua-za-db02.int.domain.com" {
}

object Zone "stg-qua-za-db02.int.domain.com" {
        //this is the local node = "stg-qua-za-db02.int.domain.com"
        endpoints = [ "stg-qua-za-db02.int.domain.com" ]
        parent = "master"
}
root@stg-qua-za-db03:/etc/icinga2# cat zones.conf
/*
 * Generated by Icinga 2 node setup commands
 * on 2015-09-03 10:39:41 +0200
 */

object Endpoint "stg-qua-za-aux01.int.domain.com" {
        host = "stg-qua-za-aux01.int.domain.com"
}

object Zone "master" {
        endpoints = [ "stg-qua-za-aux01.int.domain.com" ]
}

object Endpoint "stg-qua-za-db03.int.domain.com" {
}

object Zone "stg-qua-za-db03.int.domain.com" {
        //this is the local node = "stg-qua-za-db03.int.domain.com"
        endpoints = [ "stg-qua-za-db03.int.domain.com" ]
        parent = "master"
}

root@stg-qua-za-mis01:/etc/icinga2# cat zones.conf
/*
 * Generated by Icinga 2 node setup commands
 * on 2015-09-03 10:39:51 +0200
 */

object Endpoint "stg-qua-za-aux01.int.domain.com" {
        host = "stg-qua-za-aux01.int.domain.com"
}

object Zone "master" {
        endpoints = [ "stg-qua-za-aux01.int.domain.com" ]
}

object Endpoint "stg-qua-za-mis01.int.domain.com" {
}

object Zone "stg-qua-za-mis01.int.domain.com" {
        //this is the local node = "stg-qua-za-mis01.int.domain.com"
        endpoints = [ "stg-qua-za-mis01.int.domain.com" ]
        parent = "master"
}


root@stg-qua-za-prox01:/etc/icinga2# cat zones.conf
/*
 * Generated by Icinga 2 node setup commands
 * on 2015-09-03 10:39:53 +0200
 */

object Endpoint "stg-qua-za-aux01.int.domain.com" {
        host = "stg-qua-za-aux01.int.domain.com"
}

object Zone "master" {
        endpoints = [ "stg-qua-za-aux01.int.domain.com" ]
}

object Endpoint "stg-qua-za-prox01.int.domain.com" {
}

object Zone "stg-qua-za-prox01.int.domain.com" {
        //this is the local node = "stg-qua-za-prox01.int.domain.com"
        endpoints = [ "stg-qua-za-prox01.int.domain.com" ]
        parent = "master"
}

@icinga-migration
Copy link
Author

Updated by henti on 2015-09-17 13:36:05 +00:00

dnsmichi wrote:

Don't use such external pastie urls. You may just put the text here, including proper formatting.

[...]

I think I have found the problem.

I found that in my icinga2 node list output, I have two hosts with the same name listed in my node list output. One under the Master node, and one under it's own node. I've confirmed that both hosts connects to the MOM and Master using the "last seen" data. this can only mean somebody has cloned these hosts and likely renamed them while keeping the icinga configs the same. I'm not sure why it's showing the host as down with log lag of 60000+ days.

This does pose a problem. The hostnames in icinga2 is the same, and DNs is pointing to the correct server, so I cannot use the host name to find the incorrect one. I cannot tcpdump the traffic, as it's encrypted. What other way can i use to find the duplicated host that is connecting directly to the MOM to disable the config ?

Should configcheck also not fail when this happens ?

Regards
Henti

@icinga-migration
Copy link
Author

Updated by henti on 2015-09-30 06:09:59 +00:00

henti wrote:

dnsmichi wrote:
> Don't use such external pastie urls. You may just put the text here, including proper formatting.
>
> [...]

I think I have found the problem.

I found that in my icinga2 node list output, I have two hosts with the same name listed in my node list output. One under the Master node, and one under it's own node. I've confirmed that both hosts connects to the MOM and Master using the "last seen" data. this can only mean somebody has cloned these hosts and likely renamed them while keeping the icinga configs the same. I'm not sure why it's showing the host as down with log lag of 60000+ days.

This does pose a problem. The hostnames in icinga2 is the same, and DNs is pointing to the correct server, so I cannot use the host name to find the incorrect one. I cannot tcpdump the traffic, as it's encrypted. What other way can i use to find the duplicated host that is connecting directly to the MOM to disable the config ?

Should configcheck also not fail when this happens ?

Good morning.

Some more information.

I've stopped the icinga2 service on the final endpoint to try and identify where the additional host comes from. This is what I've seen.

Master of Master : prd-qua-za-mon.dc.domain.com

Node 'stg-qua-za-mis01.int.domain.com' (last seen: Wed Sep 30 07:46:18 2015)
    * Host 'stg-qua-za-mis01.int.domain.com'
--
Node 'stg-qua-za-aux01.int.domain.com' (last seen: Wed Sep 30 07:57:01 2015)
    * Host 'stg-qua-za-mis01.int.domain.com'

Master : stg-qua-za-aux01.int.domain.com

Node 'stg-qua-za-mis01.int.domain.com' (last seen: Wed Sep 30 07:46:18 2015)
    * Host 'stg-qua-za-mis01.int.domain.com'

It seems appears that stg-qua-za-mis01.int.domain.com connects to both icinga2 servers, but the zones.conf is configured to connect to stg-qua-za-aux01.int.domain.com and I cannot see any traffic between stg-qua-za-mis01.int.domain.com and prd-qua-za-mon.dc.domain.com so the information must come from the stg-qua-za-aux01.int.domain.com server.

I'm really at a loss here. More object information :

= Master of Masters =
root@prd-qua-za-mon:/etc/icinga2# icinga2 object list --name stg-qua-za-mis01.int.domain.com

Object 'stg-qua-za-mis01.int.domain.com' of type 'Endpoint':
  % declared in '/etc/icinga2/repository.d/endpoints/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:38
  * __name = "stg-qua-za-mis01.int.domain.com"
  * host = ""
  * log_duration = 60
    % = modified in '/etc/icinga2/repository.d/endpoints/stg-qua-za-mis01.int.domain.com.conf', lines 2:2-2:18
  * name = "stg-qua-za-mis01.int.domain.com"
  * port = "5665"
  * templates = [ "stg-qua-za-mis01.int.domain.com" ]
    % = modified in '/etc/icinga2/repository.d/endpoints/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:38
  * type = "Endpoint"
  * zone = ""

Object 'stg-qua-za-mis01.int.domain.com' of type 'Host':
  % declared in '/etc/icinga2/repository.d/hosts/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:34
  * __name = "stg-qua-za-mis01.int.domain.com"
  * action_url = ""
  * address = ""
  * address6 = ""
  * check_command = "cluster-zone"
    % = modified in '/etc/icinga2/conf.d/satellite.conf', lines 13:2-13:28
    % = modified in '/etc/icinga2/repository.d/hosts/stg-qua-za-mis01.int.domain.com.conf', lines 3:2-3:31
  * check_interval = 300
  * check_period = ""
  * command_endpoint = ""
  * display_name = "stg-qua-za-mis01.int.domain.com"
  * enable_active_checks = true
  * enable_event_handler = true
  * enable_flapping = true
    % = modified in '/etc/icinga2/conf.d/satellite.conf', lines 12:2-12:20
  * enable_notifications = true
  * enable_passive_checks = true
  * enable_perfdata = true
  * event_command = ""
  * flapping_threshold = 30
  * groups = [ ]
  * icon_image = ""
  * icon_image_alt = ""
  * max_check_attempts = 3
  * name = "stg-qua-za-mis01.int.domain.com"
  * notes = ""
  * notes_url = ""
  * retry_interval = 60
  * templates = [ "stg-qua-za-mis01.int.domain.com", "satellite-host" ]
    % = modified in '/etc/icinga2/repository.d/hosts/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:34
    % = modified in '/etc/icinga2/conf.d/satellite.conf', lines 11:1-11:30
  * type = "Host"
  * vars = null
  * volatile = false
  * zone = ""

Object 'stg-qua-za-mis01.int.domain.com' of type 'Zone':
  % declared in '/etc/icinga2/repository.d/zones/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:34
  * __name = "stg-qua-za-mis01.int.domain.com"
  * endpoints = [ "stg-qua-za-mis01.int.domain.com" ]
    % = modified in '/etc/icinga2/repository.d/zones/stg-qua-za-mis01.int.domain.com.conf', lines 2:2-2:40
  * global = false
  * name = "stg-qua-za-mis01.int.domain.com"
  * parent = "master"
    % = modified in '/etc/icinga2/repository.d/zones/stg-qua-za-mis01.int.domain.com.conf', lines 3:2-3:18
  * templates = [ "stg-qua-za-mis01.int.domain.com" ]
    % = modified in '/etc/icinga2/repository.d/zones/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:34
  * type = "Zone"
  * zone = ""

= Master Service in Region =

root@stg-qua-za-aux01:/etc/icinga2# icinga2 object list --name stg-qua-za-mis01.int.domain.com

Object 'stg-qua-za-mis01.int.domain.com' of type 'Endpoint':
  % declared in '/etc/icinga2/repository.d/endpoints/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:38
  * __name = "stg-qua-za-mis01.int.domain.com"
  * host = ""
  * log_duration = 60
    % = modified in '/etc/icinga2/repository.d/endpoints/stg-qua-za-mis01.int.domain.com.conf', lines 2:2-2:18
  * name = "stg-qua-za-mis01.int.domain.com"
  * port = "5665"
  * templates = [ "stg-qua-za-mis01.int.domain.com" ]
    % = modified in '/etc/icinga2/repository.d/endpoints/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:38
  * type = "Endpoint"
  * zone = ""

Object 'stg-qua-za-mis01.int.domain.com' of type 'Host':
  % declared in '/etc/icinga2/repository.d/hosts/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:34
  * __name = "stg-qua-za-mis01.int.domain.com"
  * action_url = ""
  * address = "stg-qua-za-mis01.int.domain.com"
    % = modified in '/etc/icinga2/repository.d/hosts/stg-qua-za-mis01.int.domain.com.conf', lines 3:2-3:34
  * address6 = ""
  * check_command = "hostalive"
    % = modified in '/etc/icinga2/conf.d/satellite.conf', lines 13:3-13:29
  * check_interval = 300
  * check_period = ""
  * command_endpoint = ""
  * display_name = "stg-qua-za-mis01.int.domain.com"
  * enable_active_checks = true
  * enable_event_handler = true
  * enable_flapping = true
    % = modified in '/etc/icinga2/conf.d/satellite.conf', lines 12:3-12:21
  * enable_notifications = true
  * enable_passive_checks = true
  * enable_perfdata = true
  * event_command = ""
  * flapping_threshold = 30
  * groups = [ ]
  * icon_image = ""
  * icon_image_alt = ""
  * max_check_attempts = 3
  * name = "stg-qua-za-mis01.int.domain.com"
  * notes = ""
  * notes_url = ""
  * retry_interval = 60
  * templates = [ "stg-qua-za-mis01.int.domain.com", "satellite-host" ]
    % = modified in '/etc/icinga2/repository.d/hosts/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:34
    % = modified in '/etc/icinga2/conf.d/satellite.conf', lines 11:1-11:30
  * type = "Host"
  * vars = null
  * volatile = false
  * zone = ""

Object 'stg-qua-za-mis01.int.domain.com' of type 'Zone':
  % declared in '/etc/icinga2/repository.d/zones/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:34
  * __name = "stg-qua-za-mis01.int.domain.com"
  * endpoints = [ "stg-qua-za-mis01.int.domain.com" ]
    % = modified in '/etc/icinga2/repository.d/zones/stg-qua-za-mis01.int.domain.com.conf', lines 2:2-2:40
  * global = false
  * name = "stg-qua-za-mis01.int.domain.com"
  * parent = "master"
    % = modified in '/etc/icinga2/repository.d/zones/stg-qua-za-mis01.int.domain.com.conf', lines 3:2-3:18
  * templates = [ "stg-qua-za-mis01.int.domain.com" ]
    % = modified in '/etc/icinga2/repository.d/zones/stg-qua-za-mis01.int.domain.com.conf', lines 1:0-1:34
  * type = "Zone"
  * zone = ""

@icinga-migration
Copy link
Author

Updated by henti on 2015-10-05 09:04:42 +00:00

henti wrote:

[...]

Further to this I found the following.

The repository.d host files generated by update-config on the master of master contains the following :

object Host "stg-qua-za-mis01.int.domain.com" {
  import "satellite-host"
  check_command = "cluster-zone"
}

Whereas other machines that is working correctly has the following

object Host "stg-qua-za-db01.int.domain.com" { 
  import "satellite-host"
  check_command = "dummy"
  zone = "stg-qua-za-aux01.int.domain.com"
}

This seems to indicate that stg-qua-za-db01.int.domain.com is connected to the MOM via the AUX server as configures, while stg-qua-za-mis01.int.domain.com is connected directly, however the AUX also reports that stg-qua-za-mis01.int.domain.com is connected to it.

H

@icinga-migration
Copy link
Author

Updated by henti on 2015-10-05 09:06:13 +00:00

henti wrote:

henti wrote:
>
> [...]

Further to this I found the following.

The repository.d host files generated by update-config on the master of master contains the following :

[...]

Whereas other machines that is working correctly has the following

[...]

This seems to indicate that stg-qua-za-db01.int.domain.com is connected to the MOM via the AUX server as configures, while stg-qua-za-mis01.int.domain.com is connected directly, however the AUX also reports that stg-qua-za-mis01.int.domain.com is connected to it.

H

This seems to be only with new host conf files being generated. Older files are correct. When I removed correct files that has existed and working in the past and generate a new file, the new file contains the direct association.

H

@icinga-migration
Copy link
Author

Updated by mjbrooks on 2015-10-05 09:25:32 +00:00

  • Assigned to changed from henti to mfriedrich

Hello @dnsmichi

@henti pinged me on IRC, he was wondering if you'd seen his feedback and was concerned. I can't seem to change the status back to "open" so I'm dropping it back in your lap and leaving it as "feedback" (sorry)

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-10-05 10:17:41 +00:00

  • Category set to Cluster
  • Status changed from Feedback to New
  • Assigned to deleted mfriedrich

We'll take care of that after our trip to Portland.

@icinga-migration
Copy link
Author

Updated by henti on 2015-11-16 09:14:59 +00:00

dnsmichi wrote:

We'll take care of that after our trip to Portland.

Any update on this bug ?

Regards
Henti

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-12-03 11:55:17 +00:00

No, not yet. It requires time to read, analyse and tests in order to reproduce your issue. We are currently involved in other projects and/or issues.

Kind regards,
Michael

@icinga-migration
Copy link
Author

Updated by vishnu on 2015-12-07 10:08:23 +00:00

dnsmichi wrote:

No, not yet. It requires time to read, analyse and tests in order to reproduce your issue. We are currently involved in other projects and/or issues.

Kind regards,
Michael
h1.
Exact Same Problem

I am facing the exact same issue. My environment has the below components,

  • master.company.com -> master icinga host in the main network which has a icingaweb2 instance
  • satelliteA.zoneA.company.com - satellite host in the subdomain zoneA having its own icingaweb2 instance
    -- clientA.zoneA.company.com - client host in the subdomain zoneA with satelliteA as parent
    -- client1.zoneA.company.com - client host in the subdomain zoneA with satelliteA as parent
  • satelliteB.zoneB.company.com - satellite host in the subdomain zoneB having its own icingaweb2 instance
    -- clientB.zoneB.company.com - client host in the subdomain zoneB with satelliteB as parent
    -- client2.zoneB.company.com - client host in the subdomain zoneB with satelliteB as parent

satelliteA's side is working fine - I am able to see clientA and client1 in master's icingaweb2 as well as in satelliteA's icingaweb2.
But problem is that satelliteB's side is not working fine. I am able to see clientB and client2 only in satelliteB's icingaweb2, not in master's icingaweb2.

The error message in icingaweb2: Zone clientB.zoneB.company.com is not connected. Log lag: 16776 days, 8 hours, 46 minutes and 28 seconds

I could not find any problem in icinga2.log or debug.log.

It should not be a problem with the configuration, because the exactly same configurations work fine in the other side (satelliteA and its clients). Please look into this.

Thanks,
Vishnu

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-24 23:23:51 +00:00

  • Status changed from New to Feedback
  • Assigned to set to henti

Not sure how this related to the rest of the history of this issue. I suspect that hentis' setup is overly complicated and some endpoints are missing the required connection information. Either connecting from the master to the client, or vice versa. There has been a problem with older versions opening multiple connections for both directions. A different but related issue was with endpoints with wrong "host" connection information, not checking against the presented CN of the connected node.

Please test that again with 2.4.3.

@vishnu
Your problem sounds different, if it persists, join the community channels and provide more details. I'd guess it is a configuration issue, not necessarily a bug.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-18 17:29:00 +00:00

  • Status changed from Feedback to Closed
  • Assigned to deleted henti

We are not able to reproduce the issue here and therefore believe this problem has been fixed in recent releases.

@icinga-migration icinga-migration added bug Something isn't working area/distributed Distributed monitoring (master, satellites, clients) labels Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant