Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #13369] Director deploeys only on first of two HA-Cluster Nodes #635

Closed
icinga-migration opened this issue Nov 30, 2016 · 4 comments
Labels

Comments

@icinga-migration
Copy link

icinga-migration commented Nov 30, 2016

This issue has been migrated from Redmine: https://dev.icinga.com/issues/13369

Created by Finn on 2016-11-30 12:06:00 +00:00

Assignee: (none)
Status: New
Target Version: (none)
Last Update: 2016-12-06 13:36:42 +00:00 (in Redmine)


In our HA-Cluster we Configuring our Hosts with the Director, but it only deploys the Configuration on the first Node of the HA-Cluster.

The zones.conf looks as follows:

object Endpoint "node1.example.com" {
host = "XXX"
}

object Endpoint "node2.example.com" {
host = "XXX"
}

object Zone "ZONE1" {
endpoints = [ "node1.example.com", "node2.example.com" ]
// global = true
}

object Zone "director-global" {
global = true
}

Update by bsheqa on 09.08.2022: Mask IP adresses and hostnames

@icinga-migration
Copy link
Author

Updated by tgelf on 2016-12-06 12:51:18 +00:00

Yep, it intentionally works this way. Deployment goes to one node, replication ships it to the others. When you deploy to another node once, both would believe to be config masters and no longer accept config for the involved zones from each other. So this would break replication.

We want to solve this in a future Director release by letting it remove it's own config package from the other node once failing back. However, this has lower priority right now and needs intensive testing first. It involves tricky combinations as a) do not fail back but stay sticky on the node "deployments migrated to" VS b) eventually not being allowed to remove config packages currently being active.

This eventually might also involve changes to the core, not sure about this however. So, given all this: please do not expect this feature very soon. Sure, it will arrive - but at least not within this year ;-) In the meantime please deploy to a single node. When experiencing a disaster state where you need to deploy to the other one that can easily be accomplished by going to the Endpoint definitions in Director and removing the API user from one of them while assigning it to the other one.

Just, when the other node comes back please make sure to completely remove /var/lib/icinga2/api/packages/director from one of them. Otherwise they will no longer accept each others replicated config.

Cheers,
Thomas

@icinga-migration
Copy link
Author

Updated by Finn on 2016-12-06 13:08:24 +00:00

Is there no other way to provide the director configuration files to both nodes?
deploying only in need to the second node doesn't seem like an option since we are opting for high availability and need the redundancy for example on weekends, when no one is around to redeploy or the first node goes down in silence.

both nodes are connected to the same Databases (core, Icingaweb2 and the Director). Icingaweb2 is installed on both nodes as well and is configured for failover with ucarp.

Anyways, Thanks for the answer. I'm looking forward for that feature :).

Cheers,
Finn

@icinga-migration
Copy link
Author

Updated by tgelf on 2016-12-06 13:33:18 +00:00

Finn wrote:

Is there no other way to provide the director configuration files to both nodes?

Well, you could trick Director into using your floating IP. Just, please be aware of the fact that Icinga 2 itself asks you to always have just one config master. So, this is not really a limitation in Director itself. And please note that if doing so it's your job to clean up /var/lib/icinga2 on one of them once the cluster joins again. Nothing complicated, just don't forget about it - or your cluster nodes will diverge.

Some parts of this documentation might also interest you.

deploying only in need to the second node doesn't seem like an option since we are opting for high availability and need the redundancy for example on weekends, when no one is around to redeploy or the first node goes down in silence.

Makes sense to me, that's why we'll sooner or later implement that feature. Still, how many times did you really run into trouble because of not being able to deploy a new monitoring configuration on a weekend while one monitoring node was dead? And how many times did you kill an application because of a cluster doing weird (intelligent) things?

both nodes are connected to the same Databases (core, Icingaweb2 and the Director). Icingaweb2 is installed on both nodes as well and is configured for failover with ucarp.

Unrelated to this issue, just a suggestion: you could also add Redis or similar to the mix and configure PHP to store it's session information there - that would allow you to load balance the web front-end.

Anyways, Thanks for the answer. I'm looking forward for that feature :).

You're welcome!

@icinga-migration
Copy link
Author

Updated by Finn on 2016-12-06 13:36:43 +00:00

cool, I will try that and look into Redis.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants