Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #9798] Cluster Endpoints are note able to reconnect FIN_WAIT1 #3207

Closed
icinga-migration opened this issue Jul 31, 2015 · 5 comments
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/9798

Created by rhillmann on 2015-07-31 08:05:38 +00:00

Assignee: (none)
Status: Closed (closed on 2016-02-24 23:45:41 +00:00)
Target Version: (none)
Last Update: 2016-02-24 23:45:41 +00:00 (in Redmine)

Icinga Version: 2.3.8
Backport?: Not yet backported
Include in Changelog: 1

It seems, that the Endpoints are sometimes not able to reconnect. The ClientAPI Connection does not clean up open connections, which stucks at FIN_WAIT1.
Below you can see, the reported endpoints which stuck in FIN_WAIT1. Usually they should be time out, but nothing happens.

tcp 0 0 0.0.0.0:5665 0.0.0.0:* LISTEN
tcp 0 0 10.112.32.18:59704 10.112.32.214:5665 ESTABLISHED
tcp 0 114836 10.112.32.18:56976 10.113.32.21:5665 FIN_WAIT1
tcp 0 0 10.112.32.18:5665 10.104.16.16:46314 ESTABLISHED
tcp 0 0 10.112.32.18:40809 10.96.0.97:5665 ESTABLISHED
tcp 0 0 10.112.32.18:57064 10.113.32.21:5665 ESTABLISHED
tcp 0 75026 10.112.32.18:5665 10.104.16.15:58097 FIN_WAIT1
tcp 0 0 10.112.32.18:51302 10.96.0.96:5665 ESTABLISHED
tcp 0 0 10.112.32.18:33674 10.104.16.15:5665 ESTABLISHED
tcp 0 1666 10.112.32.18:5665 10.113.48.114:52865 ESTABLISHED


Relations:

@icinga-migration
Copy link
Author

Updated by rhillmann on 2015-08-03 07:13:15 +00:00

here is an onother example of two server, the master server has one FIN_WAIT1 and an ESTABLISHED connection, but the worker has two ESTABLISHED to the master:

MASTER / addr:10.113.48.114
tcp 0 0 0.0.0.0:5665 0.0.0.0:* LISTEN
tcp 0 0 10.113.48.114:53862 10.113.32.21:5665 ESTABLISHED
tcp 392806 0 10.113.48.114:56710 10.112.32.18:5665 ESTABLISHED
tcp 0 0 10.113.48.114:49789 10.104.16.16:5665 ESTABLISHED
tcp 0 730 10.113.48.114:57063 10.96.0.96:5665 ESTABLISHED
tcp 0 8514 10.113.48.114:5665 10.96.0.97:55214 ESTABLISHED
tcp 285286 0 10.113.48.114:5665 10.112.32.214:40980 ESTABLISHED
tcp 0 0 10.113.48.114:5665 10.104.16.15:51791 ESTABLISHED
tcp 0 209388 10.113.48.114:53795 10.113.32.21:5665 FIN_WAIT1

WORKER / addr:10.113.32.21
tcp 0 0 0.0.0.0:5665 0.0.0.0:* LISTEN
tcp 283392 0 10.113.32.21:5665 10.112.32.18:60874 ESTABLISHED
tcp 247 0 10.113.32.21:5665 10.96.0.96:56552 ESTABLISHED
tcp 247 0 10.113.32.21:5665 10.112.32.18:60989 ESTABLISHED
tcp 275016 0 10.113.32.21:5665 10.96.0.96:54797 ESTABLISHED
tcp 247 0 10.113.32.21:5665 10.113.48.114:53862 ESTABLISHED
tcp 280692 0 10.113.32.21:5665 10.104.16.15:54264 ESTABLISHED
tcp 247 0 10.113.32.21:5665 10.104.16.15:57000 ESTABLISHED
tcp 303790 0 10.113.32.21:5665 10.112.32.214:54955 ESTABLISHED
tcp 247 0 10.113.32.21:5665 10.112.32.214:57396 ESTABLISHED
tcp 291094 0 10.113.32.21:5665 10.113.48.114:53795 ESTABLISHED

@icinga-migration
Copy link
Author

Updated by rhillmann on 2015-08-24 09:09:49 +00:00

I have fixed this behavior on ubuntu by setting the orphan retries to 5. But this is more an workaround than a real fix of this problem.
sysctl -w net.ipv4.tcp_orphan_retries 5

@icinga-migration
Copy link
Author

Updated by mfrosch on 2015-08-31 14:24:46 +00:00

  • Relates set to 10002

@icinga-migration
Copy link
Author

Updated by mfrosch on 2015-08-31 14:28:12 +00:00

  • Relates set to 9976

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-24 23:45:41 +00:00

  • Status changed from New to Closed

This should be fixed in the most recent release.

@icinga-migration icinga-migration added bug Something isn't working area/distributed Distributed monitoring (master, satellites, clients) labels Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant