Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #10931] Exception stack trace on icinga2 client when the master reloads the configuration #3816

Closed
icinga-migration opened this issue Jan 5, 2016 · 13 comments
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/10931

Created by seferovic on 2016-01-05 13:04:53 +00:00

Assignee: seferovic
Status: Resolved (closed on 2016-02-23 09:59:36 +00:00)
Target Version: 2.4.2
Last Update: 2016-02-23 09:59:53 +00:00 (in Redmine)

Icinga Version: 2.4.1
Backport?: Already backported
Include in Changelog: 1

In my test environment I have a master server which sends commands to icinga2 agents. On reload of the master server the connection to the agent is being destroyed (?!) and the icinga2 agent logs following

[2016-01-05 13:53:11 +0100] warning/JsonRpcConnection: Error while reading JSON-RPC message for identity 'master.monitoring': Error: std::exception

        (0) libbase.so: void boost::throw_exception(icinga::openssl_error const&) (+0x97) [0x7fbde1817a57]
        (1) libbase.so: void boost::exception_detail::throw_exception_(icinga::openssl_error const&, char const*, char const*, int) (+0x40) [0x7fbde1817b00]
        (2) libbase.so: icinga::TlsStream::HandleError() const (+0xbc) [0x7fbde17b646c]
        (3) libbase.so: icinga::TlsStream::Read(void*, unsigned long, bool) (+0x7e) [0x7fbde17b65de]
        (4) libbase.so: icinga::StreamReadContext::FillFromStream(boost::intrusive_ptr const&, bool) (+0x55) [0x7fbde17c56e5]
        (5) libbase.so: icinga::NetString::ReadStringFromStream(boost::intrusive_ptr const&, icinga::String*, icinga::StreamReadContext&, bool) (+0xce) [0x7fbde17d3dde]
        (6) libremote.so: icinga::JsonRpc::ReadMessage(boost::intrusive_ptr const&, boost::intrusive_ptr*, icinga::StreamReadContext&, bool) (+0x3d) [0x7fbde0e5d4ad]
        (7) libremote.so: icinga::JsonRpcConnection::ProcessMessage() (+0x65) [0x7fbde0e81555]
        (8) libremote.so: icinga::JsonRpcConnection::DataAvailableHandler() (+0x38) [0x7fbde0e9de68]
        (9) libbase.so: boost::signals2::detail::signal_impl const&), boost::signals2::optional_last_value, int, std::less, boost::function const&)>, boost::function const&)>, boost::signals2::mutex>::operator()(boost::intrusive_ptr const&) (+0x1bb) [0x7fbde184a0ab]
        (10) libbase.so: icinga::Stream::SignalDataAvailable() (+0x27) [0x7fbde17faa17]
        (11) libbase.so: icinga::TlsStream::OnEvent(int) (+0x3af) [0x7fbde17fae9f]
        (12) libbase.so: icinga::SocketEvents::ThreadProc() (+0x23a) [0x7fbde17f7aca]
        (13) libboost_thread.so.1.53.0:  (+0xc5c3) [0x7fbde222c5c3]
        (14) libpthread.so.0:  (+0x7a51) [0x7fbdde99aa51]
        (15) libc.so.6: clone (+0x6d) [0x7fbddeea093d]

I am not sure if this should be considered a bug, but I don't expect to see such exceptions in main log only because the master reloaded the config... but I might be wrong, so please excuse the newbie ;)

Thx!


Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-01-11 13:13:14 +00:00

  • Priority changed from Normal to Low

It is fairly normal that a reload of the Icinga 2 master closes the tcp connections causing the client to log an error message. It might be helpful to hide the stack trace in such cases though it sometimes contains valuable information.

@icinga-migration
Copy link
Author

Updated by tgelf on 2016-01-20 12:24:15 +00:00

  • Priority changed from Low to Normal

Might be related to this code snippet (or something similar):

void TlsStream::CloseInternal(bool inDestructor)
{
    if (!m_Eof && !inDestructor) {
        m_Eof = true;
        SignalDataAvailable();
    }

Icinga should not try to process available data if there is no such, especially when it already decided to destroy the socket. I'll reset priority to normal, I guess this could be easy to fix. Those exceptions are pretty disturbing as you get a lot of them in larger setups, they are disturbing/confusing if in case you are looking for problems in your logs.

Cheers,
Thomas

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-01-25 10:32:05 +00:00

  • Target Version set to 2.4.2

@icinga-migration
Copy link
Author

Updated by ziaunys on 2016-01-26 18:57:22 +00:00

I'm not sure if this is related, but I constantly see this exception in the logs of all my Icinga2 (2.4.1) agents:
https://gist.github.com/Ziaunys/22fcce4a90d1fd002eab

I don't know if this actually causes problems because I don't understand the error.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-01-27 13:19:59 +00:00

  • Relates set to 11006

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-02-04 12:02:10 +00:00

Can you retest this with the latest snapshot?

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-04 15:21:52 +00:00

I haven't seen this so far with the latest snapshot packages.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-04 15:22:13 +00:00

  • Status changed from New to Feedback
  • Assigned to set to seferovic

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-02-10 07:03:57 +00:00

I'm fairly certain this is fixed in the master branch.

@icinga-migration
Copy link
Author

Updated by seferovic on 2016-02-11 12:05:41 +00:00

Sorry for the late feedback. I just installed latest snapshot and unfortunately I run into a "endless restart loop" due to configuration changes. This is probably closed, but I will review it once more after resolving the actual problem at hand.

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-02-23 09:59:36 +00:00

  • Status changed from Feedback to Resolved

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-02-23 09:59:53 +00:00

  • Backport? changed from Not yet backported to Already backported

@icinga-migration icinga-migration added bug Something isn't working area/distributed Distributed monitoring (master, satellites, clients) labels Jan 17, 2017
@icinga-migration icinga-migration added this to the 2.4.2 milestone Jan 17, 2017
@ricky33000
Copy link

'apt install chrony' on master, satellite and client hosts solve the issue "Error while reading JSON-RPC message for identity 'example.com': Error: stream truncated" for me.
It seems that json rpc connexion need right ntp sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants