Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #12722] GelfWriter with enable_send_perfdata breaks checks #4666

Closed
icinga-migration opened this issue Sep 14, 2016 · 5 comments · Fixed by #5262
Closed

[dev.icinga.com #12722] GelfWriter with enable_send_perfdata breaks checks #4666

icinga-migration opened this issue Sep 14, 2016 · 5 comments · Fixed by #5262
Assignees
Labels
area/graylog Events to Graylog area/metrics General metrics handling bug Something isn't working
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/12722

Created by akrus on 2016-09-14 08:22:30 +00:00

Assignee: mariussturm
Status: Assigned
Target Version: (none)
Last Update: 2016-09-27 13:51:41 +00:00 (in Redmine)

Icinga Version: 2.5.4
Backport?: Not yet backported
Include in Changelog: 1

Hi,

We've upgraded to 2.5 branch of Icinga and turned on enable_send_perfdata in GelfWriter. This broke down many checks (e.g. check_ssh when it timed out), error log messages are as follows:

[2016-09-14 11:13:04 +0300] warning/GraphiteWriter: Ignoring invalid perfdata value: time=0,015933s;;;0,000000;10,000000
Context:
        (0) Processing check result for 'hostname!SSH'

[2016-09-14 11:13:04 +0300] debug/GelfWriter: GELF Processing check result for 'hostname!SSH'
[2016-09-14 11:13:04 +0300] critical/ThreadPool: Exception thrown in event handler:
Error: Operator - cannot be applied to values of type 'Service' and 'String'

        (0) libbase.so: void boost::throw_exception >(boost::exception_detail::error_info_injector const&) (+0xe8) [0x2b17f5210068]
        (1) libbase.so: void boost::exception_detail::throw_exception_(std::invalid_argument const&, char const*, char const*, int) (+0x59) [0x2b17f5210119]
        (2) libbase.so: icinga::operator-(icinga::Value const&, icinga::Value const&) (+0x5ca) [0x2b17f51d3dea]
        (3) libperfdata.so: icinga::GelfWriter::CheckResultHandler(boost::intrusive_ptr const&, boost::intrusive_ptr const&) (+0x16cb) [0x2b1800616abb]
        (4) libicinga.so: boost::signals2::detail::signal_impl const&, boost::intrusive_ptr const&, boost::intrusive_ptr const&), boost::signals2::o
ptional_last_value, int, std::less, boost::function const&, boost::intrusive_ptr const&, boost::intrusive_ptr const&)>, boost::function const&, boost::intrusive_ptr const&, boost::intrusive_ptr const&)>, boost::signals2::mutex>::operator()(boost::in
trusive_ptr const&, boost::intrusive_ptr const&, boost::intrusive_ptr const&) (+0x202) [0x2b17fb92eba2]
        (5) libicinga.so: icinga::Checkable::ProcessCheckResult(boost::intrusive_ptr const&, boost::intrusive_ptr const&) (+0xf86) [0x2b17fb8a5d96]
        (6) libmethods.so: icinga::PluginCheckTask::ProcessFinishedHandler(boost::intrusive_ptr const&, boost::intrusive_ptr const&, icinga::Value const&, icinga::ProcessResult const&) (+0x4ac) [0x2
b17fbc5827c]
        (7) libicinga.so: boost::detail::function::void_function_obj_invoker1, boost::_bi::list2, boost::arg<1> > >, void, icinga::ProcessResult const&>::invoke(boost::detail::function::function_buffer&, icinga::ProcessResult const&) (+0x23) [0x2b17fb8c4ac3]
        (8) libbase.so: boost::detail::function::void_function_obj_invoker0, boost::_bi::list1 > >, 
void>::invoke(boost::detail::function::function_buffer&) (+0x20) [0x2b17f520ed20]
        (9) libbase.so: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x326) [0x2b17f51f3bb6]
        (10) libboost_thread.so.1.54.0:  (+0xba4a) [0x2b17f4824a4a]
        (11) libpthread.so.0:  (+0x8184) [0x2b17f4ea9184]
        (12) libc.so.6: clone (+0x6d) [0x2b17f60b637d]
@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-09-27 13:51:41 +00:00

  • Status changed from New to Assigned
  • Assigned to set to mariussturm

@marius

Can you please have a look?

@icinga-migration icinga-migration added bug Something isn't working area/metrics General metrics handling labels Jan 17, 2017
@dnsmichi
Copy link
Contributor

The code parts introduced in d739675 are fairly broken, similar to what's going on inside the LogstashWriter issue #4054.

Seems #3237 wasn't properly tested at all.

@dnsmichi
Copy link
Contributor

Note:

double ts = cr->GetExecutionEnd();

without any sanity checks if cr is not null might crash as well.

@Al2Klimov Al2Klimov mentioned this issue May 12, 2017
@dnsmichi
Copy link
Contributor

Review

The proposed fix in #5253 does not entirely fix the problem. If the performance data is of type PerfdataValue, it is not sent at all.

There are various code locations where cr may be undefined and lead to crashes.

I'll prepare a separate PR for fixing this.

Tests

http://docs.graylog.org/en/2.2/pages/installation/docker.html#settings

is the easiest way to get a running instance. Modify docker-compose.yml and add 12201 port mappings.

vim docker-compose.yml

version: '2'
services:
  mongo:
    image: "mongo:3"
  elasticsearch:
    image: "elasticsearch:2"
    command: "elasticsearch -Des.cluster.name='graylog'"
  graylog:
    image: graylog2/server:2.2.1-1
    environment:
      GRAYLOG_PASSWORD_SECRET: somepasswordpepper
      GRAYLOG_ROOT_PASSWORD_SHA2: 8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
      GRAYLOG_WEB_ENDPOINT_URI: http://127.0.0.1:9000/api
    depends_on:
      - mongo
      - elasticsearch
    ports:
      - "9000:9000"
      - "12201:12201"

docker-compose up

Navigate to http://localhost:9000/system/inputs and add a GELF tcp input listening on 12201.

icinga2 feature enable gelf

vim features-enabled/gelf.conf

library "perfdata"

object GelfWriter "gelf" {
  host = "127.0.0.1"
  port = 12201
  enable_send_perfdata = true
}

@dnsmichi dnsmichi self-assigned this May 15, 2017
@dnsmichi dnsmichi added this to the 2.7.0 milestone May 15, 2017
@dnsmichi dnsmichi added the area/graylog Events to Graylog label May 15, 2017
@dnsmichi
Copy link
Contributor

Verified working, PR coming soon.

screen shot 2017-05-15 at 13 45 37

dnsmichi pushed a commit that referenced this issue May 15, 2017
Includes fixes for possible crashes on empty check results.

fixes #4666
dnsmichi pushed a commit that referenced this issue May 15, 2017
Fix performance data processing in GelfWriter feature

fixes #4666
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/graylog Events to Graylog area/metrics General metrics handling bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants