Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #9461] New Graphite schema #3090

Closed
icinga-migration opened this issue Jun 19, 2015 · 25 comments
Closed

[dev.icinga.com #9461] New Graphite schema #3090

icinga-migration opened this issue Jun 19, 2015 · 25 comments
Labels
area/graphite Metrics to Graphite blocker Blocks a release or needs immediate attention enhancement New feature or request
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/9461

Created by tgelf on 2015-06-19 16:18:08 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2015-09-06 09:25:03 +00:00)
Target Version: 2.4.0
Last Update: 2015-11-06 17:09:04 +00:00 (in Redmine)

Backport?: No
Include in Changelog: 1

The current graphite writer implementation creates a pretty flat tree while enriching perfdata values with lots of additional data. Currently we encounter the following problems:

Data overhead

There is (much) more additional data than "real" performance data. For many people this means burning resources for data they don't need. This can be solved with additional config flags.

I would suggest to distinguish between enable_send_thresholds and enable_send_metadata in the Graphite writer config.

Template assignment

It would be great if we could ship default templates for various checks. This is currently hard to impossible as we have no chance to figure out what data to expect behind generated Graphite paths. The service name gives no hint, the command name is not available and all the extra data on the same level are noisy and hard to filter away in an efficient way.

I'd opt for a completely new default structure to solve this issue. To avoid trouble with existing trees I propose to use a completely new global prefix, icinga2 instead of icinga. Reason: people do not read changelogs. Who does not like the new structure will delete the new tree and continue to fill the legacy one by downgrading or toggling config knobs.

Proposed tree structure

Prefix for hosts:

icinga2.$host.name$.host.$host.check_command$

Prefix for services:

icinga2.$host.name$.services.$service.name$.$service.check_command$

Data should be written as follows:

.perfdata..value

With enable_send_thresholds = true (default should be false) Icinga would add

.perfdata..min
.perfdata..max
.perfdata..warn
.perfdata..crit

With enable_send_metadata = true (should also default to false) it would add

.metadata.execution_time
.metadata.latency
.metadata.state

Cheers,
Thomas


Tests

Docker Container:

sudo docker run -d --name graphite --restart=always -p 9090:80 -p 2003:2003 hopsoft/graphite-statsd

Attachments

Changesets

2015-06-19 16:34:04 +00:00 by (unknown) 0bcb0f5

Change Graphite layout (wip)

This changes the entire tree, but with the prefix "icinga2"
not to conflict with existing installations.

refs #9461

2015-06-22 16:15:39 +00:00 by (unknown) 66a10d6

Change Graphite layout (wip)

This changes the entire tree, but with the prefix "icinga2"
not to conflict with existing installations.

refs #9461

2015-08-03 16:03:34 +00:00 by (unknown) 04fccda

Change Graphite layout

This changes the entire tree, but with the prefix "icinga2"
not to conflict with existing installations.

refs #9461

2015-08-03 16:03:34 +00:00 by (unknown) 489514e

Implement enable_legacy_mode for GraphiteWriter

Disabled by default.

refs #9461

2015-09-04 13:43:21 +00:00 by mfriedrich 6b3337c

Change Graphite layout

This changes the entire tree, but with the prefix "icinga2"
not to conflict with existing installations.

refs #9461

2015-09-04 13:43:21 +00:00 by mfriedrich e687c51

Implement enable_legacy_mode for GraphiteWriter

Disabled by default.

refs #9461

2015-09-04 15:33:52 +00:00 by mfriedrich acfb8b2

Update documentation for Graphite tree changes

refs #9461

2015-09-06 08:32:48 +00:00 by mfriedrich c37902f

Change Graphite layout

This changes the entire tree, but with the prefix "icinga2"
not to conflict with existing installations.

refs #9461

2015-09-06 08:32:49 +00:00 by mfriedrich 5b3a47e

Implement enable_legacy_mode for GraphiteWriter

Disabled by default.

refs #9461

2015-09-06 08:32:49 +00:00 by mfriedrich b6a57f8

Update documentation for Graphite tree changes

refs #9461

2015-09-06 08:32:49 +00:00 by mfriedrich 76a14ce

Fix wrong metric labels in legacy mode

refs #9461

2015-09-06 09:10:49 +00:00 by mfriedrich b10cb8a

Implement a better Graphite tree schema

This changes the entire tree, but with the prefix "icinga2"
not to conflict with existing installations. Includes
enable_legacy_mode and detailed documentation.

fixes #9461
fixes #8149

2015-10-20 06:06:25 +00:00 by (unknown) b77c9ed

Remove unnecessary default values

refs #9461
refs #8149

Relations:

@icinga-migration
Copy link
Author

Updated by tgelf on 2015-06-19 16:18:57 +00:00

NB: This would be the perfect occasion to also fix #8149

@icinga-migration
Copy link
Author

Updated by tgelf on 2015-06-19 16:19:11 +00:00

  • Relates set to 8149

@icinga-migration
Copy link
Author

Updated by tgelf on 2015-06-19 16:21:03 +00:00

Forgot to mention: perfdata-label should NOT escape dots, this would allow for even deeper structures.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-06-19 16:33:50 +00:00

  • File added icinga2_graphite_new_tree_01.png
  • File added icinga2_graphite_new_tree_02.png

This looks like this:

icinga2_graphite_new_tree_01.png

icinga2_graphite_new_tree_02.png

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-06-19 16:39:39 +00:00

  • File added icinga2_graphite_new_tree_03.png

And with enabled enable_send_metadata and enable_send_thresholds

icinga2_graphite_new_tree_03.png

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-06-22 07:26:00 +00:00

  • Category set to Graphite
  • Status changed from New to Assigned
  • Assigned to set to mfriedrich
  • Estimated Hours set to 8

TODOs:

  • Decide whether to enable or disable enable_* attributes by default
  • Add "legacy" attribute for old behaviour - allows you to run 2 graphite writer instances (one old, one new for example)
  • Documentation

@icinga-migration
Copy link
Author

Updated by dgoetz on 2015-06-22 11:50:09 +00:00

Can we also get range support for the thresholds like discussed in https://dev.icinga.org/issues/5043?
This would need something like warn_lower, warn_upper, crit_lower and crit_upper.

Furthermore enable_send_thresholds and enable_send_metadata will be parameters for the writer? Can we get this in a way so you can add these options on a service or command base?

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-06-22 12:29:27 +00:00

  • Relates set to 5043

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-06-23 07:21:18 +00:00

dirk wrote:

Can we also get range support for the thresholds like discussed in https://dev.icinga.org/issues/5043?
This would need something like warn_lower, warn_upper, crit_lower and crit_upper.

#5043 got implementation and design issues, I will comment on that one separately. Please create a follow-up feature request once #5043 is fully implemented not blocking this implementation. We may still add the new data fields below "perfdata" next to "value" once there are parsed data sources available. 

Furthermore enable_send_thresholds and enable_send_metadata will be parameters for the writer? Can we get this in a way so you can add these options on a service or command base?

I do see a similar use case as the {host,service}_format_template attributes - enable them on a global basis and have it either enabled or disabled. Obfuscating the service or checkcommand object types with additional parameters could lead into errors similar to "why does this service have these items, and others don't" which makes it impossible to determine from a graphite's point of view.

If you still think this should taken into account, please discuss it in a separate issue. If such options get added, they could be added later on, not blocking this issue.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-06-26 09:06:57 +00:00

  • Target Version set to Backlog

@icinga-migration
Copy link
Author

Updated by gbeutner on 2015-06-30 06:54:08 +00:00

  • Relates set to 9515

@icinga-migration
Copy link
Author

Updated by espenfjo on 2015-07-05 14:50:33 +00:00

It would be very nice if this could be some general cleanup of the perfdata structure. After looking into how best to create a logical structure for a tag based Graphing system (OpenTSDB and KairosDB) it feels like Icinga is lacking quite a bit.

I was fiddling with adding several new variables which optionally could be set for each service/host.
eg.:

      apply Service "ping4" {
               import "generic-service"
               vars.kairosdb_metric_name = "ping"
               vars.kairosdb_tags = ["protocol=ipv4"]
               check_command = "ping4"
               assign where host.address
     }

This will create a metric with name "`ping`" in KairosDB with the following tags:

tag | description
------------------~~|-----------------------------------------~~
host | Host name that is being probed
source | Icinga instance probing the host
protocol | Custom tag defined in the service, here set to ipv4 for the ping4 check.
target | The \`ping4\` probe will create three different target tags: \`meta\`, \`rta\` and \`pl\`. \`rta\` and \`pl\` are the sub checks of the probe. \`meta\` is the internal check statistic mentioned above.

I would think that, if generalised, one would be able to use a similar structure for every^Wmost graphing systems.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-07-06 15:16:50 +00:00

Perfdata is just a specific value class being parsed from the string values a host/service check provides. There's not much one could do about structure here.

I also do think that OpenTSDB and InfluxDB (or collectd) follow a different approach in their tree structure using tags, which is not necessarily the same as one would do in graphite. Which is why such discussions on trees should be kept in a separate issue, as each writer will prefer its own schema.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-07-17 20:32:19 +00:00

  • Relates set to 9667

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-08-03 15:46:16 +00:00

  • Relates deleted 9515

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-08-03 15:46:26 +00:00

  • Duplicated set to 9515

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-08-27 16:17:00 +00:00

  • Priority changed from Normal to High

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-06 08:30:27 +00:00

  • Target Version changed from Backlog to 2.4.0

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-06 08:34:55 +00:00

  • File added Auswahl_081.png

In order to enable the old mode you'll need to add the following to your graphite.conf

object GraphiteWriter "graphite" {
  host = "127.0.0.1"
  port = 2003
  host_name_template = "icinga.$host.name$"
  service_name_template = "icinga.$host.name$.$service.name$"
  enable_legacy_mode = true
}

Auswahl_081.png

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-06 08:55:09 +00:00

  • Description updated

I'm using this nice Docker container for tests.

sudo docker run -d --name graphite --restart=always -p 9090:80 -p 2003:2003 hopsoft/graphite-statsd

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-06 09:04:47 +00:00

  • File added Auswahl_082.png
  • File added Auswahl_083.png
  • File added Auswahl_084.png

New Schema

Again, without legacy mode.

Auswahl_082.png

Dashes are not escaped

object Host "dash-test-01" {
  import NodeName
  address = "127.0.0.1"
}

Auswahl_084.png

Perfdata Labels not escaped

[2015-09-06 10:57:19 +0200] information/ExternalCommandListener: Executing external command: [1441529839] PROCESS_SERVICE_CHECK_RESULT;google.com;my-ping4;0;perfdata test|df.root.mbytes=1024 df.root.mbytes_used=512 df.root.mbytes_free=512 

Auswahl_083.png

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-06 09:25:03 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 0 to 100

Applied in changeset b10cb8a.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-07 14:17:31 +00:00

  • Backport? changed from TBD to No

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-30 12:43:25 +00:00

  • Subject changed from Graphite writer tree proposal to New GraphiteWriter tree

@icinga-migration
Copy link
Author

Updated by gbeutner on 2015-11-06 17:09:04 +00:00

  • Subject changed from New GraphiteWriter tree to New Graphite schema

@icinga-migration icinga-migration added blocker Blocks a release or needs immediate attention enhancement New feature or request area/graphite Metrics to Graphite labels Jan 17, 2017
@icinga-migration icinga-migration added this to the 2.4.0 milestone Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/graphite Metrics to Graphite blocker Blocks a release or needs immediate attention enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant