Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #7185] Stats grouping by columns in Livestatus missing #1970

Closed
icinga-migration opened this issue Sep 11, 2014 · 15 comments
Closed
Labels
area/livestatus Legacy interface bug Something isn't working

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/7185

Created by tgelf on 2014-09-11 08:00:21 +00:00

Assignee: (none)
Status: Closed (closed on 2016-12-07 22:01:33 +00:00)
Target Version: (none)
Last Update: 2016-12-07 22:01:33 +00:00 (in Redmine)

Icinga Version: 2.1.0

Test query:

GET services
Columns: host_name
Stats: state = 0
Stats: state > 0

Expected result:

localhost;5;2
localhost1;3;1
localhost2;4;0

Got:

localhost;11;3

This is the related livestatus documentation snippet about Grouping

In such situations you can add the Columns: header to your query. There is a simple and yet mighty notion behind it: You specify a list of columns of your table. The stats are computed and displayed separately for each different combination of values of these columns.

Cheers,
Thomas

Attachments


Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2014-09-16 09:14:51 +00:00

  • Category set to Livestatus
  • Project changed from 34 to Icinga 2

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-02-16 14:13:22 +00:00

That's reproducible though I currently have no idea how to fix the aggregator to deal with these column combinations.

in livestatusquery.cpp:530 else tree

if (m_Columns.size() > 0) {
  //pass that to the aggregator
}

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-02-16 14:14:26 +00:00

  • Subject changed from Grouping in Livestatus seems to be missing to Stats grouping by columns in Livestatus missing

@icinga-migration
Copy link
Author

Updated by LaMi on 2015-04-28 15:25:23 +00:00

Wanted to push this issue. In my opinion this is not just a missing feature, because it breaks NagVis or at least makes NagVis showing wrong information.
State counts of some hosts are shown for another hosts while the other hosts get no data.

NagVis makes use of these kind of queries to keep the number of queries low. They are used to get the summary state of multiple hosts and their services at once.

For users which use NagVis this may be a show stopper.

Some example query used by NagVis:

GET services
Filter: host_name = SRV1
Filter: host_name = SRV2
Filter: host_name = SRV3
Filter: host_name = SRV4
Filter: host_name = SRV5
Or: 5
Stats: has_been_checked = 0
Stats: last_hard_state = 0
Stats: has_been_checked != 0
Stats: scheduled_downtime_depth = 0
Stats: host_scheduled_downtime_depth = 0
Stats: staleness < 1.5
StatsAnd: 5
Stats: last_hard_state = 0
Stats: has_been_checked != 0
Stats: scheduled_downtime_depth = 0
Stats: host_scheduled_downtime_depth = 0
Stats: staleness >= 1.5
StatsAnd: 5
Stats: last_hard_state = 0
Stats: has_been_checked != 0
Stats: scheduled_downtime_depth > 0
Stats: host_scheduled_downtime_depth > 0
StatsOr: 2
StatsAnd: 3
Stats: last_hard_state = 1
Stats: acknowledged = 0
Stats: host_acknowledged = 0
Stats: scheduled_downtime_depth = 0
Stats: host_scheduled_downtime_depth = 0
Stats: staleness < 1.5
StatsAnd: 6
Stats: last_hard_state = 1
Stats: acknowledged = 0
Stats: host_acknowledged = 0
Stats: scheduled_downtime_depth = 0
Stats: host_scheduled_downtime_depth = 0
Stats: staleness >= 1.5
StatsAnd: 6
Stats: last_hard_state = 1
Stats: acknowledged = 1
Stats: host_acknowledged = 1
StatsOr: 2
StatsAnd: 2
Stats: last_hard_state = 1
Stats: scheduled_downtime_depth > 0
Stats: host_scheduled_downtime_depth > 0
StatsOr: 2
StatsAnd: 2
Stats: last_hard_state = 2
Stats: acknowledged = 0
Stats: host_acknowledged = 0
Stats: scheduled_downtime_depth = 0
Stats: host_scheduled_downtime_depth = 0
Stats: staleness < 1.5
StatsAnd: 6
Stats: last_hard_state = 2
Stats: acknowledged = 0
Stats: host_acknowledged = 0
Stats: scheduled_downtime_depth = 0
Stats: host_scheduled_downtime_depth = 0
Stats: staleness >= 1.5
StatsAnd: 6
Stats: last_hard_state = 2
Stats: acknowledged = 1
Stats: host_acknowledged = 1
StatsOr: 2
StatsAnd: 2
Stats: last_hard_state = 2
Stats: scheduled_downtime_depth > 0
Stats: host_scheduled_downtime_depth > 0
StatsOr: 2
StatsAnd: 2
Stats: last_hard_state = 3
Stats: acknowledged = 0
Stats: host_acknowledged = 0
Stats: scheduled_downtime_depth = 0
Stats: host_scheduled_downtime_depth = 0
Stats: staleness < 1.5
StatsAnd: 6
Stats: last_hard_state = 3
Stats: acknowledged = 0
Stats: host_acknowledged = 0
Stats: scheduled_downtime_depth = 0
Stats: host_scheduled_downtime_depth = 0
Stats: staleness >= 1.5
StatsAnd: 6
Stats: last_hard_state = 3
Stats: acknowledged = 1
Stats: host_acknowledged = 1
StatsOr: 2
StatsAnd: 2
Stats: last_hard_state = 3
Stats: scheduled_downtime_depth > 0
Stats: host_scheduled_downtime_depth > 0
StatsOr: 2
StatsAnd: 2
Columns: host_name host_alias

[
   ["SRV1","SRV1",0.0,67.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
]

@icinga-migration
Copy link
Author

Updated by dnotivol on 2015-04-29 11:22:27 +00:00

  • File added livestatus.cmd

I'm having the same problem with livestatus. When running the Nagvis queries, I get different results in livestatus for icinga1 and icinga2.

I guess this will psotpone my migration to icinga2 (although I have everything set up, and I'm really wanting to use all the new features in production). This is making Nagvis response is unpredictable because it shows random number of services in each host.

The output in icinga1's livestatus:

200         465
[["SERVER-001","SERVER-001",0,0,7,0,0,0,0,0,0,0,0,0,0,0,0,0],
["SERVER-002","SERVER-002",0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0],
["SERVER-003","SERVER-003",0,0,10,0,0,0,0,0,0,0,0,0,0,0,0,0],
["SERVER-004","SERVER-004",0,0,10,0,0,0,0,0,0,0,0,0,0,0,0,0],
["SERVER-005","SERVER-005",0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0],
["SERVER-006","SERVER-006",0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0]]

The output in icinga2's livestatus:

200         110
[["SERVER-003","SERVER-003",0.0,34.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]]

I'm attaching the query launched by NagVis (I think it's the same one than the one LaMi's posted).

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-05-08 16:31:00 +00:00

  • Status changed from New to Assigned
  • Assigned to set to mfriedrich

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-06-23 13:26:15 +00:00

  • Target Version set to Backlog

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-01-29 14:46:30 +00:00

  • Status changed from Assigned to New
  • Assigned to deleted mfriedrich

I currently neither have the time nor resources to look into Livestatus specific stuff involving multiple days of development. If anyone else is capable of sending in a patch, I'll gladly review and merge it.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-02-03 17:07:19 +00:00

The problem is located in this section:

/* add aggregated stats */

One would need to

  1. selectively check the groups inside the results loop
  2. find a way to merge these groups inside the aggregation calls, and not the entire result set
  3. print these grouped result sets
  4. fix the "columns" filter only taking the first selected column

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-18 14:56:56 +00:00

  • Relates set to 10904

@icinga-migration
Copy link
Author

Updated by gbeutner on 2016-03-29 07:01:13 +00:00

  • Duplicated set to 11454

@icinga-migration
Copy link
Author

Updated by byb39 on 2016-07-21 01:51:42 +00:00

Hi,

I am using Nagvis 1.9b8 with Icinga 2.3.3 and this issue seems to be resolved when using IDO as a backend. It may be the same for livestatus. I was getting the same issue with random hosts/services when using Nagvis 1.85 stable.

Best Regards,

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-07-21 06:18:18 +00:00

DB IDO uses a different Nagvis code to fetch the data. One thing which recently changed there - I've sent them a patch fixing inactive db objects. That should make the IDO backend the preferred one in combination with Icinga and Icinga Web 2.

Nonetheless the Livestatus bug still exists, and isn't easy to tackle.

@icinga-migration
Copy link
Author

Updated by velin on 2016-09-08 07:55:07 +00:00

That should make the IDO backend the preferred one in combination with Icinga and Icinga Web 2.
Nonetheless the Livestatus bug still exists, and isn't easy to tackle.

Hi,

I'm using icinga (and icingaweb2) with postgresql. Nagvis seems to support only mysql db ido and livestatus, so this bug is quite important for environments, where postgresql is used.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-12-07 22:01:33 +00:00

  • Status changed from New to Closed
  • Target Version deleted Backlog

To be honest, I will not fix it and so does no-one else. Livestatus isn't our primary protocol inside the Icinga stack, neither is Compat for instance. I'm closing this - unfortunately - as wontfix.

If you really want NagVis to work with Icinga 2/Icinga Web 2, wrap your head around either writing an Icinga 2 or Icinga Web 2 API backend. Or extend the MySQL backend for PostgreSQL usage and contribute upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/livestatus Legacy interface bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant