[dev.icinga.com #2178] Very long database query time for Servicegroups cronk #616
Comments
Updated by ralfk on 2011-12-12 13:54:09 +00:00 Even though I close the "Loading" Servicegroups cronk the associated MySQL queries are still active and need to be manually killed using e.g. mysqladmin. That's bad. Isn't there a method to kill the database session/process when the hanging cronk is closed? |
Updated by mhein on 2012-01-20 11:37:37 +00:00
|
Updated by mhein on 2012-02-10 09:23:14 +00:00
|
Updated by jmosshammer on 2012-02-21 12:31:52 +00:00
Hi, can you try to add those indexes to your database and tell me if the problem is solved afterwards?: CREATE INDEX idx_all on icinga_servicegroups (servicegroup_id,instance_id,servicegroup_object_id,config_type,alias); |
Updated by ralfk on 2012-02-23 12:55:08 +00:00 I implemented the indexes, but unfortunately it did not improve it. Same behavior as before. Maybe the 2nd table icinga_servicegroup_members with 3006 rows would be a candidate to be significantly improved by indexing. The other two tables have only 48 and 346 rows. |
Updated by jmosshammer on 2012-02-23 13:22:16 +00:00
Thx for testing!
|
Updated by ralfk on 2012-02-24 07:15:04 +00:00
See attached query result (tab separated). |
Updated by jmosshammer on 2012-02-24 09:01:56 +00:00 Sorry, i think i forgot one index in my last comment: DROP INDEX idx_all on icinga_servicegroups; On my machine row 12 (which should be the bottleneck) changed from 'all' to 'index' by adding this index. |
Updated by ralfk on 2012-02-24 10:52:36 +00:00
I deleted the indexes and created the one on icinga_service_contactgroups. No improvement. See attached explain query output. Just an idea: Since the query contains the instance, maybe the slow performance is related to bug #2179 ? |
Updated by jmosshammer on 2012-02-24 11:03:33 +00:00 Hi,
Could you please readd those two indices and see if the performance is better: CREATE INDEX idx_all on icinga_servicegroup_members (servicegroup_member_id,instance_id,servicegroup_id,service_object_id); |
Updated by ralfk on 2012-02-24 11:16:31 +00:00 Did that. No improvement. Would it maybe help if you would get a copy of the icinga DB? If so, only on a separate and secure channel, since the DB contains some internal info that I don't want to become public. The DB is (unzipped) 1.3 GB big ;-) |
Updated by jmosshammer on 2012-02-24 11:21:39 +00:00 Ok, I hoped we could fix it the easy way ;) I'm now going to setup a test configuration for that - i.e. with a lot of servicegroups and contactgroups. If think this should be sufficient to reproduce the error. |
Updated by ralfk on 2012-02-24 11:36:40 +00:00 OK. In general it would make sense that you have a big DB for testing :-) I feel that the new GUI has still some room for improvement concerning performance. Some cronks which even work are not really fast (compared to old GUI). Since the new GUI can't still not replace the old one (at least for my requirements: A user shall only see his hosts and services configured as contact) I don't have much experience with the performance of the new GUI. If needed functionality is given I will more often use the new GUI and you will get more feedback. |
Updated by jmosshammer on 2012-02-27 10:39:48 +00:00
Hi, I've done some analyzing, - the problem lies in the grouping and counting fields (which can be a little bit tricky in mysql - altough pgsql shows the same issue). As this will require some deeper changes, indices and additional query rewrites i'm going to postpone it to 1.7. |
Updated by ralfk on 2012-02-27 11:00:56 +00:00 OK. Thanks for your effort. As soon as I can help testing give me a note here. |
Updated by jmosshammer on 2012-03-01 18:38:43 +00:00
Applied in changeset f80da0a. |
Updated by jmosshammer on 2012-03-07 08:41:47 +00:00 Hi, please try the jmosshammer/templateDQL branch. I've rewritten the database backend code for the templates, servicegroups (and especially notifications) are now in an acceptable speed when I add all possible credentials: https://git.icinga.org/?p=icinga-web.git;a=commit;h=97f27b513774c20a0bbe32058cac94b169f4b7a0 Please let me know if you encountered any issues in #2368, thanks! |
Updated by ralfk on 2012-05-30 14:40:30 +00:00 I updated to 1.7.0 today (sorry for being so late). In general the Servicegroup cronk does work now. Great. But the refresh (query) still takes 20-30 seconds for 8 service groups with about 1600 service checks (which is maybe 50% of the overall amount of groups and checks). There are still many "copy to tmp table" MySQL processes. So, there is still room for improvement for users which only should see their service groups and service checks according to the setting "Only show items that contain a contact with this name in their contactgroup definitions". |
Updated by tgelf on 2012-05-31 11:26:45 +00:00 @jmosshammer: as soon as you are creating an index involving the autoincremental primary key (such as service_contactgroup_id in icinga_service_contactgroups) it is going to be pretty useless: it's cardinality will match the table row count. Please also note that MySQL is incapable of combining mulitiple indexes efficiently, it usually chooses one index for each involved table. What we are talking about here are so-called m:n relations. The only useful columns such tables (e.g. icinga_xy_members) are those referencing related objects. This means that the best index will involve just two columns on each table looking like the ones you named before. As object_id is unique throughout all instances, the instance_id is not important for your joins. Unfortunately we have already had such indexes before Icinga 1.5 (ido), they have been dropped. Please see issue #2618 for farther details. |
Updated by ralfk on 2012-10-22 13:33:16 +00:00 Just for info: After upgrade to 1.8.0 the same servicegroup cronk query takes about 4 seconds. So, it has been significantly improved from 1.7.2 to 1.8.0. |
Updated by mfriedrich on 2012-10-22 13:38:28 +00:00
within #2618 there were serveral badly deleted indexes re-added, which should have helped in that case as well. |
This issue has been migrated from Redmine: https://dev.icinga.com/issues/2178
Created by ralfk on 2011-12-12 13:28:03 +00:00
Assignee: jmosshammer
Status: Resolved (closed on 2012-03-01 18:38:43 +00:00)
Target Version: 1.7
Last Update: 2012-10-22 13:38:27 +00:00 (in Redmine)
When I start the Servicegroup cronk, a database query is initiated that takes a very long time (or even lasts forever). Two queries are started initially and after 300 seconds of reload time another query is started, and so on. The query is:
On a smaller system (i.e. with much less services etc.) the same query is fast. But on my main system it blocks my system due to heavy load and multiple queries at same time (MySQL "copy to tmp table"). The InnoDB buffer size is 256 MB (which could of course be higher, but other applications and the rest of Icinga work smoothly).
Important: I am pretty sure that this only happens under following conditions:
If the user is member of all items the servicegroups are displayed quickly (1). If the definitions option (2) is disabled then it's also quick.
My system:
Icinga 1.6.1, Icinga Web 1.6.0, Linux 64 bit, MySQL 5.1.53
No of hosts: 350
No of services: 2500
No of hostgroups: 46
No of servicegroups: 47
Attachments
Changesets
2012-03-01 16:51:21 +00:00 by jmosshammer f80da0a
2012-04-17 08:25:38 +00:00 by jmosshammer 4d28ddb11637dbff8c6a32cdcacb81dd984d7a82
2012-04-17 08:40:21 +00:00 by jmosshammer e7e5859
Relations:
The text was updated successfully, but these errors were encountered: