New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #12147] IDO module starts threads before daemonize #4363
Comments
Updated by mfriedrich on 2016-07-13 17:37:48 +00:00
|
Updated by mfriedrich on 2016-07-13 17:39:15 +00:00 It would be helpful if you could provide a full gdb backtrace :) |
Updated by elippmann on 2016-07-14 07:42:26 +00:00
|
Updated by mfriedrich on 2016-07-14 08:58:21 +00:00
|
Updated by mfriedrich on 2016-07-20 08:28:51 +00:00 Can you please test the latest snapshot packages as well? :) |
Updated by tim_1740 on 2016-07-25 15:38:27 +00:00 How can I get the latest snapshot packages ? |
Updated by mfriedrich on 2016-07-25 15:41:03 +00:00 You'll need to enable the snapshot repository (and disable the release one). Then you may install the icinga2 snapshot packages. More instructions over here: http://packages.icinga.org/epel/ |
Updated by tim_1740 on 2016-07-26 08:50:34 +00:00 I tested with the snapshot version but it seems to be the same problem. (It activated all objects, but never update database) Then I tried with a smaller environment and I had this error, but it seems to be a different problem :
Regards, |
Updated by mfriedrich on 2016-07-28 15:15:21 +00:00 The snapshot packages require an updated schema from the 2.5.0.sql file. The schema is compatible to previous versions (2.4.x) so upgrading and downgrading doesn't hurt. |
Updated by tim_1740 on 2016-08-01 15:27:06 +00:00 There is a problem with the upgrade sql. Something with default values : But I think that all this is off topic, because with my complete environment, icinga don't even try to update the database. |
Updated by mfriedrich on 2016-08-01 16:05:29 +00:00
I'm seeing a similar issue with MySQL only. Patched debug output for analysing the CanExecuteQuery() calls and query queue reordering.
All of these items are not present in a fresh DB IDO installation, GetOrCreateByObject() still gets called anyways. |
Updated by mfriedrich on 2016-08-02 12:28:45 +00:00
Ok figured it out. My "categories" attribute was not set accordingly which caused multiple re-enqueues due to a recent change in the code. Although the overall insert works fine. Could you share your current IdoMysqlConnection object configuration (remove sensitive information such as host, user or passwords). |
Updated by tim_1740 on 2016-08-02 14:41:02 +00:00 This is my configuration (/etc/icinga2/features-available/ido-mysql.conf) : /**** The db_ido_mysql library implements IDO functionality for MySQL. */ library "db_ido_mysql" object IdoMysqlConnection "ido-mysql" { |
Updated by mfriedrich on 2016-08-02 14:55:43 +00:00 Drop "categories" and try again please. |
Updated by tim_1740 on 2016-08-02 15:52:20 +00:00
It's the same without categories. Daemon logs : [2016-08-02 17:40:59 +0200] information/cli: Icinga application loader (version: v2.4.10-552-gf2bacb2) Debug logs attached. (You can see that nothing is update in database) |
Updated by mfriedrich on 2016-08-02 15:59:37 +00:00 Please attach a gdb backtrace from the running process.
|
Updated by tim_1740 on 2016-08-03 09:22:55 +00:00
This is my gdb_bt.log files (attached). But I don't know if it's normal : when I run the gdb command, it lasts only two seconds. With this messages : warning: the debug information found in "/usr/lib/debug//usr/lib64/icinga2/sbin/icinga2.debug" does not match "/usr/lib64/icinga2/sbin/icinga2" (CRC mismatch). Then I did it 3 times : just after start, during start and after object activation. Is it enought for you ? |
Updated by mfriedrich on 2016-08-03 09:37:29 +00:00 The first two backtraces are coming from the time when the configuration objects are compiled and then committed. The last one seems to be coming from a running process. I guess your initial startup is taking quite a long time. Probably it is best to generate a full backtrace from the point in time when IDO Mysql says that it successfully connected to the database, version 1.14.x. |
Updated by tim_1740 on 2016-08-03 09:46:15 +00:00 You mean when I see this message : Because this message never appears with my complete environment. (as you can see in my debug logs) |
Updated by mfriedrich on 2016-08-03 10:01:22 +00:00 Strange. Can you start the instance where IDO is running on in foreground ("icinga2 daemon") and check its output? |
Updated by tim_1740 on 2016-08-03 12:46:19 +00:00
icinga2 - The Icinga 2 network monitoring daemon (version: v2.4.10-579-g1ff6939) Copyright © 2012-2016 Icinga Development Team (https://www.icinga.org/) Application information: System information: I just switch my icinga configuration with less objects and it works fine : Daemon outputs : /usr/sbin/icinga2 daemon -d I don't know if it's important, but as you can see, the line : information/DbConnection: Resuming IDO connection: ido-mysql In debug logs all is ok. (attached) And gdb backtrace if needed. (attached) |
Updated by tim_1740 on 2016-08-12 09:44:23 +00:00 Do you have any news for me? Regards, |
Updated by mfriedrich on 2016-08-12 18:09:57 +00:00 Please try the most recent snapshot packages (from today). Then please run icinga2 with strace.
Furthermore you should see in the debug logs which threads have been started for workqueues (that's our current idea). Please post your findings then here again. Thanks. |
Updated by tim_1740 on 2016-08-16 13:30:05 +00:00
I installed the last snapshot :
But when I run it with strace as you specified, it doesn't work. It's lock on something like that : You can find the complete strace attached if needed. Then I run it without strace, the debug logs are attached too. Regards, |
Updated by gbeutner on 2016-08-17 07:37:19 +00:00
|
Updated by tim_1740 on 2016-08-17 11:39:07 +00:00
I just tried with the new snapshot (version: v2.4.10-699-gb26b3a4). It seems to be better, because the line "information/DbConnection: Resuming IDO connection: ido-mysql" is not present in daemon ouputs. But I have critical/IdoMysqlConnection erros in debug logs. (see attached) Regards, |
Updated by mfriedrich on 2016-08-17 11:49:12 +00:00 Update your IDO schema.
|
Updated by tim_1740 on 2016-08-17 12:29:59 +00:00 I already did it : information/IdoMysqlConnection: MySQL IDO instance id: 1 (schema version: '1.14.1') |
Updated by gbeutner on 2016-08-17 13:28:52 +00:00 You're missing some of the schema updates which were added to the pre-release schema update file after you did your update. Just try to execute the updates again using 'mysql --force ... < /usr/share/icinga2-ido-mysql/schema/upgrade/2.5.0.sql'. |
Updated by tim_1740 on 2016-08-17 15:26:27 +00:00 Yeah it works ! My 150000 services, 3800 hosts are insert in database. Do you think I can use this snapshot for production ? Thank you for all ! |
Updated by mfriedrich on 2016-08-17 15:29:41 +00:00
Presumingly yes, as there are no more open issues except for documentation. We'll keep on testing the current master further until the release next week. |
Updated by gbeutner on 2016-08-17 15:41:00 +00:00 I'd be interested in some benchmark times for this. Can you grep for "Finished reconnecting to" in your icinga2.log? :) |
Updated by gbeutner on 2016-08-22 11:45:43 +00:00
|
This issue has been migrated from Redmine: https://dev.icinga.com/issues/12147
Created by tim_1740 on 2016-07-13 12:35:29 +00:00
Assignee: tim_1740
Status: Resolved (closed on 2016-08-17 15:29:41 +00:00)
Target Version: 2.5.0
Last Update: 2016-08-22 11:45:43 +00:00 (in Redmine)
Hello,
We are migrating from icinga1 to icinga2, but we are facing a problem.
Our server seems to be powerfull enough (8Go RAM and 12 CPU).
With 27000 services and 663 hosts, it works.
Daemon logs :
[2016-07-13 11:22:29 +0200] information/ConfigItem: Committing config items
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 2 FileLoggers.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 1 Zone.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 1 Endpoint.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 142 CheckCommands.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 2 NotificationCommands.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 663 Hosts.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 3 HostGroups.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 1 User.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 1 UserGroup.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 27402 Services.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 3 TimePeriods.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 7 ServiceGroups.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 1 ExternalCommandListener.
[2016-07-13 11:22:36 +0200] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2016-07-13 11:22:36 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.var s'
[2016-07-13 11:22:36 +0200] information/ConfigObject: Restoring program state from file '/var/lib/icinga2/icing a2.state'
[2016-07-13 11:22:37 +0200] information/ConfigObject: Restored 28231 objects. Loaded 0 new objects without stat e.
[2016-07-13 11:22:37 +0200] information/ConfigItem: Triggering Start signal for config items
[2016-07-13 11:22:38 +0200] information/ConfigItem: Activated all objects.
Debug logs :
[2016-07-13 11:22:37 +0200] debug/TimePeriod: ---
[2016-07-13 11:22:38 +0200] information/ConfigItem: Activated all objects.
[2016-07-13 11:22:38 +0200] information/DbConnection: Resuming IDO connection: ido-mysql
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: SELECT @@global.max_allowed_packet AS max_allowed_packet
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: SELECT version FROM icinga_dbversion WHERE name='idoutils'
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: SELECT instance_id FROM icinga_instances WHERE instance_name = '*****'
[2016-07-13 11:22:38 +0200] information/IdoMysqlConnection: MySQL IDO instance id: 1 (schema version: '1.14.0')
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: SET SESSION TIME_ZONE='+00:00'
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: BEGIN
[2016-07-13 11:22:38 +0200] notice/DbConnection: Updating programstatus table.
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: COMMIT
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: BEGIN
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: INSERT INTO icinga_conninfo (instance_id, connect_time, last_checkin_time, agent_name, agent_version, connect_type, data_start_time) VALUES (1, NOW (), NOW(), 'icinga2 db_ido_mysql', 'v2.4.10', 'INITIAL', NOW ())
[2016-07-13 11:22:38 +0200] debug/IdoMysqlConnection: Query: DELETE FROM icinga_contact_addresses WHERE instance_id = 1
... Databases updates ...
As you can see, objects are all activated and then the database is updated with this objects.
But when we have a larger scope (3800 hosts, 175000 services), the database is never updated.
Daemon logs :
[2016-07-13 11:46:52 +0200] information/ConfigItem: Committing config items
[2016-07-13 11:47:22 +0200] information/DbConnection: Resuming IDO connection: ido-mysql (* not present in previous logs *)
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 2 FileLoggers.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 1 Zone.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 1 Endpoint.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 142 CheckCommands.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 2 NotificationCommands.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 3822 Hosts.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 69 HostGroups.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 1 User.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 1 UserGroup.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 175563 Services.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 3 TimePeriods.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 21 ServiceGroups.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 1 ExternalCommandListener.
[2016-07-13 11:49:04 +0200] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2016-07-13 11:49:04 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2016-07-13 11:49:04 +0200] information/ConfigObject: Restoring program state from file '/var/lib/icinga2/icinga2.state'
[2016-07-13 11:49:05 +0200] information/ConfigObject: Restored 28231 objects. Loaded 151400 new objects without state.
[2016-07-13 11:49:05 +0200] information/ConfigItem: Triggering Start signal for config items
[2016-07-13 11:49:12 +0200] information/ConfigItem: Activated all objects.
Debug logs :
[2016-07-13 11:49:05 +0200] debug/TimePeriod: ---
[2016-07-13 11:49:12 +0200] notice/WorkQueue: #2 tasks: 16
[2016-07-13 11:49:12 +0200] information/ConfigItem: Activated all objects.
[2016-07-13 11:49:12 +0200] debug/IcingaApplication: In IcingaApplication::Main()
[2016-07-13 11:49:12 +0200] information/ConfigCompiler: Compiling config file: /var/lib/icinga2/modified-attributes.conf
[2016-07-13 11:49:13 +0200] notice/DbConnection: Updating programstatus table.
[2016-07-13 11:49:20 +0200] notice/ThreadPool: Pool #1: Pending tasks: 0; Average latency: 0ms; Threads: 4; Pool utilization: 0.0282933%
[2016-07-13 11:49:22 +0200] notice/DbConnection: Cleanup (downtimehistory): 172800 now: 1468403362 old: 1.46823e+09
[2016-07-13 11:49:22 +0200] notice/DbConnection: Cleanup (logentries): 2.6784e+06 now: 1468403362 old: 1.46572e+09
[2016-07-13 11:49:22 +0200] notice/WorkQueue: #2 tasks: 45
[2016-07-13 11:49:23 +0200] notice/DbConnection: Updating programstatus table.
[2016-07-13 11:49:32 +0200] notice/WorkQueue: #2 tasks: 72
[2016-07-13 11:49:33 +0200] notice/DbConnection: Updating programstatus table.
[2016-07-13 11:49:35 +0200] notice/ThreadPool: Pool #1: Pending tasks: 0; Average latency: 0ms; Threads: 4; Pool utilization: 0.118823%
[2016-07-13 11:49:42 +0200] notice/WorkQueue: #2 tasks: 99
[2016-07-13 11:49:43 +0200] notice/DbConnection: Updating programstatus table.
[2016-07-13 11:49:50 +0200] notice/ThreadPool: Pool #1: Pending tasks: 0; Average latency: 0ms; Threads: 4; Pool utilization: 0.0333518%
As you can see, objects are activated but the database is never update with them. It loops on WorkQueue/ThreadPool/DbConnection messages, and there is a minimal CPU activity. Nothing append. And new services/hosts/objects are not send into database.
Did you already seen this problem ? What could I try ?
Thank you in advance,
Regards,
Tim
Attachments
Changesets
2016-07-13 12:44:53 +00:00 by gbeutner 1bd235b
2016-07-15 07:40:39 +00:00 by gbeutner 959e250
2016-08-02 12:27:20 +00:00 by mfriedrich 53930e3
Relations:
The text was updated successfully, but these errors were encountered: