[dev.icinga.com #3728] num_rows_affected broken in libdbi/mariadb, returned ids are 0 #1216
Comments
Updated by fmbiete on 2013-02-22 09:15:03 +00:00 I'm seeing the same in icinga_host_contactgroups:
|
Updated by mfriedrich on 2013-03-04 18:11:05 +00:00 i don't have mariadb around, but with my dev setup this does not happen
so, does it happen with mysql too? |
Updated by mfriedrich on 2013-03-04 18:11:32 +00:00
|
Updated by fmbiete on 2013-03-07 19:00:47 +00:00 I will try with MySql and report back. But will be pretty difficult since I will need to use 5.6 (using group commits in binary logs, and binary log checksum). Regards, |
Updated by mfriedrich on 2013-03-08 11:52:02 +00:00 ofc. i need to figure out how and if i may test mariadb (or even mysql 5.6). still, it could be an issue with the queries invoked. can you extract those manually when the config dump happens, and re-run them against your mariadb sql client? there might be some warnings reported which the libdbi driver silently ignores. |
Updated by mfriedrich on 2013-03-08 11:52:34 +00:00 extracting queries = from the debug log. |
Updated by fmbiete on 2013-03-09 19:14:38 +00:00
Just a little bit around a bad insert (I thinks that will be enough, but I have more... the debug log file is huge). Look at these things:
That select appears 2 times, with the same values...
Selecting from the client returns a row.
This return 14541 And here we have an insert creating a bad data row. Host_id == 0. Why??
|
Updated by mfriedrich on 2013-03-09 19:22:33 +00:00 so it affects host_contactgroups as well, not only service_contactgroups, as your analysis differs from the original report. any other tables affected as well? |
Updated by fmbiete on 2013-03-09 19:31:31 +00:00 It doesn't happen only with host_contactgroups. Maybe data [ 1 ] gets corrupted somehow or not assigned??
|
Updated by fmbiete on 2013-03-10 09:15:01 +00:00
With this test MySQL-DBI returns non 0 when updating a row, against MariaDB.
|
Updated by mfriedrich on 2013-03-13 23:42:51 +00:00 nice test. and yes, it's likely a bug difficult to nail down. though, the debug logs are very detailed on verbosity=2 and level=-1 ... can you post an excerpt from a complete function call to "ido2db_handle_servicedefinition() start" til end, and marking the important sections, such as the fields required. codewise, within dbhandlers.c, ignore all #ifdef USE_ORACLE sections. e.g servicedefinition the service_id which is 0 in your example
is returned here as reference.
within dbqueries.c you'll see that the param name is generic "id".
there are 2 ways of fetching the id.
likely you issue is related to fetching the service id here, and not the one level down inserts afterwards. (and other first class objects, like timeperiods, contactgroups, servicegroups). |
Updated by fmbiete on 2013-03-15 19:06:55 +00:00 Here is the fragment debug log.
|
Updated by fmbiete on 2013-03-15 19:19:07 +00:00 Look at this dbqueries.c file, ido2db_query_insert_or_update_servicedefinition_definition_add function, line 7177
You get the service_id into the service_id variable not the *id. In that case *id will not be set?? A fast look into that function and I don't see any place where the variable service_id is read. |
Updated by fmbiete on 2013-03-16 04:43:32 +00:00
I modified all the dbqueries.c functions (git version - file attached). I have deleted the _id variable and assignation, using *id as needed. I don't see inserts with _id = 0 (it's yet starting... it takes ages to do it) |
Updated by fmbiete on 2013-03-16 05:27:43 +00:00 After the first start the data seems fine. I don't see any orphaned rows. |
Updated by fmbiete on 2013-03-16 14:57:23 +00:00 After restarting a few more times, I don't have rows with zeroed's id. Also icinga_web shows every change done in icinga core (before if a service was deleted it would be showed in icinga_web but not icinga_cgi). |
Updated by mfriedrich on 2013-03-17 15:36:56 +00:00 that change is incorrect on the algorithm itsself. there is a reason why this value is ignored in that exact location, and e.g. contact_id is read, but ignored at that stage. the flow for mysql without the insert on duplicate key update madness is like
using your attempt and actually getting current numbers would mean, that your libdbi implementation does actually return the wrong value on get_num_rows_affected and gets some id after the select again - a location you normally should not hit. likely this is the error, and my not finished fallback may be completed by changing all the (.*)_id to *id in the insert tree, fixing some libdbi fuckup on special mysql version apis. if you got a clean patch to apply against next, feel free to share. |
Updated by fmbiete on 2013-03-18 14:08:47 +00:00 The attached dbqueries.c it's the latest git version with my changes. You can diff against the git tree to review the changes and apply them. I have changed all the (.)_id to id, and removed the unused variables (maybe it should help compile with -Wunused-but-set-variable) In the original code, if you hit the select/insert part, * id would not receive any value in the function. It wasn't assigned to anything in that flow of code. That would cause an insert with 0 in the (.*)_id later. |
Updated by mfriedrich on 2013-03-19 13:39:43 +00:00
since i cannot test it, i will add something to git and ping back when i got time for it. |
Updated by mfriedrich on 2013-04-06 22:23:08 +00:00
i can't work with full files, i need diffs - attached. |
Updated by mfriedrich on 2013-04-06 22:36:32 +00:00 so this is really a problem with libdbi and its mysql driver, or likewise, with the mysql api not returning the proper values on an UPDATE query. i had that problem once, when i used an older libdbi implementation with a newer mysql version, but right on, i cannot reproduce it (with packages). so, the safety SELECT which would only be run before the INSERT query, now makes sure that the wrong affected rows from the UPDATE query does not hit yet another INSERT statement, causing a unique constraint violation. saving the required id over there is a lucky shot for fixing the problem (even - workaround it) on the idoutils part. but the real problem lies somewhere between libdbi and mysql, where icinga cannot provide any fix, and probably you should investigate on your own and report an issue to the libdbi developers as well. |
Updated by mfriedrich on 2013-04-06 22:39:24 +00:00 i forgot - you should indeed fix the libdbi irritation on your server, as it causes ido2db to fire yet another SELECT query, and may influence the performance badly. |
Updated by mfriedrich on 2013-04-06 22:41:47 +00:00
i'm only fixing the affected ids, and leaving the rest alone, for future reference, if any. under normal circumstances, those code regions are never hit during UPDATE queries. |
Updated by mfriedrich on 2013-04-06 22:46:10 +00:00
you may test it in git 'next'. |
Updated by fmbiete on 2013-04-09 17:29:44 +00:00 Thanks!! I will try ASAP and report back. |
Updated by fmbiete on 2013-04-10 12:20:12 +00:00
Hello, I've installed libdbi 0.9.0, libdbi-drivers 0.9.0 and replaced mysql-libs with mariadb-libs. EDIT: oops, sorry I missed the 'next' branch. Thank you very much |
Updated by mfriedrich on 2013-04-10 16:56:40 +00:00
thanks for testing. i was waiting for the edit, as your diff shows clearly a different head ;-) |
Updated by mfriedrich on 2014-12-08 14:38:03 +00:00
|
This issue has been migrated from Redmine: https://dev.icinga.com/issues/3728
Created by fmbiete on 2013-02-22 08:40:00 +00:00
Assignee: mfriedrich
Status: Resolved (closed on 2013-04-10 16:56:40 +00:00)
Target Version: 1.9
Last Update: 2014-12-08 14:38:03 +00:00 (in Redmine)
Hello,
I'm seeing too much rows in icinga_service_contactgroups where service_id = 0. That's a wrong service id, is that?
Enabling debug for ido2db show inserts with that service_id.
If you need more info let me know.
Regards,
Francisco Miguel
Attachments
Changesets
2013-04-06 22:43:01 +00:00 by (unknown) 0fb5110
2013-04-10 18:54:26 +00:00 by (unknown) 6dd703d
The text was updated successfully, but these errors were encountered: