Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #13427] Improve error handling when responding to non-HTML requests #2635

Closed
icinga-migration opened this issue Dec 6, 2016 · 18 comments · Fixed by #3444
Closed
Labels
area/monitoring Affects the monitoring module bug Something isn't working
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/13427

Created by medicmomcilo on 2016-12-06 15:43:52 +00:00

Assignee: medicmomcilo
Status: Feedback
Target Version: Backlog
Last Update: 2016-12-26 00:08:41 +00:00 (in Redmine)


Hi friends!

Recently in our infrastructure Nagstamon suddenly stopped working.
After some hours of troubleshooting we identified that this is due to JSON output is not being provided:
/monitoring/list/services?service_state>0&service_state<=3&service_state_type=1&addColumns=service_last_check&format=json

Our assumption is that because of number of criticals it stopped, this can also be confirmed with narrowing down the filter.

You can test this with inclusion of services in OK state if you don't have enough testing criticals. URL for that:
/monitoring/list/services?service_state>=0&service_state<=3&service_state_type=1&addColumns=service_last_check&format=json

Do let me know if there is anything else I can provide to help you investigate this issue further.

Kind regards,
Momcilo.

Attachments

@icinga-migration
Copy link
Author

Updated by elippmann on 2016-12-06 16:11:14 +00:00

  • Status changed from New to Feedback
  • Assigned to set to medicmomcilo

Hi,

The JSON output is not limited actually. What do you mean w/ the "JSON output is not being provided"? Which version of Web 2 are you using?

Best regards,
Eric

@icinga-migration
Copy link
Author

Updated by medicmomcilo on 2016-12-06 16:32:03 +00:00

Hi Eric,

Version of Web2 is 2.3.4+fix-1~ppa1404+1 on Ubuntu 14.04
What I meant to say is that when clicking on JSON new blank window is open without data.

When tried via curl I get nothing as well, however status is 200:

< HTTP/1.1 200 OK
< Date: Tue, 06 Dec 2016 16:29:05 GMT
< Server: Apache/2.4.7 (Ubuntu)
< Strict-Transport-Security: max-age=15768000; includeSubDomains; preload
< X-Powered-By: PHP/5.5.9-1ubuntu4.20
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Content-Length: 0
< Content-Type: application/json
< 
* Curl_http_done: called premature == 0
* Connection #0 to host icingaweb2.local left intact

I just installed brand new VM with latest vanilla Icinga Web2 and I get the same behaviour.
You should be able to test this if you try getting JSON for more than ~400 services:

/monitoring/list/services?service_state>=0&service_state<=3&service_state_type=1&addColumns=service_last_check&format=json

Kind regards,
Momcilo.

@icinga-migration
Copy link
Author

Updated by medicmomcilo on 2016-12-07 08:02:05 +00:00

Hi Eric,

Just as a side note, CSV export works perfectly.
Also, if '&format=json' is removed, all data is displayed properly in Web2 interface.

This should rule out any database issues or issues with other components.

Kind regards,
Momcilo.

@icinga-migration
Copy link
Author

Updated by tgelf on 2016-12-07 08:15:19 +00:00

  • Tracker changed from Bug to Support

Hi Momcilo,

you should check your webserver (error) logs. What probably happens is that your PHP is using more memory than it was granted. To solve this you should raise the memory_limit in your php.ini to a reasonable memory, allowing you to prepare all the requested data in memory. I guess your Ubuntu is running with very conservative defaults. Furthermore you might also consider upgrading your box. 16.04 LTS uses PHP 7, and that one consumes far less memory. But don't worry, we still support (and test) everything from PHP 5.3 to PHP 7.

medicmomcilo wrote:

Just as a side note, CSV export works perfectly.
Also, if '&format=json' is removed, all data is displayed properly in Web2 interface.

This should rule out any database issues or issues with other components.

Well, one main difference is that the web frontend shows a limited amount of lines while the requested JSON ships all of them. CSV works differently, as it iterates over the result set. That way it is able to send data line by line while fetching rows from the DB.

This won't work with JSON for various reasons. You cannot really stream it unless combined with other technology. So it has to be built in memory. To be able to do so, we first fetch all the data into a PHP data structure. So the whole thing is in memory at least twice at that moment.

Cheers,
Thomas

@icinga-migration
Copy link
Author

Updated by tgelf on 2016-12-07 08:17:38 +00:00

medicmomcilo wrote:

When tried via curl I get nothing as well, however status is 200:
[...]

Well, at least this is strange and speaks against my theory. It should be 500 when running out of memory. And 400 services isn't that much...

@icinga-migration
Copy link
Author

Updated by medicmomcilo on 2016-12-07 13:24:59 +00:00

Hi Thomas,

Thanks for your replies!

Unfortunately there is nothing in logs. I checked apache2 access_log error_log and icingaweb2.log which is in DEBUG logging.
Also, I tested setting 2.5GiB instead of 256MiB in /etc/php.ini for memory_limit, but this made no difference.

If there is anything I can further provide, do let me know.

Kind regards,
Momcilo.

@icinga-migration
Copy link
Author

Updated by aklimov on 2016-12-08 15:45:03 +00:00

  • File added patch_file.patch

Hi Momcilo,

I've tried to reproduce this with the following setup:

  • Ubuntu 14.04
  • Apache w/ mod_php
  • Icinga Web 2 (2.3.4+fix-1~ppa1404+1, from ppa:formorer/icinga)
  • fed by IDO/MySQL from Icinga2
  • > 10000 Services

When I fetch ALL of them as JSON with PHP's memory limit left as is... easy going.

If you're still getting this error, please test the following:

# cd /usr/share/icingaweb2
# patch -p0 PATCH_FILE      # with PATCH_FILE being my attached file

This will explicitly set the content length (to something > 0).

After testing you can undo this by:

# cd /usr/share/icingaweb2
# patch -Rp0 PATCH_FILE      # with PATCH_FILE being my attached file

Cheers,
Alex

@icinga-migration
Copy link
Author

Updated by medicmomcilo on 2016-12-08 16:28:08 +00:00

Hi Alex,

Thank you for your idea.
Unfortunately, I see no difference in output of curl:

< HTTP/1.1 200 OK
< Date: Thu, 08 Dec 2016 16:17:18 GMT
< Server: Apache/2.4.7 (Ubuntu)
< Strict-Transport-Security: max-age=15768000; includeSubDomains; preload
< X-Powered-By: PHP/5.5.9-1ubuntu4.20
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Content-Length: 0
< Content-Type: application/json
< 
* Curl_http_done: called premature == 0
* Connection #0 to host icingaweb2.local left intact

This is how handleFormatRequest procedure looks now:

...
    protected function handleFormatRequest($query)
    {
        if ($this->_getParam('format') === 'sql') {
            echo '

'
                . htmlspecialchars(wordwrap($query->dump()))
                . '

';
exit;
}
if ($this->_getParam('format') = 'json'
|| $this->_request->getHeader('Accept') = 'application/json') {
$output = json_encode($query~~getQuery()~~>fetchAll());
header('Content-type: application/json');
header('Content-Length: ' . strlen($output));
echo $output;
exit;
}
if ($this->_getParam('format') = 'csv'
|| $this->_request->getHeader('Accept') = 'text/csv') {
Csv::fromQuery($query)->dump();
exit;
}
}
...

Thank you for your confirmation that it works on your end. Since it isn't a bug, now I'm completely confused how it behaves the same way on brand new install.

Is there any other way it can be related to database, or other components even though it can produce csv output?

Kind regards,
Momcilo.

@icinga-migration
Copy link
Author

Updated by medicmomcilo on 2016-12-13 11:18:32 +00:00

Hi all!

Finally I was able to identify the issue here.
We were getting weird trailing output from one of our checks:

0000000: 5052 4f43 5320 4352 4954 4943 414c 3a20  PROCS CRITICAL: 
0000010: 3020 7072 6f63 6573 7365 7320 7769 7468  0 processes with
0000020: 2061 7267 7320 272f 6f70 742f 6261 7265   args '/opt/bare
0000030: 6f73 2d66 642f 7362 696e 2f62 6172 656f  os-fd/sbin/bareo
0000040: 732d 6664 272c 2063 6f6d 6d61 6e64 206e  s-fd', command n
0000050: 616d 6520 2762 6172 656f 732d 6664 270a  ame 'bareos-fd'.
0000060: b084 697c 957f 0a                        ..i|...

This is causing json_encode function in php to return 'false' indicating error in encoding.
This behaviour is only observed on old obsoleted hosts with outdated nagios_plugins installed.

We decided to stop monitoring this process as fixing this may require fixing non-maintained code.

You can go ahead and close this issue if you like or you can introduce some sort of handling for this.
Perhaps a single check of output of json_encode to see if it returned error and then display "JSON encode error"

Thank you all for your help and feedback, it really helped me to start thinking differently.

Keep up the good work friends ;)

Kind regards,
Momcilo.

@icinga-migration
Copy link
Author

Updated by jmeyer on 2016-12-13 14:54:06 +00:00

  • Subject changed from JSON output has a limit to Improve error handling when responding to non-HTML requests
  • Status changed from Feedback to New
  • Assigned to deleted medicmomcilo
  • Target Version set to Backlog

Hi,

many thanks for sharing your solution with us!

Best regards,
Johannes

@icinga-migration
Copy link
Author

Updated by jmeyer on 2016-12-13 14:54:16 +00:00

  • Tracker changed from Support to Bug

@icinga-migration
Copy link
Author

Updated by aklimov on 2016-12-14 12:38:55 +00:00

  • File added detect_error.patch
  • Status changed from New to Feedback
  • Assigned to set to medicmomcilo

Hi Momcilo,

I'm happy for you that you've found a solution for that, but that doesn't fix the problem.
"Garbage" in plugin outputs isn't nice to have, but IMO Icinga Web 2 should do its best while serving that output.

Please could you help us with that and:

  1. re-add your weird check as it was before you discovered that it causes the problem (and let Icinga do a check before going on)
  2. apply detect_error.patch (designed for 2.3.4) as I already described for the other patch file
  3. do your curl-request again and provide us the response
  4. tell your PHP version

Cheers,
Alex

@icinga-migration
Copy link
Author

Updated by medicmomcilo on 2016-12-26 00:08:41 +00:00

Hi Alex,

Sorry for the late reply.

I've looked at your patch and I am really uncomfortable sharing entire array in public.
That is why I pasted binary output (written in HEX) of the check in my previous post.

If you prefer base64 output of the check, here it is:

UFJPQ1MgQ1JJVElDQUw6IDAgcHJvY2Vzc2VzIHdpdGggYXJncyAnL29wdC9iYXJlb3MtZmQvc2Jp
bi9iYXJlb3MtZmQnLCBjb21tYW5kIG5hbWUgJ2JhcmVvcy1mZCcKsIRpfJV/Cg==

PHP version on the server is: 5.5.9+dfsg-1ubuntu4.20 on Ubuntu 14.04.5

Do let me know if there is anything else I can do assist in handling these types of errors.

Kind regards,
Momcilo.

@icinga-migration icinga-migration added feedback bug Something isn't working area/monitoring Affects the monitoring module labels Jan 17, 2017
@icinga-migration icinga-migration added this to the Backlog milestone Jan 17, 2017
@nilmerg nilmerg removed the feedback label Feb 2, 2017
@nilmerg nilmerg self-assigned this Feb 2, 2017
@nilmerg nilmerg modified the milestones: 2.5.0, Backlog Feb 2, 2017
nilmerg added a commit that referenced this issue Feb 3, 2017
...and add some logging to ease tracking errors.

refs #2635
@nilmerg
Copy link
Member

nilmerg commented Feb 3, 2017

Need to pause here now. Until now only JsonResponse has been adjusted but there are other usages of json_encode() which need to be handled this way.

@lippserd lippserd removed this from the 2.5.0 milestone Nov 15, 2017
@Al2Klimov Al2Klimov assigned Al2Klimov and unassigned Al2Klimov Apr 27, 2018
@Al2Klimov
Copy link
Member

@nilmerg Souldn't we adjust our Json::encode() and just use it everywhere?

@nilmerg
Copy link
Member

nilmerg commented Apr 27, 2018

@Al2Klimov Sure. But we should first verify that the behaviour the referenced commit tries to fix is still valid for PHP 5.6+. (a23b246#diff-79d5402cca911979dfd9b5217b0264fdR200)

Scratch that, it's the other way round. That's for <=5.4. We've raised the minimum to 5.6. There invalid JSON should throw an error without the mentioned option.

@Al2Klimov
Copy link
Member

@nilmerg I've taken a look at your change's effect and I'm not quite sure that it's the way(tm). Any non-UTF8 string gets NULL. What about this:

On failure:
  For each string (recursively):
    If string is not valid UTF8:
      string = latin1_2_utf8(string)

@Al2Klimov Al2Klimov self-assigned this Apr 27, 2018
@Al2Klimov
Copy link
Member

As discussed w/ @nilmerg (offline): ?s are better.

Al2Klimov added a commit that referenced this issue Apr 27, 2018
@Al2Klimov Al2Klimov removed their assignment Apr 27, 2018
Al2Klimov added a commit that referenced this issue May 14, 2018
Al2Klimov added a commit that referenced this issue May 14, 2018
Al2Klimov added a commit that referenced this issue Jun 20, 2018
Al2Klimov added a commit that referenced this issue Jun 20, 2018
@lippserd lippserd added this to the 2.6.0 milestone Jun 21, 2018
Al2Klimov added a commit that referenced this issue Jun 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/monitoring Affects the monitoring module bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants