This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1478410 - RESTEASY002020: Unhandled asynchronous exception, sending back 500
RESTEASY002020: Unhandled asynchronous exception, sending back 500
Status: ASSIGNED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Metrics (Show other bugs)
3.4.1
Unspecified Unspecified
unspecified Severity high
: ---
: 3.4.z
Assigned To: John Sanda
Junqi Zhao
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-04 09:54 EDT by Javier Ramirez
Modified: 2017-10-11 23:15 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Logs after increasing heap size (65.58 KB, application/zip)
2017-08-04 09:54 EDT, Javier Ramirez
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker HWKMETRICS-692 Major Resolved Tags queries unnecessarily enrich with full Metric<T> data 2017-10-17 09:56 EDT
JBoss Issue Tracker HWKMETRICS-733 Critical Resolved Log REST API requests 2017-10-17 09:56 EDT

  None (edit)
Description Javier Ramirez 2017-08-04 09:54:40 EDT
Created attachment 1309070 [details]
Logs after increasing heap size

Description of problem:

A dedicated cluster customer reported seeing "An error occurred getting metrics" message in their OpenShift console. I looked at the logs for the hawkular-metrics pod and saw numerous instances of the message:
11:40:06,667 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-161) RESTEASY002020: Unhandled asynchronous exception, sending back 500: org.jboss.resteasy.spi.UnhandledException: RESTEASY003770: Response is committed, can't handle exception

See attached logs for full traceback and related Cassandra/Heapster logs

Version-Release number of selected component (if applicable):

oc v3.4.1.18
kubernetes v1.4.0+776c994
Comment 10 Matt Wringe 2017-09-06 11:45:23 EDT
It looks like the problem here is that we are overloading the system currently and it can't keep up with the requests.

Has anyone tried to scale up the number of Cassandra instances to 2? The Hawkular Metrics instance can also be scaled to 2 as well.

Note: for Cassandra, you cannot just scale up its RC, you need to either set the deployer parameter for the number of Cassandra nodes (eg CASSANDRA_NODES) or use the template to deploy another instance.

To use the template (assuming you are using persistent volume and want a 100Gi persistent volume) you can run the following:

$ oc process hawkular-cassandra-node-pv \
-v IMAGE_VERSION=3.4.1 \
-v PV_SIZE=100Gi \
-v NODE=2"

If you are not using a persistent volume, then the name of the template is just 'hawkular-cassandra-node-emptydir' and you don't need to set a PV_SIZE option.

Attaching the output of 'oc get pods -o yaml -n openshift-infra' is also usually a required attachment for metric bugzillas.
Comment 13 Ruben Romero Montes 2017-09-21 08:03:56 EDT
As suggested, customer has changed the setup to 2 cassandra instances and 2 container for hawkular. Attached the output of "oc get pods -o yaml -n openshift-infra".

But in the log of the first hawkular container they see the following Error again: 

05:58:13,317 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-205) RESTEASY002020: Unhandled asynchronous exception, sending back 500: org.jboss.resteasy.spi.UnhandledException: RESTEASY003770: Response is committed, can't handle exception
Comment 16 Ruben Romero Montes 2017-10-06 05:27:54 EDT
Can you udpate this BZ once images including HWKMETRICS-733 are available?
I see ticket is resolved but I don't know whether it is already publicly released.
Comment 17 John Sanda 2017-10-11 23:15:56 EDT
(In reply to Ruben Romero Montes from comment #16)
> Can you udpate this BZ once images including HWKMETRICS-733 are available?
> I see ticket is resolved but I don't know whether it is already publicly
> released.

The Jira tickets are resolved because the changes have been merged into the upstream branches and pushed out into hawkular-metrics releases upstream. This ticket is still set to ASSIGNED though since new images are not yet available. We will update this ticket when those images are available.

Thanks

Note You need to log in before you can comment on or make changes to this bug.