Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1631480

Summary: Metrics Not Available (queue has reached its max size 256)
Product: OpenShift Container Platform Reporter: Freddy E. Montero <fmontero>
Component: HawkularAssignee: John Sanda <jsanda>
Status: CLOSED NOTABUG QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: aos-bugs, elalance, erjones, fmontero, ggore, hcisneir, shishika
Target Milestone: ---   
Target Release: 3.9.z   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-28 20:18:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Hawkular Cassandra Logs
none
Hawkular Heapster Logs
none
Hawkular Metrics Logs none

Description Freddy E. Montero 2018-09-20 16:44:25 UTC
Description of problem:
Metrics are not available.

An error occurred getting metrics for container prom-proxy from https://<URL>/hawkular/metrics.

Failed to perform operation due to an error: All host(s) tried for query failed (tried: hawkular-cassandra/100.124.250.166:9042 (com.datastax.driver.core.exceptions.BusyPoolException: [hawkular-cassandra/100.124.250.166] Pool is busy (no available connection and the queue has reached its max size 256))) 

Version-Release number of selected component (if applicable):
docker-registry.default.svc.cluster.local:5000/openshift3/metrics-cassandra                          v3.9                40e40a284b7e        2 months ago        822 MB
docker-registry.default.svc.cluster.local:5000/openshift3/metrics-heapster                           v3.9                8e0cfccc95ea        2 months ago        281 MB
docker-registry.default.svc.cluster.local:5000/openshift3/metrics-hawkular-metrics                   v3.9                0a2c9e157200        3 months ago        1.3 GB

Comment 1 Freddy E. Montero 2018-09-20 16:51:54 UTC
Created attachment 1485232 [details]
Hawkular Cassandra Logs

Comment 2 Freddy E. Montero 2018-09-20 16:52:22 UTC
Created attachment 1485233 [details]
Hawkular Heapster Logs

Comment 3 Freddy E. Montero 2018-09-20 16:52:51 UTC
Created attachment 1485234 [details]
Hawkular Metrics Logs

Comment 4 Freddy E. Montero 2018-09-20 16:55:42 UTC
This issue gets fixed when 
oc delete hawkular-metrics-l457p
and a new pod gets created

Comment 5 John Sanda 2018-09-20 21:29:27 UTC
The BusyPoolException means that the Cassandra driver, which is used by hawkular-metrics, is overloaded. Sometimes that might mean we just need to scale up hawkular-metrics. In this case though, Cassandra appears to be the problem. 

The logs show that Cassandra is dropping both read and write requests. Can I get the output of

$ oc -n openshift-infra get pods -o yaml