Bug 1477738

Summary: Stable metrics pods failing to provide metrics in webUI
Product: OpenShift Container Platform Reporter: Eric Jones <erjones>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED NOTABUG QA Contact: Johnny Liu <jialiu>
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedCC: aos-bugs, erich, erjones, jokerman, mmccomas, mwringe, pweil
Target Milestone: ---Keywords: UpcomingRelease
Target Release: 3.2.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-20 15:44:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eric Jones 2017-08-02 18:50:37 UTC
Description of problem:
Customer redeployed certificates in cluster and began to experience issues with metrics. After some troubleshooting, we deleted metrics and redeployed. After redeploying, the pods are completely stable but webUI metrics tab shows error message.

Looking at this page in the Inspect console for the browser, and specifically navigating to the Network tab shows 403 error messages for some metrics calls but the hawkular-metrics url is fully routeable.

Version-Release number of selected component (if applicable):
3.2.0

Additional info:
Attaching pod logs, pod yaml, rc yaml, service yaml, route yaml, and several screenshots shortly.

Comment 2 Matt Wringe 2017-08-03 15:59:59 UTC
Can we please get his updated to 3.2.1?

Comment 7 Matt Wringe 2017-08-08 18:53:43 UTC
The error message in the Hawkular Metrics log is indicating the that CA certificate used to sign the Master API is not valid:

"unable to find valid certification path to requested target"

Can you run the following commands:

Create and enter a debug pod:

'oc debug ${HAWKULAR_METRICS_POD_NAME}'

Run the following commands once within the pod:

'curl -v --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
$MASTER_URL'

'cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt $MASTER_URL'

That should help us get a handle on what is happening here and if the certificates for the master are signed properly or not.

Comment 14 Eric Jones 2017-09-20 15:44:12 UTC
Closing this bug as not a bug.

Closing as we actually identified the issue and are working towards a resolution in https://bugzilla.redhat.com/show_bug.cgi?id=1479930