1623987 – After running redeploy-certificates.yml playbook in OCP 3.9 prometheus stop working

Bug 1623987 - After running redeploy-certificates.yml playbook in OCP 3.9 prometheus stop working

Summary: After running redeploy-certificates.yml playbook in OCP 3.9 prometheus stop w...

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	3.9.0
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.9.z
Assignee:	Frederic Branczyk
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:	1592303 1596233 1596557 1667981
Blocks:
TreeView+	depends on / blocked

Reported:	2018-08-30 15:32 UTC by oarribas
Modified:	2022-03-13 15:29 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1596557
Environment:
Last Closed:	2019-11-20 19:04:57 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
503 error for prometheus route (61.72 KB, image/png) 2019-03-01 08:20 UTC, Junqi Zhao	no flags	Details
pods logs (67.26 KB, text/plain) 2019-03-04 03:32 UTC, Junqi Zhao	no flags	Details
View All

Comment 6 Junqi Zhao 2019-03-01 07:54:25 UTC

After running redeploy-certificates.yml playbook in OCP 3.9, prometheus works well except all routes meet 503 (Service Unavailable) error

# oc -n openshift-metrics get pod
NAME                             READY     STATUS    RESTARTS   AGE
prometheus-0                     6/6       Running   6          2h
prometheus-node-exporter-pff22   1/1       Running   1          2h
prometheus-node-exporter-stg8f   1/1       Running   2          2h

# oc -n openshift-metrics get route
NAME           HOST/PORT                                                     PATH      SERVICES       PORT      TERMINATION   WILDCARD
alertmanager   alertmanager-openshift-metrics.apps.0301-xvz.qe.rhcloud.com             alertmanager   <all>     reencrypt     None
alerts         alerts-openshift-metrics.apps.0301-xvz.qe.rhcloud.com                   alerts         <all>     reencrypt     None
prometheus     prometheus-openshift-metrics.apps.0301-xvz.qe.rhcloud.com               prometheus     <all>     reencrypt     None


Change  TERMINATION from reencrypt to passthrough can login all routes 

images
oauth-proxy-v3.9.70-1
prometheus-v3.9.70-1
prometheus-alert-buffer-v3.9.70-1
prometheus-alertmanager-v3.9.70-1
prometheus-node-exporter-v3.9.71-1

openshift-ansible version: v3.9.70

Comment 7 Junqi Zhao 2019-03-01 08:20:37 UTC

Created attachment 1539749 [details]
503 error for prometheus route

Comment 8 Junqi Zhao 2019-03-04 03:23:37 UTC

Tested with prometheus v3.9.71 and openshift-ansible v3.9.71, still can not login all routes 

alerts-proxy/alert-buffer/prom-proxy container reports error
2019/03/04 03:12:40 server.go:2753: http: TLS handshake error from 10.129.0.1:35806: remote error: tls: unknown certificate authority

Comment 9 Junqi Zhao 2019-03-04 03:32:48 UTC

Created attachment 1540442 [details]
pods logs

Comment 15 Stephen Cuppett 2019-11-20 19:04:57 UTC

OCP 3.6-3.10 is no longer on full support [1]. Marking CLOSED DEFERRED. If you have a customer case with a support exception or have reproduced on 3.11+, please reopen and include those details. When reopening, please set the Target Release to the appropriate version where needed.

[1]: https://access.redhat.com/support/policy/updates/openshift

Comment 16 Stephen Cuppett 2019-11-20 19:06:35 UTC

OCP 3.6-3.10 is no longer on full support [1]. Marking CLOSED DEFERRED. If you have a customer case with a support exception or have reproduced on 3.11+, please reopen and include those details. When reopening, please set the Target Release to the appropriate version where needed.

[1]: https://access.redhat.com/support/policy/updates/openshift

Note You need to log in before you can comment on or make changes to this bug.