Bug 1651899

Summary:

Readiness probe failed for grafana pod

Product:

OpenShift Container Platform

Reporter:

Junqi Zhao <juzhao>

Component:

Monitoring

Assignee:

Frederic Branczyk <fbranczy>

Status:

CLOSED ERRATA

QA Contact:

Junqi Zhao <juzhao>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

4.1.0

Keywords:

Regression

Target Milestone:

---

Target Release:

4.1.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-06-04 10:41:02 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Readiness probe failed for grafana pod	none

Description Junqi Zhao 2018-11-21 07:03:03 UTC

Created attachment 1507562 [details]
Readiness probe failed for grafana pod

Description of problem:
This bug is cloned from https://jira.coreos.com/browse/MON-475
File it again for QE team to track the monitoring issue in Bugzilla.

Deploy cluster monitoring by Next-Gen installer, readiness probe failed for grafana pod.

Describe pod, error is "Readiness probe failed: Get http://10.129.0.13:3000/api/health: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02""

It seems we should use https, not http to do readiness probe 


#oc -n openshift-monitoring get all
NAME                                              READY     STATUS    RESTARTS   AGE
pod/cluster-monitoring-operator-8fbbc8d47-mzl8k   1/1       Running   0          3h
pod/grafana-56567d86b-g5crx                       1/2       Running   0          3h
pod/prometheus-operator-57ddb7f5bb-ql6bw          1/1       Running   0          3h

NAME                                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/cluster-monitoring-operator   ClusterIP   None             <none>        8080/TCP   3h
service/grafana                       ClusterIP   172.30.231.107   <none>        3000/TCP   3h
service/prometheus-operator           ClusterIP   None             <none>        8080/TCP   3h

NAME                                          DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cluster-monitoring-operator   1         1         1            1           4h
deployment.apps/grafana                       1         1         1            0           3h
deployment.apps/prometheus-operator           1         1         1            1           3h

NAME                                                    DESIRED   CURRENT   READY     AGE
replicaset.apps/cluster-monitoring-operator-8fbbc8d47   1         1         1         3h
replicaset.apps/grafana-56567d86b                       1         1         0         3h
replicaset.apps/prometheus-operator-57ddb7f5bb          1         1         1         3h

NAME                               HOST/PORT                                                   PATH      SERVICES   PORT      TERMINATION   WILDCARD
route.route.openshift.io/grafana   grafana-openshift-monitoring.apps.1121-1n5.qe.rhcloud.com             grafana    https     reencrypt     None

#oc -n openshift-monitoring describe pod grafana-56567d86b-g5crx

**********snipped*********

Events:
  Type     Reason     Age                 From                                   Message
  ----     ------     ----                ----                                   -------
  Warning  Unhealthy  2m (x1141 over 3h)  kubelet, ip-172-18-4-140.ec2.internal  Readiness probe failed: Get http://10.129.0.13:3000/api/health: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"

 

#oc -n openshift-monitoring logs grafana-56567d86b-g5crx -c grafana-proxy

2018/11/21 03:27:16 oauthproxy.go:238: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain:<default> refresh:disabled
2018/11/21 03:27:16 http.go:96: HTTPS: listening on [::]:3000
2018/11/21 03:27:17 server.go:2753: http: TLS handshake error from 10.129.0.1:42980: tls: first record does not look like a TLS handshake
2018/11/21 03:27:27 server.go:2753: http: TLS handshake error from 10.129.0.1:43018: tls: first record does not look like a TLS handshake
2018/11/21 03:27:37 server.go:2753: http: TLS handshake error from 10.129.0.1:43052: tls: first record does not look like a TLS handshake

 

BTW: alertmanager-main and prometheus-k8s  are not created

Version-Release number of selected component (if applicable):
quay.io/openshift/origin-cluster-monitoring-operator:v4.0
grafana/grafana:5.2.4
openshift/oauth-proxy:v1.1.0
quay.io/coreos/configmap-reload:v0.0.1
quay.io/coreos/prometheus-operator:v0.25.0

How reproducible:
Always

Steps to Reproduce:
1. Deploy cluster monitoring by Next-Gen installer
2.
3.

Actual results:
Readiness probe failed for grafana pod

Expected results:
Readiness probe should be passed for grafana pod

Additional info:

Comment 1 Junqi Zhao 2018-12-13 06:09:39 UTC

all containers of grafana are start up now

$ oc -n openshift-monitoring get pod | grep grafana
grafana-58456d859d-hcmj2                       2/2       Running   0          48m

used images
docker.io/grafana/grafana:5.2.4
docker.io/openshift/oauth-proxy:v1.1.0

$ oc version
oc v4.0.0-alpha.0+9d2874f-759
kubernetes v1.11.0+9d2874f

Comment 4 errata-xmlrpc 2019-06-04 10:41:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758