Have similar bug with v3.7. [anli@upg_slave_qeos10 37]$ oc describe pod hawkular-metrics-mwpsh Name: hawkular-metrics-mwpsh Namespace: openshift-infra Node: 172.16.120.107/172.16.120.107 Start Time: Mon, 06 Aug 2018 06:34:20 -0400 Labels: metrics-infra=hawkular-metrics name=hawkular-metrics Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"openshift-infra","name":"hawkular-metrics","uid":"2bd612cb-9964-11e8-a... openshift.io/scc=restricted Status: Running IP: 10.129.0.98 Controlled By: ReplicationController/hawkular-metrics Containers: hawkular-metrics: Container ID: docker://2af10703f80eef65e27ba45c6c35447cddeb634c509d431ccfd75c59344aa113 Image: registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics:v3.7 Image ID: docker-pullable://registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics@sha256:0c804ff20cb23c1d340d8b8d50fa3e4808a7f46e91d603960c0a2087f545da75 Ports: 8080/TCP, 8443/TCP, 8888/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Command: /opt/hawkular/scripts/hawkular-metrics-wrapper.sh -b 0.0.0.0 -Dhawkular.metrics.cassandra.nodes=hawkular-cassandra -Dhawkular.metrics.cassandra.use-ssl -Dhawkular.metrics.openshift.auth-methods=openshift-oauth,htpasswd -Dhawkular.metrics.openshift.htpasswd-file=/hawkular-account/hawkular-metrics.htpasswd -Dhawkular.metrics.allowed-cors-access-control-allow-headers=authorization -Dhawkular.metrics.default-ttl=7 -Dhawkular.metrics.admin-tenant=_hawkular_admin -Dhawkular-alerts.cassandra-nodes=hawkular-cassandra -Dhawkular-alerts.cassandra-use-ssl -Dhawkular.alerts.openshift.auth-methods=openshift-oauth,htpasswd -Dhawkular.alerts.openshift.htpasswd-file=/hawkular-account/hawkular-metrics.htpasswd -Dhawkular.alerts.allowed-cors-access-control-allow-headers=authorization -Dorg.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true -Dorg.apache.catalina.connector.CoyoteAdapter.ALLOW_BACKSLASH=true -Dcom.datastax.driver.FORCE_NIO=true -DKUBERNETES_MASTER_URL=https://kubernetes.default.svc -DUSER_WRITE_ACCESS=False -Dhawkular.metrics.jmx-reporting-enabled State: Running Started: Mon, 06 Aug 2018 06:50:31 -0400 Last State: Terminated Reason: Completed Exit Code: 0 Started: Mon, 06 Aug 2018 06:44:31 -0400 Finished: Mon, 06 Aug 2018 06:50:31 -0400 Ready: False Restart Count: 2 Limits: memory: 2500M Requests: memory: 1500M Liveness: exec [/opt/hawkular/scripts/hawkular-metrics-liveness.py] delay=0s timeout=1s period=10s #success=1 #failure=3 Readiness: exec [/opt/hawkular/scripts/hawkular-metrics-readiness.py] delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: POD_NAMESPACE: openshift-infra (v1:metadata.namespace) MASTER_URL: https://kubernetes.default.svc JGROUPS_PASSWORD: QZEXRDtDeIQE2PBmi TRUSTSTORE_AUTHORITIES: /hawkular-metrics-certs/tls.truststore.crt ENABLE_PROMETHEUS_ENDPOINT: True OPENSHIFT_KUBE_PING_NAMESPACE: openshift-infra (v1:metadata.namespace) OPENSHIFT_KUBE_PING_LABELS: metrics-infra=hawkular-metrics,name=hawkular-metrics STARTUP_TIMEOUT: 500 Mounts: /hawkular-account from hawkular-metrics-account (rw) /hawkular-metrics-certs from hawkular-metrics-certs (rw) /var/run/secrets/kubernetes.io/serviceaccount from hawkular-token-skj24 (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: hawkular-metrics-certs: Type: Secret (a volume populated by a Secret) SecretName: hawkular-metrics-certs Optional: false hawkular-metrics-account: Type: Secret (a volume populated by a Secret) SecretName: hawkular-metrics-account Optional: false hawkular-token-skj24: Type: Secret (a volume populated by a Secret) SecretName: hawkular-token-skj24 Optional: false QoS Class: Burstable Node-Selectors: <none> Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 16m default-scheduler Successfully assigned hawkular-metrics-mwpsh to 172.16.120.107 Normal SuccessfulMountVolume 16m kubelet, 172.16.120.107 MountVolume.SetUp succeeded for volume "hawkular-metrics-certs" Normal SuccessfulMountVolume 16m kubelet, 172.16.120.107 MountVolume.SetUp succeeded for volume "hawkular-token-skj24" Normal SuccessfulMountVolume 16m kubelet, 172.16.120.107 MountVolume.SetUp succeeded for volume "hawkular-metrics-account" Normal Pulling 16m kubelet, 172.16.120.107 pulling image "registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics:v3.7" Normal Pulled 12m kubelet, 172.16.120.107 Successfully pulled image "registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics:v3.7" Normal Started 12m kubelet, 172.16.120.107 Started container Normal Created 12m kubelet, 172.16.120.107 Created container Warning Unhealthy 12m kubelet, 172.16.120.107 Liveness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>. Traceback (most recent call last): File "/opt/hawkular/scripts/hawkular-metrics-liveness.py", line 48, in <module> if int(uptime) < int(timeout): ValueError: invalid literal for int() with base 10: '' Warning Unhealthy 11m (x2 over 12m) kubelet, 172.16.120.107 Readiness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>. This may be due to Hawkular Metrics not being ready yet. Will try again. Warning Unhealthy 11m kubelet, 172.16.120.107 Readiness probe failed: Failed to access the status endpoint : HTTP Error 404: Not Found. This may be due to Hawkular Metrics not being ready yet. Will try again. Warning Unhealthy 6m (x3 over 6m) kubelet, 172.16.120.107 Liveness probe failed: The MetricsService is in a FAILED state. Aborting Warning Unhealthy 1m (x57 over 11m) kubelet, 172.16.120.107 Readiness probe failed: The MetricService is not yet in the STARTED state [STARTING]. We need to wait until its in the STARTED state.
*** This bug has been marked as a duplicate of bug 1612648 ***