Description of problem: In fresh 3.11.43 cluster installation hawkular-metrics pod does not come up and goes into crash loopback # oc get pods NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-r6ckd 1/1 Running 1 6d hawkular-metrics-94hxj 0/1 CrashLoopBackOff 2803 6d hawkular-metrics-schema-d65rc 0/1 Completed 0 6d heapster-l6wcg 0/1 Running 1064 6d Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Killing 1h (x2801 over 6d) kubelet, xyz Killing container with id docker://hawkular-metrics:Container failed liveness probe.. Container will be killed and recreated. Warning Unhealthy 34m (x15716 over 6d) kubelet, xyz Readiness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>. This may be due to Hawkular Metrics not being ready yet. Will try again. Warning Unhealthy 15m (x8463 over 6d) kubelet, xyz Liveness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>. Traceback (most recent call last): File "/opt/hawkular/scripts/hawkular-metrics-liveness.py", line 48, in <module> if int(uptime) < int(timeout): # oc -n openshift-infra get job NAME DESIRED SUCCESSFUL AGE hawkular-metrics-schema 1 1 6d # oc get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-e33d4076-14d8-11e9-8146-005056a108a0 25G RWO Delete Bound openshift-infra/metrics-cassandra-1 glusterfs-storage-block 6d # oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE metrics-cassandra-1 Bound pvc-e33d4076-14d8-11e9-8146-005056a108a0 25G RWO glusterfs-storage-block 6d workaround : Increase readiness probe.
PR containing this enhancement: https://github.com/openshift/openshift-ansible/pull/11216
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3139