Bug 1643948

Summary:

Cluster console doesn't display the real value of Crashlooping Pods (it displays 0)

Product:

OpenShift Container Platform

Reporter:

Alberto Gonzalez de Dios <algonzal>

Component:

Management Console

Assignee:

Samuel Padgett <spadgett>

Status:

CLOSED ERRATA

QA Contact:

Yadan Pei <yapei>

Severity:

low

Docs Contact:

Priority:

unspecified

Version:

3.11.0

CC:

aos-bugs, jokerman, mmccomas, rsandu, smunilla, spadgett, yapei

Target Milestone:

---

Target Release:

3.11.z

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Previously, the cluster console in OpenShift 3.11 would always show the value "0" for the Crashlooping Pods count on the cluster status page even when there were crashlooping pods. The problem has been fixed, and the count now accurately reflects the count for the selected projects.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-11-20 03:11:52 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Cluster console Crashlooping Pod counter	none

Description Alberto Gonzalez de Dios 2018-10-29 14:17:20 UTC

Created attachment 1498583 [details]
Cluster console Crashlooping Pod counter

Description of problem: 
Crashlooping Pod number in Cluster console doesn't display the real value. It displays "0" instead of the real pod number value in "CrashLoopBackOff" state.


Version-Release number of selected component (if applicable):
Openshift 3.11


How reproducible:
Create a new app, restart a pod may times so it becomes in Crahsloop state, and check Openshift Cluster Console. Instead of showing a Crashlooping value of "1", it always displays "0".


Steps to Reproduce:
1. Create a new project test:
oc new-project test
2. Create a new test app:
oc new-project testoc new-app https://github.com/openshift/sti-ruby.git --context-dir=2.0/test/puma-test-app
3. Get POD Container ID:
docker ps -a | grep ruby | grep ose-pod | grep Up
4. Kill POD Container ID with SIGTERM (I used SIGHUP):
docker kill --signal=SIGHUP CONTAINER-ID
5. Repeat 3 and 4 until POD status changes to "CrashLoopBackOff"
watch -n 5 "docker kill --signal=SIGHUP $(docker ps -a | grep ruby | grep ose-pod | grep Up | awk '{print $1}')"
oc get pods | grep Crash
6. Check Cluster Console (make sure Project is the new one, "test")


Actual results:
Crashlooping Pods number in Cluster Console remains as "0" instead of "1"


Expected results:
Crashlooping Pods number in Cluster Console should be "1"

Comment 1 Samuel Padgett 2018-10-30 11:16:51 UTC

Fixed by https://github.com/openshift/console/pull/716

Comment 5 Yadan Pei 2018-11-05 06:30:00 UTC

1. create dummy pods

2. check status on cluster console， Pods page and Home -> Status page

Crashlooping Pods are NOT showing on Status page, recording in attachment

Comment 6 Yadan Pei 2018-11-05 06:31:04 UTC

apiVersion: v1
kind: Pod
metadata:
  name: dummy-pod
spec:
  containers:
    - name: dummy-pod
      image: ubuntu
  restartPolicy: Always

Comment 9 Yadan Pei 2018-11-05 06:44:06 UTC

Verify the bug on openshift v3.11.38

Comment 10 Samuel Padgett 2018-11-05 13:43:54 UTC

(In reply to Yadan Pei from comment #5)
> 1. create dummy pods
> 
> 2. check status on cluster console， Pods page and Home -> Status page
> 
> Crashlooping Pods are NOT showing on Status page, recording in attachment

We're querying Prometheus for pods with 5 container restarts within the last 5 minutes. It might take a few minutes to update (as you've found).

Comment 11 Yadan Pei 2018-11-06 02:14:56 UTC

Thanks for the info Sam

Comment 14 errata-xmlrpc 2018-11-20 03:11:52 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3537