Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2008181

Summary:	Scheduling conformance tests are failing due to incorrect CPU and memory calculations in the test script
Product:	OpenShift Container Platform	Reporter:	Jan Chaloupka <jchaloup>
Component:	kube-scheduler	Assignee:	Jan Chaloupka <jchaloup>
Status:	CLOSED ERRATA	QA Contact:	RamaKasturi <knarra>
Severity:	medium	Docs Contact:
Priority:	low
Version:	4.9	CC:	aos-bugs, jchaloup, knarra, mfojtik, rtheis, smitha.subbarao
Target Milestone:	---	Flags:	mfojtik: needinfo?
Target Release:	4.9.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	LifecycleReset
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1999288	Environment:
Last Closed:	2022-01-31 18:22:29 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	2033751
Bug Blocks:	1999288

Description Jan Chaloupka 2021-09-27 14:00:38 UTC

+++ This bug was initially created as a clone of Bug #1999288 +++

Description of problem:
The following sig-scheduling OCP tests are failing due to incorrect CPU and memory calculations by the script:

[sig-scheduling] SchedulerPriorities [Serial] Pod should avoid nodes that have avoidPod annotation [Suite:openshift/conformance/serial] [Suite:k8s]
[sig-scheduling] SchedulerPriorities [Serial] Pod should be preferably scheduled to nodes pod can tolerate [Suite:openshift/conformance/serial] [Suite:k8s]
[sig-scheduling] SchedulerPriorities [Serial] PodTopologySpread Scoring validates pod should be preferably scheduled to node which makes the matching pods more evenly distributed [Suite:openshift/conformance/serial] [Suite:k8s]


Version-Release number of selected component (if applicable):
OCP 4.8

How reproducible:
Always

Steps to Reproduce:
1. Create an OpenShift 4.8 cluster 
2. Run the sig-scheduling tests mentioned above.
3. Observe the CPU and memory logs of the pods in the cluster in another window - the actual CPU and memory consumption values of the pods are much less than what is being calculated by the script.

Actual results:
The test fails because the script only checks the CPU and memory values of the tigera-operator pod, and also calculates these values incorrectly.

The following logs show the incorrect CPU and memory calculations of the script:
```
Aug 25 14:55:20.299: INFO: ComputeCPUMemFraction for node: 10.5.149.223
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Node: 10.5.149.223, totalRequestedCPUResource: 4300, cpuAllocatableMil: 3910, cpuFraction: 1
Aug 25 14:55:20.299: INFO: Node: 10.5.149.223, totalRequestedMemResource: 1866465280, memAllocatableVal: 13808427008, memFraction: 0.13516856618923007
```

Expected results:
The test must calculate the CPU and memory of every pod in the cluster correctly.


Additional info:

If we create a namespace "zzz" and the following pod in it, then the test passes:

```
kubectl apply -f - <<EOF
---
apiVersion: v1
kind: Pod
metadata:
  name: zzz
  namespace: zzz
spec:
  containers:
  - name: zzz
    image: us.icr.io/armada-master/pause:3.2
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
EOF
1 pass, 0 skip (1m47s)
+ [[ 0 -eq 0 ]]
+ echo 'SUCCESS: PVG ocp_conformance.sh was successful. Test results are available in directory /tmp/ocp-conformance-z8w.'
SUCCESS: PVG ocp_conformance.sh was successful. Test results are available in directory /tmp/ocp-conformance-z8w.
vagrant@verify-cluster:~/kubernetes-e2e-test-cases/tests$ 
```

--- Additional comment from Jan Chaloupka on 2021-09-01 11:01:27 UTC ---

> 3. Observe the CPU and memory logs of the pods in the cluster in another window - the actual CPU and memory consumption values of the pods are much less than what is being calculated by the script.

What are the expected actual values? Which command do you use to see the actual values?

--- Additional comment from  on 2021-09-01 15:36:47 UTC ---

(In reply to Jan Chaloupka from comment #1)
> > 3. Observe the CPU and memory logs of the pods in the cluster in another window - the actual CPU and memory consumption values of the pods are much less than what is being calculated by the script.
> 
> What are the expected actual values? Which command do you use to see the
> actual values?

You can use the command; watch -n 3 oc adm top pods --namespace="namespace_of_the_pod_that_is_being_verified" to view the actual CPU and memory values in another window. For example: watch -n 3 oc adm top pods --namespace=tigera-operator

During the test, these were the actual values of the tigera-operator pod, not the ones that were being displayed/calculated by the test:

```
Every 3.0s: oc adm top pods --namespace=tigera-... 

NAME                               CPU(cores)   MEMORY(bytes)
tigera-operator-667cd558f7-szmrj   3m           77Mi
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   2m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   4m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   2m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   2m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   2m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   2m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   2m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   3m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   4m           97Mi            
➜  ~ oc adm top pods --namespace=tigera-operator
NAME                               CPU(cores)   MEMORY(bytes)   
tigera-operator-667cd558f7-szmrj   9m           90Mi   
```


The test displayed a higher CPU value than what was actually consumed by the pod.

--- Additional comment from Jan Chaloupka on 2021-09-02 16:26:40 UTC ---

Are you referring to createBalancedPodForNodes?

oc adm top pods displays current usage of resources based on what cadvisor provides. Whereas createBalancedPodForNodes relies only on the resource requests provided by pods. So the difference you reported is expected.

Can you share links of the failed tests?

--- Additional comment from  on 2021-09-09 04:18:08 UTC ---


(In reply to Jan Chaloupka from comment #3)
> Are you referring to createBalancedPodForNodes?
> 
> oc adm top pods displays current usage of resources based on what cadvisor
> provides. Whereas createBalancedPodForNodes relies only on the resource
> requests provided by pods. So the difference you reported is expected.
> 
> Can you share links of the failed tests?

Here are the tests that are failing:

[sig-scheduling] SchedulerPriorities [Serial] Pod should avoid nodes that have avoidPod annotation [Suite:openshift/conformance/serial] [Suite:k8s]
[sig-scheduling] SchedulerPriorities [Serial] Pod should be preferably scheduled to nodes pod can tolerate [Suite:openshift/conformance/serial] [Suite:k8s]
[sig-scheduling] SchedulerPriorities [Serial] PodTopologySpread Scoring validates pod should be preferably scheduled to node which makes the matching pods more evenly distributed [Suite:openshift/conformance/serial] [Suite:k8s]

Link to the test: https://github.com/openshift/origin/blob/release-4.8/vendor/k8s.io/kubernetes/test/e2e/scheduling/priorities.go

--- Additional comment from Jan Chaloupka on 2021-09-09 07:01:53 UTC ---

Apologies, I meant CI runs of the failed tests. From https://prow.ci.openshift.org/.

--- Additional comment from Richard Theis on 2021-09-13 21:22:06 UTC ---

We are running these tests on Red Hat OpenShift on IBM Cloud clusters via IBM Cloud CI.  There are no failed test runs in https://prow.ci.openshift.org/ related to this bugzilla.  But the lack of test failures in OpenShift CI does not mean that this is not a valid test problem.  I suspect that the last pod found on OpenShift clusters run in CI allows the test to pass.  I believe the previous comments show how to reproduce the problem.  If not, please let us know.  Thanks.

--- Additional comment from Jan Chaloupka on 2021-09-14 08:46:45 UTC ---

I am asking for the test failures from the https://prow.ci.openshift.org/ so I can see the entire failure logs and to also have a proof so we can alter the test upstream if needed. It's hard to convince upstream to merge any change without the failure logs in this case. Checking https://search.ci.openshift.org/ for the last 14 days:
- [sig-scheduling] SchedulerPriorities [Serial] Pod should avoid nodes that have avoidPod annotation [Suite:openshift/conformance/serial] [Suite:k8s]
  No results found
- [sig-scheduling] SchedulerPriorities [Serial] Pod should be preferably scheduled to nodes pod can tolerate [Suite:openshift/conformance/serial] [Suite:k8s]
  Few tests failed due to overall cluster reasons (NS not created, error creating a pod, ...)
- [sig-scheduling] SchedulerPriorities [Serial] PodTopologySpread Scoring validates pod should be preferably scheduled to node which makes the matching pods more evenly distributed [Suite:openshift/conformance/serial] [Suite:k8s]
  Few tests failed due to overall cluster reasons (NS not created, no node available for scheduling, ...)

smitha.subbarao, can you share the entire test run including the failures?

> We are running these tests on Red Hat OpenShift on IBM Cloud clusters via IBM Cloud CI.

Is it a part of a CI system? Assuming 4.8 version of OpenShift (as reported). Are there other versions where the test fails as well?

--- Additional comment from Richard Theis on 2021-09-14 11:10:57 UTC ---

I think that we have provided enough details for a fix to be provide.  But we can provide the full logs from our test run if that would help.  And there is no failure in https://prow.ci.openshift.org/. This failure is seen in IBM's CI system only on OpenShift version 4.8.

Smitha, can you please provide the full test failure logs?

--- Additional comment from  on 2021-09-14 13:17:37 UTC ---

This file contains the full test failure log of the following OCP 4.8 tests:

"[sig-scheduling] SchedulerPriorities [Serial] Pod should avoid nodes that have avoidPod annotation [Suite:openshift/conformance/serial] [Suite:k8s]"
"[sig-scheduling] SchedulerPriorities [Serial] Pod should be preferably scheduled to nodes pod can tolerate [Suite:openshift/conformance/serial] [Suite:k8s]"
"[sig-scheduling] SchedulerPriorities [Serial] PodTopologySpread Scoring validates pod should be preferably scheduled to node which makes the matching pods more evenly distributed [Suite:openshift/conformance/serial] [Suite:k8s]"

--- Additional comment from Jan Chaloupka on 2021-09-15 11:26:45 UTC ---



--- Additional comment from Jan Chaloupka on 2021-09-15 12:17:36 UTC ---

Can you share more insight about how you run the tests? Checking the logs all the "Pod for on the node: " lines report exactly the same pod "tigera-operator-7d896c66cd-klhq5" (quite strange).

occurrences:
- for 10.5.149.223 32 occurrences
- for 10.5.149.234 39 occurrences
- for 10.5.149.237 43 occurrences

Checking cpu fractions:
- for 10.5.149.223 0.8439897698209718
- for 10.5.149.234 1
- for 10.5.149.237 1

Meaning both 10.5.149.234 and 10.5.149.237 are saturated. So the filler pods will fail to be scheduled (at least for 10.5.149.234 and 10.5.149.237) since there's no cpu resource left.
Thus the test must fail.

Questions:
- how saturated your pod is before running the test suite (i.e. resource consumption of each node)?
- how do you create the tigera-operator pod(s)?
- does every tigera-operator have its own NS? Or, is there only a single replica of the operator? Or, each node has its own replica of the operator? In which NS the operator lives?
- do you run the test over a real cluster or over a mock/fake cluster (i.e. with fake clientset?)
- can you run `oc get pods -A` every second during the test run (to see how many tigera pods are in Terminated/Running state) while running only those 3 tests?
- can you provide all kube-scheduler logs (3 files assuming there are 3 master nodes)?

--- Additional comment from Richard Theis on 2021-09-15 17:17:48 UTC ---

Exactly...   

"Pod for on the node: " lines report exactly the same pod "tigera-operator-7d896c66cd-klhq5" (quite strange).

This is the test bug in my opinion.  The test is incorrectly calculating cpu and memory because it is only using the last pod found in the cluster.  This bugzilla description show how we can manipulate the cluster to yield either a test failure or success.

--- Additional comment from  on 2021-09-20 20:00:29 UTC ---

Resource consumption of each node before the test is shown below (the test is conducted using an actual ROKS cluster. The `oc get pods -A` logs will be added in a following comment.

```
➜  amd64 git:(release-4.8) kubectl describe nodes | grep 'Name:\|  cpu\|  memory'                                                git:(release-4.8|) 
Name:               10.5.149.170
  cpu:                4
  memory:             16260860Ki
  cpu:                3910m
  memory:             13484796Ki
  cpu                1246m (31%)      1800m (46%)
  memory             3751443Ki (27%)  2036000Ki (15%)
Name:               10.5.149.191
  cpu:                4
  memory:             16260856Ki
  cpu:                3910m
  memory:             13484792Ki
  cpu                1218m (31%)      600m (15%)
  memory             2928147Ki (21%)  3952928Ki (29%)
Name:               10.5.149.196
  cpu:                4
  memory:             16260852Ki
  cpu:                3910m
  memory:             13484788Ki
  cpu                1354m (34%)      600m (15%)
  memory             3567123Ki (26%)  826572800 (5%)
```

To reiterate Richard's response, the test keeps referring to the tiger-operator pod because it seems to check the last pod found in the cluster. 

The steps to manipulate the cluster to successfully pass the test are below (same as the ones in the description):

1. Create a namespace "zzz" 

2. Create the following pod in the "zzz" namespace and re-run the test - the test will pass.

```
kubectl apply -f - <<EOF
---
apiVersion: v1
kind: Pod
metadata:
  name: zzz
  namespace: zzz
spec:
  containers:
  - name: zzz
    image: us.icr.io/armada-master/pause:3.2
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
EOF
1 pass, 0 skip (1m47s)
+ [[ 0 -eq 0 ]]
+ echo 'SUCCESS: PVG ocp_conformance.sh was successful. Test results are available in directory /tmp/ocp-conformance-z8w.'
SUCCESS: PVG ocp_conformance.sh was successful. Test results are available in directory /tmp/ocp-conformance-z8w.
vagrant@verify-cluster:~/kubernetes-e2e-test-cases/tests$ 
```

--- Additional comment from  on 2021-09-20 22:46:19 UTC ---

There's only 1 tigera-operator pod that is running throughput the test, tigera-operator-7d896c66cd-qlbwt:                         
```
➜  amd64 git:(release-4.8) oc get pods -A                                                                  git:(release-4.8|) 
NAMESPACE                                          NAME                                                      READY   STATUS      RESTARTS   AGE
calico-system                                      calico-kube-controllers-d78c469ff-jjvpj                   1/1     Running     0          7d7h
calico-system                                      calico-node-92xqf                                         1/1     Running     0          7d7h
calico-system                                      calico-node-9zxlr                                         1/1     Running     0          7d7h
calico-system                                      calico-node-lrttq                                         1/1     Running     0          7d7h
calico-system                                      calico-typha-75bbbcf6df-9wgd6                             1/1     Running     0          7d7h
calico-system                                      calico-typha-75bbbcf6df-v98vn                             1/1     Running     0          7d7h
calico-system                                      calico-typha-75bbbcf6df-vlqz8                             1/1     Running     0          7d7h
e2e-sched-priority-2483                            aa311e35-cedc-4326-b6cd-3ac2d809626b-0                    0/1     Pending     0          5m14s
ibm-system                                         ibm-cloud-provider-ip-169-60-45-162-5dc8b94d6d-hcftw      1/1     Running     0          8h
ibm-system                                         ibm-cloud-provider-ip-169-60-45-162-5dc8b94d6d-xpw7f      1/1     Running     0          8h
kube-system                                        ibm-file-plugin-699bf5596-dwc4r                           1/1     Running     0          8h
kube-system                                        ibm-keepalived-watcher-64mgg                              1/1     Running     0          8h
kube-system                                        ibm-keepalived-watcher-7vl2x                              1/1     Running     0          8h
kube-system                                        ibm-keepalived-watcher-9z4mh                              1/1     Running     0          8h
kube-system                                        ibm-master-proxy-static-10.5.149.170                      2/2     Running     0          7d7h
kube-system                                        ibm-master-proxy-static-10.5.149.191                      2/2     Running     0          7d7h
kube-system                                        ibm-master-proxy-static-10.5.149.196                      2/2     Running     0          7d7h
kube-system                                        ibm-storage-metrics-agent-5dc6c457c7-spspn                1/1     Running     0          4h12m
kube-system                                        ibm-storage-watcher-856bcd698b-j8wzx                      1/1     Running     0          8h
kube-system                                        ibmcloud-block-storage-driver-4hwj5                       1/1     Running     0          8h
kube-system                                        ibmcloud-block-storage-driver-mptq6                       1/1     Running     0          8h
kube-system                                        ibmcloud-block-storage-driver-nds5x                       1/1     Running     0          8h
kube-system                                        ibmcloud-block-storage-plugin-649688f859-6pzcc            1/1     Running     0          8h
kube-system                                        vpn-56c795f968-92n5f                                      1/1     Running     0          7d7h
openshift-cluster-node-tuning-operator             cluster-node-tuning-operator-7b764df77c-qh9k2             1/1     Running     0          8h
openshift-cluster-node-tuning-operator             tuned-jjmfq                                               1/1     Running     0          8h
openshift-cluster-node-tuning-operator             tuned-msbxb                                               1/1     Running     0          8h
openshift-cluster-node-tuning-operator             tuned-rlwsk                                               1/1     Running     0          8h
openshift-cluster-samples-operator                 cluster-samples-operator-59f699dcbf-sz76r                 2/2     Running     0          8h
openshift-cluster-storage-operator                 cluster-storage-operator-78c6bfb7b4-d5qrp                 1/1     Running     1          8h
openshift-cluster-storage-operator                 csi-snapshot-controller-cb6558866-4x2lp                   1/1     Running     1          8h
openshift-cluster-storage-operator                 csi-snapshot-controller-cb6558866-zgc4j                   1/1     Running     1          8h
openshift-cluster-storage-operator                 csi-snapshot-controller-operator-7b4c9b4ffc-w96lf         1/1     Running     1          8h
openshift-cluster-storage-operator                 csi-snapshot-webhook-687d7ddb94-6thcn                     1/1     Running     0          8h
openshift-cluster-storage-operator                 csi-snapshot-webhook-687d7ddb94-d4bg2                     1/1     Running     0          8h
openshift-console-operator                         console-operator-5588c56b5b-ql56x                         1/1     Running     1          8h
openshift-console                                  console-5c5b64c998-br9rq                                  1/1     Running     0          8h
openshift-console                                  console-5c5b64c998-jwctb                                  1/1     Running     0          8h
openshift-console                                  downloads-8b49bb4c5-dj7d9                                 1/1     Running     0          8h
openshift-console                                  downloads-8b49bb4c5-k9wcd                                 1/1     Running     0          8h
openshift-dns-operator                             dns-operator-74cd5949f5-lxhwt                             2/2     Running     0          8h
openshift-dns                                      dns-default-d99dg                                         2/2     Running     0          8h
openshift-dns                                      dns-default-x85rk                                         2/2     Running     0          8h
openshift-dns                                      dns-default-z4pjg                                         2/2     Running     0          8h
openshift-dns                                      node-resolver-m9mlz                                       1/1     Running     0          8h
openshift-dns                                      node-resolver-md5v2                                       1/1     Running     0          8h
openshift-dns                                      node-resolver-nj2mc                                       1/1     Running     0          8h
openshift-image-registry                           cluster-image-registry-operator-75d5684d7c-8nf47          1/1     Running     1          8h
openshift-image-registry                           image-pruner-27198720-4zt76                               0/1     Completed   0          2d22h
openshift-image-registry                           image-pruner-27200160-82m5m                               0/1     Completed   0          46h
openshift-image-registry                           image-pruner-27201600-r6clj                               0/1     Completed   0          22h
openshift-image-registry                           image-registry-868f5d4b5c-pft2z                           1/1     Running     0          8h
openshift-image-registry                           node-ca-cxggw                                             1/1     Running     0          8h
openshift-image-registry                           node-ca-nqldr                                             1/1     Running     0          8h
openshift-image-registry                           node-ca-w4qll                                             1/1     Running     0          8h
openshift-image-registry                           registry-pvc-permissions-gsg9b                            0/1     Completed   0          8h
openshift-ingress-canary                           ingress-canary-2dlp9                                      1/1     Running     0          8h
openshift-ingress-canary                           ingress-canary-75krd                                      1/1     Running     0          8h
openshift-ingress-canary                           ingress-canary-wk8tx                                      1/1     Running     0          8h
openshift-ingress-operator                         ingress-operator-76f5b96d7c-dh9fn                         2/2     Running     0          8h
openshift-ingress                                  router-default-77c7f8cb7d-2px27                           1/1     Running     0          8h
openshift-ingress                                  router-default-77c7f8cb7d-cwr96                           1/1     Running     0          8h
openshift-kube-proxy                               openshift-kube-proxy-dzz98                                2/2     Running     0          8h
openshift-kube-proxy                               openshift-kube-proxy-gg6gs                                2/2     Running     0          8h
openshift-kube-proxy                               openshift-kube-proxy-swttg                                2/2     Running     0          8h
openshift-kube-storage-version-migrator-operator   kube-storage-version-migrator-operator-6879c94bfc-rmmz8   1/1     Running     1          8h
openshift-kube-storage-version-migrator            migrator-7d5cdcd9cc-klwf6                                 1/1     Running     0          8h
openshift-marketplace                              certified-operators-jnps6                                 1/1     Running     0          12h
openshift-marketplace                              community-operators-zptdk                                 1/1     Running     0          3h50m
openshift-marketplace                              marketplace-operator-7c69549b9f-dg6t6                     1/1     Running     0          8h
openshift-marketplace                              redhat-marketplace-jk66g                                  1/1     Running     0          12h
openshift-marketplace                              redhat-operators-7vndn                                    1/1     Running     0          5h43m
openshift-monitoring                               alertmanager-main-0                                       5/5     Running     0          8h
openshift-monitoring                               alertmanager-main-1                                       5/5     Running     0          8h
openshift-monitoring                               alertmanager-main-2                                       5/5     Running     0          8h
openshift-monitoring                               cluster-monitoring-operator-7b5f987df8-j2vpk              2/2     Running     0          8h
openshift-monitoring                               grafana-5c98cd844-tcnwt                                   2/2     Running     0          8h
openshift-monitoring                               kube-state-metrics-7485cb5695-zf848                       3/3     Running     0          8h
openshift-monitoring                               node-exporter-fs554                                       2/2     Running     0          8h
openshift-monitoring                               node-exporter-lq957                                       2/2     Running     0          8h
openshift-monitoring                               node-exporter-ww6sh                                       2/2     Running     0          8h
openshift-monitoring                               openshift-state-metrics-65c6597c7-zcfvp                   3/3     Running     0          8h
openshift-monitoring                               prometheus-adapter-7586b977cb-cv44c                       1/1     Running     0          8h
openshift-monitoring                               prometheus-adapter-7586b977cb-vpjfv                       1/1     Running     0          8h
openshift-monitoring                               prometheus-k8s-0                                          7/7     Running     1          8h
openshift-monitoring                               prometheus-k8s-1                                          7/7     Running     1          8h
openshift-monitoring                               prometheus-operator-599d68ffbf-wvg5w                      2/2     Running     0          8h
openshift-monitoring                               telemeter-client-767f4f8d6b-7649d                         3/3     Running     0          8h
openshift-monitoring                               thanos-querier-84bcffdd-h7dj6                             5/5     Running     0          8h
openshift-monitoring                               thanos-querier-84bcffdd-ndznd                             5/5     Running     0          8h
openshift-multus                                   multus-57tn4                                              1/1     Running     0          8h
openshift-multus                                   multus-additional-cni-plugins-dbvq2                       1/1     Running     0          8h
openshift-multus                                   multus-additional-cni-plugins-fkxg5                       1/1     Running     0          8h
openshift-multus                                   multus-additional-cni-plugins-wlzq8                       1/1     Running     0          8h
openshift-multus                                   multus-admission-controller-n7qcx                         2/2     Running     0          8h
openshift-multus                                   multus-admission-controller-v9vx6                         2/2     Running     0          8h
openshift-multus                                   multus-admission-controller-vlfn7                         2/2     Running     0          8h
openshift-multus                                   multus-n6dsg                                              1/1     Running     0          8h
openshift-multus                                   multus-p8bpq                                              1/1     Running     0          8h
openshift-multus                                   network-metrics-daemon-25jh8                              2/2     Running     0          8h
openshift-multus                                   network-metrics-daemon-jjpgw                              2/2     Running     0          8h
openshift-multus                                   network-metrics-daemon-tv555                              2/2     Running     0          8h
openshift-network-diagnostics                      network-check-source-6ccd7c5589-glnkg                     1/1     Running     0          8h
openshift-network-diagnostics                      network-check-target-5qf9j                                1/1     Running     0          8h
openshift-network-diagnostics                      network-check-target-sjmvb                                1/1     Running     0          8h
openshift-network-diagnostics                      network-check-target-thbzf                                1/1     Running     0          8h
openshift-network-operator                         network-operator-85544fbdbc-4nb5h                         1/1     Running     1          8h
openshift-operator-lifecycle-manager               catalog-operator-7bbb999f99-492vz                         1/1     Running     0          8h
openshift-operator-lifecycle-manager               olm-operator-7bfd55d5c7-swmzn                             1/1     Running     0          8h
openshift-operator-lifecycle-manager               packageserver-c8d74b46d-6j6sn                             1/1     Running     0          8h
openshift-operator-lifecycle-manager               packageserver-c8d74b46d-9j4gz                             1/1     Running     0          8h
openshift-roks-metrics                             metrics-5fb9d747f7-6mjh5                                  1/1     Running     0          8h
openshift-roks-metrics                             push-gateway-57868bfdb9-d5lq2                             1/1     Running     0          8h
openshift-service-ca-operator                      service-ca-operator-7f994cb49b-shkgm                      1/1     Running     1          8h
openshift-service-ca                               service-ca-847c7856dc-7tmwz                               1/1     Running     1          8h
tigera-operator                                    tigera-operator-7d896c66cd-qlbwt                          1/1     Running     4          7d7h
```

--- Additional comment from Jan Chaloupka on 2021-09-23 11:16:47 UTC ---

Thank you for all the provided data. Refactoring done in https://github.com/kubernetes/kubernetes/pull/100762 incorrectly constructs the list of pods. Opened a fix upstream in https://github.com/kubernetes/kubernetes/pull/105205.

Comment 2 Michal Fojtik 2021-11-24 09:09:29 UTC

This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 3 Jan Chaloupka 2021-11-25 15:26:04 UTC

Still waiting for the rebase

Comment 4 Jan Chaloupka 2021-12-16 12:08:18 UTC

The fix got pulled into 4.9 through https://github.com/openshift/kubernetes/pull/1048

Comment 6 Jan Chaloupka 2022-01-07 11:46:56 UTC

Resolution of this ticket depends on resolution of the same issue in higher versions. I am waiting for higher version fixes to merge so the PR in https://github.com/openshift/origin/pull/26697 gets all the required permissions.

Comment 7 Michal Fojtik 2022-01-21 18:11:47 UTC

The LifecycleStale keyword was removed because the bug moved to QE.
The bug assignee was notified.

Comment 11 RamaKasturi 2022-01-27 17:15:38 UTC

After discussing with dev on how to validate this bug i have learnt that "Given this is impossible to see in the junit.xml (since it show logs of the failed tests only by default) and given the bug was not about failing the test, you can move it to VERIFIED directly."

Based on the above moving bug to verified state.

Comment 13 errata-xmlrpc 2022-01-31 18:22:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.9.18 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0279