Bug 1999285

Summary: Scheduling conformance tests are failing due to incorrect CPU and memory calculations in the test script
Product: OpenShift Container Platform Reporter: smitha.subbarao
Component: kube-schedulerAssignee: Jan Chaloupka <jchaloup>
Status: CLOSED DUPLICATE QA Contact: RamaKasturi <knarra>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.8CC: aos-bugs, mfojtik
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-15 11:26:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description smitha.subbarao 2021-08-30 19:17:52 UTC
Description of problem:
The following sig-scheduler tests keep failing because the test incorrectly calculates the CPU and memory of the pods:

[sig-scheduling] SchedulerPriorities [Serial] Pod should avoid nodes that have avoidPod annotation [Suite:openshift/conformance/serial] [Suite:k8s]
[sig-scheduling] SchedulerPriorities [Serial] Pod should be preferably scheduled to nodes pod can tolerate [Suite:openshift/conformance/serial] [Suite:k8s]
[sig-scheduling] SchedulerPriorities [Serial] PodTopologySpread Scoring validates pod should be preferably scheduled to node which makes the matching pods more evenly distributed [Suite:openshift/conformance/serial] [Suite:k8s]


Version-Release number of selected component (if applicable):
OCP 4.8


How reproducible:
Always

Steps to Reproduce:
1. Create an Openshift 4.8 cluster
2. Run the sig-scheduling tests mentioned above.
3. Observe the CPU and memory logs that are being calculated for every pod in another window. The CPU and memory values that are being calculated are much more than what is actually being consumed by the pod.

Actual results:
The test fails - the test does not check all of the pods in the cluster, instead the script keeps checking the tigera-operator pod for CPU and memory limits and uses incorrect calculations to do so. 

Logs are shown below:
```
Aug 25 14:55:20.299: INFO: ComputeCPUMemFraction for node: 10.5.149.223
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Pod for on the node: tigera-operator-7d896c66cd-klhq5, Cpu: 100, Mem: 41943040
Aug 25 14:55:20.299: INFO: Node: 10.5.149.223, totalRequestedCPUResource: 4300, cpuAllocatableMil: 3910, cpuFraction: 1
Aug 25 14:55:20.299: INFO: Node: 10.5.149.223, totalRequestedMemResource: 1866465280, memAllocatableVal: 13808427008, memFraction: 0.13516856618923007
```

Expected results:
The test checks every pod for CPU and memory values using correct calculations.


Additional info:


If we create a namespace "zzz" and the following pod in it, then the test passes:

```
kubectl apply -f - <<EOF
---
apiVersion: v1
kind: Pod
metadata:
  name: zzz
  namespace: zzz
spec:
  containers:
  - name: zzz
    image: us.icr.io/armada-master/pause:3.2
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
EOF
1 pass, 0 skip (1m47s)
+ [[ 0 -eq 0 ]]
+ echo 'SUCCESS: PVG ocp_conformance.sh was successful. Test results are available in directory /tmp/ocp-conformance-z8w.'
SUCCESS: PVG ocp_conformance.sh was successful. Test results are available in directory /tmp/ocp-conformance-z8w.
vagrant@verify-cluster:~/kubernetes-e2e-test-cases/tests$ 
```

Comment 1 Jan Chaloupka 2021-09-15 11:26:45 UTC

*** This bug has been marked as a duplicate of bug 1999288 ***