Description of problem: [sig-scheduling] SchedulerPriorities [Serial] Pod should avoid nodes that have avoidPod annotation [Suite:openshift/conformance/serial] [Suite:k8s] expand_less 2m34s fail [k8s.io/kubernetes/test/e2e/framework/util.go:1355]: Expected <string>: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc not to equal <string>: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-serial-4.3/208 related jobs: https://testgrid.k8s.io/redhat-openshift-release-4.3-informing-ocp#release-openshift-ocp-installer-e2e-azure-serial-4.3&show-stale-tests=
From `[sig-scheduling] SchedulerPriorities [Serial] Pod should avoid nodes that have avoidPod annotation` logs: ``` STEP: Scaling ReplicationController scheduler-priority-avoid-pod in namespace e2e-sched-priority-9709 to 2 STEP: Verify the pods should not scheduled to the node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc Oct 23 02:07:21.488: INFO: Expect to be unequal: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc and ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc ``` From test logs: ``` I1023 02:07:11.102791 1 scheduler.go:667] pod e2e-sched-priority-9709/scheduler-priority-avoid-pod-k58k8 is bound successfully on node "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-shrs4", 6 nodes evaluated, 2 nodes were found feasible. Bound node resource: "Capacity: CPU<2>|Memory<8141556Ki>|Pods<250>|StorageEphemeral<133682156Ki>; Allocatable: CPU<1500m>|Memory<7527156Ki>|Pods<250>|StorageEphemeral<123201474766>.". I1023 02:07:11.129010 1 scheduler.go:667] pod e2e-sched-priority-9709/scheduler-priority-avoid-pod-22s5k is bound successfully on node "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc", 6 nodes evaluated, 1 nodes were found feasible. Bound node resource: "Capacity: CPU<2>|Memory<8141556Ki>|Pods<250>|StorageEphemeral<133682156Ki>; Allocatable: CPU<1500m>|Memory<7527156Ki>|Pods<250>|StorageEphemeral<123201474766>.". ``` What's happening here: 1. "Pod should avoid nodes that have avoidPod annotation" test wants to verify the scheduler does not schedule pods to a node which has "scheduler.alpha.kubernetes.io/preferAvoidPods" annotation set. 2. It creates RC that will produce pods which will be used for testing 3. The test picks node "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc" and annotates it with about-to-be produced pods which are expected not to be scheduled to the node 4. RC is scaled up, two new pods are created 5. the first pod "e2e-sched-priority-9709/scheduler-priority-avoid-pod-k58k8" is scheduled and bounded, ignoring node "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc" as expected 6. the second pod "e2e-sched-priority-9709/scheduler-priority-avoid-pod-22s5k" is scheduled, bounded to node "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc" which is unexpected 7. the test fails since the second pod is not supposed to be scheduled to node "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc" As can be seen from the test logs, when the scheduler tries to schedule the second pod it has only one node left (the one to be ignored). So it has no other choice than to bind the pod to the node. Under the hood "scheduler.alpha.kubernetes.io/preferAvoidPods" annotation is used only when node priorities are computed. In case where there is only one node left, the priorities has no effect. Thus the reason why node "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc" is not excluded from the list of target nodes for the second pod.
Checking other part of the logs: ``` Oct 23 02:06:32.708: INFO: ComputeCPUMemFraction for node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc Oct 23 02:06:32.816: INFO: Pod for on the node: tuned-87gz6, Cpu: 10, Mem: 52428800 Oct 23 02:06:32.816: INFO: Pod for on the node: dns-default-7nb87, Cpu: 110, Mem: 283115520 Oct 23 02:06:32.816: INFO: Pod for on the node: node-ca-gkpn5, Cpu: 10, Mem: 10485760 Oct 23 02:06:32.816: INFO: Pod for on the node: machine-config-daemon-2n4sr, Cpu: 40, Mem: 104857600 Oct 23 02:06:32.816: INFO: Pod for on the node: node-exporter-klgq5, Cpu: 110, Mem: 230686720 Oct 23 02:06:32.816: INFO: Pod for on the node: multus-5dtkw, Cpu: 10, Mem: 157286400 Oct 23 02:06:32.816: INFO: Pod for on the node: ovs-5t74s, Cpu: 200, Mem: 419430400 Oct 23 02:06:32.816: INFO: Pod for on the node: sdn-b2mml, Cpu: 100, Mem: 209715200 Oct 23 02:06:32.816: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc, totalRequestedCPUResource: 690, cpuAllocatableMil: 1500, cpuFraction: 0.46 Oct 23 02:06:32.816: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc, totalRequestedMemResource: 1572864000, memAllocatableVal: 7707807744, memFraction: 0.20406113544079596 Oct 23 02:06:32.816: INFO: ComputeCPUMemFraction for node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-shrs4 Oct 23 02:06:32.927: INFO: Pod for on the node: tuned-x8hjb, Cpu: 10, Mem: 52428800 Oct 23 02:06:32.927: INFO: Pod for on the node: dns-default-h74fz, Cpu: 110, Mem: 283115520 Oct 23 02:06:32.927: INFO: Pod for on the node: node-ca-h4mpb, Cpu: 10, Mem: 10485760 Oct 23 02:06:32.927: INFO: Pod for on the node: router-default-64f8bcb478-v6sxh, Cpu: 100, Mem: 268435456 Oct 23 02:06:32.927: INFO: Pod for on the node: machine-config-daemon-27fm4, Cpu: 40, Mem: 104857600 Oct 23 02:06:32.927: INFO: Pod for on the node: alertmanager-main-0, Cpu: 300, Mem: 445644800 Oct 23 02:06:32.927: INFO: Pod for on the node: alertmanager-main-1, Cpu: 300, Mem: 445644800 Oct 23 02:06:32.927: INFO: Pod for on the node: grafana-65cd8c7fff-qwhf5, Cpu: 200, Mem: 314572800 Oct 23 02:06:32.927: INFO: Pod for on the node: node-exporter-vhfmc, Cpu: 110, Mem: 230686720 Oct 23 02:06:32.927: INFO: Pod for on the node: prometheus-adapter-5f5d9cfd57-gc5bz, Cpu: 10, Mem: 20971520 Oct 23 02:06:32.927: INFO: Pod for on the node: prometheus-k8s-1, Cpu: 530, Mem: 1398800384 Oct 23 02:06:32.927: INFO: Pod for on the node: telemeter-client-756d9dddc4-xhwcz, Cpu: 210, Mem: 440401920 Oct 23 02:06:32.927: INFO: Pod for on the node: multus-rcvkr, Cpu: 10, Mem: 157286400 Oct 23 02:06:32.927: INFO: Pod for on the node: ovs-6gxnm, Cpu: 200, Mem: 419430400 Oct 23 02:06:32.927: INFO: Pod for on the node: sdn-4d2kn, Cpu: 100, Mem: 209715200 Oct 23 02:06:32.927: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-shrs4, totalRequestedCPUResource: 2340, cpuAllocatableMil: 1500, cpuFraction: 1 Oct 23 02:06:32.927: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-shrs4, totalRequestedMemResource: 4907335680, memAllocatableVal: 7707807744, memFraction: 0.6366707425752834 Oct 23 02:06:32.927: INFO: ComputeCPUMemFraction for node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-tfbnm Oct 23 02:06:33.033: INFO: Pod for on the node: tuned-272r5, Cpu: 10, Mem: 52428800 Oct 23 02:06:33.033: INFO: Pod for on the node: dns-default-t6h9n, Cpu: 110, Mem: 283115520 Oct 23 02:06:33.033: INFO: Pod for on the node: image-registry-66cf94d4cd-2bwb2, Cpu: 100, Mem: 268435456 Oct 23 02:06:33.033: INFO: Pod for on the node: node-ca-xvk4j, Cpu: 10, Mem: 10485760 Oct 23 02:06:33.033: INFO: Pod for on the node: router-default-64f8bcb478-gfcft, Cpu: 100, Mem: 268435456 Oct 23 02:06:33.033: INFO: Pod for on the node: machine-config-daemon-gz48t, Cpu: 40, Mem: 104857600 Oct 23 02:06:33.033: INFO: Pod for on the node: certified-operators-6d6bbf5f54-4n7vv, Cpu: 10, Mem: 104857600 Oct 23 02:06:33.033: INFO: Pod for on the node: community-operators-6bc445f7db-d2b9f, Cpu: 10, Mem: 104857600 Oct 23 02:06:33.033: INFO: Pod for on the node: redhat-operators-79644f5cf-tr85m, Cpu: 10, Mem: 104857600 Oct 23 02:06:33.033: INFO: Pod for on the node: alertmanager-main-2, Cpu: 300, Mem: 445644800 Oct 23 02:06:33.033: INFO: Pod for on the node: kube-state-metrics-5db5d576f6-ncrx7, Cpu: 30, Mem: 125829120 Oct 23 02:06:33.033: INFO: Pod for on the node: node-exporter-p4fsk, Cpu: 110, Mem: 230686720 Oct 23 02:06:33.033: INFO: Pod for on the node: openshift-state-metrics-64bf9548b6-245k7, Cpu: 120, Mem: 199229440 Oct 23 02:06:33.033: INFO: Pod for on the node: prometheus-adapter-5f5d9cfd57-tvkcg, Cpu: 10, Mem: 20971520 Oct 23 02:06:33.033: INFO: Pod for on the node: prometheus-k8s-0, Cpu: 530, Mem: 1398800384 Oct 23 02:06:33.033: INFO: Pod for on the node: multus-7zh95, Cpu: 10, Mem: 157286400 Oct 23 02:06:33.033: INFO: Pod for on the node: ovs-z4p7x, Cpu: 200, Mem: 419430400 Oct 23 02:06:33.033: INFO: Pod for on the node: sdn-ck9mw, Cpu: 100, Mem: 209715200 Oct 23 02:06:33.033: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-tfbnm, totalRequestedCPUResource: 1910, cpuAllocatableMil: 1500, cpuFraction: 1 Oct 23 02:06:33.033: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-tfbnm, totalRequestedMemResource: 4614782976, memAllocatableVal: 7707807744, memFraction: 0.5987153713832953 ``` Interesting lines: ``` Oct 23 02:06:32.927: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-shrs4, totalRequestedCPUResource: 2340, cpuAllocatableMil: 1500, cpuFraction: 1 Oct 23 02:06:33.033: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-tfbnm, totalRequestedCPUResource: 1910, cpuAllocatableMil: 1500, cpuFraction: 1 ``` Both nodes have more CPU resources requested than there is available. In other words both nodes are running more pods than they should be. After the nodes are resource balanced (which in this case means no node has CPU resource left) we get: ``` Oct 23 02:07:03.866: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc, totalRequestedCPUResource: 1500, cpuAllocatableMil: 1500, cpuFraction: 1 Oct 23 02:07:04.041: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-shrs4, totalRequestedCPUResource: 2340, cpuAllocatableMil: 1500, cpuFraction: 1 Oct 23 02:07:04.151: INFO: Node: ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-tfbnm, totalRequestedCPUResource: 1910, cpuAllocatableMil: 1500, cpuFraction: 1 ``` Clearly, no node has any CPU resource left for scheduling. Yet, the scheduler were still able to bound both pods to "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-shrs4" and "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-jcmtc". The nodeInfo cache in the scheduler might be outdated and not take into account all pods scheduled on particular node. Given the cache is updated based on event handler reactions, some of the pod.Add events could have been dropped due to enormous number of events generated at a single point of time. Though it's just a speculation. I was not able to find how much CPU resources are requested for cases when the test goes green. Though, the expected scheduler output should look like [1]: ``` I1022 22:25:41.424393 1 scheduler.go:667] pod e2e-sched-priority-7740/scheduler-priority-avoid-pod-nvvsq is bound successfully on node "ci-op-x376v89r-09f59-4pq2c-worker-centralus2-pvv9c", 6 nodes evaluated, 2 nodes were found feasible. Bound node resource: "Capacity: CPU<2>|Memory<8140500Ki>|Pods<250>|StorageEphemeral<133682156Ki>; Allocatable: CPU<1500m>|Memory<7526100Ki>|Pods<250>|StorageEphemeral<123201474766>.". I1022 22:25:41.443855 1 scheduler.go:667] pod e2e-sched-priority-7740/scheduler-priority-avoid-pod-zg26l is bound successfully on node "ci-op-x376v89r-09f59-4pq2c-worker-centralus2-pvv9c", 6 nodes evaluated, 2 nodes were found feasible. Bound node resource: "Capacity: CPU<2>|Memory<8140500Ki>|Pods<250>|StorageEphemeral<133682156Ki>; Allocatable: CPU<1500m>|Memory<7526100Ki>|Pods<250>|StorageEphemeral<123201474766>.". ``` In any case, even tough the scheduler were able to bound only a single pod to "ci-op-jjgjm1z5-09f59-jcx9s-worker-westus-shrs4", it means the node ran out of free CPU. The test assumes there is still sufficient CPU and memory left on each of the node after they are resource balanced. Which is not the case here. [1] From https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-serial-4.3/207/artifacts/e2e-azure-serial/pods/openshift-kube-scheduler_openshift-kube-scheduler-ci-op-x376v89r-09f59-4pq2c-master-2_scheduler.log
Also from https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.1/818/artifacts/e2e-aws-serial/junit/ we can see the following computations: ``` May 17 17:17:08.075: INFO: ComputeCpuMemFraction for node: ip-10-0-128-108.ec2.internal May 17 17:17:08.201: INFO: Pod for on the node: tuned-r2drh, Cpu: 100, Mem: 209715200 May 17 17:17:08.201: INFO: Pod for on the node: dns-default-rr5fz, Cpu: 110, Mem: 283115520 May 17 17:17:08.201: INFO: Pod for on the node: image-registry-669565b768-ng6gd, Cpu: 100, Mem: 268435456 May 17 17:17:08.201: INFO: Pod for on the node: node-ca-vhqwd, Cpu: 100, Mem: 209715200 May 17 17:17:08.201: INFO: Pod for on the node: router-default-54d85f8787-f8pxb, Cpu: 100, Mem: 268435456 May 17 17:17:08.201: INFO: Pod for on the node: machine-config-daemon-x666j, Cpu: 20, Mem: 52428800 May 17 17:17:08.201: INFO: Pod for on the node: certified-operators-579dfd4c79-xt5js, Cpu: 100, Mem: 209715200 May 17 17:17:08.201: INFO: Pod for on the node: community-operators-67448849fd-9289n, Cpu: 100, Mem: 209715200 May 17 17:17:08.201: INFO: Pod for on the node: redhat-operators-5f6c48b6dd-8w297, Cpu: 100, Mem: 209715200 May 17 17:17:08.201: INFO: Pod for on the node: alertmanager-main-0, Cpu: 300, Mem: 629145600 May 17 17:17:08.201: INFO: Pod for on the node: alertmanager-main-2, Cpu: 300, Mem: 629145600 May 17 17:17:08.201: INFO: Pod for on the node: kube-state-metrics-dcb5d99cc-cfpjl, Cpu: 300, Mem: 629145600 May 17 17:17:08.201: INFO: Pod for on the node: node-exporter-rzxbd, Cpu: 110, Mem: 230686720 May 17 17:17:08.201: INFO: Pod for on the node: prometheus-adapter-85476c4bbd-c2jpn, Cpu: 100, Mem: 209715200 May 17 17:17:08.201: INFO: Pod for on the node: prometheus-k8s-0, Cpu: 600, Mem: 1258291200 May 17 17:17:08.201: INFO: Pod for on the node: telemeter-client-76c555c857-65hbv, Cpu: 210, Mem: 440401920 May 17 17:17:08.201: INFO: Pod for on the node: multus-lgvtp, Cpu: 100, Mem: 209715200 May 17 17:17:08.201: INFO: Pod for on the node: ovs-fwr8d, Cpu: 200, Mem: 419430400 May 17 17:17:08.201: INFO: Pod for on the node: sdn-cvknz, Cpu: 100, Mem: 209715200 May 17 17:17:08.201: INFO: Node: ip-10-0-128-108.ec2.internal, totalRequestedCpuResource: 1650, cpuAllocatableMil: 1500, cpuFraction: 1 May 17 17:17:08.201: INFO: Node: ip-10-0-128-108.ec2.internal, totalRequestedMemResource: 3535798272, memAllocatableVal: 7729664000, memFraction: 0.4574323375505067 May 17 17:17:08.201: INFO: ComputeCpuMemFraction for node: ip-10-0-131-112.ec2.internal May 17 17:17:08.317: INFO: Pod for on the node: tuned-vnkwf, Cpu: 100, Mem: 209715200 May 17 17:17:08.317: INFO: Pod for on the node: dns-default-l5486, Cpu: 110, Mem: 283115520 May 17 17:17:08.317: INFO: Pod for on the node: node-ca-7nkmq, Cpu: 100, Mem: 209715200 May 17 17:17:08.317: INFO: Pod for on the node: machine-config-daemon-rdjdz, Cpu: 20, Mem: 52428800 May 17 17:17:08.317: INFO: Pod for on the node: node-exporter-tf28q, Cpu: 110, Mem: 230686720 May 17 17:17:08.317: INFO: Pod for on the node: multus-ff4rj, Cpu: 100, Mem: 209715200 May 17 17:17:08.317: INFO: Pod for on the node: ovs-5lqcp, Cpu: 200, Mem: 419430400 May 17 17:17:08.317: INFO: Pod for on the node: sdn-plsrf, Cpu: 100, Mem: 209715200 May 17 17:17:08.317: INFO: Node: ip-10-0-131-112.ec2.internal, totalRequestedCpuResource: 640, cpuAllocatableMil: 1500, cpuFraction: 0.4266666666666667 May 17 17:17:08.317: INFO: Node: ip-10-0-131-112.ec2.internal, totalRequestedMemResource: 1300234240, memAllocatableVal: 7729655808, memFraction: 0.16821373063652978 May 17 17:17:08.317: INFO: ComputeCpuMemFraction for node: ip-10-0-145-122.ec2.internal May 17 17:17:08.431: INFO: Pod for on the node: tuned-8gcc9, Cpu: 100, Mem: 209715200 May 17 17:17:08.431: INFO: Pod for on the node: dns-default-j9qxd, Cpu: 110, Mem: 283115520 May 17 17:17:08.431: INFO: Pod for on the node: node-ca-ncgsr, Cpu: 100, Mem: 209715200 May 17 17:17:08.431: INFO: Pod for on the node: router-default-54d85f8787-jlv84, Cpu: 100, Mem: 268435456 May 17 17:17:08.431: INFO: Pod for on the node: machine-config-daemon-6vpmt, Cpu: 20, Mem: 52428800 May 17 17:17:08.431: INFO: Pod for on the node: alertmanager-main-1, Cpu: 300, Mem: 629145600 May 17 17:17:08.431: INFO: Pod for on the node: grafana-d77564f95-rhtsn, Cpu: 200, Mem: 314572800 May 17 17:17:08.431: INFO: Pod for on the node: node-exporter-hgvd5, Cpu: 110, Mem: 230686720 May 17 17:17:08.431: INFO: Pod for on the node: prometheus-adapter-85476c4bbd-d8ktz, Cpu: 100, Mem: 209715200 May 17 17:17:08.431: INFO: Pod for on the node: prometheus-k8s-1, Cpu: 600, Mem: 1258291200 May 17 17:17:08.431: INFO: Pod for on the node: prometheus-operator-6d494fb54-kknqt, Cpu: 100, Mem: 209715200 May 17 17:17:08.431: INFO: Pod for on the node: multus-v8t4s, Cpu: 100, Mem: 209715200 May 17 17:17:08.431: INFO: Pod for on the node: ovs-hcm8t, Cpu: 200, Mem: 419430400 May 17 17:17:08.431: INFO: Pod for on the node: sdn-cdj5q, Cpu: 100, Mem: 209715200 May 17 17:17:08.431: INFO: Node: ip-10-0-145-122.ec2.internal, totalRequestedCpuResource: 1240, cpuAllocatableMil: 1500, cpuFraction: 0.8266666666666667 May 17 17:17:08.431: INFO: Node: ip-10-0-145-122.ec2.internal, totalRequestedMemResource: 2512388096, memAllocatableVal: 7729664000, memFraction: 0.32503199311121417 ``` Again no more cpu resource left: ``` May 17 17:17:08.201: INFO: Node: ip-10-0-128-108.ec2.internal, totalRequestedCpuResource: 1650, cpuAllocatableMil: 1500, cpuFraction: 1 ``` Fortunately, there are two more nodes with sufficient cpu so the test would most likely pass.
*** Bug 1765736 has been marked as a duplicate of this bug. ***
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-vsphere-upi-serial-4.2/152
Thanks Hongkai Liu for the artifacts. Actually it failed for a different reason: ``` Nov 21 16:51:22.815: INFO: Waiting for running... Nov 21 17:01:22.903: INFO: Unexpected error occurred: Error waiting for 1 pods to be running - probably a timeout: Timeout while waiting for pods with labels "startPodsID=2616505a-0c7f-11ea-851d-0a58ac102307" to be running [AfterEach] [sig-scheduling] SchedulerPriorities [Serial] /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:150 ``` It's failing in the process of balancing nodes. Not related to the flake.
The test flakes due to insufficient CPU resources. Not much we can do about it right now. Unless we restrict requested cpu resource of core pods currently deployed in vanilla cluster (see https://github.com/openshift/origin/issues/24247). Moving to 4.4 to give more time to address #24247.
Can't find any failing `Pod should avoid nodes that have avoidPod annotation` test under https://search.svc.ci.openshift.org/?search=validates+resource+limits+of+pods+that+are+allowed+to+run&maxAge=336h&context=2&type=all for the last two weeks Closing for now until the test starts to flake/fail again.
Wrong link, sorry. https://search.svc.ci.openshift.org/?search=Pod+should+avoid+nodes+that+have+avoidPod+annotation&maxAge=336h&context=2&type=all - build-log from build release-openshift-ocp-installer-e2e-azure-serial-4.4 #392 9 hours: I0114 02:10:11.384975 1 scheduler.go:751] pod e2e-sched-priority-6202/scheduler-priority-avoid-pod-55x27 is bound successfully on node "ci-op-pdmlmqdf-0ba00-lcf4l-worker-westus-54srq", 6 nodes evaluated, 1 nodes were found feasible. - build-log from build release-openshift-ocp-installer-e2e-azure-fips-serial-4.3 #1 13 days I0101 01:37:12.280041 1 scheduler.go:667] pod e2e-sched-priority-1186/scheduler-priority-avoid-pod-r8whc is bound successfully on node "ci-op-9tzd46x9-6e04f-cwsrj-worker-centralus1-64txk", 6 nodes evaluated, 1 nodes were found feasible. Let's keep it open and revisit in the next 14 days.
This happened on 4.1 too https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.1/1362
Checking the first jobs for the last 6 days: I0225 10:07:54.409310 1 scheduler.go:751] pod e2e-sched-priority-3913/scheduler-priority-avoid-pod-7szvs is bound successfully on node "ci-op-91hqv97v-0ba00-wwk6q-worker-centralus1-x6xm7", 6 nodes evaluated, 1 nodes were found feasible. I0225 10:07:54.442035 1 scheduler.go:751] pod e2e-sched-priority-3913/scheduler-priority-avoid-pod-5hql8 is bound successfully on node "ci-op-91hqv97v-0ba00-wwk6q-worker-centralus1-x6xm7", 6 nodes evaluated, 1 nodes were found I0225 08:06:58.819539 1 scheduler.go:751] pod e2e-sched-priority-364/scheduler-priority-avoid-pod-xjvm2 is bound successfully on node "ci-op-z28ltq2w-0ba00-5tqcn-worker-eastus23-h2kr6", 6 nodes evaluated, 2 nodes were found feasible. I0225 08:06:58.849374 1 scheduler.go:751] pod e2e-sched-priority-364/scheduler-priority-avoid-pod-46wg9 is bound successfully on node "ci-op-z28ltq2w-0ba00-5tqcn-worker-eastus21-dh52m", 6 nodes evaluated, 1 nodes were found I0225 07:38:08.988402 1 scheduler.go:751] pod e2e-sched-priority-2242/scheduler-priority-avoid-pod-gjrqh is bound successfully on node "ci-op-1j8nrg3s-0ba00-7hnmx-worker-centralus1-mhzxn", 6 nodes evaluated, 1 nodes were found feasible. I0225 07:38:09.018183 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 07:38:09.046326 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 07:39:09.168416 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 07:40:39.169323 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 07:42:09.170004 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 07:43:39.170557 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 07:45:09.171512 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 07:46:39.172203 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 07:48:09.172787 1 factory.go:453] Unable to schedule e2e-sched-priority-2242/scheduler-priority-avoid-pod-2985r: no fit: 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.; waiting I0225 05:21:02.838452 1 scheduler.go:751] pod e2e-sched-priority-1402/scheduler-priority-avoid-pod-gbvvn is bound successfully on node "ci-op-53797g79-0ba00-c25tb-worker-centralus1-r5cft", 6 nodes evaluated, 1 nodes were found feasible. I0225 05:21:02.869585 1 scheduler.go:751] pod e2e-sched-priority-1402/scheduler-priority-avoid-pod-5xn4r is bound successfully on node "ci-op-53797g79-0ba00-c25tb-worker-centralus1-r5cft", 6 nodes evaluated, 1 nodes were found feasible. I0225 03:23:02.135507 1 scheduler.go:751] pod e2e-sched-priority-6667/scheduler-priority-avoid-pod-gmz6c is bound successfully on node "ci-op-1y6q8dy2-0ba00-9clsl-worker-centralus3-265qr", 6 nodes evaluated, 2 nodes were found feasible. I0225 03:23:02.174187 1 scheduler.go:751] pod e2e-sched-priority-6667/scheduler-priority-avoid-pod-4b74q is bound successfully on node "ci-op-1y6q8dy2-0ba00-9clsl-worker-centralus1-cvd26", 6 nodes evaluated, 1 nodes were found feasible. Still the same reason as in the previous cases. Only one node left, priorities are not taken into account.
Seeing this in 4.4 azure as well: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-serial-4.4/1280
this is the top flake on our azure 4.4 release informing job, raising severity. https://testgrid.k8s.io/redhat-openshift-ocp-release-4.4-informing#release-openshift-ocp-installer-e2e-azure-serial-4.4&sort-by-flakiness=
this is a top flake on our 4.4 release informing e2e-azure-serial job.
Checking the last failed job https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-serial-4.4/1283 the cause of the failure is exactly the same: junit.xml logs: ``` STEP: Verify the pods should not scheduled to the node: ci-op-2dw9ts2x-0ba00-gblxl-worker-westus-d5pjv ... Mar 11 05:22:34.139: INFO: Deleting ReplicationController scheduler-priority-avoid-pod took: 84.031989ms Mar 11 05:22:34.340: INFO: Terminating ReplicationController scheduler-priority-avoid-pod pods took: 200.390071ms ``` kube-scheduler logs: ``` I0311 05:22:28.321088 1 scheduler.go:751] pod e2e-sched-priority-2647/scheduler-priority-avoid-pod-tf55c is bound successfully on node "ci-op-2dw9ts2x-0ba00-gblxl-worker-westus-d5pjv", 6 nodes evaluated, 1 nodes were found feasible. I0311 05:22:28.362868 1 scheduler.go:751] pod e2e-sched-priority-2647/scheduler-priority-avoid-pod-8lmcs is bound successfully on node "ci-op-2dw9ts2x-0ba00-gblxl-worker-westus-d5pjv", 6 nodes evaluated, 1 nodes were found ``` So only a single node with sufficient capacity is left for scheduling. There's nothing to be done on the scheduling side of the tests. Azure worker nodes just do not have sufficient resources to run all the tests.
This is not a release blocker, moving to 4.5
> There's nothing to be done on the scheduling side of the tests. Azure worker nodes just do not have sufficient resources to run all the tests. please request the azure CI cluster node size/count be increased, disable this test on Azure, or pursue another solution. (Given this is happening in the "serial" suite, i'm surprised it's a pure resource issue... this should be the only test running at the time it runs).
Based on https://testgrid.k8s.io/redhat-openshift-ocp-release-4.4-informing#release-openshift-ocp-installer-e2e-azure-serial-4.4 "Pod should avoid nodes that have avoidPod annotation" test is flaking very rarely. I recall we merged some PRs reducing resource requests for core components, e.g. scheduler, apiserver, etc. The same holds for https://testgrid.k8s.io/redhat-openshift-ocp-release-4.5-informing#release-openshift-ocp-installer-e2e-azure-serial-4.5 Can't tell the same about https://testgrid.k8s.io/redhat-openshift-ocp-release-4.3-informing#release-openshift-ocp-installer-e2e-azure-serial-4.3. We need to wait until https://github.com/openshift/cluster-kube-scheduler-operator/pull/239 and other PRs from https://bugzilla.redhat.com/show_bug.cgi?id=1822770 gets merged.
Once the PRs in https://bugzilla.redhat.com/show_bug.cgi?id=1822770 merges, 4.3 can be verified as well. Switching to MODIFIED so 4.5 and 4.4 can be verified as target release is set to 4.5.
Verified the bug in the 4.4 & 4.5 CI and did not see any failures for about 4-5 days, hence moving this to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409