Description of problem: https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-azure-serial-4.2/21 https://prow.k8s.io/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-azure-serial-4.2/19 ... etc. fail [k8s.io/kubernetes/test/e2e/scheduling/preemption.go:318]: Unexpected error: <*errors.errorString | 0xc0002733f0>: { s: "timed out waiting for the condition", } timed out waiting for the condition occurred Version-Release number of selected component (if applicable): 4.2 jobs How reproducible: Much often in last week
I don't see this too often, lowering the priority and moving out of 4.2
Seems the resources is insufficient. Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:05 +0000 UTC - event for pod0-sched-preemption-medium-priority: {default-scheduler } Scheduled: Successfully assigned e2e-sched-preemption-6106/pod0-sched-preemption-medium-priority to ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus1-c7s2t Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:05 +0000 UTC - event for pod1-sched-preemption-low-priority: {default-scheduler } FailedScheduling: 0/6 nodes are available: 2 Insufficient cpu, 5 node(s) didn't match node selector. Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:05 +0000 UTC - event for pod2-sched-preemption-low-priority: {default-scheduler } Scheduled: Successfully assigned e2e-sched-preemption-6106/pod2-sched-preemption-low-priority to ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus3-bk246 Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:07 +0000 UTC - event for pod0-sched-preemption-medium-priority: {kubelet ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus1-c7s2t} Pulled: Container image "k8s.gcr.io/pause:3.1" already present on machine Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:07 +0000 UTC - event for pod0-sched-preemption-medium-priority: {kubelet ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus1-c7s2t} Created: Created container pod0-sched-preemption-medium-priority Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:07 +0000 UTC - event for pod0-sched-preemption-medium-priority: {kubelet ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus1-c7s2t} Started: Started container pod0-sched-preemption-medium-priority Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:08 +0000 UTC - event for pod2-sched-preemption-low-priority: {kubelet ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus3-bk246} Pulled: Container image "k8s.gcr.io/pause:3.1" already present on machine Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:08 +0000 UTC - event for pod2-sched-preemption-low-priority: {kubelet ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus3-bk246} Created: Created container pod2-sched-preemption-low-priority Aug 26 01:02:10.523: INFO: At 2019-08-26 00:57:08 +0000 UTC - event for pod2-sched-preemption-low-priority: {kubelet ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus3-bk246} Started: Started container pod2-sched-preemption-low-priority Aug 26 01:02:10.564: INFO: POD NODE PHASE GRACE CONDITIONS Aug 26 01:02:10.564: INFO: pod0-sched-preemption-medium-priority ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus1-c7s2t Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:05 +0000 UTC } {Ready True 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:08 +0000 UTC } {ContainersReady True 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:08 +0000 UTC } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:05 +0000 UTC }] Aug 26 01:02:10.564: INFO: pod1-sched-preemption-low-priority Pending [{PodScheduled False 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:05 +0000 UTC Unschedulable 0/6 nodes are available: 2 Insufficient cpu, 5 node(s) didn't match node selector.}] Aug 26 01:02:10.564: INFO: pod2-sched-preemption-low-priority ci-op-q0jd3q58-3a8ca-kvjb9-worker-centralus3-bk246 Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:05 +0000 UTC } {Ready True 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:08 +0000 UTC } {ContainersReady True 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:08 +0000 UTC } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2019-08-26 00:57:05 +0000 UTC }]
Hit same issue in https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-azure-serial-4.2/89
I think this may be addressed by https://github.com/kubernetes/kubernetes/pull/76663, which I'm rebasing and updating to hopefully pick to origin
Disregard that, we actually removed this test upstream so I opened a PR to pick that in origin here: https://github.com/openshift/origin/pull/23728
This test is now run against 4.2 nightly as part of azure-serial, and is now being considered a blocking failure for 4.2 (branching for 4.3). Moving this back to 4.2 and moving to urgent. Please reach out to me or nstielau if you think this should not be the case.
*** Bug 1748150 has been marked as a duplicate of this bug. ***
This test has been removed (in favor of a duplicate integration test), and should no longer be run. Can you please confirm that the test is no longer run?
(In reply to Mike Dame from comment #9) > This test has been removed (in favor of a duplicate integration test), and > should no longer be run. Can you please confirm that the test is no longer > run? Yes, I cannot find this test in latest azure-serial test now: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-azure-serial-4.2/127. How to deal with this bug?
ok, close it based on comments above, thx
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922