Bug 1915945
Summary: | [sig-scheduling] SchedulerPreemption [Serial] validates basic preemption works [Conformance] | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | jamo luhrsen <jluhrsen> |
Component: | kube-scheduler | Assignee: | Jan Chaloupka <jchaloup> |
Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.7 | CC: | aos-bugs, maszulik, mfojtik, wking |
Target Milestone: | --- | Keywords: | UpcomingSprint |
Target Release: | 4.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-02-24 15:53:00 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
jamo luhrsen
2021-01-13 19:18:54 UTC
I was able to poke around the master nodes after I caught a failure. again, I lost the cluster before figuring out too much, but I did see the below snippet (which I don't find in the logs from CI, so hoping it's a little bit helpful): in this file: pods/openshift-kube-scheduler_openshift-kube-scheduler-ip-10-0-132-85.us-west-2.compute.internal_59db6637-ae92-4357-b607-9312ce4dab44/kube-scheduler/0.log 2021-01-13T21:35:07.403973257+00:00 stderr F I0113 21:35:07.403873 1 scheduler.go:592] "Successfully bound pod to node" pod="e2e-sched-preemption-3378/pod0-sched-preemption-low-priority" node="ip-10-0-158-73.us-west-2.compute.internal" evaluatedNodes=6 feasibleNodes=1 2021-01-13T21:35:07.672966782+00:00 stderr F I0113 21:35:07.672915 1 scheduler.go:592] "Successfully bound pod to node" pod="e2e-sched-preemption-3378/pod1-sched-preemption-medium-priority" node="ip-10-0-175-169.us-west-2.compute.internal" evaluatedNodes=6 feasibleNodes=1 2021-01-13T21:35:07.884861457+00:00 stderr F I0113 21:35:07.884261 1 scheduler.go:592] "Successfully bound pod to node" pod="e2e-sched-preemption-3378/pod2-sched-preemption-medium-priority" node="ip-10-0-202-167.us-west-2.compute.internal" evaluatedNodes=6 feasibleNodes=1 2021-01-13T21:35:14.378479856+00:00 stderr F I0113 21:35:14.378416 1 factory.go:321] "Unable to schedule pod; no fit; waiting" pod="e2e-sched-preemption-3378/preemptor-pod" err="0/6 nodes are available: 3 Insufficient scheduling.k8s.io/foo, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate." 2021-01-13T21:35:14.387474415+00:00 stderr F I0113 21:35:14.387419 1 factory.go:321] "Unable to schedule pod; no fit; waiting" pod="e2e-sched-preemption-3378/preemptor-pod" err="0/6 nodes are available: 3 Insufficient scheduling.k8s.io/foo, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate." 2021-01-13T21:36:17.579211762+00:00 stderr F E0113 21:36:17.579164 1 framework.go:777] "Failed running Bind plugin" err="pods \"preemptor-pod\" is forbidden: unable to create new content in namespace e2e-sched-preemption-3378 because it is being terminated" plugin="DefaultBinder" pod="e2e-sched-preemption-3378/preemptor-pod" 2021-01-13T21:36:17.579351098+00:00 stderr F I0113 21:36:17.579310 1 scheduler.go:420] Failed to bind pod: e2e-sched-preemption-3378/preemptor-pod 2021-01-13T21:36:17.579393569+00:00 stderr F E0113 21:36:17.579377 1 factory.go:337] "Error scheduling pod; retrying" err="binding rejected: running Bind plugin \"DefaultBinder\": pods \"preemptor-pod\" is forbidden: unable to create new content in namespace e2e-sched-preemption-3378 because it is being terminated" pod="e2e-sched-preemption-3378/preemptor-pod" 2021-01-13T21:36:20.361326967+00:00 stderr F E0113 21:36:20.361207 1 event_broadcaster.go:253] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"preemptor-pod.1659e851eed18ccf", GenerateName:"", Namespace:"e2e-sched-preemption-3378", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, EventTime:v1.MicroTime{Time:time.Time{wall:0xbff7f7d5155c2ae8, ext:16359491835897, loc:(*time.Location)(0x2ceed00)}}, Series:(*v1.EventSeries)(nil), ReportingController:"default-scheduler", ReportingInstance:"default-scheduler-ip-10-0-132-85", Action:"Scheduling", Reason:"FailedScheduling", Regarding:v1.ObjectReference{Kind:"Pod", Namespace:"e2e-sched-preemption-3378", Name:"preemptor-pod", UID:"fad0c735-b398-4dfc-a2d5-8774b1db4474", APIVersion:"v1", ResourceVersion:"127277", FieldPath:""}, Related:(*v1.ObjectReference)(nil), Note:"skip schedule deleting pod: e2e-sched-preemption-3378/preemptor-pod", Type:"Warning", DeprecatedSource:v1.EventSource{Component:"", Host:""}, DeprecatedFirstTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeprecatedLastTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeprecatedCount:0}': 'events.events.k8s.io "preemptor-pod.1659e851eed18ccf" is forbidden: unable to create new content in namespace e2e-sched-preemption-3378 because it is being terminated' (will not retry!) These are logs that came for the test case iteration that failed. You can see the timestamps that it's aprox 1m with some messages about 0 nodes available, then finally the 'Error scheduling pod' which I believe comes because the test case has given up and probably tearing things down.... deletes the namespace and the scheduler realizes it on the next retry to find a node to schedule on and gives up. Checking the recent failures, every test timeouts after 1 minute of waiting for the preemptor pod to be scheduled with a similar kube-scheduler logs like these: ``` I0115 04:44:43.912012 1 factory.go:321] "Unable to schedule pod; no fit; waiting" pod="e2e-sched-preemption-930/preemptor-pod" err="0/6 nodes are available: 3 Insufficient scheduling.k8s.io/foo, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate." I0115 04:44:43.920822 1 factory.go:321] "Unable to schedule pod; no fit; waiting" pod="e2e-sched-preemption-930/preemptor-pod" err="0/6 nodes are available: 3 Insufficient scheduling.k8s.io/foo, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate." ... E0115 04:45:49.610424 1 framework.go:777] "Failed running Bind plugin" err="pods \"preemptor-pod\" is forbidden: unable to create new content in namespace e2e-sched-preemption-930 because it is being terminated" plugin="DefaultBinder" pod="e2e-sched-preemption-930/preemptor-pod" I0115 04:45:49.610463 1 scheduler.go:420] Failed to bind pod: e2e-sched-preemption-930/preemptor-pod ``` - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-serial-4.7/1350446580682985472 - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-serial-4.7/1350355743152279552 - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-serial-4.7/1349924268133060608 It seems it takes slightly over a minute to bind the preemptor-pod and then some time to actually run it. In some rare cases the preemptor pod gets scheduled (https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.7/1349872296105349120): ``` I0115 02:23:50.826264 1 factory.go:321] "Unable to schedule pod; no fit; waiting" pod="e2e-sched-preemption-3163/preemptor-pod" err="0/6 nodes are available: 3 Insufficient scheduling.k8s.io/foo, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate." I0115 02:23:50.841743 1 factory.go:321] "Unable to schedule pod; no fit; waiting" pod="e2e-sched-preemption-3163/preemptor-pod" err="0/6 nodes are available: 3 Insufficient scheduling.k8s.io/foo, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate." I0115 02:24:49.538649 1 scheduler.go:592] "Successfully bound pod to node" pod="e2e-sched-preemption-3163/preemptor-pod" node="ip-10-0-146-177.us-west-1.compute.internal" evaluatedNodes=6 feasibleNodes=1 ``` Though it's pretty tight so the test still does not wait long enough for the pod to get running. I don't see this really happening in 4.6, but it is there in 4.7 and 4.8 which makes me think it's potentially a real regression in performance to bring up a pod. Also, there is this bz (https://bugzilla.redhat.com/show_bug.cgi?id=1915085) sort of calling out a similar thing that could explain this problem. Not that increasing the timeout from 1m to 2m would be wrong, because if a pod was really taking 5m to come up we'd still be failing these test cases. just fyi (In reply to jamo luhrsen from comment #3) > I don't see this really happening in 4.6, but it is there in 4.7 and 4.8 > which makes me think it's > potentially a real regression in performance to bring up a pod. Also, there > is this bz (https://bugzilla.redhat.com/show_bug.cgi?id=1915085) > sort of calling out a similar thing that could explain this problem. btw, the upstream issue (https://github.com/kubernetes/kubernetes/issues/98142) noted in that bz does have some chatter and a proposed hack/fix that could help. Maybe we can wait for that issue to be resolved before increasing the test timeout? > > Not that increasing the timeout from 1m to 2m would be wrong, because if a > pod was really taking > 5m to come up we'd still be failing these test cases. > > just fyi The upstream PR already merged in o/k. The change would land in the next rebase anyway. Thanks for pointing out the other bug. Seems like it would be better to have a different test to measure the kubelet performance for runc in kubelet test suite. From the link below see that test has not been failing for about 7 days, so moving the bug to verified state. https://search.ci.openshift.org/?search=validates+basic+preemption+works&maxAge=168h&context=1&type=junit&name=&maxMatches=5&maxBytes=20971520&groupBy=job Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |