Bug 1852857 - [ci][sig-scheduling] SchedulerPriorities [Serial] Pod should be scheduled to node that don't match the PodAntiAffinity terms
Summary: [ci][sig-scheduling] SchedulerPriorities [Serial] Pod should be scheduled to ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-scheduler
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.3.z
Assignee: Mike Dame
QA Contact: zhou ying
URL:
Whiteboard:
Depends On: 1861850
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-01 13:05 UTC by Gabe Montero
Modified: 2020-09-09 16:24 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1861850 (view as bug list)
Environment:
Last Closed: 2020-09-09 16:24:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 25344 0 None closed [release-4.3] Bug 1852857: UPSTREAM: 90740: Update createBalancedPodForNodes function in scheduler e2e 2021-01-14 16:01:36 UTC
Red Hat Product Errata RHBA-2020:3457 0 None None None 2020-09-09 16:24:52 UTC

Description Gabe Montero 2020-07-01 13:05:32 UTC
Description of problem:

It seems to be 4.3.z specific, but as far back as last Saturday June 27 I've seen consistent failures in this test in PR https://github.com/openshift/origin/pull/25215

Version-Release number of selected component (if applicable):

I see this test fail in other release, but they are not the exact same failure as I see in 4.3.z.

There, it fails with

fail [k8s.io/kubernetes/test/e2e/scheduling/priorities.go:150]: Unexpected error:
    <*errors.errorString | 0xc0002901a0>: {
        s: "timed out waiting for the condition",
    }
    timed out waiting for the condition
occurred

And I see this in the event dump:

Jun 30 19:13:28.917: INFO: pod-with-pod-antiaffinity                                                           Pending         [{PodScheduled False 0001-01-01 00:00:00 +0000 UTC 2020-06-30 19:08:28 +0000 UTC Unschedulable 0/6 nodes are available: 1 Insufficient memory, 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.}]

I did see https://bugzilla.redhat.com/show_bug.cgi?id=1749246 but the file changed in the associated PR does not exist in the 4.3 branch of openshift/origin, so I opted with a separate bug.

How reproducible:

happening consistently in the e2e-aws-serial job in 4.3.z openshift/origing PRs

Comment 9 Mike Dame 2020-07-29 18:12:43 UTC
Opened a 4.4 bug/PR and a PR to backport this to 4.3

Comment 10 Mike Dame 2020-08-21 13:29:04 UTC
Iā€™m adding UpcomingSprint, because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug in a future sprint.

Comment 15 errata-xmlrpc 2020-09-09 16:24:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.3.35 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3457


Note You need to log in before you can comment on or make changes to this bug.