Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1886920

Summary: [sig-node] pods should never transition back to pending
Product: OpenShift Container Platform Reporter: Jing Zhang <jingzhan>
Component: NodeAssignee: Ryan Phillips <rphillips>
Node sub component: Kubelet QA Contact: Sunil Choudhary <schoudha>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: unspecified CC: aos-bugs, astoycos, bparees, deads, emoss, fabian, jokerman, tsweeney, wking
Version: 4.7Keywords: Reopened
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
[sig-node] pods should never transition back to pending
Last Closed: 2021-01-08 16:06:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jing Zhang 2020-10-09 18:04:55 UTC
test:
[sig-node] pods should never transition back to pending 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-node%5C%5D+pods+should+never+transition+back+to+pending


FIXME: Replace this paragraph with a particular job URI from the search results to ground discussion.  A given test may fail for several reasons, and this bug should be scoped to one of those reasons.  Ideally you'd pick a job showing the most-common reason, but since that's hard to determine, you may also chose to pick a job at random.  Release-gating jobs (release-openshift-...) should be preferred over presubmits (pull-ci-...) because they are closer to the released product and less likely to have in-flight code changes that complicate analysis.

FIXME: Provide a snippet of the test failure or error from the job log

Comment 1 Ryan Phillips 2020-10-12 17:05:50 UTC
This has been fixed in a different PR and we are seeing testgrid going green. Might see this still in older .y stream jobs.

https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-blocking#release-openshift-ocp-installer-e2e-aws-4.6

4.7: https://bugzilla.redhat.com/show_bug.cgi?id=1884035
4.6: https://bugzilla.redhat.com/show_bug.cgi?id=1886247

Comment 2 Seth Jennings 2020-10-14 13:41:58 UTC
This is a different issue than originally reported and fix.  The test now includes a number of different situations in which pods transitioned to Pending.

https://github.com/openshift/openshift-tests/blob/292dfd1dc2d170bd8b5f2d4dfb2414ef657ff22b/pkg/monitor/pod.go#L87 fixed

https://github.com/openshift/openshift-tests/blob/292dfd1dc2d170bd8b5f2d4dfb2414ef657ff22b/pkg/monitor/pod.go#L93 is the case we see now

The second issue is less severe as it only impacts static pods, but we should still figure it out.

Comment 10 Ben Parees 2020-12-21 16:30:12 UTC
better search url for future investigators:

https://search.ci.openshift.org/?search=illegally+transitioned+to+Pending&maxAge=168h&context=1&type=junit&name=&maxMatches=5&maxBytes=20971520&groupBy=job

this is still pretty common (it causes 6% of our job failures according to ci-search) and is still failing even in 4.7.

Comment 11 Ryan Phillips 2021-01-08 16:06:05 UTC
This is a duplicate of 1882750 that we are looking into.

*** This bug has been marked as a duplicate of bug 1882750 ***