Bug 1886920 - [sig-node] pods should never transition back to pending
Summary: [sig-node] pods should never transition back to pending
Keywords:
Status: CLOSED DUPLICATE of bug 1882750
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.7.0
Assignee: Ryan Phillips
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-09 18:04 UTC by Jing Zhang
Modified: 2021-01-08 16:06 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
[sig-node] pods should never transition back to pending
Last Closed: 2021-01-08 16:06:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jing Zhang 2020-10-09 18:04:55 UTC
test:
[sig-node] pods should never transition back to pending 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-node%5C%5D+pods+should+never+transition+back+to+pending


FIXME: Replace this paragraph with a particular job URI from the search results to ground discussion.  A given test may fail for several reasons, and this bug should be scoped to one of those reasons.  Ideally you'd pick a job showing the most-common reason, but since that's hard to determine, you may also chose to pick a job at random.  Release-gating jobs (release-openshift-...) should be preferred over presubmits (pull-ci-...) because they are closer to the released product and less likely to have in-flight code changes that complicate analysis.

FIXME: Provide a snippet of the test failure or error from the job log

Comment 1 Ryan Phillips 2020-10-12 17:05:50 UTC
This has been fixed in a different PR and we are seeing testgrid going green. Might see this still in older .y stream jobs.

https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-blocking#release-openshift-ocp-installer-e2e-aws-4.6

4.7: https://bugzilla.redhat.com/show_bug.cgi?id=1884035
4.6: https://bugzilla.redhat.com/show_bug.cgi?id=1886247

Comment 2 Seth Jennings 2020-10-14 13:41:58 UTC
This is a different issue than originally reported and fix.  The test now includes a number of different situations in which pods transitioned to Pending.

https://github.com/openshift/openshift-tests/blob/292dfd1dc2d170bd8b5f2d4dfb2414ef657ff22b/pkg/monitor/pod.go#L87 fixed

https://github.com/openshift/openshift-tests/blob/292dfd1dc2d170bd8b5f2d4dfb2414ef657ff22b/pkg/monitor/pod.go#L93 is the case we see now

The second issue is less severe as it only impacts static pods, but we should still figure it out.

Comment 10 Ben Parees 2020-12-21 16:30:12 UTC
better search url for future investigators:

https://search.ci.openshift.org/?search=illegally+transitioned+to+Pending&maxAge=168h&context=1&type=junit&name=&maxMatches=5&maxBytes=20971520&groupBy=job

this is still pretty common (it causes 6% of our job failures according to ci-search) and is still failing even in 4.7.

Comment 11 Ryan Phillips 2021-01-08 16:06:05 UTC
This is a duplicate of 1882750 that we are looking into.

*** This bug has been marked as a duplicate of bug 1882750 ***


Note You need to log in before you can comment on or make changes to this bug.