Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1726934

Summary:	Pod phase seems to break its invariants again
Product:	OpenShift Container Platform	Reporter:	Tomáš Nožička <tnozicka>
Component:	Node	Assignee:	Ryan Phillips <rphillips>
Status:	CLOSED DUPLICATE	QA Contact:	Sunil Choudhary <schoudha>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4.1.z	CC:	aos-bugs, ccoleman, erich, gblomqui, jokerman, mmccomas, pthomas, rgudimet, sttts
Target Milestone:	---
Target Release:	4.5.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-04-06 21:07:39 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Tomáš Nožička 2019-07-04 06:19:25 UTC

Jul 03 06:53:08.682 W ns/openshift-monitoring pod/node-exporter-zh4jj node/ip-10-0-136-122.ec2.internal invariant violation (bug): pod should not transition Running->Pending even when terminated


https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.1/86

Comment 1 Stefan Schimanski 2019-07-26 15:38:25 UTC

Also seen in https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1/208

Comment 2 Seth Jennings 2019-07-29 19:44:14 UTC

I do wonder if this is happening when the node reboots.

When the kubelet goes down on a reboot, the pods stay running and there is nothing to report to the apiserver when the pods go down on node shutdown.

Maybe the kubelet is coming back up, gets the list of pods, and moves from Running->Pending since the pod is indeed not running.  Then attempts to start it and then it transitions back to running.

I'm not entirely sure that this is illegal in that case.

A quick check shows that this definitely doesn't happen every time.  I rebooted a node and did a watch on pods, when the kubelet came back, all the pods statuses were updated once but the state remained Running.

Comment 3 Tomáš Nožička 2019-08-12 13:50:17 UTC

also seen in https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.2/343

Aug 12 09:45:18.278 W ns/openshift-machine-config-operator pod/machine-config-daemon-skpkw node/ip-10-0-140-14.ec2.internal invariant violation (bug): pod should not transition Running->Pending even when terminated

Comment 4 Seth Jennings 2019-08-12 21:55:43 UTC

I tried to recreate this a few ways without success.  The pod lifecycle has never been formalized, but is seems that Running -> Pending should not be allow.  However, it is not explicitly disallowed and seems like it wouldn't cause an issue, unlike the Failed -> Succeeded transitions we've seen in the past.

Comment 8 Ryan Phillips 2020-04-06 21:07:39 UTC


*** This bug has been marked as a duplicate of bug 1810652 ***