2081732 – "[sig-node] static pods should start after being created" doesn't capture "... because static pod is ready" event

Bug 2081732 - "[sig-node] static pods should start after being created" doesn't capture "... because static pod is ready" event

Summary: "[sig-node] static pods should start after being created" doesn't capture ".....

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Test Framework
Sub Component:
Version:	4.11
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	low
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Ken Zhang
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-05-04 13:46 UTC by Riccardo Ravaioli
Modified:	2024-04-30 18:04 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2024-04-30 18:04:53 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift origin pull 27143	None	open	Bug 2081732: Dump debug message whenever static pod test fails	2022-05-19 18:43:29 UTC
Github	openshift origin pull 27160	None	open	Bug 2081732: Static pod test error	2022-05-23 16:28:10 UTC
Github	openshift origin pull 27195	None	open	Bug 2081732: remove incorrect namespace check in static pod test	2022-06-02 12:58:30 UTC

Description Riccardo Ravaioli 2022-05-04 13:46:36 UTC

Description of problem:

The test "[sig-node] static pods should start after being created" parses events and looks for static pods that failed to be created, searching for events like:

"static pod lifecycle failure - static pod: \"openshift-kube-scheduler\" in namespace: \"openshift-kube-scheduler\" for revision: 7 on node: \"ci-op-j83m46vy-5cb9e-xs2c9-master-0\" didn't show up, waited: 2m30s"

Since these static pods might take a little longer to come up, it then looks for events like:

"Updated node \"ci-op-j83m46vy-5cb9e-xs2c9-master-0\" from revision 0 to 7 because static pod is ready"
... and in this case, the test shouldn't fail.

I looked at two failed runs of this test case and it seems that somehow the "... because static pod is ready" event is never captured, leading to test failing while it should not.
I provided a detailed analysis here: https://issues.redhat.com/browse/SDN-2994

Version-Release number of selected component (if applicable):
4.11

How reproducible:
I found and analyzed failed runs through testgrid: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-ci-4.11-e2e-azure-ovn


Expected results:
The test shouldn't fail

Comment 4 Ken Zhang 2022-06-13 15:23:12 UTC

Test has been fixed.

Comment 5 Rory Thrasher 2024-04-30 18:04:53 UTC

OCP is no longer using Bugzilla and this bug appears to have been left in an orphaned state. If the bug is still relevant, please open a new issue in the OCPBUGS Jira project: https://issues.redhat.com/projects/OCPBUGS/summary

Note You need to log in before you can comment on or make changes to this bug.