Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1773108

Summary:	All 4.3 metal CI jobs are failing: Due to "provisioning time limit exceeded; the Packet team will investigate"
Product:	OpenShift Container Platform	Reporter:	Greg Sheremeta <gshereme>
Component:	Installer	Assignee:	Scott Dodson <sdodson>
Installer sub component:	openshift-installer	QA Contact:	David Sanz <dsanzmor>
Status:	CLOSED DUPLICATE	Docs Contact:
Severity:	urgent
Priority:	urgent	CC:	ccoleman, dgoodwin, mifiedle, sdodson
Version:	unspecified	Keywords:	TestBlocker
Target Milestone:	---
Target Release:	4.4.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1776011 (view as bug list)		Environment:
Last Closed:	2019-12-13 13:59:02 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1775388, 1779755, 1782546
Bug Blocks:	1776011

Description Greg Sheremeta 2019-11-15 22:13:15 UTC

Description of problem:

metal jobs failing because "provisioning time limit exceeded; the Packet team will investigate"

Example:
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-4.2/183

Comment 2 Scott Dodson 2019-11-20 13:53:48 UTC

I met with Zac and Golden @ Packet yesterday to discuss this and they've informed me that this particular message indicates that the device was provisioned but that the host OS never reached running state so this is most likely a failure in PXE / Ignition processes. So it seems like there's definitely something to look into here.

In all cases where a running OS became available NetworkManager-wait-online.service is in a failed state because the second interface is not properly configured. This service now blocks other services and this introduces a 300 second delay in the boot process and with hosts rebooting multiple times this has potential to cause job failure. I'm attmepting to only configure the first interface and see if that produces better results.

Comment 3 Scott Dodson 2019-11-24 15:41:57 UTC

*** Bug 1772212 has been marked as a duplicate of this bug. ***

Comment 5 Clayton Coleman 2019-12-02 18:00:09 UTC

We cannot ship without this in 4.3, marking appropriately.

Comment 6 Scott Dodson 2019-12-13 13:59:02 UTC


*** This bug has been marked as a duplicate of bug 1775388 ***