1455455 – [RFE] PXE less provisioning - Add delay to discovery image boot for slow DHCP networks

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1455455 - [RFE] PXE less provisioning - Add delay to discovery image boot for slow DHCP networks

Summary: [RFE] PXE less provisioning - Add delay to discovery image boot for slow DHCP...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Satellite
Classification:	Red Hat
Component:	Discovery Image
Sub Component:
Version:	Unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	Unspecified
Assignee:	Lukas Zapletal
QA Contact:	Lukáš Hellebrandt
Docs Contact:
URL:	http://projects.theforeman.org/issues...
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-05-25 08:57 UTC by Roman Bobek
Modified:	2021-06-10 12:24 UTC (History)
CC List:	6 users (show)
Fixed In Version:	foreman-discovery-image-3.4.1-1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-02-21 12:39:11 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Foreman Issue Tracker	19984	0	Normal	Closed	Add sleep step to pxe-less for unattended mode	2020-01-02 09:49:55 UTC
Red Hat Product Errata	RHSA-2018:0336	0	normal	SHIPPED_LIVE	Important: Satellite 6.3 security, bug fix, and enhancement update	2018-02-21 22:43:42 UTC

Description Roman Bobek 2017-05-25 08:57:04 UTC

Description of problem:
Booting of PXE less discovery image is failing with network initialization error, because of slow DHCP network. 

How reproducible:
On customers side.

Steps to Reproduce:
1. create a discovery image
2. boot the server
3. wait for error

Actual results:
Boot fails with network initialization error.

Expected results:
Discovery image will boot.

Additional info:
Adding sleep(15) at the top of the configure_network function workarounds the issue. According to lzap we should include delay feature to PXE less initialization, same as we have in PXE initialization.

Comment 1 Brad Buckingham 2017-05-26 15:15:26 UTC

Lukas, thoughts on this one?

Comment 2 Lukas Zapletal 2017-05-29 08:15:43 UTC

Roman I can indeed implement the fix but I cannot reproduce, can you send me full logs from the discovered node when you encounter this kind of error? The script is called discovery-debug, send me output of this to the BZ.

Comment 3 Lukas Zapletal 2017-06-05 11:20:47 UTC

I am asking because I don't understand, in PXE-less mode there is plenty of time if you go screen by screen. Do you mean semi-automatic or full-automatic (unattended) mode when you enter all the details on the kernel command line? What was your setting then? I need to know before I can put the delay.

Comment 4 Roman Bobek 2017-06-07 07:29:29 UTC

(In reply to Lukas Zapletal from comment #3)
> I am asking because I don't understand, in PXE-less mode there is plenty of
> time if you go screen by screen. Do you mean semi-automatic or
> full-automatic (unattended) mode when you enter all the details on the
> kernel command line? What was your setting then? I need to know before I can
> put the delay.

I have asked for the logs, but the customer unfortunately does not have the HW to reproduce this issue.

Have asked him for more details about booting mode and settings. Waiting for the reply.

Comment 5 Lukas Zapletal 2017-06-07 10:26:58 UTC

The error message from the attached image is triggered by this statement:

    nmcli connection down primary

this happens when the primary connection is brought down and it was not yet brought up. So the error is expected in this case. It's just a cosmetic issue and it has no effect.

I filed this upstream: http://projects.theforeman.org/issues/19950

We will eventually fix this, no backport or QA needed for this one, closing.

Comment 6 Lukas Zapletal 2017-06-12 13:40:55 UTC

Sorry, this was a different issue. Re-opening:

http://projects.theforeman.org/issues/19984

Comment 8 Satellite Program 2017-06-13 20:12:34 UTC

Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/19984 has been resolved.

Comment 9 Lukáš Hellebrandt 2017-12-21 10:15:14 UTC

Verified with foreman-discovery-image-3.4.1-3.iso.

There is no clear reproducer for the issue so I did the following:
1) Used FDI to discover a host which worked correctly
2) Verified the change was actually present on the ISO in /usr/lib64/ruby/vendor_ruby/discovery/menu.rb
3) The error "nmcli connection down" is not present in journalctl

Note: waiting for a magic constant "10 seconds" is fishy, it would be better to wait in a loop

Comment 12 errata-xmlrpc 2018-02-21 12:39:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0336

Note You need to log in before you can comment on or make changes to this bug.