RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1267030 - ipxe timeout when performing introspection through Intel i350 NIC
Summary: ipxe timeout when performing introspection through Intel i350 NIC
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ipxe
Version: 7.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Lucas Alvares Gomes
QA Contact: Raviv Bar-Tal
URL:
Whiteboard:
: 1290569 (view as bug list)
Depends On: 1298313
Blocks: 1290569 1300702 1300704
TreeView+ depends on / blocked
 
Reported: 2015-09-28 21:02 UTC by Vincent S. Cojot
Modified: 2019-08-15 05:33 UTC (History)
42 users (show)

Fixed In Version: ipxe-20150821-1.git4e03af8e.el7
Doc Type: Rebase: Bug Fixes and Enhancements
Doc Text:
Clone Of:
: 1290569 1300702 (view as bug list)
Environment:
Last Closed: 2016-11-04 00:36:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ipxe timeout (59.29 KB, image/jpeg)
2015-09-28 21:08 UTC, Vincent S. Cojot
no flags Details
Screencast showing tcpdum of client's MAC on hypervisor and client console.. (1.20 MB, application/octet-stream)
2015-10-05 20:04 UTC, Vincent S. Cojot
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2214 0 normal SHIPPED_LIVE ipxe bug fix and enhancement update 2016-11-03 13:24:33 UTC

Description Vincent S. Cojot 2015-09-28 21:02:55 UTC
Description of problem:

On a few Dell R420 servers with both Broadcom and Intel NICs, ipxe works fine when netbooting from the Broadcom NIC but times out from netbooting from the Intel NIC.


Version-Release number of selected component (if applicable):

$ rpm -qf /usr/share/instack-undercloud/ipxe/post-install.d/88-setup-ipxe
instack-undercloud-2.1.2-23.el7ost.noarch

$ rpm -qf /usr/share/ipxe/undionly.kpxe
ipxe-bootimgs-20130517-6.gitc4bce43.el7.noarch


How reproducible:

always (we tried flashing the firmwares, to no avail)

Steps to Reproduce:
1. set the MAC to that of the Intel NIC in instackenv.json
2. start introspection


Actual results:

ipxe times out on the Intel NIC but works on the Broadcom NIC (inside the same VLAN and on the same switch).

Expected results:

Introspection should finish fine.

Additional info:

- From the node itself, on a pre-installed RHEL7.0, 'dhclient' takes only a few secs on Broadcom and close to 30 seconds on the Intel NICs.

- We also found a workaround by updating the iPXE payload (updating the undionly.kpxe binary from the latest builds available on ipxe.org):
on the instack machine:
# curl -O http://boot.ipxe.org/undionly.kpxe
# chmod 744 /tftpboot/undionly.kpxe
# chown ironic:ironic /tftpboot/undionly.kpxe
# chcon system_u:object_r:tftpdir_t:s0 /tftpboot/undionly.kpxe

Comment 2 Vincent S. Cojot 2015-09-28 21:08:11 UTC
Created attachment 1078072 [details]
ipxe timeout

Comment 3 Dmitry Tantsur 2015-10-01 12:36:09 UTC
Hi! So, if you can confirm that newer iPXE firmware works for you, than updating ipxe-bootimgs to something newer than May 2013 (which we have judging by the RPM version) is probably the only thing we can do. Mike, do you think we could retarget this bug to ipxe-bootimgs package?

Comment 4 Mike Burns 2015-10-01 13:04:55 UTC
In this case, we're limited to what is shipped in RHEL.  Adding Miroslav who seems to own ipxe in RHEL

Comment 5 Miroslav Rezanina 2015-10-02 07:55:29 UTC
Hi Mike,
we can try to rebase ipxe in 7.3 in case there's not proper patch found.

Comment 6 Mike Burns 2015-10-02 11:23:07 UTC
Great, moving this to RHEL, then.

Comment 8 Vincent S. Cojot 2015-10-05 20:03:22 UTC
Hi everyone,
I don't think this issue is related to OOO. The ipxe payload update is merely a workaround for the issue we ran into. We discovered that it works better (it does not timeout) if we use the more recent ipxe payload.
At any case:
1) we're still looking into the base issue (DHCP timeout with Intel NICs and Nortel switches)
2) the ipxe payloads in RHEL7.x need an update (IMHO).

For the curious, here a small screencast captured on my desktop and showing:

1) tcpdump for the client's MAC on the hypervisor hosting the instack VM.
2) the client machine's console. Notice the delay in obtaining the first lease through PXE and witness the timeout with the default iPXE payload (the newer payload worked around that issue and allowed us to sucessfully instrospect and deploy).

Kind regards,

Vincent

Comment 9 Vincent S. Cojot 2015-10-05 20:04:14 UTC
Created attachment 1080064 [details]
Screencast showing tcpdum of client's MAC on hypervisor and client console..

Comment 10 Lukas Zapletal 2015-10-15 09:37:25 UTC
Satellite 6 customers hit this as well, please rebase.

Comment 19 Gonéri Le Bouder 2015-11-26 14:26:43 UTC
Enabling PortFast (STP) on the switch fix the issue.

Comment 20 Mike Burns 2016-01-13 14:39:17 UTC
*** Bug 1290569 has been marked as a duplicate of this bug. ***

Comment 27 Chris Dearborn 2016-02-19 17:28:19 UTC
FYI, at Dell, we are not seeing timeout issues when PXE booting from Intel NICs.

Comment 29 Dan Yocum 2016-04-21 14:47:47 UTC
I can verify that the Dell R630 and R730xd systems with Intel X520 i350 nics are booting properly using the following ROMS:

ipxe-bootimgs-20160127-1.git6366fa7a.el7.noarch

NB: the git hash should match the ipxe version hash displayed when chainloading.

Comment 30 Chao Yang 2016-08-23 10:45:19 UTC
Hi Raviv,

Would you please verify this bug as it is ON_QA now? Thanks!

Comment 31 Raviv Bar-Tal 2016-09-11 12:39:01 UTC
The problem is solved by the new roms, there is no new failure report related to this problem, this was verified with the Udi the owner of bug https://bugzilla.redhat.com/show_bug.cgi?id=1301694
and As Dan wrote in comment #29.

Comment 34 errata-xmlrpc 2016-11-04 00:36:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2214.html


Note You need to log in before you can comment on or make changes to this bug.