RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1184694 - grub2 network dysfunctional under iPXE oprom(in OVMF guest)
Summary: grub2 network dysfunctional under iPXE oprom(in OVMF guest)
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: grub2
Version: 7.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Peter Jones
QA Contact: Release Test Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-01-22 03:29 UTC by Xiaoqing Wei
Modified: 2015-01-22 22:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-01-22 22:18:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
grub2-could-not-auto-config-IP.png (19.22 KB, image/png)
2015-01-22 03:32 UTC, Xiaoqing Wei
no flags Details
iPXE-dhcp-correct.png (21.08 KB, image/png)
2015-01-22 03:34 UTC, Xiaoqing Wei
no flags Details
iPXE-direct-load-success.png (25.60 KB, image/png)
2015-01-22 03:38 UTC, Xiaoqing Wei
no flags Details
shim-fail-to-load-grub.png (22.86 KB, image/png)
2015-01-22 04:35 UTC, Xiaoqing Wei
no flags Details
tcpdump-virtio-oprom.txt (310.06 KB, text/plain)
2015-01-22 04:43 UTC, Xiaoqing Wei
no flags Details
tcpdump-virtio-builtin.txt.bz2 (1.19 MB, application/x-bzip)
2015-01-22 04:49 UTC, Xiaoqing Wei
no flags Details
tcpdump-virtio-oprom-grubx64.txt.bz2 (379.61 KB, application/x-bzip)
2015-01-22 04:51 UTC, Xiaoqing Wei
no flags Details
tcpdump-iPXE-direct-load-vmlinuz-initrd.img.txt.bz2 (1015.93 KB, application/x-bzip)
2015-01-22 04:52 UTC, Xiaoqing Wei
no flags Details
drop iPXE's own broken EFI_LOAD_FILE_PROTOCOL implementation (7.09 KB, patch)
2015-01-22 12:53 UTC, Laszlo Ersek
no flags Details | Diff

Description Xiaoqing Wei 2015-01-22 03:29:55 UTC
Description of problem:

grub2 network disfunctional under iPXE oprom(in OVMF guest)

This could be other components' bug, anything that involved in booting a Linux kernel over tftp server for a OVMF(UEFI) guest with iPXE oprom, which I am not that 100% sure which one is the root cause, filing this bz against grub2 as:
1) the current failing message was from grub2
2) iPXE could gain IP addr, and transfer data(shim.efi or grubx64.efi, depends on dhcpd.conf) from tftp.
3) made a standalone grub2 could workaround this issue(vmlinuz/initrd.img in grubx64.efi, that avoids using grub2 network stack)


if I am filling wrong component, pls help to change :-)
Version-Release number of selected component (if applicable):
on the DHCP/TFTP server side:
grub2-efi-2.02-0.16.el7.x86_64 (downgraded to grub2-efi-2.02-0.2.10.el7.x86_64 not helpful)
grub2-efi-modules (version same as grub2-efi)
shim-0.7-8.el7_0.x86_64 (downgraded to shim-0.7-5.2.el7.x86_64 not helpful)
tftp-server-5.2-11.el7.x86_64
dhcp-common-4.2.5-27.el7.x86_64
dhcp-libs-4.2.5-27.el7.x86_64
dhcp-4.2.5-27.el7.x86_64

DHCP/TFTP client side:
ipxe-roms-qemu-20130517-6.gitc4bce43.el7.noarch (installed on host)

Physical host:
qemu-kvm-rhev-2.1.2-20.el7.x86_64
bridge-utils-1.5-9.el7.x86_64
kernel-3.10.0-221.el7.x86_64


I dont think Linux kernel involved here, but they are from iso as below
RHEL-7.0-20140507.0-Server-x86_64-dvd1.iso
RHEL-6.6-20140926.0-Server-x86_64-dvd1.iso

How reproducible:
100%

Steps to Reproduce:
1. setup a isolated Linux bridge(w/o physical NIC attached), and define two VMs in libvirt, one as pxe/dhcp/tftp server and another one as the pxe client configured nic as first booting priority
-----------ovmf def
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
    <nvram template='/usr/share/OVMF/OVMF_VARS.fd'>/var/lib/libvirt/qemu/nvram/uefi-rhel6_VARS.fd</nvram>
    <bootmenu enable='yes'/>
  </os>
-----------
=========== nic as first booting priority
    <interface type='bridge'>
      <mac address='52:54:00:96:79:19'/>
      <source bridge='uefi-pxe'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <boot order='1'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
=============

Illustrated as:

pxe server vm(vnet0) <-> Linux bridge(named uefi-pxe) <-> (vnet1)pxe client VM(using OVMF, and iPXE oprom)

                                                        |
                                                      tcpdump here


inside the pxe server vm:
cat /etc/dhcp/dhcpd.conf 
option space PXE;
option PXE.mtftp-ip    code 1 = ip-address;
option PXE.mtftp-cport code 2 = unsigned integer 16;
option PXE.mtftp-sport code 3 = unsigned integer 16;
option PXE.mtftp-tmout code 4 = unsigned integer 8;
option PXE.mtftp-delay code 5 = unsigned integer 8;

subnet 192.168.0.0 netmask 255.255.255.0 {
  range 192.168.0.2 192.168.0.250;

  class "pxeclients" {
    match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
    next-server 192.168.0.1;
    if option arch = 00:07 {
      filename "shim.efi";
#      filename "non-secboot/non-secbootx64-with-rhel66ga-vmlinuz-initrd-with-updated-grub.cfg.efi";
# boot to standalone all-in-one(vmlinuz initrd grub.cfg inside) works
#      filename "grubx64.efi";
# boot to grubx64.efi directly wont help
      } else if option arch = 00:09 {
      filename "shim.efi";
      }
    }
  }


# grep -i disable /etc/xinetd.d/tftp 
	disable			= no

# pwd
/var/lib/tftpboot
# ls 
grubx64.efi  shim.efi grub.cfg
set timeout=5

menuentry 'Red Hat Enterprise Linux Server release 7.0 GA' --class os {
     insmod net 
     insmod efinet
     insmod tftp
     insmod gzio
     insmod part_gpt
     insmod efi_gop
     insmod efi_uga
     #set net_default_server=192.168.0.1
     #net_bootp
     # in case DHCP wont work for grub2
     #net_add_addr eno0 efinet0 192.168.0.9
     #### indeed, both net_bootp and net_add_addr wont help under iPXE oprom
     echo 'Network status: '
     net_ls_cards
     net_ls_addr
     net_ls_routes

     echo 'Loading Red Hat Enterprise Linux Server release 7.0 GA kernel ...'
     linuxefi (tftp)/rhel70ga/vmlinuz ip=dhcp \
     inst.repo=nfs:192.168.0.1:/home/installation_source/RHEL7.0GA/RHEL-7.0-20140507.0-Server-x86_64-dvd1.iso \
     console=ttyS0,115200 console=tty

     echo 'Loading Red Hat Enterprise Linux Server release 7.0 GA initial ramdisk ...'
     initrdefi (tftp)/rhel70ga/initrd.img
}


2. check services are up 
systemctl status dhcpd.service tftp.service etc

3. virsh start pxe-client-vm
and immediately start tcpdump
tcpdump -vvvvn -i vnet1 > tcpdump.txt

virsh qemu-monitor-command pxe-client-vm --hmp --cmd "info qtree" | grep -i romfile
        romfile = ""
        romfile = "vgabios-cirrus.bin"
        romfile = "pxe-virtio.rom"              -> make sure VM is using iPXE oprom


and the booting procedure is like:

1 OVMF -> 2 (builtin DXE) or (iPXE oprom) -> 3 DHCP,TFTP(request shim.efi then grubx64.efi) -> 4 accord to grub.cfg might request vmlinuz and initrd.img

Actual results:

after 3 DHCP,TFTP downloading the filename configured on dhcpd.conf, grubx64.efi or shim.efi stops sending requests to tftp server for grub.cfg vmlinuz, and raise error that file not exist(and they actually lays on tftp server)

Expected results:
shim.efi should request for grubx64.efi and later files (when filename sets to shim)
or
grubx64.efi should bring network and request for grub.cfg and vmlinuz initrd.img etc


Additional info:

*IF* change the ovmf guest to use builtin virtio-net driver instead of iPXE oprom, then it will work, though net_bootp still raise error, tcpdump shows shim.efi requests for grubx64.efi then grub.cfg then vmlinuz.

<rom file=''/> append this to the libvirt vm def. for builtin driver.




vm screen dumps and tcpdumps would be attached later.

Comment 1 Xiaoqing Wei 2015-01-22 03:32:57 UTC
Created attachment 982581 [details]
grub2-could-not-auto-config-IP.png

grub2-could-not-auto-config-IP

net_bootp wont functional in both builtin(can boot) and oprom(cant boot)

Comment 2 Xiaoqing Wei 2015-01-22 03:34:43 UTC
Created attachment 982582 [details]
iPXE-dhcp-correct.png

iPXE could gain IP addr from dhcp server correctly,
and then request for the filename configured in dhcpd.conf

Comment 4 Xiaoqing Wei 2015-01-22 03:38:44 UTC
Created attachment 982584 [details]
iPXE-direct-load-success.png

aha, and the yet another reason for filled this bz in grub2 is:

load vmlinuz and initrd.img from tftp in iPXE is good.

Comment 5 Xiaoqing Wei 2015-01-22 04:35:23 UTC
Created attachment 982598 [details]
shim-fail-to-load-grub.png

set filename = shim.efi in dhcpd.conf and boot vm w/ iPXE oprom,
shim.efi will not request for grubx64.efi and say files not exist
tcpdump shows it didn't send request at all

Comment 6 Xiaoqing Wei 2015-01-22 04:37:40 UTC
(In reply to Xiaoqing Wei from comment #5)

> set filename = shim.efi in dhcpd.conf and boot vm w/ iPXE oprom,
> shim.efi will not request for grubx64.efi and say files not exist
> tcpdump shows it didn't send request at all

set grubx64.efi as filename, it will not request for later files like grub.cfg too

Comment 7 Xiaoqing Wei 2015-01-22 04:43:02 UTC
Created attachment 982599 [details]
tcpdump-virtio-oprom.txt

grep -i rrq tcpdump-virtio-oprom.txt 
    192.168.0.2.1024 > 192.168.0.1.tftp: [udp sum ok]  38 RRQ "shim.efi" octet blksize 1432 tsize 0


only requested shim.efi, not seeing grubx64.efi or later files.

compare to later virtio-net using builtin driver.

Comment 8 Xiaoqing Wei 2015-01-22 04:49:35 UTC
Created attachment 982600 [details]
tcpdump-virtio-builtin.txt.bz2

too big for upload, compressed and re-upload

Comment 9 Xiaoqing Wei 2015-01-22 04:51:10 UTC
Created attachment 982601 [details]
tcpdump-virtio-oprom-grubx64.txt.bz2

(In reply to Xiaoqing Wei from comment #7)
> Created attachment 982599 [details]
> tcpdump-virtio-oprom.txt
> 
> grep -i rrq tcpdump-virtio-oprom.txt 
>     192.168.0.2.1024 > 192.168.0.1.tftp: [udp sum ok]  38 RRQ "shim.efi"
> octet blksize 1432 tsize 0
> 
> 
> only requested shim.efi, not seeing grubx64.efi or later files.
> 
> compare to later virtio-net using builtin driver.

same for grubx64.efi as described in C#6

Comment 10 Xiaoqing Wei 2015-01-22 04:52:01 UTC
Created attachment 982602 [details]
tcpdump-iPXE-direct-load-vmlinuz-initrd.img.txt.bz2

iPXE direct load vmlinuz initrd.img from tftp server and boot success

Comment 11 Xiaoqing Wei 2015-01-22 04:54:23 UTC
with e1000 or rtl8139, shim/grux64 are working like virtio-net w/ oprom
wont request for later files.

Comment 12 Xiaoqing Wei 2015-01-22 05:24:17 UTC
Hello Peter,

Could you pls help to have a look ?
KVM-QE team is now handling errata for iPXE qemu-kvm(-rhev) components.
and now stuck here, having not clue about what's the root cause.

If it's a iPXE driver issue, that might impact virt components' errata process, as our deadline is in the near future days, need an expert to help analyze.

Could you please help ? thank you !

I could provide a working DHCP/TFTP with RHEL installation sources.


Best Regards,
Xiaoqing Wei.

Comment 13 Laszlo Ersek 2015-01-22 12:46:45 UTC
Xiaoqing Wei,

I've reproduced the issue on a host-internal libvirtd bridge (virbr0).

First of all, we need to trim the scope here immediately, as you experimented with many different configs, and it's impossible to analyse anything without setting a strict focus first.

So, the *only* use case we care about is comment 5: OVMF loads shim.efi, which loads grubx64.efi, which loads the grub config file, then loads the kernel & initrd images. That's it.

There are currently two issues in this path.

- The first issue is that iPXE has a horrible bug, which I found with bisection this morning. (I'll soon attach a fix for it, and maybe I can even provide you with a Brew scratch build.) This bug is the one that prevents shim.efi from booting grubx64.efi.

- The second (separate) issue can be in either grubx64.efi or still in iPXE, I can't tell. (When OVMF's builtin virtio-net driver is in use, then this 2nd issue disappears too, without changing anything in grub2, so this issue again seems to finger iPXE, but I couldn't find the cause.)

The symptoms here are very similar to those visible in "tcpdump-virtio-oprom-grubx64.txt" (comment 9): namely, after grub is loaded, it falls into an ARP request (or DHCP request) *storm*. It never seems to recover, it just times out and drops to the grub emergency shell. Again, this could be a genuine grub2 bug, but it's also possible that the broken iPXE driver prevents grub2 from seeing ARP and DHCP responses.

Comment 14 Laszlo Ersek 2015-01-22 12:53:58 UTC
Created attachment 982802 [details]
drop iPXE's own broken EFI_LOAD_FILE_PROTOCOL implementation

This is the patch that fixes issue #1 described in the previous comment.

Comment 16 Laszlo Ersek 2015-01-22 22:18:41 UTC
I analyzed and fixed issue #2 listed in comment 13 as well. It's not a grub2 bug either; grub2 is not at fault. Let's return to bug 1181980 to discuss things further, I'm closing this one because grub2 is innocent.


Note You need to log in before you can comment on or make changes to this bug.