Bug 1733198

Summary: PXE booting UEFI and IPv6 Acrossing the three layer protocol client drops to grub shell
Product: Red Hat Enterprise Linux 7 Reporter: hzj_smile
Component: ipxeAssignee: Neil Horman <nhorman>
ipxe sub component: ipxe-bootimgs QA Contact: Raviv Bar-Tal <rbartal>
Status: CLOSED DEFERRED Docs Contact:
Severity: urgent    
Priority: unspecified CC: hzj_smile
Version: 7.0   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-09 20:13:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description hzj_smile 2019-07-25 12:01:44 UTC
Description of problem:
PXE booting a client configured for UEFI and IPv6(Across the three layer protocol), the client drops to the grub2 shell instead of displaying the expected selection menu as configured .

Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux 7.4

How reproducible:
pxe booting system faills always.

Steps to Reproduce:
1. Configuring a PXE Server for UEFI-based Clients, configure a RHEL server to host TFTP and DHCPv6.
2. Copy the EFI boot image, grubx86.efi from <disc>/EFI/BOOT/* to tftp root directory.
3. Copy vmlinuz and initrd from <disc>/images/pxeboot to tftp root directory.

4.tftp root was set to /tftpboot and its contents as fallows.
bash-4.2# ls /tftpboot/ -R
/tftpboot/:
BOOTAA64.EFI  BOOTX64.EFI  EFI  efi  grubaa64.efi  grubx64.efi  initrd.img  pxelinux.0  pxelinux.cfg  vmlinuz

/tftpboot/EFI:
BOOT

/tftpboot/EFI/BOOT:
aarch64  grub.cfg  initrd.img  vmlinuz

/tftpboot/EFI/BOOT/aarch64:
initrd.img  vmlinuz

/tftpboot/efi:

/tftpboot/pxelinux.cfg:
default

5.Configure dhcpd6 

bash-4.2# cat /etc/dnsmasq.d/dnsmasq.conf
port=0
interface         =  eth_MANA
bind-interfaces
#dhcp-range=99.99.1.100, 99.99.1.200, 255.255.0.0, 12h
#dhcp-range=::fffe:6363:0164, ::fffe:6363:01c8, 112, 12h
log-queries
log-dhcp
conf-dir=/etc/dnsmasq.d
#except-interface=em1

dhcp-match=x86PC, option:client-arch, 0 #BIOS x86
dhcp-match=BC_EFI, option:client-arch, 7 #EFI x86-64
dhcp-match=AARCH64_EFI, option:client-arch, 11 #EFI AARCH64
dhcp-boot=tag:x86PC,pxelinux.0   # for Legacy BIOS detected by dhcp-match above
dhcp-boot=tag:BC_EFI,efi/BOOTX64.EFI # for UEFI arch detected by dhcp-match above
dhcp-boot=tag:AARCH64_EFI,efi/BOOTAA64.EFI # for UEFI arch detected by dhcp-match above
dhcp-option=option6:bootfile-url,tftp://[::fffe:6363:105]/BOOTX64.EFI   # for ipv6 uEFI only
tftp-root=/tftpboot

#dhcp-hostsdir=/etc/dnsmasq.d/hosts # need dnsmasq version >=2.73
dhcp-range=eth_MANA,::fffe:6362:100, ::fffe:6362:200, 112, 12h
dhcp-range=eth_MANA,::fffe:6363:132, ::fffe:6363:196, 112, 12h

6.configure tftp and enable ipv6

bash-4.2# cat /etc/xinetd.d/tftp
# default: off
# description: The tftp server serves files using the trivial file transfer \
#       protocol.  The tftp protocol is often used to boot diskless \
#       workstations, download configuration files to network-aware printers, \
#       and to start the installation process for some operating systems.
service tftp
{
    socket_type     = dgram
    protocol        = udp
    wait            = yes
    user            = root
    server          = /usr/sbin/in.tftpd
    server_args     = -s /tftpboot
    disable         = no
    per_source      = 11
    cps             = 100 2
    flags         = IPv6
    bind         =  ::fffe:6363:105
}

7.Configure client to boot in UEFI mode, and select network boot for IPv6 ( three layer protocol )

Actual results:
ient obtains IPv6 address from expected DHCP, and downloads the expected netboot image (BOOTX64.efi).  After 1-2 minutes, the client drops to the grub2 shell. The client never displays the contents of grub.

Expected results:
PXE boot client and reach the OS selection menu 

Additional info:
i found some impossible probles for reference
1) found pxe client ip but can't find three layer route
grub> net_ls_addr
efinet9 28:7b:09:c9:31:6f 0:00:00fffe:6362:1b8

grub> net_ls_routes
efinet0: loca1 O:0:0:0:0:fffe:63621b8/128 
efinet0:efinet0 0:0000:0:0:0/64 efinet0
2)Catchinf  a network packet found
13:31:10.539171 IP6 (class 0xc0, hlim 64, next-header UDP (17) payload length: 195) ::fffe:6363:105.547 > ::fffe:6363:104.547: [bad udp cksum 0xc9a1 -> 0x09cf!] dhcp6 relay-reply (linkaddr=::fffe:6362:1 peeraddr=fe80::2a7b:9ff:fec9:316f (interface-ID 0000102b...) (relay-message (dhcp6 reply (xid=6374cf (client-ID type 4) (server-ID hwaddr/time type 1 time 615370196 7ea9a37b5894) (IA_NA IAID:4252860283 T1:21600 T2:37800 (IA_ADDR ::fffe:6362:1b8 pltime:43200 vltime:43200)) (status-code Success) (opt_59))))
13:31:13.234598 IP6 (hlim 127, next-header UDP (17) payload length: 49) ::fffe:6362:1b8.1774 > ::fffe:6363:105.69: [udp sum ok]  41 RRQ "BOOTX64.EFI" octet tsize 0 blksize 1228
13:31:14.893232 IP6 (hlim 127, next-header UDP (17) payload length: 41) ::fffe:6362:1b8.1776 > ::fffe:6363:105.69: [udp sum ok]  33 RRQ "BOOTX64.EFI" octet blksize 1228
13:31:15.055866 IP6 (hlim 127, next-header UDP (17) payload length: 41) ::fffe:6362:1b8.1778 > ::fffe:6363:105.69: [udp sum ok]  33 RRQ "/grubx64.efi" octet blksize 512
13:31:15.296742 IP6 (hlim 64, next-header UDP (17) payload length: 148) ::fffe:6363:104.547 > ::fffe:6363:105.547: [udp sum ok] dhcp6 relay-fwd (linkaddr=::fffe:6362:1 peeraddr=fe80::2a7b:9ff:fec9:316f (interface-ID 0000102b...) (relay-message (dhcp6 release (xid=6474cf (client-ID type 4) (server-ID hwaddr/time type 1 time 615370196 7ea9a37b5894) (elapsed-time 0) (IA_NA IAID:4252860283 T1:4294967295 T2:4294967295 (IA_ADDR ::fffe:6362:1b8 pltime:43200 vltime:43200)))))
13:31:15.298221 IP6 (class 0xc0, hlim 64, next-header UDP (17) payload length: 120) ::fffe:6363:105.547 > ::fffe:6363:104.547: [bad udp cksum 0xc956 -> 0x32b2!] dhcp6 relay-reply (linkaddr=::fffe:6362:1 peeraddr=fe80::2a7b:9ff:fec9:316f (interface-ID 0000102b...) (relay-message (dhcp6 reply (xid=6474cf (client-ID type 4) (server-ID hwaddr/time type 1 time 615370196 7ea9a37b5894) (status-code Success))))

Comment 2 hzj_smile 2019-07-25 12:17:16 UTC
As shown above,tftp has to fail to download grub.cfg because of lacking of three layer routes. so,who bring this problem? where maybe routes from? DHCP or UEFI(RA) or others?

Comment 3 Neil Horman 2019-07-26 19:07:34 UTC
dnsmasq can specify your default route if you like, but you shouldn't need it, given that your server and client are on the same subnet (or so it appears, your mask value in the dnsmasq.conf doesn't match what in the grub route table).  In either case, its impossible to say why you didn't get your grub.cfg from the information provided.  Do you have a tcpdump of this transaction taken from the dhcp server that you can attach here?

Comment 4 hzj_smile 2019-07-30 02:12:50 UTC
hi,Neil
dhcp6 seems not support to specify default route ,it seems to get routes by RS(client) and RA(server)  ? i also know it a little.

Catch a network packet in dhcp6(tftp) server as follows:

13:31:10.539171 IP6 (class 0xc0, hlim 64, next-header UDP (17) payload length: 195) ::fffe:6363:105.547 > ::fffe:6363:104.547: [bad udp cksum 0xc9a1 -> 0x09cf!] dhcp6 relay-reply (linkaddr=::fffe:6362:1 peeraddr=fe80::2a7b:9ff:fec9:316f (interface-ID 0000102b...) (relay-message (dhcp6 reply (xid=6374cf (client-ID type 4) (server-ID hwaddr/time type 1 time 615370196 7ea9a37b5894) (IA_NA IAID:4252860283 T1:21600 T2:37800 (IA_ADDR ::fffe:6362:1b8 pltime:43200 vltime:43200)) (status-code Success) (opt_59))))
13:31:13.234598 IP6 (hlim 127, next-header UDP (17) payload length: 49) ::fffe:6362:1b8.1774 > ::fffe:6363:105.69: [udp sum ok]  41 RRQ "BOOTX64.EFI" octet tsize 0 blksize 1228
13:31:14.893232 IP6 (hlim 127, next-header UDP (17) payload length: 41) ::fffe:6362:1b8.1776 > ::fffe:6363:105.69: [udp sum ok]  33 RRQ "BOOTX64.EFI" octet blksize 1228
13:31:15.055866 IP6 (hlim 127, next-header UDP (17) payload length: 41) ::fffe:6362:1b8.1778 > ::fffe:6363:105.69: [udp sum ok]  33 RRQ "/grubx64.efi" octet blksize 512
13:31:15.296742 IP6 (hlim 64, next-header UDP (17) payload length: 148) ::fffe:6363:104.547 > ::fffe:6363:105.547: [udp sum ok] dhcp6 relay-fwd (linkaddr=::fffe:6362:1 peeraddr=fe80::2a7b:9ff:fec9:316f (interface-ID 0000102b...) (relay-message (dhcp6 release (xid=6474cf (client-ID type 4) (server-ID hwaddr/time type 1 time 615370196 7ea9a37b5894) (elapsed-time 0) (IA_NA IAID:4252860283 T1:4294967295 T2:4294967295 (IA_ADDR ::fffe:6362:1b8 pltime:43200 vltime:43200)))))
13:31:15.298221 IP6 (class 0xc0, hlim 64, next-header UDP (17) payload length: 120) ::fffe:6363:105.547 > ::fffe:6363:104.547: [bad udp cksum 0xc956 -> 0x32b2!] dhcp6 relay-reply (linkaddr=::fffe:6362:1 peeraddr=fe80::2a7b:9ff:fec9:316f (interface-ID 0000102b...) (relay-message (dhcp6 reply (xid=6474cf (client-ID type 4) (server-ID hwaddr/time type 1 time 615370196 7ea9a37b5894) (status-code Success))))

Comment 5 Neil Horman 2019-07-30 10:59:58 UTC
thats the exact same dump that you specified above, I was hoping to get the full tcpdump that you took (attached in pcap format), so I can look at it more closely with wireshark

Comment 6 hzj_smile 2019-08-09 01:43:09 UTC
thanks,Neil.we eventually find grub2 can not support to pxe install crossing ipv6 three layer network,and we will develop it.

Comment 7 Neil Horman 2019-08-09 20:13:31 UTC
ok, please reopen this when you have a patch