Bug 1183464

Summary: qemu instance internal-error during PXE booting in OVMF env (iPXE oprom is in use)
Product: Red Hat Enterprise Linux 7 Reporter: Xiaoqing Wei <xwei>
Component: qemu-kvm-rhevAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.1CC: hhuang, juzhang, lersek, virt-maint, xwei
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-26 08:00:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
grub2 stand alone image none

Description Xiaoqing Wei 2015-01-19 07:17:29 UTC
Description of problem:
qemu instance internal-error during PXE booting in OVMF env (iPXE oprom is in use)

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.2-18.el7.x86_64
ipxe-roms-qemu-20130517-6.gitc4bce43.el7.noarch
ipxe-bootimgs-20130517-6.gitc4bce43.el7.noarch
ipxe-roms-20130517-6.gitc4bce43.el7.noarch
OVMF-20140822-7.git9ece15a.el7.x86_64
kernel-3.10.0-221.el7.x86_64

processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 16
model name	: AMD A10-5800K APU with Radeon(tm) HD Graphics
How reproducible:


Steps to Reproduce:
1. Setup a pxe server and create a standalone grub2 image:

convert grub2-efi rpms into files, and copy the grubx64.efi to tftp server
/root/cpio-grub2-efi

mkdir -p /tftproot/non-secboot ; cd /tftproot/non-secboot
mkdir -p boot/grub

--
# cat boot/grub/grub.cfg 
set timeout=5

menuentry 'Red Hat Enterprise Linux Server release 7.0 GA' --class os {
     insmod net 
     insmod efinet
     insmod tftp
     insmod gzio
     insmod part_gpt
     insmod efi_gop
     insmod efi_uga

     # dhcp, tftp server in my network
     set net_default_server=192.168.0.1

     echo 'Network status: '
     net_ls_cards
     net_ls_addr
     net_ls_routes

     echo 'Loading Red Hat Enterprise Linux Server release 7.0 GA kernel ...'
     linuxefi (tftp)/rhel70ga/vmlinuz ip=dhcp \
     inst.repo=nfs:192.168.0.1:/home/installation_source/RHEL7.0GA/RHEL-7.0-20140507.0-Server-x86_64-dvd1.iso

     echo 'Loading Red Hat Enterprise Linux Server release 7.0 GA initial ramdisk ...'
     initrdefi (tftp)/rhel70ga/initrd.img
}
--

grub2-mkstandalone -d /root/cpio-grub2-efi/usr/lib/grub/x86_64-efi/ -O x86_64-efi --fonts="unicode" -o non-secbootx64.efi boot/grub/grub.cfg

and configure dhcpd to use that file:
# cat /etc/dhcp/dhcpd.conf 
#
# DHCP Server Configuration file.
#   see /usr/share/doc/dhcp*/dhcpd.conf.example
#   see dhcpd.conf(5) man page
#
option space PXE;
option PXE.mtftp-ip    code 1 = ip-address;
option PXE.mtftp-cport code 2 = unsigned integer 16;
option PXE.mtftp-sport code 3 = unsigned integer 16;
option PXE.mtftp-tmout code 4 = unsigned integer 8;
option PXE.mtftp-delay code 5 = unsigned integer 8;
option arch code 93 = unsigned integer 16; # RFC4578

subnet 192.168.0.0 netmask 255.255.255.0 {
#  option routers 10.0.0.254;
  range 192.168.0.2 192.168.0.250;

  class "pxeclients" {
    match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
    next-server 192.168.0.1;
    if option arch = 00:07 {
      filename "non-secboot/non-secbootx64.efi";
      } else if option arch = 00:09 {
      filename "non-secboot/non-secbootx64.efi";
      }
    }
  }


2. pxe boot a ovmf guest(using iPXE oprom)
2015-01-19 06:53:25.757+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name uefi-rhel66 -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/uefi-rhel6_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid a4c10d00-ac50-400f-b05e-c629ffa7a020 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/uefi-rhel66.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot menu=on,strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x9 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/var/lib/libvirt/images/uefi-rhel6.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=3 -drive file=/usr/share/OVMF/UefiShell.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none,aio=native -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:96:79:19,bus=pci.0,addr=0x3,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5901,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on
char device redirected to /dev/pts/2 (label charserial0)
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 3.933000 ms, bitrate 616125150 bps (587.582731 Mbps)
inputs_connect: inputs channel client create
red_dispatcher_set_cursor_peer: 

3. as I seeing the tftp transfer showing 71% for a while(1~2min),
then I check whether the VM is alive, and see Boom !

from the libvirt log:
-------------
KVM internal error. Suberror: 1
emulation failure
RAX=00000000e7ffff8f RBX=000000007e61ddf8 RCX=3ffffffffffb4600 RDX=000000007e750030
RSI=000000007e74c002 RDI=000000007e606e72 RBP=000000007e611a68 RSP=000000007ff67fd8
R8 =0000000000000000 R9 =000000007ff681b0 R10=00000000000003fd R11=0000000000000040
R12=000000007ff681b0 R13=0000000000000000 R14=000000007e612630 R15=000000007ff681a8
RIP=000000007e4d86f9 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
CS =0028 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
FS =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
GS =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     000000007fef0d98 0000003f
IDT=     000000007f32b018 00000fff
CR0=80000033 CR2=0000000000000000 CR3=000000007ff07000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000500
Code=0c 50 b6 00 00 2d 3a 00 00 0c 50 c7 00 00 2d 3b 00 00 0c 50 <d8> 00 00 2d 3c 00 00 0c 50 eb 00 00 2d 3d 00 00 0c 50 fd 00 00 2d 3e 00 00 0c 51 0e 00 00
Domain id=47 is tainted: custom-monitor
---------------

[root@dhcp-11-50 ~]# virsh qemu-monitor-command uefi-rhel66 --hmp --cmd "info status"
VM status: paused (internal-error)

[root@dhcp-11-50 ~]# virsh qemu-monitor-command uefi-rhel66 --hmp --cmd "info registers"
RAX=00000000e7ffff8f RBX=000000007e61ddf8 RCX=3ffffffffffb4600 RDX=000000007e750030
RSI=000000007e74c002 RDI=000000007e606e72 RBP=000000007e611a68 RSP=000000007ff67fd8
R8 =0000000000000000 R9 =000000007ff681b0 R10=00000000000003fd R11=0000000000000040
R12=000000007ff681b0 R13=0000000000000000 R14=000000007e612630 R15=000000007ff681a8
RIP=000000007e4d86f9 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
CS =0028 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
FS =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
GS =0008 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     000000007fef0d98 0000003f
IDT=     000000007f32b018 00000fff
CR0=80000033 CR2=0000000000000000 CR3=000000007ff07000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000500
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
XMM08=00000000000000000000000000000000 XMM09=00000000000000000000000000000000
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000


Actual results:
qemu instance dies

Expected results:
qemu should keep alive and guest able to finish pxe process

Additional info:

processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 16
model name	: AMD A10-5800K APU with Radeon(tm) HD Graphics  
stepping	: 1
cpu MHz		: 3800.000
cache size	: 2048 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 2
apicid		: 19
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bmi1
bogomips	: 7586.08
TLB size	: 1536 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

[root@dhcp-11-50 ~]# lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             AuthenticAMD
CPU family:            21
Model:                 16
Model name:            AMD A10-5800K APU with Radeon(tm) HD Graphics
Stepping:              1
CPU MHz:               3800.000
BogoMIPS:              7586.08
Virtualization:        AMD-V
L1d cache:             16K
L1i cache:             64K
L2 cache:              2048K
NUMA node0 CPU(s):     0-3


# virsh dumpxml uefi-rhel66
<domain type='kvm' id='47'>
  <name>uefi-rhel66</name>
  <uuid>a4c10d00-ac50-400f-b05e-c629ffa7a020</uuid>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
    <nvram template='/usr/share/OVMF/OVMF_VARS.fd'>/var/lib/libvirt/qemu/nvram/uefi-rhel6_VARS.fd</nvram>
    <bootmenu enable='yes'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/var/lib/libvirt/images/uefi-rhel6.img'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <boot order='3'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source file='/usr/share/OVMF/UefiShell.iso'/>
      <backingStore/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <shareable/>
      <boot order='2'/>
      <alias name='ide0-1-0'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='ide' index='1'>
      <alias name='ide1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='scsi' index='0'>
      <alias name='scsi0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='scsi' index='1' model='virtio-scsi'>
      <alias name='scsi1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:96:79:19'/>
      <source bridge='uefi-pxe'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <boot order='1'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/2'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/2'>
      <source path='/dev/pts/2'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0' state='disconnected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='spice' port='5901' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c301,c662</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c301,c662</imagelabel>
  </seclabel>
</domain>

Comment 1 Xiaoqing Wei 2015-01-19 07:21:58 UTC
Created attachment 981355 [details]
grub2 stand alone image

searched out bz database and found something similar.

Bug 1088784 - qemu ' KVM internal error. Suberror: 1' when query cpu frequently during pxe boot in Intel "Q95xx" host
Bug 1078775 - During query cpuinfo during guest boot from ipxe repeatedly in AMD hosts, vm repeatedly reboot.
Bug 1097363 - qemu ' KVM internal error. Suberror: 1' when query cpu frequently during pxe boot in Intel "Q95xx" host

Comment 3 Xiaoqing Wei 2015-01-19 07:58:37 UTC
(In reply to Xiaoqing Wei from comment #1)

Note: the 3 bugs pasted in C#1 are fixed, and they are using SeaBIOS, not OVMF

Comment 4 Laszlo Ersek 2015-01-22 22:38:52 UTC
Please try to reproduce this issue against the build in bug 1181980 comment 28. This BZ could turn out a duplicate of bug 1181980 -- the patches for bug 1181980 should eliminate the code from iPXE that (apparently) caused an emulation failure here. Thanks.

Comment 5 Xiaoqing Wei 2015-01-26 06:27:18 UTC
with ipxe-roms-qemu-20130517-6.gitc4bce43.el7.efi_fixes_2.noarch

using virtio-net-pci(oprom), 20 pxe, doesn't happen
/usr/libexec/qemu-kvm -name uefi-rhel66 -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/uefi-rhel6_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid a4c10d00-ac50-400f-b05e-c629ffa7a020 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/uefi-rhel66.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot menu=on,strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x9 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/var/lib/libvirt/images/uefi-rhel6.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=3 -drive file=/usr/share/OVMF/UefiShell.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none,aio=native -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2 \
\
\
-netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 \
\
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:96:79:19,bus=pci.0,addr=0x3,bootindex=1 \
\
\
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5901,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on

Comment 6 Laszlo Ersek 2015-01-26 08:00:25 UTC
(In reply to Xiaoqing Wei from comment #5)
> with ipxe-roms-qemu-20130517-6.gitc4bce43.el7.efi_fixes_2.noarch
> 
> using virtio-net-pci(oprom), 20 pxe, doesn't happen

Cheers!

*** This bug has been marked as a duplicate of bug 1181980 ***