Bug 920021

Summary: qemu-kvm segment fault when reboot guest after hot unplug device with option ROM
Product: Red Hat Enterprise Linux 7 Reporter: mazhang <mazhang>
Component: qemu-kvmAssignee: Bandan Das <bdas>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: 7.0CC: 451573170, acathrow, alex.williamson, bdas, chayang, hhuang, juzhang, knoel, lijin, mazhang, michen, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-1.5.3-15.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 12:50:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 2 Alex Williamson 2013-03-11 13:54:02 UTC
Does the vfio-pci assigned device have a PCI option ROM (lspci -vvv -s 2:00.0)?  If it does, does the segfault still occur if you add the option ",rombar=0" to the vfio-pci device options?  I suspect this is a general problem with option ROM handling and would also occur if you used an emulated device (ex. -device e1000,id=net0,romfile=<e1000.rom>)

Comment 3 mazhang 2013-03-12 03:14:28 UTC
Yes, the vfio-pci assigned device have a PCI option ROM, and just retry it, with add the option "rombar=0", the segment fault did happened.
plus, also used an emulated device with "-device e1000,id=net0,romfile=<e1000.rom>" got this problem.

Comment 4 mazhang 2013-03-12 03:16:07 UTC
(In reply to comment #3)
> Yes, the vfio-pci assigned device have a PCI option ROM, and just retry it,
> with add the option "rombar=0", the segment fault did happened.
> plus, also used an emulated device with "-device
> e1000,id=net0,romfile=<e1000.rom>" got this problem.

sorry, it was with add the option "rombar=0", the segment fault did not happened

Comment 6 mazhang 2013-11-01 08:34:26 UTC
Hit this problem with recent build.

Host:
RHEL-7.0-20131024.0
qemu-kvm-tools-1.5.3-10.el7.x86_64
qemu-img-1.5.3-10.el7.x86_64
qemu-kvm-common-1.5.3-10.el7.x86_64
qemu-kvm-1.5.3-10.el7.x86_64
kernel-3.10.0-33.el7.x86_64

Guest:
RHEL6.5
2.6.32-425.el6.x86_64

Cli:

/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-m 2G \
-smp 2,sockets=1,cores=2,threads=1 \
-name rhel6u4 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-rtc base=localtime,clock=host,driftfix=slew \
-monitor stdio \
-qmp tcp:0:6666,server,nowait \
-boot menu=on \
-chardev socket,path=/tmp/isa-serial,server,nowait,id=isa1 \
-device isa-serial,chardev=isa1,id=isa-serial1 \
-drive file=/home/rhel6u5-64.qcow2,if=none,id=drive-scsi-disk,format=qcow2,cache=none,werror=stop,rerror=stop \
-device virtio-scsi-pci,id=scsi0,addr=0x5 \
-device scsi-disk,drive=drive-scsi-disk,bus=scsi0.0,scsi-id=0,lun=0,id=scsi-disk,bootindex=1 \
-vga qxl \
-spice port=5900,disable-ticketing \
-nodefaults \
-net none \
-device vfio-pci,host=02:00.0,id=pf \
-enable-kvm \

Result:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeab08700 (LWP 3268)]
0x00007ffff3407040 in __memcmp_sse4_1 () from /lib64/libc.so.6
(gdb) bt full
#0  0x00007ffff3407040 in __memcmp_sse4_1 () from /lib64/libc.so.6
No symbol table info available.
#1  0x0000555555763ff9 in vapic_prepare ()
No symbol table info available.
#2  0x000055555576413e in vapic_write ()
No symbol table info available.
#3  0x0000555555786f12 in access_with_adjusted_size ()
No symbol table info available.
#4  0x00005555557883e7 in memory_region_iorange_write ()
No symbol table info available.
#5  0x0000555555785cc5 in kvm_cpu_exec ()
No symbol table info available.
#6  0x0000555555731005 in qemu_kvm_cpu_thread_fn ()
No symbol table info available.
#7  0x00007ffff6259de3 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#8  0x00007ffff339f1ad in clone () from /lib64/libc.so.6
No symbol table info available.

This problem just happened with 82541PI, change to 82599EB(another host) works well.

[root@localhost ~]# lspci -vvv -s 02:00.0
02:00.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)
	Subsystem: Intel Corporation PRO/1000 GT Desktop Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 32 (63750ns min), Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 20
	Region 0: Memory at f7d40000 (32-bit, non-prefetchable) [size=128K]
	Region 1: Memory at f7d20000 (32-bit, non-prefetchable) [size=128K]
	Region 2: I/O ports at d000 [size=64]
	Expansion ROM at f7d00000 [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [e4] PCI-X non-bridge device
		Command: DPERE- ERO+ RBC=512 OST=1
		Status: Dev=00:00.0 64bit- 133MHz- SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz-
	Kernel driver in use: e1000

Comment 8 mazhang 2013-11-04 10:00:16 UTC
Sent a email pasted the host ip and password, any problem please let know.

Comment 9 Paolo Bonzini 2013-11-06 13:06:50 UTC
*** Bug 988256 has been marked as a duplicate of this bug. ***

Comment 10 danielgao 2013-11-06 13:34:36 UTC
the upstream qemu has fixed this bug.
You can get this patch:
[uq/master][PATCH 0/3] Fix initialization bugs in kvmvapic

Comment 11 Miroslav Rezanina 2013-11-07 08:23:35 UTC
Fix included in qemu-kvm-1.5.3-15.el7

Comment 13 mazhang 2013-11-22 10:19:36 UTC
Reproduce this bug with qemu-kvm-1.5.3-14.el7.x86_64.

host:
qemu-kvm-1.5.3-14.el7.x86_64

guest:
RHEL-7.0-20131121.0
kernel-3.10.0-50.el7.x86_64

steps same as comment0.

Result:
qemu-kvm process core dumped.

(qemu) cmdline: line 25:  2186 Segmentation fault      (core dumped)


Verify this bug with qemu-kvm-1.5.3-19.el7.x86_64

host:
[root@localhost home]# rpm -qa |grep qemu
qemu-kvm-tools-1.5.3-19.el7.x86_64
qemu-kvm-common-1.5.3-19.el7.x86_64
ipxe-roms-qemu-20130517-1.gitc4bce43.el7.noarch
qemu-img-1.5.3-19.el7.x86_64
libvirt-daemon-driver-qemu-1.1.1-2.el7.x86_64
qemu-kvm-1.5.3-19.el7.x86_64

[root@localhost ~]# lspci -v -s 02:00.0
02:00.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)
	Subsystem: Intel Corporation PRO/1000 GT Desktop Adapter
	Flags: 66MHz, medium devsel, IRQ 20
	Memory at f7d40000 (32-bit, non-prefetchable) [size=128K]
	Memory at f7d20000 (32-bit, non-prefetchable) [size=128K]
	I/O ports at d000 [size=64]
	Expansion ROM at f7d00000 [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2
	Capabilities: [e4] PCI-X non-bridge device
	Kernel driver in use: vfio-pci

guest:
RHEL-7.0-20131121.0
kernel-3.10.0-50.el7.x86_64

Try about 5 times, after hot unplug vf , guest works well, reboot and shutdown not found core dump.
so this bug has been fixed.

Comment 15 Ludek Smid 2014-06-13 12:50:49 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.