Bug 904917 - core dump while assigning PF to guest by vfio without -enable-kvm
core dump while assigning PF to guest by vfio without -enable-kvm
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.0
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Alex Williamson
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-01-28 02:31 EST by Chao Yang
Modified: 2014-06-17 23:21 EDT (History)
8 users (show)

See Also:
Fixed In Version: qemu-1.3.1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 06:07:39 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
log from booting a guest with PF assigned without -enable-kvm (66.89 KB, application/octet-stream)
2013-03-06 04:47 EST, Chao Yang
no flags Details
lspci dump of this assigned PF (2.23 KB, application/octet-stream)
2013-03-06 04:50 EST, Chao Yang
no flags Details

  None (edit)
Description Chao Yang 2013-01-28 02:31:39 EST
Description of problem:
Trying to assign dual port 82576 PF to rhel6.4 guest by vfio with -M q35, core dump happened.

Version-Release number of selected component (if applicable):
3.7.0-0.32.el7.x86_64
qemu-kvm-1.3.0-3.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. unbind PF from host
2. bind it to vfio-pci
3. assign it to guest with -M q35
  
Actual results:
Core dump happened

Expected results:
no core dump

Additional info:
#0  kvm_ioctl (s=0x0, type=type@entry=44547) at /usr/src/debug/qemu-1.3.0/kvm-all.c:1629
1629	    ret = ioctl(s->fd, type, arg);

(gdb) bt
#0  kvm_ioctl (s=0x0, type=type@entry=44547) at /usr/src/debug/qemu-1.3.0/kvm-all.c:1629
#1  0x00007f286e8f5c30 in kvm_check_extension (s=<optimized out>, extension=extension@entry=82) at /usr/src/debug/qemu-1.3.0/kvm-all.c:495
#2  0x00007f286e8e0917 in vfio_enable_intx (vdev=0x7f2870d4eb50) at /usr/src/debug/qemu-1.3.0/hw/vfio_pci.c:441
#3  0x00007f286e8e14bd in vfio_initfn (pdev=0x7f2870d4eb50) at /usr/src/debug/qemu-1.3.0/hw/vfio_pci.c:2006
#4  0x00007f286e7cc02a in pci_qdev_init (qdev=0x7f2870d4eb50) at hw/pci.c:1631
#5  0x00007f286e7ded1f in qdev_init (dev=dev@entry=0x7f2870d4eb50) at hw/qdev.c:155
#6  0x00007f286e7d9d29 in qdev_device_add (opts=0x7f2870b39b90) at hw/qdev-monitor.c:481
#7  0x00007f286e89de29 in device_init_func (opts=<optimized out>, opaque=<optimized out>) at vl.c:2052
#8  0x00007f286e85ae53 in qemu_opts_foreach (list=<optimized out>, func=func@entry=0x7f286e89de10 <device_init_func>, opaque=opaque@entry=0x0, 
    abort_on_failure=abort_on_failure@entry=1) at qemu-option.c:1106
#9  0x00007f286e7021e9 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:3885


Host dmesg:
[ 2049.881255] qemu-kvm[4356]: segfault at 400 ip 00007f286e8f5ba1 sp 00007fff0d331c90 error 4 in qemu-kvm[7f286e684000+3e1000]
Comment 3 Alex Williamson 2013-01-28 10:31:02 EST
Please always include command lines used to start the guest.

This only happens when -enable-kvm is not specified.  It's already been fixed upstream and should be in qemu-1.3.1.
Comment 4 Chao Yang 2013-01-28 23:01:24 EST
(In reply to comment #3)
> Please always include command lines used to start the guest.
> 
> This only happens when -enable-kvm is not specified.  It's already been
> fixed upstream and should be in qemu-1.3.1.

Retested with -enable-kvm, core dump again happened but with different backtrace:
/usr/libexec/qemu-kvm -M q35 -monitor stdio -drive file=/home/RHEL-Server-6.4-64-virtio.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,cache=none -device virtio-blk-pci,drive=drive-ide0-0-0 -vnc :1 -m 2048 -smp 2 -device vfio-pci,host=23:00.0,id=pf -net none -enable-kvm
BDB2053 Freeing read locks for locker 0x8: 4412/140398590379968
BDB2053 Freeing read locks for locker 0x9: 4412/140398590379968
BDB2053 Freeing read locks for locker 0xa: 4412/140398590379968
BDB2053 Freeing read locks for locker 0xb: 4412/140398590379968
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffec998700 (LWP 4430)]
[New Thread 0x7fffe7dfe700 (LWP 4431)]
[New Thread 0x7fffe75fd700 (LWP 4432)]
qemu-kvm: -device vfio-pci,host=23:00.0,id=pf: PCI: Bug - unimplemented PCI INTx routing (q35-pcihost)

[New Thread 0x7fffe5e44700 (LWP 4434)]
qemu-kvm: PCI: Bug - unimplemented PCI INTx routing (q35-pcihost)

QEMU 1.3.0 monitor - type 'help' for more information
(qemu) 
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe7dfe700 (LWP 4431)]
0x00007ffff2b5053c in __memcmp_sse2 () from /lib64/libc.so.6

(gdb) bt
#0  0x00007ffff2b5053c in __memcmp_sse2 () from /lib64/libc.so.6
#1  0x000055555579d1b2 in patch_hypercalls (s=0x555556660470) at /usr/src/debug/qemu-1.3.0/hw/kvmvapic.c:546
#2  vapic_prepare (s=s@entry=0x555556660470) at /usr/src/debug/qemu-1.3.0/hw/kvmvapic.c:611
#3  0x000055555579d536 in vapic_write (opaque=0x555556660470, addr=<optimized out>, data=<optimized out>, size=<optimized out>)
    at /usr/src/debug/qemu-1.3.0/hw/kvmvapic.c:648
#4  0x00005555557c9322 in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7fffe7dfdb38, size=2, access_size_min=<optimized out>, 
    access_size_max=<optimized out>, access=access@entry=0x5555557c9940 <memory_region_write_accessor>, opaque=opaque@entry=0x555556662798)
    at /usr/src/debug/qemu-1.3.0/memory.c:364
#5  0x00005555557ca997 in memory_region_iorange_write (iorange=<optimized out>, offset=0, width=2, data=32)
    at /usr/src/debug/qemu-1.3.0/memory.c:439
#6  0x00005555557c77c6 in kvm_handle_io (count=1, size=2, direction=1, data=<optimized out>, port=126) at /usr/src/debug/qemu-1.3.0/kvm-all.c:1426
#7  kvm_cpu_exec (env=env@entry=0x55555664d600) at /usr/src/debug/qemu-1.3.0/kvm-all.c:1571
#8  0x00005555557746d1 in qemu_kvm_cpu_thread_fn (arg=0x55555664d600) at /usr/src/debug/qemu-1.3.0/cpus.c:757
#9  0x00007ffff6272d15 in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff2bba2cd in clone () from /lib64/libc.so.6
Comment 5 Alex Williamson 2013-01-28 23:17:08 EST
Does that stack trace only happen when the vfio-pci device is included?  I don't see anything in the stack trace suggesting vfio.
Comment 6 Chao Yang 2013-01-28 23:43:35 EST
(In reply to comment #5)
> Does that stack trace only happen when the vfio-pci device is included?  I
> don't see anything in the stack trace suggesting vfio.

This also happens without vfio-pci device.
Comment 7 Alex Williamson 2013-01-28 23:51:43 EST
Unrelated to this bug then, but you might try adding -boot c
Comment 8 Chao Yang 2013-01-29 00:18:26 EST
(In reply to comment #7)
> Unrelated to this bug then, but you might try adding -boot c

Also reproducible. Will open a new one.
Comment 9 Chao Yang 2013-03-06 04:45:53 EST
tried to verify this bug on QEMU 1.4.0 with same steps described in Comment #0. But there are two issues I met:
# qemu-system-x86_64 -version
QEMU emulator version 1.4.0, Copyright (c) 2003-2008 Fabrice Bellard
# uname -r
3.8.0-0.40.el7.x86_64


ISSUE 1:
----------
log will be attached.

ISSUE 2:
----------
hitting once "qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000000a8000"


CLI:
# qemu-system-x86_64 -M q35 -monitor stdio -drive file=rhel7.0.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,cache=none -device virtio-blk-pci,drive=drive-ide0-0-0,bootindex=1 -vnc :1 -m 2048 -smp 2 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=78:2b:cb:6e:41:22 -balloon none -boot menu=on -device vfio-pci,host=22:00.0,id=pf,rombar=0 -serial unix:/tmp/test,server,nowait
QEMU 1.4.0 monitor - type 'help' for more information
(qemu) 
(qemu) sy
system_powerdown  system_reset      system_wakeup     
(qemu) system_reset 
(qemu) qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000000a8000

EAX=00000002 EBX=000000b2 ECX=000000b2 EDX=000000b2
ESI=00000002 EDI=000000b2 EBP=818e5e78 ESP=818e5e78
EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
ES =0000 00000000 ffffffff 00000000
CS =a000 000a0000 ffffffff 00000000
SS =0000 00000000 ffffffff 00000000
DS =0000 00000000 ffffffff 00000000
FS =0000 00000000 ffffffff 00000000
GS =0000 00000000 ffffffff 00000000
LDT=0000 00000000 00000000 00008200
TR =0040 7fc11d80 00002087 00008900
GDT=     7fc04000 0000007f
IDT=     81b3b000 00000fff
CR0=00050032 CR2=00000000 CR3=018f0000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=0000000000000000 DR7=0000000000000400
CCS=00000000 CCD=00005400 CCO=EFLAGS  
EFER=0000000000000000
FCW=037f FSW=2000 [ST=4] FTW=f0 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=a9d4000000000000 400d FPR5=a9dc000000000000 400d
FPR6=a8e8000000000000 400d FPR7=a96a000000000000 400d
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
Aborted (core dumped)
Comment 10 Chao Yang 2013-03-06 04:47:18 EST
Created attachment 705853 [details]
log from booting a guest with PF assigned without -enable-kvm
Comment 11 Chao Yang 2013-03-06 04:50:49 EST
Created attachment 705854 [details]
lspci dump of this assigned PF
Comment 12 Chao Yang 2013-03-06 04:54:08 EST
Without or with PF assigned, booting guest without -enable-kvm will hit call trace both.
Comment 13 Alex Williamson 2013-03-06 08:53:30 EST
Comment 9 & 10 suggest that the guest now boots and we only start getting into new problems when a system_reset is issued.  Why aren't we opening new bugs for this?  I bug, one issue.  The tombstone in comment 9 suggests a vga problem rather than an assigned device problem, especially since the assigned device doesn't even have a ROM.  In the new bug you could try to narrow the problem used various -vga options.
Comment 15 Chao Yang 2014-01-13 04:36:13 EST
Verified as passed with qemu-kvm-1.5.3-35.el7.x86_64.

Steps:
1. boot with assignment with/without kvm support in cli
# /usr/libexec/qemu-kvm -M q35 -monitor stdio -drive file=/home/chayang/RHEL-Server-6.4-64-virtio.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,cache=none -device virtio-blk-pci,drive=drive-ide0-0-0 -vnc :1 -m 2048 -smp 2 -device vfio-pci,host=03:01.1,id=pf -net none

Result:
Without kvm support, guest could boot up successfully. So does guest with kvm support.

As per above, this issue has been fixed.
Comment 17 Ludek Smid 2014-06-13 06:07:39 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.