Hide Forgot
Description of problem: I have setup with 4 monitors and I connect with using spice client to 4 screens guest (4 qxl devices). KVM internal error sometimes occurs when shutting down multiple screens Windows guest. The guest has virtio-serial installed (as well as qxl graphic driver and spice-vdagent). Guest ends up in PAUSED state, qemu monitor is responsive. I have a feeling, that It happens when I move/click with mouse on screens during shutting down, I have not seen it on single monitor guest setup yet . qemu-kvm outputs: KVM internal error. Suberror: 1 rax 0000000000000050 rbx 0000000000000050 rcx 0000000000000050 rdx 00000000fcc39b54 rsi 00000000fcc39b54 rdi 000000009ddeb800 rsp 000000008cc132b8 rbp 000000008cc132c0 r8 0000000000000000 r9 0000000000000000 r10 0000000000000000 r11 0000000000000000 r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000 rip 00000000ac2e69c6 rflags 00010202 cs 0008 (00000000/ffffffff p 1 dpl 0 db 1 s 1 type b l 0 g 1 avl 0) ds 0023 (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) es 0023 (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) ss 0010 (00000000/ffffffff p 1 dpl 0 db 1 s 1 type 3 l 0 g 1 avl 0) fs 0030 (82744c00/00003748 p 1 dpl 0 db 1 s 1 type 3 l 0 g 0 avl 0) gs 0000 (00000000/ffffffff p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0) tr 0028 (801da000/000020ab p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt 0000 (00000000/ffffffff p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0) gdt 80b95000/3ff idt 80b95400/7ff cr0 80010031 cr2 88570840 cr3 7f1973e0 cr4 6f8 cr8 0 efer 800 emulation failure, check dmesg for details and dmesg | grep kvm kvm: 20793: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xabcd kvm: 21084: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xabcd kvm: 23719: cpu0 unhandled wrmsr: 0x198 data 0 kvm: 23719: cpu1 unhandled wrmsr: 0x198 data 0 qemu-kvm cli: /usr/libexec/qemu-kvm -S -M rhel6.1.0 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -name Win7 -uuid 807c6eca-0eb0-b4c4-7164-47afd27c036b -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/Win7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -boot order=d,menu=on -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/dev/rootvg/Windows7_test,if=none,id=drive-ide0-0-0,format=raw,cache=none,aio=native -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=/usr/share/rhev-guest-tools-iso/RHEV-toolsSetup_3.0_29.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=23,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:03:d9:0c,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -usb -spice port=3010,addr=0.0.0.0,disable-ticketing,disable-copy-paste -vga qxl -global qxl-vga.vram_size=67108864 -device qxl,id=video1,vram_size=67108864,bus=pci.0,addr=0x7 -device qxl,id=video2,vram_size=9437184,bus=pci.0,addr=0x8 -device qxl,id=video3,vram_size=9437184,bus=pci.0,addr=0x9 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 Version-Release number of selected component (if applicable): $ uname -a Linux dhcp-29-89.brq.redhat.com 2.6.32-217.el6.x86_64 #1 SMP Sat Nov 5 17:49:25 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux qemu-kvm-0.12.1.2-2.209.el6_2.1.x86_64 spice-server-0.8.2-5.el6.x86_64 Windows7x32 guest How reproducible: 1/5 Steps to Reproduce: 1. Shutdown multiple monitors guest with opened spice session and possibly click or move mouse during shut down (guest with virtio-serial installed). Actual results: KVM Internal error Expected results: smooth shutdown Additional info:
Created attachment 560041 [details] backtrace from PAUSED qemu-kvm process
When guest is paused after the error do "x/i $rip" in the monitor.
(In reply to comment #2) > When guest is paused after the error do "x/i $rip" in the monitor. virsh # qemu-monitor-command Win7 --hmp "x/i $rip" unknown register This?
(In reply to comment #4) > (In reply to comment #2) > > When guest is paused after the error do "x/i $rip" in the monitor. > > virsh # qemu-monitor-command Win7 --hmp "x/i $rip" > unknown register > > This? May be it is $eip. Or just copy rip address from the register dump. For the dump from comment #1 it will be: "x/i 0xac2e69c6"
(In reply to comment #5) > (In reply to comment #4) > > (In reply to comment #2) > > > When guest is paused after the error do "x/i $rip" in the monitor. > > > > virsh # qemu-monitor-command Win7 --hmp "x/i $rip" > > unknown register > > > > This? > > May be it is $eip. Or just copy rip address from the register dump. For the > dump from comment #1 it will be: "x/i 0xac2e69c6" virsh # qemu-monitor-command Win7 --hmp "x/i $eip" 0x00000000ac2e69c6: movntdq %xmm0,(%edi) virsh # qemu-monitor-command Win7 --hmp "x/i 0xac2e69c6" 0x00000000ac2e69c6: movntdq %xmm0,(%edi)
(In reply to comment #6) > virsh # qemu-monitor-command Win7 --hmp "x/i 0xac2e69c6" > 0x00000000ac2e69c6: movntdq %xmm0,(%edi) Can you also provide output of "info pci" monitor command for the guest?
(In reply to comment #7) > (In reply to comment #6) > > virsh # qemu-monitor-command Win7 --hmp "x/i 0xac2e69c6" > > 0x00000000ac2e69c6: movntdq %xmm0,(%edi) > > Can you also provide output of "info pci" monitor command for the guest? qemu-monitor-command Win7 --hmp "info pci" Bus 0, device 0, function 0: Host bridge: PCI device 8086:1237 id "" Bus 0, device 1, function 0: ISA bridge: PCI device 8086:7000 id "" Bus 0, device 1, function 1: IDE controller: PCI device 8086:7010 BAR4: I/O at 0xc000 [0xc00f]. id "" Bus 0, device 1, function 2: USB controller: PCI device 8086:7020 IRQ 11. BAR4: I/O at 0xc020 [0xc03f]. id "" Bus 0, device 1, function 3: Bridge: PCI device 8086:7113 IRQ 9. id "" Bus 0, device 2, function 0: VGA controller: PCI device 1b36:0100 IRQ 10. BAR0: 32 bit memory at 0xf0000000 [0xf3ffffff]. BAR1: 32 bit memory at 0xe0000000 [0xe3ffffff]. BAR2: 32 bit memory at 0xf4000000 [0xf4001fff]. BAR3: I/O at 0xc040 [0xc05f]. BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe]. id "" Bus 0, device 3, function 0: Ethernet controller: PCI device 10ec:8139 IRQ 5. BAR0: I/O at 0xc100 [0xc1ff]. BAR1: 32 bit memory at 0xf4020000 [0xf40200ff]. BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe]. id "net0" Bus 0, device 4, function 0: Class 0403: PCI device 8086:2668 IRQ 11. BAR0: 32 bit memory at 0xffffffffffffffff [0x00003ffe]. id "sound0" Bus 0, device 5, function 0: RAM controller: PCI device 1af4:1002 IRQ 10. BAR0: I/O at 0xc200 [0xc21f]. id "balloon0" Bus 0, device 6, function 0: Class 0780: PCI device 1af4:1003 IRQ 10. BAR0: I/O at 0xffffffffffffffff [0x001e]. BAR1: 32 bit memory at 0xffffffffffffffff [0x00000ffe]. id "virtio-serial0" Bus 0, device 7, function 0: Display controller: PCI device 1b36:0100 IRQ 5. BAR0: 32 bit memory at 0xffffffffffffffff [0x03fffffe]. BAR1: 32 bit memory at 0xffffffffffffffff [0x03fffffe]. BAR2: 32 bit memory at 0xffffffffffffffff [0x00001ffe]. BAR3: I/O at 0xffffffffffffffff [0x001e]. id "video1" Bus 0, device 8, function 0: Display controller: PCI device 1b36:0100 IRQ 11. BAR0: 32 bit memory at 0xec000000 [0xefffffff]. BAR1: 32 bit memory at 0xf5000000 [0xf5ffffff]. BAR2: 32 bit memory at 0xf6000000 [0xf6001fff]. BAR3: I/O at 0xc260 [0xc27f]. id "video2" Bus 0, device 9, function 0: Display controller: PCI device 1b36:0100 IRQ 10. BAR0: 32 bit memory at 0xf8000000 [0xfbffffff]. BAR1: 32 bit memory at 0xfd000000 [0xfdffffff]. BAR2: 32 bit memory at 0xf6002000 [0xf6003fff]. BAR3: I/O at 0xc280 [0xc29f]. id "video3
Here are finding after long IRC debug session: The instruction that fails is movntdq %xmm0,(%edi) and we indeed do not emulate it, but it should not be used to do mmio usually. The instruction tries to access address in %edi (0xddeb800). After walking page table we saw that it maps to a physical address 0xeb400000. Looking at "info pci" output in comment #8 there is no pci device that claims this address, but there is one unconfigured QXL at device 7. After reboot this device look like: Bus 0, device 7, function 0: Display controller: PCI device 1b36:0100 IRQ 5. BAR0: 32 bit memory at 0xe8000000 [0xebffffff]. BAR1: 32 bit memory at 0xe4000000 [0xe7ffffff]. BAR2: 32 bit memory at 0xf4046000 [0xf4047fff]. BAR3: I/O at 0xc240 [0xc25f]. id "video1" So it claims the address movntdq instruction tried to access. It looks like QXL driver tries to access device's memory after it is unconfigured. During normal operation such accesses are not emulated since QXL bars are memory, not mmio.
Gleb indicates in this IRC chat that the fix should be in the qlx driver. <gleb_> knoel_wfh: and 788227 is technically is kvm bug since we do not emulate the mmx instruction, but it triggers due to windows driver bug <gleb_> knoel_wfh: and for rhel6 we'd rather fix it in Windows driver <knoel_wfh> gleb_: Which windows driver? <gleb_> knoel_wfh: qxl is our driver
Since RHEL 6.3 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
closing as WONTFIx. This bug is about multiple qxl devices, and is not supported any more.