Bug 617119

Summary: Qemu becomes unresponsive during unattended_installation
Product: Red Hat Enterprise Linux 6 Reporter: Amos Kong <akong>
Component: qemu-kvmAssignee: Gerd Hoffmann <kraxel>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: high    
Version: 6.0CC: ailan, alexl, ddumas, jasowang, kcao, llim, mjenner, mkenneth, plyons, shuang, snagar, syeghiay, tburke, virt-maint, ypu
Target Milestone: rcKeywords: TestBlocker, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.114.el6 Doc Type: Bug Fix
Doc Text:
Under certain circumstances, QEMU could stop responding during the installation of an operating system in a virtual machine when the QXL display device was in use. This error no longer occurs, and kvm-qemu now works as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-19 11:26:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 580954, 653341    
Attachments:
Description Flags
snapshot-winxp-hung
none
gdb-threads-bt-info
none
bugfix none

Description Amos Kong 2010-07-22 09:13:44 UTC
Description of problem:
I booted up a guest with option '-vga qxl', do unattended installation from CD.
But qemu becomes unresponsive.


Version-Release number of selected component (if applicable):
host kernel: 2.6.32-44.1.el6.x86_64
# rpm -qa |grep qemu
qemu-img-0.12.1.2-2.96.el6.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.96.el6.x86_64
gpxe-roms-qemu-0.9.7-6.3.el6.noarch
qemu-kvm-tools-0.12.1.2-2.96.el6.x86_64
qemu-kvm-0.12.1.2-2.96.el6.x86_64

# rpm -qa |grep spice
pixman-spice-0.13.3-5.el6.x86_64
cairo-spice-1.8.7.1-4.el6.x86_64
ffmpeg-spice-libs-0.4.9-0.15.5spice.20080908.el6.x86_64
spice-server-0.4.2-14.el6.x86_64

How reproducible:
not always

Steps to Reproduce:
1. Sets up a PXE boot environment using the built in qemu TFTP server.
2. Copies bootloader, vmlinuz and initrd.img(from CD) to a directory that
qemu will serve trough TFTP to the VM.
3. Start VM and execute installation from cd.


Actual results:
Qemu becomes unresponsive

Expected results:
installation completed successfully

Additional info:
# strace -p 5973  (blocked)
Process 5973 attached - interrupt to quit
read(46, 

(gdb) bt
#0  0x00007fe432bd150d in read () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007fe4327384ba in read (qxl_worker=0x26852f0) at /usr/include/bits/unistd.h:45
#2  receive_data (qxl_worker=0x26852f0) at red_worker.h:117
#3  read_message (qxl_worker=0x26852f0) at red_worker.h:130
#4  qxl_worker_detach (qxl_worker=0x26852f0) at red_dispatcher.c:233
#5  0x0000000000471c8d in qxl_detach (d=0x26747c0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/qxl.c:461
#6  0x00000000026747c0 in ?? ()
#7  0x0000000000472749 in qxl_reset (d=0x26747c0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/qxl.c:544
#8  0x00000000026747c0 in ?? ()
#9  0x0000000000473fd7 in qxl_display_resize (ds=0x1c80fa0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/qxl.c:821
#10 0x0000000001c80fa0 in ?? ()
#11 0x0000000001cd0730 in ?? ()
#12 0x000000000049f2c6 in dpy_resize (ds=0x26747c0, width=<value optimized out>, height=<value optimized out>) at console.h:218
#13 qemu_console_resize (ds=0x26747c0, width=<value optimized out>, height=<value optimized out>) at console.c:1441
#14 0x0000000002674a58 in ?? ()
#15 0x000000000000004f in ?? ()
#16 0x00000000004438bd in vga_draw_text (opaque=0x26747c0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/vga.c:1318
#17 vga_update_display (opaque=0x26747c0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/vga.c:1932
#18 0x0000000000000000 in ?? ()


# kvm_stat -1
efer_reload                    0         0
exits                  228969722         0
fpu_reload                 31059         0
halt_exits             110379078         0
halt_wakeup                51805         0
host_state_reload      110750694         0
hypercalls                     0         0
insn_emulation         113348077         0
insn_emulation_fail            0         0
invlpg                    635389         0
io_exits                 4043229         0
irq_exits                 772869         0
irq_injections         110429110         0
irq_window                     0         0
largepages                    81         0
mmio_exits                 72100         0
mmu_cache_miss              7007         0
mmu_flooded                 3612         0
mmu_pde_zapped              4556         0
mmu_pte_updated            31659         0
mmu_pte_write              54618         0
mmu_recycled                   0         0
mmu_shadow_zapped           8839         0
mmu_unsync                   107         0
nmi_injections                 0         0
nmi_window                     0         0
pf_fixed                  411515         0
pf_guest                   41088         0
remote_tlb_flush           49640         0
request_irq                    0         0
signal_exits                   5         0
tlb_flush                 669489         0

# ps aux|grep qemu
root      5973  2.2  8.3 2568668 671172 pts/1  Tl+  Jul21  42:13 /usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/qemu -name vm1 -monitor unix:/tmp/monitor-humanmonitor1-20100721-094756-9hQl,server,nowait -drive file=/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/isos/linux/RHEL-Server-5.5-i386-DVD.iso,if=none,id=drive-ide0-0-0,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -serial unix:/tmp/serial-20100721-094756-9hQl,server,nowait -drive file=/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/images/RHEL-Server-5.5-32-virtio.qcow2,if=none,id=drive-virtio-disk1,media=disk,cache=none,boot=on,format=qcow2 -device virtio-blk-pci,drive=drive-virtio-disk1,id=virtio-disk1 -net nic,vlan=0,netdev=idioAyog,model=virtio,macaddr=02:2D:4B:53:6a:25 -netdev user,id=idioAyog,tftp=/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/images/tftpboot,hostfwd=tcp::5000-:22,hostfwd=tcp::5001-:12323 -m 2048 -smp 2 -fda /usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/images/floppy.img -vnc :0 -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host -M rhel6.0.0 -usbdevice tablet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -bootp /pxelinux.0 -boot n

Comment 3 Dor Laor 2010-07-22 20:53:29 UTC
If it's the same rhel6 guest, please use 2.6.32-44.2.el6.x86_64 - there is a kvm bugs that triggers it.

Comment 4 Amos Kong 2010-07-23 07:18:51 UTC
(In reply to comment #3)
> If it's the same rhel6 guest, please use 2.6.32-44.2.el6.x86_64 - there is a
> kvm bugs that triggers it.    

Could not reproduce with rhel6 guest
I would try with kernel 2.6.32-44.2.el6.x86_64 & rhel5 guest.

Comment 5 Lawrence Lim 2010-07-29 01:24:01 UTC
Update?

Comment 6 Amos Kong 2010-07-29 04:50:40 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > If it's the same rhel6 guest, please use 2.6.32-44.2.el6.x86_64 - there is a
> > kvm bugs that triggers it.    
> 
> Could not reproduce with rhel6 guest
> I would try with kernel 2.6.32-44.2.el6.x86_64 & rhel5 guest.    

Could not reproduce with 2.6.32-44.2.el6.x86_64 , rhel5/rhel6 guest.

Comment 7 Amos Kong 2010-07-29 08:12:28 UTC
Re-tested 10 times, could not reproduce.
Move to VERIFIED.

Comment 8 Amos Kong 2010-09-17 05:12:39 UTC
Hello Gerd,

I found this bug also can be reproduced with qemu-kvm-0.12.1.2-2.113.el6.x86_64, so moving to ASSIGNED status.

It's easy to be reproduced with winxp guest.
host kernel: 2.6.32-71.el6.x86_64

# strace -p `pgrep qemu`
Process 32394 attached - interrupt to quit
read(23, 

# gdb -p `pgrep qemu`
0x0000003b73c0e50d in read () at ../sysdeps/unix/syscall-template.S:82
82      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0  0x0000003b73c0e50d in read () at ../sysdeps/unix/syscall-template.S:82
#1  0x0000003b7601046a in ?? () from /usr/lib64/libspice-server.so.0
#2  0x00000000004720bd in qxl_detach (d=0x2dce7c0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/qxl.c:461
#3  0x0000000000472b79 in qxl_reset (d=0x2dce7c0) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/qxl.c:544
#4  0x0000000000474427 in qxl_display_resize (ds=0x1f34f20) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/qxl.c:821
#5  0x0000000000444697 in dpy_resize (opaque=0x2dcea58) at /usr/src/debug/qemu-kvm-0.12.1.2/console.h:218
#6  vga_draw_graphic (opaque=0x2dcea58) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/vga.c:1726
#7  vga_update_display (opaque=0x2dcea58) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/vga.c:1938
#8  0x000000000049e938 in vga_hw_screen_dump (filename=<value optimized out>) at console.c:182
#9  0x0000000000417829 in handle_user_command (mon=0x1f7f9f0, cmdline=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:3960
#10 0x000000000041787a in monitor_command_cb (mon=0x1f7f9f0, cmdline=<value optimized out>, opaque=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:4506
#11 0x000000000049e2db in readline_handle_byte (rs=0x2e410e0, ch=<value optimized out>) at readline.c:369
#12 0x00000000004178ec in monitor_read (opaque=<value optimized out>, buf=0x7fff05418dc0 "\n", size=1) at /usr/src/debug/qemu-kvm-0.12.1.2/monitor.c:4492
#13 0x00000000004b6b8a in qemu_chr_read (opaque=0x1ec8630) at qemu-char.c:154
#14 tcp_chr_read (opaque=0x1ec8630) at qemu-char.c:2072
#15 0x000000000040b4af in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4234
#16 0x0000000000428a2a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2133
#17 0x000000000040e5cb in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4444
#18 main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6601

# /home/devel/akong/client/tests/kvm/qemu -name vm1 -chardev socket,id=human_monitor_QXFy,path=/tmp/monitor-humanmonitor1-20100910-122054-N12Y,server,nowait -mon chardev=human_monitor_QXFy,mode=readline -chardev socket,id=serial_mNaX,path=/tmp/serial-20100910-122054-N12Y,server,nowait -device isa-serial,chardev=serial_mNaX -drive file=/home/devel/akong/client/tests/kvm/images/winXP-32-virtio.qcow2,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,boot=on,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idY2JzYk,id=ndev00idY2JzYk,mac=02:A9:7C:6C:29:ad,bus=pci.0,addr=0x3 -netdev tap,id=idY2JzYk,ifname=virtio_0_8000,script=/home/devel/akong/client/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 3000 -smp 2 -drive file=/home/devel/akong/client/tests/kvm/isos/ISO/WinXP/32/en_windows_xp_professional_with_service_pack_3_x86_cd_x14-80428.iso,index=1,if=none,id=drive-ide0-0-0,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=/home/devel/akong/client/tests/kvm/isos/windows/winutils.iso,index=2,if=none,id=drive-ide0-0-1,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive file=/home/devel/akong/client/tests/kvm/isos/windows/virtio-win.iso,index=3,if=none,id=drive-ide0-1-0,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -cpu cpu64-rhel6,+x2apic -fda /home/devel/akong/client/tests/kvm/images/floppy.img -vnc :0 -spice port=8000,disable-ticketing -vga qxl -rtc base=localtime,clock=host,driftfix=none -M rhel6.0.0 -usbdevice tablet -boot d -enable-kvm

# kvm_stat -1
efer_reload                    0         0
exits                   17298406      1029
fpu_reload                785991         0
halt_exits                   167         0
halt_wakeup                  149         0
host_state_reload        1988878         1
hypercalls                     0         0
insn_emulation           5021484         0
insn_emulation_fail           13         0
invlpg                     82389         0
io_exits                 1902232         0
irq_exits                4691168      1027
irq_injections            160819         0
irq_window                 55841         0
largepages                     0         0
mmio_exits                 38439         0
mmu_cache_miss             99316         0
mmu_flooded                67298         0
mmu_pde_zapped             20913         0
mmu_pte_updated           139146         0
mmu_pte_write             281852         0
mmu_recycled                   0         0
mmu_shadow_zapped         106131         0
mmu_unsync                    47         0
nmi_injections                 0         0
nmi_window                     0         0
pf_fixed                 3468094         0
pf_guest                  828283         0
remote_tlb_flush           35352         0
request_irq                    0         0
signal_exits                  38         0
tlb_flush                 788264         0

Comment 9 Amos Kong 2010-09-17 05:17:33 UTC
Additional info:
1. Boot up a winxp guest and start to install system
2. execute this command in host to check if guest lives.
# while true; do echo "info status" | nc -U /tmp/monitor-humanmonitor1-20100910-122054-N12Y; echo ;done

Reproduce rate with winxp guest : 40%

Comment 13 Gerd Hoffmann 2010-09-20 09:04:21 UTC
Does this trigger outside autotest too?

The comment #8 stacktrace has functions in there which handle screen dumping,
and autotest uses this feature ...

Where in the XP install does it happen?  Early text mode?  Somewhere in the middle (while probing the video card maybe)?

Any chance I can get stack traces for the *other* qemu threads at the point where it hangs?  The main thread just sits there waiting for a reply from the spice server thread, that is perfectly fine.  The spice server thread should answer in a timely manner but doesn't for some reason.

First XP install just finished without problems, trying again ...

Comment 14 Gerd Hoffmann 2010-09-20 09:05:04 UTC
forgot needinfo ...

Comment 15 Amos Kong 2010-09-20 10:11:56 UTC
(In reply to comment #13)
> Does this trigger outside autotest too?

Yes, I reproduced this bug in manual.



> The comment #8 stacktrace has functions in there which handle screen dumping,
> and autotest uses this feature ...

manual steps:
1. setup pxe+kickstart+dhcp
2. boot up guest through commandline
3. connect vnc port to check the status
4. execute this command in host to check if guest lives.
# while true; do echo "info status" | nc -U
/tmp/monitor-humanmonitor1-20100910-122054-N12Y; echo ;done

Stacktrace was produced when guest hung, 
could not connect monitor through unix-socket.
 
> Where in the XP install does it happen?  Early text mode?  Somewhere in the
> middle (while probing the video card maybe)?

'Installing Windows', will attach the snapshot.

> Any chance I can get stack traces for the *other* qemu threads at the point
> where it hangs?

> The main thread just sits there waiting for a reply from the
> spice server thread, that is perfectly fine.  The spice server thread should
> answer in a timely manner but doesn't for some reason.
> 
> First XP install just finished without problems, trying again ...

Comment 16 Amos Kong 2010-09-20 10:14:13 UTC
Created attachment 448412 [details]
snapshot-winxp-hung

BTW,

when guest hung, I try to get the stackstaus by :

# gdb -p `pgrep qemu`
# c
...
# bt

If sth is wrong, pls correct me, thanks.

Comment 17 Gerd Hoffmann 2010-09-20 10:23:51 UTC
Managed to trigger it locally meanwhile so I don't need the traces any more.  Nevertheless for future reference:

Once you are in gdb you can use 'info threads' to get a list of threads.  Then you can switch to the other threads using 'thread <nr>', then use 'bt' again to get the stacktrace printed.

Comment 18 Amos Kong 2010-09-20 11:07:15 UTC
Created attachment 448430 [details]
gdb-threads-bt-info

thread 1 ->  read()
thread 2 ->  __lll_lock_wait ()
thread 3 ->  __lll_lock_wait ()
thread 4 ->  ioctl()

kernel: 2.6.32-72.el6.x86_64
# rpm -qa|grep qemu
qemu-kvm-debuginfo-0.12.1.2-2.113.el6_0.1.x86_64
qemu-img-0.12.1.2-2.113.el6_0.1.x86_64
gpxe-roms-qemu-0.9.7-6.3.el6.noarch
qemu-kvm-0.12.1.2-2.113.el6_0.1.x86_64
qemu-kvm-tools-0.12.1.2-2.113.el6_0.1.x86_64

Comment 20 Amos Kong 2010-09-21 04:58:22 UTC
(In reply to comment #19)
> Created attachment 448437 [details]
> bugfix
> 
> http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2769887

I tested 8 times, bug could not be reproduced.

Comment 23 Amit Shah 2010-10-13 05:32:44 UTC
*** Bug 637703 has been marked as a duplicate of this bug. ***

Comment 25 Amos Kong 2010-11-15 10:48:56 UTC
this bug was not reproduced in our weekly testing, moving to Verified.
version: qemu-kvm-0.12.1.2-2.114.el6
https://virtlab.englab.nay.redhat.com/job/18274/details/
https://virtlab.englab.nay.redhat.com/job/18275/details/

Comment 26 Jaromir Hradilek 2011-01-10 15:37:57 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Under certain circumstances, QEMU could stop responding during the installation of an operating system in a virtual machine when the QXL display device was in use. This error no longer occurs, and kvm-qemu now works as expected.

Comment 27 errata-xmlrpc 2011-05-19 11:26:40 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html

Comment 28 errata-xmlrpc 2011-05-19 12:46:41 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html