Description of problem: I used screen to manage my terminals, I connected guest serial by nc _before_ guest completely start up, when I entered screen to copy/scrollback mode, guest deadlocked. If I recoverred to normall mode, guest also got back. Version-Release number of selected component (if applicable): host kernel: 2.6.18-236.el5 # rpm -qa |grep kvm kmod-kvm-debug-83-223.el5 etherboot-zroms-kvm-5.4.4-13.el5 kvm-debuginfo-83-223.el5 etherboot-roms-kvm-5.4.4-13.el5 kvm-tools-83-223.el5 kmod-kvm-83-223.el5 kvm-qemu-img-83-223.el5 kvm-83-223.el5 How reproducible: 100% Steps to Reproduce: 1. boot up a guest # qemu-kvm -name 'vm2' -monitor unix:'/tmp/monitor-humanmonitor1-20101213-111527-JNsP1',server,nowait -serial unix:'/tmp/serial-20101213-111527-JNsP1',server,nowait -drive file='/home/devel/autotest-akong/client/tests/kvm/images/RHEL-Server-6.0-64-virtio.qcow2',index=0,if=virtio,media=disk,cache=none,snapshot=on,boot=on,format=qcow2 -net nic,vlan=0,model=virtio,macaddr='9a:72:41:21:4c:15' -net tap,vlan=0,ifname='t0-111527-JNsP1',script='/home/devel/autotest-akong/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 4096 -smp 2,cores=1,threads=1,sockets=2 -cpu qemu64,+sse2 -soundhw ac97 -vnc :2 -rtc-td-hack -M rhel5.6.0 -usbdevice tablet -no-kvm-pit-reinjection 2. connect guest monitor and check guest status # nc -U /tmp/monitor-humanmonitor1-20101213-111527-JNsP1 (qemu) info status VM status: running 2. use 'nc' to connect guest serial before guest completely start up screen) # nc -U /tmp/serial-20101213-111527-JNsP1 3. set screen to enter copy/scrollback mode screen) ctrl + a + [ 4. connect guest monitor and check guest status # nc -U /tmp/monitor-humanmonitor1-20101213-111527-JNsP1 Actual results: guest deadlocked. Expected results: guest boot up successfully Additional info: warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fff815fc000 0x000000340b60d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0 (gdb) bt #0 0x000000340b60d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x000000340b608e1a in _L_lock_1034 () from /lib64/libpthread.so.0 #2 0x000000340b608cdc in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00000000004ff62e in kvm_mutex_lock () at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:1306 #4 0x00000000004097aa in qemu_select (timeout=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.h:202 #5 main_loop_wait (timeout=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:4042 #6 0x00000000005005fa in kvm_main_loop () at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:596 #7 0x000000000040e795 in main_loop (argc=29, argv=0x7fff815c4238, envp=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:4105 #8 main (argc=29, argv=0x7fff815c4238, envp=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:6545 (gdb) info threads 5 Thread 0x40e70940 (LWP 26784) 0x000000340aa31744 in do_sigwaitinfo () from /lib64/libc.so.6 4 Thread 0x41f0b940 (LWP 26785) 0x000000340b60d89b in write () from /lib64/libpthread.so.0 3 Thread 0x4290c940 (LWP 26786) 0x000000340b60d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0 2 Thread 0x4330d940 (LWP 26787) 0x000000340aa31744 in do_sigwaitinfo () from /lib64/libc.so.6 * 1 Thread 0x2b3c5755f150 (LWP 26775) 0x000000340b60d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0 (gdb) thread 2 [Switching to thread 2 (Thread 0x4330d940 (LWP 26787))]#0 0x000000340aa31744 in do_sigwaitinfo () from /lib64/libc.so.6 (gdb) bt #0 0x000000340aa31744 in do_sigwaitinfo () from /lib64/libc.so.6 #1 0x000000340aa317fd in sigwaitinfo () from /lib64/libc.so.6 #2 0x000000000041a991 in sigwait_compat (opaque=<value optimized out>) at compatfd.c:38 #3 0x000000340b60673d in start_thread () from /lib64/libpthread.so.0 #4 0x000000340aad40cd in clone () from /lib64/libc.so.6 (gdb) thread 3 [Switching to thread 3 (Thread 0x4290c940 (LWP 26786))]#0 0x000000340b60d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0 (gdb) bt #0 0x000000340b60d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x000000340b608e1a in _L_lock_1034 () from /lib64/libpthread.so.0 #2 0x000000340b608cdc in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00000000004fff1e in post_kvm_run (opaque=0x0, data=0x80) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:160 #4 0x000000000052bc54 in post_kvm_run (kvm=0x1895e7e0, env=0x1913e010) at libkvm.c:874 #5 0x000000000052be0b in kvm_run (kvm=0x1895e7e0, vcpu=1, env=0x1913e010) at libkvm.c:920 #6 0x0000000000500759 in kvm_cpu_exec (env=0x0) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:206 #7 0x00000000005009e3 in kvm_main_loop_cpu (_env=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:402 #8 ap_main_loop (_env=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:443 #9 0x000000340b60673d in start_thread () from /lib64/libpthread.so.0 #10 0x000000340aad40cd in clone () from /lib64/libc.so.6 (gdb) thread 4 [Switching to thread 4 (Thread 0x41f0b940 (LWP 26785))]#0 0x000000340b60d89b in write () from /lib64/libpthread.so.0 (gdb) bt #0 0x000000340b60d89b in write () from /lib64/libpthread.so.0 #1 0x0000000000470f09 in unix_write (fd=23, buf=<value optimized out>, len1=1) at qemu-char.c:477 #2 send_all (fd=23, buf=<value optimized out>, len1=1) at qemu-char.c:493 #3 0x000000000043a6a2 in serial_xmit (opaque=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/hw/serial.c:311 #4 0x000000000043ab31 in serial_ioport_write (opaque=0x191559a0, addr=<value optimized out>, val=99) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/hw/serial.c:366 #5 0x0000000000500018 in kvm_outb (opaque=<value optimized out>, addr=22948, data=1 '\001') at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:684 #6 0x000000000052b73f in handle_io (kvm=0x1895e7e0, run=0x2aaaaaaac000, vcpu=0) at libkvm.c:738 #7 0x000000000052c029 in kvm_run (kvm=0x1895e7e0, vcpu=0, env=0x19124950) at libkvm.c:967 #8 0x0000000000500759 in kvm_cpu_exec (env=0x1) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:206 #9 0x00000000005009e3 in kvm_main_loop_cpu (_env=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:402 #10 ap_main_loop (_env=<value optimized out>) at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:443 #11 0x000000340b60673d in start_thread () from /lib64/libpthread.so.0 #12 0x000000340aad40cd in clone () from /lib64/libc.so.6 (gdb) thread 5 [Switching to thread 5 (Thread 0x40e70940 (LWP 26784))]#0 0x000000340aa31744 in do_sigwaitinfo () from /lib64/libc.so.6 (gdb) bt #0 0x000000340aa31744 in do_sigwaitinfo () from /lib64/libc.so.6 #1 0x000000340aa317fd in sigwaitinfo () from /lib64/libc.so.6 #2 0x000000000041a991 in sigwait_compat (opaque=<value optimized out>) at compatfd.c:38 #3 0x000000340b60673d in start_thread () from /lib64/libpthread.so.0 #4 0x000000340aad40cd in clone () from /lib64/libc.so.6
# qemu-kvm ... -serial stdio ... When I redirected serial output to stdio, and entered screen to copy/scrollback mode, guest also deadlocked. ===> After recoverred screen to normal mode, guest outputted: BUG: soft lockup - CPU#1 stuck for 84s! [kauditd:793] Modules linked in: ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport virtio_balloon snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core sg ext4 mbcache jbd2 sr_mod cdrom virtio_blk virtio_net pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mod [last unloaded: speedstep_lib] CPU 1: Modules linked in: ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport virtio_balloon snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core sg ext4 mbcache jbd2 sr_mod cdrom virtio_blk virtio_net pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mod [last unloaded: speedstep_lib] Pid: 793, comm: kauditd Not tainted 2.6.32-71.7.1.el6.x86_64 #1 KVM RIP: 0010:[<ffffffff8107de86>] [<ffffffff8107de86>] run_timer_softirq+0x176/0x340 RSP: 0018:ffff880028223e50 EFLAGS: 00000246 RAX: ffff880028223e90 RBX: ffff880028223ed0 RCX: 0000000000000430 RDX: ffff880028223e90 RSI: dead000000200200 RDI: 0000000000000000 RBP: ffffffff81013c93 R08: ffff88013b7d4450 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000400 R12: ffff880028223dd0 R13: ffffffff81c9a9d0 R14: ffff88013b7d4000 R15: ffffffff814cfdac FS: 0000000000000000(0000) GS:ffff880028220000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00007f67d35ffa60 CR3: 00000001391d0000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <IRQ> [<ffffffff8102f52d>] ? lapic_next_event+0x1d/0x30 [<ffffffff81073c77>] ? __do_softirq+0xb7/0x1e0 [<ffffffff81095af0>] ? hrtimer_interrupt+0x140/0x250 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0 [<ffffffff81073a75>] ? irq_exit+0x85/0x90 [<ffffffff814cfdb1>] ? smp_apic_timer_interrupt+0x71/0x9c [<ffffffff81013c93>] ? apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff8106c611>] ? vprintk+0x1d1/0x4f0 [<ffffffff81059db2>] ? finish_task_switch+0x42/0xd0 [<ffffffff814c7fd3>] ? printk+0x41/0x46 [<ffffffff810cad5b>] ? audit_printk_skb+0x6b/0x70 [<ffffffff810cae48>] ? kauditd_thread+0xe8/0x180 [<ffffffff8105c530>] ? default_wake_function+0x0/0x20 [<ffffffff810cad60>] ? kauditd_thread+0x0/0x180 [<ffffffff810919d6>] ? kthread+0x96/0xa0 [<ffffffff810141ca>] ? child_rip+0xa/0x20 [<ffffffff81091940>] ? kthread+0x0/0xa0 [<ffffffff810141c0>] ? child_rip+0x0/0x20
Bug can be reproduced in UPSTREAM, so add this keyword. qemu-kvm: commit 53b6d3d5c2522e881c8d194f122de3114f6f76eb date: Thu Dec 2 12:15:20 2010 -0200
Created attachment 469908 [details] strace output .... .... read(21, "\16\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128) = 128 rt_sigaction(SIGALRM, NULL, {0x407a90, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x340b60eb10}, 8) = 0 write(5, "\0", 1) = 1 read(21, 0x7fffb2b87580, 128) = -1 EAGAIN (Resource temporarily unavailable) clock_gettime(CLOCK_MONOTONIC, {962736, 763682502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 763707502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 763728502}) = 0 futex(0xb44f20, FUTEX_WAKE_PRIVATE, 1) = 1 select(25, [4 6 10 14 15 19 21 23 24], [], [], {1, 0}) = 2 (in [4 24], left {1, 0}) recvfrom(24, "\3\1\0\0\0\0\2\320\1\220", 4096, 0, NULL, NULL) = 10 read(4, "\0", 512) = 1 read(4, 0x7fffb2b87400, 512) = -1 EAGAIN (Resource temporarily unavailable) clock_gettime(CLOCK_MONOTONIC, {962736, 795117502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 795138502}) = 0 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 795241502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 795264502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 795286502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 795307502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 795341502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 795362502}) = 0 clock_gettime(CLOCK_MONOTONIC, {962736, 795382502}) = 0 timer_gettime(0, {it_interval={0, 0}, it_value={0, 36000}}) = 0 futex(0xb44f20, FUTEX_WAKE_PRIVATE, 1) = 1 select(25, [4 6 10 14 15 19 21 23 24], [], [], {1, 0}) = 1 (in [21], left {0, 999000}) futex(0xb44f20, FUTEX_WAIT_PRIVATE, 2, NULL
I have checked with rhel5u5 and rhel6 guest from libvirt aspect. For rhel5u5 guest, it works well whenever in unix socket or pty guest serial type. But for rhel6 guest, the guest will be hang in unix socket serial type and pty serial type after several times of guest reboot. Detailed steps: 1 Start a guest (rhel6) with serial type is unix socket ... <serial type="unix"> <source mode="bind" path="/tmp/foo"/> <target port="0"/> </serial> ... 2. Open virt-manager to monitor this guest 3. use 'nc' to connect guest serial before guest completely start up screen) # nc -U /tmp/foo 4. set screen to enter copy/scrollback mode screen) ctrl + a + [ 5. reboot the guest for several times and check guest status in virt-manager Actual result: the guest is hang, then I ctrl+c to close the nc, the guest will be resumed and works well.
Won't fix on RHEL-5. Programs reading from the serial output shouldn't stop reading the data (like screen does) or buffers will fill up and qemu process will block. We could make qemu not block when writing to serial port (and drop data if needed), but it is not something we would implement in RHEL-5. I cloned this bug for RHEL-6.1: bug #670145.