Description of problem: KVM is stuck in kvm_run. Version-Release number of selected component (if applicable): qemu-kvm-0.13.0-1.fc14.i686 Linux cboie 2.6.35.6-48.fc14.i686.PAE #1 SMP Fri Oct 22 15:27:53 UTC 2010 i686 i686 i386 GNU/Linux How reproducible: Always. Steps to Reproduce: 1. qemu-kvm -m 512 -hda f14.img 2. 3. Actual results: Machine does not start. Expected results: Machine should start. Additional info: First thread: (gdb) bt full #0 0x0096b424 in __kernel_vsyscall () No symbol table info available. #1 0x002ce581 in select () at ../sysdeps/unix/syscall-template.S:82 No locals. #2 0x0805f526 in main_loop_wait (nonblocking=0) at /usr/src/debug/qemu-kvm-0.13.0/vl.c:1291 ioh = 0x0 rfds = {fds_bits = {20768, 0 <repeats 31 times>}} wfds = {fds_bits = {0 <repeats 32 times>}} xfds = {fds_bits = {0 <repeats 32 times>}} ret = <value optimized out> nfds = 14 tv = {tv_sec = 0, tv_usec = 985899} timeout = 1000 #3 0x0807303c in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:1710 fds = {12, 13} mask = {__val = {268443712, 0 <repeats 31 times>}} sigfd = 14 #4 0x080604a9 in main_loop (argc=5, argv=0xbfec0c04, envp=0xbfec0c1c) at /usr/src/debug/qemu-kvm-0.13.0/vl.c:1343 r = <value optimized out> #5 main (argc=5, argv=0xbfec0c04, envp=0xbfec0c1c) at /usr/src/debug/qemu-kvm-0.13.0/vl.c:3097 gdbstub_dev = 0x0 i = <value optimized out> snapshot = 0 linux_boot = 0 icount_option = 0x0 initrd_filename = 0x0 kernel_filename = 0x0 kernel_cmdline = 0x821eccc "" boot_devices = "cad", '\000' <repeats 29 times> ds = <value optimized out> dcl = <value optimized out> cyls = 0 heads = 0 secs = 0 translation = 0 hda_opts = 0x93bd498 opts = <value optimized out> optind = 5 optarg = 0xbfec1829 "f14.img" loadvm = 0x0 machine = 0x82ae880 cpu_model = 0x0 tb_size = 0 pid_file = 0x0 incoming = 0x0 show_vnc_port = 0 defconfig = <value optimized out> Second thread: (gdb) bt full #0 0x001d1424 in __kernel_vsyscall () No symbol table info available. #1 0x00998be9 in ioctl () at ../sysdeps/unix/syscall-template.S:82 No locals. #2 0x080714aa in kvm_run (env=0x9ad3598) at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:610 r = 0 kvm = 0x9a9abcc run = 0xb77ca000 fd = 9 #3 0x080724c2 in kvm_cpu_exec (env=0x9ad3598) at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:1238 r = <value optimized out> #4 0x08072753 in kvm_main_loop_cpu (_env=0x9ad3598) at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:1495 run_cpu = <value optimized out> #5 ap_main_loop (_env=0x9ad3598) at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:1541 env = 0x9ad3598 signals = {__val = {2147483647, 4294967294, 4294967295 <repeats 30 times>}} data = <value optimized out> #6 0x00a8bf19 in start_thread (arg=0xb749bb70) at pthread_create.c:301 pd = 0xb749bb70 unwind_buf = {cancel_jmp_buf = {{jmp_buf = {11137012, 0, 4001536, -1219906472, 1444683819, -1811475644}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <value optimized out> pagesize_m1 = <value optimized out> sp = <value optimized out> freesize = <value optimized out> __PRETTY_FUNCTION__ = "start_thread" #7 0x009a1a2e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:133 No locals. pstack 24964 Thread 2 (Thread 0xb74e3b70 (LWP 24965)): #0 0x0096b424 in __kernel_vsyscall () #1 0x002cdbe9 in ioctl () from /lib/libc.so.6 #2 0x080714aa in kvm_run () #3 0x080724c2 in kvm_cpu_exec () #4 0x08072753 in ap_main_loop () #5 0x00a8bf19 in start_thread () from /lib/libpthread.so.0 #6 0x002d6a2e in clone () from /lib/libc.so.6 Thread 1 (Thread 0xb77e6700 (LWP 24964)): #0 0x0096b424 in __kernel_vsyscall () #1 0x002ce581 in select () from /lib/libc.so.6 #2 0x0805f526 in main_loop_wait () #3 0x0807303c in kvm_main_loop () #4 0x080604a9 in main () qemu-kvm --no-kvm -m 512 -hda f14.img - WORKS qemu-kvm -m 512 -hda f14.img - NOT WORK
a "me too" post here. I got the same issue, F14 not booting up in KVM, only get a white screen right after grub .. however, it seems to work on my T400 laptop , but not on my office server. Issue with kvm_amd module ? T400: model name : Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz Office server model name : AMD Phenom(tm) II X4 945 Processor
Strange is that the bug appears with qcow2. I tried with a raw partition and everything is OK. Anyway, let me know if I can help to fix the root cause.
i'm using raw by default, but stuck anyway ..
AMD or Intel? Guest type? Where is the guest stuck exactly?
Intel. Host is F14 i386. Guest is F14 i386. Please see the "Additional info:" section. If you need more info, please let me know how to get it. Thank you!
Please remove 'rhgb quiet' from the guest command line using grub's built-in editor (assuming it gets past grub), to see what the guest does. Also post the serial trace: - start qemu with '-serial stdio' - add 'console=ttyS0' to the guest grub command line Interesting qemu commands to run: - 'info registers' - 'x/30i $eip - 20'
update on my issue: Fedora14 and Ubuntu10.10 able to boot up if the processor allocation is only 1. Both stuck right after grub if the allocation is more than 1. I ran both through libvirt i'm attaching the console output for both Fedora14 and Ubuntu10.10. Qemu parameters were taken from qemu-kvm executed by libvirt and removed as much extra parameters i can. I couldnt reproduce this on my T400 laptop however.
Created attachment 458983 [details] qemu console output - fedora14
Created attachment 458984 [details] qemu console output - ubuntu10.10
More info: info registers: (qemu) info registers EAX=00000000 EBX=00ad3067 ECX=00000000 EDX=00000000 ESI=c0ac8fd0 EDI=c0ad4000 EBP=c0991f24 ESP=c0991f1c EIP=c042803f EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA] CS =0060 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0068 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA] FS =00d8 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =00e0 c0ac3d40 00000018 00409100 DPL=0 DS [--A] LDT=0000 00000000 ffffffff 00000000 TR =0020 00001000 00000067 00008b00 DPL=0 TSS32-busy GDT= c0abb000 000000ff IDT= c0997000 000007ff CR0=80050033 CR2=00000000 CR3=00993000 CR4=00000020 DR0=00000000 DR1=00000000 DR2=00000000 DR3=c05867c3 DR6=ffff0ff0 DR7=00000000 EFER=0000000000000800 FCW=037feems it is stuck in this point. (qemu) x/30i $eip - 20 0x00000000c042802b: nop 0x00000000c042802c: push %ebp 0x00000000c042802d: mov %esp,%ebp 0x00000000c042802f: push %esi 0x00000000c0428030: push %ebx 0x00000000c0428031: call 0xc0408ff0 0x00000000c0428036: mov %eax,%esi 0x00000000c0428038: mov %edx,%ebx 0x00000000c042803a: mov (%eax),%eax 0x00000000c042803c: mov 0x4(%esi),%edx 0x00000000c042803f: lock cmpxchg8b (%esi) 0x00000000c0428043: jne 0xc042803f 0x00000000c0428045: pop %ebx 0x00000000c0428046: pop %esi 0x00000000c0428047: pop %ebp 0x00000000c0428048: ret 0x00000000c0428049: push %ebp 0x00000000c042804a: mov %esp,%ebp 0x00000000c042804c: call 0xc0408ff0 0x00000000c0428051: pushf 0x00000000c0428052: pop %eax 0x00000000c0428053: pop %ebp 0x00000000c0428054: ret 0x00000000c0428055: push %ebp 0x00000000c0428056: mov %esp,%ebp 0x00000000c0428058: call 0xc0408ff0 0x00000000c042805d: push %eax 0x00000000c042805e: popf 0x00000000c042805f: pop %ebp 0x00000000c0428060: ret Seems that "lock cmpxchg8b (%esi)" is the problem. On the screen, only EDD line appears. I tried with edd=off and no progress.
Very similar output like in bug #649333. Curious, that I can run fedora 14 under fedora 14, but can't run fedora 12, 13. CentOS works well too. Also after reinstall of Fedora 13 guest unable to boot it. There is no problem with installation, only after installation I can't boot it.
Please also dump the stack: (qemu) x/100x $esp
Note, please verify that $eip has not changed from the previous dump, so we're sure we're debugging the same problem.
(qemu) info registers EAX=00000000 EBX=00ad3067 ECX=00000000 EDX=00000000 ESI=c0ac8fd0 EDI=c0ad4000 EBP=c0991f24 ESP=c0991f1c EIP=c042803f EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA] CS =0060 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0068 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA] FS =00d8 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =00e0 c0ac3d40 00000018 00409100 DPL=0 DS [--A] LDT=0000 00000000 ffffffff 00000000 TR =0020 00001000 00000067 00008b00 DPL=0 TSS32-busy GDT= c0abb000 000000ff IDT= c0997000 000007ff CR0=80050033 CR2=00000000 CR3=00993000 CR4=00000020 DR0=00000000 DR1=00000000 DR2=00000000 DR3=c05867c3 DR6=ffff0ff0 DR7=00000000 EFER=0000000000000800 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00000000 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000 XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000 XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000 XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000 (Same stuck address) (qemu) x/100x $esp 00000000c0991f1c: 0x00ad3067 0x00000000 0xc0991f2c 0xc04282c5 00000000c0991f2c: 0xc0991f48 0xc0a4dc6a 0x00ad3067 0xc0ac8fd0 00000000c0991f3c: 0x00000000 0xc0a1c9c0 0xc0ad63d0 0xc0991fc0 00000000c0991f4c: 0xc0a3d85c 0x0000000f 0x00000046 0x00000035 00000000c0991f5c: 0xc0991f7d 0x00000000 0x00000000 0x00000000 00000000c0991f6c: 0x205b4aa0 0x30202020 0x3030302e 0xc0991fd8 00000000c0991f7c: 0xc0630020 0xc0991f9e 0x1eed093a 0xc09e8470 00000000c0991f8c: 0xc0428069 0xc0991f98 0xc0464dce 0xc0991fa4 00000000c0991f9c: 0x00000086 0x1eed093a 0x00000000 0x00099800 00000000c0991fac: 0xc0997000 0x1eed093a 0x00000000 0x00099800 00000000c0991fbc: 0xc0997000 0xc0991fe0 0xc0a39562 0xc08dc36b 00000000c0991fcc: 0xc07c0044 0x1eed093a 0xc4b459a5 0x00000000 00000000c0991fdc: 0x07fed000 0xc0991ff8 0xc0a390da 0x07560000 00000000c0991fec: 0x00000000 0x00000800 0x00099800 0x00f84003 00000000c0991ffc: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099200c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099201c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099202c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099203c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099204c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099205c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099206c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099207c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099208c: 0x00000000 0x00000000 0x00000000 0x00000000 00000000c099209c: 0x00000000 0x00000000 0x00000000 0x00000000
This is set_64bit in native_set_pmd(). old value = edx:eax = 0 new value = ecx:ebx = 0x00ad3067 ZF = 0 = compare failed Looks like cmpxchg8b is misemulated on i386.
Already fixed in mainline, 16518d5ada690643453eb0aef3cc7841d3623c2d, just needs backporting to 2.6.35. Justin, can you work with the kernel maintainers to cherry-pick this commit ASAP? I'll get this into linux-stable, but that can take a while to percolate.
Created attachment 459498 [details] fix Tested patch attached.
I have a small (stupid) question: why set_64bit is not inlined even if it is marked with 'inline'?
This patch has been added to the F14 kernel and should make the next update.
*** Bug 652373 has been marked as a duplicate of this bug. ***
*** Bug 649333 has been marked as a duplicate of this bug. ***
*** Bug 653696 has been marked as a duplicate of this bug. ***
I think this was fixed in 2.6.35.9-64.fc14.i686.PAE. I have no problems after my host was updated to this kernel. Should we close this bug?
I just experienced this issue on an up to date F14 host with a F14 live cd. Both were 64bit if it matters. So I would say leave it open. $ uname -a Linux jug 2.6.35.9-64.fc14.x86_64 #1 SMP Fri Dec 3 12:19:41 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
(In reply to comment #25) > I just experienced this issue on an up to date F14 host with a F14 live cd. > Both were 64bit if it matters. So I would say leave it open. > > $ uname -a > Linux jug 2.6.35.9-64.fc14.x86_64 #1 SMP Fri Dec 3 12:19:41 UTC 2010 x86_64 > x86_64 x86_64 GNU/Linux It's a different issue. This is specifically an i386 problem.