Description of problem: When qemu is given a read-only -drive parameter, then certain operations on that drive (such as mounting partitions from the drive) eventually cause qemu to print this message: raw_aio_remove: aio request not found! and segfault. Making the image writable removes the error. *However* note that I want the drive to be read-only. Version-Release number of selected component (if applicable): qemu 0.10-12.fc11.x86_64 How reproducible: Very reliably, particularly with a RHEL guest image. Steps to Reproduce: 1. chmod -w RHEL.img 2. qemu -drive RHEL.img 3. try mounting a filesystem inside qemu Actual results: segfaults with error Expected results: should not segfault Additional info:
Here is a realistic reproducer and stack trace: $ gdb --args qemu-system-x86_64 -drive file=/dev/mapper/Guests-RHEL39FV32 -m 384 -kernel /usr/lib64/guestfs/vmlinuz.fedora-10.x86_64 -initrd /usr/lib64/guestfs/initramfs.fedora-10.x86_64.img -append "console=ttyS0" -nographic -serial stdio bash-3.2# mount /dev/sda1 / [Thread 0x7fa52eafa910 (LWP 31843) exited] [New Thread 0x7fa52eafa910 (LWP 31845)] [Thread 0x7fa52eafa910 (LWP 31845) exited] raw_aio_remove: aio request not found! ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata1.00: cmd ca/00:02:4d:00:00/00:00:00:00:00/e0 tag 0 dma 1024 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: link is slow to respond, please be patient (ready=0) ata1: device not ready (errno=-16), forcing hardreset ata1: soft resetting link ata1.00: configured for MWDMA2 ata1: EH complete raw_aio_remove: aio request not found! ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ata1.00: cmd ca/00:02:4d:00:00/00:00:00:00:00/e0 tag 0 dma 1024 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: link is slow to respond, please be patient (ready=0) ata1: device not ready (errno=-16), forcing hardreset ata1: soft resetting link ata1.00: configured for MWDMA2 ata1: EH complete Program received signal SIGSEGV, Segmentation fault. 0x000000000041a6b5 in qemu_paio_cancel (fd=<value optimized out>, aiocb=0x21ef6f0) at posix-aio-compat.c:235 235 TAILQ_REMOVE(&request_list, aiocb, node); Missing separate debuginfos, use: debuginfo-install SDL-1.2.13-8.fc11.x86_64 gdbm-1.8.0-31.fc11.x86_64 libICE-1.0.4-7.fc11.x86_64 libSM-1.1.0-4.fc11.x86_64 libXext-1.0.99.1-2.fc11.x86_64 libXtst-1.0.3-5.fc11.x86_64 libasyncns-0.7-2.fc11.x86_64 libattr-2.4.43-3.fc11.x86_64 libcap-2.16-2.fc11.x86_64 pulseaudio-libs-0.9.15-3.test5.fc11.x86_64 tcp_wrappers-libs-7.6-54.fc11.x86_64 (gdb) bt #0 0x000000000041a6b5 in qemu_paio_cancel (fd=<value optimized out>, aiocb=0x21ef6f0) at posix-aio-compat.c:235 #1 0x000000000041b1a8 in raw_aio_cancel (blockacb=<value optimized out>) at block-raw-posix.c:682 #2 0x0000000000432930 in ide_dma_cancel (bm=0x22e5e60) at /usr/src/debug/qemu-kvm-0.10/qemu/hw/ide.c:2973 #3 0x0000000000432998 in bmdma_cmd_writeb (opaque=0x22e5e60, addr=0, val=0) at /usr/src/debug/qemu-kvm-0.10/qemu/hw/ide.c:2987 #4 0x00000000004074db in cpu_outb (env=0x21b0e80, addr=0, val=0) at /usr/src/debug/qemu-kvm-0.10/qemu/vl.c:453 #5 0x000000004271b632 in ?? () #6 0x0000000017897108 in ?? () #7 0xffffffff81221000 in ?? () #8 0x00000000021b0e80 in ?? () #9 0x00000000004be275 in phys_page_find (index=<value optimized out>) at /usr/src/debug/qemu-kvm-0.10/qemu/exec.c:389 #10 tlb_set_page_exec (index=<value optimized out>) at /usr/src/debug/qemu-kvm-0.10/qemu/exec.c:1983 #11 0x000000000052bfe3 in tlb_fill (addr=0, is_write=<value optimized out>, mmu_idx=<value optimized out>, retaddr=0x0) at /usr/src/debug/qemu-kvm-0.10/qemu/target-i386/op_helper.c:4774 #12 0x00000000004c0c12 in __ldb_cmmu (addr=18446744071581080137, mmu_idx=0) at /usr/src/debug/qemu-kvm-0.10/qemu/softmmu_template.h:135 #13 0x00000000004c456b in cpu_x86_exec (env1=<value optimized out>) at /usr/src/debug/qemu-kvm-0.10/qemu/cpu-exec.c:626 #14 0x000000000040ca4c in main_loop () at /usr/src/debug/qemu-kvm-0.10/qemu/vl.c:3862 #15 main () at /usr/src/debug/qemu-kvm-0.10/qemu/vl.c:6126
Good news. I have tested this with the latest qemu from svn (r7228), and the bug appears to be fixed. I have some more testing to do, and if that works I will close the bug.
Yes, this is looking good with qemu from svn.
Reopening and blocking F11VirtTarget.
These patches are waiting to be pulled into the qemu stable branch and should fix it: http://lists.gnu.org/archive/html/qemu-devel/2009-04/msg01276.html
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
*** Bug 500185 has been marked as a duplicate of this bug. ***
qemu-0.10.4-2.fc11 has been submitted as an update for Fedora 11. http://admin.fedoraproject.org/updates/qemu-0.10.4-2.fc11
Rich could you try qemu-0.10.4-2.fc11 and bump its karma if it fixes your problem?
Installing qemu-system-x86-0.10.4-2.fc11.x86_64.rpm gives: Usage: {start|stop|status|restart|condrestart} warning: %post(qemu-system-x86-2:0.10.4-2.fc11.x86_64) scriptlet failed, exit status 1 rpm -V qemu-system-x86 gives no output which I assume means that the failing script (/etc/sysconfig/modules/kvm.modules) is the same as the one in the package. I certainly haven't consciously edited this file ever.
I'm afraid to say this new qemu doesn't solve this problem, although it fails in a different way (it now such abruptly segfaults when I do the same test). Sorry :-(
(In reply to comment #13) > Installing qemu-system-x86-0.10.4-2.fc11.x86_64.rpm gives: > > Usage: {start|stop|status|restart|condrestart} > warning: %post(qemu-system-x86-2:0.10.4-2.fc11.x86_64) scriptlet failed, exit > status 1 Thanks Rich, I should have noticed that. Mixup between %{source1} and ${source2} qemu-0.10.4-3.fc11 is coming: https://koji.fedoraproject.org/koji/buildinfo?buildID=102024
(In reply to comment #14) > I'm afraid to say this new qemu doesn't solve this problem, > although it fails in a different way (it now such abruptly > segfaults when I do the same test). Sorry :-( Thanks for trying; backtrace?
qemu-0.10.4-3.fc11 has been submitted as an update for Fedora 11. http://admin.fedoraproject.org/updates/qemu-0.10.4-3.fc11
Here's the command I'm using: gdb --args /usr/bin/qemu-kvm -drive file=/dev/mapper/Guests-RHEL39FV32 -m 384 -no-reboot -kernel vmlinuz.rawhide.x86_64 -initrd initramfs.rawhide.x86_64.img -append 'panic=1 console=ttyS0 guestfs=10.0.2.4:6666 guestfs_verbose=1' -nographic -serial stdio -net channel,6666:unix:/tmp/sock,server,nowait -net user,vlan=0 -net nic,model=virtio,vlan=0 The initramfs in this case is modified so it gives me a shell inside the guest. At the shell I do: mount /dev/sda1 / (note that I only have *read* access, not write access, to the guest block device). KVM segfaults about 10-20 seconds after the mount command. (gdb) thread apply all bt Thread 2 (Thread 0x7ff55eee0910 (LWP 3815)): #0 0x000000000046cb43 in bdrv_aio_cancel (acb=0x1864010) at block.c:1471 #1 0x0000000000434140 in ide_dma_cancel (bm=0x1864e60) at /usr/src/debug/qemu-kvm-0.10.4/hw/ide.c:2973 #2 0x00000000004341a8 in bmdma_cmd_writeb (opaque=0x1864e60, addr=49152, val=0) at /usr/src/debug/qemu-kvm-0.10.4/hw/ide.c:2987 #3 0x000000000051ed88 in kvm_outb (opaque=<value optimized out>, addr=49152, data=0 '\0') at /usr/src/debug/qemu-kvm-0.10.4/qemu-kvm.c:684 #4 0x000000000054c249 in handle_io (vcpu=<value optimized out>, run=<value optimized out>, kvm=<value optimized out>) at libkvm.c:735 #5 kvm_run (vcpu=<value optimized out>, run=<value optimized out>, kvm=<value optimized out>) at libkvm.c:964 #6 0x000000000051f569 in kvm_cpu_exec (env=0x0) at /usr/src/debug/qemu-kvm-0.10.4/qemu-kvm.c:205 #7 0x000000000051f850 in kvm_main_loop_cpu (env=<value optimized out>) at /usr/src/debug/qemu-kvm-0.10.4/qemu-kvm.c:414 #8 ap_main_loop (env=<value optimized out>) at /usr/src/debug/qemu-kvm-0.10.4/qemu-kvm.c:451 #9 0x00000030fca0687a in start_thread () from /lib64/libpthread.so.0 #10 0x00000030fbee04cd in clone () from /lib64/libc.so.6 #11 0x0000000000000000 in ?? () Thread 1 (Thread 0x7ff5784f3740 (LWP 3812)): #0 0x00000030fbed9092 in select () from /lib64/libc.so.6 #1 0x0000000000409c33 in qemu_select (tv=<value optimized out>, xfds=<value optimized out>, wfds=<value optimized out>, rfds=<value optimized out>, max_fd=<value optimized out>) at /usr/src/debug/qemu-kvm-0.10.4/vl.c:3669 #2 main_loop_wait (tv=<value optimized out>, xfds=<value optimized out>, wfds=<value optimized out>, rfds=<value optimized out>, max_fd=<value optimized out>) at /usr/src/debug/qemu-kvm-0.10.4/vl.c:3768 #3 0x000000000051f02a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.10.4/qemu-kvm.c:596 #4 0x000000000040e9c4 in main_loop () at /usr/src/debug/qemu-kvm-0.10.4/vl.c:3831 #5 main () at /usr/src/debug/qemu-kvm-0.10.4/vl.c:6127
Created attachment 343792 [details] readonly-disk-backtrace.txt stack trace obtained with a build of qemu-kvm-0.10.4 from git
(gdb) p acb $1 = (BlockDriverAIOCB *) 0xcb3010 (gdb) p *acb $2 = {pool = 0x280000570108086, bs = 0x400001018000, cb = 0, opaque = 0x0, next = 0xc001} (gdb) p *acb->pool Cannot access memory at address 0x280000570108086
Rich, you mentioned that this does not appear upstream. Is that in qemu upstream, or qemu-kvm? If this is qemu, this might be a kvm specific problem. Otherwise, do you think you can identify the commit that causes the problem? If it gives you too much of a pain, don't bother. bisecting qemu is a PITA ;-(
Hi Glauber ... Yes, this *doesn't* appear in upstream QEMU or KVM, which is what I generally use to test / use libguestfs. I use both qemu from git, and KVM from F-12 (eg. 2:0.10.50-3.kvm85). [Although KVM from F-12 has another annoying boot-time bug (bug 500564).] I understand that bisecting is time-consuming. However maybe I can have a go at it tomorrow. If you ping me on IRC and talk me through it (I've never used git-bisect before).
qemu-0.10.4-3.fc11 has been pushed to the Fedora 11 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update qemu'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-4954
I'm assuming here that -3 will fail in the same way, so I'll put the status back to ASSIGNED.
0.10.4 was supposed to contain fixes for this DMA AIO cancellation stuff, but it turns only half of them were back-ported. I've submitted the rest of them to qemu-devel for the next stable release and cherry picked them into F-11. They fix the problem for me * Thu May 14 2009 Mark McLoughlin <markmc> - 2:0.10.4-4 - Cherry pick more DMA AIO cancellation fixes from upstream (#497170)
qemu-0.10.4-4.fc11 has been submitted as an update for Fedora 11. http://admin.fedoraproject.org/updates/qemu-0.10.4-4.fc11
Mixed results with 0.10.4-4. Good thing is that qemu-kvm doesn't crash. Not so good is that it cannot mount any filesystem if the block device is readonly, instead giving lengthy kernel messages and eventually not being able to even read the superblock. (Note: this does work OK with the QEMU from Rawhide).
(In reply to comment #27) > Mixed results with 0.10.4-4. > > Good thing is that qemu-kvm doesn't crash. Good. Please bump the update's karma > Not so good is that it cannot mount any filesystem if > the block device is readonly, instead giving lengthy kernel > messages and eventually not being able to even read the > superblock. (Note: this does work OK with the QEMU from > Rawhide). Hmm, it's working a bit better than that for me - if I boot a guest with a read-only image, the guest hangs late in boot for me, well after mounting filesystems. Please file a new bug for this
qemu-0.10.4-4.fc11 has been pushed to the Fedora 11 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update qemu'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-5050
qemu-0.10.4-5.fc11 has been submitted as an update for Fedora 11. http://admin.fedoraproject.org/updates/qemu-0.10.4-5.fc11
qemu-0.10.4-4.fc11 has been pushed to the Fedora 11 stable repository. If problems still persist, please make note of it in this bug report.