Bug 2029980
Summary: | Failed assertion in IDE emulation with Ceph backend | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Hanna Czenczek <hreitz> | ||||||
Component: | qemu-kvm | Assignee: | Hanna Czenczek <hreitz> | ||||||
qemu-kvm sub component: | Storage | QA Contact: | Tingting Mao <timao> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | medium | ||||||||
Priority: | medium | CC: | coli, jinzhao, jsnow, juzhang, kkiwi, qinwang, timao, virt-maint, yfu | ||||||
Version: | 8.4 | Keywords: | Triaged | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | qemu-kvm-6.2.0-15.module+el8.7.0+15644+189a21f6 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2022-11-08 09:18:32 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Hanna Czenczek
2021-12-07 17:48:57 UTC
Moving this to RHEL for resolution. If we need a RHEL-AV bug, then we will have to use a z-stream process. Correct the reproduce steps(The qemu will core dumped after writing a larger data size than the free size in the guest.) 1. Boot from the windows guest with IDE emulation /usr/libexec/qemu-kvm \ -name guest=instance-0000026f,debug-threads=on \ -S \ -machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \ -cpu Icelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff \ -m 1024 \ -object memory-backend-ram,id=pc.ram,size=1073741824 \ -overcommit mem-lock=off \ -smp 4,sockets=4,dies=1,cores=1,threads=1 \ -uuid 75d92da9-2652-4632-ac9f-497677076cf5 \ -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=20.6.2-2.20210607104828.el8ost.4,serial=75d92da9-2652-4632-ac9f-497677076cf5,uuid=75d92da9-2652-4632-ac9f-497677076cf5,family=Virtual Machine' \ -no-user-config \ -nodefaults \ -rtc base=localtime,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -boot strict=on \ -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 \ -blockdev '{"driver":"rbd","pool":"rbd","image":"win2016.img","cache":{"direct":false,"no-flush":false},"node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","cache":{"direct":false,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}' \ -device ide-hd,bus=ide.0,unit=0,drive=libvirt-1-format,id=ide0-0-0,bootindex=1,write-cache=on \ -netdev tap,id=hostnet0,vhost=on \ -device virtio-net-pci,rx_queue_size=512,host_mtu=1500,netdev=hostnet0,id=net0,mac=fa:16:3e:e9:f3:ab,bus=pci.0,addr=0x3 \ -device usb-tablet,id=input0,bus=usb.0,port=1 \ -vnc :0 \ -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \ -object rng-random,id=objrng0,filename=/dev/urandom \ -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x6 \ -sandbox on,obsolete=deny,elevateprivileges=deny,resourcecontrol=deny \ -msg timestamp=on \ -monitor stdio \ -chardev socket,server=on,path=/var/tmp/monitor-qmpmonitor1-20210721-024113-AsZ7KYro,id=qmp_id_qmpmonitor1,wait=off \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -blockdev node-name=file_winutils,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/windows/winutils.iso,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_winutils,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_winutils \ -device ide-cd,id=winutils,drive=drive_winutils,write-cache=on,bus=ide.1,unit=0 \ 2. Write data in the guest over the free size in the guest. (guest)# fio.exe -name=test --filename=C:\User\Administrator\Destop\fio --size=100G --rw=write Result: After step2, the qemu coredumped. (gdb) bt #0 0x00007f8591a2f37f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f8591a19db5 in __GI_abort () at abort.c:79 #2 0x00007f8591a19c89 in __assert_fail_base (fmt=0x7f8591b825f8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x559aabfaab03 "s->bus->dma->aiocb == NULL", file=0x559aabfaaada "../hw/ide/core.c", line=730, function=<optimized out>) at assert.c:92 #3 0x00007f8591a27a76 in __GI___assert_fail (assertion=assertion@entry=0x559aabfaab03 "s->bus->dma->aiocb == NULL", file=file@entry=0x559aabfaaada "../hw/ide/core.c", line=line@entry=730, function=function@entry=0x559aabfab040 <__PRETTY_FUNCTION__.32556> "ide_cancel_dma_sync") at assert.c:101 #4 0x0000559aabbfe17e in ide_cancel_dma_sync (s=0x559aad883be8) at ../hw/ide/core.c:730 #5 0x0000559aabbfe17e in ide_cancel_dma_sync (s=0x559aad883be8) at ../hw/ide/core.c:699 #6 0x0000559aabc8d7c2 in bmdma_cmd_writeb (bm=0x559aad884d40, val=<optimized out>) at /usr/src/debug/qemu-kvm-5.2.0-16.module+el8.4.0+12393+838d9165.8.x86_64/include/hw/ide/internal.h:570 #7 0x0000559aabd9af68 in memory_region_write_accessor (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at ../softmmu/memory.c:491 #8 0x0000559aabd99b5e in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7f850edfc518, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn= 0x559aabd9af10 <memory_region_write_accessor>, mr=0x559aad884e90, attrs=...) at ../softmmu/memory.c:552 #9 0x0000559aabd9cf03 in memory_region_dispatch_write (mr=mr@entry=0x559aad884e90, addr=0, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ../softmmu/memory.c:1501 #10 0x0000559aabdfb2c7 in flatview_write_continue (fv=fv@entry=0x7f8504275090, addr=addr@entry=49376, attrs=..., ptr=ptr@entry=0x7f85957b7000, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x559aad884e90) at /usr/src/debug/qemu-kvm-5.2.0-16.module+el8.4.0+12393+838d9165.8.x86_64/include/qemu/host-utils.h:164 #11 0x0000559aabdfb4e6 in flatview_write (fv=0x7f8504275090, addr=49376, attrs=..., buf=0x7f85957b7000, len=1) at ../softmmu/physmem.c:2799 #12 0x0000559aabdff07f in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at ../softmmu/physmem.c:2891 #13 0x0000559aabe11fe4 in kvm_handle_io (count=1, size=1, direction=<optimized out>, data=<optimized out>, attrs=..., port=49376) at ../accel/kvm/kvm-all.c:2297 #14 0x0000559aabe11fe4 in kvm_cpu_exec (cpu=cpu@entry=0x559aad053e60) at ../accel/kvm/kvm-all.c:2543 #15 0x0000559aabdd1445 in kvm_vcpu_thread_fn (arg=0x559aad053e60) at ../accel/kvm/kvm-cpus.c:49 #16 0x0000559aabf46a24 in qemu_thread_start (args=0x559aad07b330) at ../util/qemu-thread-posix.c:521 #17 0x00007f8591dc514a in start_thread (arg=<optimized out>) at pthread_create.c:479 #18 0x00007f8591af4dc3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The complete coredumped file: http://fileshare.englab.nay.redhat.com/pub/section2/kvm/timao/bugs/coredump/core.qemu-kvm.0.1f07eca2af2a43a6bcf3a3e00cd68c56.211561.1638859852000000.lz4 Assigning to Hanna for now, but we can move that to John or someone else if that ends up being a better fit. Here’s what I found from the core dump: Some TRIM request is going on. I’m not sure exactly what the request entails, because from what I can see, the I/O vector only contains metadata (64 separate trim requests, each stored as a u64, consisting of a 48 bit sector number and a 16 bit sector count), but it only contains zeroes, so no discard is actually done. Because the I/O vector only contains zeroes and no discard is done, ide_issue_trim_cb() is done, and has scheduled ide_trim_bh_cb() to run in a bottom half (this BH is the second entry in qemu_aio_context->bh_list). However, ide_trim_bh_cb() has not been run yet. If it were, it would have called dma_blk_cb(), which would have called ide_dma_cb(), which would have seen the end of transfer and thus called ide_set_inactive(), which would have cleared s->bus->dma->aiocb. I believe the BH hasn’t been run because we haven’t polled for it. blk_drain() just drains the BB, but there are no in-flight requests, and so blk_drain() returns basically immediately (the BDRV_POLL_WHILE() loop in bdrv_do_drained_begin() sees in_flight == 0 everywhere, and so does the AIO_WAIT_WHILE() loop in blk_drain()). So my best guess right now is that we need ide_cancel_dma_sync() needs another pool loop to wait for all scheduled BHs to settle. Created attachment 1848924 [details]
Reproducer, attach as IDE-HD image to x86 PC machine
Got a reproducer, works as follows:
$ ./qemu-system-x86_64 -drive if=ide,file=test.bin,format=raw
qemu-system-x86_64: ../hw/ide/core.c:734: ide_cancel_dma_sync: Assertion `s->bus->dma->aiocb == NULL' failed.
[1] 1447352 abort (core dumped) ./qemu-system-x86_64 -drive if=ide,file=test.bin,format=raw
Couldn’t get this to reproduce via qtest, because that means the system emulator is in a separate process from the testing process, and so QEMU gets a chance to poll between each I/O access, which means ide_trim_bh_cb() will always be invoked before we can cancel the ongoing request. Therefore, the reproducer is an x86 boot sector, which:
(1) looks up the IDE PCI device via BIOS functions
(2) sets up DMA
(3) sends a DSM TRIM command
(4) starts the DMA transfer for it (one completely zeroed 512 byte buffer)
(5) immediately cancels it by setting SRST
(Yes, its backtrace goes ide_perform_srst() -> ide_cancel_dma_sync() instead of bmdma_cmd_writeb() -> ide_cancel_dma_sync(), but I don’t think that matters.)
Hanna
Created attachment 1848925 [details]
Reproducer source code (FASM)
Fix has been merged upstream (7e5cdb345f77d76cb4877fe6230c4e17a7d0d0ca, “ide: Increment BB in-flight counter for TRIM BH”), and so will be in the qemu 7.0 release. Verified this bug as below. Tested with: qemu-kvm-6.2.0-15.module+el8.7.0+15644+189a21f6 kernel-4.18.0-400.el8.x86_64 Steps: 1. Boot from the windows guest with IDE emulation # /usr/libexec/qemu-kvm \ -name guest=instance-0000026f,debug-threads=on \ -S \ -machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \ -cpu Icelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff \ -m 1024 \ -object memory-backend-ram,id=pc.ram,size=1073741824 \ -overcommit mem-lock=off \ -smp 4,sockets=4,dies=1,cores=1,threads=1 \ -uuid 75d92da9-2652-4632-ac9f-497677076cf5 \ -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=20.6.2-2.20210607104828.el8ost.4,serial=75d92da9-2652-4632-ac9f-497677076cf5,uuid=75d92da9-2652-4632-ac9f-497677076cf5,family=Virtual Machine' \ -no-user-config \ -nodefaults \ -rtc base=localtime,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -boot strict=on \ -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 \ -blockdev '{"driver":"rbd","pool":"rbd","image":"win2022.img","cache":{"direct":false,"no-flush":false},"node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","cache":{"direct":false,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}' \ -device ide-hd,bus=ide.0,unit=0,drive=libvirt-1-format,id=ide0-0-0,bootindex=1,write-cache=on \ -netdev tap,id=hostnet0,vhost=on \ -device virtio-net-pci,rx_queue_size=512,host_mtu=1500,netdev=hostnet0,id=net0,mac=fa:16:3e:e9:f3:ab,bus=pci.0,addr=0x3 \ -device usb-tablet,id=input0,bus=usb.0,port=1 \ -vnc :0 \ -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \ -object rng-random,id=objrng0,filename=/dev/urandom \ -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x6 \ -sandbox on,obsolete=deny,elevateprivileges=deny,resourcecontrol=deny \ -msg timestamp=on \ -monitor stdio \ -chardev socket,server=on,path=/var/tmp/monitor-qmpmonitor1-20210721-024113-AsZ7KYro,id=qmp_id_qmpmonitor1,wait=off \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -blockdev node-name=file_winutils,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/windows/winutils.iso,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_winutils,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_winutils \ -device ide-cd,id=winutils,drive=drive_winutils,write-cache=on,bus=ide.1,unit=0 \ 2. Write data in the guest over the free size in the guest. (guest)# fio.exe --name=test --filename=C:\fiotest --size=100G --rw=write Result: Steps finished without any errors, and the guest works well. Also, tried the steps in comment7, there is no core dumped. # /usr/libexec/qemu-kvm -drive if=ide,file=test.img,format=raw VNC server running on ::1:5900 According to comment15, set this bug as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7472 |