Bug 2036193
| Summary: | qemu abort() when creating overlays on top of a RBD disk | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Meina Li <meili> |
| Component: | qemu-kvm | Assignee: | Stefano Garzarella <sgarzare> |
| qemu-kvm sub component: | Ceph | QA Contact: | Virtualization Bugs <virt-bugs> |
| Status: | CLOSED DUPLICATE | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | coli, hhan, kkiwi, lmen, virt-maint, xuzhang, yicui |
| Version: | 9.0 | Keywords: | TestOnly, Triaged |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-01-04 01:34:59 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2034791 | ||
| Bug Blocks: | |||
(In reply to Meina Li from comment #0) > Description of problem: > Guest shutoff after creating snapshot with rbd disk for a while > > Version-Release number of selected component (if applicable): > libvirt-7.10.0-1.el9.x86_64 > qemu-kvm-6.2.0-1.el9.x86_64 > kernel-5.14.0-39.el9.x86_64 > Submitter, can you tell us if this is a new testcase or a regression? If it's a regression, any record of when it was the last time it was tested and worked? > 2. Start a guest with the rbd disk. > # virsh start avocado-vt-vm1 > Domain 'avocado-vt-vm1' started > # virsh dumpxml avocado-vt-vm1 | grep /disk -B8 > <disk type='network' device='disk'> > <driver name='qemu' type='raw' cache='none'/> [...] > # for i in {1..3}; do virsh snapshot-create-as avocado-vt-vm1 s$i > --disk-only --diskspec vda,file=/tmp/rbd.s$i; done > Domain snapshot s1 created > Domain snapshot s2 created > Domain snapshot s3 created [...] > qemu-kvm: ../block/rbd.c:1355: int qemu_rbd_co_block_status(BlockDriverState > *, _Bool, int64_t, int64_t, int64_t *, int64_t *, BlockDriverState **): > Assertion `req.bytes <= bytes' failed. > 2021-12-30 10:49:18.282+0000: shutting down, reason=crashed > 2)# coredumpctl dump 110922 > ... > Stack trace of thread 110922: > #0 0x00007f7a8d3fa7fc __pthread_kill_implementation > (libc.so.6 + 0x8f7fc) > #1 0x00007f7a8d3ad676 __GI_raise (libc.so.6 + 0x42676) > #2 0x00007f7a8d3977d3 __GI_abort (libc.so.6 + 0x2c7d3) > #3 0x00007f7a8d3976fb __assert_fail_base (libc.so.6 + > 0x2c6fb) > #4 0x00007f7a8d3a6396 __GI___assert_fail (libc.so.6 + > 0x3b396) > #5 0x00007f7a8a98d021 qemu_rbd_co_block_status > (block-rbd.so + 0x5021) > #6 0x000056340d0fd38e bdrv_co_block_status (qemu-kvm + > 0x7d338e) > #7 0x000056340d0fd545 bdrv_co_block_status (qemu-kvm + > 0x7d3545) > #8 0x000056340d0fceeb bdrv_co_common_block_status_above > (qemu-kvm + 0x7d2eeb) > #9 0x000056340d0b5530 bdrv_common_block_status_above > (qemu-kvm + 0x78b530) > #10 0x000056340d130720 qcow2_co_pwritev_task_entry (qemu-kvm > + 0x806720) > #11 0x000056340d12b177 qcow2_co_pwritev_part (qemu-kvm + > 0x801177) It's interesting that even though the RBD image is in RAW format and using --disk-only to create-snapshot-as, this assertion fail is apparently still going through qcow2 routines? Maybe they are common (to the copy-on-write operation instead of the format)? Stefano, can you take a look? (In reply to Klaus Heinrich Kiwi from comment #1) > > It's interesting that even though the RBD image is in RAW format and using > --disk-only to create-snapshot-as, this assertion fail is apparently still > going through qcow2 routines? Maybe they are common (to the copy-on-write > operation instead of the format)? I guess because the local snapshots are qcow2, while the backend is RBD. > > Stefano, can you take a look? It seems the same issue of BZ2034791, for now I'm setting this BZ as TestOnly and depending on BZ2034791, but maybe we can close this as DUPLICATE. This bug can not be reproduced in qemu-kvm-6.1.0-8.el9. After checking the steps of BZ2034791, this bug does duplicate it. So I directly close this as DUPLICATE. *** This bug has been marked as a duplicate of bug 2034791 *** |
Description of problem: Guest shutoff after creating snapshot with rbd disk for a while Version-Release number of selected component (if applicable): libvirt-7.10.0-1.el9.x86_64 qemu-kvm-6.2.0-1.el9.x86_64 kernel-5.14.0-39.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Prepare a rbd disk image. # qemu-img convert -f qcow2 -O raw /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 rbd:blockpull-pool/rbd_blockpull_ktd5.img:mon_host=**IP** 2. Start a guest with the rbd disk. # virsh start avocado-vt-vm1 Domain 'avocado-vt-vm1' started # virsh dumpxml avocado-vt-vm1 | grep /disk -B8 <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source protocol='rbd' name='blockpull-pool/rbd_blockpull_ktd5.img' index='1'> <host name='**IP**' port='6789'/> </source> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </disk> # virsh list --all Id Name State -------------------------------- 2 avocado-vt-vm1 running 3. Create snapshot. # for i in {1..3}; do virsh snapshot-create-as avocado-vt-vm1 s$i --disk-only --diskspec vda,file=/tmp/rbd.s$i; done Domain snapshot s1 created Domain snapshot s2 created Domain snapshot s3 created 4. Check the status of guest after a while. # virsh list --all Id Name State --------------------------------- - avocado-vt-vm1 shut off Actual results: The guest will shutoff after creating snapshots for a while. Expected results: The guest is running. Additional info: 1)# cat /var/log/libvirt/qemu/avocado-vt-vm1.log ... qemu-kvm: ../block/rbd.c:1355: int qemu_rbd_co_block_status(BlockDriverState *, _Bool, int64_t, int64_t, int64_t *, int64_t *, BlockDriverState **): Assertion `req.bytes <= bytes' failed. 2021-12-30 10:49:18.282+0000: shutting down, reason=crashed 2)# coredumpctl dump 110922 ... Stack trace of thread 110922: #0 0x00007f7a8d3fa7fc __pthread_kill_implementation (libc.so.6 + 0x8f7fc) #1 0x00007f7a8d3ad676 __GI_raise (libc.so.6 + 0x42676) #2 0x00007f7a8d3977d3 __GI_abort (libc.so.6 + 0x2c7d3) #3 0x00007f7a8d3976fb __assert_fail_base (libc.so.6 + 0x2c6fb) #4 0x00007f7a8d3a6396 __GI___assert_fail (libc.so.6 + 0x3b396) #5 0x00007f7a8a98d021 qemu_rbd_co_block_status (block-rbd.so + 0x5021) #6 0x000056340d0fd38e bdrv_co_block_status (qemu-kvm + 0x7d338e) #7 0x000056340d0fd545 bdrv_co_block_status (qemu-kvm + 0x7d3545) #8 0x000056340d0fceeb bdrv_co_common_block_status_above (qemu-kvm + 0x7d2eeb) #9 0x000056340d0b5530 bdrv_common_block_status_above (qemu-kvm + 0x78b530) #10 0x000056340d130720 qcow2_co_pwritev_task_entry (qemu-kvm + 0x806720) #11 0x000056340d12b177 qcow2_co_pwritev_part (qemu-kvm + 0x801177) #12 0x000056340d0fa9ed bdrv_driver_pwritev (qemu-kvm + 0x7d09ed) #13 0x000056340d0fc360 bdrv_aligned_pwritev (qemu-kvm + 0x7d2360) #14 0x000056340d0fb753 bdrv_co_pwritev_part (qemu-kvm + 0x7d1753) #15 0x000056340d0e7362 blk_co_do_pwritev_part (qemu-kvm + 0x7bd362) #16 0x000056340d0e77d7 blk_aio_write_entry (qemu-kvm + 0x7bd7d7) #17 0x000056340d2a3016 coroutine_trampoline (qemu-kvm + 0x979016) #18 0x00007f7a8d3c2810 n/a (libc.so.6 + 0x57810)