Bug 2058459
| Summary: | Qemu core dump when mirror before "STOP" event received that caused by no space left error(iothread enabled) | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | aihua liang <aliang> |
| Component: | qemu-kvm | Assignee: | Hanna Czenczek <hreitz> |
| qemu-kvm sub component: | Block Jobs | QA Contact: | aihua liang <aliang> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
| Severity: | high | ||
| Priority: | medium | CC: | coli, hreitz, jinzhao, mdeng, ngu, virt-maint |
| Version: | 8.6 | Keywords: | Triaged |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 2058457 | Environment: | |
| Last Closed: | 2023-06-25 01:47:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2058457 | ||
| Bug Blocks: | |||
qemu-kvm-6.2.0-1.module+el8.6.0+13725+61ae1949 also hit this issue. Note: We will focus debugging on the RHEL9.0 clone: Bug 2058457 Same keypoint to reproduce as bz2058457: iothread enable + '"STOP" event send out later than "blockdev_mirror" execution'. Check this issue on RHEL8.6, RHEL8.5-av, RHEL8.4-av, all hit this issue. qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d: 1/20 (the 20th run hit the coredump issue) qemu-kvm-6.2.0-1.module+el8.6.0+13725+61ae1949: 1/89 (the 89th run hit the coredump issue) qemu-kvm-6.0.0-33.module+el8.5.0+13514+2c386966.1: 1/29 (the 29th run hit the coredump issue) qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d: 1/45 (the 45th run hit the coredump issue) qemu-kvm-5.2.0-16.module+el8.4.0+11721+c8bbc1be.3: 1/28 (the 28th run hit the coredump issue) qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9: 1/58 (the 58th run hit the coredump issue) So it's not a regression issue. ~ Like for BZ 2058457, set the ITR to --- because we don’t have clear-cut plans for how to fix this at this point. (I hope it’ll be fixed by introducing some form of locks for graph-change operations, but we’ll need to see that.) Test on qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac+scsi+iothread with case:blockdev_mirror_after_block_error for 200 times, all pass. (199/200) repeat199.Host_RHEL.m8.u9.product_rhel.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.9.0.x86_64.io-github-autotest-qemu.blockdev_mirror_after_block_error.q35: PASS (99.16 s) (200/200) repeat200.Host_RHEL.m8.u9.product_rhel.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.9.0.x86_64.io-github-autotest-qemu.blockdev_mirror_after_block_error.q35: PASS (98.61 s) RESULTS : PASS 200 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0 JOB HTML : /root/avocado/job-results/job-2023-06-20T22.06-146d4eb/results.html JOB TIME : 19720.71 s Test on qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac+virtio_blk+iothread with case:blockdev_mirror_after_block_error for 200 times, all pass. (199/200) repeat199.Host_RHEL.m8.u9.product_rhel.qcow2.virtio_blk.up.virtio_net.Guest.RHEL.8.9.0.x86_64.io-github-autotest-qemu.blockdev_mirror_after_block_error.q35: PASS (97.41 s) (200/200) repeat200.Host_RHEL.m8.u9.product_rhel.qcow2.virtio_blk.up.virtio_net.Guest.RHEL.8.9.0.x86_64.io-github-autotest-qemu.blockdev_mirror_after_block_error.q35: PASS (98.77 s) RESULTS : PASS 200 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0 JOB HTML : /root/avocado/job-results/job-2023-06-21T05.47-b163a53/results.html JOB TIME : 19718.78 s So will close this bug as currentrelease. |
qemu-kvm-6.2.0-8.module+el8.6.0+14324+050a5215 also hit this issue, core dump info as bellow: (gdb)#bt #0 bdrv_parent_can_set_aio_context (errp=0x7fff6f3e2c98, ignore=0x7fff6f3e2c20, ctx=0x559542cafe60, c=0x101010101010101) at ../block.c:7161 #1 bdrv_can_set_aio_context (bs=0x5595435f1800, ctx=0x559542cafe60, ignore=0x7fff6f3e2c20, errp=0x7fff6f3e2c98) at ../block.c:7196 #2 0x0000559541da7913 in bdrv_child_try_set_aio_context (bs=bs@entry=0x5595435f1800, ctx=ctx@entry=0x559542cafe60, ignore_child=ignore_child@entry=0x0, errp=errp@entry=0x7fff6f3e2c98) at ../block.c:7216 #3 0x0000559541da7a37 in bdrv_try_set_aio_context (errp=0x7fff6f3e2c98, ctx=0x559542cafe60, bs=0x5595435f1800) at ../block.c:7233 #4 bdrv_attach_child_common (child_bs=child_bs@entry=0x5595435f1800, child_name=child_name@entry=0x559541f78a07 "root", child_class=child_class@entry=0x559542653160 <child_root>, child_role=child_role@entry=20, perm=perm@entry=0, shared_perm=shared_perm@entry=31, opaque=0x559543a83260, child=0x7fff6f3e2d30, tran=0x559543a83650, errp=0x559542765858 <error_abort>) at ../block.c:2931 #5 0x0000559541da8c7d in bdrv_root_attach_child (child_bs=child_bs@entry=0x5595435f1800, child_name=child_name@entry=0x559541f78a07 "root", child_class=child_class@entry=0x559542653160 <child_root>, child_role=child_role@entry=20, perm=0, shared_perm=31, opaque=0x559543a83260, errp=0x559542765858 <error_abort>) at ../block.c:3055 #6 0x0000559541dc1575 in blk_insert_bs (blk=0x559543a83260, bs=bs@entry=0x5595435f1800, errp=0x559542765858 <error_abort>) at ../block/block-backend.c:862 #7 0x0000559541dd2dfe in mirror_exit_common (job=0x559543955670) at ../block/mirror.c:779 #8 0x0000559541db0121 in job_prepare (job=0x559543955670) at ../job.c:828 #9 0x0000559541db0b61 in job_txn_apply (job=job@entry=0x559543955670, fn=fn@entry=0x559541db0100 <job_prepare>) at ../job.c:158 #10 0x0000559541db15af in job_do_finalize (job=0x559543955670) at ../job.c:845 #11 0x0000559541db1795 in job_exit (opaque=0x559543955670) at ../job.c:932 #12 0x0000559541ea621d in aio_bh_call (bh=0x7f1ea4010200) at ../util/async.c:169 #13 aio_bh_poll (ctx=ctx@entry=0x559542b26170) at ../util/async.c:169 #14 0x0000559541e94562 in aio_dispatch (ctx=0x559542b26170) at ../util/aio-posix.c:381 #15 0x0000559541ea60c2 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../util/async.c:311 #16 0x00007f1eb57a095d in g_main_dispatch (context=0x559542b1e610) at gmain.c:3193 #17 g_main_context_dispatch (context=context@entry=0x559542b1e610) at gmain.c:3873 #18 0x0000559541eb0d60 in glib_pollfds_poll () at ../util/main-loop.c:232 #19 os_host_main_loop_wait (timeout=<optimized out>) at ../util/main-loop.c:255 #20 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:531 #21 0x0000559541cab1e9 in qemu_main_loop () at ../softmmu/runstate.c:726 #22 0x0000559541addd82 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:50