Bug 1277922
Summary: | Aborted (core dumped) for "qemu-kvm: block.c:3332: bdrv_aligned_pwritev: Assertion `!qiov || bytes == qiov->size' failed." after 'lvextend' the guest primary image and 'cont' the installing guest for the 2nd time | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Gu Nini <ngu> | ||||||
Component: | qemu-kvm-rhev | Assignee: | Fam Zheng <famz> | ||||||
Status: | CLOSED ERRATA | QA Contact: | FuXiangChun <xfu> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 7.2 | CC: | bmcclain, christian.grundmann, cmestreg, dgibson, famz, gklein, hannsj_uhl, huding, jkurik, juzhang, knoel, kwolf, lvivier, mst, mwest, ngu, pbonzini, qzhang, sherold, snagar, thuth, virt-maint, xfu, xqu, xuhan, ylavi, zhengtli | ||||||
Target Milestone: | rc | Keywords: | Regression, ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | qemu-kvm-rhev-2.5.0-1.el7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1283987 (view as bug list) | Environment: | |||||||
Last Closed: | 2016-11-07 20:51:58 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1281520, 1283987, 1288337, 1359843 | ||||||||
Attachments: |
|
Description
Gu Nini
2015-11-04 11:08:12 UTC
Add another 'Additional info': Sometimes, after step5, the bug did not occur, but the guest installation could not finish after a long time, once even 18+ hours, so I just quit the guest. Way too late for 7.2, moving to 7.2.z. A couple of questions to narrow down where the problem could be. 1) Are you able to reproduce this problem with an z86 host and guest? 2) Are you able to reproduce this problem if you use virtio-blk or virtio-scsi instead of spapr-vscsi? This one is pretty odd, looking at the code, I really can't see how that assertion could fail. (In reply to Gu Nini from comment #0) > > Actual results: > In step5, the guest 'Aborted (core dumped)' just after issue cmd 'cont': > > (qemu) cont > (qemu) info stqemu-kvm: util/iov.c:319: qemu_iovec_concat_iov: Assertion > `soffset == 0' failed. > Aborted (core dumped) > Please note the core dump is in following type, I have made a mistake: (qemu) cont (qemu) qemu-kvm: block.c:3332: bdrv_aligned_pwritev: Assertion `!qiov || bytes == qiov->size' failed. Aborted (core dumped) (In reply to David Gibson from comment #4) > A couple of questions to narrow down where the problem could be. > > 1) Are you able to reproduce this problem with an z86 host and guest? On ppc64le host, the problem could be reproduced on LVs both created on host physical disk and iscsi block disk, while it failed to reproduced on iscsi file disk. On x86_64 host, since I failed to find an iscsi server that could provide block disk, so I just used the file disk and failed to reproduce the bug, will retry it if iscsi server with block disk available; however, I met 1/5 following type core dump which had occurred once in former ppc64le test: (qemu) cont (qemu) qemu-kvm: util/iov.c:319: qemu_iovec_concat_iov: Assertion `soffset == 0' failed. Aborted (core dumped) It's a pity the core file was lost since autotest was running on the same host. > > 2) Are you able to reproduce this problem if you use virtio-blk or > virtio-scsi instead of spapr-vscsi? Please note primary disk of the guest for the bug is really virtio-blk. I'm really surprised by the use of qemu-img on a Logical Volume (without FS). Extending the size of the LV is like extending the size of a qcow2 file while a guest is using it. Kevin, can we really do that (if "yes", I'm really impressed...) ? Laurent, I believe extending the LV should be ok (though reducing would not be ok). IIUC, the way qcow2 works, it allocates space from the front of the file/devices. More to the point, I'm pretty sure putting a qcow2 directly on an LV which gets extended on ENOSPC, is something that RHEV does all the time, so it's an important case. I have a reproducer script on ibm-p8-virt-02, in ~dwg/bz1277922.sh. You need to manually run through the installer prompts, and run the lvextend when it hits the enospc pause, but it does the VG/LV setup and so forth for you. I've hit a couple of different assertion failures so far, but haven't gotten far with analyzing them yet. I've tested this procedure on an x86_64 system with upstream qemu and real Logical Volume, and I've the following error (and crash) after the first lvextend and continue: qemu-system-x86_64: /home/lvivier/Projects/qemu/util/iov.c:331: qemu_iovec_concat_iov: Assertion `soffset == 0' failed. upstream qemu is: ce27861 configure: Don't disable optimization for non-fortify builds (In reply to Laurent Vivier from comment #6) > I'm really surprised by the use of qemu-img on a Logical Volume (without FS). > > Extending the size of the LV is like extending the size of a qcow2 file > while a guest is using it. > > Kevin, can we really do that (if "yes", I'm really impressed...) ? Yes. And sorry if you can't sleep well any more now. :-) At first I was surprised to learn about such use in RHEV, but by now I've mostly just accepted it as a part of reality. As long as the management tool makes sure that the LV is always large enough, there is nothing preventing the use of qcow2 on a block device. (In reply to Laurent Vivier from comment #9) > I've tested this procedure on an x86_64 system with upstream qemu and real > Logical Volume, and I've the following error (and crash) after the first > lvextend and continue: > > qemu-system-x86_64: /home/lvivier/Projects/qemu/util/iov.c:331: > qemu_iovec_concat_iov: Assertion `soffset == 0' failed. > > upstream qemu is: > > ce27861 configure: Don't disable optimization for non-fortify builds Laurent, The crash you met is in the same type as I met in commnet #5 , I have met it both in x86_64 and ppc64le hosts; however, since it's not in the same type as that in the original bug, and it always occurs after the first lvextend/guest_cont(in comment #5, I also met it here) WHILE that in the bug occurs after the second time, are you sure it's the same cause as that in the bug? If no, we should not change haredware to all. BTW, I only met the crash in the bug on ppc64le. on x86_64, I have also some crashes without assertion: Program received signal SIGSEGV, Segmentation fault. 0x00007f25b57a93aa in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) () from /lib64/libtcmalloc.so.4 (gdb) bt #0 0x00007f25b57a93aa in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) () from /lib64/libtcmalloc.so.4 #1 0x00007f25b57a96bf in tcmalloc::ThreadCache::Scavenge() () from /lib64/libtcmalloc.so.4 #2 0x00007f25b57bb58f in tc_deletearray () from /lib64/libtcmalloc.so.4 #3 0x00007f25b529579f in g_free () from /lib64/libglib-2.0.so.0 #4 0x0000560f9d049559 in virtio_blk_free_request (req=0x560fa17a6000) at /home/lvivier/Projects/qemu/hw/block/virtio-blk.c:45 #5 virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at /home/lvivier/Projects/qemu/hw/block/virtio-blk.c:121 #6 0x0000560f9d281eee in bdrv_co_complete (acb=0x560fa17e7730) at /home/lvivier/Projects/qemu/block/io.c:2097 #7 0x0000560f9d2cca4a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at /home/lvivier/Projects/qemu/util/coroutine-ucontext.c:80 #8 0x00007f25b4aaffa0 in ?? () from /lib64/libc.so.6 #9 0x00007ffc70141c10 in ?? () #10 0x0000000000000000 in ?? () (In reply to Gu Nini from comment #11) > (In reply to Laurent Vivier from comment #9) > > I've tested this procedure on an x86_64 system with upstream qemu and real > > Logical Volume, and I've the following error (and crash) after the first > > lvextend and continue: > > > > qemu-system-x86_64: /home/lvivier/Projects/qemu/util/iov.c:331: > > qemu_iovec_concat_iov: Assertion `soffset == 0' failed. > > > > upstream qemu is: > > > > ce27861 configure: Don't disable optimization for non-fortify builds > > Laurent, > > The crash you met is in the same type as I met in commnet #5 , I have met it > both in x86_64 and ppc64le hosts; however, since it's not in the same type > as that in the original bug, and it always occurs after the first > lvextend/guest_cont(in comment #5, I also met it here) WHILE that in the bug > occurs after the second time, are you sure it's the same cause as that in > the bug? If no, we should not change haredware to all. BTW, I only met the > crash in the bug on ppc64le. You're right it's not the same but I have anyway a crash on x86_64 (comment 12). Feel free to revert the hardware to ppc64le if you think it is better. (In reply to Laurent Vivier from comment #12) > on x86_64, I have also some crashes without assertion: > I also had met the Segmentation fault (core dump) issue on both x86_64 and ppc64le test, again it seems a different problem... (In reply to Laurent Vivier from comment #13) ...... > > You're right it's not the same but I have anyway a crash on x86_64 (comment > 12). > Feel free to revert the hardware to ppc64le if you think it is better. Ok, we can leave as it is and wait for deeper investigation. Thanks. (In reply to Gu Nini from comment #14) > (In reply to Laurent Vivier from comment #12) > > on x86_64, I have also some crashes without assertion: > > > > I also had met the Segmentation fault (core dump) issue on both x86_64 and > ppc64le test, again it seems a different problem... > > (In reply to Laurent Vivier from comment #13) > ...... > > > > You're right it's not the same but I have anyway a crash on x86_64 (comment > > 12). > > Feel free to revert the hardware to ppc64le if you think it is better. > > Ok, we can leave as it is and wait for deeper investigation. I've been able to reproduce the same kind of error on x86_64 with upstream qemu: qemu-system-x86_64: /home/lvivier/Projects/qemu/block/io.c:1123: bdrv_aligned_pwritev: Assertion `!qiov || bytes == qiov->size' failed. Previously, the following options were missing on my command line: -drive ... snapshot=off,aio=native ... backtrace is: Program received signal SIGABRT, Aborted. 0x00007f480678f9c8 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007f480678f9c8 in raise () from /lib64/libc.so.6 #1 0x00007f480679165a in abort () from /lib64/libc.so.6 #2 0x00007f4806788187 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007f4806788232 in __assert_fail () from /lib64/libc.so.6 #4 0x0000559376a7dcc6 in bdrv_aligned_pwritev (bs=bs@entry=0x559377ee2000, req=req@entry=0x55937dc77ca0, offset=offset@entry=1648361472, bytes=bytes@entry=8192, qiov=0x55937dc77db0, flags=flags@entry=0) at /home/lvivier/Projects/qemu/block/io.c:1123 #5 0x0000559376a7e70d in bdrv_co_do_pwritev (bs=0x559377ee2000, offset=<optimized out>, bytes=8192, qiov=qiov@entry=0x55937dc77db0, flags=flags@entry=(unknown: 0)) at /home/lvivier/Projects/qemu/block/io.c:1370 #6 0x0000559376a7f318 in bdrv_co_do_writev (flags=(unknown: 0), qiov=0x55937dc77db0, nb_sectors=<optimized out>, sector_num=<optimized out>, bs=<optimized out>) at /home/lvivier/Projects/qemu/block/io.c:1394 #7 bdrv_co_writev (bs=<optimized out>, sector_num=<optimized out>, nb_sectors=<optimized out>, qiov=qiov@entry=0x55937dc77db0) at /home/lvivier/Projects/qemu/block/io.c:1403 #8 0x0000559376a5a965 in qcow2_co_writev (bs=0x559377ed0800, sector_num=63810432, remaining_sectors=16, qiov=0x559379f88038) at /home/lvivier/Projects/qemu/block/qcow2.c:1584 #9 0x0000559376a7dc6d in bdrv_aligned_pwritev (bs=bs@entry=0x559377ed0800, ---Type <return> to continue, or q <return> to quit--- req=req@entry=0x55937dc77f10, offset=offset@entry=32670433280, bytes=bytes@entry=516096, qiov=0x559379f88038, flags=<optimized out>, flags@entry=0) at /home/lvivier/Projects/qemu/block/io.c:1148 #10 0x0000559376a7e70d in bdrv_co_do_pwritev (bs=0x559377ed0800, offset=<optimized out>, bytes=516096, qiov=0x559379f88038, flags=(unknown: 0)) at /home/lvivier/Projects/qemu/block/io.c:1370 #11 0x0000559376a7f3c0 in bdrv_co_do_writev (flags=<optimized out>, qiov=<optimized out>, nb_sectors=<optimized out>, sector_num=<optimized out>, bs=<optimized out>) at /home/lvivier/Projects/qemu/block/io.c:1394 #12 bdrv_co_do_rw (opaque=0x559377e57e30) at /home/lvivier/Projects/qemu/block/io.c:2132 #13 0x0000559376ac7a4a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at /home/lvivier/Projects/qemu/util/coroutine-ucontext.c:80 #14 0x00007f48067a3fa0 in ?? () from /lib64/libc.so.6 #15 0x00007ffe0049f150 in ?? () #16 0x0000000000000000 in ?? () Nini, I've seen several different assertion failures with no changes in setup. I think the exact crash is non-deterministic, rather than it being a difference between platforms. Laurent, Good start on tracking this down. I've now tried reproducing the problem with both spapr-vscsi and virtio-scsi disks, and wasn't able to. That's another datapoint supporting it being the virtio-blk patch which caused this problem. Since this doesn't appear to be Power specific, I'm reassigning to the normal pool. Here's some deeper analysis of a single failure: (gdb) bt #0 0x00003fffb6c6e578 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00003fffb6c706fc in __GI_abort () at abort.c:90 #2 0x00003fffb7a988c4 in g_assertion_message (domain=<optimized out>, file=0x2772af20 "qom/object.c", line=<optimized out>, func=<optimized out>, message=0x28514260 "assertion failed: (obj->ref > 0)") at gtestutils.c:2292 #3 0x00003fffb7a989cc in g_assertion_message_expr (domain=0x0, file=0x2772af20 "qom/object.c", line=<optimized out>, func=0x2772ae90 <__FUNCTION__.29928> "object_unref", expr=<optimized out>) at gtestutils.c:2307 #4 0x00000000275dec44 in object_unref (obj=0x28510540) at qom/object.c:716 #5 0x000000002737064c in memory_region_unref (mr=<optimized out>) at /usr/src/debug/qemu-2.3.0/memory.c:1369 #6 0x000000002737315c in memory_region_del_subregion (mr=0x28544000, subregion=0x28544200) at /usr/src/debug/qemu-2.3.0/memory.c:1802 #7 0x0000000027373288 in memory_region_finalize (obj=<optimized out>) at /usr/src/debug/qemu-2.3.0/memory.c:1327 #8 0x00000000275ded3c in object_deinit (type=0x28680300, obj=<optimized out>) at qom/object.c:399 #9 object_finalize (data=0x28544000) at qom/object.c:413 #10 object_unref (obj=0x28544000) at qom/object.c:720 #11 0x00000000275decac in object_property_del_all (obj=0x28510540) at qom/object.c:367 #12 object_finalize (data=0x28510540) at qom/object.c:412 #13 object_unref (obj=0x28510540) at qom/object.c:720 #14 0x000000002737064c in memory_region_unref (mr=<optimized out>) at /usr/src/debug/qemu-2.3.0/memory.c:1369 #15 0x00000000273166a0 in address_space_unmap (as=<optimized out>, buffer=<optimized out>, len=<optimized out>, is_write=<optimized out>, access_len=16) at /usr/src/debug/qemu-2.3.0/exec.c:2652 #16 0x00000000273aacf0 in virtqueue_fill (vq=0x29670000, elem=0x297fa010, len=<optimized out>, idx=0) at /usr/src/debug/qemu-2.3.0/hw/virtio/virtio.c:266 #17 0x00000000273aafe4 in virtqueue_push (vq=0x29670000, elem=<optimized out>, len=<optimized out>) at /usr/src/debug/qemu-2.3.0/hw/virtio/virtio.c:294 #18 0x000000002738408c in virtio_blk_complete_request (req=0x297fa000, status=<optimized out>) at /usr/src/debug/qemu-2.3.0/hw/block/virtio-blk.c:58 #19 0x0000000027384774 in virtio_blk_req_complete (status=0 '\000', req=0x297fa000) at /usr/src/debug/qemu-2.3.0/hw/block/virtio-blk.c:64 #20 virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at /usr/src/debug/qemu-2.3.0/hw/block/virtio-blk.c:119 #21 0x00000000276212a4 in bdrv_co_em_bh (opaque=0x2971a340) at block.c:4981 #22 0x0000000027616e64 in aio_bh_poll (ctx=0x28760840) at async.c:85 #23 0x000000002762cf3c in aio_dispatch_clients (ctx=<optimized out>, client_mask=<optimized out>) at aio-posix.c:139 #24 0x0000000027616b90 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at async.c:219 #25 0x00003fffb7a626d0 in g_main_dispatch (context=0x28610240) at gmain.c:3109 #26 g_main_context_dispatch (context=0x28610240) at gmain.c:3708 #27 0x000000002762b02c in glib_pollfds_poll () at main-loop.c:209 #28 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:254 #29 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:503 #30 0x000000002730c630 in main_loop () at vl.c:1818 #31 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4394 Looking into what's going in the memory_region_unref at #14: (gdb) frame 15 #15 0x00000000273166a0 in address_space_unmap (as=<optimized out>, buffer=<optimized out>, len=<optimized out>, is_write=<optimized out>, access_len=16) at /usr/src/debug/qemu-2.3.0/exec.c:2652 2652 memory_region_unref(mr); (gdb) p mr->name $30 = 0x28507090 "ppc_spapr.ram" That's the main RAM memory region, so it looks like something is going wrong with the refcounting of regions mapped into the virtqueue. Created attachment 1092912 [details]
Reproducer script
Here's the script I'm using to reproduce on ibm-p8-virt-02.
Obviously it will need adjustment for local environment and further adjustment to run on an x86 host.
Realised the information Laurent found from a bisect didn't make it into the BZ itself. From: Laurent Vivier <lvivier> To: David Gibson <dgibson> Subject: BZ1277922 ("Assertion `!qiov || bytes == qiov->size'") Date: Tue, 10 Nov 2015 19:03:16 +0100 David, As I've been able to reproduce the problem on x86_64, I've tried to bisect (on upstream, v2.2 was good, v2.3 was bad) I have a commit id, but we can't be sure it is the good one as I have marked a commit as "bad" if QEMU crashes (for any reason) during the install and as "good" if not (on one try). commit 95f7142abc86a916682bd735aecd90172ffa0d30 Author: Peter Lieven <pl> Date: Mon Feb 2 14:52:21 2015 +0100 virtio-blk: introduce multiread this patch finally introduces multiread support to virtio-blk. While multiwrite support was there for a long time, read support was missing. The complete merge logic is moved into virtio-blk.c which has been the only user of request merging ever since. This is required to be able to merge chunks of requests and immediately invoke callbacks for those requests. Secondly, this is required to switch to direct invocation of coroutines which is planned at a later stage. The following benchmarks show the performance of running fio with 4 worker threads on a local ram disk. The numbers show the average of 10 test runs after 1 run as warmup phase. | 4k | 64k | 4k MB/s | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand --------------+--------+---------+--------+---------+--------+-------- master | 1221 | 1187 | 4178 | 4114 | 1745 | 1213 multiread | 1829 | 1189 | 4639 | 4110 | 1894 | 1216 Signed-off-by: Peter Lieven <pl> Signed-off-by: Kevin Wolf <kwolf> As you posted an entirely different backtrace that doesn't crash in the block layer, I'm not sure what Laurent actually ended up testing. But if we assume that it's the assertion failure in bdrv_aligned_pwritev() as he posted in comment 15, the result suggests that the problem (which is that the request size and the size of the passed qiov don't match) is in virtio-blk. However, looking at the virtio-blk code actually issuing the requests, I see that this is already explicitly asserted there for the num_reqs > 1 case: assert(nb_sectors == qiov->size / BDRV_SECTOR_SIZE); For the other case, nb_sectors is already initialised with the right value and never changed: int nb_sectors = mrb->reqs[start]->qiov.size / BDRV_SECTOR_SIZE; Looking closer at the backtrace, the failing request doesn't directly come from virtio-blk anyway, but from qcow2. There qiov is reset and then filled in the non-encrypted case with the following code (the encrypted one is obviously correct, too, but it's not our case): qemu_iovec_reset(&hd_qiov); qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_nr_sectors * 512); After this, hd_qiov must be consistent with cur_nr_sectors if the qiov passed to qcow2 was large enough (which was asserted in the first bdrv_aligned_pwritev() call in the chain). The only code I can see that touches qiov between here and there is the alignment code in bdrv_co_do_pwritev(), but I don't expect it to be used in our case. Laurent, if this is reproducible for you, can you check whether the alignment code triggers for you, and also dump the qiov when the assertion fails? We probably both want the qiov of the failing bdrv_co_do_pwritev() and the one that was passed to qcow2 a few calls down the stack trace. So far I only see that we have a 8192 byte requests, but not what size qiov ended up. We have remaining_sectors=16 in qcow2_co_writev(), though, so bytes looks like a plausible value at least. Kevin, Sorry forgot to summarize this for you earlier. We've seen a number of different aborts / assertion failures with similar reproducer; it doesn't seem to be deterministic (even with guest smp disabled, although it seems to trigger a bit more rarely then). There's several factors pointing towards virtio_blk, though none of them are conclusive: 1) the asserts have usually had virtio_blk functions *somewhere* in the backtrace, 2) we haven't been able to reproduce with virtio-scsi or spapr-vscsi disks instead of virtio_blk and 3) Laurent did a bisect (on ppc) to a virtio blk patch (see comment 21). The last isn't entirely conclusive because the problem isn't 100% reproducible, although for me the reproducer script I've attached (on ppc64le) seems to trip it on >50% of attempts. Because of the variety of crash symptoms, I'm suspecting internal memory corruption in qemu (perhaps a buffer overrun). Unfortunately some console problems meant I've lost the details, but I did have one run where one of the structures seemed to have been overwritten with some 0s followed by some repeated 0xfffffffc values, which also suggests some sort of memory overwrite (or maybe use-after-free). Kevin, I have bisected on x86_64. the problem can take different forms, the last I have are: * qemu-system-x86_64: /home/lvivier/Projects/qemu/block/accounting.c:41: block_acct_done: Assertion `cookie->type < BLOCK_MAX_IOTYPE' or * Program received signal SIGSEGV, Segmentation fault. 0x00007fb7e43c0000 in ?? () (gdb) bt #0 0x00007fb7e43c0000 in ?? () #1 0x0000562d1d01ebfe in virtio_blk_req_complete (status=0 '\000', req=0x562d23469330) at /home/lvivier/Projects/qemu/hw/block/virtio-blk.c:63 #2 virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at /home/lvivier/Projects/qemu/hw/block/virtio-blk.c:110 ... As David said, I think also we have a memory corruption in the block sub-system. We can imagine something like: on the io-error, the "req" is freed but as we add space on disk and ask to continue it is used anyway. With valgrind, the following errors come with commit 95f7142 (virtio-blk: introduce multiread): ==2867== Invalid read of size 4 ==2867== at 0x1F0160: virtio_blk_rw_complete (virtio-blk.c:95) ==2867== by 0x3D291D: bdrv_co_em_bh (block.c:4744) ==2867== by 0x3CBED6: aio_bh_poll (async.c:84) ==2867== by 0x3DA94F: aio_dispatch (aio-posix.c:137) ==2867== by 0x3CBD5D: aio_ctx_dispatch (async.c:223) ==2867== by 0x64A5A89: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.4400.1) ==2867== by 0x3D9582: glib_pollfds_poll (main-loop.c:200) ==2867== by 0x3D9582: os_host_main_loop_wait (main-loop.c:245) ==2867== by 0x3D9582: main_loop_wait (main-loop.c:494) ==2867== by 0x1A4F6B: main_loop (vl.c:1794) ==2867== by 0x1A4F6B: main (vl.c:4350) ==2867== Address 0x14725174 is 49,220 bytes inside a block of size 49,272 free'd ==2867== at 0x4C29D6A: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==2867== by 0x64AB79E: g_free (in /usr/lib64/libglib-2.0.so.0.4400.1) ==2867== by 0x64C344C: g_slice_free1 (in /usr/lib64/libglib-2.0.so.0.4400.1) ==2867== by 0x1F01F9: virtio_blk_free_request (virtio-blk.c:44) ==2867== by 0x1F01F9: virtio_blk_rw_complete (virtio-blk.c:112) ==2867== by 0x3D291D: bdrv_co_em_bh (block.c:4744) ==2867== by 0x3CBED6: aio_bh_poll (async.c:84) ==2867== by 0x3DA94F: aio_dispatch (aio-posix.c:137) ==2867== by 0x3CBD5D: aio_ctx_dispatch (async.c:223) ==2867== by 0x64A5A89: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.4400.1) ==2867== by 0x3D9582: glib_pollfds_poll (main-loop.c:200) ==2867== by 0x3D9582: os_host_main_loop_wait (main-loop.c:245) ==2867== by 0x3D9582: main_loop_wait (main-loop.c:494) ==2867== by 0x1A4F6B: main_loop (vl.c:1794) ==2867== by 0x1A4F6B: main (vl.c:4350) [line numbers are ones from commit 95f7142] virtio-blk.c: 86 static void virtio_blk_rw_complete(void *opaque, int ret) 87 { 88 VirtIOBlockReq *next = opaque; 89 90 while (next) { 91 VirtIOBlockReq *req = next; 92 next = req->mr_next; 93 trace_virtio_blk_rw_complete(req, ret); 94 95 if (req->qiov.nalloc != -1) { 96 /* If nalloc is != 1 req->qiov is a local copy of the original 97 * external iovec. It was allocated in submit_merged_requests 98 * to be able to merge requests. */ 99 qemu_iovec_destroy(&req->qiov); 100 } 101 102 if (ret) { 103 int p = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type); 104 bool is_read = !(p & VIRTIO_BLK_T_OUT); 105 if (virtio_blk_handle_rw_error(req, -ret, is_read)) { 106 continue; 107 } 108 } 109 110 virtio_blk_req_complete(req, VIRTIO_BLK_S_OK); 111 block_acct_done(blk_get_stats(req->dev->blk), &req->acct); 112 virtio_blk_free_request(req); 113 } 114 } In line 95 we try to use "req" which has been released previously (in another loop) in line 112. On x86, disabling requests merging seems to workaround the problem: --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -405,6 +405,7 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb) VirtIOBlockReq *req = mrb->reqs[i]; if (num_reqs > 0) { bool merge = true; +merge = false; /* merge would exceed maximum number of IOVs */ if (niov + req->qiov.niov > IOV_MAX) { The valgrind information is a good clue, but I don't think the problem is right there. At line 95 req should already have been advanced to the next thing in the chain, which should still be valid (line 91). So that suggests that the mr_next chain is somehow ending up with a cycle. The problem is that at the time of io error, req was also added to the dma restart queue, so there is a double completion. I'm working on a fix right now. A fix was posted to upstream list: http://marc.info/?l=qemu-devel&m=144740716414450&w=2 *** Bug 1253200 has been marked as a duplicate of this bug. *** *** Bug 1281520 has been marked as a duplicate of this bug. *** According to comment0, Reproduced the bug with qemu-kvm-rhev-2.3.0-31. (qemu) cli: line 1: 9040 Killed /usr/libexec/qemu-kvm -name lvtest -machine pc,accel=kvm,usb=off -m 4096 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1 -uuid 95346a10-1828-403a-a610-ac5a52a29411 -no-user-config -nodefaults -monitor stdio -rtc base=utc,clock=host -no-shutdown -boot dc -device nec-usb-xhci,id=xhci -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=/dev/xfu_vg/lv1 -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=06,bootindex=1 -cdrom RHEL-7.3-20160908.3-Server-x86_64-dvd1.iso -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:c4:e7:11,bus=pci.0,addr=0x9 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -vnc :1 -msg timestamp=on -vga qxl verified the bug with qemu-kvm-rhev-2.6.0-23 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2673.html The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |