Bug 807894
Summary: | mirroring leaves bad backing file | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Shaolong Hu <shu> | ||||||||
Component: | qemu-kvm | Assignee: | Paolo Bonzini <pbonzini> | ||||||||
Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 6.3 | CC: | acathrow, areis, bsarathy, dyasny, juzhang, michen, mkenneth, pbonzini, tburke, virt-maint | ||||||||
Target Milestone: | rc | Keywords: | TestBlocker | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2012-04-04 08:23:42 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 801449 | ||||||||||
Bug Blocks: | 806280, 806432 | ||||||||||
Attachments: |
|
Description
Shaolong Hu
2012-03-29 05:08:18 UTC
Instead s4, can you try to do lots of disk IO while streaming? *** Bug 807898 has been marked as a duplicate of this bug. *** (In reply to comment #2) > Instead s4, can you try to do lots of disk IO while streaming? add stress in guest: stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M dd if=/dev/zero of=/root/tmp bs=1G count=10 do mirroring + streaming: (qemu) __com.redhat_drive-mirror drive-virtio-disk0 /root/sn1 (qemu) block_stream drive-virtio-disk0 streaming finishes correctly, guest work correctly. exit qemu-kvm, boot guest with /root/sn1, guest kernel panic, screenshot in attachment. [root@shu ~]# qemu-img info sn1 image: sn1 file format: qcow2 virtual size: 20G (21474836480 bytes) disk size: 20G cluster_size: 65536 backing file: /root/RHEL-Server-6.3-64-virtio.qcow2 (actual path: /root/RHEL-Server-6.3-64-virtio.qcow2) the backing file is still /root/RHEL-Server-6.3-64-virtio.qcow2, is this the problem? is there a way to modify a qcow2 file's backing file manually? i want to confirm whether setting sn1's backing file as null will solve the problem. FYI, i also tried during block streaming (not finish), quit qemu-kvm, boot with sn1 always fails, file system crash, is this a problem? when we design mirroring, when this situation happens, which side do we plan to pick? Created attachment 573566 [details]
guest kernel panic
just come into another situation: with same command line in comment 0, simply: (qemu) __com.redhat_drive-mirror drive-virtio-disk0 /root/sn1 Formatting '/root/sn1', fmt=qcow2 size=21474836480 backing_file='/root/RHEL-Server-6.3-64-virtio.qcow2' backing_fmt='qcow2' encryption=off cluster_size=65536 (qemu) block_stream drive-virtio-disk0 (qemu) quit Program terminated with signal 11, Segmentation fault. #0 0x00007f889bcacc75 in bswap16 (bs=0x7f889cdb2480, cluster_index=0) at ./bswap.h:54 54 return bswap_16(x); (gdb) bt #0 0x00007f889bcacc75 in bswap16 (bs=0x7f889cdb2480, cluster_index=0) at ./bswap.h:54 #1 be16_to_cpu (bs=0x7f889cdb2480, cluster_index=0) at ./bswap.h:127 #2 get_refcount (bs=0x7f889cdb2480, cluster_index=0) at block/qcow2-refcount.c:109 #3 0x00007f889bcae855 in alloc_clusters_noref (bs=0x7f889cdb2480, size=65536) at block/qcow2-refcount.c:549 #4 qcow2_alloc_clusters (bs=0x7f889cdb2480, size=65536) at block/qcow2-refcount.c:571 #5 0x00007f889bcaf0a6 in l2_allocate (bs=0x7f889cdb2480, offset=307757056, new_l2_table=0x7f889e6db998, new_l2_offset=0x7f889e6db9a0, new_l2_index=0x7f889e6db9ac) at block/qcow2-cluster.c:168 #6 get_cluster_table (bs=0x7f889cdb2480, offset=307757056, new_l2_table=0x7f889e6db998, new_l2_offset=0x7f889e6db9a0, new_l2_index=0x7f889e6db9ac) at block/qcow2-cluster.c:512 #7 0x00007f889bcaf536 in qcow2_alloc_cluster_offset (bs=0x7f889cdb2480, offset=307757056, n_start=0, n_end=1024, num=0x7f889e6dbabc, m=0x7f889e6dba50) at block/qcow2-cluster.c:714 #8 0x00007f889bcab20f in qcow2_co_writev (bs=0x7f889cdb2480, sector_num=<value optimized out>, remaining_sectors=1024, qiov=0x7f889bbb0e90) at block/qcow2.c:555 #9 0x00007f889bc964df in bdrv_co_do_writev (bs=0x7f889cdb2480, sector_num=601088, nb_sectors=1024, qiov=0x7f889bbb0e90, flags=<value optimized out>) at block.c:1700 #10 0x00007f889bc96581 in bdrv_co_do_rw (opaque=<value optimized out>) at block.c:3000 #11 0x00007f889bc9b9bb in coroutine_trampoline (i0=<value optimized out>, i1=<value optimized out>) at coroutine-ucontext.c:129 #12 0x00007f88995ef610 in ?? () from /lib64/libc-2.12.so #13 0x00007f889bbb0a30 in ?? () #14 0x0000000000000000 in ?? () Complementary to comment 0: Program terminated with signal 11, Segmentation fault. #0 0x00007f31e4bbd63d in bdrv_co_io_em (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, iov=0x7f30bcfe1e90, is_write=true) at block.c:3147 3147 acb = bs->drv->bdrv_aio_writev(bs, sector_num, iov, nb_sectors, (gdb) bt #0 0x00007f31e4bbd63d in bdrv_co_io_em (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, iov=0x7f30bcfe1e90, is_write=true) at block.c:3147 #1 0x00007f31e4bbe222 in bdrv_co_do_copy_on_readv (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, qiov=0x7f30bcfe1f50, flags=<value optimized out>) at block.c:1550 #2 bdrv_co_do_readv (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, qiov=0x7f30bcfe1f50, flags=<value optimized out>) at block.c:1611 #3 0x00007f31e4bdd577 in stream_populate (opaque=0x7f31e6bf7a20) at block/stream.c:76 #4 stream_run (opaque=0x7f31e6bf7a20) at block/stream.c:198 #5 0x00007f31e4bc39bb in coroutine_trampoline (i0=<value optimized out>, i1=<value optimized out>) at coroutine-ucontext.c:129 #6 0x00007f31e2517610 in ?? () from /lib64/libc-2.12.so #7 0x00007fff86aa6130 in ?? () #8 0x0000000000000000 in ?? () Note: core dump in comment 6 does not happen every time. ok, without mirroring: (qemu) snapshot_blkdev drive-virtio-disk0 /root/sn1 qcow2 (qemu) block_stream drive-virtio-disk0 (qemu) quit qemu-kvm core dump: Program terminated with signal 6, Aborted. #0 0x00007f5400ae1885 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) bt #0 0x00007f5400ae1885 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f5400ae3065 in abort () at abort.c:92 #2 0x00007f5400ada9fe in __assert_fail_base (fmt=<value optimized out>, assertion=0x7f540330b286 "c->entries[i].ref == 0", file=0x7f540330b25b "block/qcow2-cache.c", line=<value optimized out>, function=<value optimized out>) at assert.c:96 #3 0x00007f5400adaac0 in __assert_fail (assertion=0x7f540330b286 "c->entries[i].ref == 0", file=0x7f540330b25b "block/qcow2-cache.c", line=69, function=0x7f540330b2b0 "qcow2_cache_destroy") at assert.c:105 #4 0x00007f54031b4324 in qcow2_cache_destroy (bs=<value optimized out>, c=0x7f54049dcd70) at block/qcow2-cache.c:69 #5 0x00007f54031ae34a in qcow2_close (bs=0x7f54047e4010) at block/qcow2.c:628 #6 0x00007f5403197f21 in bdrv_close (bs=0x7f54047e4010) at block.c:693 #7 0x00007f5403198068 in bdrv_close_all () at block.c:717 #8 0x00007f5403184985 in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2270 #9 0x00007f5403165cec in main_loop (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4202 #10 main (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6427 This one is similar to Bug 798499 - Guest aborted sometimes when quit it after a savevm. Ones in comment 0 and comment 6 seem to be different, however, i don't know whether these three are related, keep them here temporarily. This is a test blocker, raise priority. When using QMP, after block_stream finishes: {"timestamp": {"seconds": 1333017106, "microseconds": 862664}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 21474836480, "offset": 21474836480, "speed": 0, "type": "stream", "error": "Operation not supported"}} using ide-drive and virtio-blk-pci both hit this. and boot guest with sn1, guest kernel panic like in attachment. Comment 6 and comment 9 are fixed by attachment 572353 [details]. It worked in my tests, but I'll be brewing and posting this today. If streaming does not finish and qemu-kvm exits, and the destination fails to boot, it should be considered a minor bug. Sorry, comment 6 and comment 12. I reopened bug 807898 for comment 9. Finally, comment 0 seems to be similar to comment 9 but specific to mirroring. Created attachment 573718 [details]
patch to fix the bug, RHEL version
Created attachment 573719 [details]
patch to fix the bug
Closing as WONTFIX. The current blkmirror is not reparable for the mirror+stream case. It's not sure what solution we will implement, but it will not have this problem because it doesn't use block_stream. |