RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 807894 - mirroring leaves bad backing file
Summary: mirroring leaves bad backing file
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Paolo Bonzini
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 801449
Blocks: 806280 806432
TreeView+ depends on / blocked
 
Reported: 2012-03-29 05:08 UTC by Shaolong Hu
Modified: 2013-01-10 00:49 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-04-04 08:23:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
guest kernel panic (18.60 KB, image/png)
2012-03-29 08:15 UTC, Shaolong Hu
no flags Details
patch to fix the bug, RHEL version (2.68 KB, patch)
2012-03-29 15:56 UTC, Paolo Bonzini
no flags Details | Diff
patch to fix the bug (1.15 KB, patch)
2012-03-29 15:57 UTC, Paolo Bonzini
no flags Details | Diff

Description Shaolong Hu 2012-03-29 05:08:18 UTC
Description of problem:
------------------------
qemu-kvm core dump when mirroring + streaming + guest s4


Version-Release number of selected component (if applicable):
--------------------------------------------------------------
qemu-kvm 267rhev
guest kernel 258


How reproducible:
-------------------
1/1


Steps to Reproduce:
--------------------
1.boot guest:
/usr/libexec/qemu-kvm -enable-kvm -M rhel6.3.0 -m 4G -name rhel6.3-64 -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -uuid 3f2ea5cd-3d29-48ff-aab2-23df1b6ae213 -drive file=/root/RHEL-Server-6.3-64-virtio.qcow2,cache=none,if=none,rerror=stop,werror=stop,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,drive=drive-virtio-disk0,id=device-virtio-disk0 -netdev tap,script=/etc/qemu-ifup,id=netdev0 -device virtio-net-pci,netdev=netdev0,id=device-net0 -boot order=cd -monitor stdio -usb -device usb-tablet,id=input0 -chardev socket,id=s1,path=/tmp/s1,server,nowait -device isa-serial,chardev=s1 -vnc :10 -monitor tcp::1234,server,nowait -smp 4 -qmp tcp:0:5555,server,nowait -chardev socket,id=qmp_monitor_id_qmpmonitor1,path=/tmp/qmp,server,nowait -mon chardev=qmp_monitor_id_qmpmonitor1,mode=control

2. in qemu monitor:
(qemu) __com.redhat_drive-mirror drive-virtio-disk0 /root/sn1

3. in qemu monitor:
(qemu) block_stream drive-virtio-disk0

4. in guest:
echo disk > /sys/power/state

  
Actual results:
----------------
qemu-kvm core dump:
(gdb) bt
#0  0x00007f31e4bbd63d in bdrv_co_io_em (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, iov=0x7f30bcfe1e90, is_write=true) at block.c:3147
#1  0x00007f31e4bbe222 in bdrv_co_do_copy_on_readv (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, qiov=0x7f30bcfe1f50, flags=<value optimized out>) at block.c:1550
#2  bdrv_co_do_readv (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, qiov=0x7f30bcfe1f50, flags=<value optimized out>) at block.c:1611
#3  0x00007f31e4bdd577 in stream_populate (opaque=0x7f31e6bf7a20) at block/stream.c:76
#4  stream_run (opaque=0x7f31e6bf7a20) at block/stream.c:198
#5  0x00007f31e4bc39bb in coroutine_trampoline (i0=<value optimized out>, i1=<value optimized out>) at coroutine-ucontext.c:129
#6  0x00007f31e2517610 in ?? () from /lib64/libc-2.12.so
#7  0x00007fff86aa6130 in ?? ()
#8  0x0000000000000000 in ?? ()


i install glibc debug package, but there is still unknown symbols, please let me know the missing package if necessary.

Comment 2 Dor Laor 2012-03-29 07:15:45 UTC
Instead s4, can you try to do lots of disk IO while streaming?

Comment 3 Dor Laor 2012-03-29 07:17:53 UTC
*** Bug 807898 has been marked as a duplicate of this bug. ***

Comment 4 Shaolong Hu 2012-03-29 08:12:58 UTC
(In reply to comment #2)
> Instead s4, can you try to do lots of disk IO while streaming?

add stress in guest:

stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M
dd if=/dev/zero of=/root/tmp bs=1G count=10

do mirroring + streaming:
(qemu) __com.redhat_drive-mirror drive-virtio-disk0 /root/sn1      
(qemu) block_stream drive-virtio-disk0 

streaming finishes correctly, guest work correctly.

exit qemu-kvm, boot guest with /root/sn1, guest kernel panic, screenshot in attachment.

[root@shu ~]# qemu-img info sn1
image: sn1
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 20G
cluster_size: 65536
backing file: /root/RHEL-Server-6.3-64-virtio.qcow2 (actual path: /root/RHEL-Server-6.3-64-virtio.qcow2)

the backing file is still /root/RHEL-Server-6.3-64-virtio.qcow2, is this the problem?

is there a way to modify a qcow2 file's backing file manually? i want to confirm whether setting sn1's backing file as null will solve the problem.


FYI, i also tried during block streaming (not finish), quit qemu-kvm, boot with sn1 always fails, file system crash, is this a problem? when we design mirroring, when this situation happens, which side do we plan to pick?

Comment 5 Shaolong Hu 2012-03-29 08:15:20 UTC
Created attachment 573566 [details]
guest kernel panic

Comment 6 Shaolong Hu 2012-03-29 08:50:22 UTC
just come into another situation:

with same command line in comment 0, simply:

(qemu) __com.redhat_drive-mirror drive-virtio-disk0 /root/sn1
Formatting '/root/sn1', fmt=qcow2 size=21474836480 backing_file='/root/RHEL-Server-6.3-64-virtio.qcow2' backing_fmt='qcow2' encryption=off cluster_size=65536 
(qemu) block_stream drive-virtio-disk0
(qemu) quit


Program terminated with signal 11, Segmentation fault.
#0  0x00007f889bcacc75 in bswap16 (bs=0x7f889cdb2480, cluster_index=0) at ./bswap.h:54
54	    return bswap_16(x);

(gdb) bt
#0  0x00007f889bcacc75 in bswap16 (bs=0x7f889cdb2480, cluster_index=0) at ./bswap.h:54
#1  be16_to_cpu (bs=0x7f889cdb2480, cluster_index=0) at ./bswap.h:127
#2  get_refcount (bs=0x7f889cdb2480, cluster_index=0) at block/qcow2-refcount.c:109
#3  0x00007f889bcae855 in alloc_clusters_noref (bs=0x7f889cdb2480, size=65536) at block/qcow2-refcount.c:549
#4  qcow2_alloc_clusters (bs=0x7f889cdb2480, size=65536) at block/qcow2-refcount.c:571
#5  0x00007f889bcaf0a6 in l2_allocate (bs=0x7f889cdb2480, offset=307757056, new_l2_table=0x7f889e6db998, new_l2_offset=0x7f889e6db9a0, new_l2_index=0x7f889e6db9ac) at block/qcow2-cluster.c:168
#6  get_cluster_table (bs=0x7f889cdb2480, offset=307757056, new_l2_table=0x7f889e6db998, new_l2_offset=0x7f889e6db9a0, new_l2_index=0x7f889e6db9ac) at block/qcow2-cluster.c:512
#7  0x00007f889bcaf536 in qcow2_alloc_cluster_offset (bs=0x7f889cdb2480, offset=307757056, n_start=0, n_end=1024, num=0x7f889e6dbabc, m=0x7f889e6dba50) at block/qcow2-cluster.c:714
#8  0x00007f889bcab20f in qcow2_co_writev (bs=0x7f889cdb2480, sector_num=<value optimized out>, remaining_sectors=1024, qiov=0x7f889bbb0e90) at block/qcow2.c:555
#9  0x00007f889bc964df in bdrv_co_do_writev (bs=0x7f889cdb2480, sector_num=601088, nb_sectors=1024, qiov=0x7f889bbb0e90, flags=<value optimized out>) at block.c:1700
#10 0x00007f889bc96581 in bdrv_co_do_rw (opaque=<value optimized out>) at block.c:3000
#11 0x00007f889bc9b9bb in coroutine_trampoline (i0=<value optimized out>, i1=<value optimized out>) at coroutine-ucontext.c:129
#12 0x00007f88995ef610 in ?? () from /lib64/libc-2.12.so
#13 0x00007f889bbb0a30 in ?? ()
#14 0x0000000000000000 in ?? ()

Comment 7 Shaolong Hu 2012-03-29 08:52:53 UTC
Complementary to comment 0:

Program terminated with signal 11, Segmentation fault.
#0  0x00007f31e4bbd63d in bdrv_co_io_em (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, iov=0x7f30bcfe1e90, is_write=true) at block.c:3147
3147	        acb = bs->drv->bdrv_aio_writev(bs, sector_num, iov, nb_sectors,

(gdb) bt
#0  0x00007f31e4bbd63d in bdrv_co_io_em (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, iov=0x7f30bcfe1e90, is_write=true) at block.c:3147
#1  0x00007f31e4bbe222 in bdrv_co_do_copy_on_readv (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, qiov=0x7f30bcfe1f50, flags=<value optimized out>) at block.c:1550
#2  bdrv_co_do_readv (bs=0x7f31e55cd010, sector_num=2575872, nb_sectors=1024, qiov=0x7f30bcfe1f50, flags=<value optimized out>) at block.c:1611
#3  0x00007f31e4bdd577 in stream_populate (opaque=0x7f31e6bf7a20) at block/stream.c:76
#4  stream_run (opaque=0x7f31e6bf7a20) at block/stream.c:198
#5  0x00007f31e4bc39bb in coroutine_trampoline (i0=<value optimized out>, i1=<value optimized out>) at coroutine-ucontext.c:129
#6  0x00007f31e2517610 in ?? () from /lib64/libc-2.12.so
#7  0x00007fff86aa6130 in ?? ()
#8  0x0000000000000000 in ?? ()

Comment 8 Shaolong Hu 2012-03-29 09:00:03 UTC
Note: core dump in comment 6 does not happen every time.

Comment 9 Shaolong Hu 2012-03-29 09:09:37 UTC
ok, without mirroring:

(qemu) snapshot_blkdev drive-virtio-disk0 /root/sn1 qcow2  
(qemu) block_stream drive-virtio-disk0 
(qemu) quit

qemu-kvm core dump:

Program terminated with signal 6, Aborted.
#0  0x00007f5400ae1885 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);

(gdb) bt
#0  0x00007f5400ae1885 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f5400ae3065 in abort () at abort.c:92
#2  0x00007f5400ada9fe in __assert_fail_base (fmt=<value optimized out>, assertion=0x7f540330b286 "c->entries[i].ref == 0", file=0x7f540330b25b "block/qcow2-cache.c", line=<value optimized out>, 
    function=<value optimized out>) at assert.c:96
#3  0x00007f5400adaac0 in __assert_fail (assertion=0x7f540330b286 "c->entries[i].ref == 0", file=0x7f540330b25b "block/qcow2-cache.c", line=69, function=0x7f540330b2b0 "qcow2_cache_destroy") at assert.c:105
#4  0x00007f54031b4324 in qcow2_cache_destroy (bs=<value optimized out>, c=0x7f54049dcd70) at block/qcow2-cache.c:69
#5  0x00007f54031ae34a in qcow2_close (bs=0x7f54047e4010) at block/qcow2.c:628
#6  0x00007f5403197f21 in bdrv_close (bs=0x7f54047e4010) at block.c:693
#7  0x00007f5403198068 in bdrv_close_all () at block.c:717
#8  0x00007f5403184985 in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2270
#9  0x00007f5403165cec in main_loop (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4202
#10 main (argc=20, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6427



This one is similar to Bug 798499 - Guest aborted sometimes when quit it after a savevm.


Ones in comment 0 and comment 6 seem to be different, however, i don't know whether these three are related, keep them here temporarily.

This is a test blocker, raise priority.

Comment 11 Shaolong Hu 2012-03-29 10:08:34 UTC
*Note: comment 6 and comment 9 are using "ide-drive", not "virtio-blk-pci"

Comment 12 Shaolong Hu 2012-03-29 10:39:37 UTC
When using QMP, after block_stream finishes:

{"timestamp": {"seconds": 1333017106, "microseconds": 862664}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 21474836480, "offset": 21474836480, "speed": 0, "type": "stream", "error": "Operation not supported"}}

using ide-drive and virtio-blk-pci both hit this.

and boot guest with sn1, guest kernel panic like in attachment.

Comment 13 Paolo Bonzini 2012-03-29 13:57:27 UTC
Comment 6 and comment 9 are fixed by attachment 572353 [details].  It worked in my tests, but I'll be brewing and posting this today.

If streaming does not finish and qemu-kvm exits, and the destination fails to boot, it should be considered a minor bug.

Comment 14 Paolo Bonzini 2012-03-29 14:18:50 UTC
Sorry, comment 6 and comment 12.

I reopened bug 807898 for comment 9.

Finally, comment 0 seems to be similar to comment 9 but specific to mirroring.

Comment 15 Paolo Bonzini 2012-03-29 15:56:26 UTC
Created attachment 573718 [details]
patch to fix the bug, RHEL version

Comment 16 Paolo Bonzini 2012-03-29 15:57:24 UTC
Created attachment 573719 [details]
patch to fix the bug

Comment 17 Paolo Bonzini 2012-04-04 08:23:42 UTC
Closing as WONTFIX.  The current blkmirror is not reparable for the mirror+stream case.  It's not sure what solution we will implement, but it will not have this problem because it doesn't use block_stream.


Note You need to log in before you can comment on or make changes to this bug.