Bug 1529209
Summary: | QEMU core dumped when creating internal snapshots when guest boots up from luks-inside-qcow2 image | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | yilzhang | ||||||
Component: | qemu-kvm-rhev | Assignee: | Eric Blake <eblake> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Tingting Mao <timao> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 7.5 | CC: | ailan, areis, berrange, chayang, coli, eblake, jferlan, juzhang, michen, ngu, pingl, qzhang, timao, virt-maint | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-12-14 16:07:18 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
yilzhang
2017-12-27 06:56:53 UTC
Power8 and x86_64 have different test results for this testcase. 1. Power8: Test result after step3 (QMP): {"execute":"human-monitor-command","arguments":{"command-line":"savevm sn1"}} {"timestamp": {"seconds": 1514363870, "microseconds": 420042}, "event": "STOP"} {"timestamp": {"seconds": 1514363870, "microseconds": 443402}, "event": "RESUME"} {"return": "Error while writing VM state: Input/output error\r\n"} 2. x86: qemu-kvm crashed when creating internal snapshot Host kernel: 3.10.0-823.el7.x86_64 Guest kernel: 3.10.0-823.el7.x86_64 qemu-kvm-rhev: qemu-kvm-rhev-2.10.0-13.el7 Test result after step3: qemu-kvm crashed with core dump qemu-kvm process crashed: (qemu) qemu-kvm: block/qcow2-cluster.c:403: do_perform_cow_encrypt: Assertion `(offset_in_cluster & ~~((1ULL << 9) - 1)) == 0' failed. bug1529209.sh: line 23: 19918 Aborted (core dumped) /usr/libexec/qemu-kvm -name virt5-yilzhang-vm -smp 8,sockets=2,cores=4,threads=1 -m 8192 -serial unix:/tmp/lq-serial.log,server,nowait -nodefaults -vnc :9 -vga std -rtc base=localtime,clock=host -boot menu=on -monitor stdio -qmp tcp:0:9991,server,nowait -device nec-usb-xhci -device usb-kbd -device usb-tablet -device usb-mouse --object secret,id=sec0,data=backing -device virtio-scsi-pci,id=scsi0 -drive file=/home/yilzhang/LUKS/base.qcow2,encrypt.key-secret=sec0,if=none,format=qcow2,rerror=stop,werror=stop,cache=none,id=drive_sysdisk -device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0 -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c3:e7:8f GDB output: [New LWP 20218] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/libexec/qemu-kvm -name virt5-yilzhang-vm -smp 8,sockets=2,cores=4,threads='. Program terminated with signal 6, Aborted. #0 0x00007fef770861d7 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install boost-system-1.53.0-27.el7.x86_64 boost-thread-1.53.0-27.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-gssapi-2.1.26-22.el7.x86_64 cyrus-sasl-lib-2.1.26-22.el7.x86_64 cyrus-sasl-md5-2.1.26-22.el7.x86_64 cyrus-sasl-plain-2.1.26-22.el7.x86_64 elfutils-libelf-0.170-2.el7.x86_64 elfutils-libs-0.170-2.el7.x86_64 glib2-2.54.2-1.el7.x86_64 glibc-2.17-217.el7.x86_64 glusterfs-api-3.8.4-52.el7.x86_64 glusterfs-libs-3.8.4-52.el7.x86_64 gmp-6.0.0-15.el7.x86_64 gnutls-3.3.26-9.el7.x86_64 gperftools-libs-2.6.1-1.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-17.el7.x86_64 libacl-2.2.51-14.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-46.el7.x86_64 libcacard-2.5.2-2.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-10.el7.x86_64 libcurl-7.29.0-45.el7.x86_64 libdb-5.3.21-22.el7.x86_64 libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-22.el7.x86_64 libgcrypt-1.5.3-14.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libibverbs-15-1.el7.x86_64 libidn-1.28-4.el7.x86_64 libiscsi-1.9.0-7.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64 libmount-2.23.2-46.el7.x86_64 libnl3-3.2.28-4.el7.x86_64 libpng-1.5.13-7.el7_2.x86_64 librados2-0.94.5-2.el7.x86_64 librbd1-0.94.5-2.el7.x86_64 librdmacm-15-1.el7.x86_64 libseccomp-2.3.1-3.el7.x86_64 libselinux-2.5-12.el7.x86_64 libssh2-1.4.3-10.el7_2.1.x86_64 libstdc++-4.8.5-22.el7.x86_64 libtasn1-4.10-1.el7.x86_64 libusbx-1.0.21-1.el7.x86_64 libuuid-2.23.2-46.el7.x86_64 lz4-1.7.5-2.el7.x86_64 lzo-2.06-8.el7.x86_64 nettle-2.7.1-8.el7.x86_64 nspr-4.17.0-1.el7.x86_64 nss-3.34.0-0.1.beta1.el7.x86_64 nss-softokn-freebl-3.34.0-0.2.beta1.el7.x86_64 nss-util-3.34.0-0.1.beta1.el7.x86_64 numactl-libs-2.0.9-7.el7.x86_64 openldap-2.4.44-9.el7.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 opus-1.0.2-6.el7.x86_64 p11-kit-0.23.5-3.el7.x86_64 pcre-8.32-17.el7.x86_64 pixman-0.34.0-1.el7.x86_64 snappy-1.1.0-3.el7.x86_64 spice-server-0.14.0-2.el7.x86_64 systemd-libs-219-46.el7.x86_64 usbredir-0.7.1-2.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64 (gdb) bt #0 0x00007fef770861d7 in raise () at /lib64/libc.so.6 #1 0x00007fef770878d0 in abort () at /lib64/libc.so.6 #2 0x00007fef7707efcc in __assert_fail_base () at /lib64/libc.so.6 #3 0x00007fef7707f088 in () at /lib64/libc.so.6 #4 0x00005589fcda72d2 in do_perform_cow_encrypt (src_cluster_offset=<optimized out>, cluster_offset=<optimized out>, offset_in_cluster=<optimized out>, buffer=buffer@entry=0x558a00aea000 "", bytes=<optimized out>, bs=<optimized out>, bs=<optimized out>) at block/qcow2-cluster.c:403 #5 0x00005589fcda8b6d in qcow2_alloc_cluster_link_l2 (bs=<optimized out>, bs=<optimized out>, bytes=<optimized out>, buffer=0x558a00aea000 "", offset_in_cluster=<optimized out>, cluster_offset=<optimized out>, src_cluster_offset=<optimized out>) at block/qcow2-cluster.c:780 #6 0x00005589fcda8b6d in qcow2_alloc_cluster_link_l2 (m=0x5589ffd103c0, bs=0x5589ff8c6000) at block/qcow2-cluster.c:797 #7 0x00005589fcda8b6d in qcow2_alloc_cluster_link_l2 (bs=bs@entry=0x5589ff8c6000, m=0x5589ffd103c0) at block/qcow2-cluster.c:868 #8 0x00005589fcd9c8a0 in qcow2_co_pwritev (bs=0x5589ff8c6000, offset=21474836480, bytes=132384, qiov=0x7ffc08901d70, flags=<optimized out>) at block/qcow2.c:2003 #9 0x00005589fcdc8c80 in bdrv_co_rw_vmstate (bs=0x5589ff8c6000, qiov=<optimized out>, pos=<optimized out>, is_read=<optimized out>) at block/io.c:2051 #10 0x00005589fcdc8ce8 in bdrv_co_rw_vmstate_entry (opaque=0x7ffc08901d20) at block/io.c:2064 #11 0x00005589fce545da in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79 #12 0x00007fef77097fa0 in __start_context () at /lib64/libc.so.6 #13 0x00007ffc08902670 in () #14 0x0000000000000000 in () (gdb) bt full #0 0x00007fef770861d7 in raise () at /lib64/libc.so.6 #1 0x00007fef770878d0 in abort () at /lib64/libc.so.6 #2 0x00007fef7707efcc in __assert_fail_base () at /lib64/libc.so.6 #3 0x00007fef7707f088 in () at /lib64/libc.so.6 #4 0x00005589fcda72d2 in do_perform_cow_encrypt (src_cluster_offset=<optimized out>, cluster_offset=<optimized out>, offset_in_cluster=<optimized out>, buffer=buffer@entry=0x558a00aea000 "", bytes=<optimized out>, bs=<optimized out>, bs=<optimized out>) at block/qcow2-cluster.c:403 s = <optimized out> sector = <optimized out> #5 0x00005589fcda8b6d in qcow2_alloc_cluster_link_l2 (bs=<optimized out>, bs=<optimized out>, bytes=<optimized out>, buffer=0x558a00aea000 "", offset_in_cluster=<optimized out>, cluster_offset=<optimized out>, src_cluster_offset=<optimized out>) at block/qcow2-cluster.c:780 end = 0x5589ffd103f0 merge_reads = <optimized out> buffer_size = <optimized out> data_bytes = 132384 start_buffer = 0x558a00aea000 "" ret = <optimized out> s = 0x5589ff826000 start = 0x5589ffd103e8 end_buffer = 0x558a00aea000 "" qiov = {iov = 0x558a00150d80, niov = 1, nalloc = 3, size = 64224} s = 0x5589ff826000 i = <optimized out> j = 0 l2_index = 512 ret = <optimized out> old_cluster = 0x558a00587480 l2_table = 0x558a00150d80 cluster_offset = 4122148864 __PRETTY_FUNCTION__ = "qcow2_alloc_cluster_link_l2" #6 0x00005589fcda8b6d in qcow2_alloc_cluster_link_l2 (m=0x5589ffd103c0, bs=0x5589ff8c6000) at block/qcow2-cluster.c:797 end = 0x5589ffd103f0 merge_reads = <optimized out> buffer_size = <optimized out> data_bytes = 132384 start_buffer = 0x558a00aea000 "" ret = <optimized out> s = 0x5589ff826000 start = 0x5589ffd103e8 end_buffer = 0x558a00aea000 "" qiov = {iov = 0x558a00150d80, niov = 1, nalloc = 3, size = 64224} s = 0x5589ff826000 i = <optimized out> j = 0 l2_index = 512 ret = <optimized out> old_cluster = 0x558a00587480 l2_table = 0x558a00150d80 cluster_offset = 4122148864 __PRETTY_FUNCTION__ = "qcow2_alloc_cluster_link_l2" #7 0x00005589fcda8b6d in qcow2_alloc_cluster_link_l2 (bs=bs@entry=0x5589ff8c6000, m=0x5589ffd103c0) at block/qcow2-cluster.c:868 s = 0x5589ff826000 ---Type <return> to continue, or q <return> to quit--- i = <optimized out> j = 0 l2_index = 512 ret = <optimized out> old_cluster = 0x558a00587480 l2_table = 0x558a00150d80 cluster_offset = 4122148864 __PRETTY_FUNCTION__ = "qcow2_alloc_cluster_link_l2" #8 0x00005589fcd9c8a0 in qcow2_co_pwritev (bs=0x5589ff8c6000, offset=21474836480, bytes=132384, qiov=0x7ffc08901d70, flags=<optimized out>) at block/qcow2.c:2003 next = <optimized out> s = 0x5589ff826000 offset_in_cluster = <optimized out> ret = <optimized out> cur_bytes = 132384 cluster_offset = 4122148864 hd_qiov = {iov = 0x558a011ef800, niov = 1, nalloc = 64, size = 132384} bytes_done = 0 cluster_data = 0x558a019e4000 "s\261\317\342t\342\234~\365e\261\352\354M\024$5\v$o\a\300:\033\205\060\317|\032\265\227\211I[\204\315\003\206\266Wp\234\200w\263\223\302Űi!\262\264ɸ\360&<F\333\377P\223\033\223iza\016x\031I\373\350\326O'\234\240 L烜A\260\353d\234\232Pt\365̼-\315a\317L\027\250\200x\037ں.\276\071\357uA\222\024\216\272\273\v\375Kb\314W8)\374\274B[\364\306j}Fi\rI\376\215\277\067\352\273\350R\315\006u+Futj\017\376\031WI:\255\325\365\255h\200\237\237\071\311\021\001\035\t\326\323\374\227;\244\061\234?\365D\370Y\204\364\201W\364\301\002\342]\221", <incomplete sequence \332> l2meta = 0x5589ffd103c0 __PRETTY_FUNCTION__ = "qcow2_co_pwritev" #9 0x00005589fcdc8c80 in bdrv_co_rw_vmstate (bs=0x5589ff8c6000, qiov=<optimized out>, pos=<optimized out>, is_read=<optimized out>) at block/io.c:2051 drv = <optimized out> ret = -95 #10 0x00005589fcdc8ce8 in bdrv_co_rw_vmstate_entry (opaque=0x7ffc08901d20) at block/io.c:2064 co = 0x7ffc08901d20 #11 0x00005589fce545da in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79 self = 0x5589ffc44500 co = 0x5589ffc44500 #12 0x00007fef77097fa0 in __start_context () at /lib64/libc.so.6 #13 0x00007ffc08902670 in () #14 0x0000000000000000 in () (gdb) 3. Qemu command line are the same when doing testing on Power and x86 qemu cli: /usr/libexec/qemu-kvm \ -name virt5-yilzhang-vm \ -smp 8,sockets=2,cores=4,threads=1 -m 8192 \ -serial unix:/tmp/lq-serial.log,server,nowait \ -nodefaults \ -vnc :9 \ -vga std \ -rtc base=localtime,clock=host \ -boot menu=on \ -monitor stdio \ -qmp tcp:0:9991,server,nowait \ -device nec-usb-xhci \ -device usb-kbd \ -device usb-tablet -device usb-mouse \ \ --object secret,id=sec0,data=backing \ -device virtio-scsi-pci,id=scsi0 \ -drive file=/home/yilzhang/LUKS/base.qcow2,encrypt.key-secret=sec0,if=none,format=qcow2,rerror=stop,werror=stop,cache=none,id=drive_sysdisk \ -device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0 \ \ -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \ -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c3:e7:8f This bug only happens when using luks-inside-qcow2 format image. That is, if using qcow2 format image to boot up guest, then creating internal snapshot will succeed. Created attachment 1375108 [details]
backtrace
Reproduced the issue when create internal snapshot both on soft mount nfs and hard mount nfs.
Packages tested:
kernel-3.10.0-824.el7.x86_64
qemu-kvm-rhev-2.10.0-13.el7
With hard mount nfs:
Got error as follows.
(qemu) savevm s1
Error while writing VM state: Input/output error
With soft mount nfs:
(qemu) savevm s1
qemu-kvm: block/qcow2-cluster.c:403: do_perform_cow_encrypt: Assertion `(offset_in_cluster & ~~((1ULL << 9) - 1)) == 0' failed.
boot.sh: line 39: 13600 Aborted (core dumped)
How is this a regression? LUKS-in-qcow2 is a new feature that isn't very-well tested yet, and internal snapshots are not supported in downstream RHEL. It may well be that upstream qemu needs to fix stuff in this area, but it should not be driving RHEL business decisions. Created attachment 1456082 [details]
gdb backtrace
Reproduced this issue in rhel7.6
Tested packages
qemu-kvm-rhev-2.12.0-6.el7
kernel-3.10.0-915.el7
Steps:
1. Boot the installed luks-inside-qcow2 image with qemu.sh script, which is list below
#!/bin/bash
/usr/libexec/qemu-kvm \
-name 'guest-rhel7.5' \
-machine pc \
-nodefaults \
-vga qxl \
-object secret,id=sec0,data=base \
-object secret,id=secA,data=snA \
-object secret,id=secB,data=snB \
-object secret,id=sec1,data=datadisk \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x8 \
-drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=$1,encrypt.key-secret=sec0 \
-device scsi-hd,id=image1,drive=drive_image1 \
-vnc :0 \
-monitor stdio \
-m 1024 \
-smp 4 \
-device virtio-net-pci,mac=9a:b5:b6:b1:b2:b9,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pci.0,addr=0x9 \
-netdev tap,id=idxgXAlm \
-chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/timao/monitor-qmpmonitor1-20180220-094308-h9I6hRsI,server,nowait \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
2. Create internal snapshot by QMP -----> qemu core dumped[1]
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 12, "major": 2}, "package": "qemu-kvm-rhev-2.12.0-6.el7"}, "capabilities": []}}
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute":"human-monitor-command","arguments":{"command-line":"savevm sn1"}}
{"timestamp": {"seconds": 1530530445, "microseconds": 818259}, "event": "STOP"}
Ncat: Connection reset by peer.
[1]QEMU 2.12.0 monitor - type 'help' for more information
(qemu) qemu-kvm: crypto/block-luks.c:1424: qcrypto_block_luks_encrypt: Assertion `(((len) % (512LL)) == 0)' failed.
qemu.sh: line 23: 1359 Aborted (core dumped) /usr/libexec/qemu-kvm -name 'guest-rhel7.5' -machine pc -nodefaults -vga qxl -object secret,id=sec0,data=base -object secret,id=secA,data=snA -object secret,id=secB,data=snB -object secret,id=sec1,data=datadisk -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x8 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=$1,encrypt.key-secret=sec0 -device scsi-hd,id=image1,drive=drive_image1 -vnc :0 -monitor stdio -m 1024 -smp 4 -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b9,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pci.0,addr=0x9 -netdev tap,id=idxgXAlm -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/timao/monitor-qmpmonitor1-20180220-094308-h9I6hRsI,server,nowait -mon chardev=qmp_id_qmpmonitor1,mode=control
|