Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1522752

Summary: [data-plane]qemu core dump when live snapshots two disks together within a transaction, with one "snapshot-file" specified to a non-existed path
Product: Red Hat Enterprise Linux 7 Reporter: Qianqian Zhu <qizhu>
Component: qemu-kvm-rhevAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED DUPLICATE QA Contact: Qianqian Zhu <qizhu>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: chayang, juzhang, knoel, qizhu, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-04 10:12:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qianqian Zhu 2017-12-06 11:29:02 UTC
Description of problem:
Launch guest with iothreads, then create live snapshots simultaneously for two disks with transaction, make the second snapshot fail, qemu will core dump.

Without iothreads, it works as expected: qemu will give an error: "Could not create file: Not a directory", and all jobs in that transaction will be cancelled.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.10.0-11.el7.x86_64
kernel-3.10.0-808.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Launch guest:
/usr/libexec/qemu-kvm -M q35,accel=kvm,kernel-irqchip=split \
-m 4G \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0,iothread=iothread1 \
-drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=writethrough,format=qcow2,file=gluster://10.66.8.133/gv2/rhel69-64-virtio-scsi.qcow2 \
 -device scsi-hd,id=image1,drive=drive_image1 \
-device virtio-net-pci,mac=9a:89:8a:8b:8c:8d,id=idTqeiKU,netdev=idU17IIx,bus=pcie.0 \
-object iothread,id=iothread1  \
-netdev tap,id=idU17IIx  \
-cpu 'SandyBridge',+kvm_pv_unhalt  \
-vnc :0   \
-enable-kvm  \
-qmp stdio \
-object iothread,id=iothread2 \
-object iothread,id=iothread3 \
-device virtio-scsi-pci,id=virtio_scsi_pci1,bus=pcie.0,iothread=iothread2 -drive id=drive_image2,if=none,snapshot=off,aio=threads,cache=writethrough,format=qcow2,file=gluster://10.66.8.133/gv2/data1.qcow2 \
-device scsi-hd,id=image2,drive=drive_image2

2. live snapshot transaction, the second job has a invalid path: ["snapshot-file": "/home/test/test/test/test"]

{ "execute": "transaction", "arguments": { "actions": [ { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive_image2","snapshot-file": "gluster://10.66.8.119/gv0/sndd","mode": "absolute-paths", "format": "qcow2" } }, { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive_image1", "snapshot-file": "/home/test/test/test/test","mode": "absolute-paths", "format": "qcow2" } }]}}


Actual results:
qemu core dump:
{ "execute": "transaction", "arguments": { "actions": [ { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive_image2","snapshot-file": "gluster://10.66.8.119/gv0/sndd","mode": "absolute-paths", "format": "qcow2" } }, { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive_image1", "snapshot-file": "/home/test/test/test/test","mode": "absolute-paths", "format": "qcow2" } }]}}
Formatting 'gluster://10.66.8.119/gv0/sndd', fmt=qcow2 size=2147483648 backing_file=gluster://10.66.8.133/gv2/data1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
Formatting '/home/test/test/test/test', fmt=qcow2 size=21474836480 backing_file=gluster://10.66.8.133/gv2/rhel69-64-virtio-scsi.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
qemu: qemu_mutex_unlock: Operation not permitted
Aborted (core dumped)


Expected results:
All jobs in that transaction should be cancelled, guest still operate on initial images.

Additional info:

Comment 2 Qianqian Zhu 2017-12-07 05:28:28 UTC
Formatting '/home/test/test/test/test', fmt=qcow2 size=21474836480 backing_file=gluster://10.66.8.133/gv2/rhel69-64-virtio-scsi.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
qemu: qemu_mutex_unlock: Operation not permitted

Program received signal SIGABRT, Aborted.
0x00007fffed9f01f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install elfutils-libelf-0.168-5.el7.x86_64 elfutils-libs-0.168-5.el7.x86_64 glibc-2.17-194.el7.x86_64 glusterfs-3.12.2-1.el7rhgs.x86_64 glusterfs-api-3.12.2-1.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-1.el7rhgs.x86_64 glusterfs-libs-3.12.2-1.el7rhgs.x86_64 libcurl-7.29.0-35.el7.x86_64 libgcc-4.8.5-14.el7.x86_64 libibverbs-13-5.el7.x86_64 librdmacm-13-5.el7.x86_64 libstdc++-4.8.5-14.el7.x86_64 nss-softokn-freebl-3.28.3-4.el7.x86_64 nss-util-3.28.4-2.el7.x86_64 spice-server-0.12.8-1.el7.x86_64 systemd-libs-219-38.el7.x86_64
(gdb) 
(gdb) bt
#0  0x00007fffed9f01f7 in raise () at /lib64/libc.so.6
#1  0x00007fffed9f18e8 in abort () at /lib64/libc.so.6
#2  0x00005555557989eb in error_exit (err=<optimized out>, msg=msg@entry=0x555555baae90 <__func__.23806> "qemu_mutex_unlock") at util/qemu-thread-posix.c:36
#3  0x0000555555abe816 in qemu_mutex_unlock (mutex=mutex@entry=0x555556d35c60) at util/qemu-thread-posix.c:96
#4  0x0000555555aba3c9 in aio_context_release (ctx=ctx@entry=0x555556d35c00) at util/async.c:494
#5  0x0000555555a45958 in bdrv_flush (bs=bs@entry=0x55555783d400) at block/io.c:2288
#6  0x00005555559fbb0b in bdrv_unref (bs=0x55555783d400) at block.c:3157
#7  0x00005555559fbb0b in bdrv_unref (bs=0x55555783d400) at block.c:3347
#8  0x00005555559fbb0b in bdrv_unref (bs=0x55555783d400) at block.c:4332
#9  0x00005555558acbc9 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=<optimized out>, errp=errp@entry=0x7fffffffc5a8) at blockdev.c:2272
#10 0x00005555558c40fd in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7fffffffc618) at qmp-marshal.c:6118
#11 0x0000555555ab0a69 in qmp_dispatch (errp=0x7fffffffc610, request=0x555558086800, cmds=0x555556081c40 <qmp_commands>) at qapi/qmp-dispatch.c:104
#12 0x0000555555ab0a69 in qmp_dispatch (cmds=0x555556081c40 <qmp_commands>, request=request@entry=0x555558086800) at qapi/qmp-dispatch.c:131
#13 0x00005555557dafe5 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.10.0/monitor.c:3881
#14 0x0000555555ab60b8 in json_message_process_token (lexer=0x555556dc27f8, input=0x555556cd8720, type=JSON_RCURLY, x=395, y=1) at qobject/json-streamer.c:105
#15 0x0000555555ad319b in json_lexer_feed_char (lexer=lexer@entry=0x555556dc27f8, ch=125 '}', flush=flush@entry=false) at qobject/json-lexer.c:323
#16 0x0000555555ad325e in json_lexer_feed (lexer=0x555556dc27f8, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:373
#17 0x0000555555ab6179 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:124
#18 0x00005555557d99de in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/debug/qemu-2.10.0/monitor.c:3923
#19 0x0000555555a65c4f in fd_chr_read (chan=0x555556cefb00, cond=<optimized out>, opaque=0x555556d02820) at chardev/char-fd.c:66
#20 0x00007fffef4a4d7a in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#21 0x0000555555abbd5c in main_loop_wait () at util/main-loop.c:213
#22 0x0000555555abbd5c in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
#23 0x0000555555abbd5c in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
#24 0x000055555579d86a in main () at vl.c:1917
#25 0x000055555579d86a in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4805

Comment 3 Ademar Reis 2017-12-29 20:05:50 UTC
Maybe a duplicate or sharing the same root cause of Bug 1519721

Comment 4 Stefan Hajnoczi 2018-01-03 11:13:21 UTC
Please retest with qemu-kvm-rhev-2.10.0-14.el7.  It contains fixes for the QMP 'transaction' command with dataplane.

Thanks,
Stefan

Comment 5 Qianqian Zhu 2018-01-04 02:28:41 UTC
Test with qemu-kvm-rhev-2.10.0-14.el7.x86_64, the issue has been fixed.

Steps same as comment 0.

Result:
Qmp prompts correct warning, and all the block jobs within the transaction are cancelled. Both qemu and geust works well after then.

{ "execute": "transaction", "arguments": { "actions": [ { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive_image2","snapshot-file": "gluster://10.66.8.119/gv0/sndd","mode": "absolute-paths", "format": "qcow2" } }, { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive_image1", "snapshot-file": "/home/test/test/test/test","mode": "absolute-paths", "format": "qcow2" } }]}}
Formatting 'gluster://10.66.8.119/gv0/sndd', fmt=qcow2 size=2147483648 backing_file=gluster://10.66.8.133/gv2/data1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
Formatting '/home/test/test/test/test', fmt=qcow2 size=21474836480 backing_file=gluster://10.66.8.133/gv2/rhel69-64-virtio-scsi.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
{"error": {"class": "GenericError", "desc": "Could not create file: No such file or directory"}}

{ "execute": "query-block"}
{"return": [{"io-status": "ok", "device": "drive_image1", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 21474836480, "filename": "gluster://10.66.8.133/gv2/rhel69-64-virtio-scsi.qcow2", "cluster-size": 65536, "format": "qcow2", "actual-size": 4457431040, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false, "refcount-bits": 16, "corrupt": false}}, "dirty-flag": false}, "iops_wr": 0, "ro": false, "node-name": "#block117", "backing_file_depth": 0, "drv": "qcow2", "iops": 0, "bps_wr": 0, "write_threshold": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "cache": {"no-flush": false, "direct": false, "writeback": false}, "file": "gluster://10.66.8.133/gv2/rhel69-64-virtio-scsi.qcow2", "encryption_key_missing": false}, "qdev": "image1", "type": "unknown"}, {"io-status": "ok", "device": "drive_image2", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 2147483648, "filename": "gluster://10.66.8.133/gv2/data1.qcow2", "cluster-size": 65536, "format": "qcow2", "actual-size": 2094534656, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false, "refcount-bits": 16, "corrupt": false}}, "dirty-flag": false}, "iops_wr": 0, "ro": false, "node-name": "#block308", "backing_file_depth": 0, "drv": "qcow2", "iops": 0, "bps_wr": 0, "write_threshold": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "cache": {"no-flush": false, "direct": false, "writeback": false}, "file": "gluster://10.66.8.133/gv2/data1.qcow2", "encryption_key_missing": false}, "qdev": "image2", "type": "unknown"}, {"device": "sd0", "locked": false, "removable": true, "type": "unknown"}]}

Comment 6 Stefan Hajnoczi 2018-01-04 10:12:50 UTC

*** This bug has been marked as a duplicate of bug 1519721 ***