Created attachment 1671313 [details] All threads backtrace Description of problem: As subject Version-Release number of selected component (if applicable): libvirt-6.0.0-13.virtcov.el8.x86_64 qemu-kvm-4.2.0-15.module+el8.2.0+6029+618ef2ec.x86_64 How reproducible: 90% Steps to Reproduce: 1. Prepare a VM. Inactively attach a second disk to it # qemu-img create /tmp/disk 100M # virsh attach-disk new /tmp/disk sda --config 2. Prepare a luks on gluster # qemu-img create -f luks --object secret,data=redhat,id=sec0 -o key-secret=sec0 gluster://XXXX/gv/new-dest 100M 3. Prepare a luks secret, referring to https://libvirt.org/formatsecret.html 4. Prepare the gluster luks dest xml: # cat gluster-dest.xml <disk type="network" device="disk"> <driver name="qemu" type="raw"/> <source protocol='gluster' name='gv/new-dest'> <encryption format="luks"> <secret type="passphrase" usage="luks"/> </encryption> <host name='XXX'/> <host name='XXX'/> </source> <target dev="sda" bus="scsi"/> </disk> 5. Start VM and do blockcopy with --reuse-external # virsh blockcopy new sda --xml gluster-dest.xml --pivot --transient-job --verbose --wait --reuse-external error: failed to connect to the hypervisor error: End of file while reading data: Input/output error Sometimes libvirt will not crash. Then you can stop vm the check the new disk xml: # virsh destroy new Domain new destroyed # virsh dumpxml new --inactive|xmllint --xpath //disk - <disk type="file" device="disk"> <driver name="qemu" type="qcow2"/> <source file="/var/lib/libvirt/images/new.qcow2"/> <target dev="vda" bus="virtio"/> <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/> </disk><disk type="network" device="disk"> <driver name="qemu" type="raw"/> <source protocol="gluster" name="gv/new-dest"> <host name="gls1.usersys.redhat.com" port="24007"/> <host name="gls2.usersys.redhat.com" port="24007"/> </source> <backingStore/> <target dev="sda" bus="scsi"/> <encryption format="luks"> <secret type="passphrase" usage="pu"/> </encryption> <address type="drive" controller="0" bus="0" target="0" unit="0"/> </disk> It looks like the memory is corrupted here. In case that it is a security problem. I set it private first. The name of secret usage is changed. Actual results: As subject Expected results: NO SIGABRT Additional info: (gdb) bt #0 0x00007f952651a70f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f9526504b25 in __GI_abort () at abort.c:79 #2 0x00007f952655d897 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f952666a057 "%s\n") at ../sysdeps/posix/libc_fatal.c:181 #3 0x00007f9526563fdc in malloc_printerr (str=str@entry=0x7f952666c0c8 "malloc(): smallbin double linked list corrupted") at malloc.c:5366 #4 0x00007f95265672a4 in _int_malloc (av=av@entry=0x7f952689fba0 <main_arena>, bytes=bytes@entry=24) at malloc.c:3656 #5 0x00007f95265692c6 in __libc_calloc (n=n@entry=1, elem_size=elem_size@entry=24) at malloc.c:3444 #6 0x00007f952736b1fe in g_malloc0 (n_bytes=n_bytes@entry=24) at gmem.c:129 #7 0x00007f952a52c732 in virAlloc (ptrptr=ptrptr@entry=0x7ffd6f1c2410, size=size@entry=24) at ../../src/util/viralloc.c:48 #8 0x00007f952a5915ba in virJSONValueNewObject () at ../../src/util/virjson.c:577 #9 0x00007f952a591f87 in virJSONParserHandleStartMap (ctx=0x7ffd6f1c24c0) at ../../src/util/virjson.c:1723 #10 0x00007f9529c4ca07 in yajl_do_parse (hand=hand@entry=0x5583aeabf220, jsonText=jsonText@entry=0x5583aeab3500 "{\"return\": [{\"iops_rd\": 0, \"detect_zeroes\": \"off\", \"image\": {\"backing-image\": {\"virtual-size\": 104857600, \"filename\": \"/tmp/disk\", \"format\": \"raw\", \"actual-size\": 91770880, \"dirty-flag\": false}, \"back"..., jsonTextLen=jsonTextLen@entry=5878) at /usr/src/debug/yajl-2.1.0-10.el8.x86_64/src/yajl_parser.c:269 #11 0x00007f9529c4a69a in yajl_parse (hand=hand@entry=0x5583aeabf220, jsonText=jsonText@entry=0x5583aeab3500 "{\"return\": [{\"iops_rd\": 0, \"detect_zeroes\": \"off\", \"image\": {\"backing-image\": {\"virtual-size\": 104857600, \"filename\": \"/tmp/disk\", \"format\": \"raw\", \"actual-size\": 91770880, \"dirty-flag\": false}, \"back"..., jsonTextLen=jsonTextLen@entry=5878) at /usr/src/debug/yajl-2.1.0-10.el8.x86_64/src/yajl.c:130 #12 0x00007f952a59436a in virJSONValueFromString (jsonstring=<optimized out>, jsonstring@entry=0x5583aeab3500 "{\"return\": [{\"iops_rd\": 0, \"detect_zeroes\": \"off\", \"image\": {\"backing-image\": {\"virtual-size\": 104857600, \"filename\": \"/tmp/disk\", \"format\": \"raw\", \"actual-size\": 91770880, \"dirty-flag\": false}, \"back"...) at ../../src/util/virjson.c:1854 #13 0x00007f94f15e6fa6 in qemuMonitorJSONIOProcessLine (mon=mon@entry=0x7f9504031290, line=0x5583aeab3500 "{\"return\": [{\"iops_rd\": 0, \"detect_zeroes\": \"off\", \"image\": {\"backing-image\": {\"virtual-size\": 104857600, \"filename\": \"/tmp/disk\", \"format\": \"raw\", \"actual-size\": 91770880, \"dirty-flag\": false}, \"back"..., msg=msg@entry=0x7f951e8ac620) at ../../src/qemu/qemu_monitor_json.c:222 #14 0x00007f94f15e76ca in qemuMonitorJSONIOProcess (mon=mon@entry=0x7f9504031290, data=0x5583aeab1cf0 "{\"return\": [{\"iops_rd\": 0, \"detect_zeroes\": \"off\", \"image\": {\"backing-image\": {\"virtual-size\": 104857600, \"filename\": \"/tmp/disk\", \"format\": \"raw\", \"actual-size\": 91770880, \"dirty-flag\": false}, \"back"..., len=5880, msg=0x7f951e8ac620) at ../../src/qemu/qemu_monitor_json.c:277 #15 0x00007f94f15cfa45 in qemuMonitorIOProcess (mon=0x7f9504031290) at ../../src/qemu/qemu_monitor.c:352 #16 0x00007f94f15cfa45 in qemuMonitorIO (watch=watch@entry=27, fd=<optimized out>, events=0, events@entry=1, opaque=opaque@entry=0x7f9504031290) at ../../src/qemu/qemu_monitor.c:611 #17 0x00007f952a568c47 in virEventPollDispatchHandles (fds=0x5583aeac2680, nfds=<optimized out>) at ../../src/util/vireventpoll.c:503 #18 0x00007f952a568c47 in virEventPollRunOnce () at ../../src/util/vireventpoll.c:658 #19 0x00007f952a5666dd in virEventRunDefaultImpl () at ../../src/util/virevent.c:353 #20 0x00007f952a766f2d in virNetDaemonRun (dmn=0x5583aea2d4a0) at ../../src/rpc/virnetdaemon.c:836 #21 0x00005583ad6c171a in main (argc=<optimized out>, argv=<optimized out>) at ../../src/remote/remote_daemon.c:1430
For the usage corrupted issue, it can be reproduced by file, iscsi backend with luks usage. Not reproduced by gluster with luks uuid.
Does the crash happen if you start a clean libvirtd and retry the steps just once? I've identified a double-free regarding the 'usage' string, but that manifests itself only after you destroy the VM and start it again.
(In reply to Peter Krempa from comment #2) > Does the crash happen if you start a clean libvirtd and retry the steps just > once? I've identified a double-free regarding the 'usage' string, but that > manifests itself only after you destroy the VM and start it again. The crash will not happen when following 1~5 with a fresh libvirtd: ➜ luks virsh start new Domain new started ➜ luks virsh blockcopy new sda --xml file-dest.xml --pivot --transient-job --verbose --wait --reuse-external Block Copy: [100 %] Successfully pivoted ➜ luks virsh destroy new Domain new destroyed ➜ luks virsh dumpxml new |xmllint --xpath //disk - -:48: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xC0 0x94 0xBB 0x7F <secret type='passphrase' usage=''/>
I have installed rpm of v6.1.0-190-gaeb909bf9b, which includes: commit 38bc76bcc1e1bccb7f3265e15b60b0f6f8fe6dfa Author: Peter Krempa <pkrempa> Date: Fri Mar 6 14:44:43 2020 +0100 qemu: Don't take double pointer in qemuDomainSecretInfoFree Using a double pointer prevents the function from being used as the automatic cleanup function for the given type. Remove the double pointer use by replacing the calls with g_clear_pointer which ensures that the pointer is cleared. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Ján Tomko <jtomko> But the error in comment3 could be reproduced again.
(In reply to Han Han from comment #3) > (In reply to Peter Krempa from comment #2) > > Does the crash happen if you start a clean libvirtd and retry the steps just > > once? I've identified a double-free regarding the 'usage' string, but that > > manifests itself only after you destroy the VM and start it again. > > The crash will not happen when following 1~5 with a fresh libvirtd: > ➜ luks virsh start new > Domain new started > > ➜ luks virsh blockcopy new sda --xml file-dest.xml --pivot --transient-job > --verbose --wait --reuse-external > Block Copy: [100 %] > Successfully pivoted > ➜ luks virsh destroy new Thank you for confirming. > > Domain new destroyed > > ➜ luks virsh dumpxml new |xmllint --xpath //disk - > -:48: parser error : Input is not proper UTF-8, indicate encoding ! > Bytes: 0xC0 0x94 0xBB 0x7F > <secret type='passphrase' usage=''/> I've seen this symptom while debugging. (In reply to Han Han from comment #4) > I have installed rpm of v6.1.0-190-gaeb909bf9b, which includes: > commit 38bc76bcc1e1bccb7f3265e15b60b0f6f8fe6dfa > Author: Peter Krempa <pkrempa> > Date: Fri Mar 6 14:44:43 2020 +0100 > > qemu: Don't take double pointer in qemuDomainSecretInfoFree > > Using a double pointer prevents the function from being used as the > automatic cleanup function for the given type. > > Remove the double pointer use by replacing the calls with > g_clear_pointer which ensures that the pointer is cleared. > > Signed-off-by: Peter Krempa <pkrempa> > Reviewed-by: Ján Tomko <jtomko> > > > But the error in comment3 could be reproduced again. That one was just a simplification of the code. The proper fix for the issue should be the patch I've posted for a wrong copy of the 'usage' string: https://www.redhat.com/archives/libvir-list/2020-March/msg00693.html
Fixed upstream: commit 299796328c34a30295d6cdc7ebce5d65843e921f Author: Peter Krempa <pkrempa> Date: Thu Mar 19 15:38:06 2020 +0100 virStorageEncryptionSecretCopy: Properly copy internals virStorageEncryptionSecretPtr may have a string inside it, thus we must copy the string too. Use virSecretLookupDefCopy to do that. Caused by non-obvious code introduced in 756b46ddd24 and later 47e88b33b which added a string that needed to be copied. https://bugzilla.redhat.com/show_bug.cgi?id=1814923 Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Ján Tomko <jtomko> Reviewed-by: Michal Privoznik <mprivozn> commit 02f909b8a6ad8c271693fed9ceab606f0dda2294 Author: Peter Krempa <pkrempa> Date: Thu Mar 19 15:27:40 2020 +0100 virSecretLookupDefCopy: Remove return value The function always returns succes so there's no need for a return value. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Ján Tomko <jtomko> Reviewed-by: Michal Privoznik <mprivozn>
Vefified on libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64 qemu-kvm-4.2.0-17.module+el8.2.0+6141+0f540f16.x86_64: 1. Start a VM 2. Prepare luks secret and set the password 3. Blockcopy to a luks disk: # cat /tmp/file-dest.xml <disk type="file" device="disk"> <driver name="qemu" type="qcow2"/> <source file="/var/lib/libvirt/images/pc-luks"> <encryption format="luks"> <secret type="passphrase" usage="luks"/> </encryption> </source> <target dev="sda" bus="scsi"/> </disk> ➜ ~ virsh blockcopy pc sda --xml /tmp/file-dest.xml --wait --verbose --transient-job Block Copy: [100 %] ➜ ~ virsh blockjob pc sda --pivot ➜ ~ virsh dumpxml pc | xmllint --xpath //disk - <disk type="file" device="disk"> <driver name="qemu" type="qcow2"/> <source file="/var/lib/libvirt/images/pc-luks" index="6"> <encryption format="luks"> <secret type="passphrase" usage="luks"/> </encryption> </source> <backingStore/> <target dev="sda" bus="scsi"/> <alias name="scsi0-0-0-0"/> <address type="drive" controller="0" bus="0" target="0" unit="0"/> </disk> 3. Destroy and start vm, check the vm xml again ➜ ~ virsh destroy pc Domain pc destroyed ➜ ~ virsh start pc Domain pc started ➜ ~ virsh dumpxml pc | xmllint --xpath //disk - <disk type="file" device="disk"> <driver name="qemu" type="qcow2"/> <source file="/var/lib/libvirt/images/pc-luks" index="2"/> <backingStore/> <target dev="sda" bus="scsi"/> <encryption format="luks"> <secret type="passphrase" usage="luks"/> </encryption> <alias name="scsi0-0-0-0"/> <address type="drive" controller="0" bus="0" target="0" unit="0"/> </disk>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2017