Description of problem: As subject Version-Release number of selected component (if applicable): libvirt-6.0.0-10.virtcov.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1.Prepare a guest and disk xml with element "reservation": # cat disk.xml <disk device="lun" type="block"> <target bus="scsi" dev="sdb" /> <driver name="qemu" type="raw" /> <source dev="/dev/sdh"> <reservations managed="yes" /> </source> </disk> 2.Hot-plug the disk to the guest # virsh attach-device avocado-vt-vm1 disk.xml error: Disconnected from qemu:///system due to end of file error: Failed to attach device from disk.xml error: End of file while reading data: Input/output error 3.backtrace (gdb) bt #0 0x00007f809c98970f in raise () from /lib64/libc.so.6 #1 0x00007f809c973b25 in abort () from /lib64/libc.so.6 #2 0x00007f809c9cc897 in __libc_message () from /lib64/libc.so.6 #3 0x00007f809c9d2fdc in malloc_printerr () from /lib64/libc.so.6 #4 0x00007f809c9d39a8 in malloc_consolidate () from /lib64/libc.so.6 #5 0x00007f809c9d5d58 in _int_malloc () from /lib64/libc.so.6 #6 0x00007f809c9d7662 in malloc () from /lib64/libc.so.6 #7 0x00007f809d7da25e in g_realloc () from /lib64/libglib-2.0.so.0 #8 0x00007f80a099b3da in virReallocN (ptrptr=ptrptr@entry=0x7f80897e33c0, size=size@entry=1, count=count@entry=8193) at ../../src/util/viralloc.c:91 #9 0x00007f80a09dce30 in saferead_lim (fd=43, max_len=max_len@entry=1024, length=length@entry=0x7f80897e3410) at ../../src/util/virfile.c:1328 #10 0x00007f80a09dd595 in virFileReadHeaderFD (fd=<optimized out>, maxlen=maxlen@entry=1024, buf=buf@entry=0x7f80897e3438) at ../../src/util/virfile.c:1368 #11 0x00007f80a0a3d9c4 in virProcessRunInFork (cb=cb@entry=0x7f80a0a3b08f <virProcessNamespaceHelper>, opaque=opaque@entry=0x7f80897e3490) at ../../src/util/virprocess.c:1154 #12 0x00007f80a0a3dba9 in virProcessRunInMountNamespace (pid=pid@entry=523987, cb=cb@entry=0x7f80a0b92721 <virSecuritySELinuxTransactionRun>, opaque=opaque@entry=0x7f8068002050) at ../../src/util/virprocess.c:1083 #13 0x00007f80a0b92cc2 in virSecuritySELinuxTransactionCommit (mgr=<optimized out>, pid=523987, lock=<optimized out>) at ../../src/security/security_selinux.c:1172 #14 0x00007f80a0b857c1 in virSecurityManagerTransactionCommit (mgr=0x7f80281c2fd0, pid=pid@entry=523987, lock=lock@entry=true) at ../../src/security/security_manager.c:299 #15 0x00007f80a0b7f4ae in virSecurityStackTransactionCommit (mgr=<optimized out>, pid=523987, lock=true) at ../../src/security/security_stack.c:174 #16 0x00007f80a0b857c1 in virSecurityManagerTransactionCommit (mgr=0x7f80281c35c0, pid=pid@entry=523987, lock=<optimized out>) at ../../src/security/security_manager.c:299 #17 0x00007f804ce2a04f in qemuSecuritySetImageLabel (driver=driver@entry=0x7f8028173e80, vm=vm@entry=0x7f802825cb20, src=src@entry=0x7f80680096e0, backingChain=backingChain@entry=true, chainTop=chainTop@entry=true) at ../../src/qemu/qemu_security.c:125 #18 0x00007f804cd2adae in qemuDomainStorageSourceAccessModify (driver=driver@entry=0x7f8028173e80, vm=vm@entry=0x7f802825cb20, src=0x7f80680096e0, flags=flags@entry=(QEMU_DOMAIN_STORAGE_SOURCE_ACCESS_CHAIN | QEMU_DOMAIN_STORAGE_SOURCE_ACCESS_CHAIN_TOP)) at ../../src/qemu/qemu_domain.c:11899 #19 0x00007f804cd2b346 in qemuDomainStorageSourceChainAccessAllow (driver=driver@entry=0x7f8028173e80, vm=vm@entry=0x7f802825cb20, src=<optimized out>) at ../../src/qemu/qemu_domain.c:11965 #20 0x00007f804cd48d83 in qemuDomainAttachDiskGeneric (driver=driver@entry=0x7f8028173e80, vm=vm@entry=0x7f802825cb20, disk=disk@entry=0x7f8068009500) at ../../src/qemu/qemu_hotplug.c:686 --Type <RET> for more, q to quit, c to continue without paging-- #21 0x00007f804cd4bcc8 in qemuDomainAttachSCSIDisk (disk=0x7f8068009500, vm=0x7f802825cb20, driver=0x7f8028173e80) at ../../src/qemu/qemu_hotplug.c:981 #22 qemuDomainAttachDeviceDiskLiveInternal (dev=0x7f8068003c40, vm=0x7f802825cb20, driver=0x7f8028173e80) at ../../src/qemu/qemu_hotplug.c:1055 #23 qemuDomainAttachDeviceDiskLive (driver=driver@entry=0x7f8028173e80, vm=vm@entry=0x7f802825cb20, dev=dev@entry=0x7f8068003c40) at ../../src/qemu/qemu_hotplug.c:1111 #24 0x00007f804ce00cf2 in qemuDomainAttachDeviceLive (driver=0x7f8028173e80, dev=0x7f8068003c40, vm=0x7f802825cb20) at ../../src/qemu/qemu_driver.c:7809 #25 qemuDomainAttachDeviceLiveAndConfig (flags=<optimized out>, xml=0x7f8068003ea0 "<disk device=\"lun\" type=\"block\">\n<target bus=\"scsi\" dev=\"sdb\" />\n<driver name=\"qemu\" type=\"raw\" />\n<source dev=\"/dev/sdh\">\n<reservations managed=\"yes\" />\n</source>\n</disk>\n", driver=0x7f8028173e80, vm=0x7f802825cb20) at ../../src/qemu/qemu_driver.c:8712 #26 qemuDomainAttachDeviceFlags (dom=<optimized out>, xml=0x7f8068003ea0 "<disk device=\"lun\" type=\"block\">\n<target bus=\"scsi\" dev=\"sdb\" />\n<driver name=\"qemu\" type=\"raw\" />\n<source dev=\"/dev/sdh\">\n<reservations managed=\"yes\" />\n</source>\n</disk>\n", flags=<optimized out>, flags@entry=1) at ../../src/qemu/qemu_driver.c:8764 #27 0x00007f804ce013f9 in qemuDomainAttachDevice (dom=<optimized out>, xml=<optimized out>) at ../../src/qemu/qemu_driver.c:8780 #28 0x00007f80a0cf4c00 in virDomainAttachDevice (domain=domain@entry=0x7f80680035c0, xml=0x7f8068003ea0 "<disk device=\"lun\" type=\"block\">\n<target bus=\"scsi\" dev=\"sdb\" />\n<driver name=\"qemu\" type=\"raw\" />\n<source dev=\"/dev/sdh\">\n<reservations managed=\"yes\" />\n</source>\n</disk>\n") at ../../src/libvirt-domain.c:8179 #29 0x000055da96e2667f in remoteDispatchDomainAttachDevice (args=0x7f8068003600, rerr=0x7f80897e38c0, msg=0x55da98a90c40, client=<optimized out>, server=0x55da98a12050) at ./remote/remote_daemon_dispatch_stubs.h:3660 #30 remoteDispatchDomainAttachDeviceHelper (server=0x55da98a12050, client=<optimized out>, msg=0x55da98a90c40, rerr=0x7f80897e38c0, args=0x7f8068003600, ret=0x0) at ./remote/remote_daemon_dispatch_stubs.h:3639 #31 0x00007f80a0bcccf0 in virNetServerProgramDispatchCall (msg=0x55da98a90c40, client=0x55da98ab3c40, server=0x55da98a12050, prog=0x55da98a91bc0) at ../../src/rpc/virnetserverprogram.c:430 #32 virNetServerProgramDispatch (prog=0x55da98a91bc0, server=server@entry=0x55da98a12050, client=client@entry=0x55da98ab3c40, msg=msg@entry=0x55da98a90c40) at ../../src/rpc/virnetserverprogram.c:302 #33 0x00007f80a0bd47e7 in virNetServerProcessMsg (srv=srv@entry=0x55da98a12050, client=0x55da98ab3c40, prog=<optimized out>, msg=0x55da98a90c40) at ../../src/rpc/virnetserver.c:136 #34 0x00007f80a0bd4c54 in virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55da98a12050) at ../../src/rpc/virnetserver.c:153 #35 0x00007f80a0a693c0 in virThreadPoolWorker (opaque=opaque@entry=0x55da98a131f0) at ../../src/util/virthreadpool.c:163 #36 0x00007f80a0a6801e in virThreadHelper (data=<optimized out>) at ../../src/util/virthread.c:196 #37 0x00007f809cd1c2de in start_thread () from /lib64/libpthread.so.0 #38 0x00007f809ca4de83 in clone () from /lib64/libc.so.6 Actual results: Attach device failed and get coredump Expected results: Should attach the device successfully without error Additional info:
I can't reproduce this upstream, but downstream it happens reliably. Looks like memory corruption though. Also reliably does not happen if persistent reservations are not used.
==3621175== Thread 6: ==3621175== Invalid free() / delete / delete[] / realloc() ==3621175== at 0x483AA0C: free (vg_replace_malloc.c:540) ==3621175== by 0x57B544C: g_free (in /usr/lib64/libglib-2.0.so.0.6200.5) ==3621175== by 0x16340AE0: g_autoptr_cleanup_generic_gfree (glib-autocleanups.h:28) ==3621175== by 0x16340AE0: qemuDomainNamespaceSetupDisk (qemu_domain.c:15916) ==3621175== by 0x16340CCB: qemuDomainStorageSourceAccessModify (qemu_domain.c:12050) ==3621175== by 0x16353B06: qemuDomainAttachDiskGeneric (qemu_hotplug.c:686) ==3621175== by 0x163551D4: qemuDomainAttachSCSIDisk (qemu_hotplug.c:981) ==3621175== by 0x163551D4: qemuDomainAttachDeviceDiskLiveInternal (qemu_hotplug.c:1055) ==3621175== by 0x163551D4: qemuDomainAttachDeviceDiskLive (qemu_hotplug.c:1111) ==3621175== by 0x163D1848: qemuDomainAttachDeviceLive (qemu_driver.c:7809) ==3621175== by 0x163D1848: qemuDomainAttachDeviceLiveAndConfig (qemu_driver.c:8712) ==3621175== by 0x163D1848: qemuDomainAttachDeviceFlags (qemu_driver.c:8764) ==3621175== by 0x4B4AF22: virDomainAttachDeviceFlags (libvirt-domain.c:8239) ==3621175== by 0x1478A8: remoteDispatchDomainAttachDeviceFlags (remote_daemon_dispatch_stubs.h:3713) ==3621175== by 0x1478A8: remoteDispatchDomainAttachDeviceFlagsHelper (remote_daemon_dispatch_stubs.h:3692) ==3621175== by 0x4A6ED7F: virNetServerProgramDispatchCall (virnetserverprogram.c:430) ==3621175== by 0x4A6ED7F: virNetServerProgramDispatch (virnetserverprogram.c:302) ==3621175== by 0x4A73EE7: virNetServerProcessMsg (virnetserver.c:136) ==3621175== by 0x4A73EE7: virNetServerHandleJob (virnetserver.c:153) ==3621175== by 0x4991DB6: virThreadPoolWorker (virthreadpool.c:163) ==3621175== Address 0x1ae191b0 is 0 bytes inside a block of size 20 free'd ==3621175== at 0x483AA0C: free (vg_replace_malloc.c:540) ==3621175== by 0x57B544C: g_free (in /usr/lib64/libglib-2.0.so.0.6200.5) ==3621175== by 0x490DDDA: virFree (viralloc.c:348) ==3621175== by 0x4989C11: virStringListFreeCount (virstring.c:341) ==3621175== by 0x16340ACE: qemuDomainNamespaceSetupDisk (qemu_domain.c:15960) ==3621175== by 0x16340CCB: qemuDomainStorageSourceAccessModify (qemu_domain.c:12050) ==3621175== by 0x16353B06: qemuDomainAttachDiskGeneric (qemu_hotplug.c:686) ==3621175== by 0x163551D4: qemuDomainAttachSCSIDisk (qemu_hotplug.c:981) ==3621175== by 0x163551D4: qemuDomainAttachDeviceDiskLiveInternal (qemu_hotplug.c:1055) ==3621175== by 0x163551D4: qemuDomainAttachDeviceDiskLive (qemu_hotplug.c:1111) ==3621175== by 0x163D1848: qemuDomainAttachDeviceLive (qemu_driver.c:7809) ==3621175== by 0x163D1848: qemuDomainAttachDeviceLiveAndConfig (qemu_driver.c:8712) ==3621175== by 0x163D1848: qemuDomainAttachDeviceFlags (qemu_driver.c:8764) ==3621175== by 0x4B4AF22: virDomainAttachDeviceFlags (libvirt-domain.c:8239) ==3621175== by 0x1478A8: remoteDispatchDomainAttachDeviceFlags (remote_daemon_dispatch_stubs.h:3713) ==3621175== by 0x1478A8: remoteDispatchDomainAttachDeviceFlagsHelper (remote_daemon_dispatch_stubs.h:3692) ==3621175== by 0x4A6ED7F: virNetServerProgramDispatchCall (virnetserverprogram.c:430) ==3621175== by 0x4A6ED7F: virNetServerProgramDispatch (virnetserverprogram.c:302) ==3621175== Block was alloc'd at ==3621175== at 0x483980B: malloc (vg_replace_malloc.c:309) ==3621175== by 0x57B5358: g_malloc (in /usr/lib64/libglib-2.0.so.0.6200.5) ==3621175== by 0x57CF613: g_strdup (in /usr/lib64/libglib-2.0.so.0.6200.5) ==3621175== by 0x16340A75: qemuDomainNamespaceSetupDisk (qemu_domain.c:15944) ==3621175== by 0x16340CCB: qemuDomainStorageSourceAccessModify (qemu_domain.c:12050) ==3621175== by 0x16353B06: qemuDomainAttachDiskGeneric (qemu_hotplug.c:686) ==3621175== by 0x163551D4: qemuDomainAttachSCSIDisk (qemu_hotplug.c:981) ==3621175== by 0x163551D4: qemuDomainAttachDeviceDiskLiveInternal (qemu_hotplug.c:1055) ==3621175== by 0x163551D4: qemuDomainAttachDeviceDiskLive (qemu_hotplug.c:1111) ==3621175== by 0x163D1848: qemuDomainAttachDeviceLive (qemu_driver.c:7809) ==3621175== by 0x163D1848: qemuDomainAttachDeviceLiveAndConfig (qemu_driver.c:8712) ==3621175== by 0x163D1848: qemuDomainAttachDeviceFlags (qemu_driver.c:8764) ==3621175== by 0x4B4AF22: virDomainAttachDeviceFlags (libvirt-domain.c:8239) ==3621175== by 0x1478A8: remoteDispatchDomainAttachDeviceFlags (remote_daemon_dispatch_stubs.h:3713) ==3621175== by 0x1478A8: remoteDispatchDomainAttachDeviceFlagsHelper (remote_daemon_dispatch_stubs.h:3692) ==3621175== by 0x4A6ED7F: virNetServerProgramDispatchCall (virnetserverprogram.c:430) ==3621175== by 0x4A6ED7F: virNetServerProgramDispatch (virnetserverprogram.c:302) ==3621175== by 0x4A73EE7: virNetServerProcessMsg (virnetserver.c:136) ==3621175== by 0x4A73EE7: virNetServerHandleJob (virnetserver.c:153) ==3621175==
The issue is that the '*dmPath' string is double-free'd as the pointer it's not cleared when it's being made part of the string list in qemuDomainNamespaceSetupDisk. This was caused by upstream commit ce36e33c105 and later fixed accidentally by a30078cb8326461.
Fixed upstream as: a30078cb83 qemu: Create multipath targets for PRs v6.1.0-79-ga30078cb83
commit ce36e33c10579c4e6d0816ce6f154884891fc247 qemu: use g_strdup instead of VIR_STRDUP merely altered the allocation function, the (lack of) pointer zeroing and the odd freeing functions were untouched. commit a80ebd2a2a8afa3b85c0d207d5500af6dd28f95d qemu: Create NVMe disk in domain namespace is the one who switched the string free function from the shallow free to freeing the list
Note to QE: The patch linked in comment 7 does more than just fix the crasher. It also fixes the problem mentioned in bug 1711045#c61. Long story short, libvirt does not only need to allow all devices that a multipath target consists of in CGroups (fixed in bug 1557769) but also should create the nodes in namespaces (per Paolo's comment). And the patch I've backported does that. Plus fixes the crasher.
One more case associated with this crash -- Copy to scsi reservations destination pr-dest.xml: <disk type="block" device="lun"> <driver name="qemu" type="raw"/> <source dev="/dev/sdb"> <encryption format="luks"> <secret type="passphrase" usage="luks"/> </encryption> <reservations managed="yes"/> </source> <target dev="sdb" bus="scsi"/> </disk> # virsh blockcopy pc sdb --xml pr-dest.xml --wait --pivot --verbose --transient-job
(In reply to Han Han from comment #10) > One more case associated with this crash -- Copy to scsi reservations > destination > pr-dest.xml: > <disk type="block" device="lun"> > <driver name="qemu" type="raw"/> > <source dev="/dev/sdb"> > <encryption format="luks"> > <secret type="passphrase" usage="luks"/> > </encryption> > <reservations managed="yes"/> > </source> > <target dev="sdb" bus="scsi"/> > </disk> > > > # virsh blockcopy pc sdb --xml pr-dest.xml --wait --pivot --verbose > --transient-job I guess this is the same bug but it just demonstrates itself differently. Because in case of blockcopy we need to create the node in the namespace which tickles the problematic code.
Verified on : libvirt-6.0.0-14.module+el8.2.0+6069+78a1cb09.x86_64 Step: 1.Prepare a guest and disk xml with element "reservation": # cat disk.xml <disk device="lun" type="block"> <target bus="scsi" dev="sdb" /> <driver name="qemu" type="raw" /> <source dev="/dev/sdh"> <reservations managed="yes" /> </source> </disk> 2.Hot-plug the disk to the guest # virsh attach-device test1 disk.xml Device attached successfully 3.Check the qemu cmd: 2020-03-30 02:22:12.294+0000: 630435: info : qemuMonitorIOWrite:453 : QEMU_MONITOR_IO_WRITE: mon=0x7f260c2fa6f0 buf={"execute":"object-add","arguments":{"qom-type":"pr-manager-helper","id":"pr-helper0","props":{"path":"/var/lib/libvirt/qemu/domain-4-test1/pr-helper0.sock"}},"id":"libvirt-369"}^M len=180 ret=180 errno=0 ... 2020-03-30 02:22:12.296+0000: 630438: info : qemuMonitorSend:996 : QEMU_MONITOR_SEND_MSG: mon=0x7f260c2fa6f0 msg={"execute":"blockdev-add","arguments":{"driver":"host_device","filename":"/dev/sdh","pr-manager":"pr-helper0","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"},"id":"libvirt-370"} 4.Do blockcopy and check the qemu cmd: # virsh blockcopy test1 sdb --xml disk.xml --wait --pivot --verbose --transient-job Block Copy: [100 %] Successfully pivoted qemu cmd: 2020-03-30 02:33:34.707+0000: 630439: info : qemuMonitorSend:996 : QEMU_MONITOR_SEND_MSG: mon=0x7f260c2fa6f0 msg={"execute":"blockdev-add","arguments":{"driver":"host_device","filename":"/dev/sdh","pr-manager":"pr-helper0","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"},"id":"libvirt-375"}^M fd=-1 2020-03-30 02:33:34.707+0000: 630435: info : virObjectRef:386 : OBJECT_REF: obj=0x7f260c2fa6f0 2020-03-30 02:33:34.708+0000: 630435: info : qemuMonitorIOWrite:453 : QEMU_MONITOR_IO_WRITE: mon=0x7f260c2fa6f0 buf={"execute":"blockdev-add","arguments":{"driver":"host_device","filename":"/dev/sdh","pr-manager":"pr-helper0","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"},"id":"libvirt-375"}^M len=204 ret=204 errno=0 ... 2020-03-30 02:33:34.709+0000: 630439: info : qemuMonitorSend:996 : QEMU_MONITOR_SEND_MSG: mon=0x7f260c2fa6f0 msg={"execute":"blockdev-add","arguments":{"node-name":"libvirt-3-format","read-only":false,"driver":"raw","file":"libvirt-3-storage"},"id":"libvirt-376"}^M fd=-1 2020-03-30 02:33:34.709+0000: 630435: info : virObjectRef:386 : OBJECT_REF: obj=0x7f260c2fa6f0 2020-03-30 02:33:34.709+0000: 630435: info : qemuMonitorIOWrite:453 : QEMU_MONITOR_IO_WRITE: mon=0x7f260c2fa6f0 buf={"execute":"blockdev-add","arguments":{"node-name":"libvirt-3-format","read-only":false,"driver":"raw","file":"libvirt-3-storage"},"id":"libvirt-376"}^M len=152 ret=152 errno=0 Node name has been added successfully Work as expected
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2017