Description of problem: Guest crash after restart libvirtd when start guest with iscsi-direct volume Version-Release number of selected component (if applicable): libvirt-4.10.0-1.module+el8+2317+367e35b5.x86_64 qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64 How reproducible: 100% Steps to Reproduce: 1. Prepare an iscsi-direct pool. # virsh pool-dumpxml iscsi-direct <pool type='iscsi-direct'> <name>iscsi-direct</name> <uuid>0799697a-94dd-4115-9601-8714b1931248</uuid> <capacity unit='bytes'>524287488</capacity> <allocation unit='bytes'>524287488</allocation> <available unit='bytes'>0</available> <source> <host name='10.66.144.87'/> <device path='iqn.2017-12.com.virttest:emulated-iscsi-noauth.target2'/> <initiator> <iqn name='iqn.2017-12.com.example:client'/> </initiator> </source> </pool> # virsh vol-list iscsi-direct Name Path ------------------------------------------------------------------------------- unit:0:0:0 ip-10.66.144.87:3260-iscsi-iqn.2017-12.com.virttest:emulated-iscsi-noauth.target2-lun-0 2. Start guest with iscsi volume. # virsh dumpxml q35 | grep disk -a8 ... <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='iscsi-direct' volume='unit:0:0:0' mode='direct'/> <target dev='vdb' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> # virsh start q35 Domain q35 started # virsh list --all Id Name State --------------------------------- 37 lmn running 49 q35 running 3. Restart libvirtd. # systemctl restart libvirtd # virsh list --all Id Name State --------------------------------- 37 lmn running - q35 shut off Actual results: As step 3, the guest crash after restart libvirtd Expected results: The guest should still be running Additional info: 1) qemu.log: 2018-12-12 02:51:39.178+0000: 22786: debug : virFileClose:109 : Closed fd 38 2018-12-12 02:51:39.178+0000: 22786: debug : virFileClose:109 : Closed fd 39 2018-12-12 02:51:39.178+0000: 22786: debug : virCommandHandshakeChild:460 : Handshake with parent is done 2018-12-12T02:51:39.218693Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/4 (label charserial0) 2018-12-12 02:51:49.103+0000: shutting down, reason=crashed 2018-12-12T02:51:49.106079Z qemu-kvm: terminating on signal 15 from pid 22843 (<unknown process>) 2) debug.log: 2018-12-12 02:51:49.103+0000: 22910: info : virObjectUnref:344 : OBJECT_UNREF: obj=0x5580600d95b0 2018-12-12 02:51:49.103+0000: 22910: info : virObjectUnref:344 : OBJECT_UNREF: obj=0x7f5c680f1f00 2018-12-12 02:51:49.103+0000: 22910: error : virDomainDiskTranslateSourcePool:30383 : XML error: disk source mode is only valid when storage pool is of iscsi type 2018-12-12 02:51:49.103+0000: 22910: info : virObjectUnref:344 : OBJECT_UNREF: obj=0x7f5c540049e0 2018-12-12 02:51:49.103+0000: 22910: info : virObjectUnref:344 : OBJECT_UNREF: obj=0x7f5c54006640 2018-12-12 02:51:49.103+0000: 22910: info : virObjectUnref:346 : OBJECT_DISPOSE: obj=0x7f5c54006640 2018-12-12 02:51:49.103+0000: 22910: debug : virStoragePoolDispose:517 : release pool 0x7f5c54006640 iscsi-direct 0799697a-94dd-4115-9601-8714b1931248
(In reply to Meina Li from comment #0) > Description of problem: > Guest crash after restart libvirtd when start guest with iscsi-direct volume > > Version-Release number of selected component (if applicable): > libvirt-4.10.0-1.module+el8+2317+367e35b5.x86_64 > qemu-kvm-3.0.0-2.module+el8+2246+78080371.x86_64 > > How reproducible: > 100% > > Steps to Reproduce: > > 1. Prepare an iscsi-direct pool. > # virsh pool-dumpxml iscsi-direct > <pool type='iscsi-direct'> > <name>iscsi-direct</name> > <uuid>0799697a-94dd-4115-9601-8714b1931248</uuid> > <capacity unit='bytes'>524287488</capacity> > <allocation unit='bytes'>524287488</allocation> > <available unit='bytes'>0</available> > <source> > <host name='10.66.144.87'/> > <device path='iqn.2017-12.com.virttest:emulated-iscsi-noauth.target2'/> > <initiator> > <iqn name='iqn.2017-12.com.example:client'/> > </initiator> > </source> > </pool> > > # virsh vol-list iscsi-direct > Name Path > > ----------------------------------------------------------------------------- > -- > unit:0:0:0 > ip-10.66.144.87:3260-iscsi-iqn.2017-12.com.virttest:emulated-iscsi-noauth. > target2-lun-0 > > 2. Start guest with iscsi volume. > # virsh dumpxml q35 | grep disk -a8 > ... > <disk type='volume' device='disk'> > <driver name='qemu' type='raw'/> > <source pool='iscsi-direct' volume='unit:0:0:0' mode='direct'/> > <target dev='vdb' bus='virtio'/> > <address type='pci' domain='0x0000' bus='0x00' slot='0x07' > function='0x0'/> > </disk> > > # virsh start q35 > Domain q35 started This is where I have troubles. When I try to start such domain I get the error immediatelly: virsh # start fedora error: Failed to start domain fedora error: XML error: disk source mode is only valid when storage pool is of iscsi type Is it possible that the domain was started with an older libvirt and what we see here is just during libvirt upgrade?
(In reply to Michal Privoznik from comment #2) ... > > This is where I have troubles. When I try to start such domain I get the > error immediatelly: > virsh # start fedora > error: Failed to start domain fedora > error: XML error: disk source mode is only valid when storage pool is of > iscsi type > > Is it possible that the domain was started with an older libvirt and what we > see here is just during libvirt upgrade? I can also encounter this trouble in the latest version. But the domain will start successfully when there's no mode='direct' in domain xml: # virsh dumpxml q35 | grep disk -a8 ... <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='iscsi-direct' volume='unit:0:0:0'/> --no mode='direct' in disk source <target dev='vdb' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> # virsh start q35 Domain q35 started I think this maybe an another new bug except there's some new design methods on it. Please review it again, thanks. Test Version: libvirt-5.0.0-4.module+el8+2835+faae67de.x86_64 qemu-kvm-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64
Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2019-March/msg00093.html
I've just merged the commit upstream: commit e89694735011fad95bf9fc61221744e69d695137 Author: Michal Privoznik <mprivozn> AuthorDate: Fri Mar 1 16:05:16 2019 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Mon Mar 4 16:54:11 2019 +0100 virDomainDiskTranslateSourcePool: Don't set @mode of iscsi-direct https://bugzilla.redhat.com/show_bug.cgi?id=1658504 This function is called when a domain is starting up (in qemu driver that is when qemu cmd line is generated). It is used to translate <disk type='volume'/> to something usable by filling in virStorageSource (e.g. fetching disk path, or some connection URI for a network FS). But some of these info are not stored in status XML and thus the function is called on qemuProcessReconnect too to reconstruct runtime data. But this poses a problem because after the first run the mode is set to 'direct', but in the second run this triggers a failure because mode is valid only for 'iscsi' volumes and not 'iscsi-direct' ones. Signed-off-by: Michal Privoznik <mprivozn> Reviewed-by: Erik Skultety <eskultet> v5.1.0-20-ge896947350
Verified Version: libvirt-5.4.0-1.module+el8.1.0+3304+7eb41d5f.x86_64 qemu-kvm-4.0.0-4.module+el8.1.0+3356+cda7f1ee.x86_64 kernel-4.18.0-107.el8.x86_64 Verified Steps: Scenario 1: Start the guest with iscsi-direct volume and check list after libvirtd restart. 1. Prepare an iscsi-direct pool. # virsh pool-dumpxml iscsi-direct <pool type='iscsi-direct'> <name>iscsi-direct</name> <uuid>2b621385-f734-4b98-8131-0fc17ed29e67</uuid> <capacity unit='bytes'>64424704512</capacity> <allocation unit='bytes'>64424704512</allocation> <available unit='bytes'>0</available> <source> <host name='**IP**'/> <device path='iqn.2017-12.com.virttest:emulated-iscsi-noauth.target2'/> <initiator> <iqn name='iqn.2017-12.com.example:client'/> </initiator> </source> </pool> # virsh vol-list iscsi-direct Name Path ------------------------------------------------------------------------------------------------------ unit:0:0:0 ip-10.66.4.109:3260-iscsi-iqn.2017-12.com.virttest:emulated-iscsi-noauth.target2-lun-0 2. Start the guest with the following volume disk. ... <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='iscsi-direct' volume='unit:0:0:0'/> <target dev='vdb' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> </disk> ... # virsh start lmn Domain lmn started 3. Check the guest info after restart libvirtd/save/restore guest. # virsh list --all Id Name State ----------------------- 1 lmn running # systemctl restart libvirtd # virsh list --all Id Name State ----------------------- 5 lmn running # virsh save lmn test.save Domain lmn saved to test.save # virsh list --all Id Name State ----------------------- - lmn shut off # virsh restore test.save Domain restored from test.save # virsh list --all Id Name State ----------------------- 6 lmn running Scenario 2: Hotplug/unplug iscsi-pool volume disk to the guest. 1. Prepare the disk xml with iscsi-pool volume info. # cat disk.xml <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='iscsi-direct' volume='unit:0:0:0'/> <target dev='vdb' bus='virtio'/> <alias name='virtio-disk1'/> <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> </disk> 2. Hotplug the disk to guest. # virsh attach-device lmn disk.xml Device attached successfully # virsh dumpxml lmn | grep disk -a8 ... <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='iscsi-direct' volume='unit:0:0:0'/> <target dev='vdb' bus='virtio'/> <alias name='virtio-disk1'/> <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> </disk> ... 3. Detach the disk from the guest. # virsh detach-device lmn disk.xml Device detached successfully # virsh dumpxml lmn | grep disk -a8 ... No this volume disk ... So move this bug to be verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3723