Description of problem: Failed to start VM after doing snapshot for shutdown VM in OSP16.1 Version-Release number of selected component (if applicable): libvirt-daemon-kvm-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64 qemu-kvm-core-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64 openstack-nova-compute-20.2.1-0.20200528080027.1e95025.el8ost.noarch How reproducible: 100% Steps to Reproduce: 1. Create a image and volume in OSP, boot VM from volume: Project->Instances->Launch Instance->Source:Volume r8-vol $ openstack image create r8 --disk-format qcow2 --container-format bare --file /tmp/RHEL-8.2-x86_64-latest.qcow2 $ openstack volume create r8-vol --size 10 --image r8 2. Shutdown the VM, check the xml: --------------------------------------------------------------- <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none' io='native'/> <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457'/> <target dev='vda' bus='virtio'/> <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> ---------------------------------------------------------------- 3. Do VM snapshot r8-vol-shutdown-s1, check the xml: ---------------------------------------------------------------- <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='native'/> <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'/> <target dev='vda' bus='virtio'/> <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> 4. Delete VM snapshot r8-vol-shutdown-s1 successfully, check openstack volume snapshot list, the snapshot r8-vol-shutdown-s1 is no longer existed. Check the libvirt xml is not changed. ------------------------------------------------------------- <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='native'/> <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'/> <target dev='vda' bus='virtio'/> <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> 5. Do VM snapshot r8-vol-shutdown-s2 successfully, it created 0 size image, and volume snapshot. $ openstack volume snapshot list +--------------------------------------+------------------------------------+-------------+----------------+------+ | ID | Name | Description | Status | Size | +--------------------------------------+------------------------------------+-------------+----------------+------+ | b6a06c69-57f7-4fbe-a48b-b251fc550d87 | snapshot for r8-vol-shutdown-s2 | None | available | 10 | ()[root@overcloud-novacompute-1 libvirt]# virsh list --all Id Name State ------------------------------------ - instance-00000004 shut off Check the xml: --------------------------------------------------------------- <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='native'/> <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.b6a06c69-57f7-4fbe-a48b-b251fc550d87'/> <target dev='vda' bus='virtio'/> <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> 6. Try to start the VM, the VM is still shutoff in OSP16.1: ------------------------------------------------------------ $ openstack server list +--------------------------------------+-----------+---------+-----------------------+-------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-----------+---------+-----------------------+-------+--------+ | 416b3517-84d9-4f93-85de-e395c3333c3e | r8-vol | SHUTOFF | default=192.168.35.64 | | | But VM is disappeared in virsh list, and get error in libvirtd.log --------------------------------------------------------------- error : virStorageFileReportBrokenChain:5214 : Cannot access storage file '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457': No such file or directory --------------------------------------------------------------- Actual results: In step6, failed to start the VM Expected results: In step6, start the VM successfully Additional info: - libvirtd.log - nova_compute.log
Created attachment 1700262 [details] libvirtd.log, nova-compute.log
I think this is an OSP bug, add more information here: 1. Create a image and volume in OSP, boot VM from volume: Project->Instances->Launch Instance->Source:Volume r8t-vol $ openstack image create r8t --disk-format qcow2 --container-format bare --file /tmp/RHEL-8.2-x86_64-latest.qcow2 $ openstack volume create r8t-vol --size 10 --image r8t Check the cinder on nfs storage: [root@dell-per730-** cinder]# ls -lh -rw-rw-rw-. 1 qemu qemu 10G Jul 9 05:35 volume-2750c079-b5f5-48ec-8e71-b470c41abf08 2. Shutdown the VM [root@dell-per730-** cinder]# ls -lh -rw-rw-rw-. 1 root root 10G Jul 9 05:35 volume-2750c079-b5f5-48ec-8e71-b470c41abf08 3. Do VM snapshot s1-down, volume snapshot is created [root@dell-per730-** cinder]# ls -lh -rw-rw-rw-. 1 root root 10G Jul 9 05:35 volume-2750c079-b5f5-48ec-8e71-b470c41abf08 -rw-rw-rw-. 1 root root 193K Jul 9 05:36 volume-2750c079-b5f5-48ec-8e71-b470c41abf08.52fd1a14-4f33-48a7-9ca3-5c159cb3c660 4. Delete VM snapshot s1-down, the volume-**bf08 is deleted. It should delete volume-**bf08.**3c660 [root@dell-per730-** cinder]# ls -lh -rw-rw-rw-. 1 root root 193K Jul 9 05:36 volume-2750c079-b5f5-48ec-8e71-b470c41abf08.52fd1a14-4f33-48a7-9ca3-5c159cb3c660 5. Do VM snapshot s2-down, volume-**bf08.**c1bb is created [root@dell-per730-64 cinder]# ls -lh -rw-rw-rw-. 1 root root 1.6G Jul 9 05:38 volume-2750c079-b5f5-48ec-8e71-b470c41abf08.52fd1a14-4f33-48a7-9ca3-5c159cb3c660 -rw-rw-rw-. 1 root root 193K Jul 9 05:40 volume-2750c079-b5f5-48ec-8e71-b470c41abf08.bb18f197-7bce-47ca-91e4-f666048fc1bb 6. Try to start the VM, hit error in libvirtd.log: -------------------------------------------------------------------------------------------------- error : virStorageFileReportBrokenChain:5214 : Cannot access storage file '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-2750c079-b5f5-48ec-8e71-b470c41abf08': No such file or directory
So the libvirtd log shows: 'instance-00000004' is defined as: 2020-07-08 08:32:23.025+0000: 61778: debug : virDomainDefineXML:6146 : conn=0x7fafa40015d0, xml=<domain type="kvm"> <uuid>416b3517-84d9-4f93-85de-e395c3333c3e</uuid> <name>instance-00000004</name> <memory>2097152</memory> <vcpu>2</vcpu> <metadata> <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0"> <nova:package version="20.2.1-0.20200528080027.1e95025.el8ost"/> <nova:name>r8-vol</nova:name> <nova:creationTime>2020-07-08 08:32:22</nova:creationTime> <nova:flavor name="m1"> <nova:memory>2048</nova:memory> <nova:disk>10</nova:disk> <nova:swap>0</nova:swap> <nova:ephemeral>0</nova:ephemeral> <nova:vcpus>2</nova:vcpus> </nova:flavor> <nova:owner> <nova:user uuid="286e173a35bc42158a88973fdaf6e564">admin</nova:user> <nova:project uuid="a26754324ca249c590505e1f796d5f0f">admin</nova:project> </nova:owner> </nova:instance> </metadata> <sysinfo type="smbios"> <system> <entry name="manufacturer">Red Hat</entry> <entry name="product">OpenStack Compute</entry> <entry name="version">20.2.1-0.20200528080027.1e95025.el8ost</entry> <entry name="serial">416b3517-84d9-4f93-85de-e395c3333c3e</entry> <entry name="uuid">416b3517-84d9-4f93-85de-e395c3333c3e</entry> <entry name="family">Virtual Machine</entry> </system> </sysinfo> <os> <type machine="pc-i440fx-rhel7.6.0">hvm</type> <boot dev="hd"/> <smbios mode="sysinfo"/> </os> <features> <acpi/> <apic/> </features> <cputune> <shares>2048</shares> </cputune> <clock offset="utc"> <timer name="pit" tickpolicy="delay"/> <timer name="rtc" tickpolicy="catchup"/> <timer name="hpet" present="no"/> </clock> <cpu mode="host-model" match="exact"> <topology sockets="2" cores="1" threads="1"/> </cpu> <devices> <disk type="file" device="disk"> <driver name="qemu" type="raw" cache="none" io="native"/> <source file="/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457"/> <target bus="virtio" dev="vda"/> <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial> </disk> <interface type="bridge"> <mac address="fa:16:3e:77:27:46"/> <model type="virtio"/> <driver name="vhost" rx_queue_size="512"/> <source bridge="br-int"/> <mtu size="1442"/> <target dev="tapa82ddf70-7d"/> <virtualport type="openvswitch"> <parameters interfaceid="a82ddf70-7d05-4f83-9f26-07ad91cdc87e"/> </virtualport> </interface> <serial type="pty"> <log file="/var/lib/nova/instances/416b3517-84d9-4f93-85de-e395c3333c3e/console.log" append="off"/> </serial> <input type="tablet" bus="usb"/> <graphics type="vnc" autoport="yes" listen="192.168.24.11"/> <video> <model type="cirrus"/> </video> <memballoon model="virtio"> <stats period="10"/> </memballoon> </devices> </domain> 2020-07-08 08:32:23.033+0000: 61778: info : qemuDomainDefineXMLFlags:7636 : Creating domain 'instance-00000004' The VM is then started: 2020-07-08 08:32:23.069+0000: 61775: debug : virDomainCreateWithFlags:6589 : dom=0x7fafb4024930, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e), flags=0x1 2020-07-08 08:32:23.069+0000: 61775: debug : qemuDomainObjBeginJobInternal:9784 : Starting job: job=none agentJob=none asyncJob=start (vm=0x7fafa8029f70 name=instance-00000004, current job=none agentJob=none async=none) 2020-07-08 08:32:23.069+0000: 61775: debug : qemuDomainObjBeginJobInternal:9841 : Started async job: start (vm=0x7fafa8029f70 name=instance-00000004) 2020-07-08 08:32:23.069+0000: 61775: debug : qemuProcessStart:7077 : conn=0x7fafa40015d0 driver=0x7faf640f1980 vm=0x7fafa8029f70 name=instance-00000004 id=-1 asyncJob=start migrateFrom=<null> migrateFd=-1 migratePath=<null> snapshot=(nil) vmop=0 flags=0x3 2020-07-08 08:32:23.069+0000: 61775: debug : qemuProcessInit:5526 : vm=0x7fafa8029f70 name=instance-00000004 id=-1 migration=0 [...] 2020-07-08 08:32:23.148+0000: 61775: debug : virCommandRunAsync:2588 : About to run LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOME=/var/lib/libvirt/qemu/domain-7-instance-00000004 XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-7-instance-00000004/.local/share XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-7-instance-00000004/.cache XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-7-instance-00000004/.config QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000004,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-7-instance-00000004/master-key.aes -machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Server-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 2048 -overcommit mem-lock=off -smp 2,sockets=2,dies=1,cores=1,threads=1 -uuid 416b3517-84d9-4f93-85de-e395c3333c3e -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=20.2.1-0.20200528080027.1e95025.el8ost,serial=416b3517-84d9-4f93-85de-e395c3333c3e,uuid=416b3517-84d9-4f93-85de-e395c3333c3e,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=33,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -blockdev '{"driver":"file","filename":"/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on,serial=5fd5a675-1171-4dbc-af73-6b2a8a1dc457 -netdev tap,fd=35,id=hostnet0,vhost=on,vhostfd=36 -device virtio-net-pci,rx_queue_size=512,host_mtu=1442,netdev=hostnet0,id=net0,mac=fa:16:3e:77:27:46,bus=pci.0,addr=0x3 -add-fd set=3,fd=38 -chardev pty,id=charserial0,logfile=/dev/fdset/3,logappend=on -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 192.168.24.11:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on CPUs are started: 2020-07-08 08:32:26.123+0000: 61777: debug : virDomainResume:666 : dom=0x7fafa40075c0, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e) 2020-07-08 08:32:26.125+0000: 61770: debug : qemuProcessHandleResume:728 : Transitioned guest instance-00000004 into running state, reason 'unpaused', event detail 0 And the VM is then asked to be shutdown a minute later: 2020-07-08 08:33:31.688+0000: 61778: debug : virDomainShutdown:1444 : dom=0x7fafa8019b40, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e) 2020-07-08 08:33:31.688+0000: 61778: info : qemuMonitorSend:996 : QEMU_MONITOR_SEND_MSG: mon=0x7fafb4046e80 msg={"execute":"system_powerdown","id":"libvirt-367"} The guest OS shuts down in a moment 2020-07-08 08:33:33.422+0000: 61770: debug : qemuProcessHandleShutdown:594 : Transitioned guest instance-00000004 to shutdown state 2020-07-08 08:33:33.422+0000: 61770: debug : virObjectEventNew:631 : obj=0x561796c81000 2020-07-08 08:33:33.423+0000: 61770: debug : qemuProcessKill:7200 : vm=0x7fafa8029f70 name=instance-00000004 pid=103949 flags=0x2 [...] 2020-07-08 08:33:33.457+0000: 61770: debug : qemuProcessHandleMonitorEOF:293 : Received EOF on 0x7fafa8029f70 'instance-00000004' [...] 2020-07-08 08:33:33.692+0000: 62484: debug : qemuRemoveCgroup:1333 : Failed to terminate cgroup for instance-00000004 2020-07-08 08:33:33.701+0000: 62484: error : virProcessRunInFork:1161 : internal error: child reported (status=125): internal error: child reported (status=125): internal error: child reported (status=125): internal error: End of file from qemu monitor 2020-07-08 08:33:33.701+0000: 62484: warning : qemuBlockRemoveImageMetadata:2774 : Unable to remove disk metadata on vm instance-00000004 from /var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457 (disk target vda) We for some reason fail to remove image metadata, but that shouldn't be fatal in this instance. I'll ask Michal who implemented this to comment on why that could have happened. The VM is then destroyed, but it's not running so this is a no-op. 2020-07-08 08:33:33.716+0000: 61777: debug : virDomainDestroy:466 : dom=0x7fafa4057620, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e) 2020-07-08 08:33:33.716+0000: 61777: debug : virDomainObjCheckActive:7204 : Requested operation is not valid: domain is not running An external snapshot is then created with the VM in inactive state: 2020-07-08 08:35:14.804+0000: 61776: debug : virDomainSnapshotCreateXML:221 : dom=0x7fafac039850, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e), xmlDesc=<domainsnapshot> <disks> <disk name="/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457" snapshot="external" type="file"> <source file="/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a"/> </disk> </disks> </domainsnapshot> , flags=0x34 The flags used are: VIR_DOMAIN_SNAPSHOT_CREATE_NO_METADATA (0x04) VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY (0x10) VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT (0x20) Since the VM is inactive at this point and _REUSE_EXT along with _NO_METADATA is used to create an external snapshot, this means that libvirt effectively just replaces the disk source with the one specified by the snapshot. Libvirt doesn't even touch the image files apart from checking whether they exist. Another snapshot is then created ... 2020-07-08 08:43:17.458+0000: 61775: debug : virDomainSnapshotCreateXML:221 : dom=0x7fafb401eb80, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e), xmlDesc=<domainsnapshot> <disks> <disk name="/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a" snapshot="external" type="file"> <source file="/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.b6a06c69-57f7-4fbe-a48b-b251fc550d87"/> </disk> </disks> </domainsnapshot> , flags=0x34 With same configuration. So at this point the 'vda' disk points to: "/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.b6a06c69-57f7-4fbe-a48b-b251fc550d87" Another spurious destroy is attempted but the VM still isn't running. 2020-07-08 08:47:38.979+0000: 61779: debug : virDomainDestroy:466 : dom=0x7fafa0038710, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e) 2020-07-08 08:47:38.979+0000: 61779: debug : virDomainObjCheckActive:7204 : Requested operation is not valid: domain is not running At this point the VM is undefined ... 2020-07-08 08:47:39.051+0000: 61775: debug : virDomainUndefineFlags:6304 : dom=0x7fafb4005400, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e), flags=0x1 And defined again. 2020-07-08 08:47:39.190+0000: 61780: debug : virDomainDefineXML:6146 : conn=0x7fafa40015d0, xml=<domain type="kvm"> <uuid>416b3517-84d9-4f93-85de-e395c3333c3e</uuid> <name>instance-00000004</name> <memory>2097152</memory> <vcpu>2</vcpu> <metadata> <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0"> <nova:package version="20.2.1-0.20200528080027.1e95025.el8ost"/> <nova:name>r8-vol</nova:name> <nova:creationTime>2020-07-08 08:47:39</nova:creationTime> <nova:flavor name="m1"> <nova:memory>2048</nova:memory> <nova:disk>10</nova:disk> <nova:swap>0</nova:swap> <nova:ephemeral>0</nova:ephemeral> <nova:vcpus>2</nova:vcpus> </nova:flavor> <nova:owner> <nova:user uuid="286e173a35bc42158a88973fdaf6e564">admin</nova:user> <nova:project uuid="a26754324ca249c590505e1f796d5f0f">admin</nova:project> </nova:owner> </nova:instance> </metadata> <sysinfo type="smbios"> <system> <entry name="manufacturer">Red Hat</entry> <entry name="product">OpenStack Compute</entry> <entry name="version">20.2.1-0.20200528080027.1e95025.el8ost</entry> <entry name="serial">416b3517-84d9-4f93-85de-e395c3333c3e</entry> <entry name="uuid">416b3517-84d9-4f93-85de-e395c3333c3e</entry> <entry name="family">Virtual Machine</entry> </system> </sysinfo> <os> <type machine="pc-i440fx-rhel7.6.0">hvm</type> <boot dev="hd"/> <smbios mode="sysinfo"/> </os> <features> <acpi/> <apic/> </features> <cputune> <shares>2048</shares> </cputune> <clock offset="utc"> <timer name="pit" tickpolicy="delay"/> <timer name="rtc" tickpolicy="catchup"/> <timer name="hpet" present="no"/> </clock> <cpu mode="host-model" match="exact"> <topology sockets="2" cores="1" threads="1"/> </cpu> <devices> <disk type="file" device="disk"> <driver name="qemu" type="raw" cache="none" io="native"/> <source file="/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457"/> Disk skource is thus overwritten by the define to the original path. <target bus="virtio" dev="vda"/> <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial> </disk> <interface type="bridge"> <mac address="fa:16:3e:77:27:46"/> <model type="virtio"/> <driver name="vhost" rx_queue_size="512"/> <source bridge="br-int"/> <mtu size="1442"/> <target dev="tapa82ddf70-7d"/> <virtualport type="openvswitch"> <parameters interfaceid="a82ddf70-7d05-4f83-9f26-07ad91cdc87e"/> </virtualport> </interface> <serial type="pty"> <log file="/var/lib/nova/instances/416b3517-84d9-4f93-85de-e395c3333c3e/console.log" append="off"/> </serial> <input type="tablet" bus="usb"/> <graphics type="vnc" autoport="yes" listen="192.168.24.11"/> <video> <model type="cirrus"/> </video> <memballoon model="virtio"> <stats period="10"/> </memballoon> </devices> </domain> 2020-07-08 08:47:39.198+0000: 61780: info : qemuDomainDefineXMLFlags:7636 : Creating domain 'instance-00000004' The VM is then started: 2020-07-08 08:47:39.200+0000: 61778: debug : virDomainCreateWithFlags:6589 : dom=0x7fafa80016c0, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e), flags=0x0 2020-07-08 08:47:39.200+0000: 61778: debug : qemuProcessStart:7077 : conn=0x7fafa40015d0 driver=0x7faf640f1980 vm=0x7faf9801c430 name=instance-00000004 id=-1 asyncJob=start migrateFrom=<null> migrateFd=-1 migratePath=<null> snapshot=(nil) vmop=0 flags=0x1 2020-07-08 08:47:39.200+0000: 61778: debug : qemuProcessInit:5526 : vm=0x7faf9801c430 name=instance-00000004 id=-1 migration=0 And fails to find the image. 2020-07-08 08:47:39.229+0000: 61778: debug : qemuProcessPrepareHost:6535 : Preparing disks (host) 2020-07-08 08:47:39.230+0000: 61778: error : virStorageFileReportBrokenChain:5214 : Cannot access storage file '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457': No such file or directory <source file="/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457"/> 2020-07-08 08:47:39.230+0000: 61778: debug : qemuProcessStop:7286 : Shutting down vm=0x7faf9801c430 name=instance-00000004 id=8 pid=0, reason=failed, asyncJob=start, flags=0x2 And it's undefined. 2020-07-08 08:47:39.417+0000: 61780: debug : virDomainUndefineFlags:6304 : dom=0x7faf98018f90, (VM: name=instance-00000004, uuid=416b3517-84d9-4f93-85de-e395c3333c3e), flags=0x1 2020-07-08 08:47:39.417+0000: 61780: info : qemuDomainUndefineFlags:7772 : Undefining domain 'instance-00000004' As the above log shows libvirt didn't actually modify the original image or even touch it as _REUSE_EXT is used when creating the snapshots and it was supplied back to us via the newly defined XML. As of that there doesn't seem to be a libvirt problem related to snapshots here and the image was removed out of our control. Please re-assign it to the openstack team to further investigation.
(In reply to chhu from comment #2) > I think this is an OSP bug, add more information here: > > 1. Create a image and volume in OSP, boot VM from volume: > Project->Instances->Launch Instance->Source:Volume r8t-vol > $ openstack image create r8t --disk-format qcow2 --container-format bare > --file /tmp/RHEL-8.2-x86_64-latest.qcow2 > $ openstack volume create r8t-vol --size 10 --image r8t > > Check the cinder on nfs storage: > [root@dell-per730-** cinder]# ls -lh > -rw-rw-rw-. 1 qemu qemu 10G Jul 9 05:35 > volume-2750c079-b5f5-48ec-8e71-b470c41abf08 > > 2. Shutdown the VM > [root@dell-per730-** cinder]# ls -lh > -rw-rw-rw-. 1 root root 10G Jul 9 05:35 > volume-2750c079-b5f5-48ec-8e71-b470c41abf08 > > 3. Do VM snapshot s1-down, volume snapshot is created > [root@dell-per730-** cinder]# ls -lh > -rw-rw-rw-. 1 root root 10G Jul 9 05:35 > volume-2750c079-b5f5-48ec-8e71-b470c41abf08 > -rw-rw-rw-. 1 root root 193K Jul 9 05:36 > volume-2750c079-b5f5-48ec-8e71-b470c41abf08.52fd1a14-4f33-48a7-9ca3- > 5c159cb3c660 > > 4. Delete VM snapshot s1-down, the volume-**bf08 is deleted. It should > delete volume-**bf08.**3c660 > [root@dell-per730-** cinder]# ls -lh > -rw-rw-rw-. 1 root root 193K Jul 9 05:36 > volume-2750c079-b5f5-48ec-8e71-b470c41abf08.52fd1a14-4f33-48a7-9ca3- > 5c159cb3c660 > > 5. Do VM snapshot s2-down, volume-**bf08.**c1bb is created > [root@dell-per730-64 cinder]# ls -lh > -rw-rw-rw-. 1 root root 1.6G Jul 9 05:38 > volume-2750c079-b5f5-48ec-8e71-b470c41abf08.52fd1a14-4f33-48a7-9ca3- > 5c159cb3c660 > -rw-rw-rw-. 1 root root 193K Jul 9 05:40 > volume-2750c079-b5f5-48ec-8e71-b470c41abf08.bb18f197-7bce-47ca-91e4- > f666048fc1bb > > 6. Try to start the VM, hit error in libvirtd.log: > > ----------------------------------------------------------------------------- > --------------------- > error : virStorageFileReportBrokenChain:5214 : Cannot access storage file > '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-2750c079-b5f5- > 48ec-8e71-b470c41abf08': No such file or directory Yes, that's consistent with my findings from the log. The file is removed somewhere in between which is not recorded in the logs. Thanks for update from openstack side.
If this is assigned to cinder we need at least the cinder configuration and the various cinder logs (volume, api, etc).
Created attachment 1700428 [details] cinder.conf, cinder-*.log
Let's involve nova-compute developer to do further debugging. I'll change the Component to nova, please feel free to change it back if it's not correct. Thank you!
Created attachment 1700773 [details] nfs-config.yaml
Hi, Tzach 1. Boot VM from volume: Project->Instances->Launch Instance->Source:Volume r8-vol 2. Project->Instances-> Select the Instances -> Click "Shutoff" 3. Project->Instances-> Select the Instances -> Do VM snapshot: r8-vol-shutdown-s1, volume snapshot is created, you can find it in Cinder -> volume snapshot And you can found the volume-**.** is created in the storage [root@dell-per730-** cinder]# ls -lh -rw-rw-rw-. 1 root root 10G Jul 9 05:35 volume-2750c079-b5f5-48ec-8e71-b470c41abf08 -rw-rw-rw-. 1 root root 193K Jul 9 05:36 volume-2750c079-b5f5-48ec-8e71-b470c41abf08.52fd1a14-4f33-48a7-9ca3-5c159cb3c660 4. Project->Cinder->volume snapshot, select the snapshot, and click "Delete" the volume-**bf08 is deleted. It should delete volume-**bf08.**c660 5. Report step3 to do another snapshot, you can find another volume-**.** is created in cinder storage(same as in comment 2) 6. Project->Instances-> Select the Instances -> click "start instance", Can't start the VM as the volume is deleted in step4.
https://bugs.launchpad.net/cinder/+bug/1860913 seems to be the same bug upstream, and has some info about why this happens.
(In reply to Eric Harney from comment #16) > https://bugs.launchpad.net/cinder/+bug/1860913 seems to be the same bug > upstream, and has some info about why this happens. Yup correct this looks like the same issue, I'm tracking this under the following bug that I'll mark this as a duplicate of for now while I work on getting the series updated upstream. Instance uses base image file when it is rebooted after snapshot creation if cinder nfs backend is used https://bugzilla.redhat.com/show_bug.cgi?id=1757691 *** This bug has been marked as a duplicate of bug 1757691 ***