Bug 1854826 - Failed to delete VM snapshot in OSP16.1 env after mixed operations
Summary: Failed to delete VM snapshot in OSP16.1 env after mixed operations
Keywords:
Status: CLOSED DUPLICATE of bug 1757691
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 16.1 (Train)
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ---
: ---
Assignee: OSP DFG:Compute
QA Contact: OSP DFG:Compute
URL:
Whiteboard: libvirt_OSP_INT
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-08 10:28 UTC by chhu
Modified: 2023-03-21 19:35 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-09 13:44:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd.log, nova-compute.log (440.51 KB, application/gzip)
2020-07-08 10:34 UTC, chhu
no flags Details
cinder.conf, cinder-*.log (521.56 KB, application/gzip)
2020-07-10 01:14 UTC, chhu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-23542 0 None None None 2023-03-21 19:35:57 UTC

Description chhu 2020-07-08 10:28:34 UTC
Description of problem:
Failed to delete VM snapshot in OSP16.1 env after mixed operations

Version-Release number of selected component (if applicable):
libvirt-daemon-kvm-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64
qemu-kvm-core-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
openstack-nova-compute-20.2.1-0.20200528080027.1e95025.el8ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create a image and volume in OSP, boot VM from volume: Project->Instances->Launch Instance->Source:Volume r8-vol
   $ openstack image create r8 --disk-format qcow2 --container-format bare --file /tmp/RHEL-8.2-x86_64-latest.qcow2
   $ openstack volume create r8-vol --size 10 --image r8

2. Do VM snapshot s1,s2, it'll create 0 size image, and volume snapshot,
   check the VM xml:
   -----------------------------------------------------------------------------
       <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.a1ebcb3f-c1af-4daa-81d3-d59a71ea67f4' index='3'/>
      <backingStore type='file' index='2'>
        <format type='qcow2'/>
        <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.869f99ea-3ba0-46dc-ae35-69bc987ddc23'/>
        <backingStore type='file' index='1'>
          <format type='qcow2'/>
          <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'/>
          <backingStore/>
        </backingStore>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>

3. Delete volume snapshot: "snapshot for s2", check the VM xml:
   -----------------------------------------------------------------------------
   <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.a1ebcb3f-c1af-4daa-81d3-d59a71ea67f4' index='3'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>


4. Shutdown VM, check the disk xml:
   -----------------------------------------------------------------------------
   <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.a1ebcb3f-c1af-4daa-81d3-d59a71ea67f4'/>
      <backingStore type='file'>
        <format type='qcow2'/>
        <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>

5. Do VM snapshot s4-down, it created 0 size image and volume snapshot: "snapshot for s4-down",
   Check the disk xml:
   -----------------------------------------------------------------------------
   <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.8c2c3147-dd21-4d6c-acc8-b6ce786c0301'/>
      <target dev='vda' bus='virtio'/>
      <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>

6. Start VM successfully, check the disk xml:
   ()[root@overcloud-novacompute-1 libvirt]# virsh list --all
     Id   Name                State
     ------------------------------------
     11   instance-00000006   running
   -----------------------------------------------------------------------------
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a' index='1'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <serial>5fd5a675-1171-4dbc-af73-6b2a8a1dc457</serial>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>

7. Login to VM successfully, touch file in VM
8. Try to delete volume snapshot: "snapshot for s1", hit error in libvirtd.log:
  -------------------------------------------------------------------------------------------------------
  error : virStorageFileChainLookup:1691 : invalid argument: could not find image 'volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.a1ebcb3f-c1af-4daa-81d3-d59a71ea67f4' in chain for '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'

9. Try to delete the volume snapshot: "snapshot for s4-down", hit error in libvirtd.log:
  -------------------------------------------------------------------------------------------------------
  error : virStorageFileChainLookup:1687 : invalid argument: could not find image 'volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a' beneath '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a' in chain for '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'

Actual results:
In step8,9, failed to delete the snapshot

Expected results:
In step8,9, delete the snapshot successfully

Additional info:
- libvirtd.log, nova-compute.log
- virDomainBlockCommit is called when delete the volume snapshot(created by creating running VM[boot from volume] snapshot)

Comment 1 chhu 2020-07-08 10:34:54 UTC
Created attachment 1700271 [details]
libvirtd.log, nova-compute.log

Comment 2 Peter Krempa 2020-07-09 14:13:10 UTC
The debug logs show improper libvirt API usage:

The two instances recorded in the description are:

2020-07-08 10:11:06.889+0000: 61775: debug : virDomainBlockCommit:10517 : dom=0x7fafb405c910, (VM: name=instance-00000006, uuid=0eac9439-3876-4f2b-8026-20554a13137e), disk=vda, base=volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a, top=volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.a1ebcb3f-c1af-4daa-81d3-d59a71ea67f4, bandwidth=0, flags=0x8
2020-07-08 10:11:06.889+0000: 61775: error : virStorageFileChainLookup:1691 : invalid argument: could not find image 'volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.a1ebcb3f-c1af-4daa-81d3-d59a71ea67f4' in chain for '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'

and

2020-07-08 10:13:51.966+0000: 61779: debug : virDomainBlockRebase:10247 : dom=0x7fafa0025210, (VM: name=instance-00000006, uuid=0eac9439-3876-4f2b-8026-20554a13137e), disk=vda, base=volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a, bandwidth=0, flags=0x10
2020-07-08 10:13:51.966+0000: 61779: error : virStorageFileChainLookup:1687 : invalid argument: could not find image 'volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a' beneath '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a' in chain for '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'


The '@base' and '@top' arguments of the virDomainBlockCommit and virDomainBlockRebase API call are neither full paths to the image nor the indexed backing chain (vda[3]) specifiers as required by the documentation:

The @disk parameter is either an unambiguous source name of the block device (the <source file='...'/> sub-element, such as "/path/to/image"), or the device target shorthand (the <target dev='...'/> sub-element, such as "vda"). Valid names can be found by calling virDomainGetXMLDesc() and inspecting elements within //domain/devices/disk.

The @base and @top parameters can be either paths to files within the backing chain, or the device target shorthand (the <target dev='...'/> sub-element, such as "vda") followed by an index to the backing chain enclosed in square brackets. Backing chain indexes can be found by inspecting //disk//backingStore/@index in the domain XML. Thus, for example, "vda[3]" refers to the backing store with index equal to "3" in the chain of disk "vda".

Note that these requirements did not change in libvirt recently.

Comment 3 Luigi Toscano 2020-07-09 14:32:18 UTC
It this related to bug 1854796 , in the same environment? Please share also the cinder logs.

Comment 4 chhu 2020-07-10 01:14:01 UTC
Created attachment 1700505 [details]
cinder.conf, cinder-*.log

Comment 5 chhu 2020-07-10 01:21:01 UTC
(In reply to Luigi Toscano from comment #3)
> It this related to bug 1854796 , in the same environment? Please share also
> the cinder logs.

Yes, it's related to bug1854796, it delete the volume when try to delete volume snapshot(created by doing shutoff vm snapshot) in bug1854796,
in this bug, without delete volume snapshot operation, it create the running VM snapshot, then shutdown VM, do shutdown VM snapshot, start the VM,
then failed to delete the running/shutdown VM snapshots.

Comment 6 Peter Krempa 2020-07-10 06:12:18 UTC
From libvirt's point of view they are not related at all. This one is where wrong arguments are used for the block job APIs while in 1854796 a storage file gets missing.

Comment 7 chhu 2020-07-11 02:48:29 UTC
Let's involve nova-compute developer to do further debugging.
I'll change the Component to nova-compute, please feel free to change it back if it's not correct. Thank you!

Comment 8 Peter Krempa 2020-07-13 13:09:12 UTC
(In reply to Peter Krempa from comment #2)
> The debug logs show improper libvirt API usage:
> 
> The two instances recorded in the description are:
> 
> 2020-07-08 10:11:06.889+0000: 61775: debug : virDomainBlockCommit:10517 :
> dom=0x7fafb405c910, (VM: name=instance-00000006,
> uuid=0eac9439-3876-4f2b-8026-20554a13137e), disk=vda,
> base=volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-
> bf44ea626e7a,
> top=volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.a1ebcb3f-c1af-4daa-81d3-
> d59a71ea67f4, bandwidth=0, flags=0x8
> 2020-07-08 10:11:06.889+0000: 61775: error : virStorageFileChainLookup:1691
> : invalid argument: could not find image
> 'volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.a1ebcb3f-c1af-4daa-81d3-
> d59a71ea67f4' in chain for
> '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-
> 4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'
> 
> and
> 
> 2020-07-08 10:13:51.966+0000: 61779: debug : virDomainBlockRebase:10247 :
> dom=0x7fafa0025210, (VM: name=instance-00000006,
> uuid=0eac9439-3876-4f2b-8026-20554a13137e), disk=vda,
> base=volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-
> bf44ea626e7a, bandwidth=0, flags=0x10
> 2020-07-08 10:13:51.966+0000: 61779: error : virStorageFileChainLookup:1687
> : invalid argument: could not find image
> 'volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-
> bf44ea626e7a' beneath
> '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-
> 4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a' in chain for
> '/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-
> 4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a'
> 
> 
> The '@base' and '@top' arguments of the virDomainBlockCommit and
> virDomainBlockRebase API call are neither full paths to the image nor the
> indexed backing chain (vda[3]) specifiers as required by the documentation:
> 
> The @disk parameter is either an unambiguous source name of the block device
> (the <source file='...'/> sub-element, such as "/path/to/image"), or the
> device target shorthand (the <target dev='...'/> sub-element, such as
> "vda"). Valid names can be found by calling virDomainGetXMLDesc() and
> inspecting elements within //domain/devices/disk.

The argument may actually also be the full relative path present in qemu image metadata.

The problem actually lies in the image itself. The qemu process is started with:

-blockdev '{"driver":"file","filename":"/var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' 
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":null}' 
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on,serial=5fd5a675-1171-4dbc-af73-6b2a8a1dc457 

and no other -blockdev commad line. This means that the image doesn't even have a backing image
 (detected from the qcow2 metadata, can be queried manually via:

 qemu-img info /var/lib/nova/mnt/07b80119d40eda06c63650e0d74e0ba5/volume-5fd5a675-1171-4dbc-af73-6b2a8a1dc457.362d13e8-9f6f-4f61-9e34-bf44ea626e7a

As of such there is nothing for libvirt to commit.


> 
> The @base and @top parameters can be either paths to files within the
> backing chain, or the device target shorthand (the <target dev='...'/>
> sub-element, such as "vda") followed by an index to the backing chain
> enclosed in square brackets. Backing chain indexes can be found by
> inspecting //disk//backingStore/@index in the domain XML. Thus, for example,
> "vda[3]" refers to the backing store with index equal to "3" in the chain of
> disk "vda".
> 
> Note that these requirements did not change in libvirt recently.


Note You need to log in before you can comment on or make changes to this bug.