Bug 1305793 - After failed external snapshot successive VM operations fail
After failed external snapshot successive VM operations fail
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.8
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Peter Krempa
Virtualization Bugs
:
Depends On: 1304579
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-09 04:05 EST by Peter Krempa
Modified: 2016-05-10 15:26 EDT (History)
8 users (show)

See Also:
Fixed In Version: libvirt-0.10.2-57.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1304579
Environment:
Last Closed: 2016-05-10 15:26:08 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Peter Krempa 2016-02-09 04:05:34 EST
+++ This bug was initially created as a clone of Bug #1304579 +++

Description of problem:
As summary

Version-Release number of selected component (if applicable):
libvirt-0.10.2-56.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.487.el6.x86_64
libvirt-lock-sanlock-0.10.2-56.el6.x86_64
sanlock-2.8-2.el6_5.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Setup libvirt sanlock on a host:
On host A
# setsebool sanlock_use_nfs 1 && setsebool virt_use_nfs 1 && setsebool virt_use_sanlock 1
# cat /etc/libvirt/qemu-sanlock.conf
auto_disk_leases = 1
disk_lease_dir = "/var/lib/libvirt/sanlock"
host_id = 1
user = "sanlock"
group = "sanlock"

# cat /etc/libvirt/qemu.conf
lock_manager = "sanlock"

# cat /etc/sysconfig/sanlock                                              
SANLOCKOPTS="-w 0"

# service wdmd restart; service sanlock restart; service libvirtd restart

2. Create a guest and do external snapshot
# cat guest.xml
<domain type='kvm' id='1'>
...
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/c2.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
...
</domain>

# virsh create guest.xml
Domain cc created from guest.xml
# virsh list 
 Id    Name                           State
----------------------------------------------------
 6     cc                             running

# virsh snapshot-create-as cc s1 --disk-only --diskspec vda,file=/tmp/cc.s1                                                    
error: Failed to acquire lock: File exists
Snapshot file created but the snapshot failed.
# ll /tmp/cc.s1
-rw-------. 1 qemu qemu 2.2M Feb  4 10:17 /tmp/cc.s1
# virsh snapshot-list cc
 Name                 Creation Time             State
------------------------------------------------------------

When I rm the file and create it again:
# rm /tmp/cc.s1;virsh snapshot-create-as cc s1 --disk-only --diskspec vda,file=/tmp/cc.s1;
error: Timed out during operation: cannot acquire state change lock

At this point no other operation on the given VM is possible.
Comment 4 yangyang 2016-02-18 01:47:43 EST
Peter,
After failed external system check point snapshot, vm is not resumed. Is it acceptable result?

verified on libvirt-0.10.2-57.el6.x86_64

steps as following
1. Setup libvirt sanlock on a host:
On host A
# setsebool sanlock_use_nfs 1 && setsebool virt_use_nfs 1 && setsebool virt_use_sanlock 1
# cat /etc/libvirt/qemu-sanlock.conf
auto_disk_leases = 1
disk_lease_dir = "/var/lib/libvirt/sanlock"
host_id = 1
user = "sanlock"
group = "sanlock"

# cat /etc/libvirt/qemu.conf
lock_manager = "sanlock"

# cat /etc/sysconfig/sanlock                                              
SANLOCKOPTS="-w 0"

mount nfs server on /mnt
# mount
10.73.194.27:/vol/S3/libvirtauto/yy on /mnt

# service wdmd restart; service sanlock restart; service libvirtd restart

2. start a guest 
<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/mnt/rhel6.qcow2'>
        <seclabel model='selinux' relabel='no'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>

3. create external disk only snapshot
# virsh snapshot-create-as yy s1 --disk-only --diskspec vda,file=/var/lib/libvirt/images/yy.s1
error: Failed to acquire lock: File exists

# ll /var/lib/libvirt/images/yy.s1
ls: cannot access /var/lib/libvirt/images/yy.s1: No such file or directory

# virsh dumpxml yy | grep disk -a6
 <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/mnt/rhel6.qcow2'>
        <seclabel model='selinux' relabel='no'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>

4. create external disk only snapshot once more
# virsh snapshot-create-as yy s1 --disk-only --diskspec vda,file=/var/lib/libvirt/images/yy.s1
error: Failed to acquire lock: File exists

###At this point domain job is unlocked###

5. create external system check point snapshot
# virsh snapshot-create-as yy s1 --diskspec vda,file=/mnt/yy.s1 --memspec file=/mnt/yy.mem
error: Failed to acquire lock: File exists

# virsh dumpxml yy | grep disk -a6
<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/mnt/yy.s1'>
        <seclabel model='selinux' relabel='no'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>

# virsh snapshot-list yy
 Name                 Creation Time             State
------------------------------------------------------------

# virsh list
 Id    Name                           State
----------------------------------------------------
 7     yy                             paused

###At this point vm is not resumed after failed snapshot###
Comment 5 Peter Krempa 2016-02-18 01:57:53 EST
(In reply to yangyang from comment #4)
> Peter,
> After failed external system check point snapshot, vm is not resumed. Is it
> acceptable result?

Yes, that is expected. The failure was triggered by not being able to resume the VM due to a failed locking attempt. At that point the VM can be either restarted in the future, or killed via the 'destroy' api.
Comment 6 yangyang 2016-02-18 02:16:06 EST
Thank Peter's quick response. Move it to verified
Comment 8 errata-xmlrpc 2016-05-10 15:26:08 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0738.html

Note You need to log in before you can comment on or make changes to this bug.