RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1191901 - live external disk-only snapshot deadlocks with lock manager
Summary: live external disk-only snapshot deadlocks with lock manager
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Peter Krempa
QA Contact: Han Han
URL:
Whiteboard:
: 1272384 (view as bug list)
Depends On:
Blocks: 1288337 1304579 1401400
TreeView+ depends on / blocked
 
Reported: 2015-02-12 08:24 UTC by Yang Yang
Modified: 2017-08-02 01:25 UTC (History)
11 users (show)

Fixed In Version: libvirt-3.0.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 17:06:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
/var/log/libvirt/libvirtd.log (10.40 KB, text/plain)
2015-02-12 08:24 UTC, Yang Yang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1403691 0 high CLOSED Snapshot fail trying to add an existing sanlock lease 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2017:1846 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2017-08-01 18:02:50 UTC

Internal Links: 1403691

Description Yang Yang 2015-02-12 08:24:45 UTC
Created attachment 990797 [details]
/var/log/libvirt/libvirtd.log

Description of problem:
Libvirtd hang when create live external disk only snapshot with lockd enabled

Version-Release number of selected component (if applicable):
libvirt-1.2.8-16.el7.x86_64
qemu-kvm-rhev-2.1.2-23.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.# grep lock_ /etc/libvirt/qemu.conf
lock_manager = "lockd"

# vim /etc/libvirt/qemu-lockd.conf
auto_disk_leases = 1
require_lease_for_disks = 1
file_lockspace_dir = "/var/lib/libvirt/lockd/files"

#systemctl start virtlockd
#systemctl restart libvirtd

2.prepare a vm with the following xml
<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/vm1.qcow2'/>
      <backingStore/>
      <target dev='hda' bus='ide'/>
      <alias name='ide0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

3. start vm
# virsh start vm3
Domain vm3 started

4. create snapshot
# virsh snapshot-create-as vm3 s1 --disk-only
2015-02-12 07:19:33.047+0000: 8722: info : libvirt version: 1.2.8, package: 16.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-01-28-08:29:06, x86-020.build.eng.bos.redhat.com)
2015-02-12 07:19:33.047+0000: 8722: warning : virKeepAliveTimerInternal:143 : No response from client 0x7fb1de34a6c0 after 6 keepalive messages in 35 seconds
2015-02-12 07:19:33.047+0000: 8721: warning : virKeepAliveTimerInternal:143 : No response from client 0x7fb1de34a6c0 after 6 keepalive messages in 35 seconds
error: internal error: received hangup / error event on socket

5. check snapshot file
# ll /var/lib/libvirt/images/vm1.s1
-rw-------. 1 qemu qemu 2097152 Feb 12 15:18 /var/lib/libvirt/images/vm1.s1

# qemu-img info /var/lib/libvirt/images/vm1.s1 --backing-chain
image: /var/lib/libvirt/images/vm1.s1
file format: qcow2
virtual size: 5.0G (5368709120 bytes)
disk size: 2.3M
cluster_size: 65536
backing file: /var/lib/libvirt/images/vm1.qcow2
backing file format: qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false

image: /var/lib/libvirt/images/vm1.qcow2
file format: qcow2
virtual size: 5.0G (5368709120 bytes)
disk size: 2.5G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false

6. check lock
# ll /var/lib/libvirt/lockd/files/
total 0
-rw-------. 1 root root 0 Feb 12 15:18 dd69b405390b9e0827b36699faef298e24d025410e6d4e43f762e3e7077489b4
-rw-------. 1 root root 0 Feb 12 15:17 f2f4417b8fe812d19e40421ca4d46765c702e950c54a3a74e967e48dc9812492


Actual results:
Libvirtd hang when creating external disk only snapshot. However, snapshot file is created and locked

Expected results:
Live external snapshot create successfully.

Additional info:
It seems that the snapshot file's lock is locked in bad sequence

2015-02-12 07:18:03.162+0000: 7187: error : virNetClientProgramDispatchError:177 : resource busy Lockspace resource 'dd69b405390b9e0827b36699faef298e24d025410e6d4e43f762e3e7077489b4' is locked

Comment 1 piotr.rybicki 2015-09-22 09:58:59 UTC
+1 to this bug report.

Bug stil exists in libvirt-1.2.19

Comment 3 Han Han 2015-11-20 07:41:45 UTC
The bug can also be reproduced on sanlock.
Version:
sanlock-3.2.4-1.el7.x86_64
qemu-kvm-rhev-2.3.0-31.el7.x86_64
libvirt-1.2.17-13.el7.x86_64

Steps:
1. grep lock_ /etc/libvirt/qemu.conf
lock_manager = "sanlock"
# vim /etc/libvirt/qemu-sanlock.conf
auto_disk_leases = 1
disk_lease_dir = "/var/lib/libvirt/sanlock"
host_id = 1
user = "sanlock"
group = "sanlock"
# vim /etc/sysconfig/sanlock
SANLOCKOPTS="-w 0"

2. Restart sanlock and libvirtd
# systemctl restart sanlcok; systemctl restart libvirtd

3. Prepare a guest n1 like this:
...
<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/n1.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
...
Create external snapshots:
# for i in s{1..3} ;do virsh snapshot-create-as n1 $i --disk-only --diskspec vda,file=/tmp/n1.$i;sleep 2;done
Libvirtd hang when creating external disk only snapshot. And snap file created.
# ll /tmp/n1.s1 -SH
-rw-------. 1 qemu qemu 4.7M Nov 20 15:40 /tmp/n1.s1

Comment 4 Jaroslav Suchanek 2015-11-26 20:21:18 UTC
*** Bug 1272384 has been marked as a duplicate of this bug. ***

Comment 5 Matthias Leuffen 2016-02-11 14:10:46 UTC
Same here. Any suggestions and/or workarounds (different from disabling locking at all)?

Comment 7 Han Han 2017-01-05 02:30:01 UTC
Hi, Peter, the bug seems fixed on libvirt-2.0.0-10.el7_3.3.x86_64 since I cannot reproduce it by both comment0 and comment3.
Pls check if the patches of BZ1406765 can fix this bug. Thanks.

Comment 8 Peter Krempa 2017-01-05 09:05:21 UTC
There are a few more cases that need fixing. Notably with automatic disk locking. I've posted a series to fix some of them https://www.redhat.com/archives/libvir-list/2016-December/msg00809.html

Comment 9 Peter Krempa 2017-01-10 18:13:10 UTC
commit f61e40610d790836f3af2393ae7d77843f03f378
Author: Peter Krempa <pkrempa>
Date:   Fri Dec 16 15:45:26 2016 +0100

    qemu: snapshot: Properly handle image locking
    
    Images that became the backing chain of the current image due to the
    snapshot need to be unlocked in the lock manager. Also if qemu was
    paused during the snapshot the current top level images need to be
    released until qemu is resumed so that they can be acquired properly.

Comment 11 Han Han 2017-04-13 07:30:10 UTC
Verified on libvirt-3.2.0-2.el7.x86_64 sanlock-3.4.0-1.el7.x86_64
1. Set virtlockd environment as comment0
2. Start VM and create snapshots
3. Set sanlock environment as comment3
4. Start VM and create snapshots

Detailed steps:
1. For virtlocked
+ DOM=V                                                                                                                                                                                                                                                                [17/1034]
+ prep_virtlockd
+ augtool set /files/etc/libvirt/qemu.conf/lock_manager lockd
Saved 1 file(s)
+ augtool set /files/etc/libvirt/qemu-lockd.conf/auto_disk_leases 1
Saved 1 file(s)
+ augtool set /files/etc/libvirt/qemu-lockd.conf/require_lease_for_disks 1
Saved 1 file(s)
+ augtool set /files/etc/libvirt/qemu-lockd.conf/file_lockspace_dir /var/lib/libvirt/lockd/files
Saved 1 file(s)
+ systemctl restart virtlockd
+ systemctl restart libvirtd
+ virsh start V
Domain V started

+ sleep 5
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s1 --disk-only
Domain snapshot s1 created
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s2 --disk-only
Domain snapshot s2 created
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s3 --disk-only
Domain snapshot s3 created
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s4 --disk-only
Domain snapshot s4 created
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s5 --disk-only
Domain snapshot s5 created

2. For sanlock:
+ prep_sanlock
+ augtool set /files/etc/libvirt/qemu.conf/lock_manager sanlock
Saved 1 file(s)
+ augtool set /files/etc/libvirt/qemu-sanlock.conf/auto_disk_leases 1
Saved 1 file(s)
+ augtool set /files/etc/libvirt/qemu-sanlock.conf/host_id 1
Saved 1 file(s)
+ augtool set /files/etc/libvirt/qemu-sanlock.conf/disk_lease_dir /var/lib/libvirt/sanlock
Saved 1 file(s)
+ augtool set /files/etc/libvirt/qemu-sanlock.conf/user sanlock
Saved 1 file(s)
+ augtool set /files/etc/libvirt/qemu-sanlock.conf/group sanlock
Saved 1 file(s)
+ systemctl restart wdmd
Job for wdmd.service failed because a timeout was exceeded. See "systemctl status wdmd.service" and "journalctl -xe" for details.
+ systemctl restart sanlock
+ systemctl restart libvirtd
+ '[' 0 -ne 0 ']'
+ virsh start V
Domain V started

+ sleep 5
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s1 --disk-only
Domain snapshot s1 created
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s2 --disk-only
Domain snapshot s2 created
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s3 --disk-only
Domain snapshot s3 created
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s4 --disk-only
Domain snapshot s4 created
+ for i in 's{1..5}'
+ virsh snapshot-create-as V s5 --disk-only
Domain snapshot s5 created

Snapshots created. No deadlock.

Comment 12 errata-xmlrpc 2017-08-01 17:06:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 13 errata-xmlrpc 2017-08-01 23:48:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 14 errata-xmlrpc 2017-08-02 01:25:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846


Note You need to log in before you can comment on or make changes to this bug.