Bug 1165119

Summary: libvirt should provide more specific error when failed to acquire the lock using sanlock
Product: Red Hat Enterprise Linux 7 Reporter: Jiri Moskovcak <jmoskovc>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: dfediuck, dyuan, lhuang, rbalakri, shyu, xuzhang, yanyang, zhwang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.15-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 05:56:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1093704    

Description Jiri Moskovcak 2014-11-18 11:38:38 UTC
Description of problem:
When libvirt fails to start the VM because it fials the acquire the lock, it returns VIR_ERR_INTERNAL_ERROR which is too broad to be handled in some meaningful way.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-46.el6_6.1.x86_64

How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Jiri Denemark 2015-04-14 14:20:17 UTC
Currently, libvirt would report "internal error Failed to acquire lock: error -243"

Comment 5 Jiri Denemark 2015-04-14 14:41:45 UTC
Patch sent upstream for review: https://www.redhat.com/archives/libvir-list/2015-April/msg00585.html

Comment 6 Jiri Denemark 2015-04-15 07:49:32 UTC
Fixed upstream by v1.2.14-161-g4864e37:

commit 4864e377c9a6ef08cd65672775e520751a27f6d7
Author: Jiri Denemark <jdenemar>
Date:   Tue Apr 14 16:27:37 2015 +0200

    sanlock: Use VIR_ERR_RESOURCE_BUSY if sanlock_acquire fails
    
    When acquiring resource via sanlock fails, we would report it as
    VIR_ERR_INTERNAL_ERROR, which is not very friendly to applications using
    libvirt. Moreover, the lockd driver would report the same failure as
    VIR_ERR_RESOURCE_BUSY, which looks better.
    
    Unfortunately, in sanlock driver we don't really know if acquiring the
    resource failed because it was already locked or there was another
    reason behind. But the end result is the same and I think using
    VIR_ERR_RESOURCE_BUSY reason for all acquire failures is still better
    than what we have now.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1165119
    Signed-off-by: Jiri Denemark <jdenemar>

Comment 8 Yang Yang 2015-06-15 09:56:58 UTC
I can reproduce it on libvirt-1.2.8-10.el7.x86_64

Verified on libvirt-1.2.16-1.el7.x86_64

Steps
1. prepare 2 hosts, edit following conf file
# grep "lock_manager" /etc/libvirt/qemu.conf 
lock_manager = "sanlock"

# vim /etc/libvirt/qemu-sanlock.conf

auto_disk_leases = 1
disk_lease_dir = "/var/lib/libvirt/sanlock"
host_id = 1 (host_id = 2 on 2nd host)
user = "sanlock"
group = "sanlock"

# getsebool -a 
sanlock_use_nfs --> on
virt_use_sanlock --> on
virt_use_nfs --> on

#mount 10.66.4.164:/yy/nfs /var/lib/libvirt/sanlock

#service sanlock start
#service libvirtd restart

2. define/start a guest on 2 hosts with following xml
<disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/sanlock/test.img'/>
      <target dev='vda' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

start the guest on host 1
# virsh start simple
Domain simple started

start the guest on host2
# virsh start simple
error: Failed to start domain simple
error: resource busy: Failed to acquire lock: error -243

Comment 10 errata-xmlrpc 2015-11-19 05:56:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html