Bug 905282

Summary: Lockfailure action Restart can shutdown the guest but fail to start it
Product: Red Hat Enterprise Linux 6 Reporter: Luwen Su <lsu>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.4CC: acathrow, ajia, cwei, dyuan, fsimonce, jdenemar, mzhan, shyu, teigland
Target Milestone: rcKeywords: Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 967494 (view as bug list) Environment:
Last Closed: 2014-04-04 20:58:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 967494    

Description Luwen Su 2013-01-29 03:41:18 UTC
Description of problem:
Create this bug to track the Bug 832156 not solved issue
Lockfailure action Restart can shutdown the guest but fail to start it
I re-write the reproduce steps for further testing since both libvirt and sanlock have changed a lot.


Version-Release number of selected component (if applicable):
libvirt-0.10.2-17.el6.x86_64
sanlock.x86_64 0:2.6-2.el6      


How reproducible:
100%
Steps to Reproduce:
----------------------------
1.Libvirt configuration
----------------------------
# service libvirtd stop

# getsebool -a | grep sanlock
sanlock_use_fusefs --> off
sanlock_use_nfs --> on
sanlock_use_samba --> off
virt_use_sanlock --> on

# tail -5 /etc/libvirt/qemu-sanlock.conf 
user = "sanlock"
group = "sanlock"
host_id = 1
auto_disk_leases = 0
/*Because the auto leases will have effect on the action , so close it , create lockspace and lease file mannully*/
disk_lease_dir = "/var/lib/libvirt/sanlock"

#tail -1 /etc/libvirt/qemu.conf 
lock_manager = "sanlock"
--------------------------------
2.Sanlock configureation
-----------------------------------
/*create lockspace file*/
#truncate -s 1M /var/lib/libvirt/sanlock/TEST_LS
#sanlock direct init -s TEST_LS:0:/var/lib/libvirt/sanlock/TEST_LS:0

/*change to permission to sanlock , since its default is root*/
#chown sanlock:sanlock /var/lib/libvirt/sanlock/TEST_LS:0

/*add lockspace to sanlock */
#service wdmd start
#service sanlock start
#sanlock client add_lockspace -s TEST_LS:1:var/lib/libvirt/sanlock/TEST_LS:0

/*you can use this command to check the status*/
#sanlock client host_status -s TEST_LS

/*create lease file*/
#truncate -s 1M /var/lib/libvirt/sanlock/test-disk-resource-lock
#sanlock direct init -r TEST_LS:test-disk-resource-lock:/var/lib/libvirt/sanlock/test-disk-resource-lock:0

/*change permission to sanlock*/
#chown sanlock:sanlock /var/lib/libvirt/sanlock/test-disk-resource-lock
-----------------------
3.Guest XML setting
---------------------
A normal shutdown guest  , add
....
<on_lockfailure>restart</on_lockfailure>
....
<lease>
<lockspace>TEST_LS</lockspace>
<key>test-disk-resource-lock</key>
<target path=’/var/lib/libvirt/sanlock/test-disk-resource-lock’/>
</lease>
...

to the guest.

#service libvirtd restart
#virsh start $domain

-----------------------------------------------------
4.Try to remove the lockspace when the guest is using.
-------------------------------------------------------
#sanlock client rem_lockspace -s __LIBVIRT__DISKS__:1:/var/lib/libvirt/sanlock/__LIBVIRT__DISKS__:0
/*If there is a better way to make the lockspace "lost " , plz correct me :) thanks*/

#virsh domstate $domain
shutdown

The guest can't start back.


Actual results:
Guest can't back

Expected results:
Guest back

Additional info:
For the disscuess refer bug905280 and bug 832156

Comment 5 Jiri Denemark 2014-03-25 08:24:46 UTC
This is fixed upstream by v1.2.2-341-g2cc27c3:

commit 2cc27c34befd1922878be724f540b5578c3d492c
Author: Jiri Denemark <jdenemar>
Date:   Mon Mar 24 14:23:09 2014 +0100

    sanlock: Forbid VIR_DOMAIN_LOCK_FAILURE_RESTART
    
    https://bugzilla.redhat.com/show_bug.cgi?id=905282
    https://bugzilla.redhat.com/show_bug.cgi?id=967494
    
    When lock failure is detected by sanlock, our sanlock_helper kill script
    will try to restart (shutdown followed by start) the affected domain
    when RESTART action is configured for it. While shutting down kills QEMU
    and removes all its leases (which is what sanlock wants to happen),
    trying to start it again just hangs because libvirt tries reacquire the
    locks in the failed lock space. Hence, this action cannot be supported
    by sanlock driver.
    
    Signed-off-by: Jiri Denemark <jdenemar>

Comment 7 RHEL Program Management 2014-04-04 20:58:58 UTC
Development Management has reviewed and declined this request.
You may appeal this decision by reopening this request.