Red Hat Bugzilla – Bug 799986
libvirtd should explicitly check for existance of configured sanlock directory before trying to register lockspace
Last modified: 2013-02-21 02:08:32 EST
Created attachment 567652 [details]
Patch adding dependency on sanlock
Description of problem:
If libvirtd is configured to use sanlock locking, libvirtd must be started after sanlock. Otherwise the libvirtd fails.
Attached patch for following package version ...
Version-Release number of selected component (if applicable):
libvirtd daemon fails to start:
15:10:17.036: 3004: info : libvirt version: 0.9.4, package: 23.el6_2.6 (CentOS BuildSystem <http://bugs.centos.org>, 2012-03-01-10:07:12, c6b6.bsys.dev.centos.org)
15:10:17.036: 3004: error : virLockManagerSanlockSetupLockspace:241 : Unable to add lockspace /mnt/shared/sanlock/__LIBVIRT__DISKS__: No such file or directory
15:10:17.196: 3004: error : qemudLoadDriverConfig:457 : Failed to load lock manager sanlock
15:10:17.197: 3004: error : qemudStartup:566 : Missing lock manager implementation
15:10:17.198: 3004: error : virStateInitialize:849 : Initialization of QEMU state driver failed
15:10:17.274: 3004: error : daemonRunStateInit:1149 : Driver state initialization failed
Sanlock starts first, then libvirtd and everything succeeds.
I have to confirm this problem.
1. My proposition (tested in my environment) for init.d is:
wdmd: # chkconfig: 2345 90 10
sanlock: # chkconfig: 2345 91 09
2. sanlock have to touch "/var/lock/subsys/sanlock"
Without it, we have problem with nice&successful shutdown - sanlock is working (/etc/rc.d/rc will not kill sanlock), and gfs2 service can't unmount.
3. /etc/init.d/libvirtd have to check that shared filesystem is mounted:
if [ qemu.conf->lock_manager = "sanlock" ] then
if [ ! -f /var/lib/libvirt/sanlock/shared ]; then
echo "SHARED FILESYSTEM IS NOT MOUNTED or is not marked by file 'shared'!!!"
echo "Check it and if mounted then mark by command touch shared.'
Sorry for poor english :-)
Please, escalate this bugs!
There are multiple problems in this BZ report, related to sanlock itself
- sanlock not using /var/lock/subsys/sanlock
- sanlock & wdmd initscript priority is wrong
While I don't think the libvirtd initscript should be checking for existance of /var/lib/libvirt/sanlock itself, we could improve the lock manager startup code to check this so you get a better error message in syslog.
Leaving this BZ to track improved libvirt directory checking. The other new BZs I mentioned above will track the sanlock initscript ordering problems
I can reproduce this problem, and it's more like a duplicate bug with bug 820173.
This issue is now fixed upstream by v0.10.0-rc0-205-g2560a51:
Author: Jiri Denemark <firstname.lastname@example.org>
Date: Tue Aug 21 15:27:10 2012 +0200
sanlock: Provide better error if lockspace directory is missing
Generating "Unable to add lockspace /lock/space/dir/__LIBVIRT__DISKS__:
No such file or directory" is correct but not exactly clear. This patch
changes the error message to "Unable to create lockspace
/lock/space/dir/__LIBVIRT__DISKS__: parent directory does not exist or
is not a directory".
I can reproduce this issue on libvirt-0.10.0-0rc0.el6 and it's okay on libvirt-0.10.0-0rc1.el6, I can get a expected error in libvirtd.log:
2012-08-23 09:17:42.331+0000: 26264: debug : virLockManagerSanlockSetupLockspace:178 : Lockspace /var/lib/libvirt/sanlock/__LIBVIRT__DISKS__ does not yet exist
2012-08-23 09:17:42.331+0000: 26264: error : virLockManagerSanlockSetupLockspace:188 : internal error Unable to create lockspace /var/lib/libvirt/sanlock/__LIBVIRT__DISKS__: parent directory does not exist or is not a directory
A simple verification method is to remove '/var/lib/libvirt/sanlock' if it exists then configuration basic 'lock manager' in qemu.conf and qemu-sanlock.conf and restart libvirtd service.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.