Red Hat Bugzilla – Bug 187812
clvmd init script should time out and fail if there's no quorate cluster
Last modified: 2010-01-11 23:04:45 EST
Description of problem:
This is some what related to other init script bugs (168698 and 181817).
If clvmd can talk to cman but there's no quorate cluster, then the clvmd init
script will hang indefinately when starting OR stopping.
[root@link-08 ~]# cat /proc/cluster/nodes
Node Votes Exp Sts Name
1 1 3 X link-01
2 1 3 M link-08
[root@link-08 ~]# service clvmd start
Version-Release number of selected component (if applicable):
[root@link-08 ~]# rpm -q lvm2-cluster
cfeist: is there any chance you could do some magic in the init scripts for this?
Actually fixing it in clvmd is going to require rather more code than would be
healthy in an update release. The need to support cman & gulm makes it rather
Added to which, it's actually impossible to shut down clvmd on an inquorate
cluster (and why would you want to?) because it needs to talk to the lock manager.
More thoughts on this:
Failing the script because of an inquorate cluster is really the wrong thing to
do. If you start clvmd on an inquorate cluster then it stays started. Returning
a failure is incorrect because the daemon /is/ started. (surely failure of the
start script implies the daemon has not started).
Similarly with shutdown. Initiating shutdown of clvmd has succeeded even if the
daemon has not actually stopped yet, because it /will/ stop when the cluster
IMHO it would be more appropriate to return success for both these situations
because they have suceeeded in doing what was requested of them.
As long as it doesn't hang forever, I'm on board. :)
Oh good grief, the init script also enables & disables volumes! There's /no/ way
that's going to work on an inquorate cluster.
This is going to need some serious init-script mucking about if it's even
possible to do anything sensible at all. What should happen ? Do we not
activate/deactivate the volumes? If not then when do they get
IMHO the best thing to do with this is just to background the whole damn script
- if that's possible with an init script.
I had a brainwave.
clvmd now has an optional startup timeout (switch -T, see the man page for
details). In the script I have set this to 20 seconds. if clvmd doesn't get
started within this time then the script will exit without attempting to
activate any logical volumes.
It's important to note that if this does happen clvmd HAS been started and will
start operations as soon as the cluster is quorate...but this does NOT include
activating the logical volumes that got missed by this script.
Checking in daemons/clvmd/clvmd.c;
/cvs/lvm2/LVM2/daemons/clvmd/clvmd.c,v <-- clvmd.c
new revision: 1.30; previous revision: 1.29
Checking in scripts/clvmd_init_rhel4;
/cvs/lvm2/LVM2/scripts/clvmd_init_rhel4,v <-- clvmd_init_rhel4
new revision: 1.11; previous revision: 1.10
fix verified in lvm2-cluster-2.02.16-1
[root@link-08 bin]# service ccsd start start
Starting ccsd: [ OK ]
[root@link-08 bin]# service cman start
Starting cman: [FAILED]
[root@link-08 bin]# cman_tool nodes
Node Votes Exp Sts Name
1 1 4 M link-08
[root@link-08 bin]# service clvmd start
Starting clvmd: clvmd startup timed out
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.