Red Hat Bugzilla – Bug 595725
cman init script is not consistent in checking daemons startup and introduces possible race conditions
Last modified: 2011-05-19 09:03:41 EDT
when executed via init scripts:
cluster starts OK
clvmd starts fast enough that some of cluster internal bits are not set yet, resulting in clvmd failing to start.
There are 2 approaches to fix this issue:
1) clvmd init script could verify that cman is actually running and that dlm has completed its setup before invoking clvmd daemon.
2) clvmd daemon will need to do the same as #1 but internally.
A way to reproduce this issue (not that it doesn´t trigger often):
node1: start cman && sleep 10 && start clvmd <- stop here.
node2: start cman && start clvmd && sleep 5 && stop clvmd && stop cman <- loop forever.
repeat the script on node2 till clvmd will fail to start. There is no exact number of loops before it will happend, but it does eventually happen.
I have a test patch right now to address the issue via #1.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release. Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release. This request is not yet committed for
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.
** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **
For consistency: either the scripts wait, or the daemons wait.
If the scripts wait, then the cman startup script shouldn't exit until it's ready.
(It's not the job of the clvmd script to include waiting loops for things it depends on: it shouldn't get started until they are ready for it!)
If the daemons do the waiting, then it's fine for the scripts to exit before everything is ready. (That's probably more consistent with the future systemd approach too.)
(In reply to comment #4)
> For consistency: either the scripts wait, or the daemons wait.
> If the scripts wait, then the cman startup script shouldn't exit until it's
> (It's not the job of the clvmd script to include waiting loops for things it
> depends on: it shouldn't get started until they are ready for it!)
> If the daemons do the waiting, then it's fine for the scripts to exit before
> everything is ready. (That's probably more consistent with the future systemd
> approach too.)
I´ll fix this one in cman init script.
As for the daemon solution, I don´t think it´s worth doing it right now, because cman is going away and most of the daemons will change they way they start/or be started.
As extra information, I am not able to reproduce the original problem anymore. Probably fixed as side effect of: rhbz#639018.
Patches to fix cman init and dlm_controld are being tested right now.
Moving to 6.2.
fixes are upstream
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.