Description of problem:
Sometimes during cluster_cleanup, shutting down cmirror fails, stopping our regression tests in their tracks. The failing call to `service cmirror stop` looks like this when run through sh -x:
+ echo -n 'Stopping clustered mirror log server:'
Stopping clustered mirror log server:+ killall clogd
+ ps -C clogd
+ failure shutdown
+ local rc=0
The init script looks like so:
echo -n "Stopping clustered mirror log server:"
killall clogd >& /dev/null
if ps -C clogd >& /dev/null; then
If clogd doesn't exit immediately after the signal handler returns, ps could still find clogd running, even though a few moments later it will exit. This should probably be replaced by a call to killproc which waits for the daemon to exit.
Version-Release number of selected component (if applicable):
10% of the time, more frequently on ppc and ia64 or smp systems
Steps to Reproduce:
1. /etc/init.d/cmirror stop
cmirror initscript should wait for clogd to exit before continuing.
Also check the lines regarding the kernel module removal. I can see this failing during our tests quite frequently because the module is not loaded. It should be checked if the module is loaded and then try to remove it.
This also makes it possible to call the script twice with stop parameter and not fail.
Ok, I'll try to fix these.
However, reading the killproc man page, I'm not sure that it is appropriate. clogd will refuse to honor SIGTERM if there are still active cluster mirrors. Failing the SIGTERM, killproc will issue a SIGKILL - which we do not want... So, we want to wait for the SIGTERM to take effect, but not to issue the subsequent SIGKILL... rather, it should fail in that event.
Author: Jonathan Brassow <email@example.com>
Date: Thu Dec 3 15:48:04 2009 -0600
cmirror: Fix-up init script behaviour (bug 520915)
init script was throwing errors on 'stop' when it
shouldn't have been.
I have not run into this during recent RHEL 5.5 regression runs.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.