Description of Problem:
The /etc/init.d/amd script uses a function 'killproc' from
/etc/init.d/functions to terminate amd. The not-so wonderful feature of
'killproc' is that it send a TERM signal to amd, and then about five or six
seconds later, sends a KILL signal.
This can leave things in a state where amd cannot be restarted. Since the
amd process was agressively killed before the unmount of all the toplvl
nodes completed, you get mtab entries left behind, and toplvl nodes
'connected' to non-extant processes. When you restart amd, you either get
a stale filehandle, or more likely, the formerly toplvl node gets restarted
with type 'link' and is useless. The restart as type link happens because
the amd process is killed before /etc/mtab is updated. The only way to
clear this state is a system reboot.
This can make amd look unreliable, when the real culprit is the
/etc/init.d/amd script. For example, on our systems with only three toplvl
nodes, the invocation of /etc/init.d/amd restart would work about 95% of
the time. We recently changed a number of systems such that they now have
six toplvl nodes, so it now takes longer for amd to die. In this
configuration, the 'last' of the six gets hit with the bug 99% of the time.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. create an /etc/amd.conf with six or more top level nodes
2. run /etc/init.d/amd start then /etc/init.d/amd stop
3. look in /etc/mtab for nodes linked to dead processes
4. run /etc/init.d/amd start
5. run amq and look for nodes with type 'link' that should be toplvl
amq will show nodes of type 'link' that should be of type 'toplvl'
immediately after a start, the output of amq should show only the root
node and toplvl nodes
new /etc/init.d/amd script (fix) attached
Created attachment 33238 [details]
/etc/init.d/amd which correctly waits for amd to shut down
echo -n $"Stopping $prog: "
killproc $amd -TERM
# this part is from wait4amd2die
maxcount=`expr $count + 1`
while [ $i != $maxcount ]; do
# run amq
/usr/sbin/amq > /dev/null 2>&1
if [ $RETVAL != 0 ]
# amq failed to run (because amd is dead)
rm -f /var/lock/subsys/amd /var/run/amd.pid
i=`expr $i + 1`
failure $"amd shutdown"
echo "amd is still up"