Bug 54246

Summary:

/etc/init.d/amd script improper shutdown of amd

Product:

[Retired] Red Hat Linux

Reporter:

Need Real Name <jgd>

Component:

am-utils

Assignee:

Peter Vrabec <pvrabec>

Status:

CLOSED RAWHIDE

QA Contact:

Aaron Brown <abrown>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

7.1

CC:

dennis.brylow

Target Milestone:

---

Target Release:

---

Hardware:

i386

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2005-10-05 19:33:43 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
/etc/init.d/amd which correctly waits for amd to shut down	none

Description Need Real Name 2001-10-02 15:40:40 UTC

Description of Problem:

The /etc/init.d/amd script uses a function 'killproc' from
/etc/init.d/functions to terminate amd.  The not-so wonderful feature of
'killproc' is that it send a TERM signal to amd, and then about five or six
seconds later, sends a KILL signal.

This can leave things in a state where amd cannot be restarted.  Since the
amd process was agressively killed before the unmount of all the toplvl
nodes completed, you get mtab entries left behind, and toplvl nodes
'connected' to non-extant processes.  When you restart amd, you either get
a stale filehandle, or more likely, the formerly toplvl node gets restarted
with type 'link' and is useless.  The restart as type link happens because
the amd process is killed before /etc/mtab is updated.  The only way to
clear this state is a system reboot.

This can make amd look unreliable, when the real culprit is the
/etc/init.d/amd script. For example, on our systems with only three toplvl
nodes, the invocation of /etc/init.d/amd restart would work about 95% of
the time. We recently changed a number of systems such that they now have
six toplvl nodes, so it now takes longer for amd to die.  In this
configuration, the 'last' of the six gets hit with the bug 99% of the time.

Version-Release number of selected component (if applicable):


How Reproducible:
Easily

Steps to Reproduce:
1.   create an /etc/amd.conf with six or more top level nodes
2.   run /etc/init.d/amd start then /etc/init.d/amd stop
3.   look in /etc/mtab for nodes linked to dead processes
4.   run /etc/init.d/amd start
5.   run amq and look for nodes with type 'link' that should be toplvl

Actual Results:
amq will show nodes of type 'link' that should be of type 'toplvl'

Expected Results:
immediately after a start, the output of amq should show only the  root
node and toplvl nodes

Additional Information:
new /etc/init.d/amd script (fix) attached

Comment 1 Need Real Name 2001-10-02 15:42:09 UTC

Created attachment 33238 [details]
/etc/init.d/amd which correctly waits for amd to shut down

Comment 2 Peter Vrabec 2005-10-05 13:47:13 UTC

I suggest:

stop() {
        echo -n $"Stopping $prog: "
        killproc $amd -TERM
        # this part is from wait4amd2die
        delay=3
        count=10
        i=1
        maxcount=`expr $count + 1`
        while [ $i != $maxcount ]; do
                # run amq
                /usr/sbin/amq > /dev/null 2>&1
                RETVAL=$?
                if [ $RETVAL != 0 ]
                then
                        # amq failed to run (because amd is dead)
                        rm -f /var/lock/subsys/amd /var/run/amd.pid
                        echo
                        return $RETVAL
                fi
                sleep $delay
                i=`expr $i + 1`
        done
        failure $"amd shutdown"
        echo
        echo "amd is still up"
        return $RETVAL
}