Bug 579631 - autofs5 init script times out before automount exits and incorrectly shows that autofs5 stop failed
Summary: autofs5 init script times out before automount exits and incorrectly shows th...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: autofs5
Version: 4.8
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Ian Kent
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 585058 585059
TreeView+ depends on / blocked
 
Reported: 2010-04-06 05:23 UTC by Jatin Nansi
Modified: 2018-11-14 20:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 585058 585059 (view as bug list)
Environment:
Last Closed: 2011-01-25 21:05:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch - improve shutdown wait (3.00 KB, patch)
2010-04-21 03:03 UTC, Ian Kent
no flags Details | Diff
Patch - improve shutdown wait (rev 2) (3.75 KB, patch)
2010-04-30 05:25 UTC, Ian Kent
no flags Details | Diff

Description Jatin Nansi 2010-04-06 05:23:41 UTC
Description of problem:
When automount manages a large number of automounts, the umounting all of them while exiting can take longer then anticipated. As shipped, the autofs5 init script waits 45 seconds for the automount daemon to exit, after which it reports that the daemon failed to exit and itself exits. During this time the daemon itself is still busy umounting NFS mounts before exiting. 

Version-Release number of selected component (if applicable):
autofs5-5.0.1-0.rc2.106.el4_8.2
RHEL4U8 x86_64

How reproducible:
Always (for the customer)

Steps to Reproduce:
1. Configure a autofs map with wild cards that match a large number of directories on an NFS export.
2. Access the directories.
3. Shutdown automount daemon without waiting for the mounts to expire.
  
Actual results:
'service autofs5 stop' will report FAIL.

Expected results:
The init script should wait for a longer time interval to allow autofs to exit.

Additional info:

Comment 2 Jatin Nansi 2010-04-06 05:32:28 UTC
Created attachment 404612 [details]
autofs5 debug log

Comment 3 Jatin Nansi 2010-04-06 06:03:54 UTC
The customer works around this bug by running with the line 80 modified in /etc/init.d/autofs5:

 [ $RETVAL = 0 -a -z "`pidof $DAEMON`" ] || sleep 3
Modified to:
 [ $RETVAL = 0 -a -z "`pidof $DAEMON`" ] || sleep 60

Comment 6 Ian Kent 2010-04-21 03:03:12 UTC
Created attachment 407970 [details]
Patch - improve shutdown wait

Comment 7 Ian Kent 2010-04-21 03:04:32 UTC
A test package with the above patch has been built and
is available at:
http://people.redhat.com/~ikent/autofs5-debuginfo-5.0.1-0.rc2.109.bz579631.3

Please test this package and report results.

Comment 10 Ian Kent 2010-04-30 05:25:25 UTC
Created attachment 410307 [details]
Patch - improve shutdown wait (rev 2)

Comment 11 Ian Kent 2010-04-30 05:32:26 UTC
A test package with the above patch has been built and
is available at:
http://people.redhat.com/~ikent/autofs5-5.0.1-0.rc2.109.bz579631.4

While this change only increases the time to wait for autofs to
exit it still isn't as long as originally requested. So, hopefully
it won't attract criticism from those that don't like long waits
in the shutdown.

But the most recent trace of the init script stop action may
be showing a problem with autofs not exiting at all at shutdown
so please pay attention to this and report the results of your
testing. We have had another report of this but I have been
unable to reproduce the issue. If this is happening here I'm
keen to get more information.
 
Please test this package and report results.

Comment 13 Ian Kent 2010-05-11 03:48:04 UTC
How many entries are present in /proc/mounts and /etc/mtab when
the shutdown is initiated?
During the shutdown, when we are seeing umounts take a long time,
what is the CPU usage of automount?
If it is high, how long does it stay high and what is the reported
percentage of CPU usage?

Comment 18 Ian Kent 2010-05-14 08:02:27 UTC
The log from comment #14 shows that one umount occurred
after the first 6 seconds and the second umount took
4 minutes 46 seconds. The remaining umounts occurred at
between 5 and 8 umounts per second.

So the server providing patools3:/t1sw4/sopt7/icds either
was down for a time or responded unreasonably slowly.

The reason the init script gives up and returns an error is
to stop long waits like this from preventing system shutdown.

The problem here though is that this behaviour isn't appropriate
for a restart, where we need to wait for autofs to exit before
starting again.

Not sure which way I will resolve this yet.

We also see from the log that, after all mounts have gone away
it takes a further 2 minutes 37 seconds before the automount
process finishes. This seems longer than it should be but it
is very difficult to work out what is causing it. It may be due
to latency in thread termination together with the state queue
manager thread not being responsive enough to thread terminations.
I don't think this is worth chasing specifically as it may be
more difficult to resolve than we expect and will not yield
sufficient benefit to warrant the amount of effort. Also, this
may be improved with coming RHEL-4 changes.

Comment 21 RHEL Program Management 2011-01-25 21:05:10 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.


Note You need to log in before you can comment on or make changes to this bug.