Bug 90330 - System hangs in autofs, but without any NFS errors
Summary: System hangs in autofs, but without any NFS errors
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: autofs
Version: 7.3
Hardware: i386
OS: Linux
medium
low
Target Milestone: ---
Assignee: Jeff Moyer
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-05-06 23:52 UTC by Matthew Braithwaite
Modified: 2007-04-18 16:53 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2004-05-11 21:22:21 UTC
Embargoed:


Attachments (Terms of Use)
Stack trace of a system having this problem (75.00 KB, text/plain)
2003-05-06 23:53 UTC, Matthew Braithwaite
no flags Details

Description Matthew Braithwaite 2003-05-06 23:52:05 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.6 (X11; FreeBSD i386; U;) Gecko/0

Description of problem:
We have a number of systems that automount user home directories and a few other
filesystems with the following options:

  rw,nosuid,tcp,intr,nfsvers=2

With some frequency these systems become useless, in a way that suggests the
automounter is at fault.  `df' hangs, and nobody (including root!) can login via
ssh.  The only way we can get in is root login on the console, and that one
login session is extremely fragile, because many commands that root might wish
to run to investigate the problem will hang his session.

The best lead we have is the stack trace for every process on the system,
obtained through the SysRq feature.  I'll append this.

One might suspect that this is at root an NFS problem -- e.g. some NFS server
has gone away.  There are two difficulties with this hypothesis:  first, we have
only four NFS servers, and we know that none of them are having problems.  (They
were not to our knowledge down at the onset of the problem, and in any case were
definitely up by the time we started investigating.)  And second, the consoles
of machines that suffer from the problem, which we log, did not record any NFS
errors.

Version-Release number of selected component (if applicable):


How reproducible:
Didn't try

Steps to Reproduce:
We've got NFS and autofs activity going on all day.  We just have to wait for it
to happen.
    

Additional info:

Comment 1 Matthew Braithwaite 2003-05-06 23:53:00 UTC
Created attachment 91527 [details]
Stack trace of a system having this problem

Comment 2 David Alden 2003-06-12 11:37:01 UTC
Hi,
  Just wanted to add that I'm having a similar (maybe the same?) problem.
except I'm running Redhat 9 on the client that's having a problem.  For me,
I can login so long as I hit ^C.  So far, the problem has always occurred
on the same automount point -- /home/mail, which happens to be the only map
entry that contains mount options, the entry looks like:

mail  -rw,soft,bg  mail:/var/spool/mail

If I run strace agains the child automount daemon, all I get is "read(4,".
I can mount the partition without any problems:

# mount -o rw,soft,bg mail:/var/spool/mail /mnt
# umount /mnt

I'm going to wait until it locks up one more time, then I'm going to try
removing the mount options to see if that makes a difference. 

...dave alden


Comment 3 David Alden 2003-07-08 12:50:00 UTC
Hi,
  Quick followup -- I changed the mount options in the auto.home file to
just rw,soft (I took out bg) and it hasn't locked up since.
...dave


Comment 4 Jeff Moyer 2004-03-22 20:10:25 UTC
Please try with the latest distribution.  If you still have problems,
be sure to include the kernel version and the version of the autofs
user space.

Thanks!


Note You need to log in before you can comment on or make changes to this bug.