Bug 63602 - NFS client won't shut down if server is down
NFS client won't shut down if server is down
Status: CLOSED WONTFIX
Product: Red Hat Linux
Classification: Retired
Component: initscripts (Show other bugs)
9
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Steve Dickson
Brian Brock
:
: 69802 82795 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-04-16 00:44 EDT by Alexandre Oliva
Modified: 2007-04-18 12:42 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-11-27 17:59:08 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to /etc/init.d/netfs that fixes the problem (946 bytes, patch)
2003-02-21 23:43 EST, Alexandre Oliva
no flags Details | Diff

  None (edit)
Description Alexandre Oliva 2002-04-16 00:44:22 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020408

Description of problem:
I shut down an internal NFS server, and then shut down its client.  The client
was left 4 hours trying to shut down, but it wouldn't.  I had to power it off.

Version-Release number of selected component (if applicable):
2.4.18-0.22

How reproducible:
Didn't try

Steps to Reproduce:
1.automount (with amd) a filesystem from a remote nfs server
2.shut the server down
3.when the server is down, shut the client down
	

Actual Results:  the client's /var/log/messages says:

Apr 15 06:53:53 free ntpd: ntpd shutdown succeeded
Apr 15 06:53:53 free umount: umount: can't get address for libero
Apr 15 06:53:53 free umount: umount2: Device or resource busy
Apr 15 06:53:53 free umount: umount: /.automount/libero/root/l: device is busy
Apr 15 06:53:53 free umount: Cannot MOUNTPROG RPC: RPC: Program not registered
Apr 15 06:53:54 free netfs: Unmounting NFS filesystems:  failed
Apr 15 06:54:06 free kernel: nfs: server libero not responding, still trying

When I woke up, I noticed the machine had not shut down, and rebooted it:

Apr 15 10:51:45 free shutdown: shutting down for system reboot
Apr 15 10:51:45 free init: Switching to runlevel: 6
Apr 15 10:51:47 free umount: umount: can't get address for libero
Apr 15 10:51:47 free umount: umount2: Device or resource busy
Apr 15 10:51:47 free umount: umount: /.automount/libero/root/l: device is busy
Apr 15 10:51:47 free netfs: Unmounting NFS filesystems:  failed
Apr 15 10:52:00 free kernel: nfs: server libero not responding, still trying
Apr 15 10:52:22 free shutdown: shutting down for system reboot

and it remained like that for a few more minutes.  I gave up, powered the
machine off and went back to bed for a while longer :-)

Expected Results:  I'd expected the shutdown to time out and give up on waiting
for the server to come back.

Additional info:

AFAIK, there were no pending writes to the NFS server that might have caused the
kernel to play safe and not reboot.  In any case, it would still be nice to have
some form to tell it to really ``shut down, the server is not coming back.''  In
general, when you get to that point, you can't get a shell or log in remotely
any longer, which makes this tricky.

I don't know whether this makes any difference, but at the time the client was
going down, the only DNS server configured to resolve names for it (127.0.0.1)
had already gone down.
Comment 1 Alexandre Oliva 2003-02-21 23:39:25 EST
*** Bug 69802 has been marked as a duplicate of this bug. ***
Comment 2 Alexandre Oliva 2003-02-21 23:43:34 EST
Created attachment 90275 [details]
Patch to /etc/init.d/netfs that fixes the problem

This patch seems to fix the problem for me.  It pretty much waits for fuser to
complete, but if fuser remains blocked in disk wait for about 5 seconds, it
gives up on waiting for it to complete.
Comment 3 Bill Nottingham 2003-09-03 21:57:47 EDT
*** Bug 82795 has been marked as a duplicate of this bug. ***
Comment 4 Douglas Furlong 2004-10-01 07:44:29 EDT
Correct me if I am wrong, but isn't this part of the point of Hard
mounting over Soft mounting?

The purpose of hard mounting is that the file system stays up
indefinetly waiting for the server to come back online.

If the server has dissapeared for what ever reason, then it is a
"serious" situation. Do we want to alter the scripts to just drop this
connection?

Would it not be better, that if this problem is frequent that the
users mounts them softly, so that the kernel can receive the failure
messages and give up on the mount point?

Just my two pence worth, but I am a mear amature :)

Doug

Note You need to log in before you can comment on or make changes to this bug.