Bug 100537 - loopback-mounted NFS hangs shutdown
loopback-mounted NFS hangs shutdown
Status: CLOSED DEFERRED
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: initscripts (Show other bugs)
3.0
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Bill Nottingham
Brock Organ
: Security
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-07-23 03:35 EDT by Alexandre Oliva
Modified: 2014-03-16 22:37 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-21 13:59:19 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alexandre Oliva 2003-07-23 03:35:07 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703

Description of problem:
Since NFS servers are stopped before NFS filesystems are unmounted, a
loopback-mounted NFS filesystem, such as that created by amd when referencing
/net/localhost, causes shutdown to hang indefinitely, while umount waits for the
server to come back.  It obviously never will before shutdown completes => deadlock.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.export some local filesystem
2.start nfs and amd
3.access /net/localhost to get it mounted
4.shut down or reboot

Actual Results:  NFS server stops before amd and before netfs, so we can't
umount it, and we hang indefinitely.

Expected Results:  It should give up at some point.  nfs server should probably
be stopped very late in the game, such that, even in the case of cross-mounted
NFS servers shutting down, both of them could succeed.

Additional info:

Getting a process to keep an open file in /net/localhost could be used as a
denial of service attack: if the server is to reboot or shutdown immediately, it
won't, requiring manual intervention to power it down, which may cause a lot of
inconvenience or even loss of data (consider a raid 5 system that has to be
powered down, and a disk is lost between the update of a block and the update of
its checksum block).
Comment 1 Alexandre Oliva 2003-07-23 03:36:49 EDT
The patch in bug 63602 could help solve this problem, even though it's not the
ideal solution, as we'd better umount the filesystem before the server goes
down, otherwise we might lose data.
Comment 2 Alexandre Oliva 2003-10-19 16:15:26 EDT
Fixed in Fedora Core test3.  Even if I start a screen session, cd to
/net/locallhost/<dir> in it and disconnect, then request a reboot, the machine
comes down, even though there are RPC sendmsg errors logged to the console just
before the machine goes down.  This is probably as good as it gets.
Comment 3 Alexandre Oliva 2003-10-25 16:47:23 EDT
Whatever fix it was, it didn't make it to RHEL 3 :-(
Comment 4 Bill Nottingham 2003-10-26 15:22:01 EST
There actually aren't any changes in that area between Taroon and Cambridge.
Comment 5 Alexandre Oliva 2003-10-29 11:45:52 EST
I noticed there hadn't been changes to initscripts, so fuser was my prime
suspect of having fixed it, but now I see fuser is unchanged, but the problem is
definitely gone.  Something must have fixed it, even if it's just because the
kernel is reponding differently to accesses to broken NFS mounts.
Comment 6 Alexandre Oliva 2003-10-29 11:46:09 EST
I noticed there hadn't been changes to initscripts, so fuser was my prime
suspect of having fixed it, but now I see fuser is unchanged, but the problem is
definitely gone.  Something must have fixed it, even if it's just because the
kernel is reponding differently to accesses to broken NFS mounts.
Comment 8 Bill Nottingham 2005-01-12 15:42:24 EST
Are you still seeing this? You may also want to see bug 138788.
Comment 9 Alexandre Oliva 2005-09-21 03:24:35 EDT
Not on Fedora devel, no.  The problem is gone there.  Dunno about RHEL3.
Comment 10 Bill Nottingham 2005-09-21 13:59:19 EDT
Closing as DEFERRED for a later RHEL release, then.

Note You need to log in before you can comment on or make changes to this bug.