Bug 808816

Summary: Umounting exported filesystems fails during shutdown
Product: Red Hat Enterprise Linux 5 Reporter: o.h.weiergraeber
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 5.8CC: bfields, jlayton, standifm
Target Milestone: rc   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-02-05 21:45:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description o.h.weiergraeber 2012-04-01 08:07:04 UTC
Description of problem:
A few weeks ago, several of my EL5 systems started throwing error messages during shutdown indicating hat certain filesystems cannot be unmounted since they are busy. These filesystems are exported via nfs, but the error occurs even if they are definitely not mounted by any client at that time. Commenting them in /etc/exports make the messages disappear.

Version-Release number of selected component (if applicable):
nfs-utils-1.0.9-60.el5.i386


How reproducible:
quite reproducible


Steps to Reproduce:
1. define nfs exports
2. shutdown the system


Actual results:
error messages "device busy" during shutdown


Expected results:
system unmounts filesystems without errors


Additional info:

Comment 1 Steve Dickson 2012-04-03 16:02:41 UTC
What kernel version does this happen on? I'm not seeing this problem with the 2.6.18-308.el5 kernel...

Comment 2 o.h.weiergraeber 2012-04-04 10:36:40 UTC
The systems are fully patched, running kernel 2.6.18-308.1.1.el5 (one of them 32 bit, the other 64 bit).
Is there any possibility to get more verbose information about *why* exactly the file system is considered busy?

In my original post, I was intentionally referring to "EL5" since the affected systems are running CentOS, not RHEL. I don't have access to a RHEL5 machine but was confident that the issue should be relevant to RedHat as well.
In case the problem really cannot be reproduced on genuine RHEL, it might also be specific to CentOS, although chances seem quite low...

Comment 3 Steve Dickson 2012-04-04 13:02:09 UTC
(In reply to comment #2)
> The systems are fully patched, running kernel 2.6.18-308.1.1.el5 (one of them
> 32 bit, the other 64 bit).
> Is there any possibility to get more verbose information about *why* exactly
> the file system is considered busy?
'lsof <file system>' should show any processes running on that file system.

> 
> In my original post, I was intentionally referring to "EL5" since the affected
> systems are running CentOS, not RHEL. I don't have access to a RHEL5 machine
> but was confident that the issue should be relevant to RedHat as well.
> In case the problem really cannot be reproduced on genuine RHEL, it might also
> be specific to CentOS, although chances seem quite low...
True... they should be the same bits...

Comment 5 o.h.weiergraeber 2012-04-06 06:57:12 UTC
After some systematic testing, the following picture emerges:

If only the machine exporting the file system is booted and then shut down again, no problems are reported.

If the machine exporting the file system is booted together with the machine configured to import the file system (via autofs, without actually mounting), no problems are reported during shutdown.

However,
after the file system has been mounted and unmounted, shutting down the exporting machine results in the "file system busy" error. This error even persists across reboots, i.e. without new mounts of the filesystem!!!
It seems like the filesystem is somehow flagged as "in use", and this flag is never removed, not even by rebooting the machine. It only disappears after manually restarting the nfs service. So the latter does not seem to be equivalent to what happens during a reboot.

Hope this helps to resolve this mysterious issue ;-)

Comment 6 J. Bruce Fields 2012-04-09 17:01:16 UTC
I wonder if this might be the same inode leak as seen in bug 712054.  In which case, https://bugzilla.redhat.com/attachment.cgi?id=559319 (in -310) would be worth testing.

Comment 7 Matt Standiford 2012-10-27 16:54:30 UTC
I'm almost positive that this is the same bug as https://bugzilla.redhat.com/show_bug.cgi?id=717283.

The /var/lock/subsys lockfile changed from version 54 to 60, and now refers to nfsd, rather than nfs.  Therfore, rc will not pick it up on shutdown

Comment 8 RHEL Program Management 2014-01-29 10:36:56 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.