167636 – Unable to unmount a local file system exported by NFS

Bug 167636 - Unable to unmount a local file system exported by NFS

Summary: Unable to unmount a local file system exported by NFS

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Jeff Layton
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	167896 (view as bug list)
Depends On:
Blocks:	RHEL3U8MustFix 188023
TreeView+	depends on / blocked

Reported:	2005-09-06 14:55 UTC by Bruno Cornec
Modified:	2007-11-30 22:07 UTC (History)
CC List:	11 users (show)
Fixed In Version:	RHSA-2006-0437
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2006-07-20 13:28:11 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
result of lsof command when umount was impossible (162.59 KB, text/plain) 2005-09-13 15:52 UTC, Bruno Cornec	no flags	Details
nlm_lock-1: user mode nfs-utils-1.0.6-44EL patch (4.92 KB, patch) 2006-05-29 06:20 UTC, Wendy Cheng	no flags	Details \| Diff
nlm_lock-2: kernel mode patch (787 bytes, patch) 2006-05-29 06:22 UTC, Wendy Cheng	no flags	Details \| Diff
proposed kernel patch based on wendy's concept (2.80 KB, patch) 2006-05-30 22:14 UTC, Jeff Layton	no flags	Details \| Diff
Patch to initialize nlmsvc_users. (630 bytes, patch) 2006-05-31 03:43 UTC, Wendy Cheng	no flags	Details \| Diff
2.6 patch that may be applicable here too (991 bytes, patch) 2006-06-08 22:00 UTC, Jeff Layton	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2006:0437	0	normal	SHIPPED_LIVE	Important: Updated kernel packages for Red Hat Enterprise Linux 3 Update 8	2006-07-20 13:11:00 UTC

Description Bruno Cornec 2005-09-06 14:55:17 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.6

Description of problem:
When trying to unmount a file system which is exported by NFS to a large number of heterogeneous clients, the unmounting is denied:

umount: /d1: device is busy

/var/log/messages contains the following traces at the time of the umount:

Aug 31 08:24:36 ganlxsr3 rpc.statd[921]: Received erroneous SM_UNMON
request from ganlxsr3 for 193.x.y.z
Aug 31 08:24:36 ganlxsr3 kernel: lockd: couldn't shutdown host module!

No processs seem to be using that filesystem at the moment we tried to unmount it:

# lsof | grep d1
#

Gives nothing.

The problem we also see related to that issue, is that the [lockd] daemon which is now running correctly wasn't there in the list of process

# ps auxww | grep lockd
#
Gives nothing



Version-Release number of selected component (if applicable):
kernel 2.4.21-32.0.1.ELsmp

How reproducible:
Sometimes

Steps to Reproduce:
1. Wait a sufficient amount of time (some days/weeks)
2. unexport /d1
3. try to umount /d1
4. buzy message as above
  

Actual Results:  file system remains mounted

Expected Results:  file system should have been unmounted without error

Additional info:

Comment 1 Steve Dickson 2005-09-08 01:38:02 UTC

What happens if you stop nfs (i.e. serivce nfs stop), will 
that allow to unmount the filesystem?

Comment 2 Bruno Cornec 2005-09-09 16:39:23 UTC

*** Bug 167896 has been marked as a duplicate of this bug. ***

Comment 3 Bruno Cornec 2005-09-09 16:48:18 UTC

The last time I saw the problem on the customer system, I tried to kill nfsd to
see if it would help without result.

It doesn't occur all the time. After a certain amount of time (some 10/20 days
sometimes more), it's impossible to umount the FS.

Current status on the system is the same (lsof /d1 doesn't report anything, so
it shouldn't be in use).

My concern was why lockd tries to exit ? (the error message above in the code
refers to an exit situation). Could this be a hint to the problem ?

Comment 4 Steve Dickson 2005-09-09 19:59:50 UTC

It appears lockd thinks there is an outstanding lock
which might be the reason you can't unmount the 
filesystem... I'm sure but does lsof report on locks?

Comment 5 Ernie Petrides 2005-09-09 20:29:51 UTC

*** Bug 167896 has been marked as a duplicate of this bug. ***

Comment 6 Bruno Cornec 2005-09-13 15:52:35 UTC

Created attachment 118757 [details]
result of lsof command when umount was impossible

Comment 7 Steve Dickson 2005-09-14 01:05:52 UTC

Well as you stated before, lsof show nothing.... So to be clear,
your unable to mount /d1 after you bring down the nfs
server using the 'service nfs stop' command, correct?

hmm... I wonder if there is any orphan lock.... 
to see use the following script:

for i in `cat /proc/locks | grep POSIX | awk '{print $5}' `
do
        [ -f /proc/$i/stat ] && continue
        echo "$i: has an orphan lock"
done

Comment 8 Bruno Cornec 2005-09-16 14:11:18 UTC

I used that line in fact to remove duplicates on the current running system, on
which I do not want to test the umount for now:

# for i in `cat /proc/locks | grep POSIX | awk '{print $5}' |sort -u`; do      
  [ -f /proc/$i/stat ] && continue;         echo "$i: has an orphan lock";
done|wc -l
     58

So clearly we seem to have some orphand locks.

I then tried these :

# for i in `cat /proc/locks | grep POSIX | awk '{print $5}' |sort -u`; do  [ !
-d /proc/$i ] && echo "$i: no entry in /proc";  [ -f /proc/$i/stat ] &&
continue;         echo "$i: has an orphan lock"; done | grep orphan | wc -l
     56
# for i in `cat /proc/locks | grep POSIX | awk '{print $5}' |sort -u`; do  [ !
-d /proc/$i ] && echo "$i: no entry in /proc";  [ -f /proc/$i/stat ] &&
continue;         echo "$i: has an orphan lock"; done | grep entry | wc -l
     56

I then diffed the 2 result lists:
So all of the orphaned lock refer process numbers which do not exist anymore.

Comment 14 Ernie Petrides 2006-05-24 20:08:49 UTC

RHEL3 is now closed.

Comment 19 Jeff Layton 2006-05-30 21:14:46 UTC

Hi Wendy,
  The approach here to add a new export flag seems generally sound to me, but
I'm a little concerned about the kernel piece of this patch.

If we call nfsd_lockd_unexport(clp) here, won't that invalidate all locks that
this client is holding, including ones on filesystems other than the one we're
unexporting?

I think we may need a more selective kernel routine for dropping the locks here
that takes into account which filesystem is being unexported.

Comment 20 Jeff Layton 2006-05-30 22:14:20 UTC

Created attachment 130250 [details]
proposed kernel patch based on wendy's concept

This patch merges Wendy's work with the earlier patch that I had for bz 180524.
This adds the nlmsvc_release_device function that my old patch had, and uses
her NFSEXP_FOLOCK flag to cue calling it on unexport. This should keep us from
killing locks that are on other filesystems.

I'm posting this just for discussion. I've not tested this yet to see if it
will even build, so it may need some more work. We may also want to consider
some more indirection (similar to how nfsd_lockd_unexport wraps
nlmsvc_invalidate_client).

Comment 21 Wendy Cheng 2006-05-30 23:22:30 UTC

No, Jeff, the locks that I drop is associated with one particular export 
entry - i.e. the pair of (host and its mounted directory), obtaining from
/proc/fs/nfs/exports. It has a much finer granularity than you think (I've
tested this out). More specifically, the patches:

1. Piggybacks the flow on "exportfs -u" logic where it calls nfsctl system 
   call *repeatly* based on *each* entry in /proc/fs/nfs/exports that is
   structured based on host+(its own mounted top directory) pair.
2. It completely matches with how kernel stores these export entries; i.e. 
   it is based on host+exported directory pair. This would allow me to drop
   only the locks associated with the pair, nothing else ! The atomic 
   operation also allows us to avoid various race conditions. 
3. It is also paving the way for (future) dynamic load balacing work that I'm
   looking into for GFS at this moment. 
4. Sync with 2.6 kernel and upstream direction.

The extra code you add will *not* work well with my intention and the
"exportfs -u" logic that I piggyback on.

BTW, we're also looking into using different nfsd(s)/lockd(s) to solve the
issue for RHEL 4.

Comment 22 Wendy Cheng 2006-05-30 23:49:16 UTC

Hmm, wait ... Jeff is right ... The NLM part will close all the files associated
with the particular host. I'll check more.

Comment 25 Wendy Cheng 2006-05-31 02:35:01 UTC

Further looking this...

I think this is what happens - the nlmsvc_users has not been initialized
at all in RHEL 3 (at least I can't find it). So say if the memory happened 
to contain 0xffffffff in an i686 server after bootup and before nfs service 
is turned on. Then nfsd() brings up lockd via lockd_up() where nlmsvc_user++
is executed (now nlmsvc_user is 0). After lockd thread comes to life, it 
happily serves the lock requests in the following loop (2.4.21-43.EL kernel):

    133         while ((nlmsvc_users || !signalled()) && nlmsvc_pid == current->
       pid)
    134         {
    135                 long timeout = MAX_SCHEDULE_TIMEOUT;
    136                 if (signalled()) {
    137                         spin_lock_irq(&current->sighand->siglock);
    138                         flush_signals(current);
    139                         spin_unlock_irq(&current->sighand->siglock);
    140                         if (nlmsvc_ops) {
    141                                 nlmsvc_ops->detach();
    142                                 grace_period_expire = set_grace_period()
       ;
    143                         }
    144                 }
    ...........    
    186                 svc_process(serv, rqstp);
    187
    188                 /* Unlock export hash tables */
    189                 if (nlmsvc_ops)
    190                         nlmsvc_ops->exp_unlock();
    191         }

Then if someone sends lockd a signal that makes !signalled() no longer 
true. Lockd falls thru the while loop and dies. It doesn't release the 
lock (for failover) and it also disappears. 

Look like this customer is hitting this ?

Comment 27 Wendy Cheng 2006-06-01 04:34:06 UTC

Explained by Ernie, comment #25 is not possible:

Note that adding initializers of 0 to global data has no functional
effect, but it does change the addresses where such variables reside.

Without initializers, global variables reside in the ".bss" section,
which is zeroed *by the kernel* when it first starts up (because space
for the variables doesn't exist in the ELF file).

With initializers, global variables reside in the ".data" section,
which is loaded by the bootstrap into memory from the ELF file.

Comment 35 Jeff Layton 2006-06-08 22:00:34 UTC

Created attachment 130791 [details]
2.6 patch that may be applicable here too

This patch is from 2.6.12-ish, and was apparently applied to the 2.4 series
around 2.4.30. This seems to resolve a similar problem on RHEL4 (reproducable
with the connectathon lock test as described earlier).

A similar patch may be what's needed here.

Comment 36 Jeff Layton 2006-06-14 21:44:02 UTC

Reposting info from lost BZ update. The above patch does seem to resolve the
issue that we've replicated so far with connectathon lock test 7.

That reproducer is documented in 194367. It's too late for U8, but if there is a
U9 we'll try to get this in there (or maybe in an errata).

Comment 45 Ernie Petrides 2006-06-21 23:15:09 UTC

A fix for this problem has just been committed to the RHEL3 U8
patch pool this evening (in kernel version 2.4.21-45.EL).

Comment 48 Red Hat Bugzilla 2006-07-20 13:28:11 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0437.html

Note You need to log in before you can comment on or make changes to this bug.