From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.6 Description of problem: When trying to unmount a file system which is exported by NFS to a large number of heterogeneous clients, the unmounting is denied: umount: /d1: device is busy /var/log/messages contains the following traces at the time of the umount: Aug 31 08:24:36 ganlxsr3 rpc.statd[921]: Received erroneous SM_UNMON request from ganlxsr3 for 193.x.y.z Aug 31 08:24:36 ganlxsr3 kernel: lockd: couldn't shutdown host module! No processs seem to be using that filesystem at the moment we tried to unmount it: # lsof | grep d1 # Gives nothing. The problem we also see related to that issue, is that the [lockd] daemon which is now running correctly wasn't there in the list of process # ps auxww | grep lockd # Gives nothing Version-Release number of selected component (if applicable): kernel 2.4.21-32.0.1.ELsmp How reproducible: Sometimes Steps to Reproduce: 1. Wait a sufficient amount of time (some days/weeks) 2. unexport /d1 3. try to umount /d1 4. buzy message as above Actual Results: file system remains mounted Expected Results: file system should have been unmounted without error Additional info:
What happens if you stop nfs (i.e. serivce nfs stop), will that allow to unmount the filesystem?
*** Bug 167896 has been marked as a duplicate of this bug. ***
The last time I saw the problem on the customer system, I tried to kill nfsd to see if it would help without result. It doesn't occur all the time. After a certain amount of time (some 10/20 days sometimes more), it's impossible to umount the FS. Current status on the system is the same (lsof /d1 doesn't report anything, so it shouldn't be in use). My concern was why lockd tries to exit ? (the error message above in the code refers to an exit situation). Could this be a hint to the problem ?
It appears lockd thinks there is an outstanding lock which might be the reason you can't unmount the filesystem... I'm sure but does lsof report on locks?
Created attachment 118757 [details] result of lsof command when umount was impossible
Well as you stated before, lsof show nothing.... So to be clear, your unable to mount /d1 after you bring down the nfs server using the 'service nfs stop' command, correct? hmm... I wonder if there is any orphan lock.... to see use the following script: for i in `cat /proc/locks | grep POSIX | awk '{print $5}' ` do [ -f /proc/$i/stat ] && continue echo "$i: has an orphan lock" done
I used that line in fact to remove duplicates on the current running system, on which I do not want to test the umount for now: # for i in `cat /proc/locks | grep POSIX | awk '{print $5}' |sort -u`; do [ -f /proc/$i/stat ] && continue; echo "$i: has an orphan lock"; done|wc -l 58 So clearly we seem to have some orphand locks. I then tried these : # for i in `cat /proc/locks | grep POSIX | awk '{print $5}' |sort -u`; do [ ! -d /proc/$i ] && echo "$i: no entry in /proc"; [ -f /proc/$i/stat ] && continue; echo "$i: has an orphan lock"; done | grep orphan | wc -l 56 # for i in `cat /proc/locks | grep POSIX | awk '{print $5}' |sort -u`; do [ ! -d /proc/$i ] && echo "$i: no entry in /proc"; [ -f /proc/$i/stat ] && continue; echo "$i: has an orphan lock"; done | grep entry | wc -l 56 I then diffed the 2 result lists: So all of the orphaned lock refer process numbers which do not exist anymore.
RHEL3 is now closed.
Hi Wendy, The approach here to add a new export flag seems generally sound to me, but I'm a little concerned about the kernel piece of this patch. If we call nfsd_lockd_unexport(clp) here, won't that invalidate all locks that this client is holding, including ones on filesystems other than the one we're unexporting? I think we may need a more selective kernel routine for dropping the locks here that takes into account which filesystem is being unexported.
Created attachment 130250 [details] proposed kernel patch based on wendy's concept This patch merges Wendy's work with the earlier patch that I had for bz 180524. This adds the nlmsvc_release_device function that my old patch had, and uses her NFSEXP_FOLOCK flag to cue calling it on unexport. This should keep us from killing locks that are on other filesystems. I'm posting this just for discussion. I've not tested this yet to see if it will even build, so it may need some more work. We may also want to consider some more indirection (similar to how nfsd_lockd_unexport wraps nlmsvc_invalidate_client).
No, Jeff, the locks that I drop is associated with one particular export entry - i.e. the pair of (host and its mounted directory), obtaining from /proc/fs/nfs/exports. It has a much finer granularity than you think (I've tested this out). More specifically, the patches: 1. Piggybacks the flow on "exportfs -u" logic where it calls nfsctl system call *repeatly* based on *each* entry in /proc/fs/nfs/exports that is structured based on host+(its own mounted top directory) pair. 2. It completely matches with how kernel stores these export entries; i.e. it is based on host+exported directory pair. This would allow me to drop only the locks associated with the pair, nothing else ! The atomic operation also allows us to avoid various race conditions. 3. It is also paving the way for (future) dynamic load balacing work that I'm looking into for GFS at this moment. 4. Sync with 2.6 kernel and upstream direction. The extra code you add will *not* work well with my intention and the "exportfs -u" logic that I piggyback on. BTW, we're also looking into using different nfsd(s)/lockd(s) to solve the issue for RHEL 4.
Hmm, wait ... Jeff is right ... The NLM part will close all the files associated with the particular host. I'll check more.
Further looking this... I think this is what happens - the nlmsvc_users has not been initialized at all in RHEL 3 (at least I can't find it). So say if the memory happened to contain 0xffffffff in an i686 server after bootup and before nfs service is turned on. Then nfsd() brings up lockd via lockd_up() where nlmsvc_user++ is executed (now nlmsvc_user is 0). After lockd thread comes to life, it happily serves the lock requests in the following loop (2.4.21-43.EL kernel): 133 while ((nlmsvc_users || !signalled()) && nlmsvc_pid == current-> pid) 134 { 135 long timeout = MAX_SCHEDULE_TIMEOUT; 136 if (signalled()) { 137 spin_lock_irq(¤t->sighand->siglock); 138 flush_signals(current); 139 spin_unlock_irq(¤t->sighand->siglock); 140 if (nlmsvc_ops) { 141 nlmsvc_ops->detach(); 142 grace_period_expire = set_grace_period() ; 143 } 144 } ........... 186 svc_process(serv, rqstp); 187 188 /* Unlock export hash tables */ 189 if (nlmsvc_ops) 190 nlmsvc_ops->exp_unlock(); 191 } Then if someone sends lockd a signal that makes !signalled() no longer true. Lockd falls thru the while loop and dies. It doesn't release the lock (for failover) and it also disappears. Look like this customer is hitting this ?
Explained by Ernie, comment #25 is not possible: Note that adding initializers of 0 to global data has no functional effect, but it does change the addresses where such variables reside. Without initializers, global variables reside in the ".bss" section, which is zeroed *by the kernel* when it first starts up (because space for the variables doesn't exist in the ELF file). With initializers, global variables reside in the ".data" section, which is loaded by the bootstrap into memory from the ELF file.
Created attachment 130791 [details] 2.6 patch that may be applicable here too This patch is from 2.6.12-ish, and was apparently applied to the 2.4 series around 2.4.30. This seems to resolve a similar problem on RHEL4 (reproducable with the connectathon lock test as described earlier). A similar patch may be what's needed here.
Reposting info from lost BZ update. The above patch does seem to resolve the issue that we've replicated so far with connectathon lock test 7. That reproducer is documented in 194367. It's too late for U8, but if there is a U9 we'll try to get this in there (or maybe in an errata).
A fix for this problem has just been committed to the RHEL3 U8 patch pool this evening (in kernel version 2.4.21-45.EL).
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0437.html