Bug 655726
Summary: | /etc/init.d/nfs doesn't create proper subsys lock file | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Sandro Bonazzola <sandro.bonazzola> |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | low | ||
Version: | 14 | CC: | dougsland, gansalmon, iarlyy, itamar, jlayton, jonathan, kernel-maint, madhu.chinakonda, notting, plautrba, rwahl, steved |
Target Milestone: | --- | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-12-01 17:07:35 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Sandro Bonazzola
2010-11-22 10:02:52 UTC
I've added some debug commands to /etc/init.d/functions checking if /var/lib and /var/log are umounted and if something is still using /var using lsof. It seems that /var/lib and /var/log are correctly umounted so the awk scripts seems to work fine. However lsof doesn't show anything still using /var but umount keep saying it's busy. Mmm... I've just added the line: fuser -mv /var >/root/out.txt and after the reboot in out.txt there's just "kernel". After a deeper investigation it seems that this only appen if something in /var is exported in NFS. The user space programs are all terminated but knfsd still try to mount the exported paths even during shutdown. So in the end it doesn't seems to be a initscript issue but a kernel issue. for reference, kernel-2.6.35.6-48.fc14.i686 Reproducible also switching to init 1. after calling telinit 1, I had to stop cups, nfs, nfslock, rpcbind, rpcidmapd, rsyslog. I've umounted /var/lib and /var/log. fuser -mv /var shows PID:kernel ACCESS:mount umount -v /var says /var is busy. waiting some minutes I've read a message about last nfs server thread exited, then umount /var worked fine. lsof and fuser can't detect any process that claim the access to any file in /var. Just to be sure I've tryed to repeat the test calling export -u -a before umounting /var with the same result. In my exportfs I have /var/lib exported through nfs but /var/lib umounted successfully. I can't explain why /var is still busy. I've extended the test exporting other paths in other partitions. The problem exists on every single partition that contains a NFS exported path. However, adding the command: exportfs -u -a just before __umount_loop in /etc/init.d/halt allow a clean umount of all the partitions with the only exception of /var. maybe it could be caused by something at kernel nfsd level that hold a reference to something in /var/lib/nfs ? Looking with attention at the shutdown procedure, it seems that the service NFS is never stopped before /etc/init.d/halt is executed. I have added an explicit call to /etc/init.d/nfs stop as first action in /etc/init.d/halt. Now I can see NFS service stopping but nfsd kernel server is still up when the umount loop begins. nfs-utils-1.2.3-1.fc14.i686 initscripts-9.20.1-1.fc14.i686 I've tried also adding exportfs -u -a rpc.nfsd -- 0 just before killproc nfsd -2 in /etc/init.d/nfs with no effect. So it seems there are 2 bugs: - the shutdown procedure doesn't stop all the services before calling halt - the nfsd kernel server doesn't stop. Ok, part 1: NFS service is not stopped during shutdown procedure: initscripts-9.20.1-1.fc14.i686 file /etc/rc: for i in /etc/rc$runlevel.d/K* ; do subsys=${i#/etc/rc$runlevel.d/K??} [ -f /var/lock/subsys/$subsys ] || [ -f /var/lock/subsys/$subsys.init ] || continue ... So during shutdown the script find /etc/rc0.d/K60nfs and assign subsys to nfs then check for /var/lock/subsys/nfs now examining nfs-utils-1.2.3-1.fc14.i686 file /etc/init.d/nfs, the script creates. /var/lock/subsys/rpc.mountd /var/lock/subsys/nfsd but don't create /var/lock/subsys/nfs, so the service nfs could not be stopped. Just add the following line in the start case: touch /var/lock/subsys/nfs allow the service stop properly. Just add in the stop case rm /var/lock/subsys/nfs for removing the lock file. I hope this will be fixed as soon as possible. Verifyng the patch of comment #8 : now the shutdown sequence is properly restored having nfs service stopped at step 60 and rpc at step 87. /var is not more busy and can be umounted properly. In the end this is only a nfs-util bug that could be fixed in some minutes. Please push the update as soon as possible. Any additional info needed? Added keyword Regression since this bug is not present for example in Fedora 10. *** Bug 656003 has been marked as a duplicate of this bug. *** I see this on my F14 machines as well. A workmate told me of similar problems with his F13 box. It looks like this hard coded /var/lock/subsys/XXX stuff and some generic code that makes assumptions on the names does not play well together. Probably the init-scripts should not deal with this lock files altogether as it seems to be error-prone and abstract this into functions. I'm not sure if this bug is correctly assigned. Can anyone assign this to the correct person? *** This bug has been marked as a duplicate of bug 652786 *** |