Bug 150116
Summary: | autofs removed all files in mounted directory! | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Brian J. Murrell <brian> |
Component: | autofs | Assignee: | Chris Feist <cfeist> |
Status: | CLOSED ERRATA | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 3 | CC: | cfeist, jay.hilliard, jmoyer |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-04-19 21:05:52 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Brian J. Murrell
2005-03-02 16:57:21 UTC
I don't see any obvious errors in the code. To clear things up for me, do you export the home directories with no_root_squash? The automount daemon runs as root, and if the files in your home directory are owned by you, then the daemon should not be able to unlink them. Oh, and bug number 134399 does not exist. Was that a typo? Thanks! I didn't see anything obviously wrong either. Could be some subtle bug with the mtab parsing perhaps. If I see this happen (or attempte to happen) again, I will put the little mtab parsing in a loop and see what it's doing. As for bug 134399: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=134399 You didn't answer my question about your NFS server setup. Ah. Yes. My appologies: We do export no_root_squash: /export/home *(rw,sync,no_root_squash) I think there are requirements that root be able to write into some /home places if not other places that we also export and automount. So while this could be a band-aid to prevent "devastation", so does my short-circiut out of the removal process. Both are just workarounds for the bug though of course. The reason for my query was not to try to get around the problem. I was trying to make sure that autofs actually was able to do such a thing in the first place. Could you enable debugging? You can do that by adding a --debug for the entry in your auto.master: /home yp:auto.home --debug Then, configure syslog to capture debug logs. You can do this by simply adding this line to the end of /etc/syslog.conf: *.* /var/log/debug You will, of course, have to either HUP or restart syslogd. autofs will need to be restarted as well. These logs will be necessary if the problem rears its ugly head again. My initial guess at what is happening is that it is a race between mount and umount. Autofs expires a directory, and then another process accesses the directory while we're doing the post-expiry cleanup. I'll look into this further. Thanks! Jeff Oh, and the bugzilla you mentioned above is not the same problem. After inspecting the code, it seems remotely possible that this bug may have been encountered due to a locking issue with autofs. The autofs locking has changed with 4.1.4, and should resolve this issue. We will be updating our package to this version. When this happens, I will post information on where to obtain the package to this bugzilla. Have you seen this issue crop up again in your environment? We are currently working on some patches that should a) keep autofs from unlinking files, and b) provide more debugging information if we run into this bug. Would you be interested in running a debug version of autofs? Thanks. I'm afraid in the environment that this was seen in (and is no longer being seen for whatever reason) I can't drop debug versions of autofs in. :-( We've experienced data loss with autofs-4.1.3-131 on RHEL4 U1 x86_64. strace showed automount doing rmdir("/data/ada83/CHIC/char/.....etc") It was walking the path and unlinking files. The mount point in this case was /data/ada83 over nfs3. Here's the NIS automount entry for /data/ADA98 ada83 -rw,intr,hard,timeo=600,nfsvers=3,tcp,rsize=32768,wsize=32768 fa:/panfs/fa/ada83 We've looked at the recent autofs-4.1.3-149.src.rpm and are a still concerned about the following: ap.ioctlfd = open(path, O_RDONLY); if (ap.ioctlfd < 0) { umount_autofs(1); return -1; } stat(path, &st); ap.dev = st.st_dev; Here's a case where umount_autofs() is called which eventually tries to check if we're using the save device (via ap.dev), but ap.dev hasn't been assigned to yet. So it ends up comparing against an uninitialzed variable. I'm really concerned about the safety of our data before using RHEL4 U1 in our environment. Your help is appreciated. Unfortunately, we've been unable to replicate the problems you've seen. But, we have removed the code which unlink's files in -149 and beyond. So, if autofs does get confused it will error out and should not accidently start removing files. I'm investigating the code snippit you found to see if I can cause autofs to fail. In the case mentioned above, state will be equal to ST_INIT. In this case, we do not attempt to unmount anything. Moving bug to MODIFIED state. (We were unable to replicate, but have modified the code to prevent accidental deletion of files.) |