Hide Forgot
It appears that on F16 the mounts made by autofs do not expire. I am using a configuration unchanged since the stone age; /etc/auto.master just has entries for /misc and /net, and ends with "+auto.master". nsswitch.conf has "automount: files ldap". In ldap is a simple nisObject configuration; no expiry times are specified. /etc/sysconfig/autofs specifies TIMEOUT=300 Mounting things works fine, and on older releases (F14, at least) the same configuration had no problems expiring mounts. Just dug out an F15 machine and tested it to find it appears to have the same problem. Trying to understand what's happening, I set LOGGING="debug" and see that it does at least attempt to expire a mount: Feb 22 13:37:58 epithumia automount[3696]: st_expire: state 1 path /home Feb 22 13:37:58 epithumia automount[3696]: expire_proc: exp_proc = 140160062736128 path /home Feb 22 13:37:58 epithumia automount[3696]: expire_proc_indirect: expire /home/tibbs Feb 22 13:37:58 epithumia automount[3696]: expire_proc_indirect: expire /home/dave Feb 22 13:37:58 epithumia automount[3696]: 2 remaining in /home Feb 22 13:37:58 epithumia automount[3696]: expire_cleanup: got thid 140160062736128 path /home stat 5 Feb 22 13:37:58 epithumia automount[3696]: expire_cleanup: sigchld: exp 140160062736128 finished, switching from 2 to 1 Feb 22 13:37:58 epithumia automount[3696]: st_ready: st_ready(): state = 2 path /home /home/dave is completely unreferenced; I just did df /home/dave five minutes before that message. But nothing seems to be unmounted and I'm not sure why. Is there a way to get any more information about why it decided not to umount that filesystem? I suppose it is possible that something is doing a stat on it and thus keeping it active, but the entries persist even when nobody is logged in at all. (Which still doesn't rule out the possibility, I guess.)
Ugh, I'm a terrible reporter. Currently running kernel 3.2.6-3.fc16.x86_64 (though this was an issue on previous F16 kernels as well) and autofs-5.0.6-5.fc16.x86_64. Not sure if it matters, but util-linux-2.20.1-2.2.fc16.x86_64 (for mount/umount) and nfs-utils-1.2.5-4.fc16.x86_64.
(In reply to comment #1) > Ugh, I'm a terrible reporter. Currently running kernel 3.2.6-3.fc16.x86_64 > (though this was an issue on previous F16 kernels as well) and > autofs-5.0.6-5.fc16.x86_64. Not sure if it matters, but > util-linux-2.20.1-2.2.fc16.x86_64 (for mount/umount) and > nfs-utils-1.2.5-4.fc16.x86_64. Don't know what's going on their. I have F16 with kernel-3.2.6-3 and I don't see a problem with expires even if I install autofs-5.0.6-5. But atm. I have a timeout of 60 seconds, I'll try later with the default of 300, maybe something is scanning file systems.
Just to show it's not just one weird machine, I'm seeing this on about 120 machines running various F16 kernels, and on the 15 or so F15 machines I still have around as well, so it must be something specific about my setup instead of some unfortunate random set of circumstances. I found someone else on IRC who appears to be having the same problem. However, I know that at some point mounts must expire somehow, because on some machines I can see that people logged in, say, a week ago and their home directories aren't mounted even there's been no intervening reboot or autofs update. But it certainly doesn't happen after five minutes as configured. I'll configure the timeout down a bit and see what happens.
Problem solved. I set timeouts down to 30 seconds and did an experiment. On two machines I ran "df /home/dave" to mount it, then on one machine ran watch grep dave /proc/mounts and on the other watch df\|grep dave On the former, /home/dave unmounts properly after about 30 seconds. On the latter it never unmounts. So it seems that simply running df is sufficient to mark the filesystem as "accessed" and prevent it from being unmounted. Why is this important? Because I have something scanning every machine on the network every two minutes, and one of the things it does is pull a list of filesystems. Now, this system has been in place for many years now, so at some time in the not too distant past running df turned into enough of an "access" to reset the filesystem expiry. No big deal; I'll just crank the timeout way down. I just wonder if this was intentional.
(In reply to comment #4) > Problem solved. > > I set timeouts down to 30 seconds and did an experiment. On two machines I ran > "df /home/dave" to mount it, then on one machine ran > watch grep dave /proc/mounts > and on the other > watch df\|grep dave > > On the former, /home/dave unmounts properly after about 30 seconds. On the > latter it never unmounts. So it seems that simply running df is sufficient to > mark the filesystem as "accessed" and prevent it from being unmounted. Yes, that has changed back to what it used to be (quite a long time ago), from about 2.6.39. So any access will prevent the mount from expiring. This reduces expire/mount activity quite a bit and, well, I had some complaints about the change originally as well. I'm reluctant to revert that change because the way it is now is I believe the way it should be and is the way it originally was. Perhaps a kernel module load parameter to enable use of the previous semantic would be sufficient?
Man, I've been running this homebrew monitoring system for a really long time. Back in the Red Hat 7.0, 2.2 kernel days, even. I don't recall df ever preventing autofs unmounting like that, but who knows. It's a perfectly reasonable behavior, just unexpected and I'm certainly not going to worry about getting the old behavior back.
(In reply to comment #6) > Man, I've been running this homebrew monitoring system for a really long time. > Back in the Red Hat 7.0, 2.2 kernel days, even. I don't recall df ever > preventing autofs unmounting like that, but who knows. It's a perfectly > reasonable behavior, just unexpected and I'm certainly not going to worry about > getting the old behavior back. Phew, that's a relief, thanks. A lot has changed since 2.2, of course. It is still puzzling though. It's the actual traversal of a path that will update the expire counter and I don't think df by itself will do that unless you supply a path to it.
Just a plain 'df' is sufficient to reset the expiry counter as far as I can tell. If I run it more frequently than the autofs expiry time, nothing will ever unmount. Now, maybe df is doing more than just calling statfs, but a quick strace doesn't show that. So I guess simply calling statfs is indeed sufficient to reset the expiry counter. I'm not really sure if that's the expected behavior, but it does seem a bit counterintuitive.
(In reply to comment #8) > Just a plain 'df' is sufficient to reset the expiry counter as far as I can > tell. If I run it more frequently than the autofs expiry time, nothing will > ever unmount. > > Now, maybe df is doing more than just calling statfs, but a quick strace > doesn't show that. So I guess simply calling statfs is indeed sufficient to > reset the expiry counter. I'm not really sure if that's the expected behavior, > but it does seem a bit counterintuitive. I'll have a look at old and new code and see if I can understand why this has changed.
(In reply to comment #9) > (In reply to comment #8) > > Just a plain 'df' is sufficient to reset the expiry counter as far as I can > > tell. If I run it more frequently than the autofs expiry time, nothing will > > ever unmount. It is and looking at as far back as 2.6.9 the expire counter would also be updated for every path walk. But that would have changed at about 2.6.18 to only updating the counter if the dentry was really busy, meaning belonging to an open file or the subject of a process working directory. > > > > Now, maybe df is doing more than just calling statfs, but a quick strace > > doesn't show that. So I guess simply calling statfs is indeed sufficient to > > reset the expiry counter. I'm not really sure if that's the expected behavior, > > but it does seem a bit counterintuitive. It could be a combination of things. Perhaps, somehow the symlinking of /etc/mtab to /proc/mounts is causing this. Possibly in combination with a change that made statfs(2) trigger automounts which it didn't do before. When I saw the statfs(2) patch I thought it was reasonable since a statfs(2) of an autofs file system is not really useful and you want the info. of the file system that would be mounted. Obviously someone had a problem like that since there was a patch posted. Neither of the above changes were my doing so if we want to make further changes we will need clear evidence of problems and reasons why we want it changed, especially the mtab symlink change. Ian
I did a yum update the other day and now I see I have the same problem. I don't see the problem at run level 3 and the fact that I did a yum update has to mean it's something in the GUI that is causing this.
Somehow the component got changed from Fedora to Entitlements and all sorts of things changed with it. Trying to get it set back properly.
(In reply to comment #12) > Somehow the component got changed from Fedora to Entitlements and all sorts of > things changed with it. Trying to get it set back properly. Oops, I didn't notice, but I didn't change it myself either....
The ticket status seems good now. In any case, I think everyone can agree that software (perhaps the desktop environment, perhaps some monitoring system) is quite justified in occasionally calling statfs to keep track of disk usage. A reasonable frequency for this call is up for debate, of course, but if it happens to be any lower than the autofs expiry time then no mount will ever go away. I guess it then remains for someone to decide whether there is any benefit to statfs resetting the expiry time, especially in light of the above. Personally I don't see it, but it is certain that there are plenty of facts I'm not aware of.
(In reply to comment #14) > The ticket status seems good now. > > In any case, I think everyone can agree that software (perhaps the desktop > environment, perhaps some monitoring system) is quite justified in occasionally > calling statfs to keep track of disk usage. A reasonable frequency for this > call is up for debate, of course, but if it happens to be any lower than the > autofs expiry time then no mount will ever go away. That's the way it has been since the statfs(2) kernel change and probably isn't unreasonable. Although it also means that if you use the browse option and statfs(2) a mount point path it will cause it to mount. That's pretty much the stat(2) mount storm problem all over again. Fortunately statfs(2) is not normally called in this way and when it is called you probably do what to know about the mount that is mounted since the autofs entry information is from a pseudo file system and generally isn't useful. OTOH many system monitoring systems probably use statfs(2) a lot and if they run frequently and cause many mounts or prevent mounts from being umounted that could be enough to warrant the statfs(2) change be reverted. > > I guess it then remains for someone to decide whether there is any benefit to > statfs resetting the expiry time, especially in light of the above. Personally > I don't see it, but it is certain that there are plenty of facts I'm not aware > of. All we need is a couple of bugs with a root cause of the statfs(2) change and I can post a revert and see who complains. But there's something else going on here. The testing that I've done due to comment #11 show that there are no frequent path walks occurring and the last_used counter doesn't get checked because the dentry looks busy before it even gets to it. Using a simple single indirect automount that should have a single open file handle on it, when I do an lsof I see 4 occurrences of the file handle. That can't be due to a thread not closing the file handle because that particular one is opened in a thread created "after" other three threads. Now I'm not sure why things appear to work in run level three, I'll have to check that again. At the moment I'm trying to go back in glibc revisions to see if that changes anything. This is a really weird problem.
Similar issue here (see below for detail on system info): Autofs sysconfig has been set to TIMEOUT=4 and NEGATIVE_TIMEOUT=1 to make sure the DVD is unmounted and allows for ejection using the button on the drive shortly after reading. In some conditions, autofs does not timeout and never releases the mount. The consequence of this problem for us is that the physical eject button never ejects the disk. Restarting autofs or issuing an eject works but is not an option for us at the moment. With Autofs 5.0.7 (compiled and installed using rpmbuild and yum on Fedora 15), problem occurs only when trying to access DVD mount folder shortly after DVD is inserted. With 5.0.5-38 or -39, occurs sporadically, even after fresh restart of autofs service. I have attached a script testautofs to reproduce as well as a debug output of automount in testautofs.log . Note that you can see the proper expiration happening after 4s for the scenario without eject around line 69 (handle_packet: type = 6), however, no such thing for the second test with eject prior to accessing the DVD. In that case, we see the system being stuck in a loop ("expire_proc_direct: send expire to trigger /usr/BDV/Interfaces/DVD" remains without response?) Any idea? ------------------------- INFO -------------------------------- $ cat /etc/auto.master /- /etc/auto.misc +auto.master $ cat /etc/auto.misc /usr/BDV/Interfaces/DVD -fstype=iso9660,ro,nosuid,nodev :/dev/sr0 $ cat /etc/sysconfig/autofs|grep -v "#" TIMEOUT=4 NEGATIVE_TIMEOUT=1 BROWSE_MODE="no" MOUNT_NFS_DEFAULT_PROTOCOL=4 LOGGING="debug" USE_MISC_DEVICE="yes" $ cat /etc/fedora-release Fedora release 15 (Lovelock) $ automount -V Linux automount version 5.0.7-1 Directories: config dir: /etc/sysconfig maps dir: /etc modules dir: /usr/lib64/autofs Compile options: DISABLE_MOUNT_LOCKING ENABLE_IGNORE_BUSY_MOUNTS WITH_HESIOD WITH_LDAP WITH_SASL LIBXML2_WORKAROUND $ ./ver_linux Linux doris 2.6.39.1 #1 SMP PREEMPT Wed Oct 5 17:26:29 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux Gnu C 4.6.1 Gnu make 3.82 binutils 2.21.51.0.6 util-linux 2.19.1 mount support module-init-tools 3.16 e2fsprogs 1.41.14 xfsprogs 3.1.4 pcmciautils 017 quota-tools 4.00-pre1. PPP 2.4.5 Linux C Library 2.14 Dynamic linker (ldd) 2.14 Procps 3.2.8 Net-tools 1.60 Kbd 1.15.2 Sh-utils 8.10 wireless-tools 29 Modules Loaded nls_utf8 ppdev parport_pc lp parport sunrpc snd_hda_codec_realtek nvidia snd_hda_intel snd_hda_codec snd_hwdep usblp snd_seq ftdi_sio snd_seq_device snd_pcm usbserial snd_timer snd i7core_edac mxser iTCO_wdt serio_raw soundcore edac_core iTCO_vendor_support blackmagic snd_page_alloc wmi i2c_i801 pcspkr microcode ipv6 usb_storage
Created attachment 654513 [details] Script to reproduce "autofs never expire" issue See also testautofs.log for an excerpt of /var/log/messages. Here is the output on my system: $ ~/testautofs.sh ---------- test without eject --------- Closing tray... Waiting for DVD to be recognized.case1_Anonymous /dev/sr0 on /usr/BDV/Interfaces/DVD type iso9660 (ro,nosuid,nodev,relatime) Restarting autofs... Redirecting to /bin/systemctl restart autofs.service case1_Anonymous Waiting for 5s (which is greater than autofs configured timeout of 4 sec) YEAH!!! ---------- test with eject --------- Restarting autofs... Redirecting to /bin/systemctl restart autofs.service Ejecting... Closing tray Waiting for DVD to be recognized...........case1_Anonymous Waiting for 5s (which is greater than autofs configured timeout of 4 sec) /dev/sr0 on /usr/BDV/Interfaces/DVD type iso9660 (ro,nosuid,nodev,relatime) Waiting another 5s just in case /dev/sr0 on /usr/BDV/Interfaces/DVD type iso9660 (ro,nosuid,nodev,relatime) sr0 should be unmounted!
Created attachment 654514 [details] Excerpt from /var/log/messages for failing expiration see other attachment script producing this output.
This message is a reminder that Fedora 16 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 16. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '16'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 16's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 16 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.