+++ This bug was initially created as a clone of Bug #203277 +++ Description of problem: Filesystems that should be mounted under /net are often not accessible until after autofs is restarted. Then works OK for a while. Version-Release number of selected component (if applicable): autofs-5.0.1-0.rc1.15 How reproducible: Consistently. Steps to Reproduce: 1. Boot rawhide system 2. Wait a few minutes 3. Attempt to access NFS filesystem via /net Actual results: Access fails. Expected results: Filesystem automounted. Additional info: Looks a lot like BZ 20516 but that is closed, so starting a new one. Example: [root@ping0 ~]# ls /net/tabb1/share ls: /net/tabb1/share: No such file or directory You have new mail in /var/spool/mail/root [root@ping0 ~]# ls /net/tabb1 home share [root@ping0 ~]# service autofs restart Stopping automount: [ OK ] Starting automount: [ OK ] [root@ping0 ~]# ls /net/tabb1/share Avast_tabb2.reg CentOS Download Fedora jeremy Kubuntu lost+found Mandriva Music prs root ssh tabb1 tabb2 tabb3 vmware [root@ping0 ~]# tail -20 /var/log/messages Aug 20 07:03:46 ping0 syslogd 1.4.1: restart. Aug 20 07:06:25 ping0 automount[4080]: umount_autofs_indirect: ask umount returned busy /net Aug 20 07:06:57 ping0 automount[30566]: lookup_read_master: lookup(nisplus): couldn't locat nis+ table auto.master Aug 20 07:06:57 ping0 kernel: SELinux: initialized (dev autofs, type autofs), uses genfs_contexts Aug 20 07:06:59 ping0 last message repeated 3 times Aug 20 07:06:59 ping0 kernel: SELinux: initialized (dev 0:19, type nfs), uses genfs_contexts Aug 20 07:20:10 ping0 init: Trying to re-exec init Aug 20 07:27:27 ping0 automount[4427]: lookup_read_master: lookup(nisplus): couldn't locat nis+ table auto.master Aug 20 07:27:27 ping0 kernel: SELinux: initialized (dev autofs, type autofs), uses genfs_contexts Aug 20 07:27:30 ping0 last message repeated 3 times Aug 20 07:27:31 ping0 kernel: SELinux: initialized (dev 0:19, type nfs), uses genfs_contexts Another attempt to access a file after a few minutes fails again. Restart of autofs again fixes it temporarily. -- Additional comment from Philip.R.Schaffner on 2006-08-20 07:30 EST -- Typo on the BZ reference. Should have been 202516. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=202516 -- Additional comment from ikent on 2006-08-20 09:44 EST -- (In reply to comment #1) > Typo on the BZ reference. Should have been 202516. > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=202516 > No, I don't think this is the same as 202516, I'll investigate. Ian -- Additional comment from ikent on 2006-08-20 09:54 EST -- (In reply to comment #2) > (In reply to comment #1) > > Typo on the BZ reference. Should have been 202516. > > > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=202516 > > > > No, I don't think this is the same as 202516, I'll investigate. > I can't seem to duplicate this. Can you post a "showmount -e" for the server please. Ian -- Additional comment from redhat-bugzilla-f on 2006-08-20 17:43 EST -- This happens to me as well: automounted NFS file systems become inaccessible after a few minutes. Example, server behemoth exports the following filesystems: $ showmount -e behemoth /opt 192.168.1.0/255.255.255.0 /usr 192.168.0.0/255.255.0.0 /var 192.168.0.0/255.255.0.0 /mnt/ext2/4 192.168.0.0/255.255.0.0 /mnt/ext2/1 192.168.0.0/255.255.0.0 /mnt/iso9660/3 192.168.0.0/255.255.0.0 /mnt/iso9660/2 192.168.0.0/255.255.0.0 /mnt/iso9660/1 192.168.0.0/255.255.0.0 /var/share/media 192.168.0.0/255.255.0.0 $ ls /net/behemoth mnt opt usr var $ ls /net/behemoth/var ls: /net/behemoth/var: No such file or directory $ s /etc/init.d/autofs restart Stopping automount: [ OK ] Starting automount: [ OK ] $ ls /net/behemoth/var account fiction local net-snmp scrollkeeper state www arpwatch gdm lock nis share tmp yp cache home log opt shm tomcat4 db kerberos lost+found preserve spool tpm empty lib mail run ssl ucd-snmp This has been happening for at least the last 3 days (I yum upgrade just about every day on this (test-)system) -- Additional comment from ikent on 2006-08-20 23:05 EST -- Created an attachment (id=134547) Prevent autofs4 follow_link method returning false negative Could someone try this kernel patch and see if it resolves the issue please. Ian -- Additional comment from Philip.R.Schaffner on 2006-08-22 07:25 EST -- For the record... [root@tabb1 ~]# showmount -e Export list for tabb1.tabb: /home 192.168.1.255/24 /share 192.168.1.255/24 Workaround seems to be commenting out the last line of /etc/auto.master "+auto.master" and/or getting rid of the nisplus entry for automount in /etc/nsswitch.conf. -- Additional comment from ikent on 2006-08-22 08:08 EST -- (In reply to comment #6) > For the record... > > [root@tabb1 ~]# showmount -e > Export list for tabb1.tabb: > /home 192.168.1.255/24 > /share 192.168.1.255/24 > > Workaround seems to be commenting out the last line of /etc/auto.master > "+auto.master" and/or getting rid of the nisplus entry for automount in > /etc/nsswitch.conf. > Aaha. An obvious problem with my nsswitch processing. Or maybe that's the way nsswitch is supposed to work. I'll review that bit of the code. Thanks. Ian -- Additional comment from ikent on 2007-03-14 06:45 EST -- (In reply to comment #7) > (In reply to comment #6) > > For the record... > > > > [root@tabb1 ~]# showmount -e > > Export list for tabb1.tabb: > > /home 192.168.1.255/24 > > /share 192.168.1.255/24 > > > > Workaround seems to be commenting out the last line of /etc/auto.master > > "+auto.master" and/or getting rid of the nisplus entry for automount in > > /etc/nsswitch.conf. > > > > Aaha. > An obvious problem with my nsswitch processing. > Or maybe that's the way nsswitch is supposed to work. > I'll review that bit of the code. This bug seems to have fallen through the cracks, sorry. I know a lot of work has been done in this area so can you check if this is still a problem with the current package please. Ian -- Additional comment from Philip.R.Schaffner on 2007-03-14 09:59 EST -- For EL5 Beta - had to make the following changes to /etc/auto.master to get automount to work: diff -u auto.master~ auto.master --- auto.master~ 2007-01-07 17:14:35.000000000 -0500 +++ auto.master 2007-03-14 04:56:49.000000000 -0400 @@ -7,7 +7,8 @@ # For details of the format look at autofs(5). # /misc /etc/auto.misc -/net -hosts +#/net -host +/net /etc/auto.net # # Include central master map if it can be found using # nsswitch sources. @@ -17,4 +18,4 @@ # same will not be seen as the first read key seen takes # precedence. # -+auto.master +#+auto.master -- Additional comment from ikent on 2007-03-14 11:39 EST -- (In reply to comment #9) > For EL5 Beta - had to make the following changes to /etc/auto.master to get > automount to work: > > diff -u auto.master~ auto.master > --- auto.master~ 2007-01-07 17:14:35.000000000 -0500 > +++ auto.master 2007-03-14 04:56:49.000000000 -0400 > @@ -7,7 +7,8 @@ > # For details of the format look at autofs(5). > # > /misc /etc/auto.misc > -/net -hosts > +#/net -host > +/net /etc/auto.net > # > # Include central master map if it can be found using > # nsswitch sources. > @@ -17,4 +18,4 @@ > # same will not be seen as the first read key seen takes > # precedence. > # > -+auto.master > +#+auto.master > Are the NFS servers you have problems with Solaris based? Ian -- Additional comment from Philip.R.Schaffner on 2007-03-14 13:07 EST -- No - CentOS 4.4 -- Additional comment from ikent on 2007-03-14 13:18 EST -- (In reply to comment #11) > No - CentOS 4.4 Thanks. I'll see if I can reproduce this. Ian -- Additional comment from ikent on 2007-03-14 13:25 EST -- (In reply to comment #12) > (In reply to comment #11) > > No - CentOS 4.4 > > Thanks. > I'll see if I can reproduce this. Sorry to bug you again but what is the revision of autofs that you're using. Ian -- Additional comment from Philip.R.Schaffner on 2007-03-14 14:32 EST -- Should have said: autofs-5.0.1-0.rc2.15 -- Additional comment from ikent on 2007-03-15 01:02 EST -- (In reply to comment #12) > (In reply to comment #11) > > No - CentOS 4.4 > > Thanks. > I'll see if I can reproduce this. I've tried to duplicate this without success. I tested revision 0.rc2.15 and the current RHEL5 revision 0.rc2.43.0.2. I don't have a CentOS server but I tried with Solaris9, an old Debian server and an FC6 machine and they worked OK. I also tested using the network broadcast address in the export instead of the network address, as you have in your exports above. So, we need more information to take this further. You will need to update to the current RHEL5 release revision and check that it is still a problem as quite a few updates have been applied. We would need to change the Product in this bug to RHEL5 also or log a new bug. Ian -- Additional comment from Philip.R.Schaffner on 2007-03-20 11:13 EST -- Don't have a RHEL5 release install to test; however, this is still a problem on FC6 with all current updates. With the default auto.master nfs directories fail to mount. With the patch shown above everything works fine. Changed version to fc6. -- Additional comment from ikent on 2007-03-20 13:02 EST -- (In reply to comment #16) > Don't have a RHEL5 release install to test; however, this is still a problem on > FC6 with all current updates. With the default auto.master nfs directories fail > to mount. With the patch shown above everything works fine. Changed version to > fc6. > What revision of autofs? -- Additional comment from Philip.R.Schaffner on 2007-03-20 13:29 EST -- autofs-5.0.1-0.rc3.26 -- Additional comment from ikent on 2007-03-20 21:59 EST -- (In reply to comment #18) > autofs-5.0.1-0.rc3.26 Yes, that's the latest revision. As I wasn't able to reproduce this could you provide a debug log of this happening please. Information on how to do this can be found at http://people.redhat.com/jmoyer. Clearly there is some difference between how my test environment and your system is setup which we need to work out. Also, is Selinux in enforcing mode? If so could you disable it and try to reproduce the problem. Ian -- Additional comment from Philip.R.Schaffner on 2007-04-11 10:28 EST -- OK - here's a summary of recent tests. 1. Install FC6. Disable selinux during firstboot, configure networking for DHCP, add local servers (including wx1) to /etc/hosts. 2. Update to latest autofs. 3. Attempt to use autofs to mount NFS share /home on server wx1: [ggg@fc6 ~]$ ls /net/wx1/home ls: /net/wx1/home: No such file or directory 4. Change one line in /etc/auto.master and restart autofs: [root@fc6 etc]# diff auto.master.orig auto.master 10c10,11 < /net -hosts --- > #/net -hosts > /net /etc/auto.net 5. Try again: [ggg@fc6 ~]$ ls /net/wx1/home CentOS5beta ggg LARC LiveCDtools lost+found prs tsd gewet gustaf LiveCD LiveCD_v1 phil rtn [ggg@fc6 ~]$ This is the easiest problem to reproduce. Will attach debug log. Last entry with failure before change to modified auto.master is: Apr 11 10:20:52 fc6 automount[3590]: failed to mount /net/wx1 -- Additional comment from Philip.R.Schaffner on 2007-04-11 10:34 EST -- Created an attachment (id=152277) /var/log/debug with and without /net problem The attached debug log shows the failures with the out-of-the-box FC6 files, followed by correct automount in /net after one-line change to auto.master. Only other change was enabling logging as requested. Testing done in a VMware WorkStation 5.5 VM. Only update to original FC6 was autofs. Will run all FC6 updates and repeat. -- Additional comment from Philip.R.Schaffner on 2007-04-11 17:11 EST -- The problem with autofs consistently failing to mount via /net with default auto.master file persists with all current FC6 updates installed - autofs-5.0.1-0.rc3.26 kernel-2.6.20-1.2933.fc6 Have not yet reproduced the intermittent problem with the /net mounts disappearing once they are active with the "+auto.master" entry present in /etc/auto.master and logging enabled. Will report again if I can capture that behavior. Should have noted - not using NIS. No changes to /etc/nsswitch.conf -- Additional comment from ikent on 2007-04-12 00:59 EST -- (In reply to comment #21) I must be missing something really simple, but what. > Created an attachment (id=152277) [edit] > /var/log/debug with and without /net problem > > The attached debug log shows the failures with the out-of-the-box FC6 files, > followed by correct automount in /net after one-line change to auto.master. > Only other change was enabling logging as requested. Testing done in a VMware > WorkStation 5.5 VM. Only update to original FC6 was autofs. Will run all FC6 > updates and repeat. > Does the client machine match either of these network addresses? Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 Are these the entries you expect to see in the export list of host wx1? Ian -- Additional comment from ikent on 2007-04-12 01:56 EST -- (In reply to comment #23) > > Does the client machine match either of these network addresses? > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 > > Are these the entries you expect to see in the export list of > host wx1? And coupld you post the output of ifconfig for the matching interface please. Ian -- Additional comment from Philip.R.Schaffner on 2007-04-12 10:19 EST -- > Does the client machine match either of these network addresses? > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 The client in this case is on a VMware NAT subnet, so it appears to hosts as being on the 146.165.204.0 network. > Are these the entries you expect to see in the export list of > host wx1? Yes...[root@wx1 ~]# cat /etc/exports /home 198.119.136.0/24(rw,no_root_squash,insecure,async) 146.165.204.0/24(rw,no_root_squash,insecure,async) 146.165.204.0 - Building Ethernet Subnet 198.119.136.0 - Wifi Subnet > And coupld you post the output of ifconfig for the matching > interface please. On the Host OS: [root@wx1 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:E0:81:2C:B7:56 inet addr:146.165.204.75 Bcast:146.165.204.255 Mask:255.255.255.0 inet6 addr: fe80::2e0:81ff:fe2c:b756/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:193238408 errors:0 dropped:0 overruns:0 frame:0 TX packets:364018867 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1433591889 (1.3 GiB) TX bytes:2246610319 (2.0 GiB) Interrupt:193 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:3227098 errors:0 dropped:0 overruns:0 frame:0 TX packets:3227098 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1343272733 (1.2 GiB) TX bytes:1343272733 (1.2 GiB) vmnet1 Link encap:Ethernet HWaddr 00:50:56:C0:00:01 inet addr:192.168.3.1 Bcast:192.168.3.255 Mask:255.255.255.0 inet6 addr: fe80::250:56ff:fec0:1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:5 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) vmnet8 Link encap:Ethernet HWaddr 00:50:56:C0:00:08 inet addr:192.168.2.1 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::250:56ff:fec0:8/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:339 errors:0 dropped:0 overruns:0 frame:0 TX packets:5 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) On the VMware Guest OS: [root@fc6 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:1B:21:FD inet addr:192.168.2.108 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe1b:21fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2230 errors:0 dropped:0 overruns:0 frame:0 TX packets:1907 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:920987 (899.4 KiB) TX bytes:509890 (497.9 KiB) Interrupt:18 Base address:0x1424 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:91 errors:0 dropped:0 overruns:0 frame:0 TX packets:91 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:24732 (24.1 KiB) TX bytes:24732 (24.1 KiB) NAT (although not necessarily VMware) does seem to be relevant as I can no longer duplicate the problem on a physical FC6 machine on the same 146.165.204.0 network. That one now works with the default auto.master file restored, although it did not previously. -- Additional comment from ikent on 2007-04-12 21:54 EST -- (In reply to comment #25) > > Does the client machine match either of these network addresses? > > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 > > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 > > The client in this case is on a VMware NAT subnet, so it appears to hosts as > being on the 146.165.204.0 network. I can see the NAT being a problem for sure. I expect I'll be able to reproduce this problem now. This is the first valid reason I've had so far to drop the exports access validation from the hosts module and just deal with the mount fail instead. I'll need to do a fair bit of testing before I actually do that though. Ian -- Additional comment from ikent on 2007-04-19 04:06 EST -- (In reply to comment #26) > (In reply to comment #25) > > > Does the client machine match either of these network addresses? > > > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 > > > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 > > > > The client in this case is on a VMware NAT subnet, so it appears to hosts as > > being on the 146.165.204.0 network. > > I can see the NAT being a problem for sure. > I expect I'll be able to reproduce this problem now. > > This is the first valid reason I've had so far to drop the > exports access validation from the hosts module and just > deal with the mount fail instead. I'll need to do a fair > bit of testing before I actually do that though. I've removed the exports access control check from autofs-5.0.1-0.rc3.29 which is in updates/testing. Could you try this out and see if this update resolves the problem your seeing please. Ian -- Additional comment from Philip.R.Schaffner on 2007-04-20 14:27 EST -- Updated to autofs-5.0.1-0.rc3.29 on a fully up to date FC6 system and could not replicate the problem. Seems to be fixed for FC6 by the test version. The problem still exists in CentOS5 with autofs-5.0.1-0.rc2.43.0.2 and thus very likely in RHEL5. -- Additional comment from ikent on 2007-04-23 04:28 EST -- (In reply to comment #28) > Updated to autofs-5.0.1-0.rc3.29 on a fully up to date FC6 system and could not > replicate the problem. Seems to be fixed for FC6 by the test version. The > problem still exists in CentOS5 with autofs-5.0.1-0.rc2.43.0.2 and thus very > likely in RHEL5. > Yes, there's no doubt of that. I'm going to actually remove the code used for the checking (instead of just disabling it) and then clone this bug so I can fix it in RHEL 5.1. That's about all I can do for the moment. Ian
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0621.html