Bug 203277
Summary: | Fails to mount filesystems via /net | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Phil Schaffner <philip.r.schaffner> | ||||||
Component: | autofs | Assignee: | Ian Kent <ikent> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Brock Organ <borgan> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 6 | CC: | ikent, jmoyer, redhat-bugzilla-f, triage | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i386 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | bzcl34nup | ||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-05-06 16:14:47 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 238534 | ||||||||
Attachments: |
|
Description
Phil Schaffner
2006-08-20 11:27:42 UTC
Typo on the BZ reference. Should have been 202516. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=202516 (In reply to comment #1) > Typo on the BZ reference. Should have been 202516. > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=202516 > No, I don't think this is the same as 202516, I'll investigate. Ian (In reply to comment #2) > (In reply to comment #1) > > Typo on the BZ reference. Should have been 202516. > > > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=202516 > > > > No, I don't think this is the same as 202516, I'll investigate. > I can't seem to duplicate this. Can you post a "showmount -e" for the server please. Ian This happens to me as well: automounted NFS file systems become inaccessible after a few minutes. Example, server behemoth exports the following filesystems: $ showmount -e behemoth /opt 192.168.1.0/255.255.255.0 /usr 192.168.0.0/255.255.0.0 /var 192.168.0.0/255.255.0.0 /mnt/ext2/4 192.168.0.0/255.255.0.0 /mnt/ext2/1 192.168.0.0/255.255.0.0 /mnt/iso9660/3 192.168.0.0/255.255.0.0 /mnt/iso9660/2 192.168.0.0/255.255.0.0 /mnt/iso9660/1 192.168.0.0/255.255.0.0 /var/share/media 192.168.0.0/255.255.0.0 $ ls /net/behemoth mnt opt usr var $ ls /net/behemoth/var ls: /net/behemoth/var: No such file or directory $ s /etc/init.d/autofs restart Stopping automount: [ OK ] Starting automount: [ OK ] $ ls /net/behemoth/var account fiction local net-snmp scrollkeeper state www arpwatch gdm lock nis share tmp yp cache home log opt shm tomcat4 db kerberos lost+found preserve spool tpm empty lib mail run ssl ucd-snmp This has been happening for at least the last 3 days (I yum upgrade just about every day on this (test-)system) Created attachment 134547 [details]
Prevent autofs4 follow_link method returning false negative
Could someone try this kernel patch and see if it resolves
the issue please.
Ian
For the record... [root@tabb1 ~]# showmount -e Export list for tabb1.tabb: /home 192.168.1.255/24 /share 192.168.1.255/24 Workaround seems to be commenting out the last line of /etc/auto.master "+auto.master" and/or getting rid of the nisplus entry for automount in /etc/nsswitch.conf. (In reply to comment #6) > For the record... > > [root@tabb1 ~]# showmount -e > Export list for tabb1.tabb: > /home 192.168.1.255/24 > /share 192.168.1.255/24 > > Workaround seems to be commenting out the last line of /etc/auto.master > "+auto.master" and/or getting rid of the nisplus entry for automount in > /etc/nsswitch.conf. > Aaha. An obvious problem with my nsswitch processing. Or maybe that's the way nsswitch is supposed to work. I'll review that bit of the code. Thanks. Ian (In reply to comment #7) > (In reply to comment #6) > > For the record... > > > > [root@tabb1 ~]# showmount -e > > Export list for tabb1.tabb: > > /home 192.168.1.255/24 > > /share 192.168.1.255/24 > > > > Workaround seems to be commenting out the last line of /etc/auto.master > > "+auto.master" and/or getting rid of the nisplus entry for automount in > > /etc/nsswitch.conf. > > > > Aaha. > An obvious problem with my nsswitch processing. > Or maybe that's the way nsswitch is supposed to work. > I'll review that bit of the code. This bug seems to have fallen through the cracks, sorry. I know a lot of work has been done in this area so can you check if this is still a problem with the current package please. Ian For EL5 Beta - had to make the following changes to /etc/auto.master to get automount to work: diff -u auto.master~ auto.master --- auto.master~ 2007-01-07 17:14:35.000000000 -0500 +++ auto.master 2007-03-14 04:56:49.000000000 -0400 @@ -7,7 +7,8 @@ # For details of the format look at autofs(5). # /misc /etc/auto.misc -/net -hosts +#/net -host +/net /etc/auto.net # # Include central master map if it can be found using # nsswitch sources. @@ -17,4 +18,4 @@ # same will not be seen as the first read key seen takes # precedence. # -+auto.master +#+auto.master (In reply to comment #9) > For EL5 Beta - had to make the following changes to /etc/auto.master to get > automount to work: > > diff -u auto.master~ auto.master > --- auto.master~ 2007-01-07 17:14:35.000000000 -0500 > +++ auto.master 2007-03-14 04:56:49.000000000 -0400 > @@ -7,7 +7,8 @@ > # For details of the format look at autofs(5). > # > /misc /etc/auto.misc > -/net -hosts > +#/net -host > +/net /etc/auto.net > # > # Include central master map if it can be found using > # nsswitch sources. > @@ -17,4 +18,4 @@ > # same will not be seen as the first read key seen takes > # precedence. > # > -+auto.master > +#+auto.master > Are the NFS servers you have problems with Solaris based? Ian No - CentOS 4.4 (In reply to comment #11) > No - CentOS 4.4 Thanks. I'll see if I can reproduce this. Ian (In reply to comment #12) > (In reply to comment #11) > > No - CentOS 4.4 > > Thanks. > I'll see if I can reproduce this. Sorry to bug you again but what is the revision of autofs that you're using. Ian Should have said: autofs-5.0.1-0.rc2.15 (In reply to comment #12) > (In reply to comment #11) > > No - CentOS 4.4 > > Thanks. > I'll see if I can reproduce this. I've tried to duplicate this without success. I tested revision 0.rc2.15 and the current RHEL5 revision 0.rc2.43.0.2. I don't have a CentOS server but I tried with Solaris9, an old Debian server and an FC6 machine and they worked OK. I also tested using the network broadcast address in the export instead of the network address, as you have in your exports above. So, we need more information to take this further. You will need to update to the current RHEL5 release revision and check that it is still a problem as quite a few updates have been applied. We would need to change the Product in this bug to RHEL5 also or log a new bug. Ian Don't have a RHEL5 release install to test; however, this is still a problem on FC6 with all current updates. With the default auto.master nfs directories fail to mount. With the patch shown above everything works fine. Changed version to fc6. (In reply to comment #16) > Don't have a RHEL5 release install to test; however, this is still a problem on > FC6 with all current updates. With the default auto.master nfs directories fail > to mount. With the patch shown above everything works fine. Changed version to > fc6. > What revision of autofs? autofs-5.0.1-0.rc3.26 (In reply to comment #18) > autofs-5.0.1-0.rc3.26 Yes, that's the latest revision. As I wasn't able to reproduce this could you provide a debug log of this happening please. Information on how to do this can be found at http://people.redhat.com/jmoyer. Clearly there is some difference between how my test environment and your system is setup which we need to work out. Also, is Selinux in enforcing mode? If so could you disable it and try to reproduce the problem. Ian OK - here's a summary of recent tests.
1. Install FC6. Disable selinux during firstboot, configure networking for
DHCP, add local servers (including wx1) to /etc/hosts.
2. Update to latest autofs.
3. Attempt to use autofs to mount NFS share /home on server wx1:
[ggg@fc6 ~]$ ls /net/wx1/home
ls: /net/wx1/home: No such file or directory
4. Change one line in /etc/auto.master and restart autofs:
[root@fc6 etc]# diff auto.master.orig auto.master
10c10,11
< /net -hosts
---
> #/net -hosts
> /net /etc/auto.net
5. Try again:
[ggg@fc6 ~]$ ls /net/wx1/home
CentOS5beta ggg LARC LiveCDtools lost+found prs tsd
gewet gustaf LiveCD LiveCD_v1 phil rtn
[ggg@fc6 ~]$
This is the easiest problem to reproduce. Will attach debug log. Last entry
with failure before change to modified auto.master is:
Apr 11 10:20:52 fc6 automount[3590]: failed to mount /net/wx1
Created attachment 152277 [details]
/var/log/debug with and without /net problem
The attached debug log shows the failures with the out-of-the-box FC6 files,
followed by correct automount in /net after one-line change to auto.master.
Only other change was enabling logging as requested. Testing done in a VMware
WorkStation 5.5 VM. Only update to original FC6 was autofs. Will run all FC6
updates and repeat.
The problem with autofs consistently failing to mount via /net with default auto.master file persists with all current FC6 updates installed - autofs-5.0.1-0.rc3.26 kernel-2.6.20-1.2933.fc6 Have not yet reproduced the intermittent problem with the /net mounts disappearing once they are active with the "+auto.master" entry present in /etc/auto.master and logging enabled. Will report again if I can capture that behavior. Should have noted - not using NIS. No changes to /etc/nsswitch.conf (In reply to comment #21) I must be missing something really simple, but what. > Created an attachment (id=152277) [edit] > /var/log/debug with and without /net problem > > The attached debug log shows the failures with the out-of-the-box FC6 files, > followed by correct automount in /net after one-line change to auto.master. > Only other change was enabling logging as requested. Testing done in a VMware > WorkStation 5.5 VM. Only update to original FC6 was autofs. Will run all FC6 > updates and repeat. > Does the client machine match either of these network addresses? Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 Are these the entries you expect to see in the export list of host wx1? Ian (In reply to comment #23) > > Does the client machine match either of these network addresses? > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 > > Are these the entries you expect to see in the export list of > host wx1? And coupld you post the output of ifconfig for the matching interface please. Ian > Does the client machine match either of these network addresses? > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 The client in this case is on a VMware NAT subnet, so it appears to hosts as being on the 146.165.204.0 network. > Are these the entries you expect to see in the export list of > host wx1? Yes...[root@wx1 ~]# cat /etc/exports /home 198.119.136.0/24(rw,no_root_squash,insecure,async) 146.165.204.0/24(rw,no_root_squash,insecure,async) 146.165.204.0 - Building Ethernet Subnet 198.119.136.0 - Wifi Subnet > And coupld you post the output of ifconfig for the matching > interface please. On the Host OS: [root@wx1 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:E0:81:2C:B7:56 inet addr:146.165.204.75 Bcast:146.165.204.255 Mask:255.255.255.0 inet6 addr: fe80::2e0:81ff:fe2c:b756/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:193238408 errors:0 dropped:0 overruns:0 frame:0 TX packets:364018867 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1433591889 (1.3 GiB) TX bytes:2246610319 (2.0 GiB) Interrupt:193 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:3227098 errors:0 dropped:0 overruns:0 frame:0 TX packets:3227098 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1343272733 (1.2 GiB) TX bytes:1343272733 (1.2 GiB) vmnet1 Link encap:Ethernet HWaddr 00:50:56:C0:00:01 inet addr:192.168.3.1 Bcast:192.168.3.255 Mask:255.255.255.0 inet6 addr: fe80::250:56ff:fec0:1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:5 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) vmnet8 Link encap:Ethernet HWaddr 00:50:56:C0:00:08 inet addr:192.168.2.1 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::250:56ff:fec0:8/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:339 errors:0 dropped:0 overruns:0 frame:0 TX packets:5 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) On the VMware Guest OS: [root@fc6 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:1B:21:FD inet addr:192.168.2.108 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe1b:21fd/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2230 errors:0 dropped:0 overruns:0 frame:0 TX packets:1907 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:920987 (899.4 KiB) TX bytes:509890 (497.9 KiB) Interrupt:18 Base address:0x1424 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:91 errors:0 dropped:0 overruns:0 frame:0 TX packets:91 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:24732 (24.1 KiB) TX bytes:24732 (24.1 KiB) NAT (although not necessarily VMware) does seem to be relevant as I can no longer duplicate the problem on a physical FC6 machine on the same 146.165.204.0 network. That one now works with the default auto.master file restored, although it did not previously. (In reply to comment #25) > > Does the client machine match either of these network addresses? > > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 > > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 > > The client in this case is on a VMware NAT subnet, so it appears to hosts as > being on the 146.165.204.0 network. I can see the NAT being a problem for sure. I expect I'll be able to reproduce this problem now. This is the first valid reason I've had so far to drop the exports access validation from the hosts module and just deal with the mount fail instead. I'll need to do a fair bit of testing before I actually do that though. Ian (In reply to comment #26) > (In reply to comment #25) > > > Does the client machine match either of these network addresses? > > > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 146.165.204.0 pmask 24 > > > Apr 11 10:20:49 fc6 automount[3590]: match_network: pcnet 198.119.136.0 pmask 24 > > > > The client in this case is on a VMware NAT subnet, so it appears to hosts as > > being on the 146.165.204.0 network. > > I can see the NAT being a problem for sure. > I expect I'll be able to reproduce this problem now. > > This is the first valid reason I've had so far to drop the > exports access validation from the hosts module and just > deal with the mount fail instead. I'll need to do a fair > bit of testing before I actually do that though. I've removed the exports access control check from autofs-5.0.1-0.rc3.29 which is in updates/testing. Could you try this out and see if this update resolves the problem your seeing please. Ian Updated to autofs-5.0.1-0.rc3.29 on a fully up to date FC6 system and could not replicate the problem. Seems to be fixed for FC6 by the test version. The problem still exists in CentOS5 with autofs-5.0.1-0.rc2.43.0.2 and thus very likely in RHEL5. (In reply to comment #28) > Updated to autofs-5.0.1-0.rc3.29 on a fully up to date FC6 system and could not > replicate the problem. Seems to be fixed for FC6 by the test version. The > problem still exists in CentOS5 with autofs-5.0.1-0.rc2.43.0.2 and thus very > likely in RHEL5. > Yes, there's no doubt of that. I'm going to actually remove the code used for the checking (instead of just disabling it) and then clone this bug so I can fix it in RHEL 5.1. That's about all I can do for the moment. Ian Fedora apologizes that these issues have not been resolved yet. We're sorry it's taken so long for your bug to be properly triaged and acted on. We appreciate the time you took to report this issue and want to make sure no important bugs slip through the cracks. If you're currently running a version of Fedora Core between 1 and 6, please note that Fedora no longer maintains these releases. We strongly encourage you to upgrade to a current Fedora release. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained and closing them. http://fedoraproject.org/wiki/LifeCycle/EOL If this bug is still open against Fedora Core 1 through 6, thirty days from now, it will be closed 'WONTFIX'. If you can reporduce this bug in the latest Fedora version, please change to the respective version. If you are unable to do this, please add a comment to this bug requesting the change. Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we are following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again. And if you'd like to join the bug triage team to help make things better, check out http://fedoraproject.org/wiki/BugZappers This bug is open for a Fedora version that is no longer maintained and will not be fixed by Fedora. Therefore we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen thus bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |