From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225 Description of problem: RHN System ID 1002961687 sees the following behaviour: netfs does a 'mount -a -t nfs' more or less directly after the network is brought up. Although the interface reports OK (static config BTW), we are unable to mount nfs filesystems, adding sleep 15 makes the problem less severe (only the first one fails), changing the priority from 25 to 90 makes the problem go away. If the machine tries to resolve the NFS server name via DNS, we get a 'cannot resolve', if we have said server in /etc/hosts, we cannot route to it, looks to me like network is not fully up at that point. No idea if it is the local network infrastructure or not. What I'd like to see is either netfs starting later (is there a problem with my dirty hack to start it at prio 90? BTW, the script with changed prio is named netfs-patched) or some test to see if network is really working. This is being entered for a customer who will be on cc, please address all questions (regaring tests to be done on that specific machine) to him. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. have an nfs type entry in /etc/fstab 2. have the netfs service enabled 3. reboot Actual Results: at least the first nfs mount will fail Expected Results: all nfs type mounts to succeed. Additional info: doing a 'service netfs stop;service netfs start' after boot mounts all NFS filesystems just fine.
note, see Issue Tracker #21184, Event posted 06-18-2003 06:57am as well (if you have access that is. Philips tech who was on the phone for the manipulation is Mr Lunev
What network card, attached to what sort of switch? It's probably spending a very large amount of time negotiating.
Bill, it's a 'Dell PowerEdge 2650', which should make the NIC 'BROADCOM Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15)', I can check with the customer if you need this info. My question was actually aiming at why netfs is started so shortly after network. spending a very large amount of time negotiating is not that uncommon, so if there is no reason to start netfs this early after network, I'd like this bug to be considered an RFE to start network dependant services a tad later, is that acceptable? RU PCFE
Not really. Too many other things rely on netfs being finished. I presume you're using static IPs?
Yes, you are using static IPs, after rereading.
How are you handling DNS on these boxes?
Bill, "If the machine tries to resolve the NFS server name via DNS, we get a 'cannot resolve', if we have said server in /etc/hosts, we cannot route to it", so the DNS part was covered. Or is there something else you'd like to check? As we cannot start netfs later, can we add a test to the script to make sure the network reallly is up before attempting to mount network filesystems?
Yes, it would be good to know how your DNS is configured. The fact that you get a can't resolve error implies something else is wrong entirely outside of initscripts/autonegotiation interactions.
Bill, so, as things do work later in bootup you mean to say that network may be up but DNS problematic?!? (Even though taking DNS out of the equation by using an /etc/hosts entry gave me routing errors). Anyway, I'll make a separate posting for the admin to post resolv.conf and details on the DNS.
uxadm: can you please post your /etc/resolv.conf and all details you know about your DNS for Mr Nottingham to this bug please?
[root@bblxc12a rhn]# cat /etc/resolv.conf nameserver 130.143.87.243 search bbl.ms.philips.com philips.com ----------------------------------------- [root@bblxc12a rhn]# dig depbblhps1ms000.bbl.ms.philips.com. ; <<>> DiG 9.2.1 <<>> depbblhps1ms000.bbl.ms.philips.com. ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35039 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;depbblhps1ms000.bbl.ms.philips.com. IN A ;; ANSWER SECTION: depbblhps1ms000.bbl.ms.philips.com. 86400 IN A 130.143.87.243 ;; Query time: 0 msec ;; SERVER: 130.143.87.243#53(130.143.87.243) ;; WHEN: Thu Jul 3 12:38:00 2003 ;; MSG SIZE rcvd: 68 ----------------------------------------------------------- [root@bblxc12a rhn]# dig 130.143.87.243 ; <<>> DiG 9.2.1 <<>> 130.143.87.243 ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 34867 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;130.143.87.243. IN A ;; AUTHORITY SECTION: . 86400 IN SOA ns0.philips.com. dns.philips.com. 2003021700 10800 3600 604800 604800 ;; Query time: 394 msec ;; SERVER: 130.143.87.243#53(130.143.87.243) ;; WHEN: Thu Jul 3 12:38:30 2003 ;; MSG SIZE rcvd: 86
What I'm saying is that if the machine is autonegotiating, you should *not* get 'cannot resolve', or associated errors. It should just wait. The problem implies some more fundamental issue with the network config.
Bill, so the next step would be a tcpdump I guess, or would you like other info prior to doimng a tcpdump from another machine that sits on a hub with the problematic server? RU PCFE
Can you post your /etc/resolv.conf and /etc/nsswitch.conf?
Created attachment 92874 [details] cluster host 1 hi, here is my nsswitch.conf
Created attachment 92875 [details] cluster host 1 here is my resolve.conf
Are you actually using nisplus?
We use never nisplus.
What happens if you remove all the nisplus entries from /etc/nsswitch.conf?
I remove all nisplus entries form /etc/nsswitch.conf. No success
Hm, this is still something we've never seen in any other testing or reports. What does your /etc/sysconfig/network and /etc/sysconfig/network-scripts/ifcfg-* look like?
[root@bblxc11a root]# cat /etc/sysconfig/network NETWORKING=yes HOSTNAME=bblxc11a.bbl.ms.philips.com [root@bblxc11a root]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 ONBOOT=yes BOOTPROTO=static IPADDR=130.143.87.181 NETMASK=255.255.255.0 GATEWAY=130.143.87.1 [root@bblxc11a root]# cat /etc/sysconfig/network-scripts/ifcfg-eth1 DEVICE=eth1 ONBOOT=yes BOOTPROTO=static IPADDR=10.0.0.1 NETMASK=255.255.255.0
I believe this is another duplicate of 107999.
*** Bug 107999 has been marked as a duplicate of this bug. ***
*** Bug 116711 has been marked as a duplicate of this bug. ***
Created attachment 99129 [details] patch to /etc/init.d/netfs This patch from ticket 107999 seems to have been overlooked.
Created attachment 116444 [details] simple retry nfs mount on startup retry mounting, adds up to 25sec to boot time. For my boxes I did bump sshd's priority - in case 'mount' would hang the netfs for too long. thus, netfs announces itself in /etc/motd Probably, to be included in a distro it should look for options in sysconfig and be turned off by default. believe this is based on netfs from initscripts-6.40-1 happy hacking
With the goal of minimizing risk of change for deployed systems, and in response to customer and partner requirements, Red Hat takes a conservative approach when evaluating changes for inclusion in maintenance updates for currently deployed products. The primary objectives of update releases are to enable new hardware platform support and to resolve critical defects. At this stage, this behavior isn't going to be changed for RHEL2.1/RHEL 3/RHEL 4. Closing as deferred.