Bug 29139
Summary: | kernel hangs using nfs | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Joshua Buysse <buysse> |
Component: | am-utils | Assignee: | Nalin Dahyabhai <nalin> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brock Organ <borgan> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.1 | CC: | twaugh |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2001-03-28 09:59:22 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Joshua Buysse
2001-02-23 21:09:34 UTC
Did you set up a firewall in the install? This defect is considered MUST-FIX for Florence Release-Candidate #2 No firewall. It works fine for a while, then the kernel emits the error about no request slot, and it's all over. No NFS anymore, which generally makes the system unusable for me (automounted home directories, and X hangs on a blocking NFS operation at some point as well, so the console goes away.) I'm going to try to reproduce this on my home machine tonight -- much simpler setup, not using amd. Another clue, possibly that this might be more related to amd... (background information about local env.) The automounter setup is a little bit funky, it's been around for many years here. Basically, all exported disks are automounted on /NFS like /NFS/zeus/d1 (host zeus, disk d1). Those filesystems are accessed as /nfs/zeus/d1, with a symbolic link from /nfs/zeus -> /NFS/zeus. So, my home directory might be /nfs/zeus/d1/home/buysse. There are also mappings for /home/username. In this case, that's mapped to the same amd mount from zeus, with a sublink of home/buysse. At this point, I can access my home directory as /nfs/zeus/d1/home/buysse, but not as /home/buysse. Attempting to access /home/buysse gives "bash: cd: /home/buysse: Input/Output error". The kernel is also generating an error: "nfs_stat_to_errno: bad nfs status return value: 116". Is this a amd error or kernel nfs code? I can't tell -- everything on this box is either local or automounted. I may have misfiled this -- should it be am-utils? There may be more than one problem... Which ethernet card are you using? Ethernet card is a eepro100, lspci output: 00:09.0 Class 0200: 8086:1229 (rev 08) 00:09.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08) Subsystem: Intel Corporation EtherExpress PRO/100+ Management Adapter It's really looking likely that am-utils is the culprit -- I can catch it "early" -- when only one or two filesystems are hung, kill amd, manually umount -f the automount points, and start amd with the -r flag (restart mounts), and things will work well again for a while. I've got a fat pipe -- I can upgrade to newer revs of anything easily on this box. Your analysis looks right to me; am-utils looks more likely to be the home for this bug report, so I'm moving it there. It is still possible that it's a kernel bug or that it's a bad interaction between the kernel an am-utils. Using current rawhide kernel (2.4.2-0.1.28), am-utils-6.0.4-7, and glibc-2.2.2- 6, this problem still exists. Input/Output error. I'll keep updating this as I test: qa0322; problem still exists. I've backed glibc off to 2.2.2-7 due to bug 32749 (glibc-2.2.2-8 breaks am-utils completely). kernel-2.4.2-0.1.32, am- utils-6.0.4-7. Falling back to the version of am-utils shipped with RH7 corrects the problem. And again... kernel-2.4.2-0.1.35, glibc-2.2.2-9. Still broken. Does anyone want a strace or ltrace of the failure? Yes please. Also, to make sure that I understand the setup, do you have automount config files I could use to reproduce the problem? I didn't find time to grab the traces from this package, but based on testing tonight, the problem is fixed in seawolf. 6.0.5-1 fixes the bug in am-utils. |