Bug 491177
Summary: | Heavy NFS load to one disk causes all I/O on a system to hang | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Skylar Thompson <skylar2> | ||||
Component: | kernel | Assignee: | Peter Staubach <staubach> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 5.1 | CC: | bmcnally, grendelmans, jlayton, joseph.breu, prgarcial, rwheeler, sprabhu, steved, tao | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-06-11 20:34:38 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
I should point out that this doesn't happen with all local I/O; if I point the dd writers at local disk rather than NFS, I never run into problems. We are also encountering this bug. System: Red Hat 5 update 3 (x86_64). The problem occurs using the default number of nfsd processes (8). We are writing two (Oracle RMAN) streams over a single gigabit connection, at about 800mbit/s. The array we are writing to should be able to handle this amount of IO. I am encountering the same error with a Dell MD3000 mounted locally on /mnt/DAS and mounted on the same server via NFS. Steps to reproduce: /usr/sbin/bonnie++ -m direct-das1 -d /cms/fileserver/iozone-test/ -n 20:1m:4m:400 -p 2 -u 0:0 /usr/sbin/bonnie++ -m direct-das1 -d /cms/fileserver/iozone-test/ -n 20:1m:4m:400 -y -u 0:0 in another window: /usr/sbin/bonnie++ -m direct-das1 -d /cms/fileserver/iozone-test/ -n 20:1m:4m:400 -y -u 0:0 Using 32 nfsd threads All nfs threads hand and i/o to the DAS drops to 0. We are running Red Hat Enterprise Linux Server release 5.2 (Tikanga) Cleaned up one particular stack trace from c#1 which could be causing the problem nfsd S ffff810001036500 0 7960 1 7962 7956 (L-TLB) ffff810425fb3980 0000000000000046 ffff81042d738b30 ffff81043fc64000 0000000000000286 0000000000000009 ffff810425f957a0 ffff81043fc26080 0000006888690e53 000000000000a3ee ffff810425f95988 0000000619bfcc00 Call Trace: [<ffffffff884010c7>] :nfs:nfs_wait_bit_interruptible+0x22/0x28 [<ffffffff80063ac7>] __wait_on_bit+0x40/0x6e [<ffffffff80063b61>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff8009db4f>] wake_bit_function+0x0/0x23 [<ffffffff8840108b>] :nfs:nfs_wait_on_request+0x56/0x70 [<ffffffff88404a96>] :nfs:nfs_wait_on_requests_locked+0x70/0xca [<ffffffff88405a81>] :nfs:nfs_sync_inode_wait+0x60/0x1db [<ffffffff883fbe7b>] :nfs:nfs_release_page+0x2c/0x4d [<ffffffff800c7606>] shrink_inactive_list+0x4e1/0x7f9 [<ffffffff80012d02>] shrink_zone+0xf6/0x11c [<ffffffff800c801b>] try_to_free_pages+0x197/0x2b9 [<ffffffff8000f271>] __alloc_pages+0x1cb/0x2ce [<ffffffff8804ff1d>] :ext3:ext3_ordered_commit_write+0xa1/0xc7 [<ffffffff8000fb8a>] generic_file_buffered_write+0x1b0/0x6d3 [<ffffffff80016196>] __generic_file_aio_write_nolock+0x36c/0x3b8 [<ffffffff800c2c0a>] __generic_file_write_nolock+0x8f/0xa8 [<ffffffff80063bb6>] mutex_lock+0xd/0x1d [<ffffffff800c2c6b>] generic_file_writev+0x48/0xa2 [<ffffffff800dbae0>] do_readv_writev+0x176/0x295 [<ffffffff884905f4>] :nfsd:nfsd_vfs_write+0xf2/0x2e1 [<ffffffff88490e68>] :nfsd:nfsd_write+0xb5/0xd5 [<ffffffff88497986>] :nfsd:nfsd3_proc_write+0xea/0x109 [<ffffffff8848d1db>] :nfsd:nfsd_dispatch+0xd8/0x1d6 [<ffffffff8834c48b>] :sunrpc:svc_process+0x454/0x71b [<ffffffff8848d746>] :nfsd:nfsd+0x1a5/0x2cb Skylar, From the stack trace in c#5, nfsd tries to write pages onto the disk. The ext3 handler requests from memory which the system tries to free by syncing cached pages owned by the nfs share. This could easily result in a deadlock. Do you have any stack traces from a similarly hung system which doesn't mount nfs shares over loopback? Sachin Prabhu I've attached information to https://bugzilla.redhat.com/show_bug.cgi?id=489889 |
Created attachment 335899 [details] Sysrq stack trace when NFS was hung on an RHEL 5.1 x86_64 system Description of problem: Heavy NFS load (over 1Gbps) to an RHEL5 NFS server can cause all I/O to hang, even to disks not receiving NFS traffic. I've isolated two failure modes: at high I/O rates, all I/O stops, while at more moderate rates the nfsd threads just hang. Version-Release number of selected component (if applicable): Reproduced on RHEL 5.1-5.3, kernels 2.6.18-92 through 2.6.18-128 How reproducible: This problem is very reproducible. It is easiest to reproduce by mounting the NFS filesystem over the loopback interface, but can also be reproduced by bonding two gigE NICs and sending traffic from multiple client nodes. This is also reproducible on RHEL4, but at higher I/O rates; I have only generated enough I/O by using the loopback mount method. Steps to Reproduce: 1. Create and mount an ext3 filesystem on separate disk(s) from the system disk(s). 2. Add an exports rule for 127.0.0.1: /fu 127.0.0.1(rw,async,root_squash) 3. Mount that filesystem: # mount /fu /mnt/fu -o nfsvers=3,tcp,hard,intr,rsize=32768,wsize=32768 4. Up the number of NFS threads to 512 in /etc/sysconfig/nfs: RPCNFSDCOUNT=512 5. Start/Restart NFS: # service nfs restart 6. Make a directory writable by a non-root user: # mkdir -p /fu/user chown user:user /fu/user 7. Run this script to start up some number of dd writers. I've managed to hang nfsd with just 128 writers, and hanging the entire system with 512 writers. === #!/bin/bash NUM=$1 if test -z $NUM; then echo "Provide number of dd's!" exit 2 fi for((i=0;i<$NUM;i++)); do dd if=/dev/zero of=/fu/user/`hostname -s`.$i bs=1M & done wait === 8. Wait a few minutes for nfsd or the system to hang. I've seen the problem occur with as little as 4GB of data written out. Actual results: Depending on the number of writers, nfsd will hang, or all I/O on the system will hang. If vmstat and iostat are started before starting dd, vmstat will report no I/O, while iostat will still report I/O. When the system hangs, I/O wait on the system starts out normal, but eventually every CPU will be 100% I/O wait and will not go down unless the dd processes are killed. It is at the point that I/O hits 100% that the system becomes unusable. Expected results: System performance should degrade gracefully as I/O is increased. Additional info: I have replicated this with DASD disks on a MegaRAID card, and against an EMC CX380 connected with Fibre Channel and PowerPath multipathing. It is reproducible at higher I/O rates on RHEL4 using the same hardware. The system disks are completely separate from the disks on which the dd tests are run. I've attached a sysrq stack trace from the system when nfsd was hung.