Description of problem: We have a relatively repeatable case where a decent NFS load will cause an NFS mount to get stuck on a client, and any new process that then tries to access that mount also gets stuck. These machines are running Fedora 11, but we've caused it to happen with Fedora 13 as well. Currently running a Fedora 14 kernel 2.6.35.9-64 backported to Fedora 11. One machine got stuck on a mount, and I found a task in the following state: [154061.428023] flush-0:188 D 0000000000000002 0 17043 2 0x00000080 [154061.428023] ffff88025fd21c40 0000000000000046 0000000000000400 0000000000000000 [154061.428023] ffff88025fd21fd8 ffff88042ace5d00 0000000000015500 ffff88025fd21fd8 [154061.428023] 0000000000015500 0000000000015500 0000000000015500 0000000000015500 [154061.428023] Call Trace: [154061.428023] [<ffffffff811295f4>] ? inode_wait+0x0/0x12 [154061.428023] [<ffffffff81129602>] inode_wait+0xe/0x12 [154061.428023] [<ffffffff814675eb>] __wait_on_bit+0x48/0x7b [154061.428023] [<ffffffff81132b75>] ? writeback_sb_inodes+0x10b/0x142 [154061.428023] [<ffffffff81132585>] inode_wait_for_writeback+0x8a/0xaa [154061.428023] [<ffffffff81065f20>] ? wake_bit_function+0x0/0x33 [154061.428023] [<ffffffff81133378>] wb_writeback+0x1f4/0x212 [154061.428023] [<ffffffff811334c7>] wb_do_writeback+0x131/0x147 [154061.428023] [<ffffffff81133520>] bdi_writeback_task+0x43/0x11b [154061.428023] [<ffffffff81065df8>] ? bit_waitqueue+0x17/0xa9 [154061.428023] [<ffffffff810e746d>] ? bdi_start_fn+0x0/0xca [154061.428023] [<ffffffff810e74d0>] bdi_start_fn+0x63/0xca [154061.428023] [<ffffffff810e746d>] ? bdi_start_fn+0x0/0xca [154061.428023] [<ffffffff81065a4d>] kthread+0x7f/0x87 [154061.428023] [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10 [154061.428023] [<ffffffff810659ce>] ? kthread+0x0/0x87 [154061.428023] [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10 I hadn't seen this before, so I looked online and found this post: http://kerneltrap.org/mailarchive/linux-kernel/2010/5/22/4573770/thread The traceback looks almost the same, but it looks like 2.6.35.9-64 already includes the patch Trond posted. That patch is 0522f6ad in the linux-stable git repo. Could our issues be another case of that problem? Version-Release number of selected component (if applicable): kernel 2.6.35.9-64 How reproducible: High workload.
Created attachment 473610 [details] The system back trace of the hang It appear the following to process are looking for the same lock: [958985.445126] dfa D 0000000000000000 0 11141 1 0x00000084 [958985.445126] ffff8802b5f43c38 0000000000000082 ffff8802b5f43ba8 ffff880119c8a018 [958985.445126] ffff8802b5f43fd8 ffff8801e86b5d00 0000000000015500 ffff8802b5f43fd8 [958985.445126] 0000000000015500 0000000000015500 0000000000015500 0000000000015500 [958985.445126] Call Trace: [958985.445126] [<ffffffff810d33ed>] ? sync_page+0x0/0x4a [958985.445126] [<ffffffff810d33ed>] ? sync_page+0x0/0x4a [958985.445126] [<ffffffff8146706d>] io_schedule+0x43/0x5d [958985.445126] [<ffffffff810d3433>] sync_page+0x46/0x4a [958985.445126] [<ffffffff814675eb>] __wait_on_bit+0x48/0x7b [958985.445126] [<ffffffff810d3605>] wait_on_page_bit+0x72/0x79 [958985.445126] [<ffffffff81065f20>] ? wake_bit_function+0x0/0x33 [958985.445126] [<ffffffff810db921>] ? pagevec_lookup_tag+0x25/0x2e [958985.445126] [<ffffffff810d3f0a>] filemap_fdatawait_range+0xa4/0x178 [958985.445126] [<ffffffff810d3fff>] filemap_fdatawait+0x21/0x23 [958985.445126] [<ffffffff811326dc>] writeback_single_inode+0xc6/0x1f8 [958985.445126] [<ffffffff8113283b>] sync_inode+0x2d/0x3b [958985.445126] [<ffffffffa0417fe8>] nfs_wb_all+0x42/0x44 [nfs] [958985.445126] [<ffffffffa040b63a>] nfs_do_fsync+0x20/0x3d [nfs] [958985.445126] [<ffffffffa040b823>] nfs_file_flush+0x75/0x7d [nfs] [958985.445126] [<ffffffff81115a22>] filp_close+0x43/0x72 [958985.445126] [<ffffffff81115aee>] sys_close+0x9d/0xd2 [958985.445126] [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b [958985.445126] flush-0:202 D ffff8801004b2d80 0 11142 2 0x00000080 [958985.445126] ffff880216045c40 0000000000000046 0000000000000400 0000000000000800 [958985.445126] ffff880216045fd8 ffff88042b2d0000 0000000000015500 ffff880216045fd8 [958985.445126] 0000000000015500 0000000000015500 0000000000015500 0000000000015500 [958985.445126] Call Trace: [958985.445126] [<ffffffff81065f61>] ? spin_unlock_irqrestore+0xe/0x10 [958985.445126] [<ffffffff811295f4>] ? inode_wait+0x0/0x12 [958985.445126] [<ffffffff81129602>] inode_wait+0xe/0x12 [958985.445126] [<ffffffff814675eb>] __wait_on_bit+0x48/0x7b [958985.445126] [<ffffffff81132b75>] ? writeback_sb_inodes+0x10b/0x142 [958985.445126] [<ffffffff81132585>] inode_wait_for_writeback+0x8a/0xaa [958985.445126] [<ffffffff81065f20>] ? wake_bit_function+0x0/0x33 [958985.445126] [<ffffffff81133378>] wb_writeback+0x1f4/0x212 [958985.445126] [<ffffffff811334c7>] wb_do_writeback+0x131/0x147 [958985.445126] [<ffffffff81133520>] bdi_writeback_task+0x43/0x11b [958985.445126] [<ffffffff81065df8>] ? bit_waitqueue+0x17/0xa9 [958985.445126] [<ffffffff810e746d>] ? bdi_start_fn+0x0/0xca [958985.445126] [<ffffffff810e74d0>] bdi_start_fn+0x63/0xca [958985.445126] [<ffffffff810e746d>] ? bdi_start_fn+0x0/0xca [958985.445126] [<ffffffff81065a4d>] kthread+0x7f/0x87 [958985.445126] [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10 [958985.445126] [<ffffffff810659ce>] ? kthread+0x0/0x87 [958985.445126] [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10
FWIW, I am seeing a similar problem.
Can someone get the held locks when this happens? (sysrq-D)
I have some great news. Customer was able to reproduce this issue on RHEL6 and has captured the full sysrq-t traceback. kernel-2.6.32-71.el6.x86_64 sosreport and traceback will be attached shortly.
I have made a clone of this BZ to RHEL6. https://bugzilla.redhat.com/show_bug.cgi?id=672305 sosreport and sysrq-t attached.
Created attachment 477117 [details] rpc_debug output from Fedora
It appears this is showing up in upstream and being reproduced with izone.. http://marc.info/?t=129685592100001&r=1&w=2 I'll see if I can reproduce it using one our Netapps...
Keep us in the loop :)
Created attachment 480877 [details] The system tap prob used to debug the problem at Pixar
Created attachment 481359 [details] patch -- never reuse the socket port after an xs_close Our best analysis was that the sunrpc layer was reusing local ports inappropriately. The main symptom was that the RPC engine would get into a state where the kernel_connect call would immediately return EADDRNOTAVAIL. I *think* this was due to the connection already being hashed, but I'm not as well versed in the tcp connection code. If the socket becomes disconnected, the sunrpc state machine is designed to try and reuse the port. Unfortunately, when it got that error back, it would treat it as temporary and would try to keep reusing that port even though it wasn't possible. This patch from Trond simply zeroes out the port once it has been through xs_close. What I'm less clear on is why this situation never seems to rectify itself. It seems like maybe we have something sitting on this hashtable indefinitely, but Trond seems to think that this patch is necessary anyway to prevent unwanted delays in reconnect.
The error seemed to be coming from __inet_hash_connect(). It looks like it's searching for a slot to hash the new connection but that's failing, probably due to something already being in that spot. I still have a feeling that something is not quite right at the socket level -- it doesn't seem like we should be getting this error back indefinitely. That said, this was against a 2.6.35.9 kernel so it's possible that it's already fixed in more recent kernels.
We suffer the same kind of problem and tried to apply "patch -- never reuse the socket port after an xs_close" as a work-around but it does not seem to work since the same port is still re-used upon failing re-connect attempts in a loop !!! BTW, since this BZ has been duplicated to 672305, itself duplicated to 692315, itself duplicated to 701788, itself ... (I can not go furtheer the list since access to this last BZ has been denied to me ...) Anyway, is there finally a known+sure fix for this problem/BZs ???
f14 isn't going to get much beyond security fixes at this point in its lifecycle. Is this bug still relevant on f15 or f16beta ?