Bug 429925

Summary: Many processes hang in D state doing a blk_congestion_wait
Product: Red Hat Enterprise Linux 4 Reporter: Sev Binello <sev>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: low    
Version: 4.5CC: steved
Target Milestone: rc   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-03 14:13:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sev Binello 2008-01-23 20:14:12 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.13pre) Gecko/20071018 Red Hat/1.0.9-6.el4 SeaMonkey/1.0.9

Description of problem:
We are experiencing numerous hangs/slow downs on our WS4 nfs client
machines. These processes are in a D state. 
 
Time in this state can vary, not sure what triggers them to come out.
But time can be quite lengthy.

Here's a sample trace....

Jan 23 12:39:27 acnmcr4p kernel: RampEditor    D C02D6EB4  1288 21855      1         21857 22139 (NOTLB)
Jan 23 12:39:27 acnmcr4p kernel: f6e5ec94 00200086 65cc8b98 c02d6eb4 f56a9430 e3923830 00200200 00000000 
Jan 23 12:39:27 acnmcr4p kernel:        c35c4860 00000000 c35bc780 c35bbde0 00000000 00000000 0d1d4140 001553b9 
Jan 23 12:39:27 acnmcr4p kernel:        e3923830 e8fd93f0 e8fd955c 00000000 00200246 65cef381 65cef381 f6e5ed04 
Jan 23 12:39:27 acnmcr4p kernel: Call Trace:
Jan 23 12:39:27 acnmcr4p kernel:  [<c02d6eb4>] common_interrupt+0x18/0x20
Jan 23 12:39:27 acnmcr4p kernel:  [<c02d4c5a>] schedule_timeout+0x139/0x154
Jan 23 12:39:27 acnmcr4p kernel:  [<c012a73a>] process_timeout+0x0/0x5
Jan 23 12:39:27 acnmcr4p kernel:  [<c02d4b17>] io_schedule_timeout+0x26/0x30
Jan 23 12:39:27 acnmcr4p kernel:  [<c02267f8>] blk_congestion_wait+0x64/0x78
Jan 23 12:39:27 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:27 acnmcr4p kernel:  [<c0145038>] get_writeback_state+0x30/0x35
Jan 23 12:39:27 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:27 acnmcr4p kernel:  [<c0145291>] balance_dirty_pages+0xbe/0x11c
Jan 23 12:39:27 acnmcr4p kernel:  [<f8f86105>] nfs_commit_write+0x43/0x72 [nfs]
Jan 23 12:39:27 acnmcr4p kernel:  [<c0142517>] generic_file_buffered_write+0x41f/0x501
Jan 23 12:39:27 acnmcr4p kernel:  [<c01ac299>] inode_has_perm+0x4c/0x54
Jan 23 12:39:27 acnmcr4p kernel:  [<c011dd34>] move_tasks+0x19d/0x202
Jan 23 12:39:27 acnmcr4p kernel:  [<c0142982>] __generic_file_aio_write_nolock+0x389/0x3b7
Jan 23 12:39:27 acnmcr4p kernel:  [<c01429e9>] generic_file_aio_write_nolock+0x39/0x7f
Jan 23 12:39:27 acnmcr4p kernel:  [<c0142bd3>] generic_file_aio_write+0x72/0xc6
Jan 23 12:39:27 acnmcr4p kernel:  [<f8f86212>] nfs_file_write+0xde/0xf9 [nfs]
Jan 23 12:39:27 acnmcr4p kernel:  [<c015b5b8>] do_sync_write+0x9e/0xcb
Jan 23 12:39:27 acnmcr4p kernel:  [<c016be58>] poll_freewait+0x33/0x38
Jan 23 12:39:27 acnmcr4p kernel:  [<c01adf3a>] selinux_file_permission+0x117/0x120
Jan 23 12:39:27 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:27 acnmcr4p kernel:  [<c015b69b>] vfs_write+0xb6/0xe2
Jan 23 12:39:27 acnmcr4p kernel:  [<c015b765>] sys_write+0x3c/0x62
Jan 23 12:39:27 acnmcr4p kernel:  [<c02d64db>] syscall_call+0x7/0xb
Jan 23 12:39:27 acnmcr4p kernel:  [<c02d007b>] unix_dgram_sendmsg+0x23c/0x45d

Version-Release number of selected component (if applicable):
2.6.9-55.0.9.ELsmp

How reproducible:
Couldn't Reproduce


Steps to Reproduce:
1.Can't intentionally reproduce but is happening often
2.
3.

Actual Results:


Expected Results:


Additional info:
Problem has come up since upgrading clients to WS4
NFS Servers are still running at WS3 and appear to be functioning okay.
Here's a few more traces that demonstrate the problem....

Jan 23 12:39:28 acnmcr4p kernel: tape          D C02D6EB4  1268  5170      1          5174  5139 (NOTLB)
Jan 23 12:39:28 acnmcr4p kernel: ec153c94 00200082 65cd4e2a c02d6eb4 d14ae1b0 e5d453b0 00200200 00000000 
Jan 23 12:39:28 acnmcr4p kernel:        c35bc860 00000000 c35bc780 c35bbde0 00000000 00000000 0d1d4140 001553b9 
Jan 23 12:39:28 acnmcr4p kernel:        e5d453b0 ef381330 ef38149c 00000000 00200246 65cef381 65cef381 ec153d04 
Jan 23 12:39:28 acnmcr4p kernel: Call Trace:
Jan 23 12:39:28 acnmcr4p kernel:  [<c02d6eb4>] common_interrupt+0x18/0x20
Jan 23 12:39:28 acnmcr4p kernel:  [<c02d4c5a>] schedule_timeout+0x139/0x154
Jan 23 12:39:28 acnmcr4p kernel:  [<c012a73a>] process_timeout+0x0/0x5
Jan 23 12:39:28 acnmcr4p kernel:  [<c02d4b17>] io_schedule_timeout+0x26/0x30
Jan 23 12:39:28 acnmcr4p kernel:  [<c02267f8>] blk_congestion_wait+0x64/0x78
Jan 23 12:39:28 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:28 acnmcr4p kernel:  [<c0145038>] get_writeback_state+0x30/0x35
Jan 23 12:39:28 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:28 acnmcr4p kernel:  [<c0145291>] balance_dirty_pages+0xbe/0x11c
Jan 23 12:39:28 acnmcr4p kernel:  [<f8f86105>] nfs_commit_write+0x43/0x72 [nfs]
Jan 23 12:39:28 acnmcr4p kernel:  [<c0142517>] generic_file_buffered_write+0x41f/0x501
Jan 23 12:39:28 acnmcr4p kernel:  [<c01444a1>] __pagevec_free+0x15/0x1a
Jan 23 12:39:28 acnmcr4p kernel:  [<c014941a>] release_pages+0x13b/0x143
Jan 23 12:39:28 acnmcr4p kernel:  [<c0142982>] __generic_file_aio_write_nolock+0x389/0x3b7
Jan 23 12:39:28 acnmcr4p kernel:  [<c01429e9>] generic_file_aio_write_nolock+0x39/0x7f
Jan 23 12:39:28 acnmcr4p kernel:  [<c0142bd3>] generic_file_aio_write+0x72/0xc6
Jan 23 12:39:28 acnmcr4p kernel:  [<f8f86212>] nfs_file_write+0xde/0xf9 [nfs]
Jan 23 12:39:28 acnmcr4p kernel:  [<c015b5b8>] do_sync_write+0x9e/0xcb
Jan 23 12:39:28 acnmcr4p kernel:  [<c01adf3a>] selinux_file_permission+0x117/0x120
Jan 23 12:39:28 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:28 acnmcr4p kernel:  [<c015b69b>] vfs_write+0xb6/0xe2
Jan 23 12:39:28 acnmcr4p kernel:  [<c015b765>] sys_write+0x3c/0x62
Jan 23 12:39:28 acnmcr4p kernel:  [<c02d64db>] syscall_call+0x7/0xb
Jan 23 12:39:28 acnmcr4p kernel:  [<c02d007b>] unix_dgram_sendmsg+0x23c/0x45d

Jan 23 12:39:29 acnmcr4p kernel: BoosterMainMa D C02D6EB4  1484 11566      1         11623 10722 (NOTLB)
Jan 23 12:39:29 acnmcr4p kernel: da677c94 00200082 65ccc646 c02d6eb4 e5d453b0 f5cf4370 00200200 00000000 
Jan 23 12:39:29 acnmcr4p kernel:        c35c4860 00000000 c35bc780 c35bbde0 00000000 00000000 0d1d4140 001553b9 
Jan 23 12:39:29 acnmcr4p kernel:        f5cf4370 ec3574b0 ec35761c 00000000 00200246 65cef381 65cef381 da677d04 
Jan 23 12:39:29 acnmcr4p kernel: Call Trace:
Jan 23 12:39:29 acnmcr4p kernel:  [<c02d6eb4>] common_interrupt+0x18/0x20
Jan 23 12:39:29 acnmcr4p kernel:  [<c02d4c5a>] schedule_timeout+0x139/0x154
Jan 23 12:39:29 acnmcr4p kernel:  [<c012a73a>] process_timeout+0x0/0x5
Jan 23 12:39:29 acnmcr4p kernel:  [<c02d4b17>] io_schedule_timeout+0x26/0x30
Jan 23 12:39:29 acnmcr4p kernel:  [<c02267f8>] blk_congestion_wait+0x64/0x78
Jan 23 12:39:29 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:29 acnmcr4p kernel:  [<c0145038>] get_writeback_state+0x30/0x35
Jan 23 12:39:29 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:29 acnmcr4p kernel:  [<c0145291>] balance_dirty_pages+0xbe/0x11c
Jan 23 12:39:29 acnmcr4p kernel:  [<f8f86105>] nfs_commit_write+0x43/0x72 [nfs]
Jan 23 12:39:29 acnmcr4p kernel:  [<c0142517>] generic_file_buffered_write+0x41f/0x501
Jan 23 12:39:29 acnmcr4p kernel:  [<c0144154>] buffered_rmqueue+0x17d/0x1a5
Jan 23 12:39:29 acnmcr4p kernel:  [<c0142982>] __generic_file_aio_write_nolock+0x389/0x3b7
Jan 23 12:39:29 acnmcr4p kernel:  [<c01429e9>] generic_file_aio_write_nolock+0x39/0x7f
Jan 23 12:39:29 acnmcr4p kernel:  [<c0142bd3>] generic_file_aio_write+0x72/0xc6
Jan 23 12:39:29 acnmcr4p kernel:  [<f8f86212>] nfs_file_write+0xde/0xf9 [nfs]
Jan 23 12:39:29 acnmcr4p kernel:  [<c015b5b8>] do_sync_write+0x9e/0xcb
Jan 23 12:39:29 acnmcr4p kernel:  [<c01adf3a>] selinux_file_permission+0x117/0x120
Jan 23 12:39:29 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:29 acnmcr4p kernel:  [<c015b69b>] vfs_write+0xb6/0xe2
Jan 23 12:39:29 acnmcr4p kernel:  [<c015b765>] sys_write+0x3c/0x62
Jan 23 12:39:29 acnmcr4p kernel:  [<c02d64db>] syscall_call+0x7/0xb
Jan 23 12:39:29 acnmcr4p kernel:  [<c02d007b>] unix_dgram_sendmsg+0x23c/0x45d

Jan 23 12:39:30 acnmcr4p kernel: CheckMyMounts D C02D6EB4  1492 12320  12318                     (NOTLB)
Jan 23 12:39:30 acnmcr4p kernel: c61d7c94 00000082 65c283e8 c02d6eb4 f5caedf0 e19470b0 00000200 00000000 
Jan 23 12:39:30 acnmcr4p kernel:        00000001 00000000 c35bc780 c35bbde0 00000000 00000000 0d1d4140 001553b9 
Jan 23 12:39:30 acnmcr4p kernel:        e19470b0 e3fea930 e3feaa9c 00000000 00000246 65cef381 65cef381 c61d7d04 
Jan 23 12:39:30 acnmcr4p kernel: Call Trace:
Jan 23 12:39:30 acnmcr4p kernel:  [<c02d6eb4>] common_interrupt+0x18/0x20
Jan 23 12:39:30 acnmcr4p kernel:  [<c02d4c5a>] schedule_timeout+0x139/0x154
Jan 23 12:39:30 acnmcr4p kernel:  [<c012a73a>] process_timeout+0x0/0x5
Jan 23 12:39:30 acnmcr4p kernel:  [<c02d4b17>] io_schedule_timeout+0x26/0x30
Jan 23 12:39:30 acnmcr4p kernel:  [<c02267f8>] blk_congestion_wait+0x64/0x78
Jan 23 12:39:30 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:30 acnmcr4p kernel:  [<c0145038>] get_writeback_state+0x30/0x35
Jan 23 12:39:30 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:30 acnmcr4p kernel:  [<c0145291>] balance_dirty_pages+0xbe/0x11c
Jan 23 12:39:30 acnmcr4p kernel:  [<f8f86105>] nfs_commit_write+0x43/0x72 [nfs]
Jan 23 12:39:30 acnmcr4p kernel:  [<c0142517>] generic_file_buffered_write+0x41f/0x501
Jan 23 12:39:30 acnmcr4p kernel:  [<f8f87b94>] nfs_setattr+0x142/0x151 [nfs]
Jan 23 12:39:30 acnmcr4p kernel:  [<c01ab14b>] avc_has_perm_noaudit+0x8d/0xda
Jan 23 12:39:30 acnmcr4p kernel:  [<c0142982>] __generic_file_aio_write_nolock+0x389/0x3b7
Jan 23 12:39:30 acnmcr4p kernel:  [<c01429e9>] generic_file_aio_write_nolock+0x39/0x7f
Jan 23 12:39:30 acnmcr4p kernel:  [<c0142bd3>] generic_file_aio_write+0x72/0xc6
Jan 23 12:39:30 acnmcr4p kernel:  [<f8f86212>] nfs_file_write+0xde/0xf9 [nfs]
Jan 23 12:39:30 acnmcr4p kernel:  [<c015b5b8>] do_sync_write+0x9e/0xcb
Jan 23 12:39:30 acnmcr4p kernel:  [<f8f87f46>] nfs_file_set_open_context+0x24/0x43 [nfs]
Jan 23 12:39:30 acnmcr4p kernel:  [<c01adf3a>] selinux_file_permission+0x117/0x120
Jan 23 12:39:30 acnmcr4p kernel:  [<c012052d>] autoremove_wake_function+0x0/0x2d
Jan 23 12:39:30 acnmcr4p kernel:  [<c015b69b>] vfs_write+0xb6/0xe2
Jan 23 12:39:30 acnmcr4p kernel:  [<c015b765>] sys_write+0x3c/0x62
Jan 23 12:39:30 acnmcr4p kernel:  [<c02d64db>] syscall_call+0x7/0xb

Comment 1 Jeff Layton 2008-04-03 14:13:37 UTC
Closing this as a dupe of bug 396081. I suspect that it's the same problem. You
may wish to test with some of the test kernels on Vivek's people page. 

*** This bug has been marked as a duplicate of 396081 ***