Problem description: ==================== Setup the San blast test and left the system overnight. The next day the file system was locked and the box had to be rebooted dmesg could no be pulled:(. The box was not hung but the / files system was Read only as best I can tell. In var log messages Jul 23 18:39:31 elm3c24 kernel: WARNING: at kernel/rtmutex.c:1732 rt_read_slowunlock() Jul 23 18:39:31 elm3c24 kernel: Pid: 9733, comm: pdflush Not tainted 2.6.24.7-72ibmrt2.5 #1 Jul 23 18:39:31 elm3c24 kernel: Jul 23 18:39:31 elm3c24 kernel: Call Trace: Jul 23 18:39:31 elm3c24 kernel: [<ffffffff81285ef1>] rt_read_slowunlock+0x14c/0x43e Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8105dc1d>] rt_rwlock_read_unlock+0x258/0x25d Jul 23 18:39:31 elm3c24 kernel: [<ffffffff81287257>] rt_read_unlock+0x9/0xb Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8819ce80>] :dm_mod:dm_get_table+0x35/0x3d Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8819d045>] :dm_mod:dm_any_congested+0x18/0x53 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8108b3ee>] write_cache_pages+0x207/0x2f7 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8108acb2>] ? __writepage+0x0/0x36 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8108b500>] generic_writepages+0x22/0x28 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8108b538>] do_writepages+0x32/0x3a Jul 23 18:39:31 elm3c24 kernel: [<ffffffff810cce87>] __writeback_single_inode+0x1a4/0x30a Jul 23 18:39:31 elm3c24 kernel: [<ffffffff81287257>] ? rt_read_unlock+0x9/0xb Jul 23 18:39:31 elm3c24 kernel: [<ffffffff810cd38e>] sync_sb_inodes+0x1be/0x27c Jul 23 18:39:31 elm3c24 kernel: [<ffffffff810cd661>] writeback_inodes+0x74/0xd1 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8108bd38>] background_writeout+0x87/0xb9 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8108c549>] ? pdflush+0x0/0x1fc Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8108c69a>] pdflush+0x151/0x1fc Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8108bcb1>] ? background_writeout+0x0/0xb9 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8105143b>] kthread+0x49/0x76 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8100d048>] child_rip+0xa/0x12 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff810513f2>] ? kthread+0x0/0x76 Jul 23 18:39:31 elm3c24 kernel: [<ffffffff8100d03e>] ? child_rip+0x0/0x12 Jul 23 18:39:31 elm3c24 kernel: Jul 23 18:47:01 elm3c24 kernel: WARNING: at kernel/rtmutex.c:1732 rt_read_slowunlock() Jul 23 18:47:01 elm3c24 kernel: Pid: 9862, comm: pdflush Not tainted 2.6.24.7-72ibmrt2.5 #1 Jul 23 18:47:01 elm3c24 kernel: Jul 23 18:47:01 elm3c24 kernel: Call Trace: Jul 23 18:47:01 elm3c24 kernel: [<ffffffff81285ef1>] rt_read_slowunlock+0x14c/0x43e Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8105dc1d>] rt_rwlock_read_unlock+0x258/0x25d Jul 23 18:47:01 elm3c24 kernel: [<ffffffff81287257>] rt_read_unlock+0x9/0xb Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8819ce80>] :dm_mod:dm_get_table+0x35/0x3d Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8819d0d2>] :dm_mod:__split_bio+0x20/0x3b4 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8112b509>] ? __make_request+0x57c/0x5c7 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8819dcc0>] :dm_mod:dm_request+0x1fd/0x223 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff811283aa>] generic_make_request+0x332/0x36d Jul 23 18:47:01 elm3c24 kernel: [<ffffffff81086b57>] ? mempool_alloc_slab+0x11/0x13 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff88071ce7>] ? :ext3:ext3_get_block+0x0/0xfc Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8112850a>] submit_bio+0x125/0x12e Jul 23 18:47:01 elm3c24 kernel: [<ffffffff88071ce7>] ? :ext3:ext3_get_block+0x0/0xfc Jul 23 18:47:01 elm3c24 kernel: [<ffffffff810d0cf3>] submit_bh+0xed/0x111 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff810d2921>] __block_write_full_page+0x1cb/0x2b7 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff88071ce7>] ? :ext3:ext3_get_block+0x0/0xfc Jul 23 18:47:01 elm3c24 kernel: [<ffffffff88071ce7>] ? :ext3:ext3_get_block+0x0/0xfc Jul 23 18:47:01 elm3c24 kernel: [<ffffffff810d2ace>] block_write_full_page+0xc1/0xcd Jul 23 18:47:01 elm3c24 kernel: [<ffffffff880733a0>] :ext3:ext3_ordered_writepage+0xc6/0x171 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108acc7>] __writepage+0x15/0x36 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108b38c>] write_cache_pages+0x1a5/0x2f7 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108acb2>] ? __writepage+0x0/0x36 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff81287193>] ? rt_spin_lock+0x9/0xb Jul 23 18:47:01 elm3c24 kernel: [<ffffffff81045f67>] ? lock_timer_base+0x26/0x4b Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108b500>] generic_writepages+0x22/0x28 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108b538>] do_writepages+0x32/0x3a Jul 23 18:47:01 elm3c24 kernel: [<ffffffff810cce87>] __writeback_single_inode+0x1a4/0x30a Jul 23 18:47:01 elm3c24 kernel: [<ffffffff81287257>] ? rt_read_unlock+0x9/0xb Jul 23 18:47:01 elm3c24 kernel: [<ffffffff810cd38e>] sync_sb_inodes+0x1be/0x27c Jul 23 18:47:01 elm3c24 kernel: [<ffffffff810cd661>] writeback_inodes+0x74/0xd1 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108bd38>] background_writeout+0x87/0xb9 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108c549>] ? pdflush+0x0/0x1fc Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108c69a>] pdflush+0x151/0x1fc Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8108bcb1>] ? background_writeout+0x0/0xb9 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8105143b>] kthread+0x49/0x76 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8100d048>] child_rip+0xa/0x12 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff810513f2>] ? kthread+0x0/0x76 Jul 23 18:47:01 elm3c24 kernel: [<ffffffff8100d03e>] ? child_rip+0x0/0x12 2.6.24.7-72ibmrt2.5 kerenl version.... logs show not that much time before repro. Again DMESG was not avaible so this just means the disk was rw. =Comment: #1================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-24 19:51 EDT The SAN swith topology in this blade center is still unstable :( might be related. Working on it.... =Comment: #2================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-24 19:52 EDT Kicked the tests off again on the box.. A few min into the test. krcupreemptd new prio is 115?? WARNING: at kernel/rtmutex.c:1732 rt_read_slowunlock() Pid: 9478, comm: pdflush Not tainted 2.6.24.7-72ibmrt2.5 #1 Call Trace: [<ffffffff81287257>] ? rt_read_unlock+0x9/0xb [<ffffffff81285ef1>] rt_read_slowunlock+0x14c/0x43e [<ffffffff8105f455>] rt_mutex_up_read+0x25b/0x260 [<ffffffff8105fd52>] rt_up_read+0x9/0xb [<ffffffff810cd675>] writeback_inodes+0x88/0xd1 [<ffffffff8108bd38>] background_writeout+0x87/0xb9 [<ffffffff8108c549>] ? pdflush+0x0/0x1fc [<ffffffff8108c69a>] pdflush+0x151/0x1fc [<ffffffff8108bcb1>] ? background_writeout+0x0/0xb9 [<ffffffff8105143b>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513f2>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 =Comment: #3================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-24 20:05 EDT Grabed a dmesg. IPMI kcs interface initialized ipmi device interface krcupreemptd new prio is 115?? WARNING: at kernel/rtmutex.c:1732 rt_read_slowunlock() Pid: 9478, comm: pdflush Not tainted 2.6.24.7-72ibmrt2.5 #1 Call Trace: [<ffffffff81287257>] ? rt_read_unlock+0x9/0xb [<ffffffff81285ef1>] rt_read_slowunlock+0x14c/0x43e [<ffffffff8105f455>] rt_mutex_up_read+0x25b/0x260 [<ffffffff8105fd52>] rt_up_read+0x9/0xb [<ffffffff810cd675>] writeback_inodes+0x88/0xd1 [<ffffffff8108bd38>] background_writeout+0x87/0xb9 [<ffffffff8108c549>] ? pdflush+0x0/0x1fc [<ffffffff8108c69a>] pdflush+0x151/0x1fc [<ffffffff8108bcb1>] ? background_writeout+0x0/0xb9 [<ffffffff8105143b>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513f2>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 APIC error on CPU1: 00(40) APIC error on CPU5: 00(40) APIC error on CPU0: 00(40) APIC error on CPU3: 00(40) APIC error on CPU4: 00(40) APIC error on CPU6: 00(40) APIC error on CPU7: 00(40) APIC error on CPU2: 00(40) Machine check events logged I was able to log in via ssh it appears the SAN is up but the system is unusable. Not sure what is going on but it is bad. Can't get updtime or top or tab in a bash shell. After a while can't do anything. It feels like oom or something like that. Need to get an strace off when in this state. Rebooting box. =Comment: #4================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-24 20:26 EDT more pain... eth2: no IPv6 routers present krcupreemptd new prio is 115?? WARNING: at kernel/rtmutex.c:1732 rt_read_slowunlock() Pid: 9470, comm: pdflush Not tainted 2.6.24.7-72ibmrt2.5 #1 Call Trace: [<ffffffff81285ef1>] rt_read_slowunlock+0x14c/0x43e [<ffffffff8105dc1d>] rt_rwlock_read_unlock+0x258/0x25d [<ffffffff81287257>] rt_read_unlock+0x9/0xb [<ffffffff8819ce80>] :dm_mod:dm_get_table+0x35/0x3d [<ffffffff8819d045>] :dm_mod:dm_any_congested+0x18/0x53 [<ffffffff8108b3ee>] write_cache_pages+0x207/0x2f7 [<ffffffff8108acb2>] ? __writepage+0x0/0x36 [<ffffffff8108b500>] generic_writepages+0x22/0x28 [<ffffffff8108b538>] do_writepages+0x32/0x3a [<ffffffff810cce87>] __writeback_single_inode+0x1a4/0x30a [<ffffffff81287257>] ? rt_read_unlock+0x9/0xb [<ffffffff810cd38e>] sync_sb_inodes+0x1be/0x27c [<ffffffff810cd661>] writeback_inodes+0x74/0xd1 [<ffffffff8108bd38>] background_writeout+0x87/0xb9 [<ffffffff8108c549>] ? pdflush+0x0/0x1fc [<ffffffff8108c69a>] pdflush+0x151/0x1fc [<ffffffff8108bcb1>] ? background_writeout+0x0/0xb9 [<ffffffff8105143b>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513f2>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 WARNING: at kernel/rtmutex.c:1732 rt_read_slowunlock() Pid: 9491, comm: pdflush Not tainted 2.6.24.7-72ibmrt2.5 #1 Call Trace: [<ffffffff881c0d64>] ? :dm_multipath:map_io+0x130/0x140 [<ffffffff81285ef1>] rt_read_slowunlock+0x14c/0x43e [<ffffffff8105dc1d>] rt_rwlock_read_unlock+0x258/0x25d [<ffffffff81287257>] rt_read_unlock+0x9/0xb [<ffffffff8819ce80>] :dm_mod:dm_get_table+0x35/0x3d [<ffffffff8819d0d2>] :dm_mod:__split_bio+0x20/0x3b4 [<ffffffff8112b509>] ? __make_request+0x57c/0x5c7 [<ffffffff8819dcc0>] :dm_mod:dm_request+0x1fd/0x223 [<ffffffff811283aa>] generic_make_request+0x332/0x36d [<ffffffff81086b57>] ? mempool_alloc_slab+0x11/0x13 [<ffffffff88071ce7>] ? :ext3:ext3_get_block+0x0/0xfc [<ffffffff8112850a>] submit_bio+0x125/0x12e [<ffffffff88071ce7>] ? :ext3:ext3_get_block+0x0/0xfc [<ffffffff810d0cf3>] submit_bh+0xed/0x111 [<ffffffff810d2921>] __block_write_full_page+0x1cb/0x2b7 [<ffffffff88071ce7>] ? :ext3:ext3_get_block+0x0/0xfc [<ffffffff88071ce7>] ? :ext3:ext3_get_block+0x0/0xfc [<ffffffff810d2ace>] block_write_full_page+0xc1/0xcd [<ffffffff880733a0>] :ext3:ext3_ordered_writepage+0xc6/0x171 [<ffffffff8108acc7>] __writepage+0x15/0x36 [<ffffffff8108b38c>] write_cache_pages+0x1a5/0x2f7 [<ffffffff8108acb2>] ? __writepage+0x0/0x36 [<ffffffff8108b500>] generic_writepages+0x22/0x28 [<ffffffff8108b538>] do_writepages+0x32/0x3a [<ffffffff810cce87>] __writeback_single_inode+0x1a4/0x30a [<ffffffff81287257>] ? rt_read_unlock+0x9/0xb [<ffffffff810cd38e>] sync_sb_inodes+0x1be/0x27c [<ffffffff810cd661>] writeback_inodes+0x74/0xd1 [<ffffffff8108bd38>] background_writeout+0x87/0xb9 [<ffffffff8108c549>] ? pdflush+0x0/0x1fc [<ffffffff8108c69a>] pdflush+0x151/0x1fc [<ffffffff8108bcb1>] ? background_writeout+0x0/0xb9 [<ffffffff8105143b>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513f2>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 =Comment: #5================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-24 20:27 EDT After the above warning box as just fine. =Comment: #6================================================= Darren V. Hart <dvhltc.com> - 2008-07-24 20:44 EDT Keith, krcupreemptd new prio is 115?? should be fixed in current R2. What are you testing on here? =Comment: #7================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-24 21:11 EDT Umm lets see R2 from about a week ago. This box was setup for SAN testing a while ago. Latest.tar was linux-rt-Developer_Release-1122. I am moving to a current svn built kernel that includes a small fix (I don't suspect it is related). Check out the weird top output.... This is weird. the box has bad cpus and a pre-ga SAN card. I am setting up on a k"suppored" platfrom to reproduce there. top - 18:27:17 up 57 min, 4 users, load average: 20.03, 19.68, 16.40 Tasks: 272 total, 1 running, 271 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 25.0%id, 74.9%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8053068k total, 1869144k used, 6183924k free, 227428k buffers Swap: 2031608k total, 0k used, 2031608k free, 1299264k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 37 root -31 -5 0 0 0 S 0 0.0 0:01.11 sirq-sched/2 8267 root 39 19 0 0 0 S 0 0.0 0:25.45 kipmi0 9373 root 20 0 12852 1224 820 R 0 0.0 0:28.73 top 1 root 20 0 10332 700 588 S 0 0.0 0:05.44 init 2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0 4 root RT -5 0 0 0 S 0 0.0 0:00.00 posix_cpu_timer 5 root -31 -5 0 0 0 S 0 0.0 0:00.00 sirq-high/0 6 root -31 -5 0 0 0 S 0 0.0 0:03.77 sirq-timer/0 7 root -91 -5 0 0 0 S 0 0.0 0:00.00 sirq-net-tx/0 8 root -91 -5 0 0 0 S 0 0.0 0:00.06 sirq-net-rx/0 9 root -31 -5 0 0 0 S 0 0.0 0:42.42 sirq-block/0 10 root -31 -5 0 0 0 S 0 0.0 0:00.00 sirq-tasklet/0 11 root -31 -5 0 0 0 S 0 0.0 0:01.12 sirq-sched/0 12 root -93 -5 0 0 0 S 0 0.0 0:00.00 sirq-hrtimer/0 13 root -31 -5 0 0 0 S 0 0.0 0:00.79 sirq-rcu/0 14 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/0 15 root 10 -10 0 0 0 S 0 0.0 0:00.00 desched/0 16 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/1 17 root RT -5 0 0 0 S 0 0.0 0:00.00 posix_cpu_timer 18 root -31 -5 0 0 0 S 0 0.0 0:00.00 sirq-high/1 19 root -31 -5 0 0 0 S 0 0.0 0:04.41 sirq-timer/1 20 root -91 -5 0 0 0 S 0 0.0 0:00.00 sirq-net-tx/1 21 root -91 -5 0 0 0 S 0 0.0 0:00.01 sirq-net-rx/1 22 root -31 -5 0 0 0 S 0 0.0 0:27.28 sirq-block/1 23 root -31 -5 0 0 0 S 0 0.0 0:00.00 sirq-tasklet/1 24 root -31 -5 0 0 0 S 0 0.0 0:00.95 sirq-sched/1 25 root -93 -5 0 0 0 S 0 0.0 0:00.04 sirq-hrtimer/1 26 root -31 -5 0 0 0 S 0 0.0 0:00.74 sirq-rcu/1 27 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/1 28 root 10 -10 0 0 0 S 0 0.0 0:00.00 desched/1 29 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/2 30 root RT -5 0 0 0 S 0 0.0 0:00.00 posix_cpu_timer 31 root -31 -5 0 0 0 S 0 0.0 0:00.00 sirq-high/2 32 root -31 -5 0 0 0 S 0 0.0 0:03.36 sirq-timer/2 =Comment: #8================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-24 21:18 EDT Ok asked blast to stop. My top shell was ok. 3 other shells were dead in the water. Called sync on one. and pressed tab in the other 2. Totally dead shells but my top shell was alive. I was able to quit top and run a few commands (tab even). This lasted for a few min then pressed tab again and the shell locked up. Once a shell lockes up I don't ever see it come back. I am running tests as root. =Comment: #11================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-25 12:13 EDT This issue has only been seen with code that is in R2. I don't know the root issue yet but as this is a SAN box I don't think we should head to MRG yet. =Comment: #12================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-25 13:01 EDT Hmm moved to 2.6.24.7-74ibmrt2.5 and am still seeing the weird hang state. =Comment: #13================================================= JEFFREY D. LANE <jdlane.com> - 2008-07-29 10:18 EDT (In reply to comment #12) > Hmm moved to 2.6.24.7-74ibmrt2.5 and am still seeing the weird hang state. Just to add a second datapoint, I am seeing this as well on HS21XM running R2 RC1 in a SAN boot configuration. IN my case, however, I am using MPx and not blast, and my system remains up and stable, but /var/log/messages is full of similar rt_read_slowunlock() traces: (also, mine seem to mostly be centered around sirq-net-rx) Jul 29 06:17:21 HS21XM-R2RC1 kernel: rport-1:0-32: blocked FC remote port time out: removing target and saving binding Jul 29 06:17:21 HS21XM-R2RC1 kernel: rport-0:0-32: blocked FC remote port time out: removing target and saving binding Jul 29 09:36:11 HS21XM-R2RC1 kernel: WARNING: at kernel/rtmutex.c:1732 rt_read_slowunlock() Jul 29 09:36:11 HS21XM-R2RC1 kernel: Pid: 8, comm: sirq-net-rx/0 Not tainted 2.6.24.7-68ibmrt2.5 #1 Jul 29 09:36:11 HS21XM-R2RC1 kernel: Jul 29 09:36:11 HS21XM-R2RC1 kernel: Call Trace: Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81288591>] rt_read_slowunlock+0x14c/0x43e Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff810a8c9a>] ? kmem_cache_free+0x63/0x70 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff8105db19>] rt_rwlock_read_unlock+0x258/0x25d Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff812898f7>] rt_read_unlock+0x9/0xb Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff8125753b>] __udp4_lib_rcv+0x4a0/0x7d4 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81252e3a>] ? tcp_v4_rcv+0x707/0xab5 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81257884>] udp_rcv+0x15/0x17 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81235c73>] ip_local_deliver_finish+0x169/0x231 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81236157>] ip_local_deliver+0x72/0x7b Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81235ad9>] ip_rcv_finish+0x331/0x362 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81289833>] ? rt_spin_lock+0x9/0xb Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff810a9d89>] ? kmem_cache_alloc_node+0xe6/0x10f Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff812360b1>] ip_rcv+0x274/0x2a8 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81212fdd>] netif_receive_skb+0x3a9/0x462 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81143222>] ? swiotlb_map_single+0x3b/0xb4 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff882826ce>] :bnx2:bnx2_poll+0xc3a/0xe17 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81036769>] ? finish_task_switch+0x4c/0xdc Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81215402>] net_rx_action+0xab/0x1e9 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff81042a18>] ksoftirqd+0x16a/0x26f Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff810428ae>] ? ksoftirqd+0x0/0x26f Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff810428ae>] ? ksoftirqd+0x0/0x26f Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff810513e7>] kthread+0x49/0x76 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff8100d048>] child_rip+0xa/0x12 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff8105139e>] ? kthread+0x0/0x76 Jul 29 09:36:11 HS21XM-R2RC1 kernel: [<ffffffff8100d03e>] ? child_rip+0x0/0x12 I am also seeing these pop up as well: Jul 28 05:21:50 HS21XM-R2RC1 kernel: sirq-net-rx/0: page allocation failure. order:0, mode:0x20 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Pid: 8, comm: sirq-net-rx/0 Not tainted 2.6.24.7-68ibmrt2.5 #1 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Jul 28 05:21:50 HS21XM-R2RC1 kernel: Call Trace: Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff8108a3b3>] __alloc_pages+0x2fa/0x312 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff810a9bd2>] kmem_getpages+0x9d/0x16e Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff810aa1aa>] fallback_alloc+0x138/0x1dc Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff810aa36f>] ____cache_alloc_node+0x121/0x136 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff810a9d6a>] kmem_cache_alloc_node+0xc7/0x10f Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff810a9dd6>] __kmalloc_node+0x24/0x29 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff8120df57>] __alloc_skb+0x69/0x133 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff8120ebb6>] __netdev_alloc_skb+0x31/0x4d Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff882822b8>] :bnx2:bnx2_poll+0x824/0xe17 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff81033431>] ? try_to_wake_up+0x20d/0x21e Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff81215402>] net_rx_action+0xab/0x1e9 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff81042a18>] ksoftirqd+0x16a/0x26f Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff810428ae>] ? ksoftirqd+0x0/0x26f Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff810428ae>] ? ksoftirqd+0x0/0x26f Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff810513e7>] kthread+0x49/0x76 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff8100d048>] child_rip+0xa/0x12 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff8105139e>] ? kthread+0x0/0x76 Jul 28 05:21:50 HS21XM-R2RC1 kernel: [<ffffffff8100d03e>] ? child_rip+0x0/0x12 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Jul 28 05:21:50 HS21XM-R2RC1 kernel: Mem-info: Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 DMA per-cpu: Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 2: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 3: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 4: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 5: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 6: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 7: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 DMA32 per-cpu: Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 0: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 1 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 1: Hot: hi: 186, btch: 31 usd: 169 Cold: hi: 62, btch: 15 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 2: Hot: hi: 186, btch: 31 usd: 2 Cold: hi: 62, btch: 15 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 3: Hot: hi: 186, btch: 31 usd: 76 Cold: hi: 62, btch: 15 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 4: Hot: hi: 186, btch: 31 usd: 31 Cold: hi: 62, btch: 15 usd: 19 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 5: Hot: hi: 186, btch: 31 usd: 161 Cold: hi: 62, btch: 15 usd: 17 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 6: Hot: hi: 186, btch: 31 usd: 20 Cold: hi: 62, btch: 15 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 7: Hot: hi: 186, btch: 31 usd: 32 Cold: hi: 62, btch: 15 usd: 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 Normal per-cpu: Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 0: Hot: hi: 186, btch: 31 usd: 42 Cold: hi: 62, btch: 15 usd: 42 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 1: Hot: hi: 186, btch: 31 usd: 157 Cold: hi: 62, btch: 15 usd: 48 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 2: Hot: hi: 186, btch: 31 usd: 131 Cold: hi: 62, btch: 15 usd: 6 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 3: Hot: hi: 186, btch: 31 usd: 163 Cold: hi: 62, btch: 15 usd: 52 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 4: Hot: hi: 186, btch: 31 usd: 157 Cold: hi: 62, btch: 15 usd: 48 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 5: Hot: hi: 186, btch: 31 usd: 184 Cold: hi: 62, btch: 15 usd: 57 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 6: Hot: hi: 186, btch: 31 usd: 105 Cold: hi: 62, btch: 15 usd: 57 Jul 28 05:21:50 HS21XM-R2RC1 kernel: CPU 7: Hot: hi: 186, btch: 31 usd: 164 Cold: hi: 62, btch: 15 usd: 49 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Active:1538732 inactive:205001 dirty:8868 writeback:1263 unstable:4019 Jul 28 05:21:50 HS21XM-R2RC1 kernel: free:8849 slab:190850 mapped:15042 pagetables:6819 bounce:0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 DMA free:10320kB min:12kB low:12kB high:16kB active:0kB inactive:0kB present:9628kB pages_scanned:0 all_unreclaim able? yes Jul 28 05:21:50 HS21XM-R2RC1 kernel: lowmem_reserve[]: 0 2704 7954 7954 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 DMA32 free:22436kB min:3876kB low:4844kB high:5812kB active:2305936kB inactive:20kB present:2769008kB pages_scann ed:4684 all_unreclaimable? no Jul 28 05:21:50 HS21XM-R2RC1 kernel: lowmem_reserve[]: 0 0 5250 5250 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 Normal free:2640kB min:7528kB low:9408kB high:11292kB active:3848992kB inactive:819984kB present:5376000kB pages_ scanned:1196 all_unreclaimable? no Jul 28 05:21:50 HS21XM-R2RC1 kernel: lowmem_reserve[]: 0 0 0 0 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 DMA: 4*4kB 2*8kB 5*16kB 3*32kB 4*64kB 3*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 2*4096kB = 10320kB Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 DMA32: 90*4kB 57*8kB 47*16kB 25*32kB 15*64kB 5*128kB 2*256kB 3*512kB 2*1024kB 3*2048kB 2*4096kB = 22400kB Jul 28 05:21:50 HS21XM-R2RC1 kernel: Node 0 Normal: 235*4kB 1*8kB 0*16kB 0*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3124kB Jul 28 05:21:50 HS21XM-R2RC1 kernel: Swap cache: add 12122, delete 10774, find 35740/35832, race 0+3 Jul 28 05:21:50 HS21XM-R2RC1 kernel: Free swap = 1985760kB Jul 28 05:21:50 HS21XM-R2RC1 kernel: Total swap = 2031608kB Jul 28 05:21:50 HS21XM-R2RC1 kernel: Free swap: 1985760kB and I'm seeing some from pdflush as well. I am NOT seeing the APIC failure messages though. =Comment: #14================================================= KEITH MANNTHEY <mannthey.com> - 2008-07-30 15:04 EDT Thanks for the data point. Looks like the box is nearing oom? =Comment: #15================================================= JOSHUA A. TRIPLETT <josht.com> - 2008-08-13 11:53 EDT Changing the bug summary; the kernel WARNING message doesn't seem to relate to the hang. =Comment: #16================================================= KEITH MANNTHEY <mannthey.com> - 2008-08-13 11:59 EDT Ok there have not been many updates to this bug in the last little bit: At a hight level. JV and Josh are driving the bug. The hang seems related to the kind of IO the "blast" runs. I think it is something about lots small files but not 100% sure about that. Josh and JV this is a good place to keep technical. It is an easy place for others to look at. =Comment: #17================================================= JOSHUA A. TRIPLETT <josht.com> - 2008-08-13 12:04 EDT jvrao and I currently have various tests going on trying to narrow down the list of possible variables here. We can fairly consistently reproduce the problem on elm3c24 by running blast on the SAN partitions. We think we managed to reproduce the problem on elm3c24 by running iopuser (a simple Python filesystem test we have run in the past), though it took longer; tweaking iopuser's parameters to write smaller files and more of them would make it behave more like blast, which might reproduce the problem faster. Currently trying to reproduce the problem on elm3c29. =Comment: #18================================================= Venkateswarara Jujjuri <jvrao.com> - 2008-08-13 22:23 EDT In an effort to reproduce the problem on another machine, Josh started blast on one SAN disk. This single threaded blast ran fine overnight and it ran fine. Later On Keith's advise, I created 3 more LUNs, added to the machine, and started 4 thread blast. After 20 mins we observed that the system is hung. At this point we believed that we *DID* reproduced the problem on elm3c29. As per the plan, we need to retry the same test on local disk. We planned to create 4 dirs on the local disk, and restart 4 blast threads. Trying to reboot the system (elm3c29) for the last one hour.. it is still trying to reboot...and I could see lots of IO errors on the console. I checked that DS4k status is fine from the DS4k point of view. Will continue to investigate and try to run the blast on local disk. =Comment: #19================================================= Venkateswarara Jujjuri <jvrao.com> - 2008-08-14 01:21 EDT Rebooted elm3c29 took long time but finally came up. Fired off 4 threads of blast on to the local disk. We just have one local disk available on this machine, hence I directed all 4 threads to the same disk, but to different directories. We will know the status by tomorrow morning. =Comment: #20================================================= Venkateswarara Jujjuri <jvrao.com> - 2008-08-14 12:47 EDT Single LOCAL disk with 4 blast threads ran overnight without any problems. Machine was responsive in the morning. Next Step: Just to make apples to apples comparison, I am starting a test on 29 with 1 SAN disk, and 4 blast threads on that disk. =Comment: #21================================================= Darren V. Hart <dvhltc.com> - 2008-08-15 18:55 EDT online/offline test On a whim I tried looking at the Red Hat reported multipathd bugs, such as: https://bugzilla.redhat.com/show_bug.cgi?id=428338 I ran the test.sh script in the attachment from that bug (called san-fault-test.sh attached here). It seemed to online/offline the drive just fine for 100 iterations, but when I killed the test the shells all hung with an ls command (even as prio 99 shells, sshd, interrupt and softirq threads). Will run again with a fixed 100 iterations rather than ctrl+c'ing to exit. =Comment: #22================================================= Darren V. Hart <dvhltc.com> - 2008-08-15 20:42 EDT I worked with JV and niv and we collected a lot of information from the system as it slowly became less responsive. Given the quantity of data I have copied it all over to kbic here: http://kernel.beaverton.ibm.com/realtime/kdump_cores/bz46744/elm3c24/ It includes a crashdump core (FINALLY) a sysrq-t trace, and a ps command output. We had several commands hung at the time of each of these including some straced rm commands, an ls, and a sync. They all appear hung IN the kernel, on getdents(). We also observed that there were 9MB or so of dirty pages that we couldn't flush out with sync according to meminfo and vmstat (assuming that is why sync never returned). Plan to get everything uploaded tonight and commence crashdump analysis on Monday. =Comment: #23================================================= Venkateswarara Jujjuri <jvrao.com> - 2008-08-18 12:27 EDT Took a dump when the system is about to hang. Will look into the dump. In parallel plan is run tests on non-RT kernel. =Comment: #24================================================= Venkateswarara Jujjuri <jvrao.com> - 2008-08-19 00:32 EDT Me and Keith started analyzing the crash. Here are few observations. - 3 Blast threads running. 2 of them are in __wait_on_bit_lock -> sync_buffer - Many bash threads.. most of them are waiting to get scheduled. But one thread is: #0 [ffff8101105a1d48] schedule at ffffffff8128565f #1 [ffff8101105a1e00] rt_mutex_slowlock at ffffffff81286dd7 #2 [ffff8101105a1ed0] rt_mutex_lock at ffffffff812862c6 #3 [ffff8101105a1ee0] _mutex_lock at ffffffff81287739 #4 [ffff8101105a1ef0] vfs_readdir at ffffffff810bcbea #5 [ffff8101105a1f30] sys_getdents at ffffffff810bce86 It will be interesting to see who is holding the rt_mutex_slowlock. Other thing we need to see is, who is holding the scheduler.and why many bash threads are waiting to get scheduled. It is normal for the stack to show waiting to get scheduled. But we know that the system was not responding and shell are hanging when we took the crash. * In addition, our attempt create a vanilla 2.6.24.7 kernel with scsi_dh_rdac is not successful. We will be working on creating that kernel to reproduce it on a non-rt system =Comment: #25================================================= Venkateswarara Jujjuri <jvrao.com> - 2008-08-20 02:01 EDT Continuing to investigate the problem. This appears to be tricky and involved issue. Analyzed crash more deeply. Here is what We have learned. 1. Most of the CPUs are idle. This is puzzling...especially when the system is not responsive; one would assume CPUs to be occupied. 2. Most of the threads are blocked on IO. This explains why some bash shells are responsive and some are not. Any command/operation causing page fault or doing disk access are hanging. Myself, Chandra, Darren, and Keith spent time on this. Here is our plan of attack. 1. Try to run the same test on a machine with NO SAN. With the help of Keith, I have installed R2 on elm3b210 and ran blast. Blast is continuing to run even after 6 hours. This tells us, no problem with non-san-rt. 2. Try running SAN on vanilla 26-2.4 Ran on this kernel for more than 4 hours. This tells that No problem in this stack too. At this point we are sure that the problem exists on RT SAN combination. We believe that this problem could be there for long time but is getting exposed now..because the way blast does its testing. Blast creates many small files and keeps reading/writing/deleting/creating small files. As per Keith and Darren we have never tested this scenario. Now the target is to isolate the driver in the SAN IO stack. We are attempting to take out dm-rdac and dh modules from the stack. With the help of Keith I am installing a non-lvm single path SAN on elm3c29. This machine just finished installing base RH. Tomorrow We will put RT kernle on it...and test with blast. If blast runs fine on this, it tells us that the problem is there in the new dm-rdac/dh modules. If not we need to continue investigating the IO stack. Will update more info .. as we learn. =Comment: #26================================================= TIMOTHY R. CHAVEZ <chavezt.com> - 2008-08-20 16:51 EDT Apparently elm3b210 which is an LS41 with a local disk configuration seized after running the blaster test for some time. The system was still responsive to SysRq so I tried to get as much information from the hung machine as I possibly could. Unfortunately, I was not able to trigger a crashdump :/ Here's what I got (I have not analyze yet): SysRq : Show State init S [ffff81013fa8b480] ffffffff8129ebe0 0 1 0 ffff81013fa8d9d8 0000000000000086 0000000000000000 ffff81013fb57540 ffff81013fa8d968 ffffffff812872bb ffff81013fa8b480 ffff81013fb61580 ffff81013fa8b7c8 0000000201bbf9ff ffffffffffffffff 0000000101bbf9ff Call Trace: [<ffffffff812872bb>] ? rt_spin_lock+0x9/0xb [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff8128586c>] schedule_timeout+0x8d/0xb4 [<ffffffff81045994>] ? process_timeout+0x0/0xb [<ffffffff81285867>] ? schedule_timeout+0x88/0xb4 [<ffffffff810bd561>] do_select+0x47f/0x4f2 [<ffffffff810bda4e>] ? __pollwait+0x0/0xdf [<ffffffff81033423>] ? default_wake_function+0x0/0x14 [<ffffffff8107edca>] ? __rcu_read_unlock+0x5a/0x5c [<ffffffff810c18f0>] ? __d_lookup+0x116/0x128 [<ffffffff810b781d>] ? do_lookup+0x63/0x1b1 [<ffffffff810c0ddd>] ? dput+0x22/0x11a [<ffffffff810b9d8b>] ? __link_path_walk+0xc66/0xdbe [<ffffffff812872bb>] ? rt_spin_lock+0x9/0xb [<ffffffff810a8bc3>] ? _slab_irq_disable+0x37/0x5c [<ffffffff810c5edf>] ? mntput_no_expire+0x1e/0x98 [<ffffffff810b9fcb>] ? link_path_walk+0xe8/0xfc [<ffffffff810bd79a>] core_sys_select+0x1c6/0x275 [<ffffffff810c5edf>] ? mntput_no_expire+0x1e/0x98 [<ffffffff810b72e6>] ? path_release+0x2c/0x30 [<ffffffff810b338c>] ? cp_new_stat+0xf6/0x10f [<ffffffff810bdc7b>] sys_select+0x14e/0x176 [<ffffffff8100c1fe>] system_call_ret+0x0/0x5 kthreadd S [ffff81013fa8aa60] ffffffff8129ebe0 0 2 0 ffff81013fa8ff00 0000000000000046 0000000000000000 ffffffff8105df69 ffffffff814cc260 0000000000000286 ffff81013fa8aa60 ffff81013f063540 ffff81013fa8ada8 00000005957bdd80 ffffffffffffffff ffff8100957bdd10 Call Trace: [<ffffffff8105df69>] ? __rt_mutex_adjust_prio+0x11/0x24 [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff8105120c>] kthreadd+0x7b/0x142 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff81051191>] ? kthreadd+0x0/0x142 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 migration/0 S [ffff81013fa8a040] ffff81000100c158 0 3 2 ffff81013fa93ed0 0000000000000046 ffff810001016980 0000000000000001 ffff81013f12c040 ffff810001016980 ffff81013fa8a040 ffff81012e303540 ffff81013fa8a388 0000000081030e71 0000000000000001 ffff81013f12c040 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff81037bab>] migration_thread+0x16d/0x20d [<ffffffff81037a3e>] ? migration_thread+0x0/0x20d [<ffffffff81037a3e>] ? migration_thread+0x0/0x20d [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 posix_cpu_tim S [ffff81013fa954c0] 0000000000000000 0 4 2 ffff81013fa97ee0 0000000000000046 0000000000000000 ffff81013fa8b480 ffffffff813aa300 ffff81000100b980 ffff81013fa954c0 ffff81013fa8b480 ffff81013fa95808 0000000000000000 0000000000000000 ffff81013fa8b480 Call Trace: [<ffffffff81053627>] ? posix_cpu_timers_thread+0x0/0x125 [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810536b2>] posix_cpu_timers_thread+0x8b/0x125 [<ffffffff81053627>] ? posix_cpu_timers_thread+0x0/0x125 [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-high/0 S [ffff81013fa94aa0] ffff810001008240 0 5 2 ffff81013fa99eb0 0000000000000046 ffff81013fa95278 0000000000000001 ffffffff8129ebe0 ffff81013fa99e50 ffff81013fa94aa0 ffff81013fa94080 ffff81013fa94de8 0000000000000001 ffff81013fa99e70 ffffffff8107edca Call Trace: [<ffffffff8107edca>] ? __rcu_read_unlock+0x5a/0x5c [<ffffffff81036667>] ? do_sched_setscheduler+0x6f/0x7c [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-timer/0 S [ffff81013fa94080] ffff8100010085e8 0 6 2 ffff81013fa9deb0 0000000000000046 0000000600000001 ffffffff81545740 0000000000000000 0000000000000001 ffff81013fa94080 ffff81013fab3580 ffff81013fa943c8 000000008103982e 0000000600000001 ffff8100010082c0 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-net-tx/0 S [ffff81013fa9f500] ffff810001008310 0 7 2 ffff81013faa1eb0 0000000000000046 ffff81013fa9fcd8 0000000000000001 ffffffff8129ebe0 ffff81013faa1e50 ffff81013fa9f500 ffff81013fa9eae0 ffff81013fa9f848 0000000000000001 ffff81013faa1e70 ffffffff8107edca Call Trace: [<ffffffff8107edca>] ? __rcu_read_unlock+0x5a/0x5c [<ffffffff81036667>] ? do_sched_setscheduler+0x6f/0x7c [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-net-rx/0 S [ffff81013fa9eae0] ffffffff8129ebe0 0 8 2 ffff81013faa3eb0 0000000000000046 0000000000000000 0000000000000082 ffff81013faa3e60 00000040b3d2c100 ffff81013fa9eae0 ffffffff813aa7a0 ffff81013fa9ee28 00000000010094f8 ffffffffffffffff ffff810001008390 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-block/0 S [ffff81013fa9e0c0] ffff8100010085e8 0 9 2 ffff81013faa7eb0 0000000000000046 ffff81009569b380 ffff81013d0ce000 0000000000000005 0000000000000010 ffff81013fa9e0c0 ffff81013b1eb480 ffff81013fa9e408 000000009569b380 0000000600000001 ffff8100010083f8 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-tasklet/ S [ffff81013faa9540] ffff8100010085e8 0 10 2 ffff81013faabeb0 0000000000000046 ffff81013faa9d18 0000000000000001 ffffffff8129ebe0 ffff81013fa8b480 ffff81013faa9540 ffff81013fa8b480 ffff81013faa9888 0000000000000001 0000000600000001 ffff810001008460 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-sched/0 S [ffff81013faa8b20] ffff8100010085e8 0 11 2 ffff81013faadeb0 0000000000000046 ffff81000100b980 0000000101bbf1da 000000018129ebe0 0000000000000400 ffff81013faa8b20 ffff81013fa94080 ffff81013faa8e68 0000000000000000 0000000600000001 ffff8100010084c8 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-hrtimer/ S [ffff81013faa8100] ffff8100010085e8 0 12 2 ffff81013fab1eb0 0000000000000046 ffff81013fab1e40 ffffffff8105dac5 0000000600000001 ffff810001008710 ffff81013faa8100 ffff81013ccc4aa0 ffff81013faa8448 0000000000000080 0000000600000001 ffff810001008530 Call Trace: [<ffffffff8105dac5>] ? rt_rwlock_read_unlock+0xf8/0x25d [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 sirq-rcu/0 S [ffff81013fab3580] ffff8100010085e8 0 13 2 ffff81013fab5eb0 0000000000000046 0000000000000000 ffff8100b68c3870 ffff8100b68c38f0 ffff81013fa94080 ffff81013fab3580 ffff81013b1eb480 ffff81013fab38c8 0000000000000000 0000000600000001 ffff810001008598 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff810429c6>] ksoftirqd+0xbc/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff8104290a>] ? ksoftirqd+0x0/0x26f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 watchdog/0 S [ffff81013fab2b60] 0000000000000000 0 14 2 ffff81013fab7ef0 0000000000000046 0000000000000000 0000000000000246 ffff81013fab7ea0 ffffffff8102ff0a ffff81013fab2b60 ffff81013fa8aa60 ffff81013fab2ea8 00000000814cc260 ffff81013fab7eb0 ffffffff8102ffee Call Trace: [<ffffffff8102ff0a>] ? __update_rq_clock+0x1a/0xe5 [<ffffffff8102ffee>] ? update_rq_clock+0x19/0x1b [<ffffffff8107add4>] ? watchdog+0x0/0x58 [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff8107ae1c>] watchdog+0x48/0x58 [<ffffffff8128559a>] ? schedule+0xdf/0xff [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 desched/0 S [ffff81013fab2140] ffffffff8103bf65 0 15 2 ffff81013fabbef0 0000000000000046 ffff81013fabbe90 ffff81013fabbeb4 ffff81013da71540 ffff81013fc290c0 ffff81013fab2140 ffff81012e1c9500 ffff81013fab2488 00000000810a9052 ffff81013fabbeb0 ffffffff81089c85 Call Trace: [<ffffffff81089c85>] ? __free_pages+0x18/0x21 [<ffffffff8103bf65>] ? desched_thread+0x0/0x146 [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff8103c076>] desched_thread+0x111/0x146 [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 migration/1 S [ffff81013fabd480] ffff810001017158 0 16 2 ffff81013fabfed0 0000000000000046 ffff8100010 SysRq : Show Memory Node 0 DMA per-cpu: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 2: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 3: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 4: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 5: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 6: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 7: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: Hot: hi: 186, btch: 31 usd: 172 Cold: hi: 62, btch: 15 usd: 60 CPU 1: Hot: hi: 186, btch: 31 usd: 29 Cold: hi: 62, btch: 15 usd: 52 CPU 2: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 53 CPU 3: Hot: hi: 186, btch: 31 usd: 51 Cold: hi: 62, btch: 15 usd: 57 CPU 4: Hot: hi: 186, btch: 31 usd: 70 Cold: hi: 62, btch: 15 usd: 55 CPU 5: Hot: hi: 186, btch: 31 usd: 42 Cold: hi: 62, btch: 15 usd: 55 CPU 6: Hot: hi: 186, btch: 31 usd: 165 Cold: hi: 62, btch: 15 usd: 56 CPU 7: Hot: hi: 186, btch: 31 usd: 22 Cold: hi: 62, btch: 15 usd: 61 Node 0 Normal per-cpu: CPU 0: Hot: hi: 186, btch: 31 usd: 86 Cold: hi: 62, btch: 15 usd: 52 CPU 1: Hot: hi: 186, btch: 31 usd: 153 Cold: hi: 62, btch: 15 usd: 61 CPU 2: Hot: hi: 186, btch: 31 usd: 76 Cold: hi: 62, btch: 15 usd: 59 CPU 3: Hot: hi: 186, btch: 31 usd: 50 Cold: hi: 62, btch: 15 usd: 49 CPU 4: Hot: hi: 186, btch: 31 usd: 29 Cold: hi: 62, btch: 15 usd: 54 CPU 5: Hot: hi: 186, btch: 31 usd: 62 Cold: hi: 62, btch: 15 usd: 47 CPU 6: Hot: hi: 186, btch: 31 usd: 173 Cold: hi: 62, btch: 15 usd: 60 CPU 7: Hot: hi: 186, btch: 31 usd: 49 Cold: hi: 62, btch: 15 usd: 48 Active:124620 inactive:802981 dirty:91167 writeback:697 unstable:0 free:5928 slab:77502 mapped:2680 pagetables:566 bounce:0 Node 0 DMA free:10388kB min:16kB low:20kB high:24kB active:0kB inactive:0kB present:9668kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 2960 3960 3960 Node 0 DMA32 free:11104kB min:6012kB low:7512kB high:9016kB active:342056kB inactive:2405708kB present:3031040kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 1000 1000 Node 0 Normal free:2220kB min:2028kB low:2532kB high:3040kB active:156424kB inactive:806216kB present:1024000kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 3*4kB 3*8kB 3*16kB 4*32kB 3*64kB 2*128kB 2*256kB 0*512kB 1*1024kB 0*2048kB 2*4096kB = 10388kB Node 0 DMA32: 842*4kB 39*8kB 28*16kB 24*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 1*4096kB = 11040kB Node 0 Normal: 54*4kB 14*8kB 11*16kB 1*32kB 3*64kB 4*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2008kB Swap cache: add 35, delete 35, find 0/0, race 0+0 Free swap = 16458444kB Total swap = 16458584kB =Comment: #27================================================= TIMOTHY R. CHAVEZ <chavezt.com> - 2008-08-20 17:01 EDT Here is a list of all the blocked tasks (generated with sysrq+w). I've not looked this data yet... SysRq : Show Blocked State khubd D [ffff81013f3a9580] ffffffff8129ebe0 0 324 2 ffff81013ec55d90 0000000000000046 0000000000000000 ffff81013d9b5b60 ffff81013ec55d20 ffffffff812872bb ffff81013f3a9580 ffff81013f063540 ffff81013f3a98c8 0000000501bbf1b1 ffffffffffffffff 0000000101bbf1b1 Call Trace: [<ffffffff812872bb>] ? rt_spin_lock+0x9/0xb [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff8128586c>] schedule_timeout+0x8d/0xb4 [<ffffffff81045994>] ? process_timeout+0x0/0xb [<ffffffff81285867>] ? schedule_timeout+0x88/0xb4 [<ffffffff812858ac>] schedule_timeout_uninterruptible+0x19/0x1b [<ffffffff8104696b>] msleep+0x14/0x1e [<ffffffff811c3968>] hub_thread+0x49e/0xc8f [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff811c34ca>] ? hub_thread+0x0/0xc8f [<ffffffff811c34ca>] ? hub_thread+0x0/0xc8f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 kswapd0 D [ffff81013db20a60] ffffffff8129ebe0 0 436 2 ffff81013db25910 0000000000000046 0000000000000000 ffff81013db20a60 ffff81013a405388 ffff81013d1d9838 ffff81013db20a60 ffffffff813aa7a0 ffff81013db20da8 000000003d1d9818 ffffffffffffffff ffff81013d1d9818 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff81285789>] io_schedule+0x5d/0x9f [<ffffffff8112a020>] get_request_wait+0x122/0x170 [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff81126d80>] ? elv_merge+0x1df/0x21f [<ffffffff8112b5ae>] __make_request+0x50d/0x5c7 [<ffffffff81139155>] ? radix_tree_delete+0x5a/0x456 [<ffffffff810a8bc3>] ? _slab_irq_disable+0x37/0x5c [<ffffffff811284be>] generic_make_request+0x332/0x36d [<ffffffff81086bdf>] ? mempool_alloc_slab+0x11/0x13 [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff8112861e>] submit_bio+0x125/0x12e [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d0e07>] submit_bh+0xed/0x111 [<ffffffff810d2a35>] __block_write_full_page+0x1cb/0x2b7 [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d2be2>] block_write_full_page+0xc1/0xcd [<ffffffff810d5723>] blkdev_writepage+0x13/0x15 [<ffffffff8108f383>] shrink_page_list+0x351/0x5bb [<ffffffff8108f751>] shrink_inactive_list+0x164/0x4cb [<ffffffff8108fbcd>] shrink_zone+0x115/0x13b [<ffffffff8109016e>] kswapd+0x347/0x512 [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff8108fe27>] ? kswapd+0x0/0x512 [<ffffffff8108fe27>] ? kswapd+0x0/0x512 [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 kjournald D [ffff81013cec6100] ffff810001037980 0 1037 2 ffff81013c9bbd10 0000000000000046 0000000007023848 ffff81013f1c87d8 0000000000000000 ffff81013c9bbdb0 ffff81013cec6100 ffff81013f2c0040 ffff81013cec6448 0000000481129d76 ffff81013c9bbcd0 ffff81013f1c87d8 Call Trace: [<ffffffff81056a4c>] ? getnstimeofday+0x31/0x88 [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff81285789>] io_schedule+0x5d/0x9f [<ffffffff810d1f55>] sync_buffer+0x3e/0x42 [<ffffffff81285a01>] __wait_on_bit+0x45/0x77 [<ffffffff810d1f17>] ? sync_buffer+0x0/0x42 [<ffffffff810d1f17>] ? sync_buffer+0x0/0x42 [<ffffffff81285aa1>] out_of_line_wait_on_bit+0x6e/0x7b [<ffffffff8105159f>] ? wake_bit_function+0x0/0x2a [<ffffffff88024197>] ? :jbd:__journal_file_buffer+0xb5/0x161 [<ffffffff810d1ed5>] __wait_on_buffer+0x20/0x22 [<ffffffff880260e7>] :jbd:journal_commit_transaction+0x44c/0xd81 [<ffffffff812872bb>] ? rt_spin_lock+0x9/0xb [<ffffffff81045fe6>] ? try_to_del_timer_sync+0x52/0x5c [<ffffffff88029981>] :jbd:kjournald+0xca/0x23f [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff880298b7>] ? :jbd:kjournald+0x0/0x23f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 syslogd D [ffff81013b216aa0] ffff81013d12b000 0 2871 1 ffff81013a4e1d18 0000000000000086 ffff81013a4e1ce8 ffffffff81033412 ffff81013a4e1de8 000000008103982e ffff81013b216aa0 ffff81013cec6100 ffff81013b216de8 000000043c9bbee0 0000000600000001 ffff81013d12b1d0 Call Trace: [<ffffffff81033412>] ? try_to_wake_up+0x20d/0x21e [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff88028e21>] :jbd:log_wait_commit+0xa9/0xfe [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff8103982e>] ? __wake_up+0x3a/0x5b [<ffffffff88024569>] :jbd:journal_stop+0x1a4/0x1d7 [<ffffffff88024f5f>] :jbd:journal_force_commit+0x23/0x25 [<ffffffff8803f209>] :ext3:ext3_force_commit+0x26/0x28 [<ffffffff8803905e>] :ext3:ext3_write_inode+0x39/0x3f [<ffffffff810ccf9f>] __writeback_single_inode+0x1e0/0x30a [<ffffffff810cd0ed>] sync_inode+0x24/0x39 [<ffffffff8803543b>] :ext3:ext3_sync_file+0x8f/0xa0 [<ffffffff810cff1b>] do_fsync+0x54/0xaa [<ffffffff810cff9f>] __do_fsync+0x2e/0x44 [<ffffffff810cffd0>] sys_fsync+0xb/0xd [<ffffffff8100c37e>] traceret+0x0/0x5 kjournald D [ffff81013cf9cb20] ffffffff8129ebe0 0 10317 2 ffff810123499d10 0000000000000046 0000000000000000 ffffffff8112b61d 0000000000001000 ffffffff81286cae ffff81013cf9cb20 ffffffff813aa7a0 ffff81013cf9ce68 0000000081129d69 ffffffffffffffff ffff81013d1d9818 Call Trace: [<ffffffff8112b61d>] ? __make_request+0x57c/0x5c7 [<ffffffff81286cae>] ? rt_spin_lock_slowlock+0x20f/0x235 [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff81285789>] io_schedule+0x5d/0x9f [<ffffffff810d1f55>] sync_buffer+0x3e/0x42 [<ffffffff81285a01>] __wait_on_bit+0x45/0x77 [<ffffffff810d1f17>] ? sync_buffer+0x0/0x42 [<ffffffff810d1f17>] ? sync_buffer+0x0/0x42 [<ffffffff81285aa1>] out_of_line_wait_on_bit+0x6e/0x7b [<ffffffff8105159f>] ? wake_bit_function+0x0/0x2a [<ffffffff810d1ed5>] __wait_on_buffer+0x20/0x22 [<ffffffff8802649b>] :jbd:journal_commit_transaction+0x800/0xd81 [<ffffffff81045fe6>] ? try_to_del_timer_sync+0x52/0x5c [<ffffffff88029981>] :jbd:kjournald+0xca/0x23f [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff880298b7>] ? :jbd:kjournald+0x0/0x23f [<ffffffff81051443>] kthread+0x49/0x76 [<ffffffff8100d048>] child_rip+0xa/0x12 [<ffffffff810513fa>] ? kthread+0x0/0x76 [<ffffffff8100d03e>] ? child_rip+0x0/0x12 blast D [ffff81013b452100] ffffffff8129ebe0 0 10364 10337 ffff810123461458 0000000000000086 0000000000000000 000000061ca7bc30 ffff81013a405388 ffff81013d1d9838 ffff81013b452100 ffff81013f0b00c0 ffff81013b452448 000000063d1d9818 ffffffffffffffff ffff81013d1d9818 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff81285789>] io_schedule+0x5d/0x9f [<ffffffff8112a020>] get_request_wait+0x122/0x170 [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff81126d80>] ? elv_merge+0x1df/0x21f [<ffffffff8112b5ae>] __make_request+0x50d/0x5c7 [<ffffffff810a8bc3>] ? _slab_irq_disable+0x37/0x5c [<ffffffff811284be>] generic_make_request+0x332/0x36d [<ffffffff81086bdf>] ? mempool_alloc_slab+0x11/0x13 [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff8112861e>] submit_bio+0x125/0x12e [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d0e07>] submit_bh+0xed/0x111 [<ffffffff810d2a35>] __block_write_full_page+0x1cb/0x2b7 [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d2be2>] block_write_full_page+0xc1/0xcd [<ffffffff810d5723>] blkdev_writepage+0x13/0x15 [<ffffffff8108f383>] shrink_page_list+0x351/0x5bb [<ffffffff88036a30>] ? :ext3:ext3_mark_iloc_dirty+0x2bb/0x333 [<ffffffff8108f751>] shrink_inactive_list+0x164/0x4cb [<ffffffff8108fbcd>] shrink_zone+0x115/0x13b [<ffffffff81090aa1>] try_to_free_pages+0x1df/0x2f6 [<ffffffff8108a5b2>] __alloc_pages+0x1ce/0x312 [<ffffffff812872bb>] ? rt_spin_lock+0x9/0xb [<ffffffff810a4466>] alloc_pages_current+0xa8/0xb1 [<ffffffff810845ad>] __page_cache_alloc+0x8e/0x92 [<ffffffff81084834>] __grab_cache_page+0x36/0x74 [<ffffffff880391a1>] :ext3:ext3_write_begin+0x65/0x19e [<ffffffff8105f377>] ? rt_mutex_up_read+0xf8/0x260 [<ffffffff810854e0>] generic_file_buffered_write+0x151/0x5d6 [<ffffffff81041979>] ? current_fs_time+0x22/0x29 [<ffffffff88045da2>] ? :ext3:ext3_xattr_security_get+0x21/0x23 [<ffffffff810c4366>] ? file_update_time+0x30/0xad [<ffffffff81085cbe>] __generic_file_aio_write_nolock+0x359/0x3c3 [<ffffffff810a9052>] ? kmem_cache_free+0x49/0x54 [<ffffffff8128570e>] ? preempt_schedule_irq+0x52/0x70 [<ffffffff81085d8c>] generic_file_aio_write+0x64/0xc0 [<ffffffff880352ae>] :ext3:ext3_file_write+0x1e/0x9e [<ffffffff810afb23>] do_sync_write+0xe2/0x126 [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff8128570e>] ? preempt_schedule_irq+0x52/0x70 [<ffffffff8100c846>] ? retint_kernel+0x26/0x30 [<ffffffff8107781d>] ? audit_syscall_entry+0x148/0x17e [<ffffffff810b0369>] vfs_write+0xc7/0x170 [<ffffffff810b0979>] sys_write+0x4a/0x76 [<ffffffff8100c37e>] traceret+0x0/0x5 blast D [ffff810129384040] ffffffff8129ebe0 0 10365 10337 ffff8101280df458 0000000000000086 0000000000000000 ffff810129384040 ffff81013a405388 ffff81013d1d9838 ffff810129384040 ffffffff813aa7a0 ffff810129384388 000000003d1d9818 ffffffffffffffff ffff81013d1d9818 Call Trace: [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff81285789>] io_schedule+0x5d/0x9f [<ffffffff8112a020>] get_request_wait+0x122/0x170 [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff81126d80>] ? elv_merge+0x1df/0x21f [<ffffffff8112b5ae>] __make_request+0x50d/0x5c7 [<ffffffff81139524>] ? radix_tree_delete+0x429/0x456 [<ffffffff810a8bc3>] ? _slab_irq_disable+0x37/0x5c [<ffffffff811284be>] generic_make_request+0x332/0x36d [<ffffffff81086bdf>] ? mempool_alloc_slab+0x11/0x13 [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff8112861e>] submit_bio+0x125/0x12e [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d0e07>] submit_bh+0xed/0x111 [<ffffffff810d2a35>] __block_write_full_page+0x1cb/0x2b7 [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d667e>] ? blkdev_get_block+0x0/0x4d [<ffffffff810d2be2>] block_write_full_page+0xc1/0xcd [<ffffffff810d5723>] blkdev_writepage+0x13/0x15 [<ffffffff8108f383>] shrink_page_list+0x351/0x5bb [<ffffffff812872bb>] ? rt_spin_lock+0x9/0xb [<ffffffff88025864>] ? :jbd:journal_dirty_metadata+0xe8/0xf7 [<ffffffff8108f751>] shrink_inactive_list+0x164/0x4cb [<ffffffff81051482>] ? bit_waitqueue+0x12/0xa6 [<ffffffff8108fbcd>] shrink_zone+0x115/0x13b [<ffffffff81090aa1>] try_to_free_pages+0x1df/0x2f6 [<ffffffff8108a5b2>] __alloc_pages+0x1ce/0x312 [<ffffffff812872bb>] ? rt_spin_lock+0x9/0xb [<ffffffff810a4466>] alloc_pages_current+0xa8/0xb1 [<ffffffff810845ad>] __page_cache_alloc+0x8e/0x92 [<ffffffff81084834>] __grab_cache_page+0x36/0x74 [<ffffffff880391a1>] :ext3:ext3_write_begin+0x65/0x19e [<ffffffff8105f377>] ? rt_mutex_up_read+0xf8/0x260 [<ffffffff810854e0>] generic_file_buffered_write+0x151/0x5d6 [<ffffffff81041979>] ? current_fs_time+0x22/0x29 [<ffffffff88045da2>] ? :ext3:ext3_xattr_security_get+0x21/0x23 [<ffffffff810c4366>] ? file_update_time+0x30/0xad [<ffffffff81085cbe>] __generic_file_aio_write_nolock+0x359/0x3c3 [<ffffffff810a9052>] ? kmem_cache_free+0x49/0x54 [<ffffffff8802458d>] ? :jbd:journal_stop+0x1c8/0x1d7 [<ffffffff81085d8c>] generic_file_aio_write+0x64/0xc0 [<ffffffff880352ae>] :ext3:ext3_file_write+0x1e/0x9e [<ffffffff810afb23>] do_sync_write+0xe2/0x126 [<ffffffff810b0c14>] ? file_move+0x5e/0xa6 [<ffffffff81051567>] ? autoremove_wake_function+0x0/0x38 [<ffffffff810ae55e>] ? nameidata_to_filp+0x2d/0x3f [<ffffffff810ae5a6>] ? do_filp_open+0x36/0x46 [<ffffffff8107781d>] ? audit_syscall_entry+0x148/0x17e [<ffffffff810b0369>] vfs_write+0xc7/0x170 [<ffffffff810b0979>] sys_write+0x4a/0x76 [<ffffffff8100c37e>] traceret+0x0/0x5 blast D [ffff81013b300040] ffffffff8129ebe0 0 10366 10337 ffff81011454d9d8 0000000000000086 0000000000000000 0000000000200020 ffff81013fa58d00 ffff81011454d984 ffff81013b300040 ffffffff813aa7a0 ffff81013b300388 0000000081138a45 ffffffffffffffff ffffffff81138714 Call Trace: [<ffffffff81138714>] ? radix_valid_always+0x0/0xb [<ffffffff8108424e>] ? sync_page+0x0/0x45 [<ffffffff8128559a>] schedule+0xdf/0xff [<ffffffff81285789>] io_schedule+0x5d/0x9f [<ffffffff8108428f>] sync_page+0x41/0x45 [<ffffffff8128590b>] __wait_on_bit_lock+0x42/0x78 [<ffffffff8108423f>] __lock_page+0xc2/0xc4 [<ffffffff8105159f>] ? wake_bit_function+0x0/0x2a [<ffffffff81084448>] find_lock_page+0x39/0x5d [<ffffffff8108481a> =Comment: #28================================================= TIMOTHY R. CHAVEZ <chavezt.com> - 2008-08-20 17:12 EDT This may or may not be helpful, but I'm adding for the elm3b210 hang... Sched Debug Version: v0.07, 2.6.24.7-74ibmrt2.5 #1 now at 29394874.636514 msecs .sysctl_sched_latency : 80.000000 .sysctl_sched_min_granularity : 16.000000 .sysctl_sched_wakeup_granularity : 40.000000 .sysctl_sched_batch_wakeup_granularity : 40.000000 .sysctl_sched_child_runs_first : 0.000001 .sysctl_sched_features : 39 cpu#0, 1600.057 MHz .nr_running : 5 .load : 358116 .nr_switches : 88879043 .nr_load_updates : 29394296 .nr_uninterruptible : -96802 .jiffies : 4324061591 .next_balance : 4324.061658 .curr->pid : 10368 .clock : 29389828.078136 .idle_clock : 0.000000 .prev_clock_raw : 29433077.040565 .clock_warps : 0 .clock_overflows : 195920 .clock_deep_idle_events : 0 .clock_max_delta : 0.999028 .cpu_load[0] : 1024 .cpu_load[1] : 518 .cpu_load[2] : 1234 .cpu_load[3] : 5535 .cpu_load[4] : 9570 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 183 .sched_switch : 0 .sched_count : 90154834 .sched_goidle : 28092341 .ttwu_count : 54625679 .ttwu_local : 50002910 .bkl_count : 698 .rto_schedule : 35718833 .rto_schedule_tail : 48115 .rto_wakeup : 3824567 .rto_pulled : 24 .rto_pushed : 38054
Created attachment 315191 [details] online/offline test
*** Continuation of the previous comment *** .lb_breaks : 32332 .rt.rt_nr_running : 2 .rt.rt_nr_uninterruptible : 4079 cfs_rq .exec_clock : 4898525.373802 .MIN_vruntime : 34189622.761175 .min_vruntime : 34189702.761175 .max_vruntime : 34189622.761175 .spread : 0.000000 .spread0 : 0.000000 .nr_running : 3 .load : 3072 .bkl_count : 698 .nr_spread_over : 3069086 runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- IRQ-19 746 -12494417.560092 10062067 4 -12494417.560092 143536.971248 0.005545 IRQ-17 2798 -2244852.954366 181099 4 -2244852.954366 2411.567408 0.020340 ntpd 4379 34189622.761175 34907 120 34189622.761175 1547.736658 29332812.052997 avahi-daemon 4598 34189622.761175 5 120 34189622.761175 0.317794 29333393.056235 R blast 10368 34189702.761175 6477830 120 34189702.761175 3658908.920682 23218972.614414 cpu#1, 1600.057 MHz .nr_running : 0 .load : 0 .nr_switches : 84617285 .nr_load_updates : 29398387 .nr_uninterruptible : -12763 .jiffies : 4324061591 .next_balance : 4324.061614 .curr->pid : 0 .clock : 29393919.445091 .idle_clock : 0.000000 .prev_clock_raw : 75270672.883051 .clock_warps : 0 .clock_overflows : 128687 .clock_deep_idle_events : 0 .clock_max_delta : 0.996688 .cpu_load[0] : 0 .cpu_load[1] : 0 .cpu_load[2] : 0 .cpu_load[3] : 0 .cpu_load[4] : 0 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 416 .sched_switch : 0 .sched_count : 85639915 .sched_goidle : 26518251 .ttwu_count : 51411965 .ttwu_local : 48234063 .bkl_count : 527 .rto_schedule : 33245639 .rto_schedule_tail : 27157 .rto_wakeup : 2347621 .rto_pulled : 45 .rto_pushed : 30951 .lb_breaks : 20406 .rt.rt_nr_running : 0 .rt.rt_nr_uninterruptible : -4098 cfs_rq .exec_clock : 5381010.622923 .MIN_vruntime : 0.000001 .min_vruntime : 30962011.829487 .max_vruntime : 0.000001 .spread : 0.000000 .spread0 : -3227690.931688 .nr_running : 0 .load : 0 .bkl_count : 527 .nr_spread_over : 2184127 runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- cpu#2, 1600.057 MHz .nr_running : 0 .load : 0 .nr_switches : 76753549 .nr_load_updates : 29398316 .nr_uninterruptible : 63109 .jiffies : 4324061591 .next_balance : 4324.061674 .curr->pid : 0 .clock : 29393848.455904 .idle_clock : 0.000000 .prev_clock_raw : 75270685.700782 .clock_warps : 0 .clock_overflows : 108048 .clock_deep_idle_events : 0 .clock_max_delta : 0.996108 .cpu_load[0] : 0 .cpu_load[1] : 0 .cpu_load[2] : 0 .cpu_load[3] : 0 .cpu_load[4] : 0 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 445 .sched_switch : 0 .sched_count : 76966943 .sched_goidle : 25457885 .ttwu_count : 45870808 .ttwu_local : 45203771 .bkl_count : 65 .rto_schedule : 29527971 .rto_schedule_tail : 955 .rto_wakeup : 31174 .rto_pulled : 18 .rto_pushed : 11261 .lb_breaks : 16155 .rt.rt_nr_running : 0 .rt.rt_nr_uninterruptible : -210 cfs_rq .exec_clock : 4532559.248778 .MIN_vruntime : 0.000001 .min_vruntime : 38661343.821066 .max_vruntime : 0.000001 .spread : 0.000000 .spread0 : 4471641.059891 .nr_running : 0 .load : 0 .bkl_count : 65 .nr_spread_over : 89403 runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- cpu#3, 1600.057 MHz .nr_running : 0 .load : 0 .nr_switches : 75597502 .nr_load_updates : 29398246 .nr_uninterruptible : 50434 .jiffies : 4324061591 .next_balance : 4324.061681 .curr->pid : 0 .clock : 29393778.466530 .idle_clock : 0.000000 .prev_clock_raw : 75270693.534770 .clock_warps : 0 .clock_overflows : 102878 .clock_deep_idle_events : 0 .clock_max_delta : 0.996048 .cpu_load[0] : 0 .cpu_load[1] : 0 .cpu_load[2] : 0 .cpu_load[3] : 0 .cpu_load[4] : 0 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 972 .sched_switch : 0 .sched_count : 75735418 .sched_goidle : 27355032 .ttwu_count : 45267919 .ttwu_local : 44694884 .bkl_count : 44 .rto_schedule : 29463844 .rto_schedule_tail : 341 .rto_wakeup : 388 .rto_pulled : 13 .rto_pushed : 6784 .lb_breaks : 20677 .rt.rt_nr_running : 0 .rt.rt_nr_uninterruptible : -87 cfs_rq .exec_clock : 2558515.929605 .MIN_vruntime : 0.000001 .min_vruntime : 37206459.647282 .max_vruntime : 0.000001 .spread : 0.000000 .spread0 : 3016756.886107 .nr_running : 0 .load : 0 .bkl_count : 44 .nr_spread_over : 73282 runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- cpu#4, 1600.057 MHz .nr_running : 1 .load : 177522 .nr_switches : 75246847 .nr_load_updates : 29398175 .nr_uninterruptible : 40081 .jiffies : 4324061591 .next_balance : 4324.061595 .curr->pid : 587 .clock : 29393707.511111 .idle_clock : 0.000000 .prev_clock_raw : 75418323.767358 .clock_warps : 0 .clock_overflows : 138647 .clock_deep_idle_events : 0 .clock_max_delta : 0.996031 .cpu_load[0] : 0 .cpu_load[1] : 0 .cpu_load[2] : 0 .cpu_load[3] : 0 .cpu_load[4] : 0 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 1426 .sched_switch : 0 .sched_count : 75333824 .sched_goidle : 28395180 .ttwu_count : 45180035 .ttwu_local : 44719289 .bkl_count : 29 .rto_schedule : 29460959 .rto_schedule_tail : 198 .rto_wakeup : 333 .rto_pulled : 8 .rto_pushed : 3890 .lb_breaks : 23273 .rt.rt_nr_running : 1 .rt.rt_nr_uninterruptible : -62 cfs_rq .exec_clock : 1437848.398469 .MIN_vruntime : 0.000001 .min_vruntime : 28071407.036211 .max_vruntime : 0.000001 .spread : 0.000000 .spread0 : -6118295.724964 .nr_running : 0 .load : 0 .bkl_count : 29 .nr_spread_over : 69496 runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- R IRQ-3 587 179.969964 444 4 179.969964 9.079176 0.005109 cpu#5, 1600.057 MHz .nr_running : 0 .load : 0 .nr_switches : 75376770 .nr_load_updates : 29398104 .nr_uninterruptible : 26728 .jiffies : 4324061591 .next_balance : 4324.061595 .curr->pid : 0 .clock : 29393636.488096 .idle_clock : 0.000000 .prev_clock_raw : 54153304.160724 .clock_warps : 0 .clock_overflows : 97144 .clock_deep_idle_events : 0 .clock_max_delta : 0.998701 .cpu_load[0] : 0 .cpu_load[1] : 0 .cpu_load[2] : 0 .cpu_load[3] : 0 .cpu_load[4] : 0 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 1677 .sched_switch : 0 .sched_count : 75458943 .sched_goidle : 28789756 .ttwu_count : 45282745 .ttwu_local : 44965719 .bkl_count : 99 .rto_schedule : 29459623 .rto_schedule_tail : 119 .rto_wakeup : 455 .rto_pulled : 8 .rto_pushed : 2446 .lb_breaks : 18517 .rt.rt_nr_running : 0 .rt.rt_nr_uninterruptible : -41 cfs_rq .exec_clock : 909053.116367 .MIN_vruntime : 0.000001 .min_vruntime : 25311238.380281 .max_vruntime : 0.000001 .spread : 0.000000 .spread0 : -8878464.380894 .nr_running : 0 .load : 0 .bkl_count : 99 .nr_spread_over : 69037 runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- cpu#6, 1600.057 MHz .nr_running : 0 .load : 0 .nr_switches : 89140520 .nr_load_updates : 29398032 .nr_uninterruptible : -156266 .jiffies : 4324061591 .next_balance : 4324.061751 .curr->pid : 0 .clock : 29393564.500525 .idle_clock : 0.000000 .prev_clock_raw : 75270686.526764 .clock_warps : 0 .clock_overflows : 167827 .clock_deep_idle_events : 0 .clock_max_delta : 0.998390 .cpu_load[0] : 0 .cpu_load[1] : 0 .cpu_load[2] : 0 .cpu_load[3] : 0 .cpu_load[4] : 0 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 1774 .sched_switch : 0 .sched_count : 90431997 .sched_goidle : 30559564 .ttwu_count : 55066830 .ttwu_local : 54451675 .bkl_count : 636 .rto_schedule : 36481231 .rto_schedule_tail : 77343 .rto_wakeup : 4217166 .rto_pulled : 58 .rto_pushed : 27291 .lb_breaks : 5601 .rt.rt_nr_running : 0 .rt.rt_nr_uninterruptible : -52 cfs_rq .exec_clock : 2604299.438836 .MIN_vruntime : 0.000001 .min_vruntime : 34814678.176308 .max_vruntime : 0.000001 .spread : 0.000000 .spread0 : 624975.415133 .nr_running : 0 .load : 0 .bkl_count : 636 .nr_spread_over : 2998327 runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- cpu#7, 1600.057 MHz .nr_running : 0 .load : 0 .nr_switches : 76842021 .nr_load_updates : 29397960 .nr_uninterruptible : 85523 .jiffies : 4324061591 .next_balance : 4324.061594 .curr->pid : 0 .clock : 29393492.510016 .idle_clock : 0.000000 .prev_clock_raw : 75270695.181804 .clock_warps : 0 .clock_overflows : 101562 .clock_deep_idle_events : 0 .clock_max_delta : 0.995775 .cpu_load[0] : 0 .cpu_load[1] : 0 .cpu_load[2] : 0 .cpu_load[3] : 0 .cpu_load[4] : 0 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 2428 .sched_switch : 0 .sched_count : 76994234 .sched_goidle : 27556332 .ttwu_count : 45937928 .ttwu_local : 45374789 .bkl_count : 52 .rto_schedule : 29464909 .rto_schedule_tail : 374 .rto_wakeup : 1307 .rto_pulled : 7 .rto_pushed : 4268 .lb_breaks : 101688 .rt.rt_nr_running : 0 .rt.rt_nr_uninterruptible : -73 cfs_rq .exec_clock : 2344576.091583 .MIN_vruntime : 0.000001 .min_vruntime : 34734213.666132 .max_vruntime : 0.000001 .spread : 0.000000 .spread0 : 544510.904957 .nr_running : 0 .load : 0 .bkl_count : 52 .nr_spread_over : 89724 runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- ------- Additional Comment #34 From Venkateswarara Jujjuri 2008-08-21 01:48 EDT Today I did three experiments. 1. Blast to local disks (No SAN configuration): I my previous post; I said this configuration working fine. Well my statement was too early. I let it run overnight and it hung the system. So this confirms that the problem is not SAN related. 2. Made elm3c29 without multipath drivers. I started blast on it. I will let it run overnight. 3. On elm3c24 with multipath driver. I started the blast. As usual after 3 hours it practically hung the system. Through the console I took system back trace twice with 1hr and 24 hours gap. Plan is to analyze these two stacks and see which processes got hung and what are making progress. I am hoping this gives us some indication on lock contention. I will update the defect with my findings. ------- Additional Comment #35 From Sripathi Kodi 2008-08-21 04:46 EDT Tim, the list of processes in comment #26 seems to be incomplete. Copy-paste error, by any chance? ------- Additional Comment #36 From Sripathi Kodi 2008-08-21 07:39 EDT [reply] I am trying to make sense out of the dump mentioned in comment #22. However, it is very difficult to get information about function parameters. Looks like we are not passing them on the stack any more. ------- Additional Comment #37 From Vernon Mauery 2008-08-21 09:11 EDT [reply] For some reason, I think we went back to register passed function arguments. Even as far back as SR3, we are using register passed arguments. It sure does make it a bear to debug. ------- Additional Comment #38 From TIMOTHY R. CHAVEZ 2008-08-21 11:40 EDT (In reply to comment #35) Yeah, I noticed that last night... I couldn't keep the console alive long enough to get it all... I'll try again today. There are some interesting stack traces for blast and pdflush, for example. ------- Additional Comment #40 From Venkateswarara Jujjuri 2008-08-22 01:03 EDT This defect is giving conflicting data points. From the latest investigation: - SAN configured with dm_multipath hung in 1 hour. - SAN configured without dm_multipath ran overnight w/o any problem. This may tell us that.. it could be two different problems. The one we have seen on the local disks could be a different issue. I took the crashdump on elm3c24 when it is hung. Dump is available on KBIC. The stack shows that most of the threads have submitted IO and waiting for it to complete. Or waiting for a buffer to submit IO. I could not find anyone holding lock and got preempted. This might give an indication that the bio_done notification could be an issue. Another interesting thing is; I took 3 back traces with an hour gap ..while the system is hung. And then dump after 10 hours. Stack on all these three instances are identical. This tells us that the stacks pointed by dump are that of the hung processes...not a point in time. Next step is to instrument dm_scsi module. I will be doing that tomorrow. ------- Additional Comment #41 From Sripathi Kodi 2008-08-22 06:27 EDT [reply] Google led me to http://readlist.com/lists/vger.kernel.org/linux-kernel/88/443749.html which reports a problem with processes getting stuck in sync_buffer. We see some bash and blast threads in our dumps which are stuck with similar backtraces. The next mail on the thread : http://readlist.com/lists/vger.kernel.org/linux-kernel/88/443762.html says the problem is fixed by commit 149a051f82d2b3860fe32fa182dbc83a66274894. In fact the code around this area has changed again since then. This patch is not in MRG kernels. Has this problem been recreated with latest kernels? 2.6.26-rt1? Can the above be the patch we need? ------- Additional Comment #42 From TIMOTHY R. CHAVEZ 2008-08-25 10:13 EDT (In reply to comment #41) I couldn't get this elmb3b210 (the LS41 w/ local disk we were able to repo the failure on) to boot 2,6,23,3-rt2 for some reason... blargh... trying with an LS21 this morning. ------- Additional Comment #43 From Venkateswarara Jujjuri 2008-08-25 12:37 EDT Over the wk-end; I booted elm3c24 with 2.6.26.3-rt2 and was able to reproduce the problem. Configuration of the system: - AMD 2 Quad Cores - fastT San disks. - Single Path. I reproduced the problem on this kernel with and without multipath driver in the stack. i.e on mpath devices and on sd devices. Now next plan is: 1. Configure the system with non-ext3 fs( ReiserFS??) and rerun the blast. 2. Walk through the code and instrument it. With the help of Paul Mckenney's debugfs. 3. Analyze the dump. ------- Additional Comment #44 From Venkateswarara Jujjuri 2008-08-26 01:04 EDT To eliminate the FS type variable; I have configured the system with JFS filesystem and started the blast test on multipath (mpath)devices. Test are running for the last 10 hours.!! Ext3 multipath combination consistently hung the system in 2 hours. Interesting that JFS+mpath has been running for the last 10 hours. We may have to wait until tomorrow morning before making any conclusions. ------- Additional Comment #47 From Venkateswarara Jujjuri 2008-08-26 13:56 EDT My testing on JFS+mpath+RT went fine for almost 20 hours. and then blast failed with "Bus error" and on the terminal : Message from syslogd@ at Tue Aug 26 10:36:43 2008 ... elm3c24 kernel: journal commit I/O error More digging showed that the problem is with the root fs running ext3 dmesg shows: EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal Remounting filesystem read-only __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data journal commit I/O error Buffer I/O error on device dm-0, logical block 15466498 lost page write due to I/O error on dm-0 Buffer I/O error on device dm-0, logical block 15466499 lost page write due to I/O error on dm-0 Buffer I/O error on device dm-0, logical block 15892561 lost page write due to I/O error on dm-0 Buffer I/O error on device dm-0, logical block 15892605 lost page write due to I/O error on dm-0 __ratelimit: 103 messages suppressed Buffer I/O error on device dm-0, logical block 0 lost page write due to I/O error on dm-0 EXT3-fs error (device dm-0): ext3_find_entry: reading directory #13598721 offset 0 EXT3-fs error (device dm-0): ext3_find_entry: reading directory #13598721 offset 0 EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=17465504, block=17465350 EXT3-fs error (device dm-0): ext3_find_entry: reading directory #13598721 offset 0 metapage_write_end_io: I/O error ------- Additional Comment #48 From Venkateswarara Jujjuri 2008-08-26 14:57 EDT Ext3 issue reported in Comment #47 is because of SAN s/w reset. Please ignore it. ------- Additional Comment #49 From Venkateswarara Jujjuri 2008-08-27 01:19 EDT Tried on ext2 fs and got reproduced in around 3 hours. To make sure; I have started JFS test again. Me and Keith spent most of the afternoon cscoping and code walking the IO path. Crash shows number io waits on each processor. We are trying to got to the bio buffer head to findout the status of pending IOs. crash> struct rq ffff810009016980 | grep -e iowait -e counter nr_iowait = { counter = 6 crash> struct rq ffff8101500a2980 | grep -e iowait -e counter nr_iowait = { counter = 2 crash> struct rq ffff810009021980 | grep -e iowait -e counter nr_iowait = { counter = 0 crash> struct rq ffff8101500ad980 | grep -e iowait -e counter nr_iowait = { counter = 4 crash> struct rq ffff81000902c980 | grep -e iowait -e counter nr_iowait = { counter = 0 crash> struct rq ffff8101500b8980 | grep -e iowait -e counter nr_iowait = { counter = 2 crash> struct rq ffff81000900b980 | grep -e iowait -e counter nr_iowait = { counter = 27 crash> struct task_struct ffff81014d141480 > blast_task_struct crash> struct task_struct ffff81014d141480 |grep -e blast -e bio comm = "blast\000d\000\000\000\000\000\000\000\000", bio_list = 0x0, bio_tail = 0x0, crash> foreach struct task_struct |less foreach: unknown argument: "struct" foreach: unknown argument: "task_struct" foreach: no keywords specified crash> foreach task|less ------- Additional Comment #50 From TIMOTHY R. CHAVEZ 2008-08-27 09:47 EDT (In reply to comment #49) Interesting... I left blast running for about 19 hours and 20 minutes with ext2 / about 835 loops (I left my notes in the office so these are approximations) without problems. That said, I did notice I was only testing on 3 mount points and not 4 (doh!). I restarted this morning with ext3 mounted and made sure it was running on 4 mounts. Is the dumpfile available on kbic? Were you able to get any other information about the system via sysrq? ------- Additional Comment #51 From TIMOTHY R. CHAVEZ 2008-08-27 16:41 EDT (In reply to comment #41) I was skimming through LKML and I came across this conversation which reports a problem with a stack trace that seems awfully similar... I do not yet know what the conclusion of this dialog was. Adding link for documented reference... http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-03/msg02480.html
A 36 hour run reproduced the problem on JFS too. This brings us back from suspecting ext3 FS. I am analyzing the dump...so far nothing that points to a smoking gun. But one thing that looks suspicious. From the dump: struct mapped_device { .... .... /* * A list of ios that arrived while we were suspended. */ atomic_t pending; = 0000000000001fa1 wait_queue_head_t wait; struct bio_list deferred; = 0000000000000000 0000000000000000 struct bio_list pushback; = 0000000000000000 0000000000000000 ... .... } I am still looking at the code if this is a valid/sane state. While pending count is pretty high, NULLs are there in the deferred and pushback list. In the next comment I will add complete crash analysis on how I came to this data structure.
Here is the detailed analysis on how I came to the device_mapper data structure. Took one of the stuck'd blast thread. PID: 6840 TASK: ffff81014d141480 CPU: 1 COMMAND: "blast" #0 [ffff8101330bb898] schedule at ffffffff8128565f #1 [ffff8101330bb950] io_schedule at ffffffff81285b9c #2 [ffff8101330bb970] sync_buffer at ffffffff810d21ca #3 [ffff8101330bb980] __wait_on_bit at ffffffff81285e14 #4 [ffff8101330bb9c0] out_of_line_wait_on_bit at ffffffff81285eb4 #5 [ffff8101330bba30] __wait_on_buffer at ffffffff810d214a #6 [ffff8101330bba40] sync_dirty_buffer at ffffffff810d2ee7 ffff8101330bba48: ffff81019dd42e98 ffff8102348272d0 ffff8101330bba58: ffff8101330bbaa8 ffffffff88025685 #7 [ffff8101330bba60] journal_dirty_data at ffffffff88025685 ffff8101330bba68: ffff8102348272d0 ffff81014c4cb800 ffff8101330bba78: ffff81019dd42e98 0000000000001000 ffff8101330bba88: ffff8102348272d0 ffff81019dd42df8 ffff8101330bba98: ffff81019dd42df8 ffff81019dd42df8 ffff8101330bbaa8: ffff8101330bbad8 ffffffff8803720c #8 [ffff8101330bbab0] ext3_journal_dirty_data at ffffffff8803720c #9 [ffff8101330bbae0] walk_page_buffers at ffffffff8803659b #10 [ffff8101330bbb40] ext3_ordered_write_end at ffffffff88038dcd #11 [ffff8101330bbba0] generic_file_buffered_write at ffffffff81085771 #12 [ffff8101330bbca0] __generic_file_aio_write_nolock at ffffffff81085eef #13 [ffff8101330bbd50] generic_file_aio_write at ffffffff81085fbd #14 [ffff8101330bbda0] ext3_file_write at ffffffff880352ae #15 [ffff8101330bbdd0] do_sync_write at ffffffff810afd93 #16 [ffff8101330bbf10] vfs_write at ffffffff810b05d9 #17 [ffff8101330bbf40] sys_write at ffffffff810b0be9 crash> struct buffer_head ffff81019dd42df8 struct buffer_head { b_state = 16413, 100000000011101 BH_Uptodate|BH_Lock|BH_Req|BH_Mapped Indicating that, it is a locked buffer with valid disk mapping and has been suubmitted for IO. b_this_page = 0xffff81019dd42df8, b_page = 0xffffe2000c8bc200, b_blocknr = 61191, b_size = 4096, b_data = 0xffff8102174b0000 "w\003\035", b_bdev = 0xffff81014f4b99c0, ==> This is block_device b_end_io = 0xffffffff810d36f2 <end_buffer_write_sync>, b_private = 0xffff8101897081f0, b_assoc_buffers = { next = 0xffff81019dd42e40, prev = 0xffff81019dd42e40 }, ... .. } crash> struct block_device 0xffff81014f4b99c0 struct block_device { bd_dev = 265289733, bd_inode = 0xffff81014f4b9ab8, bd_openers = 1, .... ... bd_holder = 0xffffffff880505a0, bd_holders = 2, bd_holder_list = { next = 0xffff81014f4b9a60, prev = 0xffff81014f4b9a60 }, bd_contains = 0xffff81014f4b99c0, bd_block_size = 4096, bd_part = 0x0, bd_part_count = 0, bd_invalidated = 0, bd_disk = 0xffff81014e0a2800, ==> This is the gendisk. bd_list = { next = 0xffff81014f430258, prev = 0xffff81014f47a658 }, bd_inode_backing_dev_info = 0x0, bd_private = 0 } crash> struct gendisk 0xffff81014e0a2800 struct gendisk { major = 253, first_minor = 5, minors = 1, disk_name = "dm-5\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\ 000\000\000\000\000\000\000\000\000\000\000\000", ==> My disk name. part = 0x0, part_uevent_suppress = 0, fops = 0xffffffff881a87a0, queue = 0xffff81014e4a1858, ==> This is the request_qaueue private_data = 0xffff81014e174400, capacity = 2097152, flags = 16, driverfs_dev = 0x0, ... ... } crash> struct request_queue 0xffff81014e4a1858 struct request_queue { queue_head = { next = 0x0, prev = 0x0 }, .... .... make_request_fn = 0xffffffff8819dac3, <dm_request> prep_rq_fn = 0, unplug_fn = 0xffffffff8819d080, < dm_unplug_all > merge_bvec_fn = 0, prepare_flush_fn = 0, softirq_done_fn = 0, end_sector = 0, boundary_rq = 0x0, ... ... backing_dev_info = { ra_pages = 32, state = 0, capabilities = 4, congested_fn = 0xffffffff8819d02d, <dm_any_congested> congested_data = 0xffff81014e174400, unplug_io_fn = 0xffffffff81127e85 <blk_backing_dev_unplug>, unplug_io_data = 0xffff81014e4a1858, bdi_stat = {{ lock = { raw_lock = { slock = 43433 }, break_lock = 0 }, count = 54880, list = { next = 0xffff81024d5a8158, prev = 0xffff81014e4a1a78 }, counters = 0x7efeb0ecfebf }, { lock = { raw_lock = { slock = 57054 }, break_lock = 0 }, count = 2688, list = { next = 0xffff81014e4a1a50, prev = 0xffff81014e4a1aa0 }, counters = 0x7efeb172233f }}, completions = { events = { lock = { raw_lock = { slock = 39321 }, break_lock = 0 }, count = 169288, list = { next = 0xffff81014e4a1a78, prev = 0xffff81024d95d2b0 }, counters = 0x7efeb1f4bebf }, shift = 19, period = 24903680, lock = { raw_lock = { slock = 24158 }, break_lock = 0 } }, dirty_exceeded = 0 }, queuedata = 0xffff81014e174400, ==> This is mapped_device bounce_pfn = 2424832, bounce_gfp = 16, crash> struct mapped_device 0xffff81014e174400 struct: invalid data structure reference: mapped_device crash> crash> rd 0xffff81014e174400 100 // Intrpreting it as mapped_device ffff81014e174400: 0000000000000000 0000000000005e5e ........^^...... ffff81014e174410: ffff81014e174410 ffff81014e174410 .D.N.....D.N.... ffff81014e174420: ffff81014e174420 ffff81014e174420 D.N.... D.N.... ffff81014e174430: 0000000000000000 0000000000000000 ................ ffff81014e174440: ffff81014e174440 ffff81014e174440 @D.N........ ffff81014e174450: 000000000000008c 0000000000000001 ................ ffff81014e174460: 0000000000000000 ffff81014e174468 ........hD.N.... ffff81014e174470: ffff81014e174468 ffff81014e174478 hD.N....xD.N.... ffff81014e174480: ffff81014e174478 0000000000000000 xD.N............ ffff81014e174490: 0000000000000000 ffff81014e174498 .........D.N.... ffff81014e1744a0: ffff81014e174498 ffff81014e1744a8 .D.N.....D.N.... ffff81014e1744b0: ffff81014e1744a8 0000000000000000 .D.N............ ffff81014e1744c0: 0000000000000000 0000000000000000 ................ ffff81014e1744d0: 0000000000006969 ffff81014e1744d8 ii.......D.N.... ffff81014e1744e0: ffff81014e1744d8 ffff81014e1744e8 .D.N.....D.N.... ffff81014e1744f0: ffff81014e1744e8 0000000000000000 .D.N............ ffff81014e174500: 0000000000000000 ffff81014e174508 .........E.N.... ffff81014e174510: ffff81014e174508 000000000000008c .E.N............ ffff81014e174520: 0000000000000000 0000000100000003 ................ ffff81014e174530: 0000000000000000 ffff81014e4a1858 ........X.JN.... ^^^^^ request_queue ffff81014e174540: ffff81014e0a2800 000000353a333532 .(.N....253:5... ^^^^ gendisk ^^^^ name ffff81014e174550: 0000000000000000 ffff81014e0c5b40 ........@[.N.... ^^^^^^^^^^^interface_ptr ffff81014e174560: 0000000000001fa1 0000000000000000 ................ ^^^^^^^^ pending?? this is a huge number. ffff81014e174570: ffff81014e174570 ffff81014e174570 pE.N....pE.N.... ffff81014e174580: ffff81014e174580 ffff81014e174580 .E.N.....E.N.... ffff81014e174590: 0000000000000000 0000000000000000 ................ ffff81014e1745a0: ffff81014e1745a0 ffff81014e1745a0 .E.N.....E.N.... ^^^ wait (__wait_queue_head) ends here ffff81014e1745b0: 0000000000000000 0000000000000000 ................ struct bio_list deferred; ffff81014e1745c0: 0000000000000000 0000000000000000 ................ struct bio_list pushback; ffff81014e1745d0: ffff81014e56d200 ffff81014e99c480 ..VN.......N.... ^^^^ map (struct dm_table *map;) ffff81014e1745e0: ffff81014e99c080 ffff81014e8dd3c0 ...N.......N.... ^^ *bs(bio_set) ffff81014e1745f0: 0000000000000001 0000000000000000 ................ ffff81014e174600: ffff81014e174600 ffff81014e174600 .F.N.....F.N.... ffff81014e174610: ffff81014e174610 ffff81014e174610 .F.N.....F.N.... ffff81014e174620: 0000000000000000 0000000000000000 ................ ffff81014e174630: ffff81024a9a3c90 ffff81024a9a3c90 .<.J.....<.J.... ffff81014e174640: 0000000000000000 ffff81014e174648 ........HF.N.... ffff81014e174650: ffff81014e174648 0000000000000000 HF.N............ ffff81014e174660: ffff81014e174660 ffff81014e174660 `F.N....`F.N.... ffff81014e174670: ffff81014e174670 ffff81014e174670 pF.N....pF.N....
Changed the default IO scheduler from "cfq" to "anticipatory" and started the blast on mpath devices. It is running for the last 11 hours. In this configuration with "cfq" hung the machine consistently within 3 hours. I can see machine response is slow; but is responsive and blast is running. I will let it run.... (In reply to comment #60) > Changed the default IO scheduler from "cfq" to "anticipatory" and started the > blast on mpath devices. It is running for the last 11 hours. In this > configuration with "cfq" hung the machine consistently within 3 hours. > What was Ming Ming's rationale for switching to the anticipatory I/O scheduler? (In reply to comment #61) > (In reply to comment #60) > > Changed the default IO scheduler from "cfq" to "anticipatory" and started the > > blast on mpath devices. It is running for the last 11 hours. In this > > configuration with "cfq" hung the machine consistently within 3 hours. > > > > > What was Ming Ming's rationale for switching to the anticipatory I/O scheduler? CFQ is relatively new. Became default from 2.6.18. So the idea is to try anticipatory which was the default one before CFQ...and is supposed to be a stable one. BTW, tests are still running fine. ext3 on mpath with AS io scheduler ran longer .. but finally hung the machine after 15 hours of the run. I am trying to reproduce this bug on non-rt system. Compiled 2.6.26-3 kernel; booted-up; and started the blast. Tests are running for 60+ hours. I will be stopping these tests sometime soon...Looks like the bug is only on RT system or.. may be it is not getting exposed on non-rt systems. Blast on non-rt kernel ran for 75 hours. Stopped the tests. As per Mike's request, I will be collecting dump on rt kernel tonight. On non-rt kernel, I tried to stop blast. 1 out of 4 threads did not end properly... and on the console: INFO: task blast:10604 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ffff81022b54ba28 0000000000000082 0000000000000000 ffff810170e6cf18 ffff81024d17d340 ffff81024faa2b70 ffff81024d17d578 0000000300000001 00000000ffffffff 0000000000000003 0000000000000000 0000000000000000 Call Trace: [<ffffffff802f6aab>] log_wait_commit+0x9f/0xed [<ffffffff8023d90a>] autoremove_wake_function+0x0/0x2e [<ffffffff802f5605>] log_do_checkpoint+0xdf/0x348 [<ffffffff8029c3ae>] __find_get_block+0x148/0x158 [<ffffffff802f2985>] journal_stop+0x181/0x18d [<ffffffff8034c569>] __next_cpu+0x19/0x26 [<ffffffff80225401>] find_busiest_group+0x25e/0x6c6 [<ffffffff80209df6>] __switch_to+0x2ea/0x2f9 [<ffffffff8055a09f>] thread_return+0x3d/0x9c [<ffffffff80234ce2>] lock_timer_base+0x26/0x4b [<ffffffff802f5944>] __log_wait_for_space+0x7d/0xa1 [<ffffffff802f304a>] start_this_handle+0x2c3/0x304 [<ffffffff8023d90a>] autoremove_wake_function+0x0/0x2e [<ffffffff8029199d>] mntput_no_expire+0x20/0x107 [<ffffffff802f3126>] journal_start+0x9b/0xd2 [<ffffffff802ea2f8>] ext3_create+0x30/0xe9 [<ffffffff8028d77b>] d_alloc+0x15b/0x1a8 [<ffffffff802850fb>] vfs_create+0x75/0xba [<ffffffff80287a13>] do_filp_open+0x1dd/0x773 [<ffffffff8023d90a>] autoremove_wake_function+0x0/0x2e [<ffffffff802e2835>] ext3_discard_reservation+0x52/0x61 [<ffffffff8034c5f9>] _atomic_dec_and_lock+0x39/0x54 [<ffffffff8027c4d6>] do_sys_open+0x46/0xca [<ffffffff8020b07b>] system_call_after_swapgs+0x7b/0x80 After sometime blast killed itself for the timeout with the message: Waiting for tests to end. BLAST Ended on /SAN/test2 RC = 0 at 09/01/2008 21:46:38 *=== Run statistics ============================================* | Elapsed time = 3 days, 3 hours, 38 minutes and 20 seconds *===============================================================* BLAST Ended on /SAN/test4 RC = 0 at 09/01/2008 21:46:42 *=== Run statistics ============================================* | Elapsed time = 3 days, 3 hours, 38 minutes and 13 seconds *===============================================================* BLAST Ended on /SAN/test3 RC = 0 at 09/01/2008 21:47:21 *=== Run statistics ============================================* | Elapsed time = 3 days, 3 hours, 38 minutes and 58 seconds *===============================================================* q Forcing test termination issuing KILL for thread 1120500032 Killed Ran blast on the 2.6.24-ibmrt2.6 with the "noop" io scheduler... at some point /SAN/test1 (pointing at /dev/dm-6) went "read-only"... device-mapper: multipath: Failing path 8:32. end_request: I/O error, dev sdh, sector 9910550 device-mapper: multipath: Failing path 8:112. printk: 8 messages suppressed. Buffer I/O error on device dm-4, logical block 12991 lost page write due to I/O error on dm-4 Aborting journal on device dm-4. Buffer I/O error on device dm-4, logical block 841 lost page write due to I/O error on dm-4 Buffer I/O error on device dm-4, logical block 1179668 lost page write due to I/O error on dm-4 Buffer I/O error on device dm-4, logical block 1179669 lost page write due to I/O error on dm-4 Buffer I/O error on device dm-4, logical block 1179670 lost page write due to I/O error on dm-4 Buffer I/O error on device dm-4, logical block 1179671 lost page write due to I/O error on dm-4 ext3_abort called. EXT3-fs error (device dm-4): ext3_journal_start_sb: Detected aborted journal Remounting filesystem read-only ... this could be because the system is running without dm-mp enabled...
I tried configuring bonnie to do something like blast using the following script: #!/bin/sh set -x FILES_X1024=100 NUM_DIRS=1 MAX_SIZE=524288 MIN_SIZE=1024 BONNIE_LOOPS=10 BONNIE_USER="-u rtuser" BONNIE_ARGS="-x $BONNIE_LOOPS -s 0 -n $FILES_X1024:$MAX_SIZE:$MIN_SIZE:$NUM_DIRS" rm -rf b{1,2,3,4,5,6,7,8} mkdir b{1,2,3,4,5,6,7,8} if [ ! -d logs ]; then mkdir logs fi chmod a+rwx b{1,2,3,4,5,6,7,8} logs bonnie++ $BONNIE_USER -p8 -x$BONNIE_LOOPS for ((i=1;i<9;i++)); do bonnie++ -y -d b$i $BONNIE_USER $BONNIE_ARGS > logs/`hostname`-`date +%Y%M%d-%H:%M:%S`-bonnie++-b$i.log & done wait It ran until the disk was full, but no hangs. I didn't have separate partitions. I am going to need to repartition a /test partition I think in order to be able to test multiple partitions, or maybe stick this spare drive in my cube into 102... I think I'll try that tomorrow.
Currently running iozone on elm3c19, using both / and /test.
Reproduced on elm3c29 on /dev/sd devices. good the dump. Moving the vmcore to KBIC.
I have saved the core on KBIC. Defect is reproduced directly on /dev/sd devices. This should make the debugging process easier as we are avoiding multipath layers. Dump can be accessed/invoked on KBIC in the following way: jvrao@kernel:~/bz46744/elm3c29$ pwd -P /home/services/opensource/realtime/kdump_cores/bz46744/elm3c29 jvrao@kernel:~/bz46744/elm3c29$ ./crash ./vmlinux 2008-09-02-22\:25/vmcore Please contact me if you can't access the dump for any reasons.
Started analyzing the elm3c29 dump. I am updating my findings to the following file: /home/services/opensource/realtime/kdump_cores/bz46744/elm3c29/jv-bt-all If anyone interested, please feel free to look into it.
I reviewed the 2008-09-02-22:25/vmcore It appears that the scsi host adapter is in a recovery state post a number of IOs timing out. There are still some questions outstanding. - Why did the IOs timeout. Since it was indicated that a failure has been seen on non-san configurations we may need to look at a non-san failure vmcore to determine if signatures are similar to help focus on the cause of the timeouts. - The qla2x00_issue_iocb function is not returning during the abort sequence. I appears to be in a mod_timer call, but that does match the source I have so I will need a pointer to the source tree for this vmcore. I copied a script file and edited version of the crash info used to the url shown below. http://kernel.beaverton.ibm.com/~andmike/bugs/ibmbz_46744/
Yesterday I gave code access to Mike. I placed the source on KBIC.. /home/jvrao/public/linux-2.6.24.7-ibmrt2.6-view
The box I was running blast on with local disks hung, but I have been unable to collect a dump via sysrq. JV pressed the NMI, but the system did not respond. I wasn't able to raise anyone to press the button late this afternoon. Given that, I am planning to reboot the machine and kick the tests off again. On another note, I was able to crash a SAN machine using bonnie++ instead of blast with mixed_load.sh as a background load. I collected the crash and uploaded to kbic here (IBM only for now, I can share if others are interested): http://kernel.beaverton.ibm.com/realtime/kdump_cores/bz46744/elm3c29/bonnie-crash/ A README and the script I used to execute bonnie++ are there as well.
On Live machine/crash confirmed that scsi_eh_0 is not making progress. Waiting on the semaphore. This analysis shows how I tracked down that semaphore. Still need to figure out who is holding that semaphore. crash> bt 524 PID: 524 TASK: ffff81007eb494c0 CPU: 0 COMMAND: "scsi_eh_0" #0 [ffff81007ebcfb10] schedule at ffffffff8127a86c #1 [ffff81007ebcfbc8] __compat_down at ffffffff8127c6d3 #2 [ffff81007ebcfbf8] lock_timer_base at ffffffff81045f7b #3 [ffff81007ebcfc28] __compat_down_failed at ffffffff8127c329 #4 [ffff81007ebcfc78] qla2x00_mailbox_command at ffffffff880a44fc #5 [ffff81007ebcfd28] qla2x00_issue_iocb at ffffffff880a5add #6 [ffff81007ebcfdb8] qla24xx_abort_command at ffffffff880a63e3 #7 [ffff81007ebcfe08] qla2xxx_eh_abort at ffffffff8809d127 #8 [ffff81007ebcfe78] __scsi_try_to_abort_cmd at ffffffff88056f31 #9 [ffff81007ebcfe88] scsi_error_handler at ffffffff8805870d #10 [ffff81007ebcff28] kthread at ffffffff8105144f #11 [ffff81007ebcff48] kernel_thread at ffffffff8100d048 crash> crash> bt -f 524 PID: 524 TASK: ffff81007eb494c0 CPU: 0 COMMAND: "scsi_eh_0" #0 [ffff81007ebcfb10] schedule at ffffffff8127a86c ... #3 [ffff81007ebcfc28] __compat_down_failed at ffffffff8127c329 ffff81007ebcfc30: ffff81007ff97e90 0000000000000000 ffff81007ebcfc40: 0000000000000000 000000000000752f ffff81007ebcfc50: ffff81007eb494c0 0000000000000000 ffff81007ebcfc60: ffff81007eb494c0 ffff81007ebcfca0 ffff81007ebcfc70: ffff81014ece69d0 ffffffff880a44fc "include/linux/semaphore.h" down((struct compat_semaphore *sem) #4 [ffff81007ebcfc78] qla2x00_mailbox_command at ffffffff880a44fc ffff81007ebcfc80: 0000000000000000 0000000000000000 ffff81007ebcfc90: 0000000000000000 0054000100000000 ffff81007ebcfca0: 0000000000000000 0000000000200200 ffff81007ebcfcb0: 0000000102ec7256 ffffffff880a42e8 ffff81007ebcfcc0: ffff81014ece69d0 ffffffff81534700 ........................^^^^^^^^^^^ sema ffff81007ebcfcd0: ffffffff880a44b2 5f68655f69736373 ffff81007ebcfce0: 0000000000000030 000000000000020c ffff81007ebcfcf0: ffff810009005a40 ffff81014e9d2200 ffff81007ebcfd00: ffff81014ece4610 ffff81014e9d2200 ffff81007ebcfd10: 0000000000000332 ffff81014d0f94c0 ffff81007ebcfd20: ffff81007ebcfdb0 ffffffff880a5add #5 [ffff81007ebcfd28] qla2x00_issue_iocb at ffffffff880a5add ffff81007ebcfd30: 00000005000000cf 22004e9d00000054 ffff81007ebcfd40: 000100007ebcfdb0 ffffffff811b6bdb ffff81007ebcfd50: 0000000100000000 0000000000000000 ffff81007ebcfd60: 0000000000000000 ffffffffffffffff ffff81007ebcfd70: 0000000000000000 0000000000000000 ffff81007ebcfd80: 0000000000000000 ffff81000000001e ffff81007ebcfd90: ffff81014e8b8180 ffff81014e9d2200 ffff81007ebcfda0: ffff81014e8b8180 ffff81014ece4610 .......................................^^^^^^^^^^^^^^scsi_qla_host_t ffff81007ebcfdb0: ffff81007ebcfe00 ffffffff880a63e3 crash> struct scsi_qla_host ffff81014ece4610|grep host_str host_str = "qla2xxx_0\000\000\000\000\000\000", crash> struct scsi_qla_host ffff81014ece4610|grep model_number model_number = "QMI2472\000\000\000\000\000\000\000\000\000", crash> struct scsi_qla_host ffff81014ece4610|grep parent parent = 0x0, // NO parent. outstanding_cmds = {0x0, 0xffff810148028300, 0xffff81014d0f9cc0, 0xffff8100633e2dc0, 0xffff8100789abe00, 0xffff81014d0f9700, 0xffff8100789aba00, 0xffff81014d0f9480, 0xffff8100633e2900, 0xffff81014d0f9e00, 0xffff8100a95916c0, 0xffff8100a9591640, 0xffff81014e1bc340, 0xffff8101480284c0, 0xffff810148028fc0, 0xffff8100a9591380, 0xffff81014d0f91c0, 0xffff81014d0f9380, 0xffff8100a9591dc0, 0xffff81014e1bc2c0, 0xffff81014e1bc200, 0xffff8100345c1d00, 0xffff8100789ab880, 0xffff8100789abdc0, 0xffff810079056a40, 0xffff8100345c1680, 0xffff8100633e2a80, 0xffff81014d0f96c0, 0xffff8101480286c0, 0xffff81014d0f9980, 0xffff810148028d00, 0xffff810148028900, 0xffff81014d0f9280, 0xffff810148028600, 0xffff81014d0f95c0, 0xffff81014e1bc480, 0xffff8100a95913c0, 0xffff81014d0f9fc0, 0xffff81014d0f9900, 0xffff8100633e2f80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0...}, current_outstanding_cmd = 39, <= Outstanding Commands. crash> struct srb 0xffff810148028300 struct srb { ha = 0xffff81014ece4610, // Validation -> pointing back to scsi_qla_host. Confirmed that we are looking at valid structure. fcport = 0xffff81014e8b8180, cmd = 0xffff81014a539940, flags = 2, dma_handle = 18446744071579714837, request_sense_length = 0, request_sense_ptr = 0x0 } crash> struct scsi_qla_host.mbx_intr_sem ffff81014ece4610 mbx_intr_sem = { count = { counter = 0 }, sleepers = 1, wait = { lock = { lock = { wait_lock = { raw_lock = { slock = 0 }, break_lock = 0 }, wait_list = { prio_list = { next = 0xffff81014ece69e0, prev = 0xffff81014ece69e0 }, node_list = { next = 0xffff81014ece69f0, prev = 0xffff81014ece69f0 } }, owner = 0x0 }, break_lock = 0 }, task_list = { next = 0xffff81007ebcfbe8, prev = 0xffff81007ebcfbe8 } } }, Apparently one this process is sleeping. Not sure if there is any easy way of finding who is holding this semaphore. Any help on how to find that out? (In reply to comment #77) > On Live machine/crash confirmed that scsi_eh_0 is not making progress. > Waiting on the semaphore. This analysis shows how I tracked down that semaphore. > Still need to figure out who is holding that semaphore. > > crash> bt 524 > PID: 524 TASK: ffff81007eb494c0 CPU: 0 COMMAND: "scsi_eh_0" > #0 [ffff81007ebcfb10] schedule at ffffffff8127a86c > #1 [ffff81007ebcfbc8] __compat_down at ffffffff8127c6d3 > #2 [ffff81007ebcfbf8] lock_timer_base at ffffffff81045f7b > #3 [ffff81007ebcfc28] __compat_down_failed at ffffffff8127c329 > #4 [ffff81007ebcfc78] qla2x00_mailbox_command at ffffffff880a44fc > #5 [ffff81007ebcfd28] qla2x00_issue_iocb at ffffffff880a5add > #6 [ffff81007ebcfdb8] qla24xx_abort_command at ffffffff880a63e3 > #7 [ffff81007ebcfe08] qla2xxx_eh_abort at ffffffff8809d127 > #8 [ffff81007ebcfe78] __scsi_try_to_abort_cmd at ffffffff88056f31 > #9 [ffff81007ebcfe88] scsi_error_handler at ffffffff8805870d > #10 [ffff81007ebcff28] kthread at ffffffff8105144f > #11 [ffff81007ebcff48] kernel_thread at ffffffff8100d048 > crash> > crash> bt -f 524 > PID: 524 TASK: ffff81007eb494c0 CPU: 0 COMMAND: "scsi_eh_0" > #0 [ffff81007ebcfb10] schedule at ffffffff8127a86c > ... > #3 [ffff81007ebcfc28] __compat_down_failed at ffffffff8127c329 > ffff81007ebcfc30: ffff81007ff97e90 0000000000000000 > ffff81007ebcfc40: 0000000000000000 000000000000752f > ffff81007ebcfc50: ffff81007eb494c0 0000000000000000 > ffff81007ebcfc60: ffff81007eb494c0 ffff81007ebcfca0 > ffff81007ebcfc70: ffff81014ece69d0 ffffffff880a44fc > "include/linux/semaphore.h" down((struct compat_semaphore *sem) > #4 [ffff81007ebcfc78] qla2x00_mailbox_command at ffffffff880a44fc > ffff81007ebcfc80: 0000000000000000 0000000000000000 > ffff81007ebcfc90: 0000000000000000 0054000100000000 > ffff81007ebcfca0: 0000000000000000 0000000000200200 > ffff81007ebcfcb0: 0000000102ec7256 ffffffff880a42e8 > ffff81007ebcfcc0: ffff81014ece69d0 ffffffff81534700 > ........................^^^^^^^^^^^ sema > ffff81007ebcfcd0: ffffffff880a44b2 5f68655f69736373 > ffff81007ebcfce0: 0000000000000030 000000000000020c > ffff81007ebcfcf0: ffff810009005a40 ffff81014e9d2200 > ffff81007ebcfd00: ffff81014ece4610 ffff81014e9d2200 > ffff81007ebcfd10: 0000000000000332 ffff81014d0f94c0 > ffff81007ebcfd20: ffff81007ebcfdb0 ffffffff880a5add > #5 [ffff81007ebcfd28] qla2x00_issue_iocb at ffffffff880a5add > ffff81007ebcfd30: 00000005000000cf 22004e9d00000054 > ffff81007ebcfd40: 000100007ebcfdb0 ffffffff811b6bdb > ffff81007ebcfd50: 0000000100000000 0000000000000000 > ffff81007ebcfd60: 0000000000000000 ffffffffffffffff > ffff81007ebcfd70: 0000000000000000 0000000000000000 > ffff81007ebcfd80: 0000000000000000 ffff81000000001e > ffff81007ebcfd90: ffff81014e8b8180 ffff81014e9d2200 > ffff81007ebcfda0: ffff81014e8b8180 ffff81014ece4610 > .......................................^^^^^^^^^^^^^^scsi_qla_host_t > ffff81007ebcfdb0: ffff81007ebcfe00 ffffffff880a63e3 > > crash> struct scsi_qla_host ffff81014ece4610|grep host_str > host_str = "qla2xxx_0\000\000\000\000\000\000", > crash> struct scsi_qla_host ffff81014ece4610|grep model_number > model_number = "QMI2472\000\000\000\000\000\000\000\000\000", > > crash> struct scsi_qla_host ffff81014ece4610|grep parent > parent = 0x0, // NO parent. > > > outstanding_cmds = {0x0, 0xffff810148028300, 0xffff81014d0f9cc0, > 0xffff8100633e2dc0, 0xffff8100789abe00, 0xffff81014d0f9700, 0xffff8100789aba00, > 0xffff81014d0f9480, 0xffff8100633e2900, 0xffff81014d0f9e00, 0xffff8100a95916c0, > 0xffff8100a9591640, 0xffff81014e1bc340, 0xffff8101480284c0, 0xffff810148028fc0, > 0xffff8100a9591380, 0xffff81014d0f91c0, 0xffff81014d0f9380, 0xffff8100a9591dc0, > 0xffff81014e1bc2c0, 0xffff81014e1bc200, 0xffff8100345c1d00, 0xffff8100789ab880, > 0xffff8100789abdc0, 0xffff810079056a40, 0xffff8100345c1680, 0xffff8100633e2a80, > 0xffff81014d0f96c0, 0xffff8101480286c0, 0xffff81014d0f9980, 0xffff810148028d00, > 0xffff810148028900, 0xffff81014d0f9280, 0xffff810148028600, 0xffff81014d0f95c0, > 0xffff81014e1bc480, 0xffff8100a95913c0, 0xffff81014d0f9fc0, 0xffff81014d0f9900, > 0xffff8100633e2f80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0...}, > current_outstanding_cmd = 39, <= Outstanding Commands. > > crash> struct srb 0xffff810148028300 > struct srb { > ha = 0xffff81014ece4610, // Validation -> pointing back to scsi_qla_host. > Confirmed that we are looking at valid structure. > fcport = 0xffff81014e8b8180, > cmd = 0xffff81014a539940, > flags = 2, > dma_handle = 18446744071579714837, > request_sense_length = 0, > request_sense_ptr = 0x0 > } > > crash> struct scsi_qla_host.mbx_intr_sem ffff81014ece4610 > mbx_intr_sem = { > count = { > counter = 0 > }, > sleepers = 1, > wait = { > lock = { > lock = { > wait_lock = { > raw_lock = { > slock = 0 > }, > break_lock = 0 > }, > wait_list = { > prio_list = { > next = 0xffff81014ece69e0, > prev = 0xffff81014ece69e0 > }, > node_list = { > next = 0xffff81014ece69f0, > prev = 0xffff81014ece69f0 > } > }, > owner = 0x0 > }, > break_lock = 0 > }, > task_list = { > next = 0xffff81007ebcfbe8, > prev = 0xffff81007ebcfbe8 > } > } > }, > > Apparently one this process is sleeping. Not sure if there is any easy way > of finding who is holding this semaphore. Any help on how to find that out? > > > > > (In reply to comment #77) > On Live machine/crash confirmed that scsi_eh_0 is not making progress. > Waiting on the semaphore. This analysis shows how I tracked down that semaphore. > Still need to figure out who is holding that semaphore. > > crash> bt 524 > PID: 524 TASK: ffff81007eb494c0 CPU: 0 COMMAND: "scsi_eh_0" > #0 [ffff81007ebcfb10] schedule at ffffffff8127a86c > #1 [ffff81007ebcfbc8] __compat_down at ffffffff8127c6d3 > #2 [ffff81007ebcfbf8] lock_timer_base at ffffffff81045f7b > #3 [ffff81007ebcfc28] __compat_down_failed at ffffffff8127c329 > #4 [ffff81007ebcfc78] qla2x00_mailbox_command at ffffffff880a44fc > #5 [ffff81007ebcfd28] qla2x00_issue_iocb at ffffffff880a5add > #6 [ffff81007ebcfdb8] qla24xx_abort_command at ffffffff880a63e3 > #7 [ffff81007ebcfe08] qla2xxx_eh_abort at ffffffff8809d127 > #8 [ffff81007ebcfe78] __scsi_try_to_abort_cmd at ffffffff88056f31 > #9 [ffff81007ebcfe88] scsi_error_handler at ffffffff8805870d > #10 [ffff81007ebcff28] kthread at ffffffff8105144f > #11 [ffff81007ebcff48] kernel_thread at ffffffff8100d048 > crash> > crash> bt -f 524 > PID: 524 TASK: ffff81007eb494c0 CPU: 0 COMMAND: "scsi_eh_0" > #0 [ffff81007ebcfb10] schedule at ffffffff8127a86c > ... > #3 [ffff81007ebcfc28] __compat_down_failed at ffffffff8127c329 > ffff81007ebcfc30: ffff81007ff97e90 0000000000000000 > ffff81007ebcfc40: 0000000000000000 000000000000752f > ffff81007ebcfc50: ffff81007eb494c0 0000000000000000 > ffff81007ebcfc60: ffff81007eb494c0 ffff81007ebcfca0 > ffff81007ebcfc70: ffff81014ece69d0 ffffffff880a44fc > "include/linux/semaphore.h" down((struct compat_semaphore *sem) > #4 [ffff81007ebcfc78] qla2x00_mailbox_command at ffffffff880a44fc > ffff81007ebcfc80: 0000000000000000 0000000000000000 > ffff81007ebcfc90: 0000000000000000 0054000100000000 > ffff81007ebcfca0: 0000000000000000 0000000000200200 > ffff81007ebcfcb0: 0000000102ec7256 ffffffff880a42e8 > ffff81007ebcfcc0: ffff81014ece69d0 ffffffff81534700 > ........................^^^^^^^^^^^ sema > ffff81007ebcfcd0: ffffffff880a44b2 5f68655f69736373 > ffff81007ebcfce0: 0000000000000030 000000000000020c > ffff81007ebcfcf0: ffff810009005a40 ffff81014e9d2200 > ffff81007ebcfd00: ffff81014ece4610 ffff81014e9d2200 > ffff81007ebcfd10: 0000000000000332 ffff81014d0f94c0 > ffff81007ebcfd20: ffff81007ebcfdb0 ffffffff880a5add > #5 [ffff81007ebcfd28] qla2x00_issue_iocb at ffffffff880a5add > ffff81007ebcfd30: 00000005000000cf 22004e9d00000054 > ffff81007ebcfd40: 000100007ebcfdb0 ffffffff811b6bdb > ffff81007ebcfd50: 0000000100000000 0000000000000000 > ffff81007ebcfd60: 0000000000000000 ffffffffffffffff > ffff81007ebcfd70: 0000000000000000 0000000000000000 > ffff81007ebcfd80: 0000000000000000 ffff81000000001e > ffff81007ebcfd90: ffff81014e8b8180 ffff81014e9d2200 > ffff81007ebcfda0: ffff81014e8b8180 ffff81014ece4610 > .......................................^^^^^^^^^^^^^^scsi_qla_host_t > ffff81007ebcfdb0: ffff81007ebcfe00 ffffffff880a63e3 > > crash> struct scsi_qla_host ffff81014ece4610|grep host_str > host_str = "qla2xxx_0\000\000\000\000\000\000", > crash> struct scsi_qla_host ffff81014ece4610|grep model_number > model_number = "QMI2472\000\000\000\000\000\000\000\000\000", > > crash> struct scsi_qla_host ffff81014ece4610|grep parent > parent = 0x0, // NO parent. > > > outstanding_cmds = {0x0, 0xffff810148028300, 0xffff81014d0f9cc0, > 0xffff8100633e2dc0, 0xffff8100789abe00, 0xffff81014d0f9700, 0xffff8100789aba00, > 0xffff81014d0f9480, 0xffff8100633e2900, 0xffff81014d0f9e00, 0xffff8100a95916c0, > 0xffff8100a9591640, 0xffff81014e1bc340, 0xffff8101480284c0, 0xffff810148028fc0, > 0xffff8100a9591380, 0xffff81014d0f91c0, 0xffff81014d0f9380, 0xffff8100a9591dc0, > 0xffff81014e1bc2c0, 0xffff81014e1bc200, 0xffff8100345c1d00, 0xffff8100789ab880, > 0xffff8100789abdc0, 0xffff810079056a40, 0xffff8100345c1680, 0xffff8100633e2a80, > 0xffff81014d0f96c0, 0xffff8101480286c0, 0xffff81014d0f9980, 0xffff810148028d00, > 0xffff810148028900, 0xffff81014d0f9280, 0xffff810148028600, 0xffff81014d0f95c0, > 0xffff81014e1bc480, 0xffff8100a95913c0, 0xffff81014d0f9fc0, 0xffff81014d0f9900, > 0xffff8100633e2f80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0...}, > current_outstanding_cmd = 39, <= Outstanding Commands. > > crash> struct srb 0xffff810148028300 > struct srb { > ha = 0xffff81014ece4610, // Validation -> pointing back to scsi_qla_host. > Confirmed that we are looking at valid structure. > fcport = 0xffff81014e8b8180, > cmd = 0xffff81014a539940, > flags = 2, > dma_handle = 18446744071579714837, > request_sense_length = 0, > request_sense_ptr = 0x0 > } > > crash> struct scsi_qla_host.mbx_intr_sem ffff81014ece4610 > mbx_intr_sem = { > count = { > counter = 0 > }, > sleepers = 1, > wait = { > lock = { > lock = { > wait_lock = { > raw_lock = { > slock = 0 > }, > break_lock = 0 > }, > wait_list = { > prio_list = { > next = 0xffff81014ece69e0, > prev = 0xffff81014ece69e0 > }, > node_list = { > next = 0xffff81014ece69f0, > prev = 0xffff81014ece69f0 > } > }, > owner = 0x0 > }, > break_lock = 0 > }, > task_list = { > next = 0xffff81007ebcfbe8, > prev = 0xffff81007ebcfbe8 > } > } > }, > > Apparently one this process is sleeping. Not sure if there is any easy way > of finding who is holding this semaphore. Any help on how to find that out? > > > > > forgot to add: crash> struct scsi_qla_host.mbx_intr_sem struct scsi_qla_host { [9152] struct compat_semaphore mbx_intr_sem; } ffff81014ece4610+23C0 = ffff81014ece69d0 -> Semaphore location in the structure. We can see this pointer above on the stack (marked as "sema")
crash> struct scsi_qla_host ffff81014ece4610|grep mbx_cmd_flags mbx_cmd_flags = 4, unsigned long mbx_cmd_flags; #define MBX_INTERRUPT 1 #define MBX_INTR_WAIT 2 #define MBX_UPDATE_FLASH_ACTIVE 3 From the stack trace MBX_INTR_WAIT should be set..indicating that we are waiting for the interrupt. But the dump is showing the flag value MBX_UPDATE_FLASH_ACTIVE. If we look at the interrupt handler: if (test_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags) && (status & MBX_INTERRUPT) && ha->flags.mbox_int) { set_bit(MBX_INTERRUPT, &ha->mbx_cmd_flags); up(&ha->mbx_intr_sem); } Hence interrupt handler wont call up() if this bit is not set to MBX_INTR_WAIT. This explains why the io_done is not being called..and why all IOs are waiting. But we need to figure out how and why this bit is set? Another interesting factor is: Looking at the code, MBX_UPDATE_FLASH_ACTIVE is set in suspend_hba() but if the code goes to that path, interrupts_on should be set to 0. But the dump tells otherwise. crash> struct scsi_qla_host ffff81014ece4610|grep interrupts_on interrupts_on = 1 '\001', So it is puzzeling how the mbx_cmd_flags is set to MBX_UPDATE_FLASH_ACTIVE. Another minor mistake(no problem in the functionality) found in the code: tmp_intr_timer.data = (unsigned long)&ha->mbx_intr_sem; tmp_intr_timer.expires = jiffies + mcp->tov * HZ; tmp_intr_timer.function = (void (*)(unsigned long))qla2x00_mbx_sem_timeout; where mbx_intr_sem is compact_semaphore static void qla2x00_mbx_sem_timeout(unsigned long data) { struct semaphore *sem_ptr = (struct semaphore *)data; this data is typecasted to (struct semaphore)
Today's Progress: crash> struct scsi_qla_host.mbx_intr_sem ffff81007d2a8610 mbx_intr_sem = { count = { counter = 0 <<< counter should be < 0 if anyone is waiting. }, sleepers = 1, <<< Indicates that one process is waiting. <<< This is inconsistent. It means, either we missed the wakeup or a wakeup is <<<< missed...or some kind of race between down() and up() wait = { lock = { lock = { wait_lock = { raw_lock = { slock = 0 }, break_lock = 0 }, wait_list = { prio_list = { next = 0xffff81007d2aa9e0, prev = 0xffff81007d2aa9e0 }, node_list = { next = 0xffff81007d2aa9f0, prev = 0xffff81007d2aa9f0 } }, owner = 0x0 }, break_lock = 0 }, task_list = { next = 0xffff81014d021be8, prev = 0xffff81014d021be8 } } }, crash> struct __wait_queue.task_list struct __wait_queue { [24] struct list_head task_list; } 0xffff81014d021be8 - 0x18 crash> struct __wait_queue ffff81014d021bd0 struct __wait_queue { flags = 1, private = 0xffff81014d6423c0, <<< This is the task_struct of the thread waiting on the semaphore. So the thread in question , scsi_eh_1 func = 0xffffffff81034f27 <default_wake_function>, task_list = { next = 0xffff81007d2aaa10, prev = 0xffff81007d2aaa10 } } PID: 544 TASK: ffff81014d6423c0 CPU: 3 COMMAND: "scsi_eh_1" #0 [ffff81014d021b20] schedule at ffffffff8129f4fe #1 [ffff81014d021bc8] __compat_down at ffffffff812a14c6 #2 [ffff81014d021c28] __compat_down_failed at ffffffff812a108d #3 [ffff81014d021c78] qla2x00_mailbox_command at ffffffff880a56ae #4 [ffff81014d021d28] qla2x00_issue_iocb at ffffffff880a6d11 #5 [ffff81014d021db8] qla24xx_abort_command at ffffffff880a7644 #6 [ffff81014d021e08] qla2xxx_eh_abort at ffffffff8809e1e0 #7 [ffff81014d021e78] __scsi_try_to_abort_cmd at ffffffff88058077 #8 [ffff81014d021e88] scsi_error_handler at ffffffff880598da #9 [ffff81014d021f28] kthread at ffffffff81053ee5 #10 [ffff81014d021f48] kernel_thread at ffffffff8100d418
(In reply to comment #79) > crash> struct scsi_qla_host ffff81014ece4610|grep mbx_cmd_flags > mbx_cmd_flags = 4, > > unsigned long mbx_cmd_flags; > #define MBX_INTERRUPT 1 > #define MBX_INTR_WAIT 2 > #define MBX_UPDATE_FLASH_ACTIVE 3 > > From the stack trace MBX_INTR_WAIT should be set..indicating that we are waiting > for the interrupt. But the dump is showing the flag value MBX_UPDATE_FLASH_ACTIVE. > > If we look at the interrupt handler: > > if (test_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags) && > (status & MBX_INTERRUPT) && ha->flags.mbox_int) { > set_bit(MBX_INTERRUPT, &ha->mbx_cmd_flags); > up(&ha->mbx_intr_sem); > } > Hence interrupt handler wont call up() if this bit is not set to MBX_INTR_WAIT. > This explains why the io_done is not being called..and why all IOs are waiting. > > But we need to figure out how and why this bit is set? > > Another interesting factor is: > > Looking at the code, MBX_UPDATE_FLASH_ACTIVE is set in suspend_hba() but if the > code goes to that path, interrupts_on should be set to 0. But the dump tells > otherwise. > crash> struct scsi_qla_host ffff81014ece4610|grep interrupts_on > interrupts_on = 1 '\001', > > So it is puzzeling how the mbx_cmd_flags is set to MBX_UPDATE_FLASH_ACTIVE. > > Another minor mistake(no problem in the functionality) found in the code: > > tmp_intr_timer.data = (unsigned long)&ha->mbx_intr_sem; > tmp_intr_timer.expires = jiffies + mcp->tov * HZ; > tmp_intr_timer.function = > (void (*)(unsigned long))qla2x00_mbx_sem_timeout; > > where mbx_intr_sem is compact_semaphore > static void > qla2x00_mbx_sem_timeout(unsigned long data) > { > struct semaphore *sem_ptr = (struct semaphore *)data; > > this data is typecasted to (struct semaphore) > As i mentioned in this comment, as minor issue is appearing to be THE issue. because of the rt switch in include/linux/rt_lock.h and PICK_SEM_OP() picks the one depending on the type of the semaphore passed. #define up(sem) PICK_SEM_OP(compat_up, rt_up, sem) So the fix is to set the typecast correctly to compat_semaphore. Changed and tests are running from morning. Rostedt, gave a patch with magic check to find out hidden mistakes like this. I will be testing with that patch tomorrow.
Created attachment 316488 [details] patch to fix compat semaphore in qla2x00 driver The ql2x00 driver passes the semaphore to a timeout routine via a pointer. The semaphore that is passed is really an compat_semaphore, but the callback function typecasts it to a semaphore. The up() logic calls the rt code if the type is a semaphore, and calls the old semaphore code is it is a compat_semaphore. Thus, we were calling the wrong function. This patch is a one liner that simply makes the type of semaphore match the original one being passed in. This should solve the issue.
Patch added to 2.6.24.7-80.el5rt.
Created attachment 316618 [details] patch to catch misuses of rtmutexs This patch adds "magic" numbers to the rtmutexs and these are checked everytime they are used. If a compat_semaphore or raw_spinlock is called to these routines, than a warning will be printed.
patch to fix compat semaphore in qla2x00 driver was found in mrg-rt.git as commit b95c0db0f90e16c458100256f3c4f0daecb6d4eb. patch to catch misuses of rtmutexs found in mrg-rt.git as commit 12538e0ad48ca5cc5f861863b178c03b7d4d28f5. Have not managed to find suitable RHTS test system yet with the right driver available. Only code review done. Both patches are found in mrg-rt-2.6.24.7-81
Tried to run the "online/offline" test (https://bugzilla.redhat.com/attachment.cgi?id=315191) on veritas2.rhts.bos.redhat.com. Ran the test with 250 iterations on 2.6.24.7-74rt, without triggering anything close to a kernel crash. Ran the same test on 2.6.24.7-81rt, with the same results. Well, I got one result ... root filesystem got screwed up quite well and had to reinstall OS between the runs ... but that's not so surprisingly, considering what the test actually does ... and it does it quite brutally. Not sure if I've found a setup which is close enough to reproduce this bug.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0857.html
I've verified the compat_sem fix is in -85.