Description of problem: After upgrading from a 2.6.18-93.2-based kernel to one based on 2.6.18-107, we are getting a lot of syslog messages like: 2011-09-07 12:15:46 rdoss-scratch1 INFO: task kjournald:8265 blocked for more than 120 seconds. <kern.err> 2011-09-07 12:15:46 rdoss-scratch1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <kern.err> 2011-09-07 12:15:46 rdoss-scratch1 kjournald D ffff810001004440 0 8265 243 8264 (L-TLB) <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 ffff8106672edd70 0000000000000046 0000000000000000 ffff81035ce7d0c0 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 ffffffff800a822d 000000000000000a ffff81035ce7d0c0 ffffffff80324b60 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 000000a21df87f6d 00000000000008f4 ffff81035ce7d2a8 0000000000000282 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 Call Trace: <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff800a822d>] wake_bit_function+0x0/0x2a <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff8004bfd8>] __wait_on_buffer+0x20/0x22 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff88a06a76>] :jbd:journal_commit_transaction+0x1ba/0x1354 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80066031>] thread_return+0x5e/0xf6 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff8003eee8>] lock_timer_base+0x25/0x4a <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff800a81f5>] autoremove_wake_function+0x0/0x38 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff88a0b668>] :jbd:kjournald+0x0/0x22f <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff88a0b72d>] :jbd:kjournald+0xc5/0x22f <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff800a81f5>] autoremove_wake_function+0x0/0x38 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff88a0b668>] :jbd:kjournald+0x0/0x22f <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80033905>] kthread+0x100/0x136 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80028196>] schedule_tail+0x44/0xbe <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff8006101d>] child_rip+0xa/0x11 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80033805>] kthread+0x0/0x136 <kern.warning> 2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80061013>] child_rip+0x0/0x11 <kern.warning> and: 2011-09-07 12:08:24 rdoss-scratch1 INFO: task modprobe:1969 blocked for more than 120 seconds. <kern.err> 2011-09-07 12:08:24 rdoss-scratch1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <kern.err> 2011-09-07 12:08:24 rdoss-scratch1 modprobe D ffff81038a881440 0 1969 1917 (NOTLB) <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 ffff81037f085d58 0000000000000046 00000007fe933fff 0000000000000002 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 000000047f085d78 0000000000000007 ffff81035e311830 ffff81038aa0d7b0 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 0000004ffb133d98 00000000000055cb ffff81035e311a18 0000000400000001 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 Call Trace: <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff80067b32>] __down+0xc5/0xd9 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff80093b5a>] default_wake_function+0x0/0xf <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff800677dd>] __down_failed+0x35/0x3a <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801dc2c9>] __driver_attach+0xa6/0xb6 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801dc223>] __driver_attach+0x0/0xb6 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801db9e9>] bus_for_each_dev+0x49/0x7a <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801dc051>] driver_attach+0x1c/0x1e <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801db5ee>] bus_add_driver+0x78/0x116 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801dc4fc>] driver_register+0x8f/0x93 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff8016dff8>] __pci_register_driver+0x5a/0xb7 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff883ec074>] :i7core_edac:i7core_init+0x74/0x98 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff800aeb64>] sys_init_module+0xb0/0x1f5 <kern.warning> 2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff80060116>] system_call+0x7e/0x83 <kern.warning> We know that the newer kernel introduces a hung process patch that produces these messages. This explains why we weren't seeing these message before, but we're trying to understand if it makes sense for processes like modprobe and kjournald to be blocking for 120 seconds. This doesn't make sense to us. The nodes that experience these issues the most are connected to SAS JBODs via an LSI HBA, which we're using with the following driver: [root@rdadmin2 ~]# modinfo mptsas filename: /lib/modules/2.6.18-107redsky_chaos/extra/mptsas.ko version: 4.18.20.04 license: GPL description: Fusion MPT SAS Host driver author: LSI Corporation srcversion: BD0E830D9AB6B1C6F2748B6 alias: pci:v00001000d00000062sv*sd*bc*sc*i* alias: pci:v00001000d00000058sv*sd*bc*sc*i* alias: pci:v00001000d00000056sv*sd*bc*sc*i* alias: pci:v00001000d00000054sv*sd*bc*sc*i* alias: pci:v00001000d00000050sv*sd*bc*sc*i* depends: mptscsih,scsi_mod,scsi_transport_sas,mptbase,mptbase,mptscsih vermagic: 2.6.18-107redsky_chaos SMP mod_unload gcc-4.1 parm: mpt_pt_clear: Clear persistency table: enable=1 (default=MPTSCSIH_PT_CLEAR=0) (int) parm: mpt_cmd_retry_count: Device discovery TUR command retry count: default=300 (int) parm: mpt_disable_hotplug_remove: Disable hotpug remove events: default=0 (int) parm: mpt_sdev_queue_depth: Max Device Queue Depth (default=64) parm: max_lun: max lun, default=16895 (int) Version-Release number of selected component (if applicable): Kernel 2.6.18-107. How reproducible: 100%. Just reboot the node and see that modprobe, kjournald, and mount.lustre processes hang for 120 seconds or mode and we are notified of this vis syslog. Steps to Reproduce: 1. Reboot 2. Watch syslog for hung processses notice. Actual results: Hung processes called out in syslog as shown above. Expected results: No hung processes called out in syslog.
This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux.
(In reply to jbogden from comment #0) > Description of problem: > After upgrading from a 2.6.18-93.2-based kernel to one based on 2.6.18-107, > we are getting a lot of syslog messages like: In stock RHEL kernels, the "hung task" message was added in 2.6.18-177, not -107: commit 46c0c4ef2c74bc861383fe359c74435737e320a2 Author: Don Zickus <dzickus> Date: Thu Dec 3 21:09:25 2009 -0500 tag: kernel-2.6.18-177.el5 commit 2ada6549f448462cc5657cbb3da62961999d464b Author: Amerigo Wang <amwang> Date: Mon Aug 31 09:00:38 2009 -0400 [sched] enable CONFIG_DETECT_HUNG_TASK support commit d931cc4c8954661a830c086945d3e8583427d0b9 Author: Don Zickus <dzickus> Date: Tue Dec 1 21:35:07 2009 -0500 tag: kernel-2.6.18-176.el5 Looking at the traces: > 2011-09-07 12:08:24 modprobe D ffff81038a881440 0 1969 1917 (NOTLB) > 2011-09-07 12:08:24 ffff81037f085d58 0000000000000046 00000007fe933fff 0000000000000002 > 2011-09-07 12:08:24 000000047f085d78 0000000000000007 ffff81035e311830 ffff81038aa0d7b0 > 2011-09-07 12:08:24 0000004ffb133d98 00000000000055cb ffff81035e311a18 0000000400000001 > 2011-09-07 12:08:24 Call Trace: > 2011-09-07 12:08:24 [<ffffffff80067b32>] __down+0xc5/0xd9 > 2011-09-07 12:08:24 [<ffffffff80093b5a>] default_wake_function+0x0/0xf > 2011-09-07 12:08:24 [<ffffffff800677dd>] __down_failed+0x35/0x3a > 2011-09-07 12:08:24 [<ffffffff801dc2c9>] __driver_attach+0xa6/0xb6 > 2011-09-07 12:08:24 [<ffffffff801dc223>] __driver_attach+0x0/0xb6 static int __driver_attach(struct device * dev, void * data) { struct device_driver * drv = data; ... if (dev->parent) /* Needed for USB */ down(&dev->parent->sem); down(&dev->sem); if (!dev->driver) driver_probe_device(drv, dev); up(&dev->sem); if (dev->parent) up(&dev->parent->sem); ... } > 2011-09-07 12:08:24 [<ffffffff801db9e9>] bus_for_each_dev+0x49/0x7a > 2011-09-07 12:08:24 [<ffffffff801dc051>] driver_attach+0x1c/0x1e > 2011-09-07 12:08:24 [<ffffffff801db5ee>] bus_add_driver+0x78/0x116 > 2011-09-07 12:08:24 [<ffffffff801dc4fc>] driver_register+0x8f/0x93 > 2011-09-07 12:08:24 [<ffffffff8016dff8>] __pci_register_driver+0x5a/0xb7 > 2011-09-07 12:08:24 [<ffffffff883ec074>] :i7core_edac:i7core_init+0x74/0x98 > 2011-09-07 12:08:24 [<ffffffff800aeb64>] sys_init_module+0xb0/0x1f5 > 2011-09-07 12:08:24 [<ffffffff80060116>] system_call+0x7e/0x83 Synopsis: modprobe waits on device (or its parent's) semaphore 2011-09-07 12:15:46 kjournald D ffff810001004440 0 8265 243 8264 (L-TLB) 2011-09-07 12:15:46 ffff8106672edd70 0000000000000046 0000000000000000 ffff81035ce7d0c0 2011-09-07 12:15:46 ffffffff800a822d 000000000000000a ffff81035ce7d0c0 ffffffff80324b60 2011-09-07 12:15:46 000000a21df87f6d 00000000000008f4 ffff81035ce7d2a8 0000000000000282 2011-09-07 12:15:46 Call Trace: 2011-09-07 12:15:46 [<ffffffff800a822d>] wake_bit_function+0x0/0x2a 2011-09-07 12:15:46 [<ffffffff8004bfd8>] __wait_on_buffer+0x20/0x22 void __wait_on_buffer(struct buffer_head * bh) { wait_on_bit(&bh->b_state, BH_Lock, sync_buffer, TASK_UNINTERRUPTIBLE); } 2011-09-07 12:15:46 [<ffffffff88a06a76>] :jbd:journal_commit_transaction+0x1ba/0x1354 fs/jbd/commit.c journal_commit_transaction() {... while (commit_transaction->t_updates) { DEFINE_WAIT(wait); prepare_to_wait(&journal->j_wait_updates, &wait, TASK_UNINTERRUPTIBLE); if (commit_transaction->t_updates) { spin_unlock(&commit_transaction->t_handle_lock); spin_unlock(&journal->j_state_lock); schedule(); spin_lock(&journal->j_state_lock); spin_lock(&commit_transaction->t_handle_lock); } finish_wait(&journal->j_wait_updates, &wait); } ... 2011-09-07 12:15:46 [<ffffffff80066031>] thread_return+0x5e/0xf6 2011-09-07 12:15:46 [<ffffffff8003eee8>] lock_timer_base+0x25/0x4a 2011-09-07 12:15:46 [<ffffffff800a81f5>] autoremove_wake_function+0x0/0x38 2011-09-07 12:15:46 [<ffffffff88a0b668>] :jbd:kjournald+0x0/0x22f 2011-09-07 12:15:46 [<ffffffff88a0b72d>] :jbd:kjournald+0xc5/0x22f 2011-09-07 12:15:46 [<ffffffff800a81f5>] autoremove_wake_function+0x0/0x38 2011-09-07 12:15:46 [<ffffffff88a0b668>] :jbd:kjournald+0x0/0x22f 2011-09-07 12:15:46 [<ffffffff80033905>] kthread+0x100/0x136 2011-09-07 12:15:46 [<ffffffff80028196>] schedule_tail+0x44/0xbe 2011-09-07 12:15:46 [<ffffffff8006101d>] child_rip+0xa/0x11 2011-09-07 12:15:46 [<ffffffff80033805>] kthread+0x0/0x136 2011-09-07 12:15:46 [<ffffffff80061013>] child_rip+0x0/0x11 And kjournald waits on a locked buffer head. These two hung processes don't look obviously related. Based on "we are getting *a lot* of syslog messages" from the reporter, and on 7 minutes difference in the timestamps in these two traces, I guess there may be other hung (D state) process(es) on this machine which aren't shown in this BZ.
Can you provide a complete dmesg of a machine which experienced such stalls? It seems that there should be more warnings which aren't in the partial logs you quote in your problem description.
If you can catch the machine in the state when it just reported that some task is "blocked for more than 120 seconds", it would be very useful to collect kernel stack traces of all exiting processes: this will allow to find the *other process which holds the mutex*. Running "echo t >/proc/sysrq-trigger" does this kernel stack trace dumping. Note that unless you have only a few processes in the system, this generates quite an amount of kernel log (see an example from old kernel bug is at https://bugzilla.kernel.org/attachment.cgi?id=61222). Booting with bigger log_buf_len=256k may be a good idea.
Denys, Thank you for the additional commentary and kernel detail which was interesting to see. At the time of this bug, we were basically unfamiliar with the new patterns of blocked tasks/processes we would see as a result of the nice "hung task" strack trace feature. As the feature spread across our fleet of HPC servers, we discovered more than a few situations where tasks can become temporarily "hung" for more than 120 seconds, most often in D state for usually explainable reasons. We just didn't realize at the time of this bug that it was not uncommon for us to see that. In this particular case the hung tasks boiled down to a subpar SAS storage subsystem and related drivers. You can feel free to absolutely close this bug!
(In reply to jbogden from comment #6) > You can feel free to absolutely close this bug! Was glad to help however I could!