736528 – Tasks blocking for more than 120 seconds?

Bug 736528 - Tasks blocking for more than 120 seconds?

Summary: Tasks blocking for more than 120 seconds?

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.5
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Denys Vlasenko
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-09-07 23:07 UTC by jbogden
Modified:	2018-11-27 21:21 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-02-26 15:21:04 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description jbogden 2011-09-07 23:07:51 UTC

Description of problem:

After upgrading from a 2.6.18-93.2-based kernel to one based on 2.6.18-107, we are getting a lot of syslog messages like:

2011-09-07 12:15:46 rdoss-scratch1 INFO: task kjournald:8265 blocked for more than 120 seconds. <kern.err>
2011-09-07 12:15:46 rdoss-scratch1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <kern.err>
2011-09-07 12:15:46 rdoss-scratch1 kjournald     D ffff810001004440     0  8265    243                8264 (L-TLB) <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 ffff8106672edd70 0000000000000046 0000000000000000 ffff81035ce7d0c0 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 ffffffff800a822d 000000000000000a ffff81035ce7d0c0 ffffffff80324b60 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 000000a21df87f6d 00000000000008f4 ffff81035ce7d2a8 0000000000000282 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 Call Trace: <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff800a822d>] wake_bit_function+0x0/0x2a <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff8004bfd8>] __wait_on_buffer+0x20/0x22 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff88a06a76>] :jbd:journal_commit_transaction+0x1ba/0x1354 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80066031>] thread_return+0x5e/0xf6 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff8003eee8>] lock_timer_base+0x25/0x4a <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff800a81f5>] autoremove_wake_function+0x0/0x38 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff88a0b668>] :jbd:kjournald+0x0/0x22f <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff88a0b72d>] :jbd:kjournald+0xc5/0x22f <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff800a81f5>] autoremove_wake_function+0x0/0x38 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff88a0b668>] :jbd:kjournald+0x0/0x22f <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80033905>] kthread+0x100/0x136 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80028196>] schedule_tail+0x44/0xbe <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff8006101d>] child_rip+0xa/0x11 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80033805>] kthread+0x0/0x136 <kern.warning>
2011-09-07 12:15:46 rdoss-scratch1 [<ffffffff80061013>] child_rip+0x0/0x11 <kern.warning>

and:

2011-09-07 12:08:24 rdoss-scratch1 INFO: task modprobe:1969 blocked for more than 120 seconds. <kern.err>
2011-09-07 12:08:24 rdoss-scratch1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <kern.err>
2011-09-07 12:08:24 rdoss-scratch1 modprobe      D ffff81038a881440     0  1969   1917                     (NOTLB) <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 ffff81037f085d58 0000000000000046 00000007fe933fff 0000000000000002 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 000000047f085d78 0000000000000007 ffff81035e311830 ffff81038aa0d7b0 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 0000004ffb133d98 00000000000055cb ffff81035e311a18 0000000400000001 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 Call Trace: <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff80067b32>] __down+0xc5/0xd9 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff80093b5a>] default_wake_function+0x0/0xf <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff800677dd>] __down_failed+0x35/0x3a <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801dc2c9>] __driver_attach+0xa6/0xb6 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801dc223>] __driver_attach+0x0/0xb6 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801db9e9>] bus_for_each_dev+0x49/0x7a <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801dc051>] driver_attach+0x1c/0x1e <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801db5ee>] bus_add_driver+0x78/0x116 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff801dc4fc>] driver_register+0x8f/0x93 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff8016dff8>] __pci_register_driver+0x5a/0xb7 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff883ec074>] :i7core_edac:i7core_init+0x74/0x98 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff800aeb64>] sys_init_module+0xb0/0x1f5 <kern.warning>
2011-09-07 12:08:24 rdoss-scratch1 [<ffffffff80060116>] system_call+0x7e/0x83 <kern.warning>

We know that the newer kernel introduces a hung process patch that produces these messages.  This explains why we weren't seeing these message before, but we're trying to understand if it makes sense for processes like modprobe and kjournald to be blocking for 120 seconds.  This doesn't make sense to us.

The nodes that experience these issues the most are connected to SAS JBODs via an LSI HBA, which we're using with the following driver:

[root@rdadmin2 ~]# modinfo mptsas
filename:       /lib/modules/2.6.18-107redsky_chaos/extra/mptsas.ko
version:        4.18.20.04
license:        GPL
description:    Fusion MPT SAS Host driver
author:         LSI Corporation
srcversion:     BD0E830D9AB6B1C6F2748B6
alias:          pci:v00001000d00000062sv*sd*bc*sc*i*
alias:          pci:v00001000d00000058sv*sd*bc*sc*i*
alias:          pci:v00001000d00000056sv*sd*bc*sc*i*
alias:          pci:v00001000d00000054sv*sd*bc*sc*i*
alias:          pci:v00001000d00000050sv*sd*bc*sc*i*
depends:        mptscsih,scsi_mod,scsi_transport_sas,mptbase,mptbase,mptscsih
vermagic:       2.6.18-107redsky_chaos SMP mod_unload gcc-4.1
parm:           mpt_pt_clear: Clear persistency table: enable=1  (default=MPTSCSIH_PT_CLEAR=0) (int)
parm:           mpt_cmd_retry_count: Device discovery TUR command retry count: default=300 (int)
parm:           mpt_disable_hotplug_remove: Disable hotpug remove events: default=0 (int)
parm:           mpt_sdev_queue_depth: Max Device Queue Depth (default=64)
parm:           max_lun: max lun, default=16895  (int)


Version-Release number of selected component (if applicable):

Kernel 2.6.18-107.

How reproducible:

100%.  Just reboot the node and see that modprobe, kjournald, and mount.lustre processes hang for 120 seconds or mode and we are notified of this vis syslog.

Steps to Reproduce:
1. Reboot
2. Watch syslog for hung processses notice.
  
Actual results:

Hung processes called out in syslog as shown above.

Expected results:

No hung processes called out in syslog.

Comment 2 RHEL Program Management 2012-10-30 05:55:57 UTC

This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 Denys Vlasenko 2014-01-30 16:29:49 UTC

(In reply to jbogden from comment #0)
> Description of problem:
> After upgrading from a 2.6.18-93.2-based kernel to one based on 2.6.18-107,
> we are getting a lot of syslog messages like:

In stock RHEL kernels, the "hung task" message was added in 2.6.18-177, not -107:

commit 46c0c4ef2c74bc861383fe359c74435737e320a2
Author: Don Zickus <dzickus>
Date:   Thu Dec 3 21:09:25 2009 -0500

    tag: kernel-2.6.18-177.el5

commit 2ada6549f448462cc5657cbb3da62961999d464b
Author: Amerigo Wang <amwang>
Date:   Mon Aug 31 09:00:38 2009 -0400

    [sched] enable CONFIG_DETECT_HUNG_TASK support

commit d931cc4c8954661a830c086945d3e8583427d0b9
Author: Don Zickus <dzickus>
Date:   Tue Dec 1 21:35:07 2009 -0500

    tag: kernel-2.6.18-176.el5


Looking at the traces:

> 2011-09-07 12:08:24 modprobe      D ffff81038a881440     0  1969   1917                     (NOTLB)
> 2011-09-07 12:08:24 ffff81037f085d58 0000000000000046 00000007fe933fff 0000000000000002
> 2011-09-07 12:08:24 000000047f085d78 0000000000000007 ffff81035e311830 ffff81038aa0d7b0
> 2011-09-07 12:08:24 0000004ffb133d98 00000000000055cb ffff81035e311a18 0000000400000001
> 2011-09-07 12:08:24 Call Trace:
> 2011-09-07 12:08:24 [<ffffffff80067b32>] __down+0xc5/0xd9
> 2011-09-07 12:08:24 [<ffffffff80093b5a>] default_wake_function+0x0/0xf
> 2011-09-07 12:08:24 [<ffffffff800677dd>] __down_failed+0x35/0x3a
> 2011-09-07 12:08:24 [<ffffffff801dc2c9>] __driver_attach+0xa6/0xb6
> 2011-09-07 12:08:24 [<ffffffff801dc223>] __driver_attach+0x0/0xb6

static int __driver_attach(struct device * dev, void * data)
{
        struct device_driver * drv = data;
...
        if (dev->parent)        /* Needed for USB */
                down(&dev->parent->sem);
        down(&dev->sem);
        if (!dev->driver)
                driver_probe_device(drv, dev);
        up(&dev->sem);
        if (dev->parent)
                up(&dev->parent->sem);
...
}

> 2011-09-07 12:08:24 [<ffffffff801db9e9>] bus_for_each_dev+0x49/0x7a
> 2011-09-07 12:08:24 [<ffffffff801dc051>] driver_attach+0x1c/0x1e
> 2011-09-07 12:08:24 [<ffffffff801db5ee>] bus_add_driver+0x78/0x116
> 2011-09-07 12:08:24 [<ffffffff801dc4fc>] driver_register+0x8f/0x93
> 2011-09-07 12:08:24 [<ffffffff8016dff8>] __pci_register_driver+0x5a/0xb7
> 2011-09-07 12:08:24 [<ffffffff883ec074>] :i7core_edac:i7core_init+0x74/0x98
> 2011-09-07 12:08:24 [<ffffffff800aeb64>] sys_init_module+0xb0/0x1f5
> 2011-09-07 12:08:24 [<ffffffff80060116>] system_call+0x7e/0x83

Synopsis: modprobe waits on device (or its parent's) semaphore


2011-09-07 12:15:46 kjournald     D ffff810001004440     0  8265    243                8264 (L-TLB)
2011-09-07 12:15:46 ffff8106672edd70 0000000000000046 0000000000000000 ffff81035ce7d0c0
2011-09-07 12:15:46 ffffffff800a822d 000000000000000a ffff81035ce7d0c0 ffffffff80324b60
2011-09-07 12:15:46 000000a21df87f6d 00000000000008f4 ffff81035ce7d2a8 0000000000000282
2011-09-07 12:15:46 Call Trace:
2011-09-07 12:15:46 [<ffffffff800a822d>] wake_bit_function+0x0/0x2a
2011-09-07 12:15:46 [<ffffffff8004bfd8>] __wait_on_buffer+0x20/0x22

void __wait_on_buffer(struct buffer_head * bh)
{
        wait_on_bit(&bh->b_state, BH_Lock, sync_buffer, TASK_UNINTERRUPTIBLE);
}

2011-09-07 12:15:46 [<ffffffff88a06a76>] :jbd:journal_commit_transaction+0x1ba/0x1354

fs/jbd/commit.c
journal_commit_transaction()
{...
        while (commit_transaction->t_updates) {
                DEFINE_WAIT(wait);
                prepare_to_wait(&journal->j_wait_updates, &wait,
                                        TASK_UNINTERRUPTIBLE);
                if (commit_transaction->t_updates) {
                        spin_unlock(&commit_transaction->t_handle_lock);
                        spin_unlock(&journal->j_state_lock);
                        schedule();
                        spin_lock(&journal->j_state_lock);
                        spin_lock(&commit_transaction->t_handle_lock);
                }
                finish_wait(&journal->j_wait_updates, &wait);
        }
...

2011-09-07 12:15:46 [<ffffffff80066031>] thread_return+0x5e/0xf6
2011-09-07 12:15:46 [<ffffffff8003eee8>] lock_timer_base+0x25/0x4a
2011-09-07 12:15:46 [<ffffffff800a81f5>] autoremove_wake_function+0x0/0x38
2011-09-07 12:15:46 [<ffffffff88a0b668>] :jbd:kjournald+0x0/0x22f
2011-09-07 12:15:46 [<ffffffff88a0b72d>] :jbd:kjournald+0xc5/0x22f
2011-09-07 12:15:46 [<ffffffff800a81f5>] autoremove_wake_function+0x0/0x38
2011-09-07 12:15:46 [<ffffffff88a0b668>] :jbd:kjournald+0x0/0x22f
2011-09-07 12:15:46 [<ffffffff80033905>] kthread+0x100/0x136
2011-09-07 12:15:46 [<ffffffff80028196>] schedule_tail+0x44/0xbe
2011-09-07 12:15:46 [<ffffffff8006101d>] child_rip+0xa/0x11
2011-09-07 12:15:46 [<ffffffff80033805>] kthread+0x0/0x136
2011-09-07 12:15:46 [<ffffffff80061013>] child_rip+0x0/0x11

And kjournald waits on a locked buffer head.

These two hung processes don't look obviously related.
Based on "we are getting *a lot* of syslog messages" from the reporter, and on 7 minutes difference in the timestamps in these two traces, I guess there may be other hung (D state) process(es) on this machine which aren't shown in this BZ.

Comment 4 Denys Vlasenko 2014-02-18 15:34:10 UTC

Can you provide a complete dmesg of a machine which experienced such stalls?
It seems that there should be more warnings which aren't in the partial logs you quote in your problem description.

Comment 5 Denys Vlasenko 2014-02-21 18:32:19 UTC

If you can catch the machine in the state when it just reported that some task is "blocked for more than 120 seconds", it would be very useful to collect kernel stack traces of all exiting processes: this will allow to find the *other process which holds the mutex*.

Running "echo t >/proc/sysrq-trigger" does this kernel stack trace dumping.

Note that unless you have only a few processes in the system, this generates quite an amount of kernel log (see an example from old kernel bug is at https://bugzilla.kernel.org/attachment.cgi?id=61222). Booting with bigger log_buf_len=256k may be a good idea.

Comment 6 jbogden 2014-02-21 19:15:55 UTC

Denys,

Thank you for the additional commentary and kernel detail which was interesting to see.  At the time of this bug, we were basically unfamiliar with the new patterns of blocked tasks/processes we would see as a result of the nice "hung task" strack trace feature.  As the feature spread across our fleet of HPC servers, we discovered more than a few situations where tasks can become temporarily "hung" for more than 120 seconds, most often in D state for usually explainable reasons.  We just didn't realize at the time of this bug that it was not uncommon for us to see that.  In this particular case the hung tasks boiled down to a subpar SAS storage subsystem and related drivers.

You can feel free to absolutely close this bug!

Comment 7 Denys Vlasenko 2014-02-26 15:21:04 UTC

(In reply to jbogden from comment #6)
> You can feel free to absolutely close this bug!

Was glad to help however I could!

Note You need to log in before you can comment on or make changes to this bug.