This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 677064 - INFO: task httpd:9144 blocked for more than 120 seconds. [NEEDINFO]
INFO: task httpd:9144 blocked for more than 120 seconds.
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.5
i686 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Red Hat Kernel Manager
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-02-12 20:19 EST by Sameer Syed
Modified: 2014-06-02 09:18 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-02 09:18:26 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
mikolaj: needinfo? (sameer.syed)
pm-rhel: needinfo? (sameer.syed)


Attachments (Terms of Use)

  None (edit)
Description Sameer Syed 2011-02-12 20:19:37 EST
Description of problem:I am using RHEL 5.5 with OCFS2 for a cluster file system between 2 nodes. I am not sure how I get this, but the server freeze and stopped responding to all http requests and I cannot login either from remote session (ssh) or through the console.


Version-Release number of selected component (if applicable): Kernel version is 2.6.18-194.3.1.el5PAE, httpd 2.2.3-43, ocfs2-2.6.18-194.3.1.el5PAE-1.4.7-1


How reproducible: Not sure how this can be reproduced

Additional info: This is what I see in the message log:

kernel: INFO: task httpd:9144 blocked for more than 120 seconds.
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: httpd         D 00225265  1540  9144   6915          9157 15002 (NOTLB)
kernel:        e3cc9e14 00200082 56e1bfcb 00225265 c54cc9f0 e157c080 c05eecfb 0000000a
kernel:        e1813550 56e6771a 00225265 0004b74f 00000001 e181365c c4619bc4 f7853040
kernel:        00000000 00000000 dfe965c0 00000000 c061cf87 dfe965c0 c05b8528 ffffffff
kernel: Call Trace:
kernel:  [<c05eecfb>] __tcp_push_pending_frames+0x474/0x752
kernel:  [<c061cf87>] _spin_lock_bh+0x8/0x18
kernel:  [<c05b8528>] release_sock+0xc/0x91
kernel:  [<c061c265>] __mutex_lock_slowpath+0x4d/0x7c
kernel:  [<c061c2a3>] .text.lock.mutex+0xf/0x14
kernel:  [<f90a6ab3>] ocfs2_file_aio_write+0x1a4/0xb48 [ocfs2]
kernel:  [<c05b627b>] kernel_sendpage+0x35/0x3c
kernel:  [<c05b62c6>] sock_sendpage+0x44/0x81
kernel:  [<c0456005>] file_send_actor+0x32/0x4b
kernel:  [<c04572a7>] do_generic_mapping_read+0x373/0x37b
kernel:  [<c04744ca>] do_sync_write+0xb6/0xf1
kernel:  [<c05b863b>] lock_sock+0x8e/0x96
kernel:  [<c04363ff>] autoremove_wake_function+0x0/0x2d
kernel:  [<c05b6730>] sys_setsockopt+0x76/0x95
kernel:  [<c0474414>] do_sync_write+0x0/0xf1
kernel:  [<c0474d53>] vfs_write+0xa1/0x143
kernel:  [<c0475345>] sys_write+0x3c/0x63
kernel:  [<c0404ead>] sysenter_past_esp+0x56/0x79
kernel:  =======================
Comment 1 Mikolaj Kucharski 2011-10-18 11:53:03 EDT
Do you see below message in dmesg(8)?

megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered
megasas_register_aen[0]: already registered

By any chance, is this happening on server with PERC5 (or PERC6) controller?
Comment 2 erik.j.clark 2011-12-14 10:14:15 EST
I am  not the OP, but _YES_!!! My box has been rebooting spontaneously (usually some time in the morning), and my messages is full of this.

Dec 13 21:31:24 thunderbolt kernel: megasas_register_aen[0]: already registered
Dec 13 22:16:23 thunderbolt last message repeated 3 times
Dec 13 23:01:25 thunderbolt last message repeated 3 times
Dec 13 23:46:30 thunderbolt last message repeated 3 times
Dec 14 00:31:32 thunderbolt last message repeated 3 times
Dec 14 01:16:35 thunderbolt last message repeated 3 times

Linux thunderbolt 2.6.18-274.12.1.el5 #1 SMP Tue Nov 8 21:37:35 EST 2011 x86_64 x86_64 x86_64 GNU/Linux

02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller 5

Help please! This was the absolute only thread I could find relevant in google! I am  not sure if this is related to my spontaneous reboots or not, and won't know till I sort this out. Thank you much!
Comment 3 Mikolaj Kucharski 2011-12-14 13:52:27 EST
Can you install from lsi.com package megacli?

Search for ``4.00.11_Linux_MegaCLI.zip'' on the Internet as LSI homepage is really not customer friendly. Then please give me an output of:


megacli -AdpAllInfo -aALL -NoLog | \
	grep -e '^[A-Z]' | \
	sed -n -e '/^Adapter/,/^Ctrl/p' | \
	tr -cd '[\040-\176\t\n]'


This is what I have:

# uname -rm
2.6.18-274.12.1.el5PAE i686

# megacli -AdpAllInfo -aALL -NoLog | ...
Adapter #0
Product Name    : PERC 5/i Integrated
Serial No       : 12345
FW Package Build: 5.2.2-0072
Mfg. Date       : 00/00/00
Rework Date     : 00/00/00
Revision No     : @A
Battery FRU     : N/A
Boot Block Version : R.2.3.12
BIOS Version       : MT28-9
MPT Version        : MPTFW-00.10.62.00-IT
FW Version         : 1.03.50-0461
WebBIOS Version    : 1.03-04
Ctrl-R Version     : 1.04-019A


I've seen those locks and hangups  in the past on my systems, but I think after upgrading PERC5/i firmware to 5.2.2-0072 they went away.

Firmware was downloaded from Dell.com website for PowerEdge 1950.
Comment 4 erik.j.clark 2011-12-14 14:08:36 EST
Here is my output from megacli:


Adapter #0
Product Name    : PERC 5/i Integrated
Serial No       : 12345
FW Package Build: 5.1.1-0040
Mfg. Date       : 00/00/00
Rework Date     : 00/00/00
Revision No     : @A
Battery FRU     : N/A
Boot Block Version : R.2.3.12
BIOS Version       : MT28
MPT Version        : MPTFW-00.10.47.00-IT
FW Version         : 1.03.10-0216
WebBIOS Version    : 1.03-04
Ctrl-R Version     : 1.04-017A

I am going to find the perc5 firmware update, in case you think this is the best way to go. Thanks for the quick response!
Comment 5 erik.j.clark 2011-12-14 14:41:53 EST
I went ahead and updated to the version suggested, and got the below. Could it be something else?

Dec 14 14:34:55 thunderbolt kernel: megasas_register_aen[0]: already registered
Dec 14 14:35:01 thunderbolt last message repeated 4 times

Adapter #0
Product Name    : PERC 5/i Integrated
Serial No       : 12345
FW Package Build: 5.2.2-0072
Mfg. Date       : 00/00/00
Rework Date     : 00/00/00
Revision No     : @A
Battery FRU     : N/A
Boot Block Version : R.2.3.12
BIOS Version       : MT28-9
MPT Version        : MPTFW-00.10.62.00-IT
FW Version         : 1.03.50-0461
WebBIOS Version    : 1.03-04
Ctrl-R Version     : 1.04-019A

Linux thunderbolt 2.6.18-274.12.1.el5 #1 SMP Tue Nov 8 21:37:35 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
Comment 6 Mikolaj Kucharski 2011-12-14 15:03:47 EST
My servers still have this in dmesg(8):

megasas_register_aen[0]: already registered

but they don't freeze anymore. However, issue happen only few times, so I don't know how to reproduce the issue.

If think, you will still see above messages in dmesg(8), but (hopefully) your system will be stable now. I'm really curious, so if you don't mind, please let me know did firmware update helped you. Thanks.
Comment 7 RHEL Product and Program Management 2014-03-07 07:48:33 EST
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.
Comment 8 RHEL Product and Program Management 2014-06-02 09:18:26 EDT
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).

Note You need to log in before you can comment on or make changes to this bug.