RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 790862 - kernel 2.6.32-220: processes block in rwsem_down_failed_common
Summary: kernel 2.6.32-220: processes block in rwsem_down_failed_common
Keywords:
Status: CLOSED DUPLICATE of bug 669418
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-15 15:23 UTC by lit-cs-sysadmin
Modified: 2016-08-26 01:48 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-26 01:48:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
earlgrey-1 magic sysrq blocked states 2011-12-09 (66.71 KB, application/octet-stream)
2012-02-15 15:23 UTC, lit-cs-sysadmin
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 669418 0 medium CLOSED khugepaged blocking on page locks 2021-02-22 00:41:40 UTC

Description lit-cs-sysadmin 2012-02-15 15:23:50 UTC
Created attachment 562245 [details]
earlgrey-1 magic sysrq blocked states 2011-12-09

Description of problem:

We have seen a couple strange incidents on two different machines running the following kernels:

2.6.32-220.4.1.el6.x86_64
2.6.32-220.el6.x86_64

The symptom is that many processes run normally, but processes that attempt to get process info (ps, top, catting process info from /proc) just hang.

On the system running 2.6.32-220.4.1 there were several messages like this in the system log:

Feb  8 11:35:00 moxie-2 kernel: INFO: task khugepaged:181 blocked for more than 120 seconds.
Feb  8 11:35:00 moxie-2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb  8 11:35:00 moxie-2 kernel: khugepaged    D 0000000000000000     0   181      2 0x00000000
Feb  8 11:35:00 moxie-2 kernel: ffff88062591dc90 0000000000000046 ffff88062591dc58 ffff88062591dc54
Feb  8 11:35:00 moxie-2 kernel: 0000000000015f80 ffff88062fc28400 ffff88033ac95f80 0000000000000400
Feb  8 11:35:00 moxie-2 kernel: ffff88062591bb38 ffff88062591dfd8 000000000000f4e8 ffff88062591bb38
Feb  8 11:35:00 moxie-2 kernel: Call Trace:
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff814eef25>] rwsem_down_failed_common+0x95/0x1d0
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff814ef083>] rwsem_down_write_failed+0x23/0x30
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff81276d83>] call_rwsem_down_write_failed+0x13/0x20
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff814ee582>] ? down_write+0x32/0x40
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff8116f140>] khugepaged+0x790/0x12c0
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff81090a90>] ? autoremove_wake_function+0x0/0x40
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff8116e9b0>] ? khugepaged+0x0/0x12c0
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff81090726>] kthread+0x96/0xa0
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff81090690>] ? kthread+0x0/0xa0
Feb  8 11:35:00 moxie-2 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20

I also captured some info using magic sysrq on the other system (attached). A number of tasks seem to be blocked in rwsem_down_failed_common.

From my investigation so far I wonder if it might be related to either or both of these issues:

https://bugzilla.redhat.com/show_bug.cgi?id=669418
https://lkml.org/lkml/2011/6/14/163

Both these systems are Dell PowerEdge R610s, one (moxie-2) has dual Xeon X5647s with 24GB of RAM and the other (earlgrey-1) has dual Xeon X5690s with 96GB of RAM.

Version-Release number of selected component (if applicable):

2.6.32-220.4.1.el6.x86_64
2.6.32-220.el6.x86_64

Comment 2 RHEL Program Management 2012-05-03 05:26:18 UTC
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Tore H. Larsen 2013-01-22 17:25:23 UTC
Kernel 2.6.32-279.9.1.el6.x86_64. Seeing this a lot on FhGFS over IB, and NFSversion 3 TCP over DIS (PCIe network). Deadlock in rwsem.c ?

Comment 4 Linda Wang 2016-08-26 01:48:18 UTC

*** This bug has been marked as a duplicate of bug 669418 ***


Note You need to log in before you can comment on or make changes to this bug.