669418 – khugepaged blocking on page locks

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 669418 - khugepaged blocking on page locks

Summary: khugepaged blocking on page locks

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	6.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Larry Woodman
QA Contact:	Chao Ye
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	790862 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-01-13 16:13 UTC by Jeremy Agee
Modified:	2018-11-14 15:53 UTC (History)
CC List:	5 users (show)
Fixed In Version:	kernel-2.6.32-104.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-05-23 20:37:35 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:0542	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 6.1 kernel security, bug fix and enhancement update	2011-05-19 11:58:07 UTC

Internal Links: 790862

Description Jeremy Agee 2011-01-13 16:13:18 UTC

khugepaged blocking on page locks


vmcore info:

      KERNEL: /cores/20110110101743/work/vmlinux
    DUMPFILE: /cores/20110110101743/work/vmcore_debug_kernel_Iointerceptor_crash  [PARTIAL DUMP]
        CPUS: 24
        DATE: Thu Jan  6 22:13:43 2011
      UPTIME: 00:59:10
LOAD AVERAGE: 13.13, 13.58, 16.83
       TASKS: 1071
    NODENAME: node3
     RELEASE: 2.6.32-71.7.1.el6.x86_64.debug
     VERSION: #1 SMP Wed Oct 27 03:01:55 EDT 2010
     MACHINE: x86_64  (2800 Mhz)
      MEMORY: 48 GB
       PANIC: "Oops: 0002 [#1] SMP " (check log for details)

Current analysis by Lachlan McIlroy:

The khugepaged thread is the one causing all the problems.  In the following stack trace khugepaged has the mmap_sem semaphore locked and is currently trying to allocate and compact pages.  It has found a page that is currently locked due to I/O and khugepaged is waiting for the I/O to complete and the page to be unlocked.  Why has it been blocked for 2 minutes?  There could be severe I/O congestion holding up the page or it could be locked by another process.

Note that a hugepage is not just one page but 512 pages and waiting for that many pages to be unlocked means that the mmap_sem can be held for a long time and that can hold up other processes and then they'll report being blocked too.

INFO: task khugepaged:212 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
khugepaged    D 0000000000000000  4296   212      2 0x00000000
 ffff880c0386f8e0 0000000000000046 0000000000000000 0000000000000007
 0000000000000006 ffffffff81120a30 ffff8800581d6fd8 0000000100221784
 ffff880c03880cc0 ffff880c0386ffd8 0000000000010608 ffff880c03880cc0
Call Trace:
 [<ffffffff81120a30>] ? sync_page+0x0/0x60
 [<ffffffff81120a30>] ? sync_page+0x0/0x60
 [<ffffffff814f8d63>] io_schedule+0x73/0xc0
 [<ffffffff81120a74>] sync_page+0x44/0x60
 [<ffffffff814f94ea>] __wait_on_bit_lock+0x5a/0xc0
 [<ffffffff81120a07>] __lock_page+0x67/0x70
 [<ffffffff81096e50>] ? wake_bit_function+0x0/0x50
 [<ffffffff811751a3>] lock_page+0x43/0x50
 [<ffffffff81175938>] migrate_pages+0x6b8/0x6e0
 [<ffffffff81169920>] ? compaction_alloc+0x0/0x370
 [<ffffffff8116938d>] compact_zone+0x4cd/0x600
 [<ffffffff810ae43d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff8116974e>] compact_zone_order+0x7e/0xb0
 [<ffffffff811698a1>] try_to_compact_pages+0x121/0x1a0
 [<ffffffff811345d7>] __alloc_pages_nodemask+0x5a7/0x8d0
 [<ffffffff8128cf91>] ? debug_object_free+0xc1/0x140
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff8109d7cf>] ? cpu_clock+0x6f/0x80
 [<ffffffff81167d84>] alloc_pages_vma+0x84/0x110
 [<ffffffff814fa31a>] ? down_write+0x9a/0xb0
 [<ffffffff811811a9>] ? khugepaged+0x9d9/0x1320
 [<ffffffff811813a8>] khugepaged+0xbd8/0x1320
 [<ffffffff814fb8c0>] ? _spin_unlock_irq+0x30/0x40
 [<ffffffff81096e10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811807d0>] ? khugepaged+0x0/0x1320
 [<ffffffff81096ac6>] kthread+0x96/0xa0
 [<ffffffff810142ca>] child_rip+0xa/0x20
 [<ffffffff81013c10>] ? restore_args+0x0/0x30
 [<ffffffff81096a30>] ? kthread+0x0/0xa0
 [<ffffffff810142c0>] ? child_rip+0x0/0x20
1 lock held by khugepaged/212:
 #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff811811a9>] khugepaged+0x9d9/0x1320

This IOInterceptor process is trying to acquire the mmap_sem semaphore so is probably stuck behind khugepaged:

INFO: task IOInterceptor.u:6499 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
IOInterceptor D 00000000ffffffff  3872  6499   6461 0x00000080
 ffff880c065ffde8 0000000000000046 0000000000000000 ffff880c04bdc240
 0000000000000000 0000000000000007 ffff8800593d6fd8 0000000100221799
 ffff880c04bdc800 ffff880c065fffd8 0000000000010608 ffff880c04bdc800
Call Trace:
 [<ffffffff814fb10d>] rwsem_down_failed_common+0x8d/0x1d0
 [<ffffffff814fb273>] rwsem_down_write_failed+0x23/0x30
 [<ffffffff811811a9>] ? khugepaged+0x9d9/0x1320
 [<ffffffff812875d3>] call_rwsem_down_write_failed+0x13/0x20
 [<ffffffff814fa30e>] ? down_write+0x8e/0xb0
 [<ffffffff811427fc>] ? sys_mmap_pgoff+0x5c/0x2a0
 [<ffffffff811427fc>] sys_mmap_pgoff+0x5c/0x2a0
 [<ffffffff814fb3d2>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff810182b9>] sys_mmap+0x29/0x30
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by IOInterceptor.u/6499:
 #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff811427fc>] sys_mmap_pgoff+0x5c/0x2a0

These three IOInterceptor processes are also trying to acquire the mmap_sem semaphore so are probably also stuck behind khugepaged:

INFO: task IOInterceptor.u:20167 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
IOInterceptor D 0000000000000000  4880 20167   6461 0x00000080
 ffff880acc8b7b60 0000000000000046 0000000000000000 ffff880102481100
 0000000000000000 0000000000000007 ffff8800587d6fd8 0000000100221ba4
 ffff8801024816c0 ffff880acc8b7fd8 0000000000010608 ffff8801024816c0
Call Trace:
 [<ffffffff814fb10d>] rwsem_down_failed_common+0x8d/0x1d0
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff8101a739>] ? sched_clock+0x9/0x10
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff814fb2a6>] rwsem_down_read_failed+0x26/0x30
 [<ffffffff812875a4>] call_rwsem_down_read_failed+0x14/0x30
 [<ffffffff814fa3ca>] ? down_read+0x9a/0xa0
 [<ffffffffa0344438>] ? fuse_copy_fill+0x98/0x1f0 [fuse]
 [<ffffffff814fb97b>] ? _spin_unlock+0x2b/0x40
 [<ffffffffa0344438>] fuse_copy_fill+0x98/0x1f0 [fuse]
 [<ffffffffa03445cf>] fuse_copy_one+0x3f/0x70 [fuse]
 [<ffffffffa034552e>] fuse_dev_read+0x23e/0x320 [fuse]
 [<ffffffff81188a7a>] do_sync_read+0xfa/0x140
 [<ffffffff810adaed>] ? lock_release_holdtime+0x3d/0x190
 [<ffffffff81096e10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8109d71d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff810aa7ad>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff810adaed>] ? lock_release_holdtime+0x3d/0x190
 [<ffffffff8121f7c6>] ? security_file_permission+0x16/0x20
 [<ffffffff811894a5>] vfs_read+0xb5/0x1a0
 [<ffffffff81189ff6>] ? fget_light+0x66/0x100
 [<ffffffff811895e1>] sys_read+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by IOInterceptor.u/20167:
 #0:  (&mm->mmap_sem){++++++}, at: [<ffffffffa0344438>] fuse_copy_fill+0x98/0x1f0 [fuse]

INFO: task IOInterceptor.u:20201 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
IOInterceptor D 00000000ffffffff  5648 20201   6461 0x00000080
 ffff880120927b60 0000000000000046 0000000000000000 ffff8804c6c248c0
 0000000000000000 0000000000000007 ffff88005a3d6fd8 0000000100221b92
 ffff8804c6c24e80 ffff880120927fd8 0000000000010608 ffff8804c6c24e80
Call Trace:
 [<ffffffff814fb10d>] rwsem_down_failed_common+0x8d/0x1d0
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff8101a739>] ? sched_clock+0x9/0x10
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff814fb2a6>] rwsem_down_read_failed+0x26/0x30
 [<ffffffff812875a4>] call_rwsem_down_read_failed+0x14/0x30
 [<ffffffff814fa3ca>] ? down_read+0x9a/0xa0
 [<ffffffffa0344438>] ? fuse_copy_fill+0x98/0x1f0 [fuse]
 [<ffffffff814fb97b>] ? _spin_unlock+0x2b/0x40
 [<ffffffffa0344438>] fuse_copy_fill+0x98/0x1f0 [fuse]
 [<ffffffffa03445cf>] fuse_copy_one+0x3f/0x70 [fuse]
 [<ffffffffa034552e>] fuse_dev_read+0x23e/0x320 [fuse]
 [<ffffffff81188a7a>] do_sync_read+0xfa/0x140
 [<ffffffff810adaed>] ? lock_release_holdtime+0x3d/0x190
 [<ffffffff81096e10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8109d71d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff810aa7ad>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff810adaed>] ? lock_release_holdtime+0x3d/0x190
 [<ffffffff8121f7c6>] ? security_file_permission+0x16/0x20
 [<ffffffff811894a5>] vfs_read+0xb5/0x1a0
 [<ffffffff81189ff6>] ? fget_light+0x66/0x100
 [<ffffffff811895e1>] sys_read+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by IOInterceptor.u/20201:
 #0:  (&mm->mmap_sem){++++++}, at: [<ffffffffa0344438>] fuse_copy_fill+0x98/0x1f0 [fuse]

INFO: task IOInterceptor.u:20377 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
IOInterceptor D 00000000ffffffff  3920 20377   6461 0x00000080
 ffff8806f94b3b60 0000000000000046 0000000000000000 ffff8806fcb7c140
 0000000000000000 0000000000000007 ffff880058dd6fd8 0000000100221ba6
 ffff8806fcb7c700 ffff8806f94b3fd8 0000000000010608 ffff8806fcb7c700
Call Trace:
 [<ffffffff814fb10d>] rwsem_down_failed_common+0x8d/0x1d0
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff8101a739>] ? sched_clock+0x9/0x10
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff814fb2a6>] rwsem_down_read_failed+0x26/0x30
 [<ffffffff812875a4>] call_rwsem_down_read_failed+0x14/0x30
 [<ffffffff814fa3ca>] ? down_read+0x9a/0xa0
 [<ffffffffa0344438>] ? fuse_copy_fill+0x98/0x1f0 [fuse]
 [<ffffffff814fb97b>] ? _spin_unlock+0x2b/0x40
 [<ffffffffa0344438>] fuse_copy_fill+0x98/0x1f0 [fuse]
 [<ffffffffa03445cf>] fuse_copy_one+0x3f/0x70 [fuse]
 [<ffffffffa034552e>] fuse_dev_read+0x23e/0x320 [fuse]
 [<ffffffff81188a7a>] do_sync_read+0xfa/0x140
 [<ffffffff810adaed>] ? lock_release_holdtime+0x3d/0x190
 [<ffffffff81096e10>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8109d71d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff810aa7ad>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff810adaed>] ? lock_release_holdtime+0x3d/0x190
 [<ffffffff8121f7c6>] ? security_file_permission+0x16/0x20
 [<ffffffff811894a5>] vfs_read+0xb5/0x1a0
 [<ffffffff81189ff6>] ? fget_light+0x66/0x100
 [<ffffffff811895e1>] sys_read+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by IOInterceptor.u/20377:
 #0:  (&mm->mmap_sem){++++++}, at: [<ffffffffa0344438>] fuse_copy_fill+0x98/0x1f0 [fuse]

These two python processes are trying to acquire the mmap_sem semaphore so are probably stuck behind khugepaged:

INFO: task python:7585 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
python        D 0000000000000000  4608  7585   7576 0x00000080
 ffff880c06e5bca0 0000000000000046 0000000000000000 ffff880bc534d640
 0000000000000000 0000000000000007 ffff880058bd6fd8 00000001002233be
 ffff880bc534dc00 ffff880c06e5bfd8 0000000000010608 ffff880bc534dc00
Call Trace:
 [<ffffffff814fb10d>] rwsem_down_failed_common+0x8d/0x1d0
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff8101a739>] ? sched_clock+0x9/0x10
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff814fb2a6>] rwsem_down_read_failed+0x26/0x30
 [<ffffffff812875a4>] call_rwsem_down_read_failed+0x14/0x30
 [<ffffffff814fa3ca>] ? down_read+0x9a/0xa0
 [<ffffffff8114d35d>] ? access_process_vm+0x4d/0x200
 [<ffffffff8114d35d>] access_process_vm+0x4d/0x200
 [<ffffffff811eef7d>] proc_pid_cmdline+0x6d/0x120
 [<ffffffff81167cb7>] ? alloc_pages_current+0x87/0xd0
 [<ffffffff811efdad>] proc_info_read+0xad/0xf0
 [<ffffffff811894a5>] vfs_read+0xb5/0x1a0
 [<ffffffff81189ff6>] ? fget_light+0x66/0x100
 [<ffffffff811895e1>] sys_read+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by python/7585:
 #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8114d35d>] access_process_vm+0x4d/0x200

INFO: task python:7599 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
python        D 0000000000000000  4608  7599   7576 0x00000080
 ffff880bc18bbca0 0000000000000046 0000000000000000 ffff880c00850180
 0000000000000000 0000000000000007 ffff880058bd6fd8 0000000100230158
 ffff880c00850740 ffff880bc18bbfd8 0000000000010608 ffff880c00850740
Call Trace:
 [<ffffffff814fb10d>] rwsem_down_failed_common+0x8d/0x1d0
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff8101a739>] ? sched_clock+0x9/0x10
 [<ffffffff8101b145>] ? native_sched_clock+0x15/0x70
 [<ffffffff814fb2a6>] rwsem_down_read_failed+0x26/0x30
 [<ffffffff812875a4>] call_rwsem_down_read_failed+0x14/0x30
 [<ffffffff814fa3ca>] ? down_read+0x9a/0xa0
 [<ffffffff8114d35d>] ? access_process_vm+0x4d/0x200
 [<ffffffff8114d35d>] access_process_vm+0x4d/0x200
 [<ffffffff811eef7d>] proc_pid_cmdline+0x6d/0x120
 [<ffffffff81167cb7>] ? alloc_pages_current+0x87/0xd0
 [<ffffffff811efdad>] proc_info_read+0xad/0xf0
 [<ffffffff811894a5>] vfs_read+0xb5/0x1a0
 [<ffffffff81189ff6>] ? fget_light+0x66/0x100
 [<ffffffff811895e1>] sys_read+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by python/7599:
 #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8114d35d>] access_process_vm+0x4d/0x200

These three postgres processes are all blocked on an inode's mutex while trying to lseek on a file in a fuse filesystem:

INFO: task postgres:11939 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
postgres      D 0000000000000246  4304 11939  11672 0x00000084
 ffff880ab924de08 0000000000000046 ffff880ab924dd68 ffffffff8101a739
 ffff880ab924dd88 ffffffff8109d71d ffff880ab924dd88 ffff880b8d91c800
 ffff880b8d91cdc0 ffff880ab924dfd8 0000000000010608 ffff880b8d91cdc0
Call Trace:
 [<ffffffff8101a739>] ? sched_clock+0x9/0x10
 [<ffffffff8109d71d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff814f9cad>] ? __mutex_lock_common+0x24d/0x400
 [<ffffffff814f9bff>] __mutex_lock_common+0x19f/0x400
 [<ffffffffa034aca3>] ? fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffffa034aca3>] ? fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffff814f9f68>] mutex_lock_nested+0x48/0x60
 [<ffffffffa034aca3>] fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffff811877fa>] vfs_llseek+0x3a/0x40
 [<ffffffff81188fa6>] sys_lseek+0x66/0x80
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by postgres/11939:
 #0:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffffa034aca3>] fuse_file_llseek+0x43/0xe0 [fuse]

INFO: task postgres:11944 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
postgres      D 00000000ffffffff  4032 11944  11652 0x00000084
 ffff880b9df4de08 0000000000000046 0000000000000000 ffffffff8101a739
 ffff880b9df4dd88 ffffffff8109d71d ffff880059dd6fd8 000000010022170b
 ffff880b95e68f80 ffff880b9df4dfd8 0000000000010608 ffff880b95e68f80
Call Trace:
 [<ffffffff8101a739>] ? sched_clock+0x9/0x10
 [<ffffffff8109d71d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff814f9bff>] __mutex_lock_common+0x19f/0x400
 [<ffffffffa034aca3>] ? fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffff8109d7cf>] ? cpu_clock+0x6f/0x80
 [<ffffffffa034aca3>] ? fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffff814f9f68>] mutex_lock_nested+0x48/0x60
 [<ffffffffa034aca3>] fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffff811877fa>] vfs_llseek+0x3a/0x40
 [<ffffffff81188fa6>] sys_lseek+0x66/0x80
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by postgres/11944:
 #0:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffffa034aca3>] fuse_file_llseek+0x43/0xe0 [fuse]

INFO: task postgres:11946 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
postgres      D 0000000000000246  4304 11946  11639 0x00000084
 ffff880a3b6b5e08 0000000000000046 ffff880a3b6b5d68 ffffffff8101a739
 ffff880a3b6b5d88 ffffffff8109d71d ffff880a3b6b5d88 ffff880b8dcb4440
 ffff880b8dcb4a00 ffff880a3b6b5fd8 0000000000010608 ffff880b8dcb4a00
Call Trace:
 [<ffffffff8101a739>] ? sched_clock+0x9/0x10
 [<ffffffff8109d71d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff814f9cad>] ? __mutex_lock_common+0x24d/0x400
 [<ffffffff814f9bff>] __mutex_lock_common+0x19f/0x400
 [<ffffffffa034aca3>] ? fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffffa034aca3>] ? fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffff814f9f68>] mutex_lock_nested+0x48/0x60
 [<ffffffffa034aca3>] fuse_file_llseek+0x43/0xe0 [fuse]
 [<ffffffff811877fa>] vfs_llseek+0x3a/0x40
 [<ffffffff81188fa6>] sys_lseek+0x66/0x80
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
1 lock held by postgres/11946:
 #0:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffffa034aca3>] fuse_file_llseek+0x43/0xe0 [fuse]

It's not clear why these three postgres processes are stuck on inode mutexes.  I think the first one is stuck behind this process:

PID: 11714  TASK: ffff880ad9631240  CPU: 15  COMMAND: "postgres"
 #0 [ffff880903841cc8] schedule at ffffffff814f852d
 #1 [ffff880903841d90] wait_answer_interruptible at ffffffffa0344cb1
 #2 [ffff880903841e00] fuse_request_send at ffffffffa0344eeb
 #3 [ffff880903841e70] fuse_fsync_common at ffffffffa034be6a
 #4 [ffff880903841ed0] fuse_fsync at ffffffffa034bed0
 #5 [ffff880903841ee0] vfs_fsync_range at ffffffff811b7fa3
 #6 [ffff880903841f30] vfs_fsync at ffffffff811b804d
 #7 [ffff880903841f40] do_fsync at ffffffff811b808e
 #8 [ffff880903841f70] sys_fsync at ffffffff811b80e0
 #9 [ffff880903841f80] system_call_fastpath at ffffffff81013172
    RIP: 00007f136424ea30  RSP: 00007fff30f9c608  RFLAGS: 00000246
    RAX: 000000000000004a  RBX: ffffffff81013172  RCX: 0000000000000000
    RDX: 0000000002302000  RSI: 00000000000000ae  RDI: 0000000000000038
    RBP: 00000000022e6178   R8: 0000000000000000   R9: 000000004ce4f9ab
    R10: 00000000bca7d16a  R11: 0000000000000246  R12: ffffffff811b80e0
    R13: ffff880903841f78  R14: 0000000000000fad  R15: 0000000000000000
    ORIG_RAX: 000000000000004a  CS: 0033  SS: 002b

Which appears to be waiting for something fuse-related to respond and based on the last run times that process has been waiting quite a while.

So it definitely looks like we have a problem with khugepaged blocking on page locks when it shouldn't be and holding up many other processes.  We may also have a secondary problem with fuse not responding but that could be a further consequence of the khugepaged problem.


Discussion about the issue:
http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/465007e8e5bfee15

Posable patch:
http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=commitdiff;h=06e5d52f1815848da1647c8021150db037cf366e

Comment 4 RHEL Program Management 2011-01-14 04:31:16 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 9 errata-xmlrpc 2011-05-23 20:37:35 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html

Comment 10 Linda Wang 2016-08-26 01:48:18 UTC

*** Bug 790862 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.