435734 – page fault handling may keep mmap_sem for a long time

Bug 435734 - page fault handling may keep mmap_sem for a long time

Summary: page fault handling may keep mmap_sem for a long time

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	realtime-kernel
Sub Component:
Version:	1.0
Hardware:	i386
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Peter Zijlstra
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-03-03 16:37 UTC by Roland Westrelin
Modified:	2014-08-11 05:40 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-01-05 21:10:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Roland Westrelin 2008-03-03 16:37:08 UTC

Description of problem:

page fault handling may end up in file system code with mmap_sem held, keep
mmap_sem locked for a long time and thus delay other thread that need the mmap_sem.

For instance, we found threads blocked owning the mmap_sem with the following
stack (with 2.6.21-57.el5rt kernel):
do_page_fault() grabs the mmap_sem
then calls:
handle_mm_fault()
__handle_mm_fault()
handle_pte_fault()
do_wp_page()
file_update_time()
mark_inode_dirty_sync()
__mark_inode_dirty()
ext3_dirty_inode()
ext3_mark_inode_dirty()
ext3_reserve_inode_write()
ext3_journal_get_write_access()
__ext3_journal_get_write_access()
journal_get_write_access()
do_get_write_access()


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Peter Zijlstra 2008-03-05 10:56:59 UTC

What workload are you seeing problems in? Typically paging is discouraged in RT
workloads due to its non-deterministic character. Also, what contenders for the
mmap_sem do you have, other page-faults, or something else?

Comment 2 Roland Westrelin 2008-03-19 15:08:05 UTC

The workload is a real-time java program running on top of a real-time JVM. We
are seeing a RT thread delayed by a non-realtime thread performing mmap I/Os.
When that happens the RT thread is in the kernel, handling a write to a
write-protected memory area (a mechanism we use to have the thread stop
executing its java code and then from the SIGSEV do some work on behalf of the VM).

Note You need to log in before you can comment on or make changes to this bug.