Red Hat Bugzilla – Bug 435734
page fault handling may keep mmap_sem for a long time
Last modified: 2014-08-11 01:40:43 EDT
Description of problem:
page fault handling may end up in file system code with mmap_sem held, keep
mmap_sem locked for a long time and thus delay other thread that need the mmap_sem.
For instance, we found threads blocked owning the mmap_sem with the following
stack (with 2.6.21-57.el5rt kernel):
do_page_fault() grabs the mmap_sem
Version-Release number of selected component (if applicable):
Steps to Reproduce:
What workload are you seeing problems in? Typically paging is discouraged in RT
workloads due to its non-deterministic character. Also, what contenders for the
mmap_sem do you have, other page-faults, or something else?
The workload is a real-time java program running on top of a real-time JVM. We
are seeing a RT thread delayed by a non-realtime thread performing mmap I/Os.
When that happens the RT thread is in the kernel, handling a write to a
write-protected memory area (a mechanism we use to have the thread stop
executing its java code and then from the SIGSEV do some work on behalf of the VM).