Bug 449381 - System hangs when using /proc/sys/vm/drop_caches under heavy load on large system.
Summary: System hangs when using /proc/sys/vm/drop_caches under heavy load on large sy...
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.7
Hardware: All
OS: Linux
Target Milestone: rc
: ---
Assignee: Larry Woodman
QA Contact: Martin Jenner
Depends On:
TreeView+ depends on / blocked
Reported: 2008-06-02 15:04 UTC by Larry Woodman
Modified: 2009-02-12 18:56 UTC (History)
2 users (show)

Fixed In Version: RHSA-2008-0665
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2008-07-24 19:30:01 UTC
Target Upstream Version:

Attachments (Terms of Use)
Patch that fixes this problem. (2.04 KB, patch)
2008-06-02 15:04 UTC, Larry Woodman
no flags Details | Diff

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2008:0665 normal SHIPPED_LIVE Moderate: Updated kernel packages for Red Hat Enterprise Linux 4.7 2008-07-24 16:41:06 UTC

Description Larry Woodman 2008-06-02 15:04:35 UTC
Description of problem:

Eliminate hang when using /proc/sys/vm/drop_caches under heavy load on large system.

Version-Release number of selected component (if applicable):


How reproducible:

Frequest but requires large system(~64GB) and multiple CPUs(~8) running
several(more then CPU count).

Steps to Reproduce:
1. Start several file system exercisers that create and/or read large enoug
files to exhaust memory in the pagecache.
2. "echo 3 > /proc/sys/vm /drop_caches" until system hangs
3. Capture AltSysrq-W ad verify all CPUs are stuck on the inode_lock.
Actual results:

System Hang.

Expected results:

Pagecache memory is freed without system hanging.

Additional info:

Back in RHEL4-U6 we backported the /proc/sys/vm/drop_caches
functionality from upstream to RHEL4.  Recently I encountered hang in
this code while creating 256GB files on a 64GB 4-core system and
dropping the pagecache at the same time.  The cause of the hang is
invalidate_list() calls invalidate_inode_pages() which calls
invalidate_mapping_pages() with the inode_lock held.  Since
invalidate_mapping_pages() calls cond_resched(), every CPU can try to
acquire the inode_lock if the time quantum of the process writing
to /proc/sys/vm/drop_caches expires.  So far I have only been able to
reproduce this problem when writing multiple huge files on every CPU and
"echo 3 > /proc/sys/vm/drop_caches" from a shell, but it can happen

The attached patch fixes this problem by creating and calling a new
function invalidate_all_mapping_pages() which does not reschedule.  I
could not backport the upstream solution to RHEL4 because
invalidate_mapping_pages() is exported and the fix would break the kABI
but the fix is basically the same logic that is upstream.

The original BZ is 205722.

Comment 1 Larry Woodman 2008-06-02 15:04:35 UTC
Created attachment 307376 [details]
Patch that fixes this problem.

Comment 2 Linda Wang 2008-06-09 22:02:34 UTC
How big of the impact to the customer base w/o the patch?

Comment 5 Vivek Goyal 2008-06-12 18:32:13 UTC
Committed in 73.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 8 errata-xmlrpc 2008-07-24 19:30:01 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.