Bug 453811

Summary:	[RFE] Backport per device dirty thresholds
Product:	Red Hat Enterprise Linux 5	Reporter:	Bryn M. Reeves <bmr>
Component:	kernel	Assignee:	Peter Zijlstra <pzijlstr>
Status:	CLOSED WONTFIX	QA Contact:	Red Hat Kernel QE team <kernel-qe>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	5.2	CC:	bubrown, fhirtz, jwest, lwang, rwheeler, ssaha, tao
Target Milestone:	rc	Keywords:	FutureFeature, Triaged
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Enhancement
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-08-12 17:15:05 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	533192, 554476, 729758

Description Bryn M. Reeves 2008-07-02 18:05:07 UTC

Description of problem:
Upstream moved to having separate per-device dirty limit thresholds in 2.6.24.

Having per-BDI limits avoids a number of problematic situations; for example an
unresponsive network file system can cause other file systems to stall.

This was merged in commit 04fbfdc14e5f48463820d6b9807daa5e9c92c51f:

Author: Peter Zijlstra <a.p.zijlstra>
Date:   Tue Oct 16 23:25:50 2007 -0700

    mm: per device dirty threshold
    
    Scale writeback cache per backing device, proportional to its writeout speed.
    
    By decoupling the BDI dirty thresholds a number of problems we currently have
    will go away, namely:
    
     - mutual interference starvation (for any number of BDIs);
     - deadlocks with stacked BDIs (loop, FUSE and local NFS mounts).
    
    It might be that all dirty pages are for a single BDI while other BDIs are
    idling. By giving each BDI a 'fair' share of the dirty limit, each one can have
    dirty pages outstanding and make progress.
    
    A global threshold also creates a deadlock for stacked BDIs; when A writes to
    B, and A generates enough dirty pages to get throttled, B will never start
    writeback until the dirty pages go away. Again, by giving each BDI its own
    'independent' dirty limit, this problem is avoided.
    
    So the problem is to determine how to distribute the total dirty limit across
    the BDIs fairly and efficiently. A DBI that has a large dirty limit but does
    not have any dirty pages outstanding is a waste.
    
    What is done is to keep a floating proportion between the DBIs based on
    writeback completions. This way faster/more active devices get a larger share
    than slower/idle devices.
    
    [akpm: fix warnings]
    [hugh: Fix occasional hang when a task couldn't get out of
balance_dirty_pages]
    Signed-off-by: Peter Zijlstra <a.p.zijlstra>
    Signed-off-by: Hugh Dickins <hugh>
    Signed-off-by: Andrew Morton <akpm>
    Signed-off-by: Linus Torvalds <torvalds>



Version-Release number of selected component (if applicable):
2.6.18-*

How reproducible:
100%

Steps to Reproduce:
(One way...)
1. Mount a writable NFS mount somewhere
2. create 20 threads doing writes on this fs
3. stop the NFS server
4. dd if /dev/zero of /some/local/path bs 1024 count 10000
  
Actual results:
dd eventually stalls due to I/O throttling.

Expected results:
dd is not affected by throttling due to the unavailable NFS volume.

Additional info:

Comment 8 RHEL Program Management 2009-02-16 15:28:05 UTC

Updating PM score.

Comment 15 RHEL Program Management 2010-12-07 09:51:08 UTC

This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.6 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 16 RHEL Program Management 2011-06-20 21:10:12 UTC

This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 17 Sayan Saha 2011-08-12 17:15:05 UTC

Thank you for submitting a feature request to be considered for inclusion in Red Hat Enterprise Linux enterprise operating system. After consideration, Red Hat does not plan to incorporate the suggested capability in a future release of Red Hat Enterprise Linux.  If you would like Red Hat to re-consider your feature request, please re-open the feature request via appropriate support channels and provide additional supporting details about the importance of this feature.