Bug 740887

Summary: Tune dirty_ratio and dirty_background_ratio
Product: Red Hat Enterprise Linux 6 Reporter: Federico Simoncelli <fsimonce>
Component: vdsmAssignee: Federico Simoncelli <fsimonce>
Status: CLOSED ERRATA QA Contact: Pavel Stehlik <pstehlik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: abaron, bazulay, bengland, danken, iheim, perfbz, rcyriac, s.kieske, ykaul
Target Milestone: rc   
Target Release: 6.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: vdsm-4.9-107 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 07:28:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Federico Simoncelli 2011-09-23 16:20:04 UTC
Description of problem:
Tune the dirty_ratio and dirty_background_ratio accordingly to the values suggested by the performance team:

vm.dirty_ratio=5
vm.dirty_background_ratio=2

Comment 2 Federico Simoncelli 2011-09-26 15:33:46 UTC
commit 9c4a571d3db43bcb16d73ed7289049699df9ff66
Author: Federico Simoncelli <fsimonce>
Date:   Mon Sep 26 15:28:36 2011 +0000

    BZ#740887 Tune cache dirty ratio
    
    Change-Id: Ibf5c8e4c0637c60092b89fba103b96b37bdafaa0

http://10.35.18.144/970

Comment 6 Ben England 2011-10-04 16:33:05 UTC
Here is a presentation summarizing performance analysis of RHEV with a NFS storage pool backed by a 10-Gbps link to a NetApp filer.  This study focused not on just throughput but also response time, in order to address problems with I/O timeouts reported earlier.  It also describes improvements in response time and throughput resulting from above tuning.    Presentation is available at:
 
http://perf1.lab.bos.redhat.com/bengland/laptop/rhev/rhev-vm-rsptime.pdf

I don't claim that these are optimal for all configurations and workloads.  The ratio of storage throughput to physical memory, as well as the workload, will most likely change the optimal settings.   I only suggest that the proposed changed values above are better for a KVM hypervisor than defaults of vm.dirty_ratio=20 (40 with tuned) vm.dirty_background_ratio=10 for providing balance of low response time and high throughput in typical customer environment using NFS storage.  Furthermore, when VMs are consuming much of physical memory we have to be careful not to oversubscribe remaining physical memory or we can get cache-thrashing and worse.  The proposed change makes this less likely.  vm.dirty_ratio is based on entire amount of physical memory in KVM host, but isn't some of that physical memory locked up by VMs, so the remaining physical memory is really what we have available?

For KVM hosts with multi-TB of RAM, percentages will probably not work.  Until we get better at estimating how much dirty page space is needed, we'll just have to explain how to use vm.dirty_bytes and vm.dirty_background_bytes in this situation.

Note a separate bug was created to add tuned profile for RHEL6 KVM guest. 

Dave Chinner did a similar workload with XFS on bare metal recently and came to similar tuning suggestion -- see bug 736224 cmt 17.

Comment 10 errata-xmlrpc 2011-12-06 07:28:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html