Bug 2255436

Summary: [RHCS 5] RFE: change default value of "mds_bal_interval" to "0", aka false
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Patrick Donnelly <pdonnell>
Component: CephFSAssignee: Patrick Donnelly <pdonnell>
Status: CLOSED ERRATA QA Contact: Hemanth Kumar <hyelloji>
Severity: medium Docs Contact: Ranjini M N <rmandyam>
Priority: unspecified    
Version: 5.3CC: ceph-eng-bugs, cephqe-warriors, hyelloji, mcaldeir, rmandyam, tserlin, vereddy, vshankar
Target Milestone: ---Keywords: FutureFeature
Target Release: 5.3z6   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: ceph-16.2.10-231.el8cp Doc Type: Enhancement
Doc Text:
.The MDS default balancer is now disabled by default With this release, the MDS default balancer or the automatic dynamic subtree balancer is disabled by default. This prevents accidental subtree migrations, Subtree migrations can be expensive to undo when the operator increases the file system `max_mds` setting without planning subtree delegations, such as, with pinning.
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-02-08 16:57:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2227309, 2255435    
Bug Blocks: 2258797    

Description Patrick Donnelly 2023-12-20 18:37:18 UTC
This bug was initially created as a copy of Bug #2227309

I am copying this bug because: 

rhcs 5 backport

Description of problem:  RFE: change default value of "mds_bal_interval" to "0", aka false

From case 03492882 and BZ (https://bugzilla.redhat.com/show_bug.cgi?id=2203258) we see that having "mds_bal_interval" enabled results in performance issue. Beyond the log evidence in BZ 2203258, the customer reports that Ceph FS latency has dropped by 75 percent and I/O in the cluster has doubled since "mds_bal_interval" was set to false on their system.

We'd like to see the change in 5.3z-whatever and 6.1z-whatever and beyond

See also KCS #7026124, (https://access.redhat.com/solutions/7026124)

This same customer commented if this feature hampers performance this much and there is no intention on fixing it, the default value should be changed.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 9 errata-xmlrpc 2024-02-08 16:57:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 Security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:0745