Bug 2087527

Summary: [RFE] Limit the Health Detail MSG log size in cluster logs
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vikhyat Umrao <vumrao>
Component: RADOSAssignee: Prashant Dhange <pdhange>
Status: ASSIGNED --- QA Contact: Tintu Mathew <tmathew>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.0CC: akupczyk, amathuri, bhubbard, ceph-eng-bugs, choffman, ksirivad, lflores, nojha, pdhange, rfriedma, rzarzyns, sseshasa, vumrao
Target Milestone: ---Keywords: FutureFeature
Target Release: 7.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vikhyat Umrao 2022-05-17 23:18:18 UTC
Description of problem:
[RFE] Limit the Health Detail MSG log size in cluster logs

Version-Release number of selected component (if applicable):
Upstream quincy

In a recent occurrence on one of the Ceph upstream clusters, the LRC one went into DU mode due to a bug https://tracker.ceph.com/issues/54132.

Issues caused by this problem:

- SSH errors logged massive hexdump output (cephadm bug already fixed)
- These logs got stored as part of ‘ceph health detail’ as well
- Periodically we dumped ‘ceph health detail’ to the cluster log (going through paxos, each mon db, etc.) causing the mons to lose quorum and become unresponsive