Bug 1849478

Summary: [RADOS] Backport changes related to bluefs log not being compacted and possibly getting corrupted after growing to extreme size
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Prashant Dhange <pdhange>
Component: RADOSAssignee: Neha Ojha <nojha>
Status: CLOSED ERRATA QA Contact: Manohar Murthy <mmurthy>
Severity: high Docs Contact: Amrita <asakthiv>
Priority: high    
Version: 3.3CC: akupczyk, asakthiv, bhubbard, ceph-eng-bugs, ceph-qe-bugs, dzafman, gsitlani, kchai, nojha, rzarzyns, sseshasa, tchandra, tserlin
Target Milestone: z6Flags: pdhange: automate_bug?
Target Release: 3.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHEL: ceph-12.2.12-120.el7cp Ubuntu: ceph_12.2.12-108redhat1 Doc Type: Bug Fix
Doc Text:
Previously, BlueFS failed to replay log which was corrupted because of a previous written replay log. The BlueFS log corruption was caused by the BlueFS log growing exponentially because of OSD functioning in a way where sync_metadata was never invoked in some situations and if sync_metadata was invoked the BlueFS log was not getting compacted even though there is no new log data to flush. This log corruption caused Bluestore not to be mounted and data loss in multiple OSDs. With this update, sync_metadata is now invoked, BlueFS log is getting compacted even though there is no new log data to flush in sync_metadata and avoid log corruption if log is expanded. This prevents OSDs getting full due to BlueFS log growing exponentially and also BlueFS getting corrupted. As a result, logs replay and there is no data loss.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-18 18:05:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 14 errata-xmlrpc 2020-08-18 18:05:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 3.3 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3504

Comment 15 Neha Ojha 2020-08-18 18:15:11 UTC
Requested info has already been provided.