Bug 2003212

Summary: [Bluestore] Remove the possibility of replay log and file inconsistency
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vikhyat Umrao <vumrao>
Component: RADOSAssignee: Adam Kupczyk <akupczyk>
Status: CLOSED ERRATA QA Contact: skanta
Severity: high Docs Contact:
Priority: medium    
Version: 5.0CC: akupczyk, bhubbard, ceph-eng-bugs, gsitlani, nojha, pdhiran, rzarzyns, sseshasa, tserlin, vereddy, vumrao
Target Milestone: ---Keywords: CodeChange
Target Release: 4.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-14.2.22-10.el8cp, ceph-14.2.22-10.el7cp Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-05 07:54:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vikhyat Umrao 2021-09-10 15:59:16 UTC
This bug was initially created as a copy of Bug #2003207

I am copying this bug because: 

We want to backport the fix to the RHCS 4.x release.


Description of problem:
[Bluestore] Remove the possibility of replay log and file inconsistency
https://tracker.ceph.com/issues/50965

In power-off conditions, BlueFS can create corrupted files.

It is possible to create a condition in which a BlueFS contains a file that is corrupted. It can happen when the BlueFS replay log is on device A and we just wrote to device B.

Scenario:
1) write to file h1 on a SLOW device
2) flush h1 (and trigger h1 mark to be added to bluefs replay log, but no fdatasync yet)
3) write to file h2 on DB
4) fsync h2 (forces replay log to be written, after fdatasync to DB)
5) power off

As a result, we have file h1 that is properly declared in the replay log but with uninitialized content.


Version-Release number of selected component (if applicable):
RHCS 4.x, 5.x

Comment 11 errata-xmlrpc 2022-05-05 07:54:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 4.3 Security and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1716