Bug 1392135

Summary: Possible data corruption when in-memory cache is enabled
Product: Red Hat Ceph Storage Reporter: Jason Dillaman <jdillama>
Component: RBDAssignee: Jason Dillaman <jdillama>
Status: CLOSED WONTFIX QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 1.3.3CC: ceph-eng-bugs
Target Milestone: rc   
Target Release: 1.3.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1392136 (view as bug list) Environment:
Last Closed: 2018-03-07 23:52:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1392136    

Description Jason Dillaman 2016-11-05 00:44:29 UTC
Description of problem:
With writeback cache enabled, there is a possibility that read requests serviced through the cache will be corrupted. 

Version-Release number of selected component (if applicable):
1.2, 1.3

How reproducible:
Reports indicate nearly 100% given enough time and high IOPS workload (i.e. a database load simulator).

Steps to Reproduce:
1. Reported to fail with Windows Server 2012R2 and SQL Server backed by RBD image with RBD writeback cache enabled.
2. Use SQLioSim to inject load into the database

Actual results:
Expected FileId: 0x0
Received FileId: 0x0
Expected PageId: 0xCB19C
Received PageId: 0xCB19A (does not match expected)
Received CheckSum: 0x9F444071
Calculated CheckSum: 0x89603EC9 (does not match expected)
Received Buffer Length: 0x2000

Expected results:
No IO errors

Additional info:

Comment 3 Jason Dillaman 2016-11-05 00:48:24 UTC
Upstream, backport PR against hammer branch: https://github.com/ceph/ceph/pull/11618