Bug 1447959

Summary: Mismatch in checksum of the image file after copying to a new image file
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: shardingAssignee: Krutika Dhananjay <kdhananj>
Status: CLOSED ERRATA QA Contact: SATHEESARAN <sasundar>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, kdhananj, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-26 Doc Type: Bug Fix
Doc Text:
The checksum of a file could change when it was copied from a local file system to a volume with sharding enabled. If write and truncate operations were in progress simultaneously, the aggregated size was calculated incorrectly, resulting in a changed checksum. Aggregated file size is now calculated correctly in this circumstance.
Story Points: ---
Clone Of:
: 1448299 (view as bug list) Environment:
Last Closed: 2017-09-21 04:41:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1448299    
Bug Blocks: 1411323, 1417151, 1485863    

Description SATHEESARAN 2017-05-04 10:17:02 UTC
Description of problem:
-----------------------
With sharding enabled on the replica 3 volume, when an image file is copied from the local filesystem ( say /home/vm1.img ) to the fuse mounted sharded replica 3 volume, checksum of source file and the copied file no longer matches.

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHGS 3.2.0 ( glusterfs-3.8.4-18.el7rhgs )

How reproducible:
-----------------
Always

Steps to Reproduce:
--------------------
1. Enable sharding on the replica 3 volume and fuse mount the volume
2. Create a VM image file locally on the machine ( /home/vm1.img ) and install with OS
3. Calculate the sha256sum of the image file
4. Copy the file to the fuse mounted gluster volume.
5. Calculate the checksum of the copied image file

Actual results:
---------------
sha256sum of source file and the copied file is no longer matching

Expected results:
-----------------
sha256sum of source file and the copied file should be the same

Comment 1 Krutika Dhananjay 2017-05-05 06:51:30 UTC
The issue is size mismatch between the src and dst file upon `cp` where the dst file is on gluster mount and sharded, leading to checksum mismatch. This particular bug is in shard's aggregated size accounting and is exposed when there are parallel writes and extending truncate on the file.
And `cp` does a truncate on the dst file before writing to it. The parallelization comes in when write-behind flushes cached writes and an extending truncate in parallel.

Note that the data integrity of the vm image is *not* affected by this bug. What is affected is the size of the file.
To confirm this, I truncated the extra bytes off the dst file to make its size same as size of src file and computed checksum again.
In this case checksums did match. I asked Satheesaran also to verify the same and he confirmed it works.
Basically md5sum,sha256sum etc fetch file size and read till the end of the file size. So in the dst file, the excess portion is filled with zeroes and checksum calculated on this region too.

FWIW, the same checksum test exists in upstream master regression test suite - https://github.com/gluster/glusterfs/blob/master/tests/bugs/shard/bug-1272986.t. The reason it passes there consistently is because the script performs copy through `dd` as opposed to `cp`.

Comment 2 Atin Mukherjee 2017-05-09 04:08:03 UTC
upstream patch : https://review.gluster.org/#/c/17184/

Comment 7 SATHEESARAN 2017-07-22 13:15:32 UTC
Tested with glusterfs-3.8.4-35.el7rhgs with the following steps

1. Created a raw image and got its sha256sum value
2. Copied the same image from localfilesystem on to the fuse mounted filesystem
3. Recalculated the sha256sum of the file

sha256sum calculated at 1 and 3 remains the same

Comment 9 Krutika Dhananjay 2017-08-16 05:14:26 UTC
Laura,

I'm not sure the edited doc text correctly captures the issue and the fix (unless I'm reading too much into every single detail).

Let me provide some inline comments in any case:

"The checksum of a file changed when sharding was enabled."
>> So this happened after a file was copied from a local file system to a sharded gluster volume.

"This occurred because the file's shards were not correctly truncated, which meant that the sharded files had a greater aggregate size than the original file, which affected the checksum."
>> This wasn't related to shards not getting truncated. Shard does its own accounting of aggregated file size because the file is now getting split into multiple pieces and it's shard's responsibility to present an aggregated view of the file (including its size) to the application. The aggregated size is maintained in the form of an extended attribute on the base shard (zeroth shard). This accounting logic had a bug whenever the application sent a truncate file operation on a sharded file while writes were in progress. The incorrect accounting caused sharding translator to assign a higher aggregated size extended attribute value to the destination copy, causing its checksum to not match with the src file's on the local disk. 

"Files are now truncated correctly and the checksum of a sharded file matches its checksum before sharding."
>> The bug in shard truncate fop is now fixed and as a result the aggregated size is now getting accounted correctly even when there are parallel writes and  truncates.

Feel free to ping me if you need any more clarifications.

-Krutika

Comment 14 errata-xmlrpc 2017-09-21 04:41:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774