Bug 1448299 - Mismatch in checksum of the image file after copying to a new image file
Summary: Mismatch in checksum of the image file after copying to a new image file
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: sharding
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Krutika Dhananjay
QA Contact: bugs@gluster.org
URL:
Whiteboard:
Depends On:
Blocks: 1447959
TreeView+ depends on / blocked
 
Reported: 2017-05-05 06:36 UTC by Krutika Dhananjay
Modified: 2017-09-05 17:29 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.12.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1447959
Environment:
Last Closed: 2017-09-05 17:29:02 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Krutika Dhananjay 2017-05-05 06:36:40 UTC
+++ This bug was initially created as a clone of Bug #1447959 +++

Description of problem:
-----------------------
With sharding enabled on the replica 3 volume, when an image file is copied from the local filesystem ( say /home/vm1.img ) to the fuse mounted sharded replica 3 volume, checksum of source file and the copied file no longer matches.

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHGS 3.2.0 ( glusterfs-3.8.4-18.el7rhgs )

How reproducible:
-----------------
Always

Steps to Reproduce:
--------------------
1. Enable sharding on the replica 3 volume and fuse mount the volume
2. Create a VM image file locally on the machine ( /home/vm1.img ) and install with OS
3. Calculate the sha256sum of the image file
4. Copy the file to the fuse mounted gluster volume.
5. Calculate the checksum of the copied image file

Actual results:
---------------
sha256sum of source file and the copied file is no longer matching

Expected results:
-----------------
sha256sum of source file and the copied file should be the same

Comment 1 Krutika Dhananjay 2017-05-05 06:50:41 UTC
The issue is size mismatch between the src and dst file upon `cp` where the dst file is on gluster mount and sharded, leading to checksum mismatch. This particular bug is in shard's aggregated size accounting and is exposed when there are parallel writes and extending truncate on the file.
And `cp` does a truncate on the dst file before writing to it. The parallelization comes in when write-behind flushes cached writes and an extending truncate in parallel.

Note that the data integrity of the vm image is *not* affected by this bug. What is affected is the size of the file.
To confirm this, I truncated the extra bytes off the dst file to make its size same as size of src file and computed checksum again.
In this case checksums did match. I asked Satheesaran also to verify the same and he confirmed it works.
Basically md5sum,sha256sum etc fetch file size and read till the end of the file size. So in the dst file, the excess portion is filled with zeroes and checksum calculated on this region too.

FWIW, the same checksum test exists in upstream master regression test suite - https://github.com/gluster/glusterfs/blob/master/tests/bugs/shard/bug-1272986.t. The reason it passes there consistently is because the script performs copy through `dd` as opposed to `cp`.

Comment 2 Worker Ant 2017-05-05 10:17:46 UTC
REVIEW: https://review.gluster.org/17184 (features/shard: Set size in inode ctx before size update for truncate too) posted (#2) for review on master by Krutika Dhananjay (kdhananj)

Comment 3 Worker Ant 2017-05-08 09:01:36 UTC
REVIEW: https://review.gluster.org/17184 (features/shard: Set size in inode ctx before size update for truncate too) posted (#3) for review on master by Krutika Dhananjay (kdhananj)

Comment 4 Worker Ant 2017-05-09 04:05:01 UTC
REVIEW: https://review.gluster.org/17184 (features/shard: Set size in inode ctx before size update for truncate too) posted (#4) for review on master by Krutika Dhananjay (kdhananj)

Comment 5 Worker Ant 2017-05-10 06:07:40 UTC
REVIEW: https://review.gluster.org/17184 (features/shard: Set size in inode ctx before size update for truncate too) posted (#5) for review on master by Krutika Dhananjay (kdhananj)

Comment 6 Worker Ant 2017-05-10 12:33:55 UTC
COMMIT: https://review.gluster.org/17184 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 9df83b504a01e86d3b73af6c40df0c94cd2cd97a
Author: Krutika Dhananjay <kdhananj>
Date:   Fri May 5 14:30:49 2017 +0530

    features/shard: Set size in inode ctx before size update for truncate too
    
    Change-Id: I7e984bb0f50c7d42764c0648e697d94d6c768dc7
    BUG: 1448299
    Signed-off-by: Krutika Dhananjay <kdhananj>
    Reviewed-on: https://review.gluster.org/17184
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Smoke: Gluster Build System <jenkins.org>

Comment 7 Shyamsundar 2017-09-05 17:29:02 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report.

glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.