Bug 1476870 - [EC]: md5sum mismatches every time for a file from the fuse client on EC volume
Summary: [EC]: md5sum mismatches every time for a file from the fuse client on EC volume
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: 3.11
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Ashish Pandey
QA Contact:
URL:
Whiteboard:
Depends On: 1476205
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-31 16:31 UTC by Ashish Pandey
Modified: 2017-08-24 14:46 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.11.3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1476205
Environment:
Last Closed: 2017-08-24 14:46:21 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Ashish Pandey 2017-07-31 16:31:54 UTC
+++ This bug was initially created as a clone of Bug #1476205 +++

+++ This bug was initially created as a clone of Bug #1472764 +++

Description of problem:
-----------------------

some files are giving different md5sum when executed from different mounts on the same volume.
Example:
10.70.43.4:/master on /mnt/glusterfs type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
10.70.43.4:/master on /mnt/master type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
10.70.43.27:/slave on /mnt/slave type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@fan ~]# md5sum /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
9c0de9e06e88ddba88248e9c8e502f82  /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
[root@fan ~]# ls -l /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
-rwxrwxr--. 3 42023 13883 13526 Jul 18 05:29 /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
[root@fan ~]# ls -l /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
-rwxrwxr--. 3 42023 13883 13526 Jul 18 05:29 /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
[root@fan ~]# md5sum /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
16f92bf8f96094a214b1d0293c1accda  /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T

Further Analysis by Ashish shows the following:

Looks like files are corrupted at backend, to confirm mounted the volume on 4 mount points and got the different md5sum. This could be because all mount points are reading file from different sets of 4 bricks. To confirm this, killed 2 bricks from each subvolume and again calculated md5sum. This time all md5sums are same.



[root@fan tmp]# md5sum /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
0bc19eac0693355a2e954ba3d66d4ef1  /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T

[root@fan tmp]# md5sum /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
0bc19eac0693355a2e954ba3d66d4ef1  /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T

[root@fan tmp]# md5sum /mnt/temp_master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
0bc19eac0693355a2e954ba3d66d4ef1  /mnt/temp_master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T

[root@fan tmp]# md5sum /mnt/temp-2/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
0bc19eac0693355a2e954ba3d66d4ef1  /mnt/temp-2/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T


Version-Release number of selected component (if applicable):
==============================================================

glusterfs-3.8.4-18.5.el7rhgs.x86_64


How reproducible:
=================
Will retry


Steps carried:
==============

The steps carried were in geo-rep setup, however the checksum missmatches on master. Providing only the steps that is needed to avoid confusion. 

1. Create 6 node Master Cluster
2. Create EC volume on Master (2x(4+2))
3. Mount the volume (Fuse)
4. Using crefi, create the data set. fops carried inorder were:

create, chmod, chown, chgrp, hardlink, symlink, truncate

5. After every fop, calculate the checksum of mount (At least twice)
6. After truncate, checksum never matched

Actual results:
===============

Checksum mismatches for a file (Might be data corruption) every time calculated from different mount for the same volume. 


Expected results:
=================

Checksum should match all the time from all the different mounts.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-07-19 07:32:52 EDT ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.3.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Rahul Hinduja on 2017-07-19 08:53:05 EDT ---

sosreports at: 

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1472764/

--- Additional comment from Ashish Pandey on 2017-07-19 12:45:52 EDT ---

(In reply to Rahul Hinduja from comment #0)
> Description of problem:
> -----------------------
> 
> some files are giving different md5sum when executed from different mounts
> on the same volume.
> Example:
> 10.70.43.4:/master on /mnt/glusterfs type fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,
> max_read=131072)
> 10.70.43.4:/master on /mnt/master type fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,
> max_read=131072)
> 10.70.43.27:/slave on /mnt/slave type fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,
> max_read=131072)
> [root@fan ~]# md5sum
> /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/
> level61/level71/596c62e1%%HPJLBVY86T
> 9c0de9e06e88ddba88248e9c8e502f82 
> /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/
> level61/level71/596c62e1%%HPJLBVY86T
> [root@fan ~]# ls -l
> /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/
> level61/level71/596c62e1%%HPJLBVY86T
> -rwxrwxr--. 3 42023 13883 13526 Jul 18 05:29
> /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/
> level61/level71/596c62e1%%HPJLBVY86T
> [root@fan ~]# ls -l
> /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/
> level71/596c62e1%%HPJLBVY86T
> -rwxrwxr--. 3 42023 13883 13526 Jul 18 05:29
> /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/
> level71/596c62e1%%HPJLBVY86T
> [root@fan ~]# md5sum
> /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/
> level71/596c62e1%%HPJLBVY86T
> 16f92bf8f96094a214b1d0293c1accda 
> /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/
> level71/596c62e1%%HPJLBVY86T
> 
> Further Analysis by Ashish shows the following:
> 
> Looks like files are corrupted at backend, to confirm mounted the volume on
> 4 mount points and got the different md5sum. This could be because all mount
> points are reading file from different sets of 4 bricks. To confirm this,
> killed 2 bricks from each subvolume and again calculated md5sum. This time
> all md5sums are same.
> 
> 
> 
> [root@fan tmp]# md5sum
> /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/
> level61/level71/596c62e1%%HPJLBVY86T
> 0bc19eac0693355a2e954ba3d66d4ef1 
> /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/
> level61/level71/596c62e1%%HPJLBVY86T
> 
> [root@fan tmp]# md5sum
> /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/
> level71/596c62e1%%HPJLBVY86T
> 0bc19eac0693355a2e954ba3d66d4ef1 
> /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/
> level71/596c62e1%%HPJLBVY86T
> 
> [root@fan tmp]# md5sum
> /mnt/temp_master/thread0/level01/level11/level21/level31/level41/level51/
> level61/level71/596c62e1%%HPJLBVY86T
> 0bc19eac0693355a2e954ba3d66d4ef1 
> /mnt/temp_master/thread0/level01/level11/level21/level31/level41/level51/
> level61/level71/596c62e1%%HPJLBVY86T
> 
> [root@fan tmp]# md5sum
> /mnt/temp-2/thread0/level01/level11/level21/level31/level41/level51/level61/
> level71/596c62e1%%HPJLBVY86T
> 0bc19eac0693355a2e954ba3d66d4ef1 
> /mnt/temp-2/thread0/level01/level11/level21/level31/level41/level51/level61/
> level71/596c62e1%%HPJLBVY86T
> 
> 
> Version-Release number of selected component (if applicable):
> ==============================================================
> 
> glusterfs-3.8.4-18.5.el7rhgs.x86_64
> 
> 
> How reproducible:
> =================
> Will retry
> 
> 
> Steps carried:
> ==============
> 
> The steps carried were in geo-rep setup, however the checksum missmatches on
> master. Providing only the steps that is needed to avoid confusion. 
> 
> 1. Create 6 node Master Cluster
> 2. Create EC volume on Master (2x(4+2))
> 3. Mount the volume (Fuse)
> 4. Using crefi, create the data set. fops carried inorder were:
> 
> create, chmod, chown, chgrp, hardlink, symlink, truncate
> 
> 5. After every fop, calculate the checksum of mount (At least twice)
> 6. After truncate, checksum never matched
> 

Just to clarify - 

- Fops order does not mention when did you (script) write data on the file
- Is it possible that while IO is going on, you truncated the file?
- Can we anyhow find out what truncate command actually truncated the file in   
  question? I mean the "truncate -s <??>" 
- Also, what was the original size of the file if it happened after write completed?




> Actual results:
> ===============
> 
> Checksum mismatches for a file (Might be data corruption) every time
> calculated from different mount for the same volume. 
> 
> 
> Expected results:
> =================
> 
> Checksum should match all the time from all the different mounts.

--- Additional comment from Rahul Hinduja on 2017-07-20 02:25:03 EDT ---

> 
> Just to clarify - 
> 
> - Fops order does not mention when did you (script) write data on the file
> - Is it possible that while IO is going on, you truncated the file?

The next fop starts only once the earlier fop is completed. 

> - Can we anyhow find out what truncate command actually truncated the file
> in   
>   question? I mean the "truncate -s <??>" 

AFAIK, CREFI do not have a way, output do not help here. I will keep a watch when I try next time. 

> - Also, what was the original size of the file if it happened after write
> completed?

Its the random workload. Each file contains a different size between min=1K to Max=5k. After truncate, can not provide the actual size at the moment.

--- Additional comment from Ashish Pandey on 2017-07-20 07:10:33 EDT ---


I tried to reproduce it on my laptop but could not get success.
Tried different set of option  like write, chown with truncate but nothing helped.
Gone through the bricks logs and did not find any significant error message except quota inode_link issue.

Tried on 3.2 latest as well as upstream latest.

Rahul, could you please give us the exact CREFI command you used so that I can run it for one file and see what happens.

--- Additional comment from Rahul Hinduja on 2017-07-22 13:20:42 EDT ---

> Rahul, could you please give us the exact CREFI command you used so that I
> can run it for one file and see what happens.

How to setup and run crefi:
---------------------------

Setup:

1. git clone using:

git clone https://github.com/vijaykumar-koppad/Crefi.git

2. Ensure you have following packages installed. If not, please subscribe to base channel and install them:

a. gcc
b. python-setuptools
c. pyxattr

pyxattr can be downloaded directly from: https://brewweb.engineering.redhat.com/brew/

3. cd Crefi

4. python setup.py install

Run:

[root@fan ~]# crefi
usage: crefi [-h] [-V] [-n FILES] [--size SIZE] [--random] [--max MAX]
             [--min MIN] [--single] [--multi] [-b BRDTH] [-d DEPTH] [-l FLEN]
             [-t {text,sparse,binary,tar}] [-I INTER]
             [--fop {create,rename,chmod,chown,chgrp,symlink,hardlink,truncate,setxattr}]
             [-R] [-T THREADS]
             mntpnt
crefi: error: too few arguments
[root@fan ~]#

Example:

crefi --multi -n 5 -b 5 -d 5 --max=10k --min=5k --random -T 5 -t text --fop=create <mnt>

--- Additional comment from Rahul Hinduja on 2017-07-22 13:24:55 EDT ---

I have hit this issue again by same steps as mentioned in the bug. I have carried these similar steps in earlier releases atleast >10 times but never have hit this. I am marking Regression (via usecase) and proposing this as blocker now due to data corruption.

[root@dj ~]# df | grep master
10.70.37.189:/master     104857600 1623040 103234560   2% /mnt/glusterfs
10.70.37.189:/master     104857600 1623040 103234560   2% /mnt/master
10.70.37.189:/master     104857600 1623040 103234560   2% /mnt/new
10.70.37.189:/master     104857600 1623040 103234560   2% /mnt/new2
10.70.37.189:/master     104857600 1623040 103234560   2% /mnt/latestgluster
[root@dj ~]#

[root@dj ~]# md5sum /mnt/master/thread0/level05/level15/59729275%%OTYJP4U0L5
d061eb37da89f099e22b8e7a267ba4e0  /mnt/master/thread0/level05/level15/59729275%%OTYJP4U0L5
[root@dj ~]# mkdir /mnt/new
[root@dj ~]# md5sum /mnt/glusterfs/thread0/level05/level15/59729275%%OTYJP4U0L5
b8e8eaabbbea12a160477f56bcb4485d  /mnt/glusterfs/thread0/level05/level15/59729275%%OTYJP4U0L5
[root@dj ~]# mount -t glusterfs 10.70.37.189:/master /mnt/new/
[root@dj ~]# md5sum /mnt/new/thread0/level05/level15/59729275%%OTYJP4U0L5
d061eb37da89f099e22b8e7a267ba4e0  /mnt/new/thread0/level05/level15/59729275%%OTYJP4U0L5
[root@dj ~]#
[root@dj ~]# md5sum /mnt/new2/thread0/level05/level15/59729275%%OTYJP4U0L5
d061eb37da89f099e22b8e7a267ba4e0  /mnt/new2/thread0/level05/level15/59729275%%OTYJP4U0L5
[root@dj ~]# md5sum /mnt/glusterfs/thread0/level05/level15/59729275%%OTYJP4U0L5
b8e8eaabbbea12a160477f56bcb4485d  /mnt/glusterfs/thread0/level05/level15/59729275%%OTYJP4U0L5
[root@dj ~]# mkdir /mnt/latestgluster
[root@dj ~]# mount -t glusterfs 10.70.37.189:/master /mnt/latestgluster/
[root@dj ~]# md5sum /mnt/latestgluster/thread0/level05/level15/59729275%%OTYJP4U0L5
330c99082ff298c3b75766be571cb4f7  /mnt/latestgluster/thread0/level05/level15/59729275%%OTYJP4U0L5
[root@dj ~]#

I have hit this issue during rename fop carried after truncate fop.

Exact crefi commands are as follows:
[geo_rahul@skywalker ~]$ grep -ri "crefi" /home/geo_rahul/regression/3.3.0-RHEL7.4-reg/FUSE-RSYNC-EC.log | grep -i "RETCODE is 0"
2017-07-22 05:02:13,986 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=create /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 05:12:32,798 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chmod /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 05:18:09,531 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chown /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 05:23:24,575 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chgrp /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 05:29:21,515 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=symlink /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 05:42:05,076 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=hardlink /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 05:57:25,711 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=truncate /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 06:15:38,047 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=rename /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 06:51:36,294 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=create /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 07:02:11,336 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chmod /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 07:12:21,204 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chown /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 07:22:41,819 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chgrp /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 07:33:42,973 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=symlink /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 07:47:40,119 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=hardlink /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 08:03:19,522 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=truncate /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 08:22:11,782 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=create /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 08:38:03,876 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chmod /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 08:58:24,953 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chown /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 09:19:06,560 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=chgrp /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 09:40:13,765 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=symlink /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 09:58:55,441 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=hardlink /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 10:20:12,530 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=truncate /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
2017-07-22 18:03:04,933 INFO run "crefi --multi -n 5 -b 10 -d 10 --max=200K --min=10K --random -T 10 -t text --fop=rename /mnt/glusterfs 1>/dev/null 2>&1" on dj.lab.eng.blr.redhat.com: RETCODE is 0
[geo_rahul@skywalker ~]$ 


Where /mnt/glusterfs is the fuse mount

--- Additional comment from Rejy M Cyriac on 2017-07-24 08:55:17 EDT ---

At the 'RHGS 3.3.0 - Release Blocker Bug Triage and Status Check' meeting on 24 July, it was decided to ACCEPT this BZ for fix at the RHGS 3.3.0 release

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-07-24 08:55:23 EDT ---

Since this bug has been approved for the RHGS 3.3.0 release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.3.0+', and through the Internal Whiteboard entry of '3.3.0', the Target Release is being automatically set to 'RHGS 3.3.0'

--- Additional comment from Pranith Kumar K on 2017-07-27 19:49:12 EDT ---

Some notes, didn't find the RCA yet.

Only the last fragment of the file is corrupted. On closer inspection, it is found that the file on 5th brick is the bad-one. It is as if the write didn't happen on the 5th brick but only happened on the rest of the bricks. So the next step is to find why was the write got skipped.

Rahul,
    For more debugging we need /etc/services file from the machine where you ran crefi. Could you attach it to the bz?

Pranith

--- Additional comment from Pranith Kumar K on 2017-07-27 21:51:51 EDT ---

(In reply to Pranith Kumar K from comment #10)
> Some notes, didn't find the RCA yet.
> 
> Only the last fragment of the file is corrupted. On closer inspection, it is
> found that the file on 5th brick is the bad-one. It is as if the write
> didn't happen on the 5th brick but only happened on the rest of the bricks.
> So the next step is to find why was the write got skipped.
> 
> Rahul,
>     For more debugging we need /etc/services file from the machine where you
> ran crefi. Could you attach it to the bz?
> 
> Pranith

I found it in one of the rhgs machines and the md5sum is matching with what we have in the bz. I will use that

--- Additional comment from Pranith Kumar K on 2017-07-27 22:12:20 EDT ---

Some notes, still no RC
07:24:01 :) ⚡ md5sum /home/gfs/ec_?/5*
ecb9f6f29ebbb72e3e612b0aec775b36  /home/gfs/ec_0/596c62e1%%HPJLBVY86T
6fe5b21f27a5c9dc72ae3422102327b2  /home/gfs/ec_1/596c62e1%%HPJLBVY86T
6f82c16ed99b901929737ddf11dd920b  /home/gfs/ec_2/596c62e1%%HPJLBVY86T
bcac001cf1cac8860e4ae233a35cd7ef  /home/gfs/ec_3/596c62e1%%HPJLBVY86T
6921a33bd9c1083aaec582cde9973b61  /home/gfs/ec_4/596c62e1%%HPJLBVY86T <<<---- this is the bad fragment
b9d70081b5b41c9f980e20bf40e1fd13  /home/gfs/ec_5/596c62e1%%HPJLBVY86T

I did the following to see what fragment corresponds to the bad md5sum that is present in the file we got.

07:24:53 :) ⚡ for i in {12288..14336}; do Crefi/crefi.py --multi -n 1 -b 1 -d 1 --size $i -t text --fop=create /mnt/ec2; done

root@dhcp35-190 - ~ 
07:26:49 :) ⚡ md5sum /home/gfs/ec_4/level00/* | grep 6921a33bd9c1083aaec582cde9973b61
6921a33bd9c1083aaec582cde9973b61  /home/gfs/ec_4/level00/597a99d1%%4SKACGWCDK

root@dhcp35-190 - ~ 
07:27:07 :) ⚡ ls -l /mnt/ec2/level00/597a99d1%%4SKACGWCDK
-rw-r--r--. 1 root root 14336 Jul 28 07:26 /mnt/ec2/level00/597a99d1%%4SKACGWCDK

07:39:48 :) ⚡ truncate -s 13526 /mnt/ec2/level00/597a99d1%%4SKACGWCDK

root@dhcp35-190 - ~ 
07:40:10 :) ⚡ md5sum /home/gfs/ec_?/level00/597a99d1%%4SKACGWCDK
ecb9f6f29ebbb72e3e612b0aec775b36  /home/gfs/ec_0/level00/597a99d1%%4SKACGWCDK
6fe5b21f27a5c9dc72ae3422102327b2  /home/gfs/ec_1/level00/597a99d1%%4SKACGWCDK
6f82c16ed99b901929737ddf11dd920b  /home/gfs/ec_2/level00/597a99d1%%4SKACGWCDK
bcac001cf1cac8860e4ae233a35cd7ef  /home/gfs/ec_3/level00/597a99d1%%4SKACGWCDK
f039f8c213c0b31b2d16da87de8da6b7  /home/gfs/ec_4/level00/597a99d1%%4SKACGWCDK<------ This fragment is correct.
b9d70081b5b41c9f980e20bf40e1fd13  /home/gfs/ec_5/level00/597a99d1%%4SKACGWCDK

So This means either the write was never wound on the file when this bug happened or it failed silently.

This will be the next things to look into.

--- Additional comment from Pranith Kumar K on 2017-07-28 01:27:26 EDT ---

Ashish,
     Didn't find any evidence for how it failed, but there is a bug in EC. When writev in truncate fails, then it is not marking that brick as bad. Verified it in gdb. We may need more help from Rahul to find how write exactly failed. Are the quota errors contributing to any failures is also something we may need to check.

I put one of the brick processes in gdb and made it return error without performing any write:
Thread 7 "glusterfsd" hit Breakpoint 3, server3_3_writev (req=0x7fbf100ca140) at server-rpc-fops.c:4100
4100	        server_state_t      *state  = NULL;
(gdb) n
...
(gdb) 
4119	        frame = get_frame_from_request (req);
(gdb) 
4120	        if (!frame) {
(gdb) p frame=0
$3 = (call_frame_t *) 0x0
(gdb) n
4122	                SERVER_REQ_SET_ERROR (req, ret);
(gdb) 
4123	                goto out;
(gdb) 
4172	        free (args.xdata.xdata_val);
(gdb) 
4174	        if (op_errno)
(gdb) c

I got the same scenario mentioned in this bug:
root@dhcp35-190 - ~ 
10:34:11 :) ⚡ getfattr -d -m. -e hex /home/gfs/ec_?/services
getfattr: Removing leading '/' from absolute path names
# file: home/gfs/ec_0/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_1/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_2/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_3/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_4/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_5/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312


root@dhcp35-190 - ~ 
10:34:13 :) ⚡ md5sum  /home/gfs/ec_?/services
ecb9f6f29ebbb72e3e612b0aec775b36  /home/gfs/ec_0/services
6fe5b21f27a5c9dc72ae3422102327b2  /home/gfs/ec_1/services
6f82c16ed99b901929737ddf11dd920b  /home/gfs/ec_2/services
bcac001cf1cac8860e4ae233a35cd7ef  /home/gfs/ec_3/services
6921a33bd9c1083aaec582cde9973b61  /home/gfs/ec_4/services <<<<------ same md5sum as the bad file in Rahul's setup and all xattrs match as if everything is fine when it is not.
b9d70081b5b41c9f980e20bf40e1fd13  /home/gfs/ec_5/services

Ashish,
    Can you send out a patch? In both ec_truncate_open_cbk, ec_truncate_writev's cbk, we need to update lock->good_mask correctly. I don't find any other way for the file to end up in the state it did on Rahul's setup. We just need to confirm this is indeed the case. The only other psychotic case will be if the brick sends a success without performing the write.

Rahul,
    Will it be possible to get a tcpdump of a run where this happens? That will confirm how this bug happened. The bug we found based on the information you had on your setup I explained above is a day-1 bug.

Pranith

--- Additional comment from Ashish Pandey on 2017-07-28 05:42:15 EDT ---

Description of problem:
-----------------------

some files are giving different md5sum when executed from different mounts on the same volume.
Example:
:/master on /mnt/glusterfs type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
:/master on /mnt/master type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
:/slave on /mnt/slave type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@fan ~]# md5sum /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
9c0de9e06e88ddba88248e9c8e502f82  /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
[root@fan ~]# ls -l /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
-rwxrwxr--. 3 42023 13883 13526 Jul 18 05:29 /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
[root@fan ~]# ls -l /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
-rwxrwxr--. 3 42023 13883 13526 Jul 18 05:29 /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
[root@fan ~]# md5sum /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
16f92bf8f96094a214b1d0293c1accda  /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T

Further Analysis by Ashish shows the following:

Looks like files are corrupted at backend, to confirm mounted the volume on 4 mount points and got the different md5sum. This could be because all mount points are reading file from different sets of 4 bricks. To confirm this, killed 2 bricks from each subvolume and again calculated md5sum. This time all md5sums are same.



[root@fan tmp]# md5sum /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
0bc19eac0693355a2e954ba3d66d4ef1  /mnt/glusterfs/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T

[root@fan tmp]# md5sum /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
0bc19eac0693355a2e954ba3d66d4ef1  /mnt/master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T

[root@fan tmp]# md5sum /mnt/temp_master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
0bc19eac0693355a2e954ba3d66d4ef1  /mnt/temp_master/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T

[root@fan tmp]# md5sum /mnt/temp-2/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T
0bc19eac0693355a2e954ba3d66d4ef1  /mnt/temp-2/thread0/level01/level11/level21/level31/level41/level51/level61/level71/596c62e1%%HPJLBVY86T


Steps carried:
==============

The steps carried were in geo-rep setup, however the checksum missmatches on master. Providing only the steps that is needed to avoid confusion. 

1. Create 6 node Master Cluster
2. Create EC volume on Master (2x(4+2))
3. Mount the volume (Fuse)
4. Using crefi, create the data set. fops carried inorder were:

create, chmod, chown, chgrp, hardlink, symlink, truncate

5. After every fop, calculate the checksum of mount (At least twice)
6. After truncate, checksum never matched

Actual results:
===============

Checksum mismatches for a file (Might be data corruption) every time calculated from different mount for the same volume. 


Expected results:
=================

Checksum should match all the time from all the different mounts.

--- Additional comment from Ashish Pandey on 2017-07-28 05:43:54 EDT ---

RCA by Pranith -

Some notes, didn't find the RCA yet.

Only the last fragment of the file is corrupted. On closer inspection, it is found that the file on 5th brick is the bad-one. It is as if the write didn't happen on the 5th brick but only happened on the rest of the bricks. So the next step is to find why was the write got skipped.

Rahul,
    For more debugging we need /etc/services file from the machine where you ran crefi. Could you attach it to the bz?

Pranith

--- Additional comment from Pranith Kumar K on 2017-07-27 21:51:51 EDT ---

(In reply to Pranith Kumar K from comment #10)
> Some notes, didn't find the RCA yet.
> 
> Only the last fragment of the file is corrupted. On closer inspection, it is
> found that the file on 5th brick is the bad-one. It is as if the write
> didn't happen on the 5th brick but only happened on the rest of the bricks.
> So the next step is to find why was the write got skipped.
> 
> Rahul,
>     For more debugging we need /etc/services file from the machine where you
> ran crefi. Could you attach it to the bz?
> 
> Pranith

I found it in one of the rhgs machines and the md5sum is matching with what we have in the bz. I will use that

--- Additional comment from Pranith Kumar K on 2017-07-27 22:12:20 EDT ---

Some notes, still no RC
07:24:01 :) ⚡ md5sum /home/gfs/ec_?/5*
ecb9f6f29ebbb72e3e612b0aec775b36  /home/gfs/ec_0/596c62e1%%HPJLBVY86T
6fe5b21f27a5c9dc72ae3422102327b2  /home/gfs/ec_1/596c62e1%%HPJLBVY86T
6f82c16ed99b901929737ddf11dd920b  /home/gfs/ec_2/596c62e1%%HPJLBVY86T
bcac001cf1cac8860e4ae233a35cd7ef  /home/gfs/ec_3/596c62e1%%HPJLBVY86T
6921a33bd9c1083aaec582cde9973b61  /home/gfs/ec_4/596c62e1%%HPJLBVY86T <<<---- this is the bad fragment
b9d70081b5b41c9f980e20bf40e1fd13  /home/gfs/ec_5/596c62e1%%HPJLBVY86T

I did the following to see what fragment corresponds to the bad md5sum that is present in the file we got.

07:24:53 :) ⚡ for i in {12288..14336}; do Crefi/crefi.py --multi -n 1 -b 1 -d 1 --size $i -t text --fop=create /mnt/ec2; done

root@dhcp35-190 - ~ 
07:26:49 :) ⚡ md5sum /home/gfs/ec_4/level00/* | grep 6921a33bd9c1083aaec582cde9973b61
6921a33bd9c1083aaec582cde9973b61  /home/gfs/ec_4/level00/597a99d1%%4SKACGWCDK

root@dhcp35-190 - ~ 
07:27:07 :) ⚡ ls -l /mnt/ec2/level00/597a99d1%%4SKACGWCDK
-rw-r--r--. 1 root root 14336 Jul 28 07:26 /mnt/ec2/level00/597a99d1%%4SKACGWCDK

07:39:48 :) ⚡ truncate -s 13526 /mnt/ec2/level00/597a99d1%%4SKACGWCDK

root@dhcp35-190 - ~ 
07:40:10 :) ⚡ md5sum /home/gfs/ec_?/level00/597a99d1%%4SKACGWCDK
ecb9f6f29ebbb72e3e612b0aec775b36  /home/gfs/ec_0/level00/597a99d1%%4SKACGWCDK
6fe5b21f27a5c9dc72ae3422102327b2  /home/gfs/ec_1/level00/597a99d1%%4SKACGWCDK
6f82c16ed99b901929737ddf11dd920b  /home/gfs/ec_2/level00/597a99d1%%4SKACGWCDK
bcac001cf1cac8860e4ae233a35cd7ef  /home/gfs/ec_3/level00/597a99d1%%4SKACGWCDK
f039f8c213c0b31b2d16da87de8da6b7  /home/gfs/ec_4/level00/597a99d1%%4SKACGWCDK<------ This fragment is correct.
b9d70081b5b41c9f980e20bf40e1fd13  /home/gfs/ec_5/level00/597a99d1%%4SKACGWCDK

So This means either the write was never wound on the file when this bug happened or it failed silently.

This will be the next things to look into.

--- Additional comment from Pranith Kumar K on 2017-07-28 01:27:26 EDT ---

Ashish,
     Didn't find any evidence for how it failed, but there is a bug in EC. When writev in truncate fails, then it is not marking that brick as bad. Verified it in gdb. We may need more help from Rahul to find how write exactly failed. Are the quota errors contributing to any failures is also something we may need to check.

I put one of the brick processes in gdb and made it return error without performing any write:
Thread 7 "glusterfsd" hit Breakpoint 3, server3_3_writev (req=0x7fbf100ca140) at server-rpc-fops.c:4100
4100	        server_state_t      *state  = NULL;
(gdb) n
...
(gdb) 
4119	        frame = get_frame_from_request (req);
(gdb) 
4120	        if (!frame) {
(gdb) p frame=0
$3 = (call_frame_t *) 0x0
(gdb) n
4122	                SERVER_REQ_SET_ERROR (req, ret);
(gdb) 
4123	                goto out;
(gdb) 
4172	        free (args.xdata.xdata_val);
(gdb) 
4174	        if (op_errno)
(gdb) c

I got the same scenario mentioned in this bug:
root@dhcp35-190 - ~ 
10:34:11 :) ⚡ getfattr -d -m. -e hex /home/gfs/ec_?/services
getfattr: Removing leading '/' from absolute path names
# file: home/gfs/ec_0/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_1/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_2/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_3/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_4/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312

# file: home/gfs/ec_5/services
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a686f6d655f726f6f745f743a733000
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000000000000000000000
trusted.ec.size=0x00000000000034d6
trusted.ec.version=0x00000000000000270000000000000027
trusted.gfid=0xeacb572971f5457a9a5706020255f312


root@dhcp35-190 - ~ 
10:34:13 :) ⚡ md5sum  /home/gfs/ec_?/services
ecb9f6f29ebbb72e3e612b0aec775b36  /home/gfs/ec_0/services
6fe5b21f27a5c9dc72ae3422102327b2  /home/gfs/ec_1/services
6f82c16ed99b901929737ddf11dd920b  /home/gfs/ec_2/services
bcac001cf1cac8860e4ae233a35cd7ef  /home/gfs/ec_3/services
6921a33bd9c1083aaec582cde9973b61  /home/gfs/ec_4/services <<<<------ same md5sum as the bad file in Rahul's setup and all xattrs match as if everything is fine when it is not.
b9d70081b5b41c9f980e20bf40e1fd13  /home/gfs/ec_5/services

Ashish,
    Can you send out a patch? In both ec_truncate_open_cbk, ec_truncate_writev's cbk, we need to update lock->good_mask correctly. I don't find any other way for the file to end up in the state it did on Rahul's setup. We just need to confirm this is indeed the case. The only other psychotic case will be if the brick sends a success without performing the write.

Rahul,
    Will it be possible to get a tcpdump of a run where this happens? That will confirm how this bug happened. The bug we found based on the information you had on your setup I explained above is a day-1 bug.

Pranith

--- Additional comment from Worker Ant on 2017-07-28 06:08:03 EDT ---

REVIEW: https://review.gluster.org/17906 (ec/cluster: Update failure of fop on a brick properly) posted (#1) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Worker Ant on 2017-07-31 02:25:15 EDT ---

REVIEW: https://review.gluster.org/17906 (ec/cluster: Update failure of fop on a brick properly) posted (#2) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Worker Ant on 2017-07-31 07:55:31 EDT ---

COMMIT: https://review.gluster.org/17906 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 12906467f0f8df2edc07ea8412c0c6be6fae11b3
Author: Ashish Pandey <aspandey>
Date:   Fri Jul 28 15:18:29 2017 +0530

    ec/cluster: Update failure of fop on a brick properly
    
    Problem:
    In case of truncate, if writev or open fails on a brick,
    in some cases it does not mark the failure onlock->good_mask.
    This causes the update of size and version on all the bricks
    even if it has failed on one of the brick. That ultimately
    causes a data corruption.
    
    Solution:
    In callback of such writev and open calls, mark fop->good
    for parent too.
    
    Thanks Pranith Kumar K <pkarampu> for finding the
    root cause.
    
    Change-Id: I8a1da2888bff53b91a0d362b8c44fcdf658e7466
    BUG: 1476205
    Signed-off-by: Ashish Pandey <aspandey>
    Reviewed-on: https://review.gluster.org/17906
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 1 Worker Ant 2017-07-31 16:41:14 UTC
REVIEW: https://review.gluster.org/17933 (ec/cluster: Update failure of fop on a brick properly) posted (#1) for review on release-3.11 by Ashish Pandey (aspandey)

Comment 2 Worker Ant 2017-08-12 13:39:52 UTC
COMMIT: https://review.gluster.org/17933 committed in release-3.11 by Shyamsundar Ranganathan (srangana) 
------
commit 49ac61ae11508a16a56534a656d77be47f3fe6b8
Author: Ashish Pandey <aspandey>
Date:   Fri Jul 28 15:18:29 2017 +0530

    ec/cluster: Update failure of fop on a brick properly
    
    Problem:
    In case of truncate, if writev or open fails on a brick,
    in some cases it does not mark the failure onlock->good_mask.
    This causes the update of size and version on all the bricks
    even if it has failed on one of the brick. That ultimately
    causes a data corruption.
    
    Solution:
    In callback of such writev and open calls, mark fop->good
    for parent too.
    
    Thanks Pranith Kumar K <pkarampu> for finding the
    root cause.
    
    >Change-Id: I8a1da2888bff53b91a0d362b8c44fcdf658e7466
    >BUG: 1476205
    >Signed-off-by: Ashish Pandey <aspandey>
    >Reviewed-on: https://review.gluster.org/17906
    >Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    >Smoke: Gluster Build System <jenkins.org>
    >CentOS-regression: Gluster Build System <jenkins.org>
    >Signed-off-by: Ashish Pandey <aspandey>
    
    Change-Id: I8a1da2888bff53b91a0d362b8c44fcdf658e7466
    BUG: 1476870
    Signed-off-by: Ashish Pandey <aspandey>
    Reviewed-on: https://review.gluster.org/17933
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 3 Shyamsundar 2017-08-24 14:46:21 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.3, please open a new bug report.

glusterfs-3.11.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-August/000081.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.