+++ This bug was initially created as a clone of Bug #1235964 +++ Description of problem: In a 3 x (4 + 2) = 18 distributed disperse volume, there are input/output error of some files on fuse mount after simulating the following scenario 1. Simulate the disk failure by killing the disk pid and again adding the same disk after formatting the drive 2. Try to read the recovered or healed file after 2 bricks/nodes were brought down Version-Release number of selected component (if applicable): glusterfs 3.7.2 built on Jun 19 2015 16:33:27 Repository revision: git://git.gluster.com/glusterfs.git <http://git.gluster.com/glusterfs.git> Copyright (coffee) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License How reproducible: Steps to Reproduce: 1. create a 3x(4+2) disperse volume across nodes 2. FUSE mount on the client and start creating files/directories with mkdir and rsync/dd 3. simulate the disk failure by killing pid of any disk on one node and add again the same disk after formatting the drive 4. start volume by force 5. self haling adding the file name with 0 bytes in newly formatted drive 6. wait more time to finish self healing, but self healing is not happening the file lies on 0 bytes 7. Try to read same file from client, now the file name with 0 byte try to recovery and recovery completed. Get the md5sum of the file with all client live and the result is positive 8. Now, bring down 2 of the node 9. Now try to get the mdsum of same recoverd file, client throws I/O error Actual results: I/O error on the recovered file Expected results: There should not be any IO erro Additional info: admin@node001:~$ sudo gluster volume info Volume Name: vaulttest21 Type: Distributed-Disperse Volume ID: ac6a374d-a0a2-405c-823d-0672fd92f0af Status: Started Number of Bricks: 3 x (4 + 2) = 18 Transport-type: tcp Bricks: Brick1: 10.1.2.1:/media/disk1 Brick2: 10.1.2.2:/media/disk1 Brick3: 10.1.2.3:/media/disk1 Brick4: 10.1.2.4:/media/disk1 Brick5: 10.1.2.5:/media/disk1 Brick6: 10.1.2.6:/media/disk1 Brick7: 10.1.2.1:/media/disk2 Brick8: 10.1.2.2:/media/disk2 Brick9: 10.1.2.3:/media/disk2 Brick10: 10.1.2.4:/media/disk2 Brick11: 10.1.2.5:/media/disk2 Brick12: 10.1.2.6:/media/disk2 Brick13: 10.1.2.1:/media/disk3 Brick14: 10.1.2.2:/media/disk3 Brick15: 10.1.2.3:/media/disk3 Brick16: 10.1.2.4:/media/disk3 Brick17: 10.1.2.5:/media/disk3 Brick18: 10.1.2.6:/media/disk3 Options Reconfigured: performance.readdir-ahead: on *_After simulated the disk failure( node3- disk2) and adding aging by formatting the drive _* admin@node003:~$ date Thu Jun 25 *16:21:58* IST 2015 admin@node003:~$ ls -l -h /media/disk2 total 1.6G drwxr-xr-x 3 root root 22 Jun 25 16:18 1 *-rw-r--r-- 2 root root 0 Jun 25 16:17 up1* *-rw-r--r-- 2 root root 0 Jun 25 16:17 up2* -rw-r--r-- 2 root root 797M Jun 25 16:03 up3 -rw-r--r-- 2 root root 797M Jun 25 16:04 up4 -- admin@node003:~$ date Thu Jun 25 *16:25:09* IST 2015 admin@node003:~$ ls -l -h /media/disk2 total 1.6G drwxr-xr-x 3 root root 22 Jun 25 16:18 1 *-rw-r--r-- 2 root root 0 Jun 25 16:17 up1* *-rw-r--r-- 2 root root 0 Jun 25 16:17 up2* -rw-r--r-- 2 root root 797M Jun 25 16:03 up3 -rw-r--r-- 2 root root 797M Jun 25 16:04 up4 admin@node003:~$ date Thu Jun 25 *16:41:25* IST 2015 admin@node003:~$ ls -l -h /media/disk2 total 1.6G drwxr-xr-x 3 root root 22 Jun 25 16:18 1 -rw-r--r-- 2 root root 0 Jun 25 16:17 up1 -rw-r--r-- 2 root root 0 Jun 25 16:17 up2 -rw-r--r-- 2 root root 797M Jun 25 16:03 up3 -rw-r--r-- 2 root root 797M Jun 25 16:04 up4 *after waiting nearly 20 minutes, self healing is not recovered the full data junk . Then try to read the file using md5sum* * * root@mas03:/mnt/gluster# time md5sum up1 4650543ade404ed5a1171726e76f8b7c up1 real 1m58.010s user 0m6.243s sys 0m0.778s *corrupted junk starts growing* admin@node003:~$ ls -l -h /media/disk2 total 2.6G drwxr-xr-x 3 root root 22 Jun 25 16:18 1 -rw-r--r-- 2 root root 797M Jun 25 15:57 up1 -rw-r--r-- 2 root root 0 Jun 25 16:17 up2 -rw-r--r-- 2 root root 797M Jun 25 16:03 up3 -rw-r--r-- 2 root root 797M Jun 25 16:04 up4 *_To verify healed file after two node 5 & 6 taken offline_* root@mas03:/mnt/gluster# time md5sum up1 md5sum: up1:*Input/output error*
REVIEW: http://review.gluster.org/11844 (cluster/ec: Fix tracking of good bricks) posted (#1) for review on master by Xavier Hernandez (xhernandez)
REVIEW: http://review.gluster.org/11844 (cluster/ec: Fix tracking of good bricks) posted (#2) for review on master by Xavier Hernandez (xhernandez)
COMMIT: http://review.gluster.org/11844 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 7298b622ab39c2e78d6d745ae8b6e8413e1d9f1a Author: Xavier Hernandez <xhernandez> Date: Wed Aug 5 23:42:41 2015 +0200 cluster/ec: Fix tracking of good bricks The bitmask of good and bad bricks was kept in the context of the corresponding inode or fd. This was problematic when an external process (another client or the self-heal process) did heal the bricks but no one changed the bitmaks of other clients. This patch removes the bitmask stored in the context and calculates which bricks are healthy after locking them and doing the initial xattrop. After that, it's updated using the result of each fop. Change-Id: I225e31cd219a12af4ca58871d8a4bb6f742b223c BUG: 1236065 Signed-off-by: Xavier Hernandez <xhernandez> Reviewed-on: http://review.gluster.org/11844 Tested-by: NetBSD Build System <jenkins.org> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user