Description of problem: ======================= self heal of hard linked files failed when simulated a disk replacement scenario and than healing with heal full. Files got created as regular empty files on the bricks which were removed and created with the same name followed by volume start force and heal full Detailed case mentioned below: Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.4.0.9rhs-1.el6rhs.x86_64 glusterfs-geo-replication-3.4.0.9rhs-1.el6rhs.x86_64 glusterfs-fuse-3.4.0.9rhs-1.el6rhs.x86_64 glusterfs-rdma-3.4.0.9rhs-1.el6rhs.x86_64 glusterfs-devel-3.4.0.9rhs-1.el6rhs.x86_64 glusterfs-server-3.4.0.9rhs-1.el6rhs.x86_64 glusterfs-debuginfo-3.4.0.9rhs-1.el6rhs.x86_64 Steps Carried: ============== 1. Created 6*2 volume from 4 servers (server1-4) 2. Mounted on client (Fuse and NFS) 3. Created f and n directory from the mount. 4. cd to f from fuse mount 5. cd to n from nfs mount 6. Created test_hardlink_self_heal directory from f and n 7. create files and directories under “test_hardlink_self_heal” using the following code from mount point f and n : “ cd test_hardlink_self_heal ; for i in `seq 1 5` ; do mkdir dir.$i ; for j in `seq 1 10` ; do dd if=/dev/input_file of=dir.$i/file.$j bs=1k count=$j ; done ; done ; cd ../ ” 8. Brought down bricks from server 2 and powered off server 4. 9. create hard links to files under directory “test_hardlink_self_heal/dir.*” using the following code from mount point f and n: “ cd test_hardlink_self_heal ; for i in `seq 1 5` ; do for j in `seq 1 10` ; do ln dir.$i/file.$j dir.$i/link_file.$j; done ; done ; cd ../ “ 10. Brought back server 4 and started the volume forcefully. 11. Self heal started and completed successfully. 12. verify the hard links are self-healed using the following code from the mount point: “ ( cd test_hardlink_self_heal ; for i in `seq 1 5` ; do for j in `seq 1 10` ; do if [ `stat -c %i dir.$i/file.$j` != `stat -c %i dir.$i/link_file.$j` ] ; then exit 1 ; fi ; done ; done ; cd ../ ) “ echo $? shows exit status 0. 13. Brought down bricks from server1 and server3 (kill -9) 14. create hard links to files under directory “test_hardlink_self_heal/dir.*” using the following code from mount point: “ cd test_hardlink_self_heal ; for i in `seq 1 5` ; do mkdir new_dir.$i ; for j in `seq 1 10` ; do ln dir.$i/file.$j new_dir.$i/new_file.$j ; done ; done ; cd ../ “ 15. Bring back all offlined process using volume start force. 16. verify the hard links are self-healed using the following code from the mount point: “ ( cd test_hardlink_self_heal ; for i in `seq 1 5` ; do for j in `seq 1 10` ; do if [ `stat -c %i dir.$i/file.$j` != `stat -c %i new_dir.$i/new_file.$j` ] ; then exit 1 ; fi ; done ; done ; cd ../ ) “ echo $? shows exit status 0. 17. Brought down all brick process from server2 and server4 (kill -9) 18. remove one brick's directory from every replicate-subvolumes to simulate disk replacement scenario (removed b2,b4,b6 from server2 and b8,b10,b12 from server4) 19. Created removed directory with the same name (created b2,b4,b6 from server2 and b8,b10,b12 from server4) 20. Started the volume forcefully which is successful. 21. Start the heal full using "gluster volume heal <vol-name> full" 22. Few of the files got created on server2 and server4 as regular empty files. Actual results: =============== Arequal did not match on the bricks of same replica pair. Few files got created on server2 and server4 as 0 byte. Number of links differ between the bricks of replica pair. Files that are in question are as follows: ========================================== b1/b2 < /rhs/brick1/b1/n/test_hardlink_self_heal/dir.2/file.1 0x0577a082ec314310ad5a33f1a0189977 1024 --- > /rhs/brick1/b2/n/test_hardlink_self_heal/dir.2/file.1 0x0577a082ec314310ad5a33f1a0189977 0 b3/b4 17c17 < /rhs/brick1/b3/f/test_hardlink_self_heal/dir.2/file.6 0x0be124943ec64ad39e83a5f84b950855 6144 --- > /rhs/brick1/b4/f/test_hardlink_self_heal/dir.2/file.6 0x0be124943ec64ad39e83a5f84b950855 0 20c20 < /rhs/brick1/b3/f/test_hardlink_self_heal/dir.3/file.6 0x0845cb18d7314d6a90abe4912324eff5 6144 --- > /rhs/brick1/b4/f/test_hardlink_self_heal/dir.3/file.6 0x0845cb18d7314d6a90abe4912324eff5 0 81c81 < /rhs/brick1/b3/n/test_hardlink_self_heal/dir.3/file.1 0x166b00abee30437699da3d1a8a9cc484 1024 --- > /rhs/brick1/b4/n/test_hardlink_self_heal/dir.3/file.1 0x166b00abee30437699da3d1a8a9cc484 0 b5/b6 5c5 < /rhs/brick1/b5/f/test_hardlink_self_heal/dir.1/file.6 0x77f745a0f5cf4efca2da40a75de76069 6144 --- > /rhs/brick1/b6/f/test_hardlink_self_heal/dir.1/file.6 0x77f745a0f5cf4efca2da40a75de76069 0 20c20 < /rhs/brick1/b5/f/test_hardlink_self_heal/dir.4/file.6 0xb6ef74f3fed74971a6cd352acb69fca8 6144 --- > /rhs/brick1/b6/f/test_hardlink_self_heal/dir.4/file.6 0xb6ef74f3fed74971a6cd352acb69fca8 0 64c64 < /rhs/brick1/b5/n/test_hardlink_self_heal/dir.2/file.6 0xd11ade94f7084a778d6e20e4aff9a860 6144 --- > /rhs/brick1/b6/n/test_hardlink_self_heal/dir.2/file.6 0xd11ade94f7084a778d6e20e4aff9a860 0 b7/b8 23c23 < /rhs/brick1/b7/f/test_hardlink_self_heal/dir.5/file.6 0xece85a9a422247fa9836ee1aeec8117b 6144 --- > /rhs/brick1/b8/f/test_hardlink_self_heal/dir.5/file.6 0xece85a9a422247fa9836ee1aeec8117b 0 48c48 < /rhs/brick1/b7/n/test_hardlink_self_heal/dir.1/file.1 0x582860c6ed31459fb9c913d60b8530e5 1024 --- > /rhs/brick1/b8/n/test_hardlink_self_heal/dir.1/file.1 0x582860c6ed31459fb9c913d60b8530e5 0 60c60 < /rhs/brick1/b7/n/test_hardlink_self_heal/dir.3/file.6 0x26c5ab665e8149e49719e054e30a3dc6 6144 --- > /rhs/brick1/b8/n/test_hardlink_self_heal/dir.3/file.6 0x26c5ab665e8149e49719e054e30a3dc6 0 75c75 < /rhs/brick1/b7/n/test_hardlink_self_heal/dir.5/file.6 0x49ce57de4a8746949a98a5fcccc6df60 6144 --- > /rhs/brick1/b8/n/test_hardlink_self_heal/dir.5/file.6 0x49ce57de4a8746949a98a5fcccc6df60 0 b9/b10 < /rhs/brick1/b9/n/test_hardlink_self_heal/dir.4/file.6 0x5d14b14e4dd44ed885af89b6a2045fa5 6144 --- > /rhs/brick1/b10/n/test_hardlink_self_heal/dir.4/file.6 0x5d14b14e4dd44ed885af89b6a2045fa5 0 b11/b12 8c8 < /rhs/brick1/b11/f/test_hardlink_self_heal/dir.2/file.1 0x268ebce2c8a64d46bacc55bb9722142d 1024 --- > /rhs/brick1/b12/f/test_hardlink_self_heal/dir.2/file.1 0x268ebce2c8a64d46bacc55bb9722142d 0 46c46 < /rhs/brick1/b11/n/test_hardlink_self_heal/dir.1/file.6 0xbd6bfd4c5df04ef9b21359aec3f2913b 6144 --- > /rhs/brick1/b12/n/test_hardlink_self_heal/dir.1/file.6 0xbd6bfd4c5df04ef9b21359aec3f2913b 0 Stat on file which is in question for b1/b2 is ============================================== # stat /rhs/brick1/b1/n/test_hardlink_self_heal/dir.2/file.1 File: `/rhs/brick1/b1/n/test_hardlink_self_heal/dir.2/file.1' Size: 1024 Blocks: 8 IO Block: 4096 regular file Device: fd02h/64770d Inode: 1744831047 Links: 4 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-06-19 11:54:54.846039000 -0400 Modify: 2013-06-19 11:54:54.849039000 -0400 Change: 2013-06-19 12:51:20.620036657 -0400 # stat /rhs/brick1/b2/n/test_hardlink_self_heal/dir.2/file.1 File: `/rhs/brick1/b2/n/test_hardlink_self_heal/dir.2/file.1' Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fd02h/64770d Inode: 1006633866 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-06-19 12:42:40.095142205 -0400 Modify: 2013-06-19 12:42:40.095142205 -0400 Change: 2013-06-19 12:42:40.095142205 -0400 Getfattr: ========= # getfattr -d -e hex -m . /rhs/brick1/b1/n/test_hardlink_self_heal/dir.2/file.1 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/b1/n/test_hardlink_self_heal/dir.2/file.1 trusted.afr.vol-dis-rep-client-0=0x000000000000000000000000 trusted.afr.vol-dis-rep-client-1=0x000000000000000000000000 trusted.gfid=0x0577a082ec314310ad5a33f1a0189977 # getfattr -d -e hex -m . /rhs/brick1/b2/n/test_hardlink_self_heal/dir.2/file.1 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/b2/n/test_hardlink_self_heal/dir.2/file.1 trusted.gfid=0x0577a082ec314310ad5a33f1a0189977 arequal miss match between servers: =================================== server1: ======== # ./areequal-checksum ----------------------- Subvolume: 1---------------------- Entry counts Regular files : 83 Directories : 25 Symbolic links : 0 Other : 0 Total : 108 Metadata checksums Regular files : 48a1e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 42e2a6e4d9c94a782ce440b6c62f0572 Directories : 656c6e666f7b7c48 Symbolic links : 0 Other : 0 Total : b6a8834709d3342 ----------------------- Subvolume: 2---------------------- Entry counts Regular files : 90 Directories : 25 Symbolic links : 0 Other : 0 Total : 115 Metadata checksums Regular files : 20c85 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 7172e858de313ff41f7951c0bc5dbf80 Directories : 6c595f710e040667 Symbolic links : 0 Other : 0 Total : 252e6e96c688613 ----------------------- Subvolume: 3---------------------- Entry counts Regular files : 96 Directories : 25 Symbolic links : 0 Other : 0 Total : 121 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : fd09adca15b119833cb5c676fa4d5d38 Directories : 300f0000332f08 Symbolic links : 0 Other : 0 Total : c18c64bcefcf6bb3 server2: ======== # ./areequal-checksum ----------------------- Subvolume: 1---------------------- Entry counts Regular files : 83 Directories : 25 Symbolic links : 0 Other : 0 Total : 108 Metadata checksums Regular files : 48a1e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 99c291b322a39148acb823bf0869c70c Directories : 656c6e666f7b7c48 Symbolic links : 0 Other : 0 Total : 5016dc6a45b12a0c ----------------------- Subvolume: 2---------------------- Entry counts Regular files : 90 Directories : 25 Symbolic links : 0 Other : 0 Total : 115 Metadata checksums Regular files : 20c85 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : aa52df0f255be4c49f2532c9721b7dfe Directories : 6c595f710e040667 Symbolic links : 0 Other : 0 Total : 592eb2b759449f5d ----------------------- Subvolume: 3---------------------- Entry counts Regular files : 96 Directories : 25 Symbolic links : 0 Other : 0 Total : 121 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 7172e858de313ff41f7951c0bc5dbf80 Directories : 300f0000332f08 Symbolic links : 0 Other : 0 Total : 6e3bb698625faf7c server3: ======== # ./areequal-checksum ----------------------- Subvolume: 4---------------------- Entry counts Regular files : 72 Directories : 25 Symbolic links : 0 Other : 0 Total : 97 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 8079246448e1d641dd27b85a7e692906 Directories : 9350f39041e3c41 Symbolic links : 0 Other : 0 Total : 546b93073296c306 ----------------------- Subvolume: 5---------------------- Entry counts Regular files : 64 Directories : 25 Symbolic links : 0 Other : 0 Total : 89 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 6bf687f74378e1fff9d18f58d09a1a34 Directories : 300800300f2808 Symbolic links : 0 Other : 0 Total : 921700afa3edd3c3 ----------------------- Subvolume: 6---------------------- Entry counts Regular files : 53 Directories : 25 Symbolic links : 0 Other : 0 Total : 78 Metadata checksums Regular files : 486e85 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 251640e519d05bb12bdee0022eccd4f8 Directories : 392e555d4a6e Symbolic links : 0 Other : 0 Total : ec899c96241c527 server4: ======== # ./areequal-checksum ----------------------- Subvolume: 4---------------------- Entry counts Regular files : 72 Directories : 25 Symbolic links : 0 Other : 0 Total : 97 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : d72256a1780b2b067eb74ce5f63f09c0 Directories : 9350f39041e3c41 Symbolic links : 0 Other : 0 Total : a0a0157d8a2a1e87 ----------------------- Subvolume: 5---------------------- Entry counts Regular files : 64 Directories : 25 Symbolic links : 0 Other : 0 Total : 89 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : e78dc26588f8c788da1d18ee968af88c Directories : 300800300f2808 Symbolic links : 0 Other : 0 Total : 3da0d28b2e7d170c ----------------------- Subvolume: 6---------------------- Entry counts Regular files : 53 Directories : 25 Symbolic links : 0 Other : 0 Total : 78 Metadata checksums Regular files : 486e85 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 724d3220293aa6f6884e14bda69af43e Directories : 392e555d4a6e Symbolic links : 0 Other : 0 Total : fa031fb3dafd18a6 Expected results: ================== arequal should match and hard linked file should successfully heal.
https://code.engineering.redhat.com/gerrit/#/c/10556/
Verified the fix on the build: ============================== root@rhs-client11 [Jul-25-2013-16:45:16] >rpm -qa | grep glusterfs-server glusterfs-server-3.4.0.12rhs.beta6-1.el6rhs.x86_64 root@rhs-client11 [Jul-25-2013-16:45:22] >gluster --version glusterfs 3.4.0.12rhs.beta6 built on Jul 23 2013 16:20:03 Bug is fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html