Description of problem: .glusterfs is missing entries after volume heal and lookup Version-Release number of selected component (if applicable): glusterfs 3.3.0rhs built on Jul 25 11:21:58 (glusterfs 3.3.0rhs-25.el6rhs.x86_64) How reproducible: 1/1 Steps to Reproduce: 1. Create a distribute volume with 1 brick. 2. Populate huge amount of data to the volume. (We had Total 85 directories having 1200 files) 3. Translate the volume to Replicate(1X2) using "gluster volume add-brick <vol-name> replica 2 <brick>" 4. Use "gluster volume heal <vol-name> full" to perform self heal. 5. Once the heal is done "Number of entries : 0" in "gluster volume heal <vol-name> info" 6. Use arequal-checksum to calculate the checsum of the bricks(brick1 and brick2). Actual results: Arequal doesnt match for the brick1 and brick2. Newly added bricks are having less number of directories in ".glusterfs". Even after performing the lookup on mount point using "find .| xargs stat" the directories missmatch between brick1 and brick2 for .glusterfs directory. Expected results: areequal should match and should report the equal number of directories and files for brick1 and brick2 Additional info:
I have reproduced the problem and found that there is a possibility of creation of temporary files(swap file) and when they get removed only the hard links are removed and the parent directories exists even after that. For eg. The difference of .glusterfs of brick1 and brick2: ------------------------------------------ 3747d3746 < ./.glusterfs/61/61 3970d3968 < ./.glusterfs/67/5d 9765,9766d9762 < ./.glusterfs/indices < ./.glusterfs/indices/xattrop ------------------------------------------ and from the log files it can be seen that some temporary files has got created and removed. So the hard link were created and removed only for those swap files. for gfid:61612913-fd9b-4fcb-85cd-749d9aae49f3 creation of hard link:61612913-fd9b-4fcb-85cd-749d9aae49f3 -------------------------- [2012-09-28 07:50:43.032358] I [posix-handle.c:593:posix_handle_hard] (-->/usr/local/lib/glusterfs/3git/xlator/features/access-control.so(posix_acl_create+0x25d) [0x7f3b8323a75d] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_create+0x2a0) [0x7f3b8344b4d0] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_gfid_set+0x132) [0x7f3b83457202]))) 0-volume3-posix: /temp_disk/bricks/volume3/brick1/.glusterfs/61/61/61612913-fd9b-4fcb-85cd-749d9aae49f3 <--> /temp_disk/bricks/volume3/brick1/.sh1.sh.swp <--> hard SWAP FILE:.sh1.sh.swp unlink:61612913-fd9b-4fcb-85cd-749d9aae49f3 --------------------------------------------- [2012-09-28 07:51:43.727594] I [posix-handle.c:711:posix_handle_unset] (-->/usr/local/lib/libglusterfs.so.0(default_unlink+0x124) [0x7f3b8721cea4] (-->/usr/local/lib/glusterfs/3git/xlator/features/access-control.so(posix_acl_unlink+0x218) [0x7f3b83238b68] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_unlink+0x575) [0x7f3b8344f8d5]))) 0-volume3-posix: 61612913-fd9b-4fcb-85cd-749d9aae49f3 <--> unset for gfid:675d18a5-2e71-4881-91b0-309a48e23ed7 creation of hard link: -------------------------- [2012-09-28 07:50:43.027252] I [posix-handle.c:593:posix_handle_hard] (-->/usr/local/lib/glusterfs/3git/xlator/features/access-control.so(posix_acl_create+0x25d) [0x7f3b8323a75d] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_create+0x2a0) [0x7f3b8344b4d0] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_gfid_set+0x132) [0x7f3b83457202]))) 0-volume3-posix: /temp_disk/bricks/volume3/brick1/.glusterfs/67/5d/675d18a5-2e71-4881-91b0-309a48e23ed7 <--> /temp_disk/bricks/volume3/brick1/.sh1.sh.swp <--> hard SWAP FILE:.sh1.sh.swp unlink: ------------------------------ [2012-09-28 07:50:43.029697] I [posix-handle.c:711:posix_handle_unset] (-->/usr/local/lib/libglusterfs.so.0(default_unlink+0x124) [0x7f3b8721cea4] (-->/usr/local/lib/glusterfs/3git/xlator/features/access-control.so(posix_acl_unlink+0x218) [0x7f3b83238b68] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_unlink+0x575) [0x7f3b8344f8d5]))) 0-volume3-posix: 675d18a5-2e71-4881-91b0-309a48e23ed7 <--> unset So it is not a bug as long as arequal-checksum( excluding .glusterfs ) are same for both the bricks of a replicate volume.