Bug 855856 - .glusterfs is missing entries after self heal and lookup
Summary: .glusterfs is missing entries after self heal and lookup
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: vsomyaju
QA Contact: Sudhir D
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-09-10 12:11 UTC by Rahul Hinduja
Modified: 2015-03-05 00:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-09-28 09:23:35 UTC
Embargoed:


Attachments (Terms of Use)

Description Rahul Hinduja 2012-09-10 12:11:06 UTC
Description of problem:

.glusterfs is missing entries after volume heal and lookup 

Version-Release number of selected component (if applicable):

glusterfs 3.3.0rhs built on Jul 25 11:21:58

(glusterfs 3.3.0rhs-25.el6rhs.x86_64)

How reproducible:

1/1

Steps to Reproduce:
1. Create a distribute volume with 1 brick.
2. Populate huge amount of data to the volume. (We had Total 85 directories having 1200 files)
3. Translate the volume to Replicate(1X2) using "gluster volume add-brick <vol-name> replica 2 <brick>"
4. Use "gluster volume heal <vol-name> full" to perform self heal.
5. Once the heal is done "Number of entries : 0" in "gluster volume heal <vol-name> info"
6. Use arequal-checksum to calculate the checsum of the bricks(brick1 and brick2). 
  
Actual results:

Arequal doesnt match for the brick1 and brick2. Newly added bricks are having less number of directories in ".glusterfs". Even after performing the lookup on mount point using "find .| xargs stat" the directories missmatch between brick1 and brick2 for .glusterfs directory.

Expected results:

areequal should match and should report the equal number of directories and files for brick1 and brick2

Additional info:

Comment 2 vsomyaju 2012-09-28 09:23:35 UTC
I have reproduced the problem and  found that there is a possibility of creation of temporary files(swap file) and when they get removed only the hard links are removed and the parent directories exists even after that.

For eg. 
The difference of .glusterfs of brick1 and brick2: 
------------------------------------------
3747d3746
< ./.glusterfs/61/61
3970d3968
< ./.glusterfs/67/5d
9765,9766d9762
< ./.glusterfs/indices
< ./.glusterfs/indices/xattrop
------------------------------------------

and from the log files it can be seen that some temporary files has got created and removed.
So the hard link were created and removed only for those swap files. 


for gfid:61612913-fd9b-4fcb-85cd-749d9aae49f3


creation of hard link:61612913-fd9b-4fcb-85cd-749d9aae49f3
--------------------------
[2012-09-28 07:50:43.032358] I [posix-handle.c:593:posix_handle_hard] (-->/usr/local/lib/glusterfs/3git/xlator/features/access-control.so(posix_acl_create+0x25d) [0x7f3b8323a75d] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_create+0x2a0) [0x7f3b8344b4d0] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_gfid_set+0x132) [0x7f3b83457202]))) 0-volume3-posix: /temp_disk/bricks/volume3/brick1/.glusterfs/61/61/61612913-fd9b-4fcb-85cd-749d9aae49f3 <--> /temp_disk/bricks/volume3/brick1/.sh1.sh.swp <--> hard

SWAP FILE:.sh1.sh.swp

unlink:61612913-fd9b-4fcb-85cd-749d9aae49f3
---------------------------------------------
[2012-09-28 07:51:43.727594] I [posix-handle.c:711:posix_handle_unset] (-->/usr/local/lib/libglusterfs.so.0(default_unlink+0x124) [0x7f3b8721cea4] (-->/usr/local/lib/glusterfs/3git/xlator/features/access-control.so(posix_acl_unlink+0x218) [0x7f3b83238b68] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_unlink+0x575) [0x7f3b8344f8d5]))) 0-volume3-posix: 61612913-fd9b-4fcb-85cd-749d9aae49f3 <--> unset



for gfid:675d18a5-2e71-4881-91b0-309a48e23ed7


creation of hard link:
--------------------------
[2012-09-28 07:50:43.027252] I [posix-handle.c:593:posix_handle_hard] (-->/usr/local/lib/glusterfs/3git/xlator/features/access-control.so(posix_acl_create+0x25d) [0x7f3b8323a75d] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_create+0x2a0) [0x7f3b8344b4d0] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_gfid_set+0x132) [0x7f3b83457202]))) 0-volume3-posix: /temp_disk/bricks/volume3/brick1/.glusterfs/67/5d/675d18a5-2e71-4881-91b0-309a48e23ed7 <--> /temp_disk/bricks/volume3/brick1/.sh1.sh.swp <--> hard

SWAP FILE:.sh1.sh.swp


unlink:
------------------------------
[2012-09-28 07:50:43.029697] I [posix-handle.c:711:posix_handle_unset] (-->/usr/local/lib/libglusterfs.so.0(default_unlink+0x124) [0x7f3b8721cea4] (-->/usr/local/lib/glusterfs/3git/xlator/features/access-control.so(posix_acl_unlink+0x218) [0x7f3b83238b68] (-->/usr/local/lib/glusterfs/3git/xlator/storage/posix.so(posix_unlink+0x575) [0x7f3b8344f8d5]))) 0-volume3-posix: 675d18a5-2e71-4881-91b0-309a48e23ed7 <--> unset

So it is not a bug as long as  arequal-checksum( excluding .glusterfs ) are same for both the bricks of a replicate volume.


Note You need to log in before you can comment on or make changes to this bug.