Description of problem: I was seeing errors self-healing "/". Upon checking the .glusterfs/00/00/00000000-0000-0000-0000-000000000001 stats, I discovered that some of my bricks had directories instead of symlinks. I replaced the directories with symlinks to ../../.. and set the gfid on those symlinks to 0x00000000000000000000000000000001 and healing was able to return to normal. Version-Release number of selected component (if applicable): 3.3.0 How reproducible: Unsure Steps to Reproduce: Sorry, there were a lot of things happening all at once so I'm not sure which one of them caused this to happen. I do replica 3 volumes so that may be a variable in this.
Something affecting self-heal like this would normally make it urgent, but it looks like chance/frequency of occurrence might be low so I'll step it down one notch.
Unable to reproduce this problem. Please feel free to re-open with more details (logs) if you happen to notice this problem again.
I've had two more reports of this problem in IRC. Still no repro though.
I've also suffered this problem. As per pranithk's request on irc I post some information from a bad directory. root@server:/pool/c/.glusterfs/1d/c2# ls -l drwx------ 2 root root 4.0K Mar 20 16:39 1dc2745b-4e1b-41a1-ba9d-59bceb06809c root@server:/pool/c/.glusterfs/1d/c2# getfattr -m ".*" -e hex -d 1dc2745b-4e1b-41a1-ba9d-59bceb06809c root@server:/pool/c/.glusterfs/1d/c2# ls -l 1dc2745b-4e1b-41a1-ba9d-59bceb06809c lrwxrwxrwx 1 root root 55 Mar 20 16:39 BACKUP -> ../../1d/c2/1dc2745b-4e1b-41a1-ba9d-59bceb06809c/BACKUP root@server:/pool/c/<path to real directory># getfattr -m ".*" -e hex -d . # file: . trusted.afr.vol01-client-4=0x000000000000000000000000 trusted.afr.vol01-client-5=0x000000000000000000000000 trusted.gfid=0x1dc2745b4e1b41a1ba9d59bceb06809c trusted.glusterfs.dht=0x0000000100000000b6db6db4db6db6d7 On another brick without problems: aff395fe-2d22-49eb-afa1-85c6b70c600f -> ../../1d/c2/1dc2745b-4e1b-41a1-ba9d-59bceb06809c/BACKUP 1dc2745b-4e1b-41a1-ba9d-59bceb06809c -> ../../65/03/650342d0-58cf-48eb-927f-856698b9fff9/<parent directory of BACKUP> Another example. It's not exactly the same case, but unfortunately I don't have the extended attributes of the real directory: root@server:/pool/c/.glusterfs/cd/80# ls -l drwx------ 2 root root 22 Mar 21 15:14 cd8019dd-880f-40d4-a18a-9a6e45ef0510 root@server:/pool/c/.glusterfs/cd/80# stat cd8019dd-880f-40d4-a18a-9a6e45ef0510 File: `cd8019dd-880f-40d4-a18a-9a6e45ef0510' Size: 22 Blocks: 0 IO Block: 4096 directory Device: 10302h/66306d Inode: 2480359527 Links: 2 Access: (0700/drwx------) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-03-21 20:37:13.314935893 +0100 Modify: 2013-03-21 15:14:26.416891579 +0100 Change: 2013-03-21 15:14:26.416891579 +0100 root@server:/pool/c/.glusterfs/cd/80# getfattr -m ".*" -e hex -d cd8019dd-880f-40d4-a18a-9a6e45ef0510 root@server:/pool/c/.glusterfs/cd/80# ls -l cd8019dd-880f-40d4-a18a-9a6e45ef0510 total 0 lrwxrwxrwx 1 root root 58 Mar 19 16:45 T02-19_20 -> ../../f0/96/f0968474-b319-4073-a453-eafe3bd7e60f/T02-19_20 On another brick where there is no problem: b731f16a-d361-4e83-9e03-11f04d51ee08 -> ../../f0/96/f0968474-b319-4073-a453-eafe3bd7e60f/T02-19_20 It seems that there has been some kind of split-brain incorrectly solved.
I have been able to reproduce the problem. I have had to modify directly the contents of one brick. I'm not sure how/if these modifications can happen without direct access to the brick. [root@glnas01 ~]# gluster volume create vol01 replica 2 glnas01:/bricks/b01 glnas02:/bricks/b01 Creation of volume vol01 has been successful. Please start the volume to access data. [root@glnas01 ~]# gluster volume start vol01 Starting volume vol01 has been successful [root@glnas01 ~]# mount -t glusterfs glnas01:/vol01 /vol01 [root@glnas01 ~]# mkdir -p /vol01/dir1/dir2 [root@glnas01 ~]# getfattr -m. -e hex -d /bricks/b01/dir1 getfattr: Removing leading '/' from absolute path names # file: bricks/b01/dir1 trusted.gfid=0x43e7a966ce8944e7ba8f2cb00fc0a16f [root@glnas01 ~]# getfattr -m. -e hex -d /bricks/b01/dir1/dir2 getfattr: Removing leading '/' from absolute path names # file: bricks/b01/dir1/dir2 trusted.gfid=0x923114807a9445819e1f38ae427a8b95 [root@glnas01 ~]# rm -f /bricks/b01/.glusterfs/43/e7/43e7a966-ce89-44e7-ba8f-2cb00fc0a16f [root@glnas01 ~]# rmdir /bricks/b01/dir1/dir2 [root@glnas01 ~]# gluster volume heal vol01 full Launching Heal operation on volume vol01 has been successful Use heal info commands to check status [root@glnas01 ~]# ls -l /bricks/b01/.glusterfs/43/e7 total 4 drwx------ 2 root root 4096 28 mar 14:53 43e7a966-ce89-44e7-ba8f-2cb00fc0a16f The problem is caused by self-heal when it tries to regenerate dir2 with an existing gfid inside .glusterfs and at least one of the parent gfid's of dir2 does not exist. In posix_handle_soft() newpath is built using MAKE_PATH_HANDLE() that returns /bricks/b01/.glusterfs/43/e7/43e7a966-ce89-44e7-ba8f-2cb00fc0a16f/dir2 instead of the expected /bricks/b01/.glusterfs/92/31/92311480-7a94-4581-9e1f-38ae427a8b95 because this last symbolic link exists and MAKE_PATH_HANDLE() tries to resolve it. However, as 43e7a966-ce89-44e7-ba8f-2cb00fc0a16f does not exist, it can't resolve it. After that, a call to posix_handle_mkdir_hashes() creates the last two levels of the dirname of the path, in this case 'e7' and '43e7a966-ce89-44e7-ba8f-2cb00fc0a16f'.
REVIEW: http://review.gluster.org/5075 (storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes()) posted (#1) for review on master by Xavier Hernandez (xhernandez)
Hey I have the same problems in case glusterfs is killed by the kernel. OOM, still think the glusterfs has memory allocation issues. It took me a long time to figure the issues out with the 00000000-0000-0000-0000-000000000001 link/directory This is a serious issue for me as well and caused few hours downtime to cleanup the mess.
REVIEW: http://review.gluster.org/5075 (storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes()) posted (#2) for review on master by Xavier Hernandez (xhernandez)
REVIEW: http://review.gluster.org/6736 (storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes()) posted (#1) for review on release-3.5 by Xavier Hernandez (xhernandez)
REVIEW: http://review.gluster.org/6737 (storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes()) posted (#1) for review on release-3.4 by Xavier Hernandez (xhernandez)
REVIEW: http://review.gluster.org/5075 (storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes()) posted (#3) for review on master by Xavier Hernandez (xhernandez)
REVIEW: http://review.gluster.org/6736 (storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes()) posted (#2) for review on release-3.5 by Xavier Hernandez (xhernandez)
REVIEW: http://review.gluster.org/6737 (storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes()) posted (#2) for review on release-3.4 by Xavier Hernandez (xhernandez)
COMMIT: http://review.gluster.org/5075 committed in master by Vijay Bellur (vbellur) ------ commit c7838fbd6afd876c922e1ec681bbbcf73be653e5 Author: Xavier Hernandez <xhernandez> Date: Thu May 23 11:13:25 2013 +0200 storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes() Whenever a new directory is created, its corresponding gfid file must also be created. This was done first calling MAKE_HANDLE_PATH() to get the path of the gfid file, then calling posix_handle_mkdir_hashes() to create the parent directories of the gfid, and finally creating the soft-link. In normal circumstances, the gfid we want to create won't exist and MAKE_HANDLE_PATH() will return a simple path to the new gfid. However if the volume is damaged and a self-heal is running, it is possible that we try to create an already existing gfid. In this case, MAKE_HANDLE_PATH() will return a path to the directory instead of the path to the gfid. To solve this problem, every time a path to a gfid is needed, a call to MAKE_HANDLE_ABSPATH() is made instead of the call to MAKE_HANDLE_PATH(). Change-Id: Ic319cc38c170434db8e86e2f89f0b8c28c0d611a BUG: 859581 Signed-off-by: Xavier Hernandez <xhernandez> Reviewed-on: http://review.gluster.org/5075 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu> Reviewed-by: Vijay Bellur <vbellur>
COMMIT: http://review.gluster.org/6736 committed in release-3.5 by Niels de Vos (ndevos) ------ commit b3fd7004a4a579c64ed29ee7eeb7e0fa57a3591f Author: Xavier Hernandez <xhernandez> Date: Thu May 23 11:13:25 2013 +0200 storage/posix: do not dereference gfid symlinks before posix_handle_mkdir_hashes() Whenever a new directory is created, its corresponding gfid file must also be created. This was done first calling MAKE_HANDLE_PATH() to get the path of the gfid file, then calling posix_handle_mkdir_hashes() to create the parent directories of the gfid, and finally creating the soft-link. In normal circumstances, the gfid we want to create won't exist and MAKE_HANDLE_PATH() will return a simple path to the new gfid. However if the volume is damaged and a self-heal is running, it is possible that we try to create an already existing gfid. In this case, MAKE_HANDLE_PATH() will return a path to the directory instead of the path to the gfid. To solve this problem, every time a path to a gfid is needed, a call to MAKE_HANDLE_ABSPATH() is made instead of the call to MAKE_HANDLE_PATH(). BUG: 859581 Change-Id: I84405bf04562e647fc02445f45358e9451f9b479 Signed-off-by: Xavier Hernandez <xhernandez> Reviewed-on: http://review.gluster.org/6736 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Kaleb KEITHLEY <kkeithle> Reviewed-by: Raghavendra Bhat <raghavendra> Reviewed-by: Pranith Kumar Karampuri <pkarampu> Reviewed-by: Niels de Vos <ndevos>
Moving to POST, still waiting for the merging of http://review.gluster.org/6737.
The first (and last?) Beta for GlusterFS 3.5.1 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.1beta release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-May/040377.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
The problem seems to be fixed in this version.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.1, please reopen this bug report. glusterfs-3.5.1 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-June/040723.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user