+++ This bug was initially created as a clone of Bug #1564071 +++ Description of problem: Some directories and files can't be seen by the clients. They are there, they can be entered but they don't show up in ls. If moved, they appear. If moved back to the old name, they disappear again. On the server: > The directory exists on all bricks and contains some files. $ cd /$path_to_brick/data/dir1/ $ ls dir2 $ find /$path_to_brick/data/dir1/ /$path_to_brick/data/dir1/dir2 /$path_to_brick/data/dir1/dir2/file1 /$path_to_brick/data/dir1/dir2/file2 on all clients: > The directory seems empty. $ cd /$path_to_mount/dir1/ $ ls $ > We can list the files in the invisible directory. $ ls dir2 file1 file2 > We can move the directory to make it reappear. $ mv dir2 dir2_new $ ls dir2_new > But after moving back it disappears again. $ mv dir2_new dir2 $ ls $ > We can't remove the affected directory. $ cd; rm -rf /$path_to_mount/dir1/ rm: cannot remove ‘/$path_to_mount/dir1/dir2’: Directory not empty rm: cannot remove ‘/$path_to_mount/dir1/dir3’: Directory not empty Some files in the affected dir show the same behaviour. What we tried so far: Setting "performance.readdir-ahead" to off didn't change the behaviour. Version-Release number of selected component (if applicable): $ gluster --version glusterfs 3.12.7 Repository revision: git://git.gluster.org/glusterfs.git Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/> GlusterFS comes with ABSOLUTELY NO WARRANTY. It is licensed to you under your choice of the GNU Lesser General Public License, version 3 or any later version (LGPLv3 or later), or the GNU General Public License, version 2 (GPLv2), in all cases as published by the Free Software Foundation. $ dpkg -l | grep glusterfs-server ii glusterfs-server 3.12.7-1 amd64 clustered file-system (server package) $ cat /etc/apt/sources.list.d/gluster.list deb [arch=amd64] https://download.gluster.org/pub/gluster/glusterfs/3.12/LATEST/Debian/stretch/amd64/apt stretch main How reproducible: Seems to be random. We weren't able to break a directory on purpose. An affected directory didn't return back to normal. Additional info: Gluster is installed on Debian Stretch using https://download.gluster.org and updated from Gluster 3.8 a while ago and 1 day before the first seen appearance updated from 3.12.6 to 3.12.7. A clean reboot of all GlusterFS servers happened during this update. According to the user who owns the affected directory, it was created after the update. The volume is mounted via FUSE. $ gluster volume info $volume Volume Name: $volume Type: Distribute Volume ID: 0d210c70-e44f-46f1-862c-ef260514c9f1 Status: Started Snapshot Count: 0 Number of Bricks: 23 Transport-type: tcp Bricks: Brick1: gluster02:/srv/glusterfs/bricks/DATA201/data Brick2: gluster02:/srv/glusterfs/bricks/DATA202/data Brick3: gluster02:/srv/glusterfs/bricks/DATA203/data Brick4: gluster02:/srv/glusterfs/bricks/DATA204/data Brick5: gluster02:/srv/glusterfs/bricks/DATA205/data Brick6: gluster02:/srv/glusterfs/bricks/DATA206/data Brick7: gluster02:/srv/glusterfs/bricks/DATA207/data Brick8: gluster02:/srv/glusterfs/bricks/DATA208/data Brick9: gluster01:/srv/glusterfs/bricks/DATA110/data Brick10: gluster01:/srv/glusterfs/bricks/DATA111/data Brick11: gluster01:/srv/glusterfs/bricks/DATA112/data Brick12: gluster01:/srv/glusterfs/bricks/DATA113/data Brick13: gluster01:/srv/glusterfs/bricks/DATA114/data Brick14: gluster02:/srv/glusterfs/bricks/DATA209/data Brick15: gluster01:/srv/glusterfs/bricks/DATA101/data Brick16: gluster01:/srv/glusterfs/bricks/DATA102/data Brick17: gluster01:/srv/glusterfs/bricks/DATA103/data Brick18: gluster01:/srv/glusterfs/bricks/DATA104/data Brick19: gluster01:/srv/glusterfs/bricks/DATA105/data Brick20: gluster01:/srv/glusterfs/bricks/DATA106/data Brick21: gluster01:/srv/glusterfs/bricks/DATA107/data Brick22: gluster01:/srv/glusterfs/bricks/DATA108/data Brick23: gluster01:/srv/glusterfs/bricks/DATA109/data Options Reconfigured: nfs.addr-namelookup: off transport.address-family: inet nfs.disable: on diagnostics.brick-log-level: ERROR performance.readdir-ahead: on auth.allow: $OUR_IP_SPACE features.quota: on features.inode-quota: on features.quota-deem-statfs: on Nothing interesting ran into the brick logs. Just normal boot stuff as always.
REVIEW: https://review.gluster.org/20535 (posix: check before removing stale symlink) posted (#1) for review on release-4.1 by Ravishankar N
COMMIT: https://review.gluster.org/20535 committed in release-4.1 by "Shyamsundar Ranganathan" <srangana> with a commit message- posix: check before removing stale symlink BZ 1564071 complains of directories with missing gfid symlinks and corresponding "Found stale gfid handle" messages in the logs. Hence add a check to see if the symlink points to an actual directory before removing it. Note: Removing stale symlinks was added via commit 3e9a9c029fac359477fb26d9cc7803749ba038b2 Change-Id: I5d91fab8e5f3a621a9ecad4a1f9c898a3c2d346a Updates: bz#1603099 Signed-off-by: Ravishankar N <ravishankar> (cherry picked from commit 9ab218be5e69b9f71fe4eea9ca8d114b78cafd25)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.1.2, please open a new bug report. glusterfs-4.1.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-July/000106.html [2] https://www.gluster.org/pipermail/gluster-users/