Created attachment 1288382 [details] statedump as requested Version-Release number of selected component (if applicable): CentOS Linux release 7.3.1611 (Core) glusterfs 3.8.12 built on May 11 2017 18:46:22 3.10.0-514.21.1.el7.x86_64 glusterfs.x86_64 3.8.12-1.el7 @centos-gluster38 glusterfs-api.x86_64 3.8.12-1.el7 @centos-gluster38 glusterfs-cli.x86_64 3.8.12-1.el7 @centos-gluster38 glusterfs-client-xlators.x86_64 3.8.12-1.el7 @centos-gluster38 glusterfs-fuse.x86_64 3.8.12-1.el7 @centos-gluster38 glusterfs-libs.x86_64 3.8.12-1.el7 @centos-gluster38 glusterfs-server.x86_64 3.8.12-1.el7 @centos-gluster38 It's worth noting that we do run SElinux on these servers, however we have the necessary rules in place and running 'sudo audit2allow -a' does not show any newly required rules. As a test, we also disabled Selinux, but the problem was still reproducible. These servers have been running glusterd for a few months, without issue and we're confident that it's working correctly on about 4 other client sites. How reproducible: We have a single volume 'gvol0' which is split across two nodes (replicated), server1 and server2: Here is the sanitised volume info: Volume Name: gvol0 Type: Replicate Volume ID: 434237432743274327473247 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server1:/var/lib/gvol0/brick1 Brick2: server2:/var/lib/gvol0/brick2 Options Reconfigured: nfs.rpc-auth-allow: x.x.x.* nfs.export-volumes: on nfs.addr-namelookup: off auth.allow: x.x.x.* transport.address-family: inet performance.readdir-ahead: on nfs.disable: off Each brick is then mounted on the server at the following mount point location: /mnt/gluster/gvol0/ The issue we are facing is as follows: On server1, there is a particular folder on the mount point, let's call it "_data" that doesn't display any subfolder, nor files within it. It is however possible to stat these files/folders and even edit files and successfully write to the disk: File: ‘_data’ Size: 4096 Blocks: 8 IO Block: 131072 directory Device: 27h/39d Inode: 10981725783047031592 Links: 35 Access: (2775/drwxrwsr-x) Uid: ( 48/ apache) Gid: ( 1024/teamgroup) Context: system_u:object_r:fusefs_t:s0 Access: 2017-06-13 15:42:18.653306032 +0000 Modify: 2017-06-13 15:42:22.461305492 +0000 Change: 2017-06-13 15:42:22.461305492 +0000 Birth: - The weird thing is, that this problem doesn't exist on the mount point on server2 AND I am able to successfully list folders and files within _data on the brick itself at /var/lib/gvol0/brick1/_data/ All volumes are running, including the self-healing daemons: Status of volume: gvol0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick server1:/var/lib/gvol0/brick1 49156 0 Y 997 Brick server2:/var/lib/gvol0/brick2 49154 0 Y 1109 NFS Server on localhost 2049 0 Y 17085 Self-heal Daemon on localhost N/A N/A Y 17093 NFS Server on server 1 2049 0 Y 4971 Self-heal Daemon on server 2 N/A N/A Y 4979 Task Status of Volume gvol0 Running the self heal command doesn't return any issues: Brick server1:/var/lib/gvol0/brick1 Status: Connected Number of entries: 0 Brick server2:/var/lib/gvol0/brick2 Status: Connected Number of entries: 0 The workaround: The workaround I found was to rename the _data folder on the mount point in web1. All files and folders then showed up as normal. However I'm conscious that this could be a bug and I'd like to avoid this in the future. If you need any other information, please let me know. Additional Information... Brick information: xfs_info /dev/mapper/vgglus1-gbrick1 meta-data=/dev/mapper/vgglus1-gbrick1 isize=512 agcount=4, agsize=26214144 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=0 spinodes=0 data = bsize=4096 blocks=104856576, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal bsize=4096 blocks=51199, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 getfattr -d -m. -ehex /var/lib/gvol0/brick1 getfattr: Removing leading '/' from absolute path names # file: var/lib/gvol0/brick1 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000 [will_emmerson@web-1 ~]$ df -Th Filesystem Type Size Used Avail Use% Mounted on /dev/sda1 xfs 15G 12G 3.2G 79% / devtmpfs devtmpfs 3.5G 0 3.5G 0% /dev tmpfs tmpfs 3.6G 0 3.6G 0% /dev/shm tmpfs tmpfs 3.6G 219M 3.3G 7% /run tmpfs tmpfs 3.6G 0 3.6G 0% /sys/fs/cgroup /dev/mapper/vgglus1-gbrick1 xfs 400G 185G 216G 47% /var/lib/gvol0 tmpfs tmpfs 718M 0 718M 0% /run/user/1002 web-1:/gvol0 fuse.glusterfs 400G 369G 32G 93% /mnt/gluster/gvol0 Statedump attached!
This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.