Description of problem: We found that our Gluster clients couldn't see directories when running `ls` or `find`. - They could create directories (which could not be seen after the fact). - They could enter the directories they couldn't see with `cd`. - They could create and see files. - The hosts could see the directories. After disabling `performance.parallel-readdir` on each volume - the problem went away. As per the docs, prior to enabling `performance.parallel-readdir` I had enabled `performance.readdir-ahead`. We suspect that as our typology is replica 3, arbiter 1 - _perhaps_ the read operations are also happening on the arbiter node, where of course data doesn't _really_ exist, only the metadata? Version-Release number of selected component (if applicable): - CentOS 7 x64 - Gluster Versions 3.12.1 and 3.12.2 - Gluster Client Versions 3.12.1 and 3.12.2 How reproducible: - Always Steps to Reproduce: 1. Setup CentOS 7, 3 Replica, 1 Arbiter node cluster running 3.12.2 (I'm assuming .3 will also have the problem) 2. Create a volume for use by the native Gluster FUSE client 3. Enable performance.parallel-readdir on the volume 4. Mount the volume on a client using the native fuse client 5. Create a directory within the volume Actual results: - The directory cannot be seen with ls, find etc... Expected results: - The directory should show up Additional info: Example volume (in the broken state): # gluster volume info dev_static Volume Name: dev_static Type: Replicate Volume ID: e5042a4d-9ee8-42e4-a4b2-fd66c3e8cb39 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: int-gluster-01.fqdn:/mnt/gluster-storage/dev_static Brick2: int-gluster-02.fqdn:/mnt/gluster-storage/dev_static Brick3: int-gluster-03.fqdn:/mnt/gluster-storage/dev_static Options Reconfigured: performance.parallel-readdir: true performance.cache-refresh-timeout: 2 performance.write-behind-window-size: 2MB server.event-threads: 10 performance.stat-prefetch: true performance.io-thread-count: 32 performance.cache-size: 128MB network.ping-timeout: 10 diagnostics.client-log-level: ERROR diagnostics.brick-log-level: ERROR cluster.use-compound-fops: true cluster.readdir-optimize: true cluster.lookup-optimize: true cluster.favorite-child-policy: size client.event-threads: 10 transport.address-family: inet nfs.disable: on performance.client-io-threads: true cluster.brick-multiplex: enable
I tried on my local setup, i couldn't reproduce the issue, the directories were listed for me. Will need more details, from which version did you upgrade to 3.12.1?. Which version these directories were created? Does ls not see even the newly created directories? Which version did you start enabling parallel-readdir from? Is it possible to unmount and mount the volume? This is not required as such, instead creating another mount point locally and trying ls, will also do.
I was able to hit the issue on a 1 brick distribute volume also, so this is not related to AFR or arbiter as such. When parallel-readdir is enabled, the skip_dirs flag for posix_fill_readdir is set to true. My volinfo: Volume Name: testvol Type: Distribute Volume ID: 0c3b3c49-db17-4c14-95f8-e0e3f8b3f071 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 127.0.0.2:/bricks/brick1 Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on performance.parallel-readdir: off performance.client-io-threads: true client.event-threads: 10 cluster.favorite-child-policy: size cluster.lookup-optimize: true cluster.readdir-optimize: true cluster.use-compound-fops: true performance.cache-size: 128MB performance.io-thread-count: 32 performance.stat-prefetch: true server.event-threads: 10 performance.write-behind-window-size: 2MB performance.cache-refresh-timeout: 2 transport.address-family: inet nfs.disable: on
REVIEW: https://review.gluster.org/18723 (dht: Fill fist_up_subvol before use in dht_opendir) posted (#1) for review on master by Poornima G
REVISION POSTED: https://review.gluster.org/18723 (dht: Fill fist_up_subvol before use in dht_opendir) posted (#2) for review on master by Poornima G
I just met someone else who was caught unaware by this bug. I was wondering, until it's fixed - is there something you can set on the Gluster server side to prevent clients from ever trying to use the arbiter node?
(In reply to Sam McLeod from comment #5) > I just met someone else who was caught unaware by this bug. > > I was wondering, until it's fixed - is there something you can set on the > Gluster server side to prevent clients from ever trying to use the arbiter > node? Sam, this is not related to arbiter (See comment #2). Setting performance.parallel-readdir to off on the volumes should serve as a workaround.
This is going to be hit only when parallel-readdir is enabled. And in your use case, there is only 1 distribute, hence there is not much to be made parallel. I would suggest to disable parallel-readdir in this case. Was it that, you saw any perf improvement after enabling parallel-readdir on your setup?
The fix for this is part of release 4.1 and 5. As there will be no more 3.12.x releases once version 5 is released, I will close this BZ once 5 is out.
REVIEW: https://review.gluster.org/21364 (dht: Fill first_up_subvol before use in dht_opendir) posted (#1) for review on release-3.12 by N Balachandran
COMMIT: https://review.gluster.org/21364 committed in release-3.12 by "jiffin tony Thottan" <jthottan> with a commit message- dht: Fill first_up_subvol before use in dht_opendir Reported by: Sam McLeod Change-Id: Ic8f9b46b173796afd70aff1042834b03ac3e80b2 BUG: 1512371 Signed-off-by: Poornima G <pgurusid>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.15, please open a new bug report. glusterfs-3.12.15 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-October/000114.html [2] https://www.gluster.org/pipermail/gluster-users/