Description of problem: ----------------------- 2*2 volume,trying too mount via FUSE . Enable parallel readdir,then set cache limit to > 1G,Turn off parallel readdir and try to mount the volume. Mount fails. Snippet from mount logs : [2017-04-25 10:45:06.698688] I [MSGID: 100030] [glusterfsd.c:2417:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.4 (args: /usr/sbin/glusterfs --volfile-server=gqas013.sbu.lab.eng.bos.redhat.com --volfile-id=/testvol /gluster-mount) [2017-04-25 10:45:06.706458] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-04-25 10:45:06.760460] E [MSGID: 101028] [options.c:168:xlator_option_validate_sizet] 0-testvol-readdir-ahead: '2147483648' in 'option rda-cache-limit 2GB' is out of range [0 - 1073741824] [2017-04-25 10:45:06.760476] W [MSGID: 101029] [options.c:945:xl_opt_validate] 0-testvol-readdir-ahead: validate of rda-cache-limit returned -1 [2017-04-25 10:45:06.760484] E [MSGID: 101090] [graph.c:301:glusterfs_graph_validate_options] 0-testvol-readdir-ahead: validation failed: '2147483648' in 'option rda-cache-limit 2GB' is out of range [0 - 1073741824] [2017-04-25 10:45:06.760490] E [MSGID: 101090] [graph.c:672:glusterfs_graph_activate] 0-graph: validate options failed [2017-04-25 10:45:06.760779] W [glusterfsd.c:1288:cleanup_and_exit] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x3c1) [0x7f5c85fe6471] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x1b1) [0x7f5c85fe0831] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7f5c85fdfd6b] ) 0-: received signum (1), shutting down [2017-04-25 10:45:06.760798] I [fuse-bridge.c:5803:fini] 0-fuse: Unmounting '/gluster-mount'. ~ Mount succeeds when the option is set to on again. Version-Release number of selected component (if applicable): ------------------------------------------------------------- 3.8.4-23 How reproducible: ----------------- Every which way I try. Additional info: --------------- [root@gqas013 ~]# gluster v info Volume Name: testvol Type: Distributed-Replicate Volume ID: 7f5ae046-00d8-428c-a3f4-75e4f7515a82 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0 Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1 Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2 Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3 Options Reconfigured: performance.rda-cache-limit: 1GB performance.parallel-readdir: on server.allow-insecure: on performance.stat-prefetch: off transport.address-family: inet nfs.disable: on [root@gqas013 ~]#
The only way to reproduce the bug is when rda cache limit is > 1GB.
Fixing BZ: 1438245 will fix this issue as well.
When parallel readdir is enabled, the cache limit is (dist count * 1GB). Lets say the cache limit was set to 2 GB and then parallel readdir was disabled, the mount fails as the rda instance is only one(without parallel readdir) and the cache limit is set to 2GB more than the limit(1GB)
This is even reproducible for cifs as well. Enable parallel readdir,then set cache limit to > 2G,Turn off parallel readdir and try to do a cifs mount. Mount fails.
upstream patch : https://review.gluster.org/#/c/17338/
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106815/
Verified on 3.8.4-27. Subsequent mounts succeed post disabling paralel readdir,even after setting rda cache limit to a high value.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774