Bug 1445246 - [Parallel Readdir] : Mounts fail when performance.parallel-readdir is set to "off"
Summary: [Parallel Readdir] : Mounts fail when performance.parallel-readdir is set to ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: rhgs-3.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 3.3.0
Assignee: Poornima G
QA Contact: Ambarish
URL:
Whiteboard:
Depends On: 1438245
Blocks: 1417151 1446516 1453152
TreeView+ depends on / blocked
 
Reported: 2017-04-25 10:47 UTC by Ambarish
Modified: 2017-09-21 04:39 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.4-26
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1446516 (view as bug list)
Environment:
Last Closed: 2017-09-21 04:39:40 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Description Ambarish 2017-04-25 10:47:45 UTC
Description of problem:
-----------------------

2*2 volume,trying too mount via FUSE .

Enable parallel readdir,then set cache limit to > 1G,Turn off parallel readdir and try to mount the volume.

Mount fails.

Snippet from mount logs :

[2017-04-25 10:45:06.698688] I [MSGID: 100030] [glusterfsd.c:2417:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.4 (args: /usr/sbin/glusterfs --volfile-server=gqas013.sbu.lab.eng.bos.redhat.com --volfile-id=/testvol /gluster-mount)
[2017-04-25 10:45:06.706458] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-25 10:45:06.760460] E [MSGID: 101028] [options.c:168:xlator_option_validate_sizet] 0-testvol-readdir-ahead: '2147483648' in 'option rda-cache-limit 2GB' is out of range [0 - 1073741824]
[2017-04-25 10:45:06.760476] W [MSGID: 101029] [options.c:945:xl_opt_validate] 0-testvol-readdir-ahead: validate of rda-cache-limit returned -1
[2017-04-25 10:45:06.760484] E [MSGID: 101090] [graph.c:301:glusterfs_graph_validate_options] 0-testvol-readdir-ahead: validation failed: '2147483648' in 'option rda-cache-limit 2GB' is out of range [0 - 1073741824]
[2017-04-25 10:45:06.760490] E [MSGID: 101090] [graph.c:672:glusterfs_graph_activate] 0-graph: validate options failed
[2017-04-25 10:45:06.760779] W [glusterfsd.c:1288:cleanup_and_exit] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x3c1) [0x7f5c85fe6471] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x1b1) [0x7f5c85fe0831] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7f5c85fdfd6b] ) 0-: received signum (1), shutting down
[2017-04-25 10:45:06.760798] I [fuse-bridge.c:5803:fini] 0-fuse: Unmounting '/gluster-mount'.
~                                                                                           


Mount succeeds when the option is set to on again.                                                        






Version-Release number of selected component (if applicable):
-------------------------------------------------------------

3.8.4-23

How reproducible:
-----------------

Every which way I try.

Additional info:
---------------

[root@gqas013 ~]# gluster v info
 
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 7f5ae046-00d8-428c-a3f4-75e4f7515a82
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0
Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1
Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2
Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3
Options Reconfigured:
performance.rda-cache-limit: 1GB
performance.parallel-readdir: on
server.allow-insecure: on
performance.stat-prefetch: off
transport.address-family: inet
nfs.disable: on
[root@gqas013 ~]#

Comment 2 Ambarish 2017-04-25 10:49:35 UTC
The only way to reproduce the bug is when rda cache limit is > 1GB.

Comment 4 Poornima G 2017-05-08 09:22:07 UTC
Fixing BZ: 1438245 will fix this issue as well.

Comment 5 Poornima G 2017-05-08 09:27:46 UTC
When parallel readdir is enabled, the cache limit is (dist count * 1GB). Lets say the cache limit was set to 2 GB and then parallel readdir was disabled, the mount fails as the rda instance is only one(without parallel readdir) and the cache limit is set to 2GB more than the limit(1GB)

Comment 6 Vivek Das 2017-05-16 14:40:57 UTC
This is even reproducible for cifs as well.
Enable parallel readdir,then set cache limit to > 2G,Turn off parallel readdir and try to do a cifs mount.

Mount fails.

Comment 9 Atin Mukherjee 2017-05-19 06:45:30 UTC
upstream patch : https://review.gluster.org/#/c/17338/

Comment 10 Atin Mukherjee 2017-05-22 11:47:04 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106815/

Comment 12 Ambarish 2017-06-09 13:51:54 UTC
Verified on 3.8.4-27.

Subsequent mounts succeed post disabling paralel readdir,even after setting rda cache limit to a high value.

Comment 14 errata-xmlrpc 2017-09-21 04:39:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.