1445246 – [Parallel Readdir] : Mounts fail when performance.parallel-readdir is set to "off"

Bug 1445246 - [Parallel Readdir] : Mounts fail when performance.parallel-readdir is set to "off"

Summary: [Parallel Readdir] : Mounts fail when performance.parallel-readdir is set to ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	rhgs-3.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.3.0
Assignee:	Poornima G
QA Contact:	Ambarish
Docs Contact:
URL:
Whiteboard:
Depends On:	1438245
Blocks:	1417151 1446516 1453152
TreeView+	depends on / blocked

Reported:	2017-04-25 10:47 UTC by Ambarish
Modified:	2017-09-21 04:39 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.8.4-26
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1446516 (view as bug list)
Environment:
Last Closed:	2017-09-21 04:39:40 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:2774	0	normal	SHIPPED_LIVE	glusterfs bug fix and enhancement update	2017-09-21 08:16:29 UTC

Description Ambarish 2017-04-25 10:47:45 UTC

Description of problem:
-----------------------

2*2 volume,trying too mount via FUSE .

Enable parallel readdir,then set cache limit to > 1G,Turn off parallel readdir and try to mount the volume.

Mount fails.

Snippet from mount logs :

[2017-04-25 10:45:06.698688] I [MSGID: 100030] [glusterfsd.c:2417:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.4 (args: /usr/sbin/glusterfs --volfile-server=gqas013.sbu.lab.eng.bos.redhat.com --volfile-id=/testvol /gluster-mount)
[2017-04-25 10:45:06.706458] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-25 10:45:06.760460] E [MSGID: 101028] [options.c:168:xlator_option_validate_sizet] 0-testvol-readdir-ahead: '2147483648' in 'option rda-cache-limit 2GB' is out of range [0 - 1073741824]
[2017-04-25 10:45:06.760476] W [MSGID: 101029] [options.c:945:xl_opt_validate] 0-testvol-readdir-ahead: validate of rda-cache-limit returned -1
[2017-04-25 10:45:06.760484] E [MSGID: 101090] [graph.c:301:glusterfs_graph_validate_options] 0-testvol-readdir-ahead: validation failed: '2147483648' in 'option rda-cache-limit 2GB' is out of range [0 - 1073741824]
[2017-04-25 10:45:06.760490] E [MSGID: 101090] [graph.c:672:glusterfs_graph_activate] 0-graph: validate options failed
[2017-04-25 10:45:06.760779] W [glusterfsd.c:1288:cleanup_and_exit] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x3c1) [0x7f5c85fe6471] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x1b1) [0x7f5c85fe0831] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7f5c85fdfd6b] ) 0-: received signum (1), shutting down
[2017-04-25 10:45:06.760798] I [fuse-bridge.c:5803:fini] 0-fuse: Unmounting '/gluster-mount'.
~                                                                                           


Mount succeeds when the option is set to on again.                                                        






Version-Release number of selected component (if applicable):
-------------------------------------------------------------

3.8.4-23

How reproducible:
-----------------

Every which way I try.

Additional info:
---------------

[root@gqas013 ~]# gluster v info
 
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 7f5ae046-00d8-428c-a3f4-75e4f7515a82
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0
Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1
Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2
Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3
Options Reconfigured:
performance.rda-cache-limit: 1GB
performance.parallel-readdir: on
server.allow-insecure: on
performance.stat-prefetch: off
transport.address-family: inet
nfs.disable: on
[root@gqas013 ~]#

Comment 2 Ambarish 2017-04-25 10:49:35 UTC

The only way to reproduce the bug is when rda cache limit is > 1GB.

Comment 4 Poornima G 2017-05-08 09:22:07 UTC

Fixing BZ: 1438245 will fix this issue as well.

Comment 5 Poornima G 2017-05-08 09:27:46 UTC

When parallel readdir is enabled, the cache limit is (dist count * 1GB). Lets say the cache limit was set to 2 GB and then parallel readdir was disabled, the mount fails as the rda instance is only one(without parallel readdir) and the cache limit is set to 2GB more than the limit(1GB)

Comment 6 Vivek Das 2017-05-16 14:40:57 UTC

This is even reproducible for cifs as well.
Enable parallel readdir,then set cache limit to > 2G,Turn off parallel readdir and try to do a cifs mount.

Mount fails.

Comment 9 Atin Mukherjee 2017-05-19 06:45:30 UTC

upstream patch : https://review.gluster.org/#/c/17338/

Comment 10 Atin Mukherjee 2017-05-22 11:47:04 UTC

downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106815/

Comment 12 Ambarish 2017-06-09 13:51:54 UTC

Verified on 3.8.4-27.

Subsequent mounts succeed post disabling paralel readdir,even after setting rda cache limit to a high value.

Comment 14 errata-xmlrpc 2017-09-21 04:39:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Note You need to log in before you can comment on or make changes to this bug.