Bug 1215385 - rmtab file is a bottleneck when lot of clients are accessing a volume through NFS
Summary: rmtab file is a bottleneck when lot of clients are accessing a volume through...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: nfs
Version: 3.7.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Niels de Vos
QA Contact:
URL:
Whiteboard:
Depends On: 1169317
Blocks: glusterfs-3.7.0
TreeView+ depends on / blocked
 
Reported: 2015-04-26 08:41 UTC by Niels de Vos
Modified: 2015-05-14 17:46 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.7.0
Doc Type: Bug Fix
Doc Text:
Clone Of: 1169317
Environment:
Last Closed: 2015-05-14 17:29:30 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Niels de Vos 2015-04-26 08:41:48 UTC
+++ This bug was initially created as a clone of Bug #1169317 +++

+++ This bug was initially created as a clone of Bug #1166862 +++

Description of problem:

This feature: http://review.gluster.org/#/c/4430/

Create bottleneck when several clients are accessing a nfs volume.

On our setup:

Gluster 3.5.2 under centos7.

Hardware:

	dual Xeon® CPU E5-2640
	64GB RAM
	SSD for rootfs
	10Gb NIC

Context:

	Around 700 nfs clients for small file or vm images.


Version-Release number of selected component (if applicable):

3.5.2

How reproducible:

Always as long as you have enough NFS clients

Steps to Reproduce:
1. Create a volume accessible through gluster nfs
2. Make it accessible for 700 clients
3. See how it hangs

Actual results:

NFS client got intermittent hang (every minutes / for 10s each time). Even an “rpcinfo -t server nfs 3" will hang.

Gluster nfs process literally eat the CPU of the server

Expected results:

No hanging

Additional info:

The cause:

the rmtab file located in /var/lib/glusterd/nfs/ is flushed from memory to  /var/lib/glusterd/nfs/rmtab.tmp. During this time, NFS server hang literraly.

Workaroud:

Move the file to memory for faster I/O using this option:

set nfs.mount-rmtab: /dev/shm/glusterfs.rmtab

Result:

We still have some hang but for ~300ms now, the load average of the server is WAY better.

Personal thought:

This feature is not usable and should be disabled by default.

You can find attached load average and Disk usage before and after using SHM for rmtab.

--- Additional comment from Anand Avati on 2014-12-01 11:32:14 CET ---

REVIEW: http://review.gluster.org/9223 (nfs: make it possible to disable nfs.mount-rmtab) posted (#2) for review on master by Niels de Vos (ndevos)

--- Additional comment from Anand Avati on 2014-12-01 15:18:03 CET ---

REVIEW: http://review.gluster.org/9223 (nfs: make it possible to disable nfs.mount-rmtab) posted (#4) for review on master by Niels de Vos (ndevos)

--- Additional comment from Anand Avati on 2014-12-02 12:14:50 CET ---

REVIEW: http://review.gluster.org/9223 (nfs: make it possible to disable nfs.mount-rmtab) posted (#5) for review on master by Niels de Vos (ndevos)

--- Additional comment from Anand Avati on 2014-12-05 22:30:27 CET ---

REVIEW: http://review.gluster.org/9223 (nfs: make it possible to disable nfs.mount-rmtab) posted (#6) for review on master by Niels de Vos (ndevos)

--- Additional comment from Anand Avati on 2015-04-26 10:40:25 CEST ---

COMMIT: http://review.gluster.org/9223 committed in master by Niels de Vos (ndevos) 
------
commit 331ef6e1a86bfc0a93f8a9dec6ad35c417873849
Author: Niels de Vos <ndevos>
Date:   Tue Dec 2 10:54:53 2014 +0100

    nfs: make it possible to disable nfs.mount-rmtab
    
    When there are many NFS-clients doing very often mount/unmount actions,
    the updating of the 'rmtab' can become a bottleneck and cause delays. In
    these situations, the output of 'showmount' may be less important than
    the responsiveness of the (un)mounting.
    
    By setting 'nfs.mount-rmtab' to the value "/-", the cache file is not
    updated anymore, and the entries are only kept in memory.
    
    BUG: 1169317
    Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
    Reported-by: Cyril Peponnet <cyril>
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/9223
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: soumya k <skoduri>
    Reviewed-by: jiffin tony Thottan <jthottan>
    Reviewed-by: Kaleb KEITHLEY <kkeithle>

Comment 1 Anand Avati 2015-04-26 08:43:37 UTC
REVIEW: http://review.gluster.org/10379 (nfs: make it possible to disable nfs.mount-rmtab) posted (#1) for review on release-3.7 by Niels de Vos (ndevos)

Comment 2 Anand Avati 2015-05-03 15:18:43 UTC
REVIEW: http://review.gluster.org/10379 (nfs: make it possible to disable nfs.mount-rmtab) posted (#2) for review on release-3.7 by Niels de Vos (ndevos)

Comment 3 Anand Avati 2015-05-03 18:16:22 UTC
COMMIT: http://review.gluster.org/10379 committed in release-3.7 by Vijay Bellur (vbellur) 
------
commit 40407afb529f6e5fa2f79e9778c2f527122d75eb
Author: Niels de Vos <ndevos>
Date:   Sun Apr 26 10:42:53 2015 +0200

    nfs: make it possible to disable nfs.mount-rmtab
    
    When there are many NFS-clients doing very often mount/unmount actions,
    the updating of the 'rmtab' can become a bottleneck and cause delays. In
    these situations, the output of 'showmount' may be less important than
    the responsiveness of the (un)mounting.
    
    By setting 'nfs.mount-rmtab' to the value "/-", the cache file is not
    updated anymore, and the entries are only kept in memory.
    
    Cherry picked from commit 331ef6e1a86bfc0a93f8a9dec6ad35c417873849:
    > BUG: 1169317
    > Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
    > Reported-by: Cyril Peponnet <cyril>
    > Signed-off-by: Niels de Vos <ndevos>
    > Reviewed-on: http://review.gluster.org/9223
    > Tested-by: Gluster Build System <jenkins.com>
    > Reviewed-by: soumya k <skoduri>
    > Reviewed-by: jiffin tony Thottan <jthottan>
    > Reviewed-by: Kaleb KEITHLEY <kkeithle>
    
    This change also contains the fixes to the test-case from:
    >
    > nfs: fix spurious failure in bug-1166862.t
    >
    > In some environments, "showmount" could return an NFS-client that does
    > not start with "1". This would cause the test-case to fail. The check is
    > incorrect, the number of lines should get counted instead.
    >
    > Also moving the test-case to the .../nfs/... subdirectory.
    >
    > Cherry picked from commit ee9b35a780607daddc2832b9af5ed6bf414aebc0:
    > BUG: 1166862
    > Change-Id: Ic03aa8145ca57d78aea01564466e924b03bb302a
    > Signed-off-by: Niels de Vos <ndevos>
    > Reviewed-on: http://review.gluster.org/10419
    > Tested-by: Gluster Build System <jenkins.com>
    > Reviewed-by: Vijay Bellur <vbellur>
    >
    
    Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
    BUG: 1215385
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/10379
    Tested-by: NetBSD Build System
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 4 Niels de Vos 2015-05-14 17:29:30 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 5 Niels de Vos 2015-05-14 17:35:57 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 6 Niels de Vos 2015-05-14 17:38:18 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 7 Niels de Vos 2015-05-14 17:46:59 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.