Bug 1234096 - rmtab file is a bottleneck when lot of clients are accessing a volume through NFS
Summary: rmtab file is a bottleneck when lot of clients are accessing a volume through...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: nfs
Version: 3.6.3
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Niels de Vos
QA Contact:
URL:
Whiteboard:
Depends On: 1169317
Blocks: glusterfs-3.6.4 glusterfs-3.6.5
TreeView+ depends on / blocked
 
Reported: 2015-06-21 10:06 UTC by Niels de Vos
Modified: 2015-12-01 16:45 UTC (History)
3 users (show)

Fixed In Version: glusterfs-3.6.5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-27 13:06:38 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Niels de Vos 2015-06-21 10:06:34 UTC
+++ This bug was initially created as a clone of Bug #1166862 +++

Description of problem:

This feature: http://review.gluster.org/#/c/4430/

Create bottleneck when several clients are accessing a nfs volume.

On our setup:

Gluster 3.5.2 under centos7.

Hardware:

	dual Xeon® CPU E5-2640
	64GB RAM
	SSD for rootfs
	10Gb NIC

Context:

	Around 700 nfs clients for small file or vm images.


Version-Release number of selected component (if applicable):

3.5.2

How reproducible:

Always as long as you have enough NFS clients

Steps to Reproduce:
1. Create a volume accessible through gluster nfs
2. Make it accessible for 700 clients
3. See how it hangs

Actual results:

NFS client got intermittent hang (every minutes / for 10s each time). Even an “rpcinfo -t server nfs 3" will hang.

Gluster nfs process literally eat the CPU of the server

Expected results:

No hanging

Additional info:

The cause:

the rmtab file located in /var/lib/glusterd/nfs/ is flushed from memory to  /var/lib/glusterd/nfs/rmtab.tmp. During this time, NFS server hang literraly.

Workaroud:

Move the file to memory for faster I/O using this option:

set nfs.mount-rmtab: /dev/shm/glusterfs.rmtab

Result:

We still have some hang but for ~300ms now, the load average of the server is WAY better.

Personal thought:

This feature is not usable and should be disabled by default.

You can find attached load average and Disk usage before and after using SHM for rmtab.

Comment 1 Anand Avati 2015-06-21 10:22:27 UTC
REVIEW: http://review.gluster.org/11335 (nfs: make it possible to disable nfs.mount-rmtab) posted (#1) for review on release-3.6 by Niels de Vos (ndevos)

Comment 2 Anand Avati 2015-08-14 09:13:13 UTC
REVIEW: http://review.gluster.org/11335 (nfs: make it possible to disable nfs.mount-rmtab) posted (#2) for review on release-3.6 by Raghavendra Bhat (raghavendra)

Comment 3 Anand Avati 2015-08-18 10:40:30 UTC
REVIEW: http://review.gluster.org/11335 (nfs: make it possible to disable nfs.mount-rmtab) posted (#3) for review on release-3.6 by Raghavendra Bhat (raghavendra)

Comment 4 Anand Avati 2015-08-18 13:20:03 UTC
COMMIT: http://review.gluster.org/11335 committed in release-3.6 by Raghavendra Bhat (raghavendra) 
------
commit 21643f8427be22ab7e512acf6c6368eb8af1ec9d
Author: Niels de Vos <ndevos>
Date:   Sun Jun 21 12:00:52 2015 +0200

    nfs: make it possible to disable nfs.mount-rmtab
    
    When there are many NFS-clients doing very often mount/unmount actions,
    the updating of the 'rmtab' can become a bottleneck and cause delays. In
    these situations, the output of 'showmount' may be less important than
    the responsiveness of the (un)mounting.
    
    By setting 'nfs.mount-rmtab' to the value "/-", the cache file is not
    updated anymore, and the entries are only kept in memory.
    
    Cherry picked from commit 40407afb529f6e5fa2f79e9778c2f527122d75eb:
    > Cherry picked from commit 331ef6e1a86bfc0a93f8a9dec6ad35c417873849:
    >> BUG: 1169317
    >> Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
    >> Reported-by: Cyril Peponnet <cyril>
    >> Signed-off-by: Niels de Vos <ndevos>
    >> Reviewed-on: http://review.gluster.org/9223
    >> Tested-by: Gluster Build System <jenkins.com>
    >> Reviewed-by: soumya k <skoduri>
    >> Reviewed-by: jiffin tony Thottan <jthottan>
    >> Reviewed-by: Kaleb KEITHLEY <kkeithle>
    >
    > This change also contains the fixes to the test-case from:
    >>
    >> nfs: fix spurious failure in bug-1166862.t
    >>
    >> In some environments, "showmount" could return an NFS-client that does
    >> not start with "1". This would cause the test-case to fail. The check is
    >> incorrect, the number of lines should get counted instead.
    >>
    >> Also moving the test-case to the .../nfs/... subdirectory.
    >>
    >> Cherry picked from commit ee9b35a780607daddc2832b9af5ed6bf414aebc0:
    >> BUG: 1166862
    >> Change-Id: Ic03aa8145ca57d78aea01564466e924b03bb302a
    >> Signed-off-by: Niels de Vos <ndevos>
    >> Reviewed-on: http://review.gluster.org/10419
    >> Tested-by: Gluster Build System <jenkins.com>
    >> Reviewed-by: Vijay Bellur <vbellur>
    >>
    >
    > Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
    > BUG: 1215385
    > Signed-off-by: Niels de Vos <ndevos>
    > Reviewed-on: http://review.gluster.org/10379
    > Tested-by: NetBSD Build System
    > Tested-by: Gluster Build System <jenkins.com>
    > Reviewed-by: Vijay Bellur <vbellur>
    
    Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
    BUG: 1234096
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/11335
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 5 Raghavendra Bhat 2015-08-27 13:06:38 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.5, please open a new bug report.

glusterfs-3.6.5 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/gluster-devel/2015-August/046570.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.