Bug 1169320

Summary: rmtab file is a bottleneck when lot of clients are accessing a volume through NFS
Product: [Community] GlusterFS Reporter: Niels de Vos <ndevos>
Component: nfsAssignee: Jiffin <jthottan>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1166862 Environment:
Last Closed: 2018-11-20 06:25:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1169317    
Bug Blocks: 1163723    

Description Niels de Vos 2014-12-01 10:35:53 UTC
+++ This bug was initially created as a clone of Bug #1166862 +++

Description of problem:

This feature: http://review.gluster.org/#/c/4430/

Create bottleneck when several clients are accessing a nfs volume.

On our setup:

Gluster 3.5.2 under centos7.

Hardware:

	dual Xeon® CPU E5-2640
	64GB RAM
	SSD for rootfs
	10Gb NIC

Context:

	Around 700 nfs clients for small file or vm images.


Version-Release number of selected component (if applicable):

3.5.2

How reproducible:

Always as long as you have enough NFS clients

Steps to Reproduce:
1. Create a volume accessible through gluster nfs
2. Make it accessible for 700 clients
3. See how it hangs

Actual results:

NFS client got intermittent hang (every minutes / for 10s each time). Even an “rpcinfo -t server nfs 3" will hang.

Gluster nfs process literally eat the CPU of the server

Expected results:

No hanging

Additional info:

The cause:

the rmtab file located in /var/lib/glusterd/nfs/ is flushed from memory to  /var/lib/glusterd/nfs/rmtab.tmp. During this time, NFS server hang literraly.

Workaroud:

Move the file to memory for faster I/O using this option:

set nfs.mount-rmtab: /dev/shm/glusterfs.rmtab

Result:

We still have some hang but for ~300ms now, the load average of the server is WAY better.

Personal thought:

This feature is not usable and should be disabled by default.

You can find attached load average and Disk usage before and after using SHM for rmtab.

--- Additional comment from Cyril Peponnet on 2014-11-21 19:53:50 CET ---



--- Additional comment from Anand Avati on 2014-12-01 11:29:51 CET ---

REVIEW: http://review.gluster.org/9223 (nfs: make it possible to disable nfs.mount-rmtab) posted (#1) for review on master by Niels de Vos (ndevos)

--- Additional comment from Niels de Vos on 2014-12-01 11:34:37 CET ---

Note that the proposed change in comment #2 is actually for the master branch, not for release-3.5. When it has been merged in the master branch, we can backport the fix to release-3.5 and release-3.6.

Comment 1 Muthu Vigneshwaran 2016-08-23 13:01:19 UTC
Hi,

'GlusterFS-3.6 is nearing its End-Of-Life, only important security bugs still make a chance on getting fixed. Moving this to the mainline 'version'. If this needs to get fixed in 3.7 or 3.8 this bug should get cloned.'