1480315 – [EC+NFS]: nfs mount or lookup (via df or ls) taking longer time on EC volume

Bug 1480315 - [EC+NFS]: nfs mount or lookup (via df or ls) taking longer time on EC volume

Summary: [EC+NFS]: nfs mount or lookup (via df or ls) taking longer time on EC volume

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	gluster-nfs
Sub Component:
Version:	rhgs-3.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Jiffin
QA Contact:	Manisha Saini
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-08-10 16:34 UTC by Rahul Hinduja
Modified:	2018-11-19 04:31 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-11-19 04:31:04 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
fusepackettrace (35.38 KB, application/octet-stream) 2017-08-11 05:08 UTC, Jiffin	no flags	Details
nfspackettrace (347.57 KB, application/octet-stream) 2017-08-11 05:09 UTC, Jiffin	no flags	Details
View All

Description Rahul Hinduja 2017-08-10 16:34:49 UTC

Description of problem:
=======================

After running my general regression sanity test (geo-rep) on EC volume via NFS mount observer that the nfs mount taking longer time during lookup or mount

Following is the observation between Nfs and Fuse:

Fuse:
=====

[root@dhcp37-115 ~]# time mount -t glusterfs 10.70.42.79:/master /mnt/fuse

real	0m0.270s
user	0m0.038s
sys	0m0.080s
[root@dhcp37-115 ~]#

[root@dhcp37-115 ~]# time df
Filesystem                        1K-blocks    Used Available Use% Mounted on
/dev/mapper/rhel_dhcp37--115-root  17811456 1700664  16110792  10% /
devtmpfs                            3994208       0   3994208   0% /dev
tmpfs                               4005152       0   4005152   0% /dev/shm
tmpfs                               4005152    8552   3996600   1% /run
tmpfs                               4005152       0   4005152   0% /sys/fs/cgroup
/dev/sda1                           1038336  189360    848976  19% /boot
tmpfs                                801032       0    801032   0% /run/user/0
10.70.42.79:/master               104857600       0 104857600   0% /mnt/fuse

real	0m0.034s
user	0m0.000s
sys	0m0.005s
[root@dhcp37-115 ~]# 


NFS:
====

[root@dhcp37-115 ~]# time mount -t nfs -o vers=3 10.70.42.79:/master /mnt/nfs/

real	0m25.786s
user	0m0.002s
sys	0m0.024s
[root@dhcp37-115 ~]# time df
Filesystem                        1K-blocks    Used Available Use% Mounted on
/dev/mapper/rhel_dhcp37--115-root  17811456 1700180  16111276  10% /
devtmpfs                            3994208       0   3994208   0% /dev
tmpfs                               4005152       0   4005152   0% /dev/shm
tmpfs                               4005152    8556   3996596   1% /run
tmpfs                               4005152       0   4005152   0% /sys/fs/cgroup
/dev/sda1                           1038336  189360    848976  19% /boot
tmpfs                                801032       0    801032   0% /run/user/0
10.70.42.79:/master               104857600       0 104857600   0% /mnt/fuse
10.70.42.79:/master               104857600       0 104857600   0% /mnt/nfs

real	0m14.890s
user	0m0.000s
sys	0m0.007s
[root@dhcp37-115 ~]# 

glluster nfs.log do not show any message during these operations. Also, I have seen this issue twice on 38 build. 

When I observed first time, disabling eager-lock was improving the performance. Or once I rebooted the client, that resolved the issue. 


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.8.4-38.el7rhgs.x86_64


How reproducible:
=================

2/2 only after the complete test automation suite on EC over NFS is completed. 

Automation sanity does the following:

1. Creates the Distributed Dispersed Master volume from 6 nodes (2x(4+2))
2. Creates DR volume (2x2)
3. Creates geo-rep , mount the master volume 
4. Creates data using crefi
(create,chmod,chown,chgrp,hardlink,symlink,truncate,rename,remove} during changelog,xsync and history crawl

After every fop at step 4, checksum gets calculated at master volume and slave volume. Once it matches, it moves to another fop.

Comment 5 Jiffin 2017-08-11 05:08:09 UTC

Created attachment 1311954 [details]
fusepackettrace

Comment 6 Jiffin 2017-08-11 05:09:05 UTC

Created attachment 1311956 [details]
nfspackettrace

Note You need to log in before you can comment on or make changes to this bug.