Bug 1480315 - [EC+NFS]: nfs mount or lookup (via df or ls) taking longer time on EC volume
[EC+NFS]: nfs mount or lookup (via df or ls) taking longer time on EC volume
Status: NEW
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: gluster-nfs (Show other bugs)
3.3
x86_64 Linux
unspecified Severity urgent
: ---
: ---
Assigned To: Niels de Vos
Manisha Saini
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-10 12:34 EDT by Rahul Hinduja
Modified: 2017-09-28 13:00 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
fusepackettrace (35.38 KB, application/octet-stream)
2017-08-11 01:08 EDT, Jiffin
no flags Details
nfspackettrace (347.57 KB, application/octet-stream)
2017-08-11 01:09 EDT, Jiffin
no flags Details

  None (edit)
Description Rahul Hinduja 2017-08-10 12:34:49 EDT
Description of problem:
=======================

After running my general regression sanity test (geo-rep) on EC volume via NFS mount observer that the nfs mount taking longer time during lookup or mount

Following is the observation between Nfs and Fuse:

Fuse:
=====

[root@dhcp37-115 ~]# time mount -t glusterfs 10.70.42.79:/master /mnt/fuse

real	0m0.270s
user	0m0.038s
sys	0m0.080s
[root@dhcp37-115 ~]#

[root@dhcp37-115 ~]# time df
Filesystem                        1K-blocks    Used Available Use% Mounted on
/dev/mapper/rhel_dhcp37--115-root  17811456 1700664  16110792  10% /
devtmpfs                            3994208       0   3994208   0% /dev
tmpfs                               4005152       0   4005152   0% /dev/shm
tmpfs                               4005152    8552   3996600   1% /run
tmpfs                               4005152       0   4005152   0% /sys/fs/cgroup
/dev/sda1                           1038336  189360    848976  19% /boot
tmpfs                                801032       0    801032   0% /run/user/0
10.70.42.79:/master               104857600       0 104857600   0% /mnt/fuse

real	0m0.034s
user	0m0.000s
sys	0m0.005s
[root@dhcp37-115 ~]# 


NFS:
====

[root@dhcp37-115 ~]# time mount -t nfs -o vers=3 10.70.42.79:/master /mnt/nfs/

real	0m25.786s
user	0m0.002s
sys	0m0.024s
[root@dhcp37-115 ~]# time df
Filesystem                        1K-blocks    Used Available Use% Mounted on
/dev/mapper/rhel_dhcp37--115-root  17811456 1700180  16111276  10% /
devtmpfs                            3994208       0   3994208   0% /dev
tmpfs                               4005152       0   4005152   0% /dev/shm
tmpfs                               4005152    8556   3996596   1% /run
tmpfs                               4005152       0   4005152   0% /sys/fs/cgroup
/dev/sda1                           1038336  189360    848976  19% /boot
tmpfs                                801032       0    801032   0% /run/user/0
10.70.42.79:/master               104857600       0 104857600   0% /mnt/fuse
10.70.42.79:/master               104857600       0 104857600   0% /mnt/nfs

real	0m14.890s
user	0m0.000s
sys	0m0.007s
[root@dhcp37-115 ~]# 

glluster nfs.log do not show any message during these operations. Also, I have seen this issue twice on 38 build. 

When I observed first time, disabling eager-lock was improving the performance. Or once I rebooted the client, that resolved the issue. 


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.8.4-38.el7rhgs.x86_64


How reproducible:
=================

2/2 only after the complete test automation suite on EC over NFS is completed. 

Automation sanity does the following:

1. Creates the Distributed Dispersed Master volume from 6 nodes (2x(4+2))
2. Creates DR volume (2x2)
3. Creates geo-rep , mount the master volume 
4. Creates data using crefi
(create,chmod,chown,chgrp,hardlink,symlink,truncate,rename,remove} during changelog,xsync and history crawl

After every fop at step 4, checksum gets calculated at master volume and slave volume. Once it matches, it moves to another fop.
Comment 5 Jiffin 2017-08-11 01:08 EDT
Created attachment 1311954 [details]
fusepackettrace
Comment 6 Jiffin 2017-08-11 01:09 EDT
Created attachment 1311956 [details]
nfspackettrace

Note You need to log in before you can comment on or make changes to this bug.