Bug 1224923

Summary: nfs-ganesha: Getting error with SPECsfs2014 SWBUILD workload
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Manoj Pillai <mpillai>
Component: nfs-ganeshaAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED CURRENTRELEASE QA Contact: storage-qa-internal <storage-qa-internal>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: jthottan, mpillai, ndevos, nlevinki, psuriset, shberry, skoduri, smohan
Target Milestone: ---Keywords: Performance
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-03 11:43:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sfs_rc file for SPECsfs2014 SWBUILD workload none

Description Manoj Pillai 2015-05-26 08:42:00 UTC
Created attachment 1029806 [details]
sfs_rc file for SPECsfs2014 SWBUILD workload

Description of problem:

The SWBUILD workload in the SPECsfs2014 benchmark fails when running on nfs-ganesha. Runs with kernel NFS are completing without any issues. Run failures are seen both with ganesha-gluster and with ganesha-vfs. 

This is a software build workload (small-files, lots of them, metadata-intensive). Ganesha logs showing the following message repeatedly:

ganesha.nfsd-1955[cache_lru] lru_run :INODE LRU :CRIT :Futility count exceeded.  The LRU thread is unable to make progress in reclaiming FDs.  Disabling FD cache.

Version-Release number of selected component (if applicable):

glusterfs-libs-3.7.0-1.el7.x86_64
glusterfs-api-3.7.0-1.el7.x86_64
glusterfs-3.7.0-1.el7.x86_64
glusterfs-fuse-3.7.0-1.el7.x86_64
glusterfs-cli-3.7.0-1.el7.x86_64
nfs-ganesha-gluster-2.2.0-0.el7.centos.x86_64
glusterfs-client-xlators-3.7.0-1.el7.x86_64
glusterfs-server-3.7.0-1.el7.x86_64
glusterfs-ganesha-3.7.0-1.el7.x86_64

nfs-ganesha-vfs-2.2.0-0.el7.centos.x86_64
nfs-ganesha-2.2.0-0.el7.centos.x86_64

kernel-3.10.0-229.el7.x86_64 (RHEL 7.1)

How reproducible:

Consistently.

Steps to Reproduce:
1. single-brick, single-server gluster volume, exported with nfs-ganesha fsal gluster or fsal vfs. nfsv4 mount on clients used in testing. install specsfs2014 on clients as per user guide instructions.
2. Run the SPECsfs2014 benchmark with the attached sfs_rc file, as follows (this rc file is for 6 clients):
python SfsManager -r sfs_rc -s ${RUN_TAG}


Actual results:

Benchmark reports error and exits.

Expected results:

Benchmark runs to completion

Additional info:

Comment 2 Anand Subramanian 2015-05-27 06:21:55 UTC
Assigning to Niels as he is currently looking at the ganesha perf issues. Hopefully this is related to the other Specsfs issue he is already looking at in which case it will save some debug effort.

Comment 3 Vivek Agarwal 2015-06-04 07:46:20 UTC
team-nfs

Comment 4 Niels de Vos 2016-06-15 13:55:35 UTC
I am not working on this at the moment, moving back to NEW.

Manoj, could you check if a newer version of RHGS (updated NFS-Ganesha and Gluster) happens to fix this problem?

Comment 5 Manoj Pillai 2016-07-14 06:07:11 UTC
I am hoping to find some time soon to get to this. Don't have an ETA yet.

Comment 9 Shekhar Berry 2017-04-05 10:17:11 UTC
The test was repeated with RHGS 3.2 and it was running for SWBUILD workload of SPECsfs2014.

Version:

glusterfs-libs-3.8.4-18.el7rhgs.x86_64
glusterfs-3.8.4-18.el7rhgs.x86_64
glusterfs-api-3.8.4-18.el7rhgs.x86_64
glusterfs-fuse-3.8.4-18.el7rhgs.x86_64
glusterfs-server-3.8.4-18.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-18.el7rhgs.x86_64
glusterfs-cli-3.8.4-18.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.1-9.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-18.el7rhgs.x86_64


The same sfs_rc which is given in BZ description above was used for this test as well.