Created attachment 1029806 [details] sfs_rc file for SPECsfs2014 SWBUILD workload Description of problem: The SWBUILD workload in the SPECsfs2014 benchmark fails when running on nfs-ganesha. Runs with kernel NFS are completing without any issues. Run failures are seen both with ganesha-gluster and with ganesha-vfs. This is a software build workload (small-files, lots of them, metadata-intensive). Ganesha logs showing the following message repeatedly: ganesha.nfsd-1955[cache_lru] lru_run :INODE LRU :CRIT :Futility count exceeded. The LRU thread is unable to make progress in reclaiming FDs. Disabling FD cache. Version-Release number of selected component (if applicable): glusterfs-libs-3.7.0-1.el7.x86_64 glusterfs-api-3.7.0-1.el7.x86_64 glusterfs-3.7.0-1.el7.x86_64 glusterfs-fuse-3.7.0-1.el7.x86_64 glusterfs-cli-3.7.0-1.el7.x86_64 nfs-ganesha-gluster-2.2.0-0.el7.centos.x86_64 glusterfs-client-xlators-3.7.0-1.el7.x86_64 glusterfs-server-3.7.0-1.el7.x86_64 glusterfs-ganesha-3.7.0-1.el7.x86_64 nfs-ganesha-vfs-2.2.0-0.el7.centos.x86_64 nfs-ganesha-2.2.0-0.el7.centos.x86_64 kernel-3.10.0-229.el7.x86_64 (RHEL 7.1) How reproducible: Consistently. Steps to Reproduce: 1. single-brick, single-server gluster volume, exported with nfs-ganesha fsal gluster or fsal vfs. nfsv4 mount on clients used in testing. install specsfs2014 on clients as per user guide instructions. 2. Run the SPECsfs2014 benchmark with the attached sfs_rc file, as follows (this rc file is for 6 clients): python SfsManager -r sfs_rc -s ${RUN_TAG} Actual results: Benchmark reports error and exits. Expected results: Benchmark runs to completion Additional info:
Assigning to Niels as he is currently looking at the ganesha perf issues. Hopefully this is related to the other Specsfs issue he is already looking at in which case it will save some debug effort.
team-nfs
I am not working on this at the moment, moving back to NEW. Manoj, could you check if a newer version of RHGS (updated NFS-Ganesha and Gluster) happens to fix this problem?
I am hoping to find some time soon to get to this. Don't have an ETA yet.
The test was repeated with RHGS 3.2 and it was running for SWBUILD workload of SPECsfs2014. Version: glusterfs-libs-3.8.4-18.el7rhgs.x86_64 glusterfs-3.8.4-18.el7rhgs.x86_64 glusterfs-api-3.8.4-18.el7rhgs.x86_64 glusterfs-fuse-3.8.4-18.el7rhgs.x86_64 glusterfs-server-3.8.4-18.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-18.el7rhgs.x86_64 glusterfs-cli-3.8.4-18.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.1-9.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-18.el7rhgs.x86_64 The same sfs_rc which is given in BZ description above was used for this test as well.